flowchart LR A(Link to video) --> B(Praat TextGrid)
From videos to TextGrids
2024-03-28
Phonetic
Extraction and
Alignment of
Subtitled
YouTube
Videos
flowchart LR A(Link to video) --> B(Praat TextGrid)
#! /bin/sh
SPPAS (Bigi 2012; Bigi and Hirst 2012)
P2FA (Yuan and Liberman 2008)
PRAAT (Boersma and Weenink 2019)
R (R Core Team 2023)
yt-dlp (yt-dlp 2022)
ffmpeg (Developers 2021)
the Longman Pronunciation Dictionary (Wells 2008)
flowchart LR subgraph S1[Step 1] direction TB A(fa:fa-file-lines list of links) --> |yt-dlp| B(fa:fa-video video) A --> |yt-dlp| C(fa:fa-closed-captioning subtitles) B -->|ffmpeg| D(fa:fa-file-audio fa:fa-regular far:fa-square Main Audio TG) C -->|praat| D D -->|praat| E1(fa:fa-file-audio far:fa-square) D -->|praat| E2(fa:fa-file-audio far:fa-square) D -->|praat| E3(fa:fa-file-audio far:fa-square) end subgraph S2[Step 2] direction LR F1(fa:fa-file-audio far:fa-square) -->|SPPAS| G1(fa:fa-table-cells-large Segm TG) F2(fa:fa-file-audio far:fa-square) -->|SPPAS| G2(fa:fa-table-cells-large Segm TG) F3(fa:fa-file-audio far:fa-square) -->|SPPAS| G3(fa:fa-table-cells-large Segm TG) F1(fa:fa-file-audio far:fa-square) -->|fa:fa-align P2FA| H1(fa:fa-table-cells-large Segm TG) F2(fa:fa-file-audio far:fa-square) -->|fa:fa-align P2FA| H2(fa:fa-table-cells-large Segm TG) F3(fa:fa-file-audio far:fa-square) -->|fa:fa-align P2FA| H3(fa:fa-table-cells-large Segm TG) G1--> GH(fa:fa-file-audio fa:fa-table-cells-large) H1--> GH G2--> GH H2--> GH G3--> GH H3--> GH end subgraph S3[Step 3] direction TB I(fa:fa-file-audio fa:fa-table-cells-large) -->|praat| K(fa:fa-table-cells Segm Syll TG) J(fa:fa-book LPD) -->|praat|K K -->|R| L(fa:fa-file-csv spreadsheets) L -->|R| M(fa:fa-chart-line vocalic diagnoses) end S1 --> S2 S2 --> S3
2 aligners are better than just one
Step 2 prevents cascading alignment errors
Added values:
Data aligned by SPPAS.
Nb of TextGrids: 453
Total length of the videos: 172:39:22
Data on monophthongs.
References: Deterding (1997)
References: Hillenbrand et al. (1995)
Next are the formant tracks for monophthongs.
Data on diphthongs.
Next are the formant tracks for diphthongs.
Data aligned by P2FA.
Data on monophthongs.
References: Deterding (1997)
References: Hillenbrand et al. (1995)
Next are the formant tracks for monophthongs.
Data on diphthongs.
Next are the formant tracks for diphthongs.
Let’s now compare the data obtained with the two aligners.
but potential issues with Wells (2008)
create interactive website
upload interactive diagnoses
adrienmeli@gmail.com
ALOES 2024 pre-conference workshop