Lasisi Adeiza Isiaka


Exploiting machine algorithms in vocalic quantification of African English corpora
Lasisi Adeiza Isiaka
Proceedings of the 2019 Workshop on Widening NLP

Towards procedural fidelity in the processing of African English speech corpora, this work demonstrates how the adaptation of machine-assisted segmentation of phonemes and automatic extraction of acoustic values can significantly speed up the processing of naturalistic data and make the vocalic analysis of the varieties less impressionistic. Research in African English phonology has, till date, been least data-driven – much less the use of comparative corpora for cross-varietal assessments. Using over 30 hours of naturalistic data (from 28 speakers in 5 Nigerian cities), the procedures for segmenting audio files into phonemic units via the Munich Automatic Segmentation System (MAUS), and the extraction of their spectral values in Praat are explained. Evidence from the speech corpora supports a more complex vocalic inventory than attested in previous auditory/manual-based accounts – thus reinforcing the resourcefulness of the algorithms for the current data and cognate varieties. Keywords: machine algorithms; naturalistic data; African English phonology; vowel segmentation