Audio to phone transcription – UROP Spring Symposium 2021

Audio to phone transcription

Jack Moeser


Pronouns: He/him

Research Mentor(s): San Duanmu, Professor
Research Mentor School/College/Department: Linguistics, College of Literature, Science, and the Arts
Presentation Date: Thursday, April 22, 2021
Session: Session 4 (2pm-2:50pm)
Breakout Room: Room 15
Presenter: 8

Event Link


By the end of this century, at least 50% of the world’s 6,000 languages are projected to go extinct (UNESCO 2013), meaning that they have no living, fluent speakers and have not been transcribed. Linguists are able to transcribe languages phonetically using recorded samples of a language, but this process is highly time consuming and requires a significant quantity of recordings to be transcribed, meaning that humans are not able to efficiently transcribe languages. Clearly, this process should be automated to increase efficiency, but currently, there is not a sufficient program available. The goal of this project is to develop a transcription program that can transcribe audio files into phonetic symbols in order to facilitate the preservation of endangered languages. In order to accomplish this task, short audio samples of Australian English, a known language, will be segmented and transcribed phonetically manually, using the software Praat (Boersma & Weenink 2020). Major classes of phonetic sounds will be classified using the acoustic information in the audio file, including amplitude, pitch, and frequency spectrum, as well as data that can be derived from the provided data, such as the rate at which the amplitude of a wave crosses the horizontal axis. By analyzing this data in Microsoft Excel, hopefully we will be able to clearly distinguish and notate different phonetic sounds automatically and successfully using information that can be obtained without any knowledge of a language. References: Boersma, Paul, & David Weenink. 2020. Praat: doing phonetics by computer [Computer program]. Version 6.1.16, retrieved 6 June 2020 from UNESCO 2013. Endangered languages. Accessed 8/21/2013.

Authors: Jack Moeser, San Duanmu
Research Method: Qualitative Study

lsa logoum logo