Research Mentor(s): Yoonjeong Lee
Authors: Yoojin Kwon, Yoonjeong Lee, Jelena Krivokapić
In everyday conversation, speakers move their vocal tract speech organs differently to express various linguistic intentions, by grouping spoken words together in chunks (i.e., prosodic phrasing) and highlighting important information to emphasize a certain word (i.e., prominence). Speakers also often use other body parts (e.g., the hands, fingers, eyebrows, and head) to communicate, which are referred to as co-speech gestures. Previously, only few studies have investigated the temporal relationship between co-speech body gestures and vocal tract gestures. Most of these studies have only focused on languages that are typologically similar to English, in which speakers adjust their voice pitch when emphasizing an important word in the spoken utterance. This study focuses on co-speech eyebrow gestures in Seoul Korean, a language that is typologically different from English and expresses prominence by starting a new phrase rather than raising a pitch. Our recent study (Lee, Krivokapić & Purse, 2022) has shown that manual gestures are temporally coordinated with speech phrases in Seoul Korean, in that up-and-down hand movements start and end synchronously with the phrase, which differs from speech-to-co-speech coordination patterns, the point of the highest pitch and the movement peak of the hand, reported in previous studies with English. We hypothesize that eyebrow movements are coordinated with phrases in Seoul Korean just like manual movements, but in a different way due to the different spatiotemporal scopes between the manual (greater) and eyebrow (smaller) movements. This study used electromagnetic articulography and camcorders to collect multimodal speech data (video, audio, and speech kinematic data) from eight Seoul Korean speakers (5F, 3M) reading short stories and analyzed the temporal coordination between the vertical eyebrow movements and the accompanying acoustic signals. The perceived eyebrow movements of each participant recorded on the face camera were first labeled using Adobe Premiere Pro, the kinematic data of the visible eyebrow movements and the nearby vocal tract movements were analyzed using MATLAB, and the acoustic signals of spoken phrases including pitch were further labeled using Praat. The labeled multimodal signals were temporally aligned for the coordination analysis. Overall, up-and-down eyebrow movements are observed when speakers highlight certain words, and they are temporally synchronous with the spoken phrases as hypothesized. However, the eyebrow gestures differ from manual gestures in that the small and rapid eyebrow movements are local, occurring shortly at the beginning and end of a phrase, but not spanning over the entire phrase. Additionally, eyebrow movements are often observed when speakers produced wh-words (e.g., who, what, when, where, why, and how) and interrogative phrases. Results reveal that the coordination targets in Korean are indeed the phonetic correlates of phrase edges, not the highest pitch of that phrase. This study contributes to our understanding of how different modalities are recruited and temporally coordinated in expressing prosodic phrasing and prominence in a variety of languages.