Skip to main content
Skip to main content

LSLT - Joselyn Rodriguez / Probing Computational Speech Representations for Cross-Linguistic Differences

Linguistics PhD student Joselyn Rodriguez, standing in front of a chalkboard, smiling at the camera.

LSLT - Joselyn Rodriguez / Probing Computational Speech Representations for Cross-Linguistic Differences

Linguistics Thursday, March 28, 2024 12:30 pm - 1:30 pm H. J. Patterson Hall

March 28, Joselyn Rodriguez presents an LSLT talk on her research, asking how computational speech models might be used as research tools in the cognitive science of speech perception.


The recent advances in computational speech models has led to interest in their efficacy as cognitive models of human speech perception. While recent research in machine learning has examined their effectiveness in downstream tasks such as speech recognition or speaker identification, there is still much that is unknown about the structure of the learned representations of these models. Whether these models display similar patterns as human speech representations is still uncertain. The current work explores whether self-supervised speech models share one property of human speech representations - language specificity. Previous research has suggested that pre-trained transformer-based speech models display “universal”-like performance in cross-linguistic categorization tasks, however what it means for a model to display universal performance cross-linguistically is underexplored. The current work examines the performance of a Hindi-trained speech model and an English-trained speech model, showing that they display worse performance on cross-linguistically challenging contrast (Hindi dental and retroflex sounds) but that this performance does not necessarily mirror the pattern of difficulties for non-native naive human listeners.

Add to Calendar 03/28/24 12:30:00 03/28/24 13:30:00 America/New_York LSLT - Joselyn Rodriguez / Probing Computational Speech Representations for Cross-Linguistic Differences

March 28, Joselyn Rodriguez presents an LSLT talk on her research, asking how computational speech models might be used as research tools in the cognitive science of speech perception.


The recent advances in computational speech models has led to interest in their efficacy as cognitive models of human speech perception. While recent research in machine learning has examined their effectiveness in downstream tasks such as speech recognition or speaker identification, there is still much that is unknown about the structure of the learned representations of these models. Whether these models display similar patterns as human speech representations is still uncertain. The current work explores whether self-supervised speech models share one property of human speech representations - language specificity. Previous research has suggested that pre-trained transformer-based speech models display “universal”-like performance in cross-linguistic categorization tasks, however what it means for a model to display universal performance cross-linguistically is underexplored. The current work examines the performance of a Hindi-trained speech model and an English-trained speech model, showing that they display worse performance on cross-linguistically challenging contrast (Hindi dental and retroflex sounds) but that this performance does not necessarily mirror the pattern of difficulties for non-native naive human listeners.

H. J. Patterson Hall false