Today's Terrapins up at BU
November 01, 2024
On acquiring determiners, connectives and A'-dependencies, in the human way.
November is BUCLD season, and several Terrapins are up in Beantown presenting their work. There are four talks involving current Marylanders, abstracted below. Jack Ying has a talk on "Overcoming performance issues: Children respect presuppositions of “the”-expressions," co-authored with Alexander, Valentine and Jeff, which won the Jean Berko Gleason Award for "outstanding student paper". Elizabeth Swanson has a talk showing that "Children’s difficulty comprehending ‘but’ is linked to revision," joint work with Ana Antonio and (former Maryland visitor) Alex de Carvalho. Katherine Howitt is presenting her work with Sathvik Nair, Allison Dods and undergraduate RA Robert Hopkins in a talk called "LMs are not good proxies for human language learners." And recent RA Nathalie Fernandez has a talk with Naomi, Rose Griffin and Patrick Shafto on "Characterizing language learning trajectories with optimal transport." In addition, Naomi and Leslie Famularo are discussants for a special Symposium on "Language Models and Language Acquisition", together with Virginia Valian, Judit Gervain, Qihui Xu and Xiaomeng Ma.
There also several presentations by alumni. Tyler Knowlton *21 has a poster with John Trueswell and Anna Papafragou on "(Im)possible determiners and their learnability." Laurel Perkins *19 has a poster with Victoria Mateu and Nina Hyams, showing that "28-month-olds use inferred thematic relations to bootstrap intransitive verb meanings." Julie Gerard *16 has a talk with several collaborators arguing that "(All) pronouns are difficult, but not delayed: Evidence in favour of early Principle B acquisition." Utako Minai *06 has a talk on "Pragmatic factors facilitating children’s universal quantification: Evidence from child Turkish." Acrisio Pires *01 is co-author of a talk on "Knowledge of morphological case in adult Heritage Western Armenian." Former Baggett Fellow Margaret Kandel has a paper with Nan Li and Jesse Snedeker giving "Evidence for top-down constraints and form-based prediction in 4–5 year-olds’ lexical processing." And former undergraduate RA Lilliana Righter '20 has two talks with former Baggett Fellow Elika Bergelson, in whose lab Lily is currently lab manager: "Connections between real-time point comprehension and overall gesture and word knowledge in infancy," and "Emerging phonological and semantic specificity in lnfant’s lexical processing."
Overcoming performance issues: Children respect presuppositions of “the”-expressions / Ying, Williams, Hacquard and Lidz.
Elicited production studies have suggested that children up to the age of 5 sometimes use singular “the”-expressions in non-adult-like ways, i.e., when the referent is not familiar to interlocutors or not unique in the domain. However, unnatural settings in prior studies may be responsible for children’s production errors. We present a new elicitation-through-conversation study that creates a more natural production setting. By controlling for referent salience and incorporating natural turn-taking, we find that 4-year-olds, just like adults, respect the presuppositions of “the” even in elicited production. They never used definites when intended referents were unfamiliar to the listener; after introducing referents, they mostly used definites to refer to a unique referent and much less so with a non-unique one. This implies, contrary to prior work, that we have little reason to believe that children have the wrong meaning for “the” or lack the pragmatic capacity to use it properly.
Children’s difficulty comprehending ‘but’ is linked to revision / Swanson, Antonio and de Carvalho
Children have been found to struggle with interpreting the connective ‘but’ before age seven. However, we provide new evidence that their difficulty is not merely due to a misinterpretation of ‘but’ itself, but rather involves difficulty with revising their initial expectations. In two experiments, we examined French-speaking 6-to-9-year-olds’s interpretations of novel words in sentences containing ‘alors’ (‘so’) or ‘mais’ (‘but’), e.g., “Lea wanted to heat up her food, [so/but] she put it in the rane.” By varying whether the final interpretation of the novel word required revising the expectations created by the initial clause, we found that children successfully interpreted both ‘so’ and ‘but’ in sentences that did not require revision, but they failed in sentences requiring revision. Regardless of connective type, even by age nine, children struggled to regulate revision. We discuss possible factors contributing to variation in children’s success, including the effect of socioeconomic status.
LMs are not good proxies for human language learners / Nair, Howitt, Dods and Hopkins
We look for a shared underlying representation of diverse surface forms of filler-gap dependencies (FGDs) that could be exploited for learning. We test a large language model (LM), explicitly controlling its input, to uncover whether it posits such a generalization. Recent successes of LMs call into question what aspects of linguistic theory are necessary for acquiring language: they can generate natural language and determine relative acceptability of sentence strings. Some researchers argue these successes replace syntactic theory in explaining how children arrive at a grammar. We show that while LMs have limited success differentiating grammatical from ungrammatical FGDs (Table 1), they rely on specific properties of the input, rather than making a human-like generalization across FGDs. Our work reiterates the importance of constrained hypothesis spaces for acquiring human-like structural generalizations and reflects on the limited potential of LMs, which lack specific linguistic biases, to model language acquisition.
Characterizing language learning trajectories with optimal transport / Fernandez, Griffin, Shafto and Feldman
Children are often characterized as learning language efficiently from remarkably little data. We formalize the idea of efficient learning using optimal transport, an area of mathematics that formalizes how to efficiently move between probability distributions, through a case study on determiner acquisition. To apply optimal transport to language acquisition, we characterize language as a distribution over linguistic features and measure the child’s movement toward the caregiver’s distribution. Specifically, we characterize a speaker’s determiner production as a multinomial distribution, where its use could take one of three events: omitting the required determiner, producing a definite article (the), or producing an indefinite article (a or an). Contrary to the predictions of previous theories, results showed that the child moved toward the caregiver’s distribution, but the trajectory veered strongly in the direction of the indefinite determiner, relative to the parent’s distribution, before moving back toward producing the definite determiner.