Skip to main content
Skip to main content

Computational Linguistics

At Maryland we use computation in two ways: to build formal models of language structure, processing, and learning, and also to build technologies that make use of human languages. 

Computational linguistics at Maryland has two aspects. The first, known as "computational psycholinguistics,” uses computational models to better understand how people understand, generate and learn language and to characterize the human language capacity as a formal computational system. Researchers at Maryland have particular interests in using models to investigate problems in phonetics and phonology, psycholinguistics and language acquisition.

Computational linguistics also has a practical side, sometimes referred to as "natural language processing" or "human language technology.” Here the goal is to make computers smarter about human language, improving the automated analysis and generation of text, with results that can interact effectively with other information systems.

These two strands of computational linguistics are connected by shared methods (such as Bayesian models), a shared concern with grounding theories in naturally occurring linguistic data and a shared view of language as a fundamentally computational system for which formally explicit models and theories can be specified, designed and tested.
 
Our department has close ties to the Computational Linguistics and Information Processing Laboratory (CLIP Lab) at UMD's Institute for Advanced Computer Studies, where colleagues from linguistics, computer science and the College of Information Studies (iSchool) work together to advance the state of the art in such areas as machine translation, automatic summarization, information retrieval, question answering and computational social science.

Modeling the learning of the Person Case Constraint

Adam Liter and Naomi Feldman show that the Person Case Constraint can be learned on the basis of significantly less data, if the constraint is represented in terms of feature bundles.

Linguistics

Contributor(s): Adam Liter, Naomi Feldman
Dates:

Many domains of linguistic research posit feature bundles as an explanation for various phenomena. Such hypotheses are often evaluated on their simplicity (or parsimony). We take a complementary approach. Specifically, we evaluate different hypotheses about the representation of person features in syntax on the basis of their implications for learning the Person Case Constraint (PCC). The PCC refers to a phenomenon where certain combinations of clitics (pronominal bound morphemes) are disallowed with ditransitive verbs. We compare a simple theory of the PCC, where person features are represented as atomic units, to a feature-based theory of the PCC, where person features are represented as feature bundles. We use Bayesian modeling to compare these theories, using data based on realistic proportions of clitic combinations from child-directed speech. We find that both theories can learn the target grammar given enough data, but that the feature-based theory requires significantly less data, suggesting that developmental trajectories could provide insight into syntactic representations in this domain.

Learning an input filter for argument structure acquisition

How do children learn a verb’s argument structure when their input contains nonbasic clauses that obscure verb transitivity? Laurel Perkins shows that it might be enough for them to make a good guess about how likely they are to be wrong.

Linguistics

Non-ARHU Contributor(s): Laurel Perkins
Dates:
How do children learn a verb’s argument structure when their input contains nonbasic clauses that obscure verb transitivity? Here we present a new model that infers verb transitivity by learning to filter out non-basic clauses that were likely parsed in error. In simulations with childdirected speech, we show that this model accurately categorizes the majority of 50 frequent transitive, intransitive and alternating verbs, and jointly learns appropriate parameters for filtering parsing errors. Our model is thus able to filter out problematic data for verb learning without knowing in advance which data need to be filtered.

A unified account of categorical effects in phonetic perception

A statistical model that explains both the strong categorical effects in perception of consonants, and the very weak effects in perception of vowels.

Linguistics

Contributor(s): Naomi Feldman
Non-ARHU Contributor(s): Yakov Kronrod, Emily Coppess
Dates:
Categorical effects are found across speech sound categories, with the degree of these effects ranging from extremely strong categorical perception in consonants to nearly continuous perception in vowels. We show that both strong and weak categorical effects can be captured by a unified model. We treat speech perception as a statistical inference problem, assuming that listeners use their knowledge of categories as well as the acoustics of the signal to infer the intended productions of the speaker. Simulations show that the model provides close fits to empirical data, unifying past findings of categorical effects in consonants and vowels and capturing differences in the degree of categorical effects through a single parameter.

Read More about A unified account of categorical effects in phonetic perception

Primary Faculty

Naomi Feldman

Associate Professor, Linguistics

1413 A Marie Mount Hall
College Park MD, 20742

(301) 405-5800

Philip Resnik

Professor, Linguistics

1401 C Marie Mount Hall
College Park MD, 20742

(301) 405-6760

Amy Weinberg

Professor Emerita, Linguistics

Secondary Faculty

William Idsardi

Professor, Linguistics

1401 A Marie Mount Hall
College Park MD, 20742

(301) 405-8376