Skip to main content
Skip to main content

Research

Research at our top-ranked department spans syntax, semantics, phonology, language acquisition, computational linguistics, psycholinguistics and neurolinguistics. 

Connections between our core competencies are strong, with theoretical, experimental and computational work typically pursued in tandem.

A network of collaboration at all levels sustains a research climate that is both vigorous and friendly. Here new ideas develop in conversation, stimulated by the steady activity of our labs and research groups, frequent student meetings with faculty, regular talks by local and invited scholars and collaborations with the broader University of Maryland language science community, the largest and most integrated language science research community in North America.

Show activities matching...

filter by...

A Formal Model of Ambiguity and its Applications in Machine Translation

A model of language processing for machine translation that copes with ambiguity by trafficking in weighted sets of multiple inputs and outputs, and choosing a single analysis only as a last resort.

Linguistics

Non-ARHU Contributor(s): Chris Dyer
Dates:
Systems that process natural language must cope with and resolve ambiguity. In this dissertation, a model of language processing is advocated in which multiple inputs and multiple analyses of inputs are considered concurrently and a single analysis is only a last resort. Compared to conventional models, this approach can be understood as replacing single-element inputs and outputs with weighted sets of inputs and outputs. Although processing components must deal with sets (rather than individual elements), constraints are imposed on the elements of these sets, and the representations from existing models may be reused. However, to deal efficiently with large (or infinite) sets, compact representations of sets that share structure between elements, such as weighted finite-state transducers and synchronous context-free grammars, are necessary. These representations and algorithms for manipulating them are discussed in depth in depth. To establish the effectiveness and tractability of the proposed processing model, it is applied to several problems in machine translation. Starting with spoken language translation, it is shown that translating a set of transcription hypotheses yields better translations compared to a baseline in which a single (1-best) transcription hypothesis is selected and then translated, independent of the translation model formalism used. More subtle forms of ambiguity that arise even in text-only translation (such as decisions conventionally made during system development about how to preprocess text) are then discussed, and it is shown that the ambiguity-preserving paradigm can be employed in these cases as well, again leading to improved translation quality. A model for supervised learning that learns from training data where sets (rather than single elements) of correct labels are provided for each training instance and use it to learn a model of compound word segmentation is also introduced, which is used as a preprocessing step in machine translation.

Control as Movement

Norbert Hornstein reduces the arsenal of syntactic relations, with a defense of his analysis of Control as Movement. Thus the basic relation between "Sam" and "to leave" is the same whether the two are separated by "promised" or "seemed."

Linguistics

Contributor(s): Norbert Hornstein
Non-ARHU Contributor(s):

Cedric Boeckx, Jairo Nunes

Dates:
Publisher: Cambridge University Press

The Movement Theory of Control (MTC) makes one major claim: that control relations in sentences like 'John wants to leave' are grammatically mediated by movement. This goes against the traditional view that such sentences involve not movement, but binding, and analogizes control to raising, albeit with one important distinction: whereas the target of movement in control structures is a theta position, in raising it is a non-theta position; however the grammatical procedures underlying the two constructions are the same. This book presents the main arguments for MTC and shows it to have many theoretical advantages, the biggest being that it reduces the kinds of grammatical operations that the grammar allows, an important advantage in a minimalist setting. It also addresses the main arguments against MTC, using examples from control shift, adjunct control, and the control structure of 'promise', showing MTC to be conceptually, theoretically, and empirically superior to other approaches.

Language Learning and Language Universals

How do patterns in the environment interact with our innate capacities to produce our first languages?

Linguistics

Contributor(s): Jeffrey Lidz
Dates:
This paper explores the role of learning in generative grammar, highlighting interactions between distributional patterns in the environment and the innate structure of the language faculty. Reviewing three case studies, it is shown how learners use their language faculties to leverage the environment, making inferences from distributions to grammars that would not be licensed in the absence of a richly structured hypothesis space.

Restrictions on the Meaning of Determiners: Typological Generalisations and Learnability

Are nonconservative meanings for determiners unlearnable? And what about a determiner that means 'less than half'?

Linguistics

Contributor(s): Jeffrey Lidz
Non-ARHU Contributor(s): Tim Hunter, Alexis Wellwood, Anastasia Conroy
Dates:
In this paper we examine the relationship between learnability and typology in the area of determiner meanings. We begin with two generalisations about the meanings that determiners of the world’s languages are found to have, and investigate the learnability of fictional determiners with unattested meanings. If participants in our experiments fail to learn such determiners, then this would suggest that they are unattested because they are unlearnable. If, on the other hand, participants are able to learn the determiners in question, then some other explanation for their absence in the languages of the world is necessary.

Cross-linguistic representations of numerals and number marking

Numerals are restrictive modifiers.

Linguistics

Non-ARHU Contributor(s):

Michaël Gagnon, Alan Bale, Hrayr Khanjian

Dates:

Inspired by Partee (2010), this paper defends a broad thesis that all modifiers, including numeral modifiers, are restrictive in the sense that they can only restrict the denotation of the NP or VP they modify. However, the paper concentrates more narrowly on numeral modification, demonstrating that the evidence that motivated Ionin & Matushansky (2006) to assign non-restrictive, privative interpretations to numerals – assigning them functions that map singular sets to sets containing groups – is in fact consistent with restrictive modification. Ionin & Matushansky (2006)’s argument for this type of interpretation is partly based on the distribution of Turkish numerals which exclusively combine with singular bare nouns. Section 2 demonstrates that Turkish singular bare nouns are not semantically singular, but rather are unspecified for number. Western Armenian has similar characteristics. Building on some of the observations in section 2, section 3 demonstrates that restrictive modification can account for three different types of languages with respect to the distribution of numerals and plural nouns: (i) languages where numerals exclusively combine with plural nouns (e.g., English), (ii) languages where they exclusively combine with singular bare nouns (e.g., Turkish), (iii) languages where they optionally combine with either type of noun (e.g., Western Armenian). Accounting for these differences crucially involves making a distinction between two kinds of restrictive modification among the numerals: subsective vs. intersective modification. Section 3 also discusses why privative interpretations of numerals have trouble accounting for these different language types.

The topology of infixation and reduplication

Postdoc Bridget Samuels gives a theory of reduplication and affixation, based on Raimy's proposal that phonological representations are directed graphs ordered by precedence.

Linguistics

Non-ARHU Contributor(s):

Bridget Samuels

Dates:

This article is concerned with how to characterize and constrain the typology of reduplication and affixation, given Raimy’s (1999 et seq.) precedence-based theory of phonological representations as directed graphs. First, we establish a typology of attested reduplication and infixation anchor points based on an empirical survey. We then extend the SEARCH and COPY algorithms proposed by Mailhot & Reiss (2007) for long-distance assimilation (harmony) processes to the morphological domain, proposing modifications to reconcile this formalism with Raimy’s. Finally, we argue for an amended version of a proposal by Idsardi & Shorey (2007) regarding the process by which ‘looped’ representations created during the course of morphological concatenation are linearized.

Relating Movement and Adjunction in Syntax and Semantics

Tim Hunter renders movement and adjunction as different combinations of more primitive operations, in a way that unifies adjunct islands with freezing effects, and supports a new view of the argument/adjunct distinction in a neo-Davidsonian semantics.

Linguistics

Non-ARHU Contributor(s):

Timothy Hunter

Dates:

In this thesis I explore the syntactic and semantic properties of movement and adjunction in natural language, and suggest that these two phenomena are related in a novel way. In a precise sense, the basic pieces of grammatical machinery that give rise to movement, also give rise to adjunction. In the system I propose, there is no atomic movement operation and no atomic adjunction operation; the terms "movement" and "adjunction" serve only as convenient labels for certain combinations of other, primitive operations. As a result the system makes non-trivial predictions about how movement and adjunction should interact, since we do not have the freedom to stipulate arbitrary properties of movement while leaving the properties of adjunction unchanged, or vice-versa. I focus first on the distinction between arguments and adjuncts, and propose that the differences between these two kinds of syntactic attachment can be thought of as a transparent reflection of the differing ways in which they contribute to neo-Davidsonian logical forms. The details of this proposal rely crucially on a distinctive treatment of movement, and from it I derive accurate predictions concerning the equivocal status of adjuncts as optionally included in or excluded from a maximal projection, and the possibility of counter-cyclic adjunction. The treatment of movement and adjunction as interrelated phenomena furthermore enables us to introduce a single constraint that subsumes two conditions on extraction, namely adjunct island effects and freezing effects. The novel conceptions of movement and semantic composition that underlie these results raise questions about the system's ability to handle semantic variable-binding. I give an unconventional but descriptively adequate account of basic quantificational phenomena, to demonstrate that this important empirical ground is not given up. More generally, this thesis constitutes a case study in (i) deriving explanations for syntactic patterns from a restrictive, independently motivated theory of compositional semantics, and (ii) using a computationally explicit framework to rigourously investigate the primitives and consequences of our theories. The emerging picture that is suggested is one where some central facts about the syntax and semantics of natural language hang together in a way that they otherwise would not.

Priming of abstract logical representations in 4-year-olds

"Every horse did not jump over the fence." Preschoolers tend to hear this as meaning that none did. But the preference is not grammatical, as it can be reduced either by priming or changes to the context.

Linguistics

Contributor(s): Jeffrey Lidz
Non-ARHU Contributor(s):

Joshua Viau, Julien Musolino

Dates:

Though preschoolers in certain experimental contexts strongly prefer to interpret ambiguous sentences containing quantified NPs and negation on the basis of surface syntax (e.g., Musolino’s (1998) “observation of isomorphism”), contextual manipulations can lead to more adult-like behavior. But is isomorphism a purely pragmatic phenomenon, as recently proposed? In Experiment 1, we begin by isolating the contextual factor responsible for children’s improvement in Musolino and Lidz (2006). We then demonstrate in Experiment 2 that this factor can be used to prime inverse scope interpretations. To remove pragmatics from the equation altogether, we show in Experiment 3 that the same effect can be achieved via semantic priming. Our results represent the first clear evidence for priming of the abstract logico-syntactic structures underlying these interpretations and, thus, highlight the importance of language processing alongside pragmatic reasoning during children’s linguistic development.

On the Event-Relativity of Modal Auxiliaries

The syntactic position of modal auxiliaries restricts interpretations of their uses. Valentine Hacquard explains why, with a modification of the standard Kratzerian assumptions: modal auxiliaries are evaluated with respect to an event, not a world.

Linguistics

Contributor(s): Valentine Hacquard
Dates:

Crosslinguistically, the same modal words can be used to express a wide range of interpretations. This crosslinguistic trend supports a Kratzerian analysis, where each modal has a core lexical entry and where the difference between an epistemic and a root interpretation is contextually determined. A long standing problem for such a unified account is the equally robust crosslinguistic correlation between a modal’s interpretation and its syntactic behavior: epistemics scope high (in particular higher than tense and aspect) and roots low, a fact which has led to proposals that hardwire different syntactic positions for epistemics and roots (cf. Cinque’s hierarchy). This paper argues that the range of interpretations a modal receives is even more restricted: a modal must be keyed to certain time-individual pairs, but not others. I show that this can be captured straightforwardly by minimally modifying the Kratzerian account: modals are relative to an event—rather than a world—of evaluation, which readily provides a time (the event’s running time) and (an) individual(s) (the event’s participants). I propose that this event relativity of modals can in turn explain the correlation between type of interpretation and syntactic position, without having stipulation of an interpretation-specific height for modals.

Negative Concord and (Multiple) Agree: A Case Study of West Flemish

A conservative, Agree-based account of negative concord in West Flemish.

Linguistics

Non-ARHU Contributor(s): Liliane Haegeman, Terje Lohndal
Dates:
This paper examines the formalization of negative concord in terms of the Minimalist Program, focusing entirely on negative concord in West Flemish. It is shown that a recent analysis of negative concord which advocates Multiple Agree is empirically inadequate. Instead of Multiple Agree, it is argued that a particular implementation of the simpler and less powerful binary Agree is superior in deriving the data in question.