Learning Objectives

  • Define syntactic head and explain how it determines constituent structure and external distribution
  • Distinguish arguments from adjuncts using the tests of obligatoriness and entailment
  • Describe valency as a fundamental lexical property and give examples from multiple word classes
  • Analyse how adjunct attachment ambiguity challenges information extraction and semantic interpretation

Reading

Read Chapter 7 (Heads, Arguments, and Adjuncts) of Bender, E. M. (2013). Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Morgan & Claypool. Use the course materials below to activate and consolidate the concepts from that chapter.

1

Core Input

Read through each tab. Take notes on the key ideas before moving to the activities.

A constituent is a group of words that functions as a single unit within a sentence (#51). Not every sequence of words forms a constituent: the old dog is a constituent, but old dog barked (skipping the determiner and leaving out the subject–verb boundary) is not.

Evidence that a sequence forms a constituent includes: it can be replaced by a single pronoun (the large brown dog → it); it can be moved as a unit (fronted, relativised, clefted); it can answer a constituent question as a whole unit.

Every constituent has a syntactic head (#52) — the word that determines what the phrase is. The head governs two fundamental properties of the constituent:

  1. Internal structure — what other elements can appear inside the constituent. A verb head determines that its arguments must appear; a noun head licences determiners and adjectival modifiers; a preposition head selects a following NP complement.
  2. External distribution — where the constituent as a whole can appear in a larger sentence. An NP can appear in any position a bare noun could appear: subject, object, after a preposition. A VP can serve as the predicate of a sentence. The phrase inherits its distributional possibilities from its head (#50 from Unit 6).

The head-dependent relationship is the central organising principle of dependency grammar, which represents all syntactic structure as a network of head–dependent links, without phrase-structure nodes. Modern NLP parsers — including most neural dependency parsers — operate on this framework.

Arguments are the participants semantically required by the head's meaning (#53, #54). The verb give, for instance, requires three participants: someone who gives (the giver), something that is given (the theme), and someone who receives (the recipient). These three participants are the arguments of give.

The number of arguments a head requires is its valency (#54), a term borrowed from chemistry. Valency is a lexical property — it must be stored in the lexicon for each individual head:

  • Intransitive verbs (valency 1 — subject only): sleep, arrive, sneeze
  • Transitive verbs (valency 2 — subject + object): see, chase, read
  • Ditransitive verbs (valency 3 — subject + object + recipient or second object): give, send, show

In many (perhaps all) languages, arguments can be left unexpressed in context (#55). In English, the object of eat can be omitted: She was eating (eating what? — left implicit from context). In Japanese, both subject and object arguments are freely dropped when contextually recoverable. In Spanish, the subject pronoun is routinely dropped because verb agreement morphology identifies the subject sufficiently. NLP systems performing semantic analysis must recover omitted arguments — the task of implicit argument resolution.

Importantly, not only verbs select arguments (#56). Nouns (the arrival of the train), adjectives (proud of her), and prepositions (on the table) can all select arguments. Argument extraction in NLP must work across all POS heads.

Adjuncts are optional modifiers that are not selected by the head (#53, #57). Unlike arguments, they are not required by the head's valency; they can be added and removed without affecting grammaticality; and they can be stacked iteratively:

She ran. → She ran quickly. → She ran quickly in the park. → She ran quickly in the park on Tuesday. → She ran quickly in the park on Tuesday despite the rain.

Each added element (quickly, in the park, on Tuesday, despite the rain) is an adjunct to the verb phrase. None is required by the valency of run; each adds information; and they all stack up without making the sentence ungrammatical.

There is an important semantic asymmetry (#58): although adjuncts are syntactically dependents of the head, they are semantically predicates that take the head as their argument. She ran quickly means: the running was quick — quickly is a predicate whose argument is the running event. This may seem counter-intuitive (syntactically, quickly modifies ran), but it is the correct semantic analysis and it matters for NLP event extraction and semantic role labelling.

Two formal tests distinguish arguments from adjuncts:

  • Obligatoriness test (#59): Remove the element. If the sentence is ungrammatical (or shifts from "less informative" to "expressing a different proposition"), it is an argument. If it remains grammatical, it is an adjunct.
  • Entailment test (#60): Adjuncts introduce entailments that are not introduced by arguments. She ran in the park entails she was in the park; removing the adjunct removes the entailment. Arguments also create entailments, but the pattern differs.
2

Key Concepts A: Constituents, Heads, Arguments, and Tests (#51–#60)

Expand each item. Think about your answer before reading the explanation.

Words in a sentence are not related to each other in a flat, unstructured list. They form hierarchical groups — constituents — that function as units at intermediate levels between the word and the full sentence.

Evidence for constituent structure:

  • Pronoun replacement: The large brown dog barked at the postman.It barked at the postman. The whole sequence the large brown dog is replaced by it — evidence that it is a constituent.
  • Movement: Constituents can move as units. At the postman, the dog barked. — the PP moves as a whole.
  • Constituency questions: What did the dog bark at?the postman. A constituent answers as a unit.
  • Coordination: Only constituents of the same type can be coordinated: the large brown dog and the small grey cat — both NPs.

NLP relevance: Constituent identification (parsing) is a prerequisite for named entity recognition, coreference resolution, semantic role labelling, and question answering. A system that cannot identify constituent boundaries will make systematic errors in all of these downstream tasks.

The head is the most important word in a constituent. It determines both what the constituent contains (internal structure) and where it can appear (external distribution).

Tests for identifying the head:

  1. Distribution test: The phrase can appear in the same positions as the head alone. A very good teacher arrived.A teacher arrived. ✓ → A very good arrived. ✗ — teacher is the head.
  2. Agreement/feature percolation: The morphological features of the head determine the features of the whole phrase. A big catcat is singular; the phrase takes singular agreement (a big cat is, not *a big cat are).
  3. Obligatoriness: The head is the most obligatory element; a constituent cannot exist without its head.

Dependency grammar places the head-dependent relationship at the centre of all syntactic description. Every word (except the root) is a dependent of exactly one head; the entire sentence structure is a tree of head–dependent arcs. This formalism is the basis of most modern NLP parsers, including the Universal Dependencies annotation framework.

Every dependent of a head belongs to one of two fundamental types:

  • Arguments — semantically required participants whose properties are specified in the head's lexical entry. They are part of the head's valency frame.
  • Adjuncts — optional modifiers that are not listed in the head's lexical entry. They add information without being required.

Importantly, the same phrase can be an argument in one construction and an adjunct in another:

  • She arrived in London.in London is an adjunct: she arrived is a complete, grammatical sentence; the PP adds locative information optionally. Arrive is intransitive (valency 1); it does not select a location argument.
  • She lives in London.in London is an argument (or at least strongly expected): *She lives is awkward without a location because live in this sense strongly selects a locative complement. Whether this is a true argument or a very strong adjunct is debated — but it illustrates that the boundary is not always sharp.

NLP relevance: Distinguishing arguments from adjuncts determines how a semantic role labeller populates the argument slots of a predicate. Confusing an adjunct with an argument produces incorrect knowledge base entries.

Valency is the number of semantic arguments a head provides for — the number of participant slots it opens. It is stored lexically: knowing that a word is a verb tells you only that it selects a subject; the full valency frame specifies how many additional arguments are required and what semantic roles they bear.

English verb valency examples:

VerbValencyArgumentsExample
arrive1subjectThe train arrived.
sleep1subjectThe child slept.
see2subject + objectShe saw the film.
chase2subject + objectThe dog chased the cat.
give3subject + object + recipientShe gave him the book.
send3subject + object + recipientShe sent the committee the report.
put3 (obligatory)subject + object + locationShe put the book on the table. (*She put the book.)

NLP systems performing semantic role labelling consult valency lexicons — VerbNet, FrameNet, PropBank — to know how many arguments each verb takes and what semantic roles (Agent, Patient, Theme, Goal, etc.) they bear. A system without valency information cannot reliably distinguish argument slots from adjunct positions.

Even obligatory arguments can be omitted under certain conditions — when they are recoverable from context, from discourse, or from pragmatic defaults.

English implicit arguments:

  • She was eating. — the object (what she was eating) is implicit; eat is used intransitively. The participant slot is still there semantically; the argument is simply unexpressed.
  • I've already read it. (diary entry) — subject omitted; acceptable in certain registers.

Pro-drop languages:

  • Japanese — both subject and object arguments are freely dropped when contextually recoverable. Verb morphology does not mark agreement, but shared context makes the referents clear.
  • Spanish — subject pronouns are routinely dropped (Habla español, "He/she speaks Spanish") because verb agreement suffixes identify person and number.
  • Mandarin — arguments are frequently omitted in discourse when topics are established; no morphological agreement compensates.

NLP relevance: Unexpressed arguments must be recovered for semantic role labelling, coreference resolution, and entailment inference. In English, this is the task of implicit argument resolution or argument ellipsis resolution. In Japanese and Mandarin, zero-pronoun resolution is a major research challenge with significantly lower system performance than English coreference resolution.

Verbs are the prototypical argument-selecting heads, but they are not the only POS that selects arguments. Nouns, adjectives, and prepositions can also open argument slots.

Nouns selecting arguments:

  • the arrival of the trainarrival is a deverbal noun derived from arrive; it inherits the verb's argument: the thing that arrives.
  • a student of linguisticsstudent selects a PP complement specifying the field of study.
  • the destruction of the city — the object argument of destroy is preserved in the nominalization.

Adjectives selecting arguments:

  • proud of herproud selects a PP argument specifying what the subject is proud of.
  • similar to the originalsimilar selects a PP argument specifying what it is similar to.
  • aware that prices had risenaware selects a clausal argument.

Prepositions selecting arguments:

  • on the tableon selects an NP argument.
  • for the childrenfor selects an NP argument.
  • despite the raindespite selects an NP argument.

NLP relevance: Information extraction and semantic role labelling systems must handle argument structure across all POS heads, not just verbs. Nominalisation is particularly challenging: event nouns inherit verbal argument structure, and an NLP system that only looks at verb predicates will miss these argument relations.

Adjuncts have two diagnostic properties that distinguish them from arguments:

  1. Not required: The head is grammatically and semantically complete without its adjuncts. Removing an adjunct from a grammatical sentence always yields a grammatical sentence (though a less informative one).
  2. Iterable: Any number of adjuncts can be stacked onto the same head without creating ungrammaticality.

Demonstration:

She ran. ✓
She ran quickly. ✓
She ran quickly in the park. ✓
She ran quickly in the park on Tuesday. ✓
She ran quickly in the park on Tuesday despite the rain. ✓
She ran quickly in the park on Tuesday despite the rain in her new shoes. ✓

Contrast with arguments:

She put the book on the table. ✓
*She put on the table. ✗ (object argument missing)
*She put the book. ✗ (location argument of put missing in most dialects)

NLP relevance: The iterability diagnostic can be operationalised in information extraction pipelines: if removing an element still yields a grammatical predicate-argument structure, the element is likely an adjunct and should not be added to the argument slots in the extracted relation.

This is one of the most conceptually surprising insights about adjuncts. Syntactically, an adjunct is a dependent: it depends on its head. But semantically, the direction is reversed: the adjunct introduces a predicate, and the head (or its referent) is the argument of that predicate.

Example:

She ran quickly.

Syntactically: quickly modifies ran (dependent → head).
Semantically: quickly introduces the predicate QUICK(x); the running event is x. The meaning is: the running was quick.

Verbal adjuncts:

  • She left before he arrived. — BEFORE(leaving-event, arriving-event).
  • She ran because she was late. — CAUSE(being-late, running).

Nominal adjuncts:

  • the dog in the garden. — IN(dog, garden): the PP introduces a predicate IN(x, garden) where x = the dog.
  • the dog that bit the man. — the relative clause introduces BIT(dog, man) with the dog as the semantic argument of the embedded predicate.

NLP relevance: Accurate event extraction and knowledge graph construction require treating adjuncts as introducing additional propositions, not merely qualifying a single event representation. A system that represents she ran quickly in the park as a single event with properties attached will miss the fact that adjuncts contribute separate predicate-argument structures.

The obligatoriness test: remove the element in question from the sentence.

  • If the result is ungrammatical, or if the meaning shifts from "less informative" to "expressing a different or incomplete proposition," the element is (likely) an argument.
  • If the result is grammatical and still expresses a complete proposition (just a less informative one), the element is (likely) an adjunct.

Examples:

OriginalAfter removalGrammatical?Classification
She put the book on the table. *She put the book. No on the table = argument of put
She ran in the park. She ran. Yes in the park = adjunct of ran
She gave the children a book. *She gave a book. No (recipient missing) the children = argument of gave
She read on Tuesday. She read. Yes on Tuesday = adjunct

Caveats: The test is not always decisive. Some arguments can be left implicit (#55): She ate is grammatical without an object, even though eat can take one. And some very strong adjuncts (e.g. location with live) feel obligatory even though they are not strictly required by valency. The test is a diagnostic, not a proof.

The entailment test exploits the fact that adjuncts, as semantic predicates (#58), introduce entailments that are absent when the adjunct is removed.

Adjunct entailments:

  • She ran in the park. entails She was in the park.
    She ran. does not entail She was in any specific place.
    → the entailment is introduced by the adjunct PP.
  • She ran on Tuesday. entails The running occurred on Tuesday.
    She ran. does not entail this. → the temporal adjunct introduces the entailment.

Argument entailments:

  • She gave him the book. entails He received the book. This entailment is part of the semantics of give — it is introduced by the verb, not an optional modifier.

The pattern: adjuncts introduce entailments that are not present in the adjunct-free sentence; arguments participate in entailments that are part of the verb's core semantics.

NLP relevance: Textual entailment systems — which must determine whether one sentence entails another — can use the argument/adjunct distinction to identify which propositions are contributed by optional modifiers versus core argument structure. This supports more accurate inference and question answering.

3

Key Concepts B: Types and Distribution of Adjuncts (#61–#67)

Expand each item.

Adjuncts are not a single syntactic type — they can be realised at any level of structural complexity.

  • Single word (adverb): She ran quickly.
  • Phrase (PP): She ran in the park.
  • Participial clause: She ran, breathing hard.
  • Relative clause (nominal adjunct): the dog that bit the man
  • Adverbial clause: She left before he arrived.
  • Infinitival clause (purpose): She ran to catch the bus.

In all cases, the adjunct is structurally optional and semantically predicative (#58). The internal complexity of the adjunct does not change its status as an adjunct; a long relative clause is still an adjunct to its head noun.

NLP relevance: Parsers must identify adjuncts at all levels of complexity. Long clausal adjuncts are a major source of parsing errors because they introduce deeply embedded structures and multiple potential attachment points.

Nominal adjuncts modify noun phrases and contribute predications about the referent of the NP head.

  • Adjective: the old dog — OLD(dog)
  • Prepositional phrase: the dog in the garden — IN(dog, garden)
  • Relative clause: the dog that bit the man — BIT(dog, man)
  • Participial phrase: the dog running away — RUNNING-AWAY(dog)
  • Genitive: the dog's owner — OWNS(owner, dog)
  • Noun-noun compound modifier: the garden dog — located in the garden

NLP relevance: Nominal modification is central to information extraction. The attributes, properties, and relationships of named entities in text are typically expressed as nominal adjuncts. A system that fails to parse nominal adjuncts correctly will miss entity attributes and relationships. Relative clause attachment is a major source of parsing ambiguity (see Activity 4, Attachment Ambiguity tab).

Verbal adjuncts modify the verb phrase and contribute predications about the event or state described by the verb.

  • Manner: She ran quickly. — QUICK(running)
  • Frequency: She runs often. — FREQUENT(running)
  • Temporal: She ran on Tuesday. — ON-TUESDAY(running)
  • Locative: She ran in the park. — IN-PARK(running)
  • Reason: She ran because she was late. — CAUSE(being-late, running)
  • Purpose: She ran in order to catch the bus. — PURPOSE(running, catch-bus)
  • Concession: She ran despite the rain. — DESPITE(running, rain)

NLP relevance: Verbal adjuncts carry the event properties that are essential for event extraction and temporal knowledge base construction. Timeline systems and question answering over events (when? where? why?) depend entirely on accurate verbal adjunct extraction and semantic interpretation.

Adjuncts are not limited to modifying NPs and VPs. They can also modify adjective phrases, adverb phrases, and even entire sentences.

  • Adjective phrase (degree modifier): very fast — VERY modifies the AdjP fast.
  • Adverb phrase (degree modifier): extremely quickly — EXTREMELY modifies the AdvP.
  • Sentence adverb (speaker-oriented): Fortunately, she arrived on time. — the adverb scopes over the entire sentence; it expresses the speaker's evaluation of the proposition.
  • Domain/frame adverb: In general, the system works. — scopes over the whole claim.
  • Evidential adverb: Apparently, the meeting was cancelled. — indicates the speaker's information source.

Sentence-level adjuncts are semantically distinct from VP adjuncts: they evaluate the proposition expressed by the sentence, or characterise the speaker's epistemic state, rather than describing a property of the event.

NLP relevance: Sentence-level adjuncts are important for sentiment analysis (fortunately, unfortunately, disappointingly), hedging detection (apparently, seemingly, in general), and claim scope analysis. Systems that misanalyse these as VP adjuncts will misidentify their semantic contribution.

Adjuncts encode a rich variety of semantic relations between the adjunct predicate and its head argument. The principal semantic types are:

Semantic typeExamplesRelevance for NLP
Temporal yesterday, on Tuesday, at noon, since 2020 Timeline construction, temporal relation extraction
Locative in the park, nearby, at the station Geolocation, event grounding, named entity linking
Manner quickly, with care, in a hurry Event description, sentiment-bearing adverbials
Reason / cause because she was late, due to the strike Causal relation extraction, explanation mining
Purpose in order to arrive, to submit the report Goal/intention extraction
Concession despite the rain, although it was difficult Contrast and concession relation detection
Condition if she runs, provided that the data is correct Conditional extraction, rule mining
Frequency often, rarely, twice a week Temporal frequency in event representations
Degree very, extremely, barely, almost Scalar semantics, sentiment intensity

Each semantic type introduces different implications for event representation and knowledge base construction. Temporal and locative adjuncts ground events in space and time; causal adjuncts create causal links between propositions; concessive adjuncts signal that the expected causal or logical relationship is overridden.

Not every phrase can modify every other type. The syntactic category of a constituent determines where it can appear as a modifier:

  • Adverbs (AdvP) modify verbs and adjectives, not nouns directly: She ran quickly ✓; *the quickly dog
  • Adjectives (AdjP) modify nouns directly, not verbs: the large dog ✓; *she ran large ✗ (as a manner modifier)
  • Prepositional phrases (PP) can modify both NPs and VPs: the dog in the garden ✓ (NP); she ran in the park ✓ (VP)
  • Relative clauses modify NPs, not VPs or AdjPs: the dog that barked ✓; *she ran that barked

The modification potential is not a pragmatic convention but a structural property of the constituent's syntactic category. This is why the category (POS) assigned to a word or phrase directly constrains which attachment positions are available to it.

NLP relevance: POS tags constrain attachment possibilities in the parser. A POS tagger that misidentifies a category will produce downstream attachment errors. For example, if a word that is an adverb is mistagged as an adjective, the parser may attempt to attach it to a noun rather than a verb, producing an incorrect parse and downstream information extraction errors.

While many heads select NP arguments, some heads select clausal arguments — full embedded sentences (or sentence-like structures) as their complements.

Verbs with clausal arguments in English:

  • Declarative complement (that-clause): She believed [that he was right]. — the entire bracketed clause is the object argument of believe.
  • Interrogative complement (wh-clause): She knew [where he was].
  • Polar interrogative complement: She asked [whether he would come].
  • Infinitival complement: She wanted [him to leave].
  • Counter-factual complement: She wished [that she had studied harder].

Cross-linguistically, clausal arguments are expressed differently: English uses complementisers (that, whether); some languages use subjunctive morphology on the embedded verb; others nominalise the embedded clause (the embedded proposition appears as a noun phrase). All of these are argument structures — the head selects a clausal complement — but the morphosyntactic realisation varies.

NLP relevance: Clausal argument extraction is substantially more complex than NP argument extraction and is a central challenge for relation extraction and knowledge base construction. Identifying the boundaries of a clausal complement, resolving the internal structure of the embedded clause, and determining the semantic relationship between the matrix verb and its clausal argument all require robust parsing and deep syntactic understanding. LLMs handle common clausal argument patterns reasonably well in English but struggle with deeply embedded clauses and with cross-lingual transfer.

4

Worked Examples

Study each worked example. Connect each case to the relevant concept numbers.

Verbs differ in how many arguments they require and what types those arguments are. The same verb can sometimes appear with different valency frames (intransitive/transitive alternations). Valency frames must be stored in the lexicon (#54).

VerbFrameValencyArgument typesObligatory?
sleep She slept. 1 NP-subj Subject only
eat She ate. / She ate the soup. 1 or 2 NP-subj (+ NP-obj optional) Object optional (#55)
see She saw the film. 2 NP-subj + NP-obj Both obligatory
give She gave him the book. / She gave the book to him. 3 NP-subj + NP-obj + NP/PP-recipient All obligatory
put She put the book on the table. 3 NP-subj + NP-obj + PP-loc All obligatory (*She put the book.)
consider She considered the proposal. / She considered him honest. 2 NP-subj + NP-obj (or NP + AdjP) Object obligatory
read She reads. / She reads the paper. 1 or 2 NP-subj (+ NP-obj optional) Intransitive use permitted

NLP systems that perform semantic role labelling consult valency lexicons such as VerbNet, FrameNet, and PropBank to identify the expected arguments of each predicate. These lexicons encode the argument frames shown above, along with semantic role labels (Agent, Patient, Theme, Goal, Source, etc.) for each argument slot.

Adjunct iteration demonstrates both the optionality and the non-selectivity of adjuncts. Any number of adjuncts can be added to the same head without creating ungrammaticality, and they can appear in varying orders (#57). Compare this with the strict constraints on arguments.

Adjunct stacking on a VP:

She ran.
She ran quickly.
She ran quickly in the park.
She ran quickly in the park on Tuesday.
She ran quickly in the park on Tuesday despite the rain.
She ran quickly in the park on Tuesday despite the rain to catch the bus.

Colours: manner (orange), locative (blue), temporal (green), concessive (peach), purpose (purple).

Argument contrast — put (obligatory location argument):

She put the book on the table. ✓
*She put on the table. ✗ — the object argument (the book) is missing.
*She put the book. ✗ — the location argument (on the table) is missing.
She put the book on the table carefully on Tuesday. ✓ — adjuncts added freely.

The contrast illustrates the diagnostic value of iterability (#57): if an element can be added or removed without affecting grammaticality, and if multiple such elements can stack, the element is an adjunct. Arguments resist removal and do not iterate (you cannot have two object arguments of a single transitive verb in the same clause).

Ditransitive verbs (give, send, show) take three arguments cross-linguistically, but the encoding of those arguments varies: case morphology, particles, or word order. This demonstrates that valency (#54) is a cross-linguistic property, but its surface realisation differs systematically.

Japanese — あげる (ageru, 'give'):

私はジョンに本をあげた。

Watashi-wa Jon-ni hon-wo age-ta.

I-TOPIC John-DAT book-ACC give-PAST

“I gave John the book.”

Arguments are marked by particles: -wa (topic/subject), -ni (dative/recipient), -wo (accusative/theme). Word order is relatively free because the particles identify each argument's role.

German — geben ('give'):

Ich gab ihm das Buch.

Ich gab ihm das Buch.

I.NOM gave him.DAT the.ACC book

“I gave him the book.”

Arguments are marked by case endings on articles and pronouns: nominative subject (Ich), dative recipient (ihm), accusative theme (das Buch).

Urdu — دینا (dena, 'give'):

اس نے جان کو کتاب دی۔

Us ne Jan ko kitāb dī.

He.ERG John.DAT book.NOM give.PAST

“He gave John the book.”

Urdu uses postpositions (ne for ergative subject, ko for dative recipient) rather than prepositions or case inflections on the noun itself.

NLP implication: Cross-lingual semantic role labelling must learn language-specific surface realisations of the same underlying valency frames. A system trained only on English subject–verb–object patterns will not transfer reliably to particle-marking or case-marking languages.

Attachment ambiguity arises when a modifier — typically a PP or relative clause — can be attached to more than one head in the sentence structure. The argument/adjunct distinction is crucial for understanding which attachments are constrained by valency and which are free (#57, #58, #62, #63).

Classic PP attachment ambiguity:

I saw the man with the telescope.

  • Reading Awith the telescope modifies the man (NP adjunct): I saw the man who had the telescope.
  • Reading Bwith the telescope modifies saw (VP adjunct): I used the telescope to see the man.

In both readings, with the telescope is an adjunct — neither see nor man selects it as an obligatory argument.

Contrast — obligatory argument vs adjunct:

She put the book in the bag.

Here, in the bag is the obligatory location argument of put (valency 3 — subject + object + location). It cannot be omitted (*She put the book), and it cannot freely attach to the NP the book — it must attach to the VP headed by put. The argument/adjunct distinction constrains the attachment: arguments attach to their selecting head; adjuncts attach more freely.

More complex case:

The researcher studied the protein that causes cancer in the laboratory.

Possible attachments for in the laboratory:

  1. VP adjunct of studied — she studied in the laboratory
  2. VP adjunct of causes — cancer is caused in the laboratory
  3. NP adjunct of cancer — the cancer that is in the laboratory

All three are adjunct attachments (none is required by valency). Statistical parsers — including neural models — choose the most probable attachment based on distributional patterns learned from training data. Errors in this choice propagate directly to information extraction: if reading 2 is chosen, the extracted relation might be "cancer causes disease in the laboratory" rather than "researcher works in the laboratory."

5

Check Your Understanding

Select the best answer for each question.

A linguist applies the obligatoriness test to the phrase 'in the park' in the sentence 'She met him in the park.' The sentence 'She met him' is grammatical. What does this tell us?

Correct! #59 — obligatoriness is a test to distinguish arguments from adjuncts. 'She met him' is grammatical without 'in the park' — the sentence expresses a complete proposition. Therefore 'in the park' is an adjunct: it adds locative information but is not required by the valency of met. Compare: *'She put it' is not grammatical without a location argument — the location is an argument of put, not an adjunct.
Not quite — review the material and try again. #59 — obligatoriness is a test to distinguish arguments from adjuncts. 'She met him' is grammatical without 'in the park' — the sentence expresses a complete proposition. Therefore 'in the park' is an adjunct: it adds locative information but is not required by the valency of met. Compare: *'She put it' is not grammatical without a location argument — the location is an argument of put, not an adjunct.

The verb 'give' in English requires three semantic arguments: a giver, a recipient, and a theme (the thing given). Which concept states that this property is a fundamental lexical fact about 'give' that must be stored in the lexicon?

Correct! #54 — the number of semantic arguments provided for by a head is a fundamental lexical property. Give has valency 3 — this must be encoded in the lexical entry for give. NLP systems that perform semantic role labelling consult valency lexicons such as VerbNet and FrameNet to know how many arguments each verb takes and what semantic roles they bear.
Not quite — review the material and try again. #54 — the number of semantic arguments provided for by a head is a fundamental lexical property. Give has valency 3 — this must be encoded in the lexical entry for give. NLP systems that perform semantic role labelling consult valency lexicons such as VerbNet and FrameNet to know how many arguments each verb takes and what semantic roles they bear.
AI Dimension

The concepts from this unit — heads, arguments, adjuncts, and valency — connect directly to core challenges in NLP system design.

  • Predicate-argument extraction. Information extraction systems aim to identify who did what to whom — the core predicate-argument structure of sentences. This requires knowing the valency of verbs (#54) and distinguishing arguments from adjuncts (#53, #59, #60). LLMs learn predicate-argument patterns statistically from large corpora and are reasonably accurate on high-frequency verbs in English, but degrade significantly on low-frequency verbs, nominalisations (#56), and languages where argument structure is encoded differently.
  • Modifier attachment ambiguity. A persistent failure mode for NLP parsers. The researcher studied the protein that causes cancer in the laboratory — does in the laboratory modify studied, causes, or cancer? LLMs resolve this statistically based on learned co-occurrence patterns, not by reasoning about argument structure. Errors propagate to information extraction and knowledge base construction (#57, #58, #63).
  • Clausal argument extraction. Complex sentence understanding requires identifying clausal arguments of verbs — believe that, know whether, ask if, wish that (#67). LLMs handle simple examples well but struggle with deeply embedded clauses, non-canonical constructions, and cross-lingual transfer to languages that express clausal arguments through nominalisation or subjunctive morphology.
  • Unexpressed arguments and reference resolution. When arguments are omitted — implicit objects in English, zero subjects and objects in Japanese and Mandarin (#55) — NLP systems must identify the referent. This is a coreference and zero-pronoun resolution problem. LLMs perform reasonably well on common English patterns but less reliably in zero-pronoun languages, where the absence of overt argument marking creates systematic ambiguities that require deep discourse-level processing to resolve.
6

Activities

Individual task

For each sentence below, identify: (a) the head of each major phrase; (b) the arguments of the main verb — state the valency; (c) any adjuncts. Justify your argument/adjunct distinctions using the obligatoriness test (#59) and the entailment test (#60).

  1. The student submitted the essay late.
  2. She gave the children a book about butterflies.
  3. They met on Tuesday.
  4. He put the report on the table.

For each sentence, record your analysis in a table with columns: Element, Type (argument / adjunct), Test applied, Result of test.

Pair task

Use an information extraction tool or NLP demo to extract predicate-argument triples from the following sentences:

  1. The CEO gave the board an update on the quarterly results last Thursday.
  2. She sent the report to the committee by email.

For each extracted triple, check:

  1. Are all obligatory arguments extracted? Use the valency information from this unit to decide what arguments you expect.
  2. Are any adjuncts incorrectly listed as arguments? Apply the obligatoriness test to each extracted element.
  3. Are any arguments omitted — either because the system missed them, or because they were implicit (#55)?

Compare your findings with your partner. Where you disagree, apply the formal tests to settle the analysis. Write a short (150-word) summary of the system's error types.

Group task

Choose a language other than English: Japanese, Urdu, German, or Turkish. For five verbs in that language, carry out a valency analysis.

Present your findings as a valency table with the following columns:

  • Verb — the verb in the target language (with romanisation if needed) and English gloss
  • Valency — number of arguments
  • Argument 1 — semantic role and morphological/syntactic marking
  • Argument 2 — semantic role and morphological/syntactic marking (if applicable)
  • Argument 3 — semantic role and morphological/syntactic marking (if applicable)
  • Example sentence — with gloss and translation

After completing the table, discuss: how would an NLP system designed for English argument extraction need to be adapted to handle your chosen language? Consider: word order, morphological marking, dropped arguments (#55), and any categories from Unit 6 (#49) that are relevant to the argument structure of your language.

Review

  • #51 — Words within sentences form intermediate groupings called constituents: hierarchical units that can be replaced by pronouns, moved, and answered as wholes.
  • #52 — A syntactic head determines both the internal structure (what can appear inside the constituent) and the external distribution (where the constituent can appear) of the phrase it projects.
  • #53 — Syntactic dependents are classified as arguments (semantically required participants) or adjuncts (optional modifiers). The same phrase can be an argument in one construction and an adjunct in another.
  • #54 — The number of semantic arguments a head provides for (its valency) is a fundamental lexical property that must be stored in the lexicon — it cannot be derived from the POS alone.
  • #55 — In many (perhaps all) languages, some arguments can be left unexpressed when recoverable from context: implicit objects in English; free argument drop in Japanese and Mandarin; pro-drop in Spanish.
  • #56 — Nouns, adjectives, and prepositions, as well as verbs, can serve as heads that select arguments. Argument extraction must work across all POS heads.
  • #57 — Adjuncts are not required by the head and generally can iterate: any number can be stacked onto the same head without ungrammaticality. Iterability is a diagnostic for adjunct-hood.
  • #58 — Adjuncts are syntactically dependents of the head but semantically predicates that take the head (or its referent) as their argument. The direction of predication is reversed relative to the direction of syntactic dependency.
  • #59 — Obligatoriness is a test to distinguish arguments from adjuncts: remove the element; if the sentence becomes ungrammatical, the element is (likely) an argument.
  • #60 — Entailment is a test to distinguish arguments from adjuncts: adjuncts introduce entailments absent from the adjunct-free sentence; arguments participate in entailments that are part of the head's core semantics.
  • #61 — Adjuncts can be single words, phrases, or clauses, regardless of internal structural complexity.
  • #62 — Adjuncts can modify nominal constituents: adjectives, PPs, relative clauses, participials, and genitives all modify NPs.
  • #63 — Adjuncts can modify verbal constituents: manner, temporal, locative, reason, purpose, and concession adjuncts all modify VPs.
  • #64 — Adjuncts can modify other constituent types: degree modifiers modify AdjPs and AdvPs; sentence adverbs scope over entire propositions and express speaker-oriented meanings.
  • #65 — Adjuncts express a wide range of semantic types: temporal, locative, manner, reason, purpose, concession, condition, frequency, degree. Each type introduces different propositions into the event representation.
  • #66 — The potential to be a modifier is inherent to the syntax of a constituent: adverbs modify verbs and adjectives, not nouns; adjectives modify nouns, not verbs. POS category determines modification potential and therefore constrains parser attachment decisions.
  • #67 — Just about anything can be an argument, for some head: some heads (believe, know, ask, wish) select clausal arguments — full embedded sentences. Cross-linguistically, clausal arguments are expressed through complementisers, subjunctive morphology, or nominalisation.

The argument-adjunct distinction is one of the most consequential structural distinctions for NLP applications that go beyond surface text matching.

Defining the distinction:

Arguments are the semantically required participants whose presence is specified by the head's valency frame (#53, #54). Adjuncts are optional modifiers that contribute additional predicative content but are not selected by the head (#53, #57). Every dependent of a head falls into one of these two classes.

The tests:

  • Obligatoriness (#59): Remove the element. Ungrammatical result → argument; grammatical result → adjunct.
  • Entailment (#60): Adjuncts introduce new entailments absent from the adjunct-free sentence; arguments participate in entailments that are part of the head's core semantics.
  • Iterability (#57): Adjuncts can be stacked; arguments cannot be doubled.

Valency as a lexical property (#54):

Valency is not derivable from POS alone — it is stored in the lexicon. NLP systems perform semantic role labelling by consulting valency lexicons (VerbNet, FrameNet, PropBank) that encode each verb's argument frames and the semantic roles of each argument slot.

NLP consequences:

  • Information extraction: Only arguments populate predicate-argument triples in knowledge bases; adjuncts provide contextual enrichment but should not be represented as core argument slots.
  • Semantic role labelling: The system must identify which tokens fill obligatory argument roles (Agent, Patient, Theme, Goal) and which are adjuncts (manner, temporal, locative).
  • Attachment ambiguity: Adjuncts attach more freely than arguments; understanding whether a PP is an argument (constrained by the head's valency) or an adjunct (freely attachable) is essential for resolving attachment ambiguity (#57, #62, #63).
  • Cross-lingual transfer: Different languages encode the same valency frames through different morphosyntactic means (case, particles, word order). NLP systems must learn these surface realisations independently for each language.

Proceed to Unit 8 when ready.