A GUIDE TO LINK TYPES Davy Temperley INTRODUCTION. This file contains an alphabetical list of all link types used in the dictionary "2.0.dict". All subscript types are explained also; however, only the actual link names are listed alphabetically. (For example, the subscript type "Ds" will be explained somewhere in the entry for "D".) All post-processing features are also explained, under the entries for the links that are involved. Since the original release of the parser, a number of the link-names have been changed. Our aim has been to make the link-names and subscript names as mnemonically significant as possible, while still keeping them concise. In many cases the mnemonic significance is obvious: "A" stands for adjective, "D" stands for determiner, and so on. In a few cases we have adopted a character to arbitrarily stand for something in a number of different link-names. "G" generally stands for proper nouns; "B" stands for fronted objects; "J" stands for prepositional objects; "E" stands for adverbs; and "X" stands for punctuation. The first section below contains brief (1-2 sentence) descriptions of each link type. The second section contains much more thorough explanations. I. LINK TYPES AT A GLANCE A connects pre-noun ("attributive") adjectives to following nouns: "The BIG DOG chased me", "The BIG BLACK UGLY DOG chased me". AA is used in the construction "How [adj] a [noun] was it?". It connects the adjective to the following "a". AF connectives adjectives to verbs in cases where the adjective is fronted, such as questions and indirect questions: "How BIG IS it?" AL connects a few determiners like "all" or "both" to following determiners: "ALL THE people are here". AN connects noun-modifiers to following nouns: "The TAX PROPOSAL was rejected". AZ connects the word "as" back to certain verbs that can take "[obj] as [adj]" as a complement: "He VIEWED him AS stupid". B serves various functions involving relative clauses and questions. It connects transitive verbs back to their objects in cases like relative clauses and questions ("WHO did you HIT?"); it also connects the main noun to the finite verb in subject-type relative clauses ("The DOG who CHASED me was black"). BI connects form of the verb "be" to certain idiomatic expressions: for example, cases like "He IS PRESIDENT of the company". BT is used with time expressions acting as fronted objects: "How many YEARS did it LAST?". BW connects "what" to various verbs like "think", which are not really transitive but can connect back to "what" in questions: "WHAT do you THINK?" C links conjunctions to subjects of subordinate clauses ("He left WHEN HE saw me"). it also links certain verbs to subjects of embedded clauses ("He SAID HE was sorry"). CC connects clauses to following coordinating conjunctions ("SHE left BUT we stayed"). CO connects "openers" to subjects of clauses: "APPARENTLY / ON Tuesday , THEY went to a movie". CQ connects to auxiliaries in comparative constructions involving s-v inversion: "SHE has more money THAN DOES Joe". CX is used in comparative constructions where the right half of the comparative contains only an auxiliary: "She has more money THAN he DOES". D connects determiners to nouns: "THE DOG chased A CAT and SOME BIRDS". DD connects definite determiners ("the", "his") to number expressions certain things like number expressions and adjectives acting as nouns: "THE POOR", "THE TWO he mentioned". DG connects the word "The" with proper nouns: "the Riviera", "the Mississippi". DP connects possessive determiners to gerunds: "YOUR TELLING John to leave was stupid". DT connects determiners to nouns in idiomatic time expressions: "NEXT WEEK", "NEXT THURSDAY". E is used for verb-modifying adverbs which precede the verb: "He APPARENTLY not COMING". EA connects adverbs to adjectives: "She is a VERY GOOD player". EB connects adverbs to forms of "be" before an object or prepositional phrase: "He IS APPARENTLY a good programmer". EC connects adverbs to comparative adjectives: "It is MUCH BIGGER" EE connects adverbs to other adverbs: "He ran VERY QUICKLY". EF connects the word "enough" to preceding adjectives and adverbs: "He didn't run QUICKLY ENOUGH". EI connects a few adverbs to "after" and "before": "I left SOON AFTER I saw you". EN connects certain adverbs to expressions of quantity: "The class has NEARLY FIFTY students". ER is used the expression "The x-er..., the y-er...". it connects the two halfs of the expression together, via the comparative words (e.g. "The FASTER it is, the MORE they will like it"). FM connects the preposition "from" to various other prepositions: "We heard a scream FROM INSIDE the house". G connects proper noun words together in series: "GEORGE HERBERT WALKER BUSH is here." GN (stage 2 only) connects a proper noun to a preceding common noun which introduces it: "The ACTOR Eddie MURPHY attended the event". H connects "how" to "much" or "many": "HOW MUCH money do you have". I connects certain words with infinitive verb forms, such as modal verbs and "to": "You MUST DO it", "I want TO DO it". IN connects the preposition "in" to certain time expressions: "We did it IN DECEMBER". J connects prepositions to their objects: "The man WITH the HAT is here". JG connects certain prepositions to proper-noun objects: "The Emir OF KUWAIT is here". JQ connects prepositions to question-word determiners in "prepositional questions": "IN WHICH room were you sleeping?" JT connects certain conjunctions to time-expressions like "last week": "UNTIL last WEEK, I thought she liked me". K connects certain verbs with particles like "in", "out", "up" and the like: "He STOOD UP and WALKED OUT". L connects certain determiners to superlative adjectives: "He has THE BIGGEST room". LE is used in comparative constructions to connect an adjective to the second half of the comparative expression beyond a complement phrase: "It is more LIKELY that Joe will go THAN that Fred will go". M connects nouns to various kinds of post-noun modifiers: prepositional phrases ("The MAN WITH the hat"), participle modifiers ("The WOMAN CARRYING the box"), prepositional relatives ("The MAN TO whom I was speaking"), and other kinds. MG allows certain prepositions to modify proper nouns: "The EMIR OF Kuwait is here". MV connects verbs and adjectives to modifying phrases that follow, like adverbs ("The dog RAN QUICKLY"), prepositional phrases ("The dog RAN IN the yard"), subordinating conjunctions ("He LEFT WHEN he saw me"), comparatives, participle phrases with commas, and other things. MX connects modifying phrases with commas to preceding nouns: "The DOG, a POODLE, was black". "JOHN, IN a black suit, looked great". N connects the word "not" to preceding auxiliaries: "He DID NOT go". ND connects numbers with expressions that require numerical determiners: "I saw him THREE WEEKS ago". NF is used with NJ in idiomatic number expressions involving "of": "He lives two THIRDS OF a mile from here". NI is used in a few special idiomatic number phrases: "I have BETWEEN 5 AND 20 dogs". NN connects number words together in series: "FOUR HUNDRED THOUSAND people live here". NR connects fraction words with superlatives: "It is the THIRD BIGGEST city in China". NS connects singular numbers (one, 1, a) to idiomatic expressions requiring number determiners: "I saw him ONE WEEK ago". NW is used in idiomatic fraction expressions: "TWO THIRDS of the students were women". O connects transitive verbs to their objects, direct or indirect: "She SAW ME", "I GAVE HIM the BOOK". OD is used for verbs like "rise" and "fall" which can take expressions of distance as complements: "It FELL five FEET". OF connects certain verbs and adjectives to the word "of": "She ACCUSED him OF the crime", "I'm PROUD OF you". OT is used for verbs like "last" which can take time expressions as objects: "It LASTED five HOURS". P connects forms of the verb "be" to various words that can be its complements: prepositions, adjectives, and passive and progressive participles: "He WAS [ ANGRY / IN the yard / CHOSEN / RUNNING ]". PF is used in certain questions with "be", when the complement need of "be" is satisfied by a preceding question word: "WHERE ARE you?", "WHEN will it BE?" PP connects forms of "have" with past participles: "He HAS GONE". Q is used in questions. It connects the wall to the auxiliary in simple yes-no questions ("///// DID you go?"); it connects the question word to the auxiliary in where-when-how questions ("WHERE DID you go"). QI connects certain verbs and adjectives to question-words, forming indirect questions: "He WONDERED WHAT she would say". R connects nouns to relative clauses. In subject-type relatives, it connects to the relative pronoun ("The DOG WHO chased me was black"); in object-type relatives, it connects either to the relative pronoun or to the subject of the relative clause ("The DOG THAT we chased was black", "The DOG WE chased was black"). RS is used in subject-type relative clauses to connect the relative pronoun to the verb: "The dog WHO CHASED me was black". RW connects the right-wall to the left-wall in cases where the right-wall is not needed for punctuation purposes. S connects subject nouns to finite verbs: "The DOG CHASED the cat": "The DOG [ IS chasing / HAS chased / WILL chase ] the cat". SF is a special connector used to connect "filler" subjects like "it" and "there" to finite verbs: "THERE IS a problem", "IT IS likely that he will go". SFI connects "filler" subjects like "it" and "there" to verbs in cases with subject-verb inversion: "IS THERE a problem?", "IS IT likely that he will go?" SI connects subject nouns to finite verbs in cases of subject-verb inversion: "IS JOHN coming?", "Who DID HE see?" TA is used to connect adjectives like "late" to month names: "We did it in LATE DECEMBER". TD connects day-of-the-week words to time expressions like "morning": "We'll do it MONDAY MORNING". TH connects words that take "that [clause]" complements with the word "that". These include verbs ("She TOLD him THAT..."), nouns ("The IDEA THAT..."), and adjectives ("We are CERTAIN THAT"). TI is used for titles like "president", which can be used in certain cirumstances without a determiner: "AS PRESIDENT of the company, it is my decision". TM is used to connect month names to day numbers: "It happened on JANUARY 21". TO connects verbs and adjectives which take infinitival complements to the word "to": "We TRIED TO start the car", "We are EAGER TO do it". TQ is the determiner connector for time expressions acting as fronted objects: "How MANY YEARS did it last". TS connects certain verbs that can take subjunctive clauses as complements - "suggest", "require" - to the word that: "We SUGGESTED THAT he go". TY is used for certain idiomatic usages of year numbers: "I saw him on January 21 , 1990 ". (In this case it connects the day number to the year number.) U is a special connector on nouns, which is disjoined with both the determiner and subject-object connectors. It is used in idiomatic expressions like "What KIND_OF DOG did you buy?" UN connects the words "until" and "since" to certain time phrases like "after [clause]": "You should wait UNTIL AFTER you talk to me". V connects various verbs to idiomatic expressions that may be non-adjacent: "We TOOK him FOR_GRANTED", "We HELD her RESPONSIBLE". W connects the subjects of main clauses to the wall, in ordinary declaratives, imperatives, and most questions (except yes-no questions). It also connects coordinating conjunctions to following clauses: "We left BUT SHE stayed". WN connects the word "when" to time nouns like "year": "The YEAR WHEN we lived in England was wonderful". WR connects the word "where" to a few verbs like "put" in questions like "WHERE did you PUT it?". X is used with punctuation, to connect punctuation symbols either to words or to each other. For example, in this case, POODLE connects to commas on either side: "The dog , a POODLE , was black." Y is used in certain idiomatic time and place expressions, to connect quantity expressions to the head word of the expression: "He left three HOURS AGO", "She lives three MILES FROM the station". YP connects plural noun forms ending in s to "'" in possessive constructions: "The STUDENTS ' rooms are large". YS connects nouns to the possessive suffix "'s": "JOHN 'S dog is black". Z connects the preposition "as" to certain verbs: "AS we EXPECTED, he was late". II. A DETAILED EXPLANATION OF LINK TYPES A connects pre-noun ("attributive") adjectives to nouns. Any number of adjectives can be used; all connect to the noun. +----A----+ | +-A--+ | | | The big black dog ran Nouns thus have optional "@A-" connectors, conjoined with "D-" connectors and with their main "S/O/J" complex. Many adjectives take complements such as clauses or infinitival phrases; but such complements may not be used when the adjective is being used prenominally ("The man was eager to go", "*The eager to go man is here"). Thus the A- connectors on adjectives must be disjoined with complement connectors like TO+, TH+, and QI+. Some adjectives, such as superlatives and number adjectives ("biggest", "first"), must be used with a definite determiner such as "The" or "His". See "L". All hyphenated expressions are treated as adjectives, and may be used either attributively or predicatively; they therefore carry both A+ and Pa- (e.g., "bone-headed"). A few adjectives can act only as predicative adjectives, not prenominal ones ("asleep", "alone"). These have no A+ connector. Many participles can also act as prenominal adjectives; these, also, have A+ connectors. The situation here is complicated. Present participles of intransitive verbs have A+ connectors; those of transitive verbs do not ("The sleeping child", "*The hitting child"). Many passive participles have A+ connectors; this applies only to transitive verbs, since only transitive have passive forms ("The destroyed building"). (A few intransitive past participles also take A+: "The fallen horse".) Past participles of complex verbs which require phrasal complements do not carry A+ (*"The hoped agreement"); those of complex verbs which can take direct objects carry A+ in some cases, not in others ("The reported incident", "*The seen man"). (Note: Do not confuse participle-adjectives with full-fledged adjectives which happen to be participles, like "amused" and "annoying": "It is very annoying", "*It is very destroyed". Full-fledged adjectives can take modifiers such as "very"; see "EA".) A+ connectors on adjectives (but not participle-adjectives) are conjoined with an optional Xc+; this allows commas to be inserted after any prenominal adjective in a list ("The big, bad, ugly bear"). (This also allows doubtful usages such as "The big, bear", "The big black ugly, bear".) See "Xc". Attributive adjectives are also allowed on proper nouns with cost 2. See "D: Determiners and Adjectives on Proper Nouns". AA is used in the construction "How big a dog was it?" +---HA---+ +EAh+-AA-+-Ds-+ | | | | How big a dog was it The article "a" thus carries "Ds+ & {AA- & HA-}". Adjectives carry "AA+", disjoined with their A+ (used in prenominal adjectives) and "Pa-" (used in predicative adjectives). This construction uses the "EA-" on adjectives, which is also used for modifying adverbs like "very" and for questions like "How big is it". Similarly, the "EAh+" on "how" is also used for ordinary adjectival questions like "How big is it". "How" therefore carries "EAh+ & {HA+}". Since the HA+ is optional, there is a danger of the unwanted construction "*How big dogs were they". This is prohibited in post-processing: see "EAh". AF connects adjectives to verbs in cases where the adjective is "fronted", such as questions and indirect questions. +-AF-+ +------AF---------+ | | | | How big is it I wonder how big he wants it to be Verbs that can take adjectival complements, like "be", "seem", and "make [obj]", have AF- disjoined with their Pa+ connectors (and any other complement connectors). Adjectives have AF+ connectors disjoined with their A+ and Pa-. The constructions above can only be used with the question-word "how" modifying the adjective: "*Very big is it". This is enforced by connector logic: the sentence must connect to the wall somehow. "How" has a W- connector; adjectives and adjectival adverbs like "very" do not. "How" also has a QI- connector, for use in indirect questions like ex. 2 above. Connector logic ensures that the only way the AF connector can be used is if it connects through "how" to something on the left. Moreover, as shown by the examples above, if the AF occurs in the main clause, s-v inversion must occur; if not (i.e., if it is an indirect question), s-v inversion may not occur. This is enforced in post-processing; see "SI". If the adjective has complements, these must occur after the subject of the sentence: "HOW certain ARE you that he is coming?" This is enforced by the ordering of the elements on adjective expressions: "AF-" precedes "(TH+ or TO+...)". ("*How certain that he is coming are you" seems questionable; we reject it.) AL connects a few determiners like "all" and "both" to following determiners. +--------Sp-----+ +-----Jp--+ | +-AL-+-D--+ | | | | | All the people are here Words like "all" are unusual. They act like determiners in a way; they must agree with the noun in number ("All" may take mass or plural but not singular nouns: "*All the dog died"). But they may precede another determiner like "the" or "his". Thus we allow them to a make a J+ connection to the noun, treating it like a prepositional object. We also subscript J- connectors on nouns, giving singular nouns "Js-" and mass/plural nouns "Jp-"; "all" is then given "Jp+". This prevents "*All the dog died". A further problem here is that not just any determiner may be used with "all": "*All some/many people are here". Thus we require "all" to a make connection to the determiner as well; determiners that can be used with it carry "{AL-} & D+". Note that "all" can also be used with no additional determiner: "All people are good". For this we give "all" an ordinary D+ connector. AN connects noun-modifiers to nouns. +-----------D-----+ | +-----AN-----+ | | +--AN--+--S--+ | | | | | The income tax proposal was rejected Any singular noun may be used as a noun-modifier. All the noun-modifiers of a phrase are connected to the main noun of the phrase. Any number of noun modifiers may be connected. Thus nouns have a "@AN-" connector, conjoined with the rest of their expressions. Noun modifiers therefore always connect in parallel, rather than serially (sometimes this is counterintuitive, as in "income tax proposal"). Noun-modifiers must normally occur after ordinary adjectival modifiers: "The stupid tax proposal was rejected", "*The tax stupid proposal was rejected". However, there are exceptions to this: "city clerical worker", "New York municipal bonds". For this reason, we require in stage 1 that any AN connections be made closer to the noun than any A connections; at stage 2, however, we allow AN connections further away than A connections. Nouns therefore have the following: dog: ({@AN-} & {@A- & {[[@AN-]]}} & ... In general, noun-modifiers may not be plural: "*The taxes proposal was rejected", "*I made an eggs sandwich". However, one does sometimes see plural noun modifiers: "arms control", "sales division", "weapons violations charges". Here again, we use stage 2, giving plural nouns "AN+" as a stage 2 connector: dog: ({@A-} & D- or (S+....)....) or AN+ dogs: ({@A-} & D- or (S+....)....) or [[AN+]] Noun modifiers may also not take determiners or post-nominal modifiers "*The tax on liquor proposal was rejected" (unless hyphenated: see "Pa"). Thus the AN+ on nouns is disjoined from the rest of the expression. Proper nouns also have an AN connector, allowing them to act as modifiers to nouns: "The Smith tax proposal was rejected". (Proper-noun modifiers are allowed to follow ordinary noun modifiers, which is incorrect: "*The tax Smith proposal was rejected" is accepted.) AZ connects the word "as" back to certain verbs that can take "object-as-adjective" as a complement: +---AZ---+ +-O--+ +--Pa+ | | | | He viewed him as stupid "As" thus has AZ- conjoined with Pa+ (used in ordinary predicate uses of adjectives). Verbs like "view" and "characterize" have "O+ & {AZ+}", disjoined with other complement connectors. B is used in a number of situations, involving relative clauses and questions. It is most often used with transitive verbs. Transitive verbs have an O+ connector, which can be satisfied by a noun to the right. However, there are also various ways in which a word to the left may satisfy this need, such as relative clauses and questions; thus transitive verbs have a "B-" disjoined with their "O+". B connectors are also used to link the main noun of a subject-type relative clause to the verb. And they are used for so-called prepositional prepositioning, in which the object-need of a preposition is satisfied by a preceding word: "The MAN we talked TO is here". B connectors interact very heavily with post-processing. It will be noted that there are many kinds of subscripted B+ connectors; many of these subscript distinctions are used only for post-processing, not for controlling actual linkages. _B- on Verbs_ Transitive verbs have B- connectors disjoined with their O+ (and any other complement connectors they may have such as TO+ or TH+). Every finite verb also has a B- disjoined with its S+, for use in subject-type relative clauses. This B- is conjoined with RS+. Ordinary transitive verbs thus have the following: destroyed: (S- or (RS- & B-)) & (O+ or B-); _Relative Clauses_ B is used in restrictive relative clauses (i.e. those without commas), to connect the main noun to the verb of the relative clause (whether the relative clause is subject-type or object-type): +--B------+ +-R-+-S---+ | | | The dog I chased was black +----B----+ +-R-+-RS--+ | | | The dog who chased me was black Bs and Bp are used to enforce noun-verb agreement in subject-type relative clauses; these are exactly analagous to "Ss" and "Sp". So, "The dog who chases me is black", "The dogs who chase me are black" are accepted; "*The dog who chase me is black" is rejected. See "R" for a fuller explanation of relative clauses. _Questions: B#w, B#m_ Bsw and Bpw connectors are used for object-type questions, in which the object is a simple question word like "which", "what", "who", or "whom". Such words therefore have "B*w+" connectors. +-----Bsw---+ | +---I---+ | +SI-+ | | | | | Who did you see yesterday Bsm and Bpm connectors are used for object-type questions, in which the object-phrase contains a noun: +----Bsm---+ | +---I---+ +D**w-+ +-SI+ | | | | | | Which dog did you buy This construction uses the same B- connectors on verbs that object-type relative clauses use. However, the B#m+ connector on nouns used is _not_ the one used in relative clauses. Note that in the above construction, the Bsm satisfies the main requirement of "dog"; "dog" need not (in fact may not) also serve as a subject or object of a clause (*Which dog ran in the park did you buy?). Thus, whereas the "R+ & B#+" complex on nouns is optionally conjoined with the main "S+ or O-..." complex, the B#m+ is disjoined with the main complex. dog: (R+ & Bs+) ...& ((S+ & {Wd-...}) or O- or Bsm+...) B#m connectors may only be used with question-word determiners ("what", "which", "whose", "how[many/much]"): "*The dog did you buy", "*The dog you bought". This is enforced by the fact that the wall has to connect to the sentence somehow. Normally, this is done through the subject of the sentence; nouns have a "Wd-" conjoined with their "S+". Notice, however, that the "Bsm+" on nouns is disjoined from the Wd-. When the B#x is being used, the sentence cannot connect to the wall unless it does so through the determiner - and the only determiners that have W- connectors are question-word determiners. (Question-word determiners also have QI- connectors, used in indirect questions: see below.) (Object-type questions also require subject-verb inversion: see "SI".) "What" and "which" can act either as question-word determiners or as complete noun-phrases; and as complete noun-phrases they may occur in either subject- or object-type questions. Thus they carry (B*w+ or S**w+ or D**w+) & (QI- or W-). (See also "S**w".) _Indirect questions_ The above discussion of B connectors in questions applies to indirect questions as well. +---B(s)--+ +-S--+-QI-+ +S(s)+ | | | | | I wonder who Dave hit Noun-phrase question-words have "(W- or QI-) & B*w+": the W- is used in questions, the QI- in indirect questions. See "R". _B links involving dependent clauses_ Suppose the verb making the B connection is in a dependent clause: +------------------B--------+ | +---I---+ | | +SI+ +-C--+-S--+--I--+ | | | | | | | Who do you think Bill will bring This construction is handled perfectly well under the current arrangement. Similarly with relative clauses and indirect questions: "I wonder who you think Bill will bring", "The man who I think you met yesterday is here". There is a problem, however: there are constraints on the way that a B link can be made out of a dependent clause. Specifically, a B link cannot be made to a word that is within a subordinate clause (1), an indirect question (2), or a relative clause (3). +----------------B(s)------------+ | +--Cs-+--S(s)+ | | | | 1.* Who did you leave because Bill mentioned? +-----------B(s)------------+ | +--Cs+S(s)+ | | | | 2.* Who do you wonder why Bill hit? +--------------B(r)-----------+ | +-----B(r)--+ | +--R--+RS(r)+ | | | | 3.* Who do you know someone who likes? This hold true whether the outer construction is a direct question, an indirect question (ex. 4-6 below) or a relative clause (ex. 7-9). 4.*I wonder who you left because Bill mentioned 5.*I wonder who you wonder why Bill hit 6.*I wonder who you know someone who likes 7.*The man who you left because Bill mentioned is here 8.*The man who you wonder why Bill hit is here 9.*The man who you know someone who likes is here Linkages are found for all of these sentences, under the current arrangement. They are weeded out in post-processing. In each case, a domain is started at the beginning of the dependent clause. Notice that the domain started will then spread back through the B connector. However, the subordinate domains started in these cases are of different kinds. Verbs that take clausal complements (like "said") have Ce connectors, which start 'e' domains; question words and conjunctions, however, have Cs connectors, which start 's' domains, and relative pronouns have "R" connectors, starting 'r' domains. We then dictate that 's' and 'r' domains simply are not allowed to stretch back before the root word (the left end of the starting link). (We call such domains "bounded domains".) Thus the incorrect sentences above are prohibited. As described above, finite verbs also have "B-" connectors for use in subject-type relative clauses. These cannot be used, however, unless the finite verb can also make a RS connector to the left. The only way this can happen in questions is if the question contains a subordinate clause: "Who do you think will come?" (See "RS".) _More about Dependent Clauses in Indirect Questions_ Dependent clauses within indirect questions are, again, handled quite naturally: +------Bsw(s(e))----------+ +-S--+-QI-+ +S(s)+-Ce(s)-+S(s(e))+ | | | | | | | I wonder who Dave thinks Bill hit Note that in this case, it is very important that the B#w link be a "restricted link": the 'e' domain begun by the Ce must not be traced through it, otherwise it would spread to the rest of the sentence. Another false positive arises here at the linkage level: +-----------O(s)----------+ | +-QI-+D**w(s)+ | | | | I read I don't know which book Again, this is handled by post-processing. Since 's' domains are "bounded", they may not reach back before the root word. Thus the above construction is prohibited in post-processing. _Transitive adjectives_ Certain adjectives, when used predicatively, can take transitive infinitives as complements: "It is easy to use". Such adjectives take B+ conjoined with TOt+; see "TOt". _"Whatever", "whoever"_ Bsd is used for words like "whatever" and "whoever". These words may take object-type relative clauses: however, they simultaneously serve as the subject or object of a main clause. Therefore, they take "(Bsd+ or Ss*d+) & (Os- or Ss+)". +-----------Ss-------+ +------Bsd--------+ | | | | Whatever you want to do is fine Since the "whatever" phrase seems to constitute its own clause here, a domain must be started for such constructions. Therefore we make "Bsd" domain starting. A domain must also be started for the corresponding subject construction: +-----------Ss-------+ +------Ss*d-------+ | | | | Whatever pleases you is fine Therefore we make "Ss*d" domain-starting as well. _Comparatives_ In comparatives, the second half of the comparative can take a transitive verb with no object: "I have more books than she has", "She has as many friends as I have". A B+ is needed here to satisfy the "B- or O+" requirement on transitive verbs. Thus "than" and "as" have a "Bc" connector. Post-processing ensures that this is only used in certain kinds of comparatives: "*I am smarter than she has". See "MV: Comparatives". _Prepositional Prepositioning_ 99% of the uses of B connectors involve hooking to a verb on the right: either satisfying the object-need of a transitive verb (most often) or the subject-need in "RS" constructions (see "RS"). B- connectors appear analogously on prepositions, however, when the object-need of a preposition can be satisfied by a word on the right. For this construction to be valid, the preposition involved must be modifying a verb, not a noun (i.e., it must be using an MVp- connector, not an Mp-): 1.Who did you talk to 2.Which room did you sleep in 3.*What country did you meet a man from 4.*How many legs did you see an insect with 5.Who did you take a picture of (!) Sentences 3 and 4 are perhaps grammatical, but not if we take the preposition to modify the noun. For this reason, we do not directly disjoin B- with J+, as we disjoin B- and O+; rather, prepositions carry (J+ & (Mp- or MVp- or ...)) or (MVp- & B-) (The exception is "of", as shown by ex. 5; here, B- must be conjoined with Mp-.) Beyond this, B- on prepositions can be used in all the ways it is used on verbs: in relative clauses ("The man who I talked to is here"), with transitive adjectives ("He is easy to talk to"), with "whatever" ("Whatever you like to work with is fine"), and so on. _Prepositional-Object Relative Clauses_ A final use of B occurs in relative clauses where the focus of the clause is the object of a preposition modifying the subject. +----Bpj-----+ +--MX--+-M--+-Jr+RS-+ | | | | | The doctors, many of whom are surgeons, were angry +--------Bpj-----+ +-----MX---+-M-+-Jr-+Cr-+-S-+ | | | | | | The book, the author of which I know personally, is good We already have a mechanism for allowing noun phrases surrounded by commas to modify other noun phrases: this is the MX connector (e.g., "The doctor, a good friend of mine, is here"). We use that same mechanism here. In this case, however, the modifying noun acts like the antecedent of a relative clause. It may either act as a subject, in which case it must be followed by an ordinary finite verb phrase; or it may act as an object, in which case it must be followed by a phrase containing a transitive verb but no object. The "B/RS" system is already in place to handle this. In relative clauses, the relative pronoun provides the needed "RS+" to allow for the following finite verb; in object clauses, it makes a "Cr" link to the subject of the relative clause. In this case, "which" and "whom" can serve the same function. The only difference between this usage and an ordinary relative pronoun is that here the relative pronoun is also acting as a prepositional object. Thus we give it "Jr". which: (R- or Jr-) & (RS+ or Cr+); We must then allow the main noun of a comma-modifier phrase to act as a relative antecedent. Thus we give nouns an optional "B#j+", conjoined with their MX-: dog: ...(S+ or O- or J- or ({Bsj+} & Xc+ & Xd- & MXs-)) There are several false positives to be avoided here. First, we must prevent the Bsj being used without the appropriate relative pronoun. Subject constructions ("The doctors, many of them are surgeons"), will be prevented anyway, because there is no "RS" to connect to the verb. But object constructions must be prevented: +---MX----+---Bsj-+ | | | The book, the author I know personally, is excellent (Actually this construction will be accepted anyway; the comma phrase will be treated as an ordinary noun-phrase, a noun followed by a relative clause, as indeed it is.) Secondly, we must prevent the "which" from being used in the wrong place: +-Mp-+-Jr+----Cr--+ | | | | I saw the author of which the book was excellent (This parse too will be accepted anyway, using an Mj construction, but in any case the Jr parse is redundant and should be avoided.) We solve these problems in post-processing: we require that the Jr must occur in the same group as B*j. (This goes both ways: a Jr requires a B*j, and a B*j requires a Jr.) BI is used to connect forms of the verb "be" to some idiomatic expressions. Forms of "be" therefore have BI+, directly disjoined with their other connectors: are: S- & (Pg+ or Pv+ or Pp+ or .... or BI+) Some title words like "chairman", "president", "head", etc., can serve as complements of "be" without taking an article. For such words, special dictionary entries have been created: president.i chairman.i: {@AN-} & BI-; Indirect questions can act as complements of "be" as well, but only when the subject of "be" is a word like "question": +-Ss*q+BIq+ | | | The question is who killed Nicole ?The murderer is who killed Nicole The S connectors on words like "question" are thus given a special "s*q" subscript. Question words are given a BIq- connector. P.P. then dictates that these BIq connectors cannot be used unless a Ss*q is present in the group. Question words have BIq- directly disjoined with QI- and W- connectors, used in direct and indirect questions. The phrase "because (clause)" can be a complement with "be" as well, but only if the subject is "this" or "that". "This" and "that" are thus given Ss*b+; "because" is given BIh-; and P.P. dictates that a BIh requires a Ss*b. (These words are also among the subjects that can take "(be) (indirect question)" as a predicate; Ss*b is thus added to Ss*q in post-processing as a connector that permits BIq.) BT is used with time expressions acting as fronted objects, as in "how many years did it last". See "OT". BW connects "what" (as a direct or indirect question word) to various verbs like "think", "decide", which are not really transitive verbs but which can connect to "what" in object- type questions. +----BW-----+ | +----I---+ | +SI-+ | | | | | What do you think? *I thought a good idea today. BW is also used with verbs like "tell" and "ask", which can connect to "what" as if it were an indirect object: "what did you tell him"? "Whatever" can be used in this way as well: "Whatever you think is fine". C connects conjunctions and certain verbs with subjects of clauses. It is therefore used only in embedded and subordinate clauses, not main clauses. +---C--+ +-C-+ | | | | I told him I was angry Call me when you are ready Every noun, nominative pronoun, and every other potential subject has a "C-" conjoined with its "S+" connector (but not its O-, J-, etc.). The C- is directly disjoined with a Wd-, which is used in main clauses: dog: ({C- or Wd-} & S+) or O- or J-... When a dependent clause is begun, the subject usually makes a C connection to the left. There are two exceptions. When an object-type relative clause occurs with an omitted relative pronoun ("The man you met is here"), the subject of the relative clause makes an R- connection, not a C- connection. The reason for this concerns the use of "CO": see "CO". Secondly, in indirect object-type questions, the subject of the indirect-question clause makes no left connection at all. The "C-" connection on nouns must therefore be optional. _Different kinds of C+_ Ce is used for verbs that take clausal complements, also known as "embedded clauses": "tell", "assume", "think", etc.. Such verbs therefore have "Ce+" disjoined with their other complement connectors (TH+, TO+, O+, etc.). All verbs that can take "Ce+" can also take "TH+": "I assumed we would go", "I assumed that we would go". The reverse is not true, however: "*I asserted/whispered/retorted we should go". Cs is used in several kinds of subordinate clauses. It is used with certain conjunctions, like "when" and "after": "The man I saw after I left your party is here." (Some other conjunctions do not take Cs; see "Wd".) Usually conjunctions that take Cs+ can act either precede or follow the clause they modify ("When I saw you, I left"; "I left after I saw you". They thus take "Cs+ & (MVs- or CO+)". In many cases, such conjunctions may also take noun-phrases or participles as objects: "The man I saw ( after lunch / after running ) is here"; thus they have "Cs+ or J+ or Mv+ or Mg+", as appropriate. Notice that conjunctions that take nouns as objects can in general modify nouns also. Some, like "after", can modify nouns, as well as taking nouns as objects ("The party after the lecture was good"); in this sense, they are essentially acting as prepositions, and take Mp-. This raises the question of whether they can take Mp- and Cs+ in conjunction: "?The party after Fred graduated was excellent". We allow this, but the expressions could easily be rewritten to prevent it. Cs is also used for certain nouns that take clausal complements, like "way" and "time": "I remember the time I went to London". Such nouns therefore have "Cs+ or @M+....", conjoined with their main "S+ or O+..." complex. Cs is also used in where/when/how indirect questions: "I wonder where they will live". Such question words therefore have "QI- & Cs+". (In direct questions of this kind, s-v inversion must take place; therefore no C connection is made. See "Wq".) _Reasons for the C+ distinctions_ The reason for making the distinction between Ce and Cs relates to "bounded domains". In relative clauses and questions (direct and indirect), a transitive verb can make a B connection to a preceding noun-phrase; however, there are constraints on how this may be done. A B link may be made to a word within an embedded clause ("I wonder who Joe thinks Bill hit"), but not to a word within a subordinate clause ("*I wonder who Joe cried when Bill hit."), nor to a word within a relative clause or indirect question. To enforce this, Ce connectors (found in embedded clauses) start 'e' domains, Cs connectors (found in conjunction-linked subordinate clauses, and some indirect questions) start 's' domains; we then dictate that B links can extend out of 'e' domains, but not 's' domains. See "B: B links involving dependent clauses" for further explanation. A "B" link may not be made to a word within a subordinate clause, from outside that clause. However, it is perfectly fine to have a conjunction-connected subordinate clause within a relative clause, as long as the B link is not inside it: *The man I cried when John hit is here The man I hit when John cried is here _Ca_ is used in indirect adverbial questions: +--R-+--Eeh-+--Ca-+ | | | | I wonder how quickly John ran Adverbs that can be used in this way have "EEh- & (Ca+ or Qe+ or MVa-...)". (Qe is used in direct adverbial questions.) Ca can only be used in indirect questions. This is enforced because if the sentence must connect to the wall, and can only do so through "how". ("How" could make a direct "W" connection to the wall, but this would trigger post-processing constraints which would prevent "Ca" from being used.) Like Cs connectors (used in other indirect questions), Ca starts an 's' domain, thus putting the indirect question clause in its own group. _Cr_ is used only in the obscure "Noun-modifying prepositional- object relative clause" construction: see B*j. CC is used to connect clauses to coordinating conjunctions. +--------------CC-----------+ +--S--+--MV--+-C-+-S-+ +-W---+--S--+ | | | | | | | | 3. John screamed when I arrived but Sue left CC is used with coordinating conjunctions only, and it links to the subject of the previous main clause. Subordinating conjunctions, by contrast (like "when" and "after"), link to the main verb of the previous clause, main or dependent. See "W" for more explanation. CCq is used with verbs like "say", which can be used in a quotation or paraphrase: +-------------CC----------------+ | +-------Xd--------+ | | +---S---+-Xc-+ | | | | | The President is busy , the spokesman said \\\\\ Verbs that can take quotational complements of this kind have "CC+", directly disjoined with their other complement connectors. The verb must connect to commas on either side (or to the right- hand wall); see "Xc". say.v: (S- or I-...) & (O+ or Ce+ or TH+ or (Xd- & Xc+ & CCq-); Verbs that can be used this way can usually also be placed elsewhere in the sentence - either after an opener (ex. 1), or after the subject phrase (ex. 2). In the first case, they use COq; in the second case, they use Eq. +-------------S---------------+ | +-Eq--+ | | | 1. The President , the spokesman said , is busy +---------------------CO--------------------+ | +-----COq----+ | | | 2. At the moment , the spokesman said , the President is busy Verbs such as "say" therefore have the following: (S- or I-) & (O+ ... or (Xd- & Xc+ & (CCq- or Eq+ or COq+))); The category of verbs that can take quotations like this cuts across other verb categories. For that reason, it seemed simpler to designate new dictionary entries for quotation-taking verbs; these entries have only the complement connectors used in quotation (CCq, Eq, and COq). Such entries are subscripted with ".q" ("say.q", etc.). Quotation complement connectors - Eq, CCq, and COq - are all domain-starting. The structures created in such sentences are rather unusual, however. With domains, the usual principle is to have embedded clauses nested inside the domains of main clauses. In cases of quotation, it would seem (semantically anyway) that the quoting verb (e.g. "say") is the main clause, and the quotation itself is dependent on it. Therefore, where possible, we try to make the quoting expression start a domain which includes the quoted expression. This is easy in the case of Eq and COq; both are simply made domain-starting links. It is more difficult in the case of CCq, given the link structure; there, the quoting expression is to the right, and cannot easily start a domain which will spread back to the quoted expression. So, in the case of CCq, the CCq again starts a domain, but this causes the quoting expression to be nested in the quoted one rather than vice versa. +-------------CC------------------+ | +-------Xd(e)-----+ +---D--+--S---+-Pa--+ | +---S(e)+Xc(e)+ | | | | | | | | The President is busy , the spokesman said \\\\\ CO is used to connect "openers" to subjects of clauses: +----------CO------------+ | | 1. Apparently, they went to a movie 2. On Tuesday, they went to a movie 3. Although they were tired, they went to a movie 4. Leaving the kids at home, they went to a movie 5. Abandoned by their parents, they went to a movie 6. Still upset about Joe, they went to a movie Various kinds of words have CO+ connectors: adverbs (ex.1), prepositions (ex.2), conjunctions (ex.3), participle phrases (ex. 4-5), and adjective phrases (ex.6). Openers may take commas; almost all words with CO+ therefore have an optional "{{Xd-} & Xc+}". See "Xc". Nouns have optional @CO- connectors, conjoined with their C- and W- connectors, conjoined in turn with S+: dog: ...({({@CO-} & (W- or C-)) or C-} & S+) or O- or J-...) Thus a CO link is always made to the subject of a clause. There are constraints on the way clauses can take openers. Main clauses can always take openers, as can subordinate and embedded clauses (ex. 1-3 below); relative clauses can only do so if they involve a relative pronoun (ex. 4 and 5); indirect questions cannot (ex. 6): 1. Screaming furiously, Fred left the room 2. I was about to leave when, screaming furiously, Fred attacked me 3. I told her that, after the party, I would meet her 4. This is the man who, in some ways, I would like to hire 5. *This is a man, in some ways, I would like to hire 6. *I wonder who, on Tuesday, John had lunch with Recall that the clauses at issue here take different connectors. Main clauses make a W connection; embedded and subordinate clauses make a C. Indirect questions subjects take no left-branching connector at all. Relative clause subjects make C connections when a relative pronoun is present; otherwise they make R connections to the previous noun. Note that in the above expression for "dog", C- and W- are conjoined with CO+; C- is not. Thus, when a noun subject is making a C- connection backwards, or when it is making no connection at all, it may not take an opener. Therefore ex. 5 and 6 are rejected. A further distinction is necessary here. Participle phrases can modify main clauses; they are very rarely found, however, on dependent clauses (either embedded or subordinate). Shouting loudly, Fred ran out of the room *John told us how shouting loudly, Fred ran out of the room *John had just entered when shouting loudly, Fred ran out of the room For this reason we give participle phrases COp+; we then create two CO- connectors on nouns, one conjoined with W-, the other with C-. The one with C- is subscripted COd-: {({@CO-} & W-) or ({@COd-} & C-) or Cc-} & S+ _Participles as Openers_ A further point is needed about participle openers. They appear to take complements in the same manner as ordinary participles. There is a problem here, however. While participles always make a connection to the left to an auxiliary, they sometimes make a further-left connection to a fronted object, for example: +-------B------+ | +---Pg----+ | | | What book are you reading For this reason, the Pg or Pv on participles must be to the left of the complement expression on participles. However, openers may not be used in this way. Moreover, on openers, the CO connection (directly disjoined with the Pg or Pv) must be made further to the right than any complement connections: +----------CO---------+ +---+ | | | | Saying he was innocent, John left the room Angered by what he saw, John left the room +------------------------+ +---CO-+ | WRONG: | | | Saying, John left the room he was innocent Angered, John left the room by what he saw Here, then, the CO+ must be to the right of the complement connectors on the expression. For this reason, we must fully disjoin the CO and Pg/Pv connectors on passive and progressive participles. This is necessary in any case for progressive participles in the case of gerunds; for gerunds, also, the S+ must be to the right of the complement connectors on the expression. (See "Ss*g".) We also directly disjoin COp+ here with MVx- (used for comma participle phrases modifying verbs), and MX*p- (used for comma participle phrases modifying nouns). hitting: (Pg- & (O+...)) or (O+...) & (Ss*g+ or COp+ or MVx- or MX*p-); angered: (Pg- & {@MV+}) or ({@MV+} & (COp+ or MVx- or MX*p-); CQ is used to connect to auxiliaries in cases where the subject and auxiliary are inverted, and where the complement need of the auxiliary is satisfied by a word behind it. Such situations include comparatives, with "more...than" or "as...as", as well as other situations involving "as". +-CQ-+ | | He has more money than does Joe Normally, subject-verb inversion is tightly constrained; it may only occur in certain types of questions. To allow it in situations such as these, we must add "CQ" to a special list of connectors in post-processing which permit (and in fact require) s-v inversion. See "SI". See also "MV: Comparatives: Constrained Uses with Separate Clauses". CX is used in comparative constructions like the following: +--CX--+ | | He has more money than I do Normally an auxiliary like "do" or "have" must be followed by a participle. In cases like the above, however, this need can be satisfied by a preceding comparative word like "as" or "than". Auxiliaries thus carry do: S- & (I+ or CX-...) See "MV: Comparatives: Constrained Uses with Separate Clauses". D connects determiners to nouns. +-D-+ | | The dog died The first two subscript places on D links relate to number agreement. consider the following simplified entries. the: D+; a: Ds+; some: Dm+; many: Dmc+; much: Dmu+; dog: Ds- & ...; dogs: {Dmc-} & ...; water: {Dmu-} & ...; war: {Dm-} & ...; Essentially there are three categories of noun and article: singular, mass, and plural. The first subscript place distinguishes between singular ("s") and everything else ("m"); the second place distinguishes between plural ("c") and mass ("u") (for "countable" and "uncountable"). Nouns and articles which are singular-only have Ds; those which are plural-only have Dmc; those which are mass-only have Dmu; nouns which may be singular or mass have D*u+; determiners which may be plural or mass have Dm+; and determiners which may be mass, plural or singular have D+. (A few nouns, such as "fish", may be plural or singular; for these we create multiple dictionary entries.) The third subscript place on D connectors relates only to post-processing. D##w connectors are used for question-determiners: "which", "what" and "whose". The w triggers post-processing constraints relating to question-inversion; see "SI". D##w connectors may only be used in questions: "*I bought which eggs today". This is enforced because the D##w on question-words is conjoined with (W- or QI-); it must make a link back to either the wall or an indirect-question verb like "wonder". _Determiners and Adjectives on Proper Nouns_ In general, proper nouns may not take determiners. However, there are a number of exceptions. A number of proper nouns may take the determiner "the": "The Emir of Kuwait died", "The Supreme Soviet met today". For these we use the DG connector; see "DG". Beyond this, however, one quite often sees proper names taking determiners, for example with brand-names or with people. ?The new David Letterman is a happy, secure David Letterman. ?I bought a Toyota to carry my Macintosh and several IBMs Thus we give proper nouns D- at stage 2. Note that in the first case the proper noun carries an adjective as well; this is also not uncommon in more colloquial writing. Thus proper nouns carry [[{@A-} & {D-}]] & (S+ or O- or J-...) DD is used to connect definite determiners ("the", "his", "John's") to number expressions and adjectives acting as nouns. DD connects determiners to number expressions: +-DD-+--D--+ | | | My three sisters are coming next week +---------S-------+ +-DD-+-M-+ | | | | | The two in the window are very attractive In the first case above, the number expression is really acting as the determiner (there must be number agreement with the subject: "*My three sister is coming"); the other determiner is superfluous. In the second case, the number acts as a noun-phrase; note that it may take prepositional phrases, relative clauses, etc., just like an ordinary noun. In either case, the number does not require a DD connection; it is optional. Numbers therefore have the following: three: {DD-} & (Dmc+ or ({@M+ or ...} & (S+ or O-...))); Only certain determiners may be used here: "*A three sisters are coming","*Many three students are coming". DD is also used to connect determiners to adjectives when they are being used as self-contained noun-phrases. Here again, the definite determiner is most often used; possessive determiners can perhaps be used (we allow them); but singular, plural-mass determiners are incorrect, as is use with no determiner. +----O-----+ +---O----+ | +-DD-+ | +-DD+ | | | | | | This law will benefit the rich and hurt the poor ?This law will benefit our rich and hurt our poor *This law will benefit a rich and hurt some poor *This law will benefit rich and hurt poor Adjectives may be used in this way as subjects ("The poor will suffer"), objects ("It will affect the poor"), or prepositional objects ("It applies to the poor"). When used as subjects, adjectives act as plural forms ("The poor are/*is going to suffer"). This construction is most often seen with a few adjectives such as "rich", "poor", "powerful", "meek", and "famous". We allow it with all adjectives; however, since it is quite rare, we make it a stage 2 construction. Adjectives therefore have: poor: A+ or (Pa- & ) or [[DD- & (Sp+ or SIp- or O- or J-)]]; DG connects the word "the" with proper nouns. In general, proper nouns may not take determiners; but we do allow proper nouns to take the determiner "the". This allows things like "The Emir of Kuwait died", "The Supreme Soviet met today", but not "*I have known many Kuwaits", "*We need a Soviet". Other uses of determiners with proper nouns are allowed at stage 2; see "D: Determiners with Proper Nouns". DP is used to connect possessive determiners to gerunds in cases where the gerund is taking its normal complement: +----------Ss*g----+ +----TOo--+ | +---DP--+-O---+ | | | | | | | Your telling John to leave was a mistake See "S**g: Gerunds: The Complement / No Determiner Case." DT connects determiners with nouns in certain idiomatic time expressions like "next week" and "last Tuesday". +-------MVp---------+ | +DTi+ | | | I'm going to London next week This might be a good place for a general discussion of time expressions. English contains many kinds of time expressions. The most common uses of these expressions are analogous to prepositional phrases (indeed, many time expressions are prepositional phrases). Time expressions therefore often connect to the rest of the sentence with the same connectors as prepositionsl phrases: MVp+ (when following a verb), Mp+ (when following a noun phrase), and CO+ (when preceding a clause). Many time expressions are single words or simple prepositional phrases ("yesterday", "on Tuesday"), but many others are idiomatic in construction, and use special-purpose connectors like JT, DT, and Yt: "It happened three weeks ago / last Tuesday morning / two days after the party". (Certain place expressions like "five miles away", which use Yd, are similar in this regard). In such cases, decisions must be made, sometimes rather arbitrarily, about which word in the time expressions will serve as the head, i.e., will make the MV connection to the rest of the sentence. When time expressions contain a preposition or conjunction, that word is almost always the head of the expression. Often the head word is modified by a quantity expression, using a Yt connector (see "Y"). There are many time expressions, however, which do not use prepositions or conjunctions, such as this one: I'm going to London next week With a phrase like "next week", we make "week" the head of the expression (we create a special "week.i" dictionary entry for such nouns); we use DTi to attach a word like "next" (or "last"). We then give "week.i" the usual prepositional phrase expressions: MVp-, Mp-, CO+. week.i: DTi- & (MVp- or Mp- or CO+); next: DTi+; +-------MVp---------+ | +DTi+ | | | I'm going to London next week A phrase like "next week" can also be used a noun phrase. 1.I'll be free after next week 2.Next week would be good 3.Are you ready for next week With certain prepositions, time expressions are very common as objects: "after", "by", "until" (ex. 1). For these we use JT (see "JT"). Other noun-phrase uses of time expressions are less common, and rather colloquial (exx. 2-3). Therefore we give "week.i" the usual "S+ or O- or J-" complex for nouns, but we make it stage 2. week.i DTi- & (MVp- or Mp- or CO+ or JT- or [[S+ or O- or J-]]); With phrases like "this week" and "every week", we must do something similar. However, there is an important difference between "this week" and "next week". "This week" is a well-formed noun phrase on its own, while "next week" is not. Therefore the noun-phrase uses of "this week" (like exx. 1-3 above) will take care of themselves, simply using the ordinary noun entry "week.n". +-D-+--S-+ | | | This week would be good We need only worry about the prepositional phrase usages: "I can see you this week". Therefore, for determiners like "this" aand "every", as well as phrases like "the_next" and "the_previous", we create a DTn connector, conjoined with the prepositional-phrase connectors of "week.i" but not the noun-phrase connectors: week.i: ((DTn- or DTi-) & (MVp- or Mp...)) or (DTi- & (JT- or [[S+ or O-...]]) this: DTn+; next: DTi+; (There are a few other determiners like "one" and "all" which can be be combined with some time-units and not others: "one day"/"*one week", "all day/*all month". For these we create one-word idiom entries. The same distinction must be drawn here, however, between phrases that are well-formed noun phrases - "one day" - and those that are not - "all day".) Phrases can also be formed with all these determiners (both the DTn+ ones and the DTi+ ones) with month and day-of-the-week names. Again, such expressions occur most commonly as prepositional phrases or JT objects (exx. 4-6), but can also occur as noun phrases (exx. 7-8). Again, we make the noun-phrase usages stage 2. 4. We should meet this Tuesday 5. You should do it before next January 6. Last Monday, I saw Fred 7. Next Tuesday would be good 8. Are you ready for next Monday Unlike words like "week", day and month names can be used as noun phrases with no determiner at all (exx. 9-10). Day names are often objects of "on", and month names are often objects of "in"; these are so common that it seems sensible to make them stage 1 usages; therefore we create special "ON" and "IN" connectors for them (ex. 11). 9. Monday would be good 10. Are you ready for April +---MV---+-ON-+ | | | 11. I saw Fred on Monday Finally, month names can be used with dates and years. For this we use TD and TY. For this purpose, special categories have been created for numbers which are common as years and days-of-the-month. +--TY-+ +--ON-+-TD-+ +-Xd+-Xc+ | | | | | | I saw him on January 21 , 1990 \\\\\ TD is used to connect day-of-the-week names to word like "evening": "I saw him Tuesday evening". TA is used to connect adjectives like "late" to month names: "We did it in late December". There are some other subtle distinctions regarding time expressions which we will not go into here; the following sentences illustrate some of them. I saw him Monday *I saw him January I saw him January 21 *January 21, I saw him We did it on Jan 21 *We did it in Jan *I saw him on early January 21 Monday's concert should be good This Monday's concert should be good This week's concert should be good The Monday concert should be excellent *The this week concert should be good I saw him in January 1990 E is used for verb-modifying adverbs which precede the verb: +---E---+ | | he apparently is not coming This is perhaps a good place for a general discussion of adverbs. _Types of adverb_ There are a number of types of adverb. Some kinds of adverb are quite specific in the way they may be used: adjective- modifiers ("EA"), adverb-modifiers ("EE"), comparative- modifiers ("EC"), number-modifiers ("EN"). These kinds of adverbs are highly constrained in their use; thus we assign a specific connector type to each. (See specific connector entries for discussion.) Other kinds are less constrained in the way they can be used: adverbs of manner, clausal adverbs, and time adverbs. Adverbs of manner refer specifically to the manner in which an action is done: "He laughed loudly", "She ran quickly". Clausal adverbs refer in some way to the clause as a whole: "John is apparently coming", "John is also coming", "John is actually coming". Time adverbs give information about the time of the action: "John is soon coming". Each of these types of adverbs has a variety of usages, some of which overlap; therefore they have some connectors in common. However, the usage of each category is slightly different; there is also some variance within the categories. Several kinds of connectors are used with manner/clausal/time adverbs. E connects adverbs to following verbs. Thus every verb has an optional "@E-", conjoined with the rest of its expression. MVa connects adverbs to preceding verbs or adjectives (see MVa). (In a series of verbs, "He will want to have tried to do it", only the last one will have an MV+ available for use.) EB connects adverbs to forms of "be", when the "be" word is connecting to an object or prepositional phrase. And CO connects adverbs to a following subject-noun-phrase. (A somewhat fuzzy distinction is assumed here between adverbs and prepositions. Generally prepositions take objects, can modify nouns, and can be complements of "be", whereas adverbs differ in all three respects. However, there is some gray area between the two. See "MVp" for more discussion.) _Manner adverbs_ Manner adverbs may occur after the verb: "He ran quickly, she laughed loudly". They may also occur at the beginning of the sentence: "Quickly he ran", "Loudly, she laughed". Or, they may occur before the verb (or before any of the verbs, if there are several): "She had quickly opened the door"; "She quickly had opened the door". They may not usually occur after forms of "be": "*She was loudly in the kitchen". Therefore, most manner adverbs carry "MVa- or E+ or CO+". However, a few manner adverbs do not take E+ or CO+, like "properly" and "outright": "*You should properly do it", "*Properly you should do it". _Clausal adverbs_ Clausal adverbs may almost always occur before the verb: "He almost/probably/fortunately closed the door". They can almost never be used following the main verb: "*He closed the door almost/probably/fortunately". They can be used before the subject in some cases, not in others: "Probably/fortunately/ apparently he closed the door", "*Barely/simply/ever/almost he closed the door". A few can only occur before the subject: "Maybe he is coming", "*He is maybe coming", "*He is coming maybe". Generally they can occur after forms of "be": "They are apparently/probably/fortunately good programmers." In short, the norm for clausal adverbs is "E+ or CO+ or EB+", but there are many exceptions. Note that there are also a few adverbs which can be used either as clausal or manner adverbs, like "clearly" and "sadly". These adverbs thus take "MVa- or E+ or CO+ or EB+", and when used with E+ or CO+, they are ambiguous: "Clearly, he read the speech". (Many clausal adverbs _can_ follow the verb with commas: thus they take "E+ or CO+ or EB+ or (Xd- & Xc+ & MVa-)". See "MVa".) A special category of clausal adverbs is words like "chemically", and "financially". These would appear to be clausal adverbs in that they modify the entire clause rather than the verb, and like clausal adverbs, they can occur before the clause before the verb, or after forms of "be" (exs. 1, 2 and 3 below). However, unlike most clausal adverbs, they may follow the main verb (ex. 4), and they may also modify adjectives (ex. 5). Thus such adverbs take "MVa- or E+ or CO+ or EA+ or EB+". 1. Biochemically, the experiment was well-designed 2. We biochemically altered the materials 3. It was biochemically a good experiment 4. The experiment was well-designed biochemically 5. We need to get some biochemically valid results _Time adverbs_ Many time adverbs (sometimes, often, recently, soon) can be used either after the verb, before the verb, after "be", or before the subject: "Sometimes, we have chicken", "We sometimes have chicken", "We have chicken sometimes", "We are sometimes in the garden". Such adverbs thus take "MVa- or E+ or CO+ or EB+". However, a few ("always", "never", "rarely") can only take E+ or EB- ("*Always he is late", "*He is late always"). _Other kinds of adverbs_ Other more specialized uses of adverbs are explained elsewhere: adverbs modifying adjectives and other adverbs (see "EA", "EE"), those modifying number expressions ("EN"), those modifying comparatives ("EC", and those modifying prepositional phrases (see "MVl"). The fact that a number of words belong to different combinations of these categories explains the large number of adverb categories in the dictionary. A further complication is that some adverbs can be modified by adverbial adverbs like "very"; others cannot. Those that can have "{EE-}" conjoined with their other connectors; such adverbs can also be part of adverbial questions ("How quickly did you run"), and therefore must also take Ca+ and Qe+. See "EE", "Ca", "Qe". _High-cost Uses of Adverbs_ Some uses of adverbs are more rare, and are therefore given a cost of 2, making them stage 2 usages. It was mentioned that certain adjectives, like "very", can modify adjectives: "He is very skillful". However, one also sometimes sees ordinary manner adverbs modifying adjectives: The cellist's delicately melodic style contrasted with the fiercely abrasive tone of the violin and the pianist's violently percussive chords. To allow this, we give manner adverbs high-cost "EA+" connectors. Note that adjectives only have one (non-multiple) "EA-" connector; therefore, if an adjective takes an adverb in this way, it cannot also take an ordinary "intensifying" adverb like "very". This seems correct: "*The delicately very melodic tone of the cello was beautiful." Adjectives may also sometimes be modified by clausal adverbs and time adverbs. 1. He was often friendly 2. The often underpaid administrators used to resent the invariably rude students and the understandably impatient professors. Recall that clausal adverbs take "EB+" connectors, and forms of "be" have "EB+" connectors; thus in a sentence like 1 above, "EB" is used (see "EB"). In ex. 2, however, no EB+ connector is available. Thus we must allow the adverb to connect directly to the adjective. We therefore give adjectives "@E-" connectors, conjoined _only_ with their A+ connectors, not with their Pa-. Since this usage is fairly rare, we again make it a stage 2 usage. This yields the following: rude: {EA-} & (({[[@Ec-]]} & A+) or (Pa- & ); This requires several further comments. First, why is it that clausal/time adverbs attach to adjectives with "E+", and manner adverbs attach with "EA"? For one thing, as mentioned above, manner adverbs seem to take the place of ordinary adjectival adverbs like "very" and "quite", whereas clausal and time adverbs do not: The often very underpaid administrators used to resent the invariably quite rude students and the understandably rather impatient professors. Furthermore, when time/clausal adverbs are used as well as an adjectival adverb (or a manner adverb), the time/clausal adverb must come first: The cellist's sometimes very melodic tone contrasted with the largely rather percussive chords of the piano *The cellist's very sometimes melodic tone contrasted with the rather largely percussive chords of the piano Both of these distinctions are enforced by the solution described above. Finally, a problem arises here: manner adverbs also take E+ connectors, so given the above expression for adjectives, a sentence like "The fiercely percussive piano chords" will receive two parses. To prevent this, we give manner adverbs "Em+", and we give adjectives "Ec-", as shown. EA connects adverbs to adjectives. Certain adverbs can modify adjectives ("very", "quite", "relatively"); these have EA+ connectors. Adjectives have optional "EA-" connectors. +----Pa-----+ | +--EA---+ | | | He is pretty stupid EAh connects to adjectives in the same manner as "EA". EAh+ occurs only on the word "how", and is used in adjectival questions: +-Wq-+EAh-+-----AF-----+ | | | | || How stupid can you be (Also indirect questions: "I wonder how stupid he can be."" On adjectives, "EA-" is optionally conjoined with "Pa+ or A+ or AF-". It would seem, then, that there is nothing to prevent EAh being used with Pa- (exs. 1 and 2 below) or A+ (ex. 3): +---Pa--+ +-S-+ +-EAh+ <-- ? | | | | 1. *He is how stupid. 2. *He is I wonder how stupid. 3. *The how stupid man is here. This problem is analagous to that of question-word determiners ("*He bought I wonder which book"), and is prevented in exactly the same way. Like "D**w+" on "which", EAh+ on "how" is conjoined with "R- or Wq- or Ws-", and therefore must make a link back either to an verb taking indirect questions ("wonder", "know") or to the wall. Therefore, in ex. 1 and 3, no linkage is found. QI- also starts an 's' domain, which is bounded: i.e., it is prohibited from spreading back to the left of its root word. Thus ex. 2 is prohibited in post- processing. In practice, then, EAh only occurs with AF, never with A+ or Pa+. There is one problem, however. Usually question-word determiners can be used in noun-focused (subject- or object-type) questions: "Which dog did you chase", "Which dog chased you". Similar linkages might be imagined, involving EAh links: +-EAh+-A--+ | | | 1. *How big dogs did you chase 2. *How big dogs chased you. However, such constructions are clearly incorrect. Thus these must be prevented somehow. Another point: Notice that the noun here is plural. With singular nouns, a determiner is required, so no such linkage can be formed (*"How a big dog did you chase" would involve crossing links). However, there is a way of expressing this thought: ex.3 below. With subject-type questions, though, even this construction seems wrong (ex.4): 3. How big a dog did you chase? 4. *How big a dog chased you? Other questions with "how" seem fine: subject- or object-type question like 5 and 6, involving "how many" or "how much", or adjectival questions like 7. 5. How many dogs did you chase? 6. How many dogs chased you? 7. How big was it? The rule seems to be that if a question is "adjective-focused" (that is, if the degree of the adjective is what is being asked about), it must be a "be" type question (using AF), rather than a subject-type (S) or object-type (B) question, unless it is an object-type question using "how (adj) a (noun)" (as in ex. 3). (We must also of course enforce s-v inversion in questions, and prevent it in indirect questions; this involves other connectors. See "SI: Questions requiring s-v inversion".) How do we enforce these constraints? First we must allow the rather odd construction in ex. 3. To do this we use the HA and AA connectors (see "AA" for a full explanation): how: (EAh+ & {HA+}) a: EAh- & AA+; This yields the following: +---HA---+ +EAh+-AA-+-Ds-+ | | | | How big a dog was it But this allows both 3 and 4; 4 must be prevented. Moreover, plural constructions, both subject and object- type (ex. 1 and 2), must be prevented too. (We cannot make the HA+ obligatory on "how"; EAh is often used on its own, as in "How big is it".) We prevent these constructions in post-processing. We simply state that EAh connectors can only be used when one of a list of connectors is present in the group: either AF (as in ordinary adjectival questions) or Bsm (as in ex. 3). Thus we weed out all constructions where the noun of the "how" phrase is either a subject or where it is a plural object. This applies in the same way to indirect questions. (Note that with indirect questions analogous to ex. 2 above - "I wonder how big dogs chase you" - another interpretation is possible in which "how" is analagous to "when"; this is correct and is accepted.) EB connects adverbs to forms of "be" before an object or prepositional phrase: +------------O-----------+ +-S-+--EB--+ | | | | | He is apparently a good programmer Forms of "be" therefore have an optional "EB+". Note that "EB+" is conjoined with MVp+ or O+, but not with Pg+ and Pv+; this is because present and passive participles have @E- connectors, so they can connect to adverbs using these. Certain adverbs can also be used in comma modifiers, following the first comma: +------MX-------+ | +----Xc-------+ | +--EB--+ +-----Xd----+ | | | | | A man, apparently in a bad mood, was there A man in a bad mood was there *A man apparently in a bad mood was there As the third example shows, if an adverb is to be used in such situations, the modifying phrase must be surrounded by commas. For this reason, it is simplest to make the adverb attach directly to the preceding comma. (This is the only case where words attach to each other _via_ a comma.) We then give commas "{@EB+} & Xc+". (See "X" for more on commas.) Now the only problem is that while many adverbs that can follow "be" can also follow commas, some cannot: "He is really a good player", "*John, really a good player, beat everyone". Thus we give such adjectives "EBm-", and we give commas "EBc+". Regarding the kinds of adverbs that take EB+, see "E: Types of Adverb". EC connects adverbs and comparative adjectives: +---Pa---+ | +-EC-+ | | | It is much bigger Comparative adjectives have optional EC- connectors conjoined with their "Pa- or A+". The same class of adverbs that modify comparatives - "much", "somewhat", "a little" - can also modify comparatives like "more", "less", and "fewer": "I like him much more now", "We have much fewer students now", "I earn much less now"). Notice that comparatives can be so modified whether they are acting as adverbs, noun phrases, or determiners; thus they have EC- optionally conjoined with the rest of their expression. ECn is used in noun-focused comparative questions, like "How much more money do you have". See "EEh". ECa is used in adjective-focused comparative questions like "How much more efficient are they". See "EEh". EEx links adverbs to the word "much". Many adverb-modifying adverbs, like "very", can also modify the word "much". "How" also has an EEh connector; but this connector has special significance for post-processing (see "EEh"). Therefore we prevent the EEh from linking to "much"; instead, we create a special "H" connector for this purpose. EE connects adverbs to other adverbs. Some adverbs can modify other adverbs ("very", "quite"); these carry EE+ connectors. Adverbs which can be modified in this way (some can not) take EE-. +---MVa---+ | +-EE-+ | | | I ran very quickly EE can also be used with E ("He very quickly left"), CO ("Very quickly, he left") and EB ("He is very clearly a good programmer"). _EEh: Adverbial questions_ EEh is used to connect "how" to adverbs, in adverbial questions (direct or indirect): "How quickly did you run", "I wonder how quickly you ran". Adverbs that can be used in questions like this thus take "{EE-} & (Qe+ or Ca+ or MVa-...)". "Qe" is used in direct questions; "Ca" is used in indirect questions. +---I---+ +Eeh-+--Qe--+-SI+ | | | | | | How quickly did you run Eeh is therefore analagous to EAh (see "EAh"). The use of EEh is constrained by post- processing in a very similar way. The "EE-" on adverbs is conjoined with MVa-; thus an adverb might, in principle, make both an EEh and an MVa connection: "*I ran how quickly". In practice, however, EEh is only usable with Qe+ (used in direct questions) and Ca+ (in indirect questions). EEh+ on "how" is conjoined with "R- or Wq- or Ws-". When EEh is used with MVa, either "how" cannot make the connection it needs to the left, or else the bounded-domain constraint is violated. See "EAh". _How Much, How Much More_ "Much" can also be used with "how" as a determiner or noun-phrase (see "H"). It can also make an EC+ connection to "more"; "more" may then act as a noun-phrase (ex.1) or as a determiner. Alternatively, "more" can act as a comparative adverb modifying a sentence or a following adjective. In this case, however, a different linkage is used (ex.2 & 3): +--H-+ECn+ | | | 1. How much more did you earn +EEh-+ECa+ | | | 2. How much more can you run 3. How much more efficient is your program In either case, "how" makes a W connection to the wall (in direct questions) or an QI connection in indirect question. In that case, why do we distinguish between "H" and "EEh" at all? The reason is that there are complex constraints on the way this phrase is used. Consider the following: 1. How much more money do they have 2. How much more money will be coming in 3. How much more efficient is their program 4. *How much more efficient programmers do they have 5. How much more efficient a programmer do they have 6. *How much more efficient programmers work for them 7. *How much more efficient a programmer works for them A pattern emerges here, similar to the pattern that emerged with "EAh": 8. How many dogs did you chase 9. How many dogs chased you 10. How big is it 11. *How big dogs did you chase 12. How big a dog did you chase 13. *How big dogs chase you 14. *How big a dog chased you When the focus of the question is an adjective, either directly or indirectly ("How efficient", "How much more efficient"), then it must be a "be"-type ("AF") question; it may not be a a subject- ("S") or object-type ("B") question; the exception is object-type singular questions like ex. 12 ("How big a dog"). Exs. 8 and 9 are not adjective-focused; ex. 10 contains an "AF" link; thus these are fine. We already have a mechanism for ensuring that in adjective-focused questions like 10-14, an AF link must occur, except in cases like ex. 12. (see "EAh".) Thus the constraints are set up just the way we want: we just have to find a way of triggering them in the right cases, i.e., in "adjective-focused questions"; ex. 3-7 above. The problem is that in cases 8-14, adjective-focused questions are easily characterized; they all contain an "EAh". With exs. 1-7, they are not so easily characterized. We need to make sure that the adjective-focused questions, and these only, all contain a certain connector. Then we can have that connector trigger the same the same constraints that are triggered by EAh. This we do through connector logic; the connector we use is EEh. (Irrelevant connectors and subscripts are omitted here.) how: (QI- or W-) & (H+ or EEh+); much: (H- & (ECn+ & Dmu+)) or (Ee- & (ECa+)) more: (ECn- & (Dmu+ or S+ or B+)) or (ECa- & EA+) If "more" forms an EA+ link with an adjective - implying an adjective-focused question - an ECa link with "much" is required; an EEh link with "how" is then formed (if any link with "how" is formed); and the "adjective-focused question" constraints are applied. Thus 4, 6, and 7 above are prohibited; 3 and 5 are allowed. However, if "more" is acting as a determiner or noun-phrase (as in 1 and 2 above), its ECn+ connector is used; the ECn- on "much" is therefore used, a H link with "how" is formed, and the "adjective-focused question" constraints are not enforced. (Recall that EEh, as well as being used in adjective-focused comparative questions, is also used in adverbial questions: "How quickly did he do it?" Thus Ca and Qe - which occur in adverbial questions - are added to the list of "contains_one" connectors that EEh requires in its group. Notice also that "How much more" may be used in yet another way as an adverbial phrase: "How much more did he do it?" Thus "more", like other adverbs, must carry "Qe+ and Ca+" conjoined with its ECa-. Further, "more" may itself modify an adverb: "How much more quickly did he do it?" Thus "more" must also carry EE+, like other adverb-modifying adverbs.) EAx connects certain adverbs to the word "many", when it is _not_ being used in questions. The adverbs that can modify adjectives ("very", "relatively") can also modify "many". EA is also used by "how" to connect to adjectives; "how" has "EAh+" for this purpose. However, "EAh" has a special use in post-processing: it signifies that an adjective-focused question is present (see EAh). To prevent such connections from forming in the phrase "how many", we give "many" an EEx- connector and create a H connector for linking "how" and "many". EF is used to connect the word "enough" to adjectives and adverbs. +-Pa+ | +-EF-+ | | | He is good enough +-----A------+ +--EF-+ | | | | He is a good enough player In this case, the word "enough" expresses the degree of the adjective or adverb, similar to preceding adjectival adverbs like "very", which use EA+. Moreover, "enough" cannot be combined with such a preceding adverb: "He is very good enough". Therefore EF+ on adjectives and adverbss must be disjoined with EA-: good: {EA- or EF+} & (A+ or Pa- or ...) quickly: {EE- or EF+} & (CO+ or MVa- or ...) EI connects a few adverbs to "after" and "before", such as "soon", "immediately", and "shortly". +-EI-+ | | I left soon after I saw you EN connects certain adverbs to expressions of quantity: "We have nearly 100 students", "I have about 50 dollars". Number words therefore carry an optional EN-. +----EN----+ | | It will cost almost 400 million dollars They died almost 400 million years ago Note that the EN always connects to the head word of the number expression: i.e., the word that connects it to the rest of the sentence. ER is used in the construction "the X-er..., the Y-er...": +------ER----------+ +-DG-+ +-DG-+ | | | | The better it is, the more people will use it The more people use it, the better it is Such constructions always use comparative adjectives or adverbs. They consist essentially of two similar phrases, attached together by an ER connector. (Any phrase that can occur in the first half can occur in the second, as the above examples suggest.) The comparative adjective is always the "head" of each phrase. Each comparative adjective must have the capacity to serve as either the first or second half of the expression. The complement connectors (i.e., those that connect within the phrase) are the same, whether the adjective is serving as a "first-half" or a "second-half". Such adjectives therefore have: bigger: ( complement ) & DG- & ((Wd- & Xc+ & ER+) or ER-) ---------------- ^ for first-half for second-half phrase phrase The DG- connects to the definite article "the"; the Xc+ connects to a comma; the Wd- to the wall; and the ER from one phrase to the other. The connectors used within the phrase are mostly those discussed elsewhere: The bigger it is AF+ The more you run Cs+ The more you earn B+ The more money you earn Dm*w+ An exception is the following construction: +-TR--+--U---+ | | | The better the computer , the faster the program For such constructions, the word "the" has "TR- & U+". Nouns have U-; this special connector is disjoined with all other optional and mandatory connectors on nouns (except AN- and A-); see "U". FM connects the prepositon "from" to various other prepositions. +--MVp-+--FM--+----J----+ | | | | John screamed from inside the house under the bed behind the car *John screamed from with the dog "From" is unusual in that it can take many prepositional phrases as objects (rather than noun phrases). "From" therefore has "(FM+ or J+) & (MVp- or Mp- or CO+...)"; it can serve this function whether it is acting as a verb modifier, noun modifier, opener, etc.. Prepositions that can serve as objects of "from" have "J+ & (MVp- or Mp- or CO+ ... or FM-)". G connects proper nouns together in series. +--G---+---G--+--G--+ | | | | George Herbert Walker Bush is here Any number of proper nouns may thus be strung together to make a proper noun phrase. The last noun in the sequence then serves as the head of the phrase. G is only used for the linking of proper nouns. When proper nouns (or common nouns) modify common nouns - "The Dole proposal" - AN is used. Any word, when capitalized, may be used as a proper noun: "I saw The today". "I had lunch with And Of From Smith." The exception is at the beginning of sentences. A word which is listed in the dictionary in an uncapitalized form, and which is used at the beginning of a sentence, will be treated only as an uncapitalized form (although it might in theory be intended as a proper noun). We had to do this, otherwise EVERY word beginning a sentence would be a potential proper noun - which would be ridiculous. ("*The died" would be accepted.) However: certain words are common uncapitalized words, but are also used as names, and thus should be recognized as names at the beginning of sentences: "Bill", "Pat", "Sue", etc.. A special category was created for these. Numbers are rarely used as proper nouns phrases, although they occasionally are ("301 is a great class", "I live in 509"). We do not allow this, since it would create a huge number of false positives. However, numbers are sometimes used as part of a multi-word proper noun expressions: "Fahrenheit 451", "Die Hard 3", etc.. Therefore we give numbers "G- & (G+ or S+ or O-...)" as a stage 2 usage. GN is a stage 2 connector used in expressions where proper nouns are introduced by a common noun, with or without a determiner: +----D-----+----GN-----+ | +-A--+ +--G--+----S----+ | | | | | | The famous actor Eddie Murphy attended the event Actor Eddie Murphy attended the event The proper noun (or the last in a series of proper noun words - see G) is the head of the expression, making an S, O, or J connection to the rest of the sentence. On common nouns, GN+ is fully disjoined with the main S/O/J complex and all the connectors for post-noun modifiers; but it is conjoined with the @A- connector (for adjectives), as shown by the first example above, as well as the @AN-, for pre-noun noun modifiers ("The adventure movie actor Eddie Murphy"). As for determiners, nouns in this situation can take a definite or possessive determiner ("My friend John") but not an indefinite one ("*An actor Joe Smith"); therefore we use the DD connector. This yields: dog: {@AN-} & {@A-} & ((D- & {@M+}... & (S+ or O- or J-...)) or (DD- & GN+)) H connects "how" to "much" or "many". "Much" and "many" can serve as determiners ("How many books do you have"); they can also serve as independent noun-phrases ("How many do you have"). Unlike most determiners, they can also be preceded by the word "how" in either case, creating a question or indirect question: +--H-+-Dmc-+ | | | How many books do you have Therefore they have H- connectors, optionally conjoined with their D+ connectors and also with their main noun complexes. However: when the H- is being used on "many" and "much", a question situation is created (direct or indirect). This introduces constraints on word-order, mainly relating to subject-verb inversion. For this reason, H+ on "how" is conjoined with QI- (used in indirect questions) or Wq- or Ws- (used in direct questions). These connectors trigger post-processing rules; these are explained in "SI". "H" is also in a post-processing category along with "D##w", relating also to s-v inversion; see "SI: Questions without s-v inversion". I connects certain verbs with infinitives. +---I----+ +-S-+--I-+ +S-+-O--+ | | | | | | | | I must go to the store I made him go Modals have I+ connectors. Certain other verbs also have I+ connectors, sometimes conjoined with O+ connectors, like "make" and "see". The word "to" also has an "I+", conjoined with "TO-", used in infinitives. Infinitive verb forms have "I-", conjoined with their complement connectors. In every case except "be", the infinitive form is the same as the plural form; therefore the same expression can be used. "Ii" connectors are used by pp to enforce the correct use of "filler-it" and "there". See "SF: Filler-it". "Ia" is used in object-type infinitival indirect questions. Here, a "to"+infinitive construction occurs, but in this case - unlike other "to"+infinitive constructions - the "to" is unable to connect back to another word. +--B---+ | +Ia+ | | | I wonder what to buy In infinitival questions with "where/when/how" - "I wonder where to go" - the question word instead makes a TOn connection with "to". See "TOn". IN is used to connect the preposition "in" to certain idiomatic time expressions. +----IN----+ | +--TA--+ | | | We did it in early December See "DT" for more discussion of time expressions. J connects prepositions to their objects. +-Mp+----J---+ +-----MVp----+-J-+ | | | | | | The man with the hat chased the dog on Tuesday Proper and common nouns, accusative pronouns, and determiners that can act as noun- phrases have "J-" disjoined with their S+ and O- connectors. Prepositions have "J+ & (Mp- or MVp-)". "Mp" is used for prepositions modifying nouns; "MVp" is used for prepositions modifying verbs and adjectives. Prepositions may also have other connectors, disjoined with J+, such as Mg+, Mv+, and QI+; see "MV: Other Uses of MVp and MVs". Jw is used to connect prepositions to noun-phrase question-words in the construction "To whom were you speaking?" The construction formed here is very similar to that formed for determiner-question-words, as in "To which person were you speaking?" See "JQ" for an explanation. Jw is also used in prepositional relative clauses, like "The room IN WHICH I was working was cold". See "Mj". Jr is used only with "noun-modifying prepositional-object relative clauses". See "B*j". JG connects certain prepositions ("of" and "for") to proper-noun objects. +-MG--+--JG-+ | | | The National Academy of Science is meeting Proper nouns have an optional "MG+"; certain prepositions like "of" and "for" have "MG- & JG+". Since MG attaches a modifying preposition to a noun, is analagous to M; and JG, which attaches a preposition to its object, is analagous to J. But note that MG+ on prepositions is conjoined only with JG+, not with J+. Thus proper nouns can only be modified with prepositional phrases using other proper nouns: "The National Academy of Science is meeting", "*The National Academy of science is meeting". (Similarly, proper nouns taking JG can only modify other proper nouns; but since they also have J- connectors, they can use these to modify ordinary nouns: "My book on Germany is excellent".) JQ is used to connect prepositions to question-words in constructions like the following: +-------Qd----+ +-----J---+ | +-Wj-+-JQ+--D--+ +-SI-+ | | | | | | || In which room were you working This requires some explanation. Consider the following simplified expressions: in: {JQ+} & J+ & (MVp- or Mp- or (Qd+ & Wj-)...) which: (R- & (RS+ or C+)) or ((QI- or W-) & (B+ or S+)) or (JQ- & D+); Notice, first, that the preposition uses the same J+ to connect to "room" that it uses in an ordinary prepositional phrase (for example, with a MVp- or Mp-). In ordinary prepositional phrases, however, no JQ link is made; thus the JQ+ on "in" must be optional. Notice also that the complex used on "which" is disjoined from the connectors it uses in other kinds of questions (direct and indirect) and relative clauses. But given this expression for "in", what prevents use of JQ in ordinary prepositional phrases: "I sat in which room"? This is prevented by post-processing. The Wj linking the preposition to the wall starts a domain, in which it is included; we then require that JQ connectors must have a Wj connector in the same group. We also have to ensure that in constructions like the above, a question-determiner _is_ used: "*In that room were you working". To do this, we also insist in post-processing that a group with a Wj must contain a JQ. Notice, also, that the preposition here makes a Qd connection to the auxiliary. This connector, which starts a domain, is one of the ones that permits (and in fact requires) subject-verb inversion in post-processing (see SI). But this serves another function too; it isolates the JQ and J connectors in the group. This ensures that the post-processing rules relating to Wj and JQ will not be satisfied by a JQ occurring later in the sentence ("*In the room were you working on which desk?" (We should note that groups started by Wj, unlike others, normally do not contain whole clauses.) Thus the following domain structure is generated: +---Q(j(m))----+ +-----J(j)-+ | +Wj(j)+JQ(j)+D(j)+ +SI(j(m))+ | | | | | | || In which room were you working Such a construction might be also be formed in indirect questions: "I wonder in which room he was sleeping". This usage would require extra paraphernalia, however, and seems almost non-existent, so we exclude it. A similar linkage is formed in questions where the question-word acts not as a determiner but as a noun-phrase: +---Qd----+ +-Wj-+-Jw+ +-SI-+ | | | | | || To whom were you speaking Question-words that can be used in this way - "which" and "whom" - have Jw- connectors disjoined with everything else. In this case, then, the usual "J+" connector on the preposition is used, but it is given a 'w' subscript; no "JQ" connection is made. "Jw" is then added to "JQ" in the list of connectors which may satisfy the demand of "Wj"; and, in turn, "Jw" may only occur if a "Wj" is present. JT connects certain conjunctions to time-expressions that are not well-formed noun phrases, like "last week". and "this week". +-----CO-------+ +----JT----+ | | +--DT-+ | | | | | Until last week, I thought she liked me With time expressions that are well-formed noun phrases like "this week", no JT is necessary; the phrase can serve as a prepositional object using its ordinary J connector. See "MVp: Time Expressions". K connects certain verbs with particles like "in", "out", "up", and the like. +-K-+ | | The man came up *The man arrived up Particles that can be used in this way have "K-" disjoined with everything else. (Most are also prepositions; a few, like "away", are not.) We distinguish between verbs that can take particles and those that do not, but among particle-taking verbs, we do not distinguish between specific verb-particle pairs: we allow "We sorted them out/*up", "We put them in/*over". Verbs that take particles may be transitive ("pick"), intransitive ("come"), or trans/intrans ("move"). With transitive verbs, the particle may either precede or follow the direct object: "We picked the dog up"; "We picked up the dog". This yields the following expression. pick: (S- or ....) & ((K+ & O+) or ((O+ or B-) & {K+})); ^ particle ^ particle precedes object follows object Note that the particle is always optional. However: if we made the particle optional in both the "pre-object" case and the "post-object case", then "We picked the dog" would receive two parses. So it must be obligatory in one case, optional in the other. A further complication: with transitive verbs, the particle may always follow the object, but it may precede it only if the object is a full noun-phrase, not a pronoun: "We picked it up," "*We picked up it". We enforce this using post-processing. In the expression for transitive verbs, the O+ in the pre-object case is subscripted "O*n+". Pronouns are then subscripted "Ox-". "Oxn" connectors are then prohibited in post-processing. L connects certain determiners to superlative adjectives. +------D----+ +--L--+ | | | | He has the biggest room In most cases, when a determiner-adjective-noun phrase occurs, both the determiner and the adjective attach to the noun. Superlative adjectives are different, however. Superlatives be used with determiners, and only certain determiners: This is their biggest room *This is a biggest room *They have biggest rooms To enforce this, it seems easiest to simply make the superlative connect to the determiner. Determiners that can perform this function carry "{L+} & D+". Superlative adjectives carry "L-". As well as superlatives, other adjectives are in this category such as "own", "next", and "same". Number words like "third" and "fifteenth" are also included: "This is the fifteenth book I've read", "*This is a fifteenth book I've read", "*This is fifteenth book I've read". Numbers can also be used here, on the left end of an L link: "The five biggest cities are in China". As in other cases with determiners followed by numbers ("The five cities we saw were..."), a DD connector is used: numbers thus carry "{{L+} & DD-} & (Dmc+ or S+...)". +------Dmc----+ +-DD-+--L--+ +-S--+ | | | | | The five biggest cities are Note that the L+ on numbers is optionally conjoined with the DD-; it may not be used unless the DD- is used. This prevents "*Five biggest cities are in China". On determiners that take L+, it is optionally conjoined not only with D+ but with DD+. This allows "The biggest five cities..." LE is a special connector used in comparative constructions to connect an adjective to the second half of the comparative expression beyond a complement phrase: +------------LE---------+ | | It is more likely that Joe will go than that Fred will go See "MV: Comparatives". M connects nouns to various kinds of post-nominal modifiers without commas, such as prepositional phrases, participle modifiers, prepositional relatives, and possessive relatives. (Phrases of these kinds with commas use MX; see "MX".) Ordinary relative clauses do not use M+, but rather "R+ & B+" (see "B: relative clauses"). Nouns therefore have optional "@M+" connectors, conjoined with their main "S+ or O+ or J+..." expression and with their "{@A-} & D-". Certain determiners that can act as complete noun-phrases also carry @M+ ("everyone", "many"), as do numbers. Pronouns do not. This yields The man from London is here Some of the programmers are very good Five of the programmers are very good *He of the programmers is very good _"M" modifiers used with relative clauses_ Note that words that take @M+ connectors have the following expression: ...{@M+} & {R+ & B+ & {[@M+]}}... Nouns frequently take a relative clause, or one or more prepositional phrases; they rarely take both. Occasionally they do, however: "The picture that I showed you of John", "The picture of John that I showed you". Therefore we have to allow "@M+" and "R+ & B+" both conjoined and disjoined, in either order; this is what the above expression provides. There is still something rather arbitrary about this, however. We allow any number of M connections (prepositional phrases, participle modifiers, etc.) to be made, or one relative clause followed or preceded by any number of M connections, but not multiple relative clauses. This was necessitated by the fact that relative clauses require two conjoined connectors, and there is no way of allowing indefinitely many "R+ & B+" connections to a word. In practice, however, multiple relative clauses are extremely rare. _Mp: Prepositional phrases modifying nouns_ "Mp" is used for prepositional phrases modifying nouns. Prepositions therefore have Mp- connectors directly disjoined with MVp- (used for prepositional phrases modifying verbs) and conjoined with J+ (used for prepositional objects). (Some prepositions can take other kinds of objects besides noun-phrases, and thus have other connectors conjoined with "MVp- or Mp-": see "MVp".) Almost all words that have Mp- also have MVp-; one exception is "of", which can modify nouns but not verbs ("*The dog ran of the yard"). (A few verbs can be modified by "of": see "OF".) It was mentioned that many determiners and numbers also have M+ connectors, as well as nouns. Thus the sentences below are all linked in essentially the same way: +-Mp+---J------+ | | | 1. The salaries of the programmers are excellent 2. Some of the programmers are excellent 3. Five of the programmers are excellent This may seem counterintuitive. In ex. 1, the subject of "are" is "salaries"; in ex. 2 and ex. 3, it is really "programmers". We see no need to make this distinction, however. As well as the special use of "some" and "five" shown here, these words can act as independent noun-phrases taking no modifying phrase or quite different modifying phrases (ex. 4 and 5); and even constructions like ex. 2 and 3 above sometimes arise where it is quite clear that "some", not the prepositional object, is the real subject (e.g., ex. 7). 4. Some are excellent 5. Some with doctorates are excellent 6. Some holding doctorates are excellent 7. (Most of the pictures of the executives are terrible, but) some of the programmers are excellent _Ma: adjectival modifiers_ Ma connects nouns with post-nominal adjectival modifiers. +-Ma--+ | | These are people unhappy about the economy This is a trial certain to attract attention This is only correct when the adjective has some kind of complement or modifier attached to it: "*These are people unhappy", "*This is a trial certain". We enforce this in post-processing. Ma connectors start a domain; post-processing then requires that a group containing an Ma contain one of a list of connector-types, either complement connectors like "TO" or "TH", or the prepositional phrase connector MVp. _Mv and Mg: Participle Modifiers_ Mv connects nouns with passive participles: +--Mv-+ | | The dog chased by the man died Mg connects nouns with present participles: +--Mg-+ | | The dog chasing the man died These are sometimes known as "participle modifiers". Every passive verb form has a Mv-, directly disjoined with its V- (used in normal passive constructions). Every present participle has an Mg-, directly disjoined with Pg- (used in normal present participle constructions). Thus in complex verbs, whatever complement is normally required or allowed by the participle (O+, TH+, TO+, etc.) will be required or allowed here. (Recall that passive verb-forms have different complement expressions from active forms.) The dog chased by the man was black *The dog chased the man was black *The dog chasing was black The dog chasing the man was black People expecting to see the President will be disappointed *People expecting will be disappointed If Mv- is directly disjoined with Pv- in every case, and Mg- with Pg-, why have two different connector types? The reason relates to post-processing. Post-processing divides a sentence into groups of links, in which each group corresponds, roughly, to a clause: the set of links involved in a subject-verb expression (as well as any direct object, indirect object, or prepositional phrases). If the clause contains a embedded clause, this forms its own group. In some cases that participles are used, they belong to the same subject-verb expression as the previous links (ex. 1 below). In cases of participle modifiers, however, the participle indicates the beginning of a new subject-verb expression (ex. 2). +----J----+ +-D---+-S-+--Pv-+--MV--+ +--D--+ | | | | | | | The study was mentioned in the journal +--J(e)--+ +-Mv(e)-+-MVp(e)+ +D(e)+ | | | | | I've read the study mentioned in the journal Thus Mg and Mv connectors must start new domains; Pv and Pg connectors must not. This distinction is particularly important in cases where we use post-processing to enforce subject-verb constraints, such as "filler" uses of "it" and "there". Post-processing must know that in cases of participle modifiers, the verb in the participle modifiers does not apply to the subject of the main clause; thus it knows that "There seems to have been a study mentioned in the journal" is correct, while "There seems to have mentioned a study in the journal" is incorrect. See "SF: Filler-it". _Mv+, Mg+ used on conjunctions_ Some conjunctions can also take present and past participles instead of noun-phrases and clauses: "We saw a movie ABOUT CATCHING dogs", "We entered BY CLIMBING in the window", "He will respond WHEN QUESTIONED". Mv and Mg are used here to connect the conjunction to the participle. Again, the question arises here: why use Mg+ and Mv+ rather than Pv+ and Pg+? This is a difficult question, and again, relates only to post-processing. Recall that Mg and Mv begin new domains; they imply that a new subject is in force. In many cases this seems to hold true with participles modifying prepositions and conjunctions. This is particularly true when the preposition modifies a noun (ex. 1), but also sometimes when it modifies a verb (ex. 2). +---Mp-+---Mg-+ | | | 1. We saw a great movie about catching dogs +-------MVp---+-Mg---+ | | | 2. She criticized us for chasing the dog 3.?There was a problem while catching the dog In other cases, however, it seems that the previous subject is still implied: 4. We talked about catching dogs 5. We angered her by chasing the dog 6. The man died while chasing the dog At the moment, this has little effect on well-formedness. The only cases it matters are where post-processing checks are used to control the uses of "there". If we used Pg connectors rather than Mg, then sentences like ex. 3 above would be prohibited; perhaps they should be. In any case, for now, Mg and Mv are used in all such cases. _Mr: Possessive Relatives_ Mr is used for relative clauses involving "whose": +-Mr-+-Ds*w+-S--+ | | | | the dog whose owner died was black +---Bsm--+ +-Mr-+-Ds*w+ +S-+ | | | | | the dog whose owner John hit was black The two constructions shown here are exactly analagous to the uses of "whose" (and other question-determiners like "which") in indirect questions. A combination of link logic and post-processing rules enforces very tight constraints on the use of Ds#w and Bsm (and hence of Mr as well); see "B#n, B#m". Like Mg and Mv connectors and relative clauses (R+ & B+), Mr connectors start a new domain. See Mg and Mv for an explanation. _Mj: Prepositional-object relative clauses_ Mj is used for relative clauses in which the main noun is the object of a preposition: +---Cs--+ +-Mj-+-Jw-+ | | | | | 1. The man to whom I was speaking was tall +------Cs-----+ +----J----+ | +-Mj-+--JQ-+-D-+ | | | | | | 2. The man to whose wife I was speaking was tall The "JQ" and "Jw" connectors used here are also used in prepositional questions ("To whom were you speaking"). See "JQ" and "Jw". Prepositions have "Mj- & Cs+" conjoined with J+ (which forms the "Jw" link with "whom"), and disjoined with "MVp- or Mp- or (Wj- & Qd+)..." (used in prepositional questions). This raises possible problems. In relative constructions like the above, the prepositional object used must be a relative pronoun like "whom" (*The man to the man I was speaking was tall", "*The man to John's wife I was speaking was tall". Furthermore, we must prevent "whom" and "whose" being used in any old way: "*I was speaking to whom", "*I was speaking to whose wife". These problems have been dealt with already in connection with prepositional questions: e.g., "To whom were you speaking?". There, we have Wj - the connector connection the preposition to the wall - start a domain whose group contains only the initial prepositional phrase of the sentence; we then require that every group containing a Wj contain a Jw or a JQ, and that Jw and JQ may occur only in groups containing a Wj. Here, then, all we have to do is add "Mj" to the list of connectors that require JQ or Jw and satisfy the requirement of JQ and Jw. The same constraints that apply in prepositional questions then automatically apply here. Note that this system very naturally permits more nested constructions like "The dog to the wife of whose owner you were speaking is here". MG allows certain prepositions to modify proper nouns. We do not allow most prepositions to modify proper nouns (*"John with the big nose is here") but we do allow it with "of" and "for": "The Society for Creative Anachronisms is meeting", "The Ongle of Bongle Dongle died today"). MG must be used with JG, forcing the object of the preposition also to be capitalized ("The Society for Creative Anachronisms is meeting", "*The Society for creative anachronisms is meeting"). MV connects verbs (and adjectives) to modifying phrases like adverbs, prepositional phrases, time expressions, certain conjunctions, "than"-phrases, and other things. +----------------MV-----------+ +------MV-----+ | +-MV--+ | | | | | | The dog ran quickly through the park with a bone Any number of prepositional phrases or adverbs may attach to a verb. _@MV+ on Verbs_ The way @MV+ connectors are combined with complement connectors on verbs is quite complex. Simple intransitive, transitive and optionally-transitive verbs simply have an optional "@MV+"; with transitive and optionally-transitive verbs, this is conjoined with the O+ connector: destroy: (Sp- or I-) & (O+ or B-) & {@MV+}; (Normally, prepositional phrases and the like follow the object in this situation, but sometimes a prepositional phrase is inserted before the object: "We destroyed the garage with axes on Tuesday", "?We destroyed with axes on Tuesday the garage". See below for explanation.) Some verbs connect to other verbs - modals (which take I+), auxiliaries (which take I+, PP+, or Pg+) or verbs that take infinitival complements (TO+). Such verbs may NEVER make an @MV+ connection beyond the verb they attach to. Thus in the sentence below, "in London" can connect only to "run", not to "would", "prefer", "be" or "appointed". +---------MV---+ +-PP--+-TO--+-I+--Pv--+-TO--+-I-+ | | | | | | | | | 1. He had expected to be appointed to run the project in London Consider the following expressions: expect: ({@MV+} & TO+) or ((O+ or B-) & {@MV+}); appointed: ...{@MV+} & {TO+}; "Expect" has an @MV+ conjoined with TO+, but it must connect _before_ the complement; this is explained below. Note that, since "expect" is also an ordinary transitive verb ("I expected the appointment"), it has an @MV+ conjoined with its O+ connector (which, again, connects _beyond_ the direct object), but this is disjoined with the TO+. With "appointed", the @MV+ conjoined with the TO+ is, again, before the TO+. Notice that the TO+ is optional here: thus "appointed" may be used with no complement at all, in which case it may take an MV connection ("He was appointed in London"). The same applies to complement connectors like TH, TS, and C (attaching to clauses, regular or subjunctive) and QI (attaching to indirect questions). Again, such verbs cannot make MV connections beyond their complements. Thus the sentence below receives only the parse shown. +---TOo-+ +-S-+-TH-+-C-+-S---+--QI--+---C-+-S--+-I--+-O-+ +-I+MV+ | | | | | | | | | | | | | 2. He said that he wondered where John would ask him to go on Monday Notice that phrasal complements of this kind, since they necessarily contain verbs, will themselves have available @MV+ connectors at the end. The point is that on any single verb expression, or in any clause involving verb-attached embedded clauses (not subordinate clauses and relative clauses), adverbial and prepositional modifiers placed at the end can only attach to the final verb in the expression. However, linkage logic insures that every verb expression or subordinate clause will eventually end with a verb that does _not_ connect to a further verb or clause, and thus will have an available "@MV+". By doing it this way, we greatly reduce the number of linkages that would be found on long sentences; and in most cases the modifier applies to the final verb anyway. (Although not always. In ex. 2 above, for example, "on Monday" seems to apply to "go", but might also apply to "ask" or "wonder".) _Other @MV+ connectors on Verbs_ In most cases, with complex verb expressions or verb-linked subordinate-clause expressions, prepositional phrases and the like occur at the end. However, they may also in some cases occur in the middle. Verbs that take complements such as TO+, Pg+ or I+ occasionally take prepositional phrases _before_ the complement. And verbs that take clausal complements (TH+, C+ or QI+) sometimes take prepositional phrases before their complements also. Some of these uses are rather questionable (TH+ and TO+ are best; Pg+, I+, QI+, and C+ are more doubtful), but for now we allow them all. +---------------------+ +--MV---+ | | | | He attempted for many years to be a concert pianist ?We discussed at that time hiring a new secretary He announced on Monday that he was hiring Smith I asked him on Tuesday who had been hired As mentioned, transitive verbs have "@MV+" connectors following their "O+" connectors. However, one occasionally sees a prepositional phrase inserted before the object: We destroyed with axes the garage that our grandfather had built We therefore give transitive verbs an optional "@MV+" before the "O+", in addition to the one following. Since this construction is rare, we give it a cost of 2, making it a Stage 2 construction. Moreover: in such cases, the object phrase is never a pronoun; it is always a full noun phrase. ("*We destroyed with axes it"). We already have a special O+ subscript, O*n+, that can be used only with noun phrases, not with pronouns (see "K" for an explanation). Thus we use that here as well. This yields the following: destroy: ([[@MV+ & O*n+]] or O- or B+) & {@MV+}; A final case to be considered is verbs that take an object plus some other complement. Some verbs take "O+ & TOo+" (for object+infinitive constructions), "O+ & TH+" (for object+clause constructions), and the like. In such cases, one _never_ sees a prepositional phrase inserted before the object. One occasionally sees it inserted after the object and before the TH+ or TOo+. *I told on Tuesday John to do it I told John on Tuesday to do it Such verbs therefore have (O+ or B-) & {@MV+} & {TOo+ or TH+...}; The same applies to verbs that take two objects; a prepositional phrase may be inserted between the two objects, but never before the first object: I gave him on Tuesday an expensive present *I gave on Tuesday him an expensive present In cases where the verb may take two objects, we must also prevent the second object from being a pronoun (see "O"); and in any case where the preposition precedes an object, we must prevent that object from being a pronoun (see above). In such cases we use O*n+, which may not connect with pronouns. This yields: gave: ... & ((O+ & O*n+) or (O+) or (O+ & [[@MV+]] & O*n+) or ([[@MV+]] & O*n+)) This can be somewhat condensed into the following: gave: ... & ((O+ & {{[[@MV+]]} & O*n+} or [[@MV+ & O*n+]]); This then yields the following judgments: I gave my brother an expensive present I gave him an expensive present I gave an expensive present I gave it *I gave my brother it I gave him for his birthday an expensive present (stage 2) I gave my brother for his birthday an expensive present (stage 2) *I gave him for his birthday it I gave for his birthday an expensive present (stage 2) *I gave for his birthday it *I gave for his birthday him an expensive present *I gave for his birthday my brother an expensive present *I gave for his birthday my brother it _@MV+ on adjectives_ MV is also used to attach modifying phrases to adjectives. This is only allowed when the adjective is being used predicatively: "He is HAPPY ABOUT his job", "*He is a happy about his job man". As with verbs, "@MV+" is optionally conjoined with complement connectors like "TO+" and "TH+" on adjectives, and is to the left of the complement connectors. Thus when adjectives participate in long phrases like "He wants to be certain that the John is eager to go", the "@MV+" on "certain" and "eager" cannot be used. _Words taking MV-: adverbs and prepositions_ If a word or expression has an "MV-", that means it can modify a verb that precedes it. A number of different subscripts are used to identify different kinds of modifiers. For the most part, these subscripts distinctions are just provided for the sake of clarity. They are rarely used to constrain linkages, although a few subscripts have special properties either in post-processing or at the linkage level. MVa connects adverbs and to verbs; adverbs thus have "MVa-" connectors. Many adverbs can take optional commas; they thus carry "{Xc+ & Xd-} & MVa-" (see "Xc"). Some "clausal" adverbs can _only_ be used in this position with commas: *He is angry apparently. He is angry, apparently. These adverbs thus carry "Xc+ & Xd- & MVa-". MVp connects prepositions to verbs; prepositions have "(MVp- or Mp- or Pp-) & J+" (The J+ connects to the prepositional object; the Mp- is used when the preposition modifies a noun; the Pp- is used when the prepositional phrase is a complement of "be"). For the most part, prepositions and adverbs form clearly distinct natural kinds. Prepositions take an object, can modify nouns, and can be complements of "be"; adverbs differ in all these respects. But there are many words and phrases which are like adverbs in some ways (e.g., they take no object), but like prepositions in others. "Alone", "everywhere" and "downtown" can be complements of "be" and can modify nouns; "backwards" and "somewhere" can be "be"-complements but cannot modify nouns. In any case, whether something is labeled MVp or MVa has few practical consequences. It will be noted that we do not distinguish between the many kinds of prepositional phrases: time expressions ("on Monday"), place expressions ("in the plane"), manner expressions ("with skill and discretion"), special verb- or adjective- complement expressions ("She prepared for the meeting", "She is angry about the decision"). There seems to be no reason to make these distinctions for our purposes. One problem is that there are certain verbs which seem to require a certain prepositional complement: "He hoped for an agreement", "*He hoped on Tuesday", "*He hoped". We could solve this problem by creating special connectors for such prepositions; at present, we simply give such verbs optional "@MV+" connectors, thus allowing questionable sentences like the ones above. MVs- is used for certain conjunctions (i.e. words that can take clauses), such as "while", "because", and "now_that". Note that some conjunctions, such as "after", can take either objects or clauses as complements; we usually treat these as prepositions, giving them "MVp". Note also that coordinating conjunctions like "although" or "but", which seem to connect two equal phrases, take "CC-", not "MVs-"; thus they connect to the subject of the preceding clause, not the main verb. This is because such conjunctions may not be used in relative clauses (unlike other conjunctions and prepositions, which may be so used): The woman you said you liked on Tuesday is here The woman we saw after we left the party is here *The woman you said you liked but she was too intelligent is here See "W" for further explanation. _Other uses of MVp and MVs_ As well as taking noun-phrases (using J+) and clauses (using C+), some prepositions and conjunctions may take present participles, passive participles or indirect questions: +----MV-----+-Mg-+ | | | I yelled at her for going to the party +--MV-+-Mv-+ | | | She cried when asked about it +-MV--+-QI-+ | | | I talked about how to use the program Such words have "(J+ or Mg+ or Mv+ [as appropriate]) & MVp-". Some such uses are extremely unconstrained: almost any sentence can take a phrase like "by _ing" or "after _ing". Others are much more constrained, like "for _ing" (*"I went to the store for getting some milk"), and most indirect question uses ("*I went to the store about how to use the program"). At present we have no way of controlling this. Note that when participles are used with conjunctions, we use Mv and Mg rather than Pv and Pg. See "M: Mv and Mg used with conjunctions". _MVi_ MVi is used to connect infinitival phrases to verbs and adjectives when they mean "in order to": "He went to the store to get some bread". Thus "to" carries "(MVi- or TO-...) & I+". (TO is used when infinitival phrases act as verb or adjective complements: "He wanted/expected/was eager to go.") Any verb or adjective may take such a phrase; it has a cost of 1, however. _MVl_ A few adverbs can modify prepositional phrases, conjunctions and adverbs: "partly", "even", "largely". Notice that these adverbs do not simply modify the previous clause: they require a following phrase. +---MVl--+-MVp-+ | | | He did it largely in his spare time He did it largely voluntarily He did it largely because he wanted to do it *He did it largely Such adverbs take "MVl- & (MVp+ or MVa+ or MVs+)". Somewhat counterintuitively, then, the prepositional phrase connects to the rest of the sentence _through_ the adverb; it no longer makes a direct connection. (Notice the subscripts here prevent such adverbs from linking in sequence: "*He did it largely partly in his spare time".) _MVx_ MVx is used to connect verbs to certain modifying phrases surrounded by commas. These include some kinds of phrases, namely passive and progressive participle phrases, that _must_ be surrounded by commas when they modify verbs (exx. 1-4 below). +---MVx--+ | +-Xc-+-------Xd------+ | | | | 1. John left , carrying a dog \\\\\ 2. John left , followed by Bill \\\\\ 3. *John left carrying the dog 4. *John left followed by Bill 5. John left , with the dog \\\\\ +---MVp---+ | | 6. John left with the dog MVx is also used with prepositional phrases, which need not be surrounded by commas (see exx. 5-6 above). When they are not, however, they use MVp to connect to the verb, not MVx. Prepositions therefore have with: J+ & (MVp- or Mp- or Pp- or (Xc+ & Xd- & (MVx- or MX-)); To connect to the commas, Xc and Xd are used; see "Xc". The MVx- on passive and progressive participles is disjoined with with their normal left-pointing connectors (Pg- & Mg- for progressives, Pv- & Mv- for passives). However, the MVx- on participles must be to the right of the complement connectors on the expression; the Pg/V/M connectors must be to the left. Therefore the MVx connectors on participles must be fully disjoined with the Pg/V/M connectors. In this respect MVx- is much like MX*p- and COp+; see "CO: Participles as Openers" for more explanation. _Special Purpose MV- connectors_ A number of MV connector types are used to trigger post-processing constraints. MVt, MVm, MVy, and MVz are used in comparatives. The subscripts are used by post-processing to enforce a variety of constraints on the way comparatives are used. See "MV: Comparatives" below. MVh is used to attach the word "that" to verbs, in "so...that" "such...that" constructions: "He was so angry that he left". "She was such a good programmer that they had to keep her." These uses of "so" and "such" are specially subscripted with "k" (EAxk, EExk, Dm#k, Ds#k). Post-processing ensures that MVh is only used when a special "so" or "such" connector is present. In addition, MVh is a stage 2 connector; it is only considered in stage 2, thus preventing a huge number of spurious parses. The only use of subscripts to actually constrain linkages (with the exception of the post-processing features described above) is as follows. There are certain uses where only a prepositional phrase may be used. Certain prepositions seems to take other prepositions as objects. They cannot take any kind of adverb as objects, though. +-MVp-+MVp+MVp+ | | | | We were drinking over out by the lake *We were drinking over *We were drinking over happily Such prepositions therefore carry MVp+ as a possible right-branching connector along with J+, Mg+, etc.. _Comparatives_ This is perhaps a good place to describe the system's handling of comparatives. Comparatives are constructions involving "more...than", or "as...as". Each comparative construction has two sides: a first half (involving "more" or "as") and a second half (involving "than" or "as"). (The two different "as"'s are given different entries in the dictionary, labeled "as.y" and "as.z", and we shall refer to them that way here.) We will begin by describing the "more...than" case; the "as...as" case is quite similar, though somewhat simpler. A wide variety of constructions are possible on both halves of a comparative expression. 1.He is more intelligent THAN John all 2.He is bigger John is 1,2 3.He runs more quickly John does all except 1,2,11 4.I have more CDs is John 1,2 5.He earns more money does Fred all except 1,2,11 6.He plays with more skill for money all 7.More people attended the partyhe was last year1,2 8.He earns more last year all 9.He did it for more attractive 1 10.He plays more for pleasure Fred earns 4,5,6,7,8,10 11.He is more a scholar came to the concert 4,5,6,7,8,10 12.He plays football more I work 3,12 you have tapes 3,4,12 elegantly 3,9 he said he earns 4,5,6,7,8,10 he said he was 1,2 I had expected all had been expected all expected all It seems that there is tremendous freedom in the way "more" phrases and "than" phrases can be formed. However, there are strong constraints on the way they may be combined. By each "than" phrase above are listed the possible "more" phrases that they may be combined with. These constraints are not easy to enforce for our system through ordinary link logic. Nor are they easy to enforce with simple post-processing logic; very often the "than" phrase is in an embedded clause (as in several of the examples above), and therefore not in the same group as the "more" phrase. Therefore, our enforcement of these rules relies on a rather elaborate combination of link logic and post-processing. First we will describe the construction of "more" phrases; then we will turn to the more complex construction of "than" phrases. _The Construction of "More" Phrases_ In the "more" phrases above, the word "more" is usually serving a pretty clear function. In the first case above, "He is more intelligent...", for example, it is serving as an adjectival adverb. This is clear because it may not be combined with another such adverb: *"He is more very intelligent". Thus we give it specially subscripted "EA" connector: +---Pa------+ +S-+ +---EAm-+ | | | | He is more intelligent than... The other uses of "more" are similar; in each case, it is simply acting like a determiner, noun phrase (object or prepositional object), or adverb, and can simply link to the rest of the sentence as if it were a normal member of these categories. In each case, however, the connector is specially subscripted (solely for post-processing purposes; this will be explained below). He is more intelligent EAm+ He runs more quickly EEm+ I have more CDs Dm*m+ He earns more money Dm*m+ He plays with more skill Dm*m+ More people attended the party Dm*m+ He earns more Om- He did it for more Jm- He does it more for pleasure MVm- He is more a scholar EB*m- A few other words can also perform the function of "more", marking a phrase as the left half of a comparative. These include adjectives such as "bigger" and adverbs such as "better" and "further". Again, in other respects these words are just like others in the same category. Therefore we give them ordinary adjective or adverb expressions but with special subscripts. He is taller Pam He is a taller man Am He plays better MVm Finally, the words "less" and "fewer" are very similar in function to "more", and have the same connectors. "Less" is identical to "more" in functioning as an adverb, mass noun phrase, and mass determiner; "fewer" duplicates the functions of "more" as a plural noun phrase and plural determiner. _The Construction of Than Phrases_ The connection of "than" to the rest of the sentence is simple; it has an MVt- connector, and thus connects to the main verb of the sentence like a prepositional phrase. The MVt on "than" is restricted by post-processing; it cannot occur unless one of the "more" connectors listed above appears. This prevents "than" phrases from occurring in non-comparative contexts (*He is intelligent than me), and also in "as" contexts (*He is as intelligent than me). Now, how do we construct "than" phrases, and how do we enforce the constraints between "than" phrases and "more" phrases? It is useful here to distinguish between two types of "than" phrase. Some types are "unconstrained"; others are "constrained". We will discuss the unconstrained ones first. With some "than" phrases, there seems to be almost no limits on the kind of "more" phrases they can be combined with. One such type is noun phrases. He (is more intelligent) (runs more quickly) (earns more) than John It seems here that such a complement can be used with any kind of "more" phrase. Therefore we give "than" an O*c+ connector. No post-processing is needed here; we have already required that _some_ kind of "more" phrase must be used every time "MVt" is used. So far, then we have than: MVt- & (O*c+...) Another kind of "unconstrained" use of "than" is with a prepositional phrase. Again, it seems that any use of "more" is possible: He (is more intelligent) (runs more quickly) (earns more) than in the past Thus we give "than" an Mpc+, allowing it to connect to any prepositional phrase. Finally, "than" may take a phrase like "was expected", or "I had expected", combined with any use of "more": He (is more intelligent) (runs more quickly) (earns more) than (I had expected) (was expected) For this, "than" is given an Zc connector. Verbs like "expect" have Z-, disjoined with their other complement connectors. (Z is also used in subordinate "as"-phrases like "He left, as I expected". See "Z".) expect: (S-...) & (TH+ or TO+ or Z-) This yields +----Z----+ | +--S--+ | | | He earns more than I expected The second case, "He earns more than was expected", is more problematic, and requires a further SFsic connector on "than". See "Z" for more explanation. _Constrained "Than" Phrases_ Other "than" phrases are more constrained. These divide into two groups: those where the "than" phrase contains a clause, and those where it does not. Regarding the latter case, a "than" phrase may contain an adjective, but only if the "more" phrase contains a predicative adjective as well as the adverb "more". "Than" is therefore given Pafc+. +-----Pa-----+ +EB*m+ | +---Pafc-+ | | | | | He is more intelligent than attractive *He has more money than attractive *He is smarter than attractive *More intelligent people came than attractive The "than" phrase may contain an adverb or prepositional phrase, but only if the "more" phrase also does, with "more" acting as an adverb. Thus we give "than" MV*c+. +--MVp/a--+ +-MVm+ | +-MVp/ac+ | | | | | He plays more for money than for pleasure He plays more quickly than elegantly *He has more CDs than elegantly ?He has more CDs than for pleasure (The last sentence is perhaps valid if "for pleasure" is being construed in an unconstrained sense: "He has more CDs than he does for pleasure". And we allow this.) Finally, the "than" phrase can be a mass or plural noun phrase, but only when "more" is acting as a determiner. In this case, the "than" seems to fulfill both the determiner and the main demand of the noun; therefore we use the "U-" connector on nouns, which overrides both these demands. +-Dmum+ +U*c+ | | | | He has more money than time ?He has more money than a hobby (Here again, an unconstrained interpretation is possible--and is allowed--with both sentences.) How do we prevent these constrained "than" uses from being mixed and matched: *"He plays more quickly than intelligent", *"He is more intelligent than quickly"? In each case, the particular words that are needed in the "more" phrase, and the particular use of "more", can be identified by their link-names. Therefore, we can simply insist in post-processing that each "than" connector may only be used with the appropriate "more" connectors. Thus, "Pafc" demands "Pa" as well as "EB*m"; "MVpc" and "MVac" demand either "MVp" or "MVa"; and"U*c" demands "Dm*m". A "than" phrase can also contain a subjectless clause: 1.More people attended the party than came to the concert 2.More money was committed than was available 3.*More money was committed than were available To handle this, we give "than" an S**c connector. No new domain is started here. For this usage, "more" must either be serving as a noun-phrase (as in 1 above), or as determiner of a noun (as in 2); and in the latter case, the verb of the "than" clause must agree in number with this noun (see 3 above). The number of the noun in the "more" phrase is indicated by the D connector; if it is a "Dmum", the noun is singular, whereas if it is "Dmcm", the noun is plural. If the "than" connects to a singular verb (as in 1), a Ss*c will be formed. In post-processing, Ss*c demands an Om, Jm, or Dmum; an Sp*c demands an Om, Jm, or Dmcm. In this way we enforce number agreement. So far, then, we have: than: MVt- & (O*c+ or Mpc+ or ({SFsic+} & Zc+) or S**c+ or U*c+ or Pafc+) _Constrained Uses with Separate Clauses_ In all the constrained cases discussed so far, the "more" phrase and the "than" phrase are in the same post-processing group. This seems logical, since groups usually correspond to clauses, and in these cases both phrases seem to be in the same clause. Indeed, it is necessary for both phrases to be in the same group for the post-processing rules to work. In other cases, however, the "than" phrase seems to contain its own clause. 1. He is more intelligent than I am 2. He earns more than I earn 3. I run more quickly than he does 4. *He is more intelligent than I do 5. *He earns more than I am 6. *He runs more quickly than I earn In the first case, we can use the AF- connector on verbs like "be" and "seem" which take adjective complements (used also in adjectival questions: "how intelligent is he?"). In the second case, we can use the B- connector on verbs, used with fronted objects in questions and relatives ("What do you earn?"). In the third case, we need to create a special connector. Normally, auxiliary verbs and modals like "do" and "can" require a main verb: "*I do", "*I can". In this case, however, the comparative seems to satisfy this need. Therefore, we give auxiliaries "CX-" disjoined with their main verb (I+ or PP+) connectors. This yields the following linkages: +--AF-+ | | He is more intelligent than I am +--B--+ | | He earns more money than I earn +--CX-+ | | He runs more quickly than I do So far, then, "than" has the following: than: MVt- & (O*c+ or Mpc+ or ({SFsic+} & Zc+) or S**c+ or U*c+ or Pafc+ or AFc- or Bc- or CX-) These "than" phrases are clearly constrained, as shown by sentences 3-6 above. But in order to enforce these constraints as we do in the earlier cases, the "than" phrase and the "more" phrase have to be in the same group. Since normally groups correspond to clauses, it is not at all clear that the two phrases would be in the same group in these cases. In simple cases such as the ones above we could perhaps prevent a new domain from being started at the "than" phrase. However, in other cases, the "than" phrase may include an embedded clause: "He is smarter than I think he is". Normally, the C+ on "think" would start a new domain in this case. But if that happens, how do we enforce post-processing constraints between the "more" phrase and the "than" phrase? Consider the following structure: +-------MVto----+ +------O---+ +--Bc(e)-+ | +Dmum-+ | +S(e)+ | | | | | | He earns more money than I earn Notice first of all that the "than" is making an "MVto" connector to the left, rather than the "MVt" described earlier. This connector starts an 's' domain, which contains the whole "than" phrase. (The "Bc" is a restricted link, as are the AFd and CX links described above; this prevents the subordinate group from spreading back to the rest of the sentence.) What we want to do is enforce a constraint on the "B" connector. The "Bc" connector may only be used if a "Dm#m" is present in the "more" phrase. The "Bc" connector and the "Dm#m" are not in the same group, so this cannot be enforced directly. Note, however, that the "Bc" connects to "than", which is making the "MVto" connector, and this MVto _is_ in the same group as the "Dm*m". Therefore, we can use ordinary link logic to ensure that the Bc only occurs with MVto: than: (MVt- & (O*c+ or Mpc+ or...)) or (MVto- & Bc+); and we can then use post-processing to ensure that MVto only occurs when a Dm*m is present in the same group. In effect, then, we use link logic to communicate a post-processing constraint from one group to another. A similar process is used with the CX and AF cases. CX requires a determiner, noun-phrase, or adverb use of "more"; AF requires a adjectival-adverb use (or a comparative adjective like "bigger" or "better"). CX+ and AF+ on "than" are each conjoined with a special form of "MVt" (MVtp and MVta, respectively). In both cases a new group is started for the "than" phrase; but in both cases, the "MVt*" is in the outer group, thus its use (and indirectly the use of the CX and AF) can be constrained depending on the links present in the "more" phrase. By doing this, we can handle not only simple cases like those above, but also more complex cases where the "than" phrase is in an embedded clause, like "He earns more money than I thought he did". In cases where the than-phrase contains "be" or an auxiliary, subject-verb inversion may occur: He is more intelligent than am I He runs more quickly than do I To allow the first of these, we give "than" a "PF" connector. Forms of "be" have PF-; this is used in certain cases where the complement demand of "be" is fulfilled by something preceding. To handle the second, we give auxiliaries "CQ-" connectors, conjoined with their SI connectors (unlike CX-, which is conjoined with their S connectors). Auxiliaries thus have: do: (S- & (I+ or CX-...)) or (SI- & (I+ & CQ-)) (Since both CQ and PF occur with s/v-inversion, they must be added to the "compatible with inversion" and "requires inversion" lists in post-processing.) In terms of the constraints on their use, the PF case is identical to the AF case described above; the CQ case is identical to the CX case. We already have a system in place for enforcing the right constraints for AFd and CX; thus PFc and CQ can simply be disjoined with them. This yields: than: ... or (MVta- & (AFd+ or QIc+)) or (MVtp- & (CX+ or CQ+)) One more case of than-phrases in a separate clause is easy to deal with. A "than" phrase may contain a complete clause: +--EEc-+ +-Cc-+ | | | | I ran more quickly than he painted his house This usage is constrained: the "more" phrase must must be acting either as an adverbial-adverb or a verb adverb ("I ran more"). In such a case, it is most logical to give "than" a C+ connector. C connectors are not (normally) in the domains they start; thus the Cc here is in the same group as the "more" phrase, and the constraints can be easily enforced. _Comparatives Involving Adjectival Clausal Complements_ A special kind of comparative involves a "more" phrase with an adjective plus its complement; the "than" phrase then includes a similar complement, a similar main clause, or both: I am more confident that Joe will come than I am that Fred will go that Fred will go I was It is more likely that Joe will come than it is that Fred will go that Fred will go it was * Fred is that Joe will go So far, we have made no provision for "than" phrases to take such complements at all. We have dealt with the case of more-phrases containing adjectives: "He is more intelligent than I am". As described above, an AFd+ connector on "than" is used for attaching to the verb of the "than" phrase; this is conjoined with a MVta-. We now add complement connectors like those on adjectives: THc+, TOic+, and TOc+. Either the "be" phrase may occur, or the complement phrase, or both. This then yields: than: ... or (MVta- & (AFd- or ({AFd-} & (THc+ or TOic+ or TOc+)))); We have already enforced that the MVta will only occur when the "more" phrase contains an adjective modified with "more". Various other constraints must be enforced here. If the subject of the first phrase is filler "it" (that is, if there is a adjective-complement phrase present that can only be used with "it"), then the subject of the second phrase must be as well. Therefore, we add the complex MVti- & AFdi+. This works the same way as the MVtx complex, except that MVti is only allowed with filler-it, while the MVta is only allowed with ordinary subjects. Thus we prevent the final incorrect sentence above. This yields: than: ... or (MVta- & AFd+) or (MVti- & AFdi+) or ((MVta- & {AFd+}) or (MVti- & {AFdi+}) & (THc+ or TOic+ or TOc+ or (TOt+ & B+))) However, there is a problem here. Normally, in order to enforce these constraints, the MVta must be in the same group as the "more" phrase. But when an adjective takes a complement such as TOi or TH, the complement usually starts a new group: +----Pa---+ ?--MV--+ +SF+ +--EAc+-TH-+--C-+S(e)+ | | | | | | | | | It is more likely that John left than it is that... and there is no way for anything to the right to make an MV connection back to the outer group. Thus, in the sentence above, there is no way for the MVta to be in the same group as the outer group of the "more" phrase. To fix this, we give adjectives "LE" connectors, disjoined with their complement connectors (if any). "Than" can then either make an LE link to the adjective of the "more" phrase (if it is taking a complement) or an MVta (if it is not). The sentence above therefore gets the following linkages: +----Pa---+----------LE-----------+--AF--+ +SF+ +--EAc+-TH-+--C-+S(e)+ | | | | | | | | | | | It is more likely that John left than it is that... Now the LE connector _is_ in the same group as the "more" phrase. So it takes the place of the MVta, enforcing the same constraints: the "more" phrase must contain an EAc, and if the "more" phrase contains a filler-it subject, the "than" phrase must also. A further constraint is that when the "than" phrase contains a complement, it must be of the same kind as that in the "more" phrase: *It is more important to go than it is that Fred goes. (TOi/TH) *It is more important that Fred goes than to stay. (TH/TO) *It is more pleasant to use than it is to go. (TOt/TOi) To enforce this, The complement connectors on "than" are given special subscripts; p.p. insists that these connectors are only used when the corresponding connector occurs in the same group. (For this to work, the complement connector of the "than" phrase must be in the same group as that of the "more" phrase. However, we want the "AFd" to be in a different group. Thus, rather than having the MVta or LE connectors start the group as we do with MVto and MVtp, we have the AFd connector here start the group.) This yields the following structure: +----THc----+ +----Pa---+---------LE--------+-AF(e)+ | +SF+ +--EAc+-TH-+--C-+S(e)+ | +SF+ | | | | | | | | | | | | It is more likely that John left than it is that... _Comparatives with "As"_ Comparatives with "as" are quite similar to those with "more... than", only somewhat simpler. The word "as" corresponds to "more"; a different word "as", as it were, corresponds to the word "than". These two "as"'s are called "as.y" and "as.z", respectively. It was mentioned that "than" connects to the rest of the sentence using a subscripted MV connector (or rather one of several: MVt, MVto, MVta, MVtp, MVti). Post-processing carries a list of all the subscript types that may occur with "more": EAm, EEm, Dm*m, etc.. If none of these connectors are present, than no "more" phrase has occurred, so no "than" phrase may occur either: the MVt connectors may not be used. Note that a "than" phrase can not be used with an "as" phrase, nor vice versa: *He earn as much money than she earns *He earns more money as she earns Thus "as.y" connectors must be subscripted differently from "more" connectors. Once this is done, we can make a separate list of "as.y" connectors; we then give "as.z" "MVz*" subscripts. We can then ensure that these subscripts are used only with the "as.y" connectors, and the "than" subscripts are used only with "more" connectors. Many of the same connectors occur with "as.y" as occur with "more". With adjectives and adverbs, "as.y" serves as an adverb, exactly as "more" does: He is as smart EAy He runs as quickly EEy In other cases, "as.y" must connect with "much" or "many" to serve the same function as "more". When acting as a verb adverb, a noun phrase, or a mass determiner, it must attach to "much". When acting as plural determiner, it must attach to "many". This uses the "AM" connector. "Much" and "many" have "AM-": much: AM- & (Dmuy+ or Oy+ or Jy+ or MVy-); many: (AM- & Dmcy+); +-----O------+ +---Oy---+ +---MVy-+ | +-AM-+Dmuy+ | +-AM+ | +AM-+ | | | | | | | | | | I have as much money I earn as much I walk as much Notice that it is then "much" (or "many") that makes the specially subscripted O, J, MV and D connections that will be used in post-processing. The expression for "as.z" is very similar to that for "than". It was mentioned that different subscripts are used for the MV- connecting "as" to the left. Beyond this, however, most of the connectors for "as.z" phrases are exactly the same as those for "than", and are used in the same way. For example, "as.z" can form unconstrained phrases using "O*c+" and "Mpc", just like "than". It can also make constrained, domain-starting phrases using "MVzo- & Bc+" (this is the same as the complex on "than", except that a "z" subscript is used instead of a "t"). With "than", p.p. ensured that MVto was used only when certain connectors were present in the "more" phrase: D**m, Om, Jm. For "as.z", we simply add the corresponding "as.y" connectors to this same list (Dmuy, Dmcy, Oy, Jy), and we likewise make MVzo require one of those connectors on the list. (There is no danger that an "as" connector will be used with a "more-than" connector; we have already stipulated that an "than" connector requires a "more" connector rather than an "as.y" one, and similarly with "as.z".) The main difference with "as...as" is that its uses are somewhat more limited than "more...than": He is more intelligent than attractive *He is as intelligent as attractive He did it more quickly than carefully ?He did it as quickly as carefully "As" is also a preposition and conjunction, and thus can be used in this capacity in many of the kinds of sitations where it is used as a comparative. (For this purpose it has yet a third dictionary entry, "as.p".) Joe is coming, as (I have to leave / I expected / a surprise) For this reason, many comparative sentences with "as" get multiple parses, and many strange-sounding comparative constructions will be accepted with "as" as a preposition. MX connects nouns to post-nominal noun modifiers surrounded by commas. +----MX--+ | +-Xd--+-----Xc------+ | | | | The dog , a poodle , The dog , who was black , barked loudly The dog , with a big nose , barked loudly The dog , chasing the cat , barked loudly The dog , annoyed by Fred , barked loudly The dog , angry at Fred , barked loudly The MX connectors used here have different subscripts depending on the kind of modifying phrase. Most kinds of comma modifier phrases correspond to kinds of no-comma modifier phrases, which use M rather than MX. However, every word that takes MX- requires a comma on either side of its phrase; such words therefore take (Xd- & Xc+). See "Xc". Nouns may take any number of comma modifiers: +--------------MX------------------+ +--------MX-------+ | +----MX---+ | | | +--Xd--+-Xc+-Xd+-----Xc------+Xd+-------------+ | | | | | | | | The dog , a poodle , with black hair , who was pretty , ... Nouns can take without-comma modifiers as well as with-comma modifiers. However, any comma modifiers must follow any no-comma modifiers: The dog with black hair, a poodle, was pretty *The dog, a poodle, with black hair was pretty The dog that I bought, a poodle, with black hair, was pretty *The dog, a poodle, with black hair, that I bought was pretty Nouns therefore have dog: {@M+} & {R+ & Bs+ & {[[@M+]]}} & {@MX+}... _Noun phrases as modifiers_ Nouns can take noun phrases as modifiers with commas. (In this case the commas are obligatory; there is no corresponding without-comma phrase.) Nouns therefore have MX-, disjoined with their main S/O/J complex: dog: Ss- or SIs+ or Os- or Js- or (Xc+ & Xd- & MX-); Proper nouns can act as noun modifiers too ("My professor, Ms. Smith, is very good"); thus they also have MX-. Stand-alone determiners are given MX- too, although this is rarely used (?"John and Bill, some of my friends, are here"). Prepositions and progressive and passive participles also carry MX-. These usages correspond exactly to Mp, Mv, and Mg, used for prepositions and participle phrases without commas. As with Mv and Mg, the MX- on participles must start a domain; thus it is subscripted MX*p-, and this connector is domain-starting. Relative clauses may also act as comma-modifiers. These differ slightly from ordinary no-comma relative clauses, however. First of all, the relative pronoun is obligatory, unlike in no-comma phrases. Secondly, the relative pronoun must be "who" or "which"; "that" is not allowed. My friend, who you met yesterday, is here *My friend, you met yesterday, is here *My friend, that you met yesterday, is here For this reason, we treat comma relative clauses quite differently from no-comma ones. The relative pronoun acts as the head of the phrase. In subject relatives, it makes an ordinary subject (S) connection to the verb (unlike in no-comma relatives, where it makes a RS connection). In object relatives, it makes a B connection to the verb (whereas in no-comma relatives it makes a C connection to the relative subject). Therefore it behaves just like a question phrase. Indeed, the MX- on "who" is directly disjoined with the W- and QI- connectors used in indirect questions. who: (B*w+ or S**w+) & (Ws- or Wq- or QI- or (Xd- & Xc+ & MX*r-)); +--MX--+-S-+ | | | My friend, who was drunk, left the party +--MX--+----B----+ | | | My friend, who John hated, left the party (The subscripts on B+ and S+ are irrelevant to MX-.) One complication here is that in subject relative clauses, the verb of the relative clause must agree with the main noun: "*My friend, who were drunk, left the party." To handle this, we use post-processing. MX+ connectors on nouns are subscripted MXs+ or MXp+ for plural and singular. This creates a MXsr or MXpr link with the relative pronoun; these links are domain-starting. The verb that the relative pronoun connects to will then form an "Ss*w" or "Sp*w" link. We then dictate that a group containing a MXpr must contain a Sp*w; one containing a MXsr must contain a Ss*w. Prepositional relatives can also act as comma modifiers. For this, prepositions have MX*j-. These connectors act exactly like Mj-; see "Mj". Finally, adjective phrases can serve as comma modifiers; adjectives therefore carry MX*a-, which is similar in function to Ma-. N connects the word "not" to preceding auxiliaries. +------+ +-N-+ | | | | He had not gone He was not speaking to Fred "Not" is similar to an ordinary adverb, in that it can be inserted between an auxiliary and a participle ("He had quickly gone", "He was quietly speaking to Fred"). With adverbs, we use E in this situations to connect the adverb to the following participle. However, adverbs may modify such participles in any of their uses; for example, in uses of progressive participles as openers or noun modifiers ("A man quietly speaking to Fred looked up"); "not" may not be used in this way ("*The man not speaking to Fred looked up"). Therefore we have "not" connect not to the following participle, but to the previous auxiliary, using N. Auxiliaries therefore have N+ conjoined with their participle connectors (I, Pp, Pg, Pv). The word "not" may also follow the verb "be" with adjectives and prepositional phrases; for this we can use EB. See "EB". ND connects numbers with certain expressions which require numerical determiners: +-ND--+-Yt-+ | | | I saw him three weeks ago Also "The store is FIVE MILES away", "FIFTY PERCENT of them were women", "THREE OTHER people are coming", "These are the TWO BIGGEST buildings in the city". In each of these cases, the word on the right of the link - not the number - is used to link the word to the rest of the sentence. Numbers can also be used to modify the word "more", when it is acting as a plural determiner or noun: "Three more are coming", "Three more people are coming", "*Three more money is needed". Thus "more" has ND- optionally conjoined with its plural-determiner and plural-noun connectors. Notice that ND is not used for ordinary noun-phrase number expressions: "Fifty people came to the party", "Fifty came to the party". For these purposes, numbers have ordinary plural determiner (Dmc) and noun (S/O/J) connectors. ND is only used when a numerical determiner is required: I saw him three weeks ago *I saw him the weeks ago *I saw him John's weeks ago These are the two biggest buildings *These are the some biggest buildings ND on numbers is directly disjoined with the plural determiner and noun-phrase connectors, and conjoined with other connectors like NN+ and EN-, used in building larger number expressions: +-----EN-----+ | +--NN--+--ND--+ | | | | I saw him about fifty million years ago NF is used with NJ in idiomatic number expressions involving "of": +-NJ-+ +-NW-+---NF---+ +NS+ | | | | | He lives two thirds of a mile from here He lives one third of a mile from here (uses NS instead of NW) He died three quarters of a century ago He lives 3/4 of a mile from here Fractional words like "thirds" and "third" have NF+, as do fractional numbers like "3/4". "Of" has the complex "NF- & NJ+" disjoined with everything else. Singular words like "mile" (not plural forms: "*He lives two thirds of 5 miles from here") have an optional NJ-. NF and NJ are similar to ND in that they are only used in idiomatic expressions such as time as place expressions. Fraction words like "third" can also act as noun phrases (with the right determiners), and thus they carry "S/O/J"; they can also take modifying phrases such as "of" phrases using "M-". For this, no special apparatus is needed: +-------S--------------+ +NS-+-M--+----J---+ | | | | | | A third of the population is illiterate Two thirds of the population is illiterate (NW instead of NS) NI is used in a few special idiomatic number phrases: +----NIc----+ +--NId--+ | +-NIa+ | +--+ | | | | | I have between 5 and 20 dogs *I have between dogs and cats The first word of the expression basically connects all the words together; but the last word of the expression (a number) connects the phrase to the outside world. NIa is also used in a few idiomatic number expressions: "He is aged 70", "He is on flight 714". NN connects number words together in series. The last word in the series then connects to the rest of the sentence - either making a D connection to a noun, an S/O/J connection (in which case the number expression acts as a noun phrase), an ND connection, etc.. +--NN-+---NN---+--NN---+----D--+ | | | | | Four hundred thousand million people live here Numerals and single-word numbers under 100 ("3", "1000", "three", "eighty") have NN+, disjoined with "D+ or ND+ or S+...". Thus they must either connect forward to another number word, or serve as the head of the number expression; they cannot connect backward to another number word. The words "hundred" and "thousand" have "(NN- & NN+) or ({NN-} & (ND+ or Dmc+ or S+...)". They must connect backwards to another number; they may either connect forwards to another number or serve as the head of the expression. This yields Four people live here *Four three six people live here *Million four people live here *Million four hundred people live here *Million people live here Four hundred million people live here Whatever numbers is serving as the head of the expression may also make a DD connection back to a definite or possessive determiner: "THE four hundred MILLION people living in New York will suffer." See "DD". NR connects fraction words with superlatives: +------F----+ | +--NR--+ | | | It is the third biggest city in China Superlative adjectives connect back to the previous determiner, rather than connecting forward to the noun like most adjectives. See "L". NS serves a similar function to ND, only for singular expressions. (See "ND".) The words "one", "1", and "a (an)" therefore carry "NS+"; singular words like "week", used in idiomatic number expressions, carry NS-. +-NS+ | | I saw him a week ago NW is used in idiomatic number expressions. It is exactly like ND, except that whereas both word-numbers ("five") and numerical numbers ("5") have ND+, only word numbers have NW+. The only use of it right now is in building fraction expressions: +-NW-+ | | Two thirds of the students are women O connects transitive verbs to direct or indirect objects: +----O----+ | | The dog chased the cat Some verbs have optional O connectors; they may or may not be transitive ("We moved"; "We moved it"). Some verbs can take an object followed by some other complement; such verbs have other connectors like Pa, TOo, I, Pg, or TH+ optionally conjoined with O+: +---?---+ +-O-+ | | | | We made him do it (I) We saw him running (Pg) We find him stupid (Pa) We told him to leave (TOo) We told him he was in trouble (TH) Other verbs have two O+ connectors, one or both of which may be optional ("I gave him five dollars", "I gave five dollars"). In this case, the first object may either be a pronoun or a noun; however, if it is a noun, the second may not be a pronoun: "I gave him the money", "I gave John the money", "*I gave John it", "*I gave him it". This is parallel to the case of particles; in transitive verbs which take particles like "up" or "out", the particle may not precede a pronoun ("We sorted out them"). The O*n+/Ox- subscripts, developed for that purpose, are used here as well. The second O connector on two-object verbs has O*n+; pronouns have Ox-; "Oxn" is prohibited in post-processing. (See "K".) Os and Op connectors mark nouns as being singular or plural. The main reason for this is to enforce the correct use of "there is"/ "there are". See "SF: 'There' as subject". Osi and Opi are used in the construction: +------Bs----+ +----R---+ | +SF+-Osi+ +-RS+ | | | | | 1.It is John who wants to do it Forms of "be" are thus able to take a noun (proper or common) plus a relative clause. Forms of "be" thus carry is: (Ss-) & (Pg+ or Pv+ or O*t+ ... or (Osi+ & R+ & Bs+) or (Opi+ & R+ & Bp+) Note that the O connectors used in this expression are different from the ordinary O connector used in "They are professors". Note also that there must be number agreement between the noun and the verb of the relative: "*It was my father who wanted to do it", "*It was my parents who wants to do it". This is enforced by the linkage expressions above. There are other constraints on this construction as well: 2.*The man was John who did it 3.*I saw John who did it 4.*It wanted to be John who did it In this construction, the subject must be "it" (see ex.2); the verb must be "be" (ex.3); and there are constraints on the other verbs that may be used in the expression (ex.4). These are the same constraints that are already in place for the use of "filler-it". Thus we must simply add "Osi" and "Opi" to the list of connectors that require "filler-it" as subject, and everything else follows automatically. (See SF.) OD is used for a few verbs like "rise" and "fall" which can take expressions of distance as complements. +----OD----+ | +-ND-+ | | | It fell five feet Words like feet therefore have "OD-". (The special ".i" entry for distance nouns, used in other distance expressions, is used here.) In financial writing, one often sees such verbs used with "points", or just ordinary numbers; therefore we allow this also: GM stock fell five points GM stock fell 2 1/2 OF connects certain verbs and adjectives to the word "of". In most cases "of" can not modify verbs: "*I ran of the house". With a few verbs it can however: "I accused him of the crime", "I just thought of something". It can also modify certain adjectives: "I'm proud of you". Such verbs or adjectives have OF+ disjoined with their other complement connectors. "Of" has "J+ & (OF- or Mp-)"; to connect to its object, the usual "J+" is used. ON is used to connect the preposition "on" to certain time expressions. +--TY-+ +--ON-+-TD-+ +-Xd+-Xc+ | | | | | | I saw him on January 21 , 1990 \\\\\ See "DT" for more discussion of time expressions. OT is used for a few verbs like "last" which can take time expressions as objects: +-----OT----+ | +--ND-+ | | | It lasted five years ?I've been working on this five years *It lasted five books (Constructions like the second above are sometimes seen; we disallow them.) In questions, the object of such verbs may be fronted; in such cases, BT is used. In such cases, the phrase "how many" must precede the noun. This is analagous to an ordinary "how many" object-type question, like "how many dogs did you chase". BT is analogous to the usual B; TQ is analogous to Dmc. +-----BT----+ +-H--+-TQ-+ | | | | | How many years did it last Thus we give "many" "H- & TQ+"; we give "years.i" "TQ- & BT". P is used to link forms of the verb "be" to various words that can be its complements: prepositions, adjectives, and passive and progressive participles. +S-+Pp-+ | | | He is in the yard (Pp) He is running (Pg) He was chosen (Pv) He was angry (Pa) Some of these connectors, particularly Pg and Pa, are used also with other verbs that take complements of these kinds. _Pp_ Pp is used to attach forms of "be" to prepositions. Prepositions thus have "Pp-" directly disjoined with other connectors used for attaching prepositional phrases to things Mp- (used for phrases modifying nouns), MVp- (used for phrases modifying verbs), CO+ (used for openers), and so on. _Pg_ Pg connects verbs that take present participles with present participles. +--Pg--+ | | I enjoy running A number of verbs - "be", "enjoy", "like", "hate", "remember" - take present participles as possible complements; such verbs have "Pg+" disjoined with other complement connectors (like O+, TO+, etc.). A few words take both objects and present participles: "I saw him leaving". Such verbs take "O+ & {Pg+}". Present participles can also be used with no preceding verb in so-called participle modifiers: "The dog chasing John was black". Mg is used here, not Pg; this distinction relates to post-processing. Present participles can also be used as subjects ("Playing the piano is fun"); such "gerund" usages use "Ss*g" connectors. See "Ss*g". Pgf is used by P.P. to control the use of "it" and "there". See SF. _Pv_ Pv is used to connect forms of "be" to passive participles: +-Pv+ | | John was hit Form of "be" have "Pv+" disjoined with their other complement connectors (O+, Pg+, etc.). Since the passive form of a verb is always the same as the past participle form, the same expression can be used for both: the "Pv-" connector is thus disjoined with the "PP-". However, the connectors conjoined with Pv- are quite different from those conjoined with PP-. First of all, only transitive verbs have Pv connectors (*"He was arrived"). Moreover, the Pv connector must be disjoined with the O connector on such verbs, to prevent "*He was hit the dog". When verbs take complement connectors such as "TH+", "TO+", and "QI+", the Pv- must usually be disjoined: +--Pv-+ | | I had known of the problem I had known that it was a problem I had known what was happening * John was known of the problem * John was known that it was a problem * John was known what was happening The complication here is that, frequently, such constructions are permissible when the subject is "it". It was known that it was a problem It was known what was happening We already have a mechanism in post-processing for ensuring that certain complement connectors ("THi", "QIi") are only used with "it" as the subject (see "SF"); so these can be used here. This produces: known: (T- & (O+ or QI+ or TH+ or C+ ....)) or (Pv- & (QIi+ or THi+...)); A further complication is that sometimes certain complements are permitted _only_ with the passive, for example: "He was known to be clever": "*I knew him to be clever". This yields: known: (T- & (O+ or QI+ or TH+ or C+ ....)) or (Pv- & (QIi+ or THi+ or TO+...)); If a verb can take an object plus another complement, such as an infinitive (O+ & TOo+) or clause (O+ & TH+), the Pv- must be disjoined with the O+, conjoined with the other complement connector: +--TOo-+ +-O-+ | | | | I told him to go +-Pv+-TO-+ | | | He was told to go *He was told him to go this yields told: (T- & ((O+ or B-) & {TH+ or QI+ or TOo+...})) or (Pv- & {TH+ or QI+ or TO+}); (Note that for the passive, "TO" is used rather than "TOo". The function of "TOo" is to indicate to post-processing that a new subject is in force, by starting a new domain; but with the passive form, a new subject is _not_ in force. In "He was told to go", "he" is the implied subject of "go".) Sometimes one encounters what might be called a "prepositional passive". In most cases, a passive cannot be constructed out of a verb+preposition phrase: "I went to the house", "*The house was gone to"; "I threw a stick at the dog", "*The dog was thrown a stick at"; "We ate in the park", "*the park was eaten in". There are a few cases of common verb+preposition expressions, however, where such passives can be constructed: "I've been yelled at, gossiped about, lied to, and trifled with". We simply treat these as idiomatic, non-separable expressions, similar to passive forms of transitive verbs: yelled_at lied_to: Pv- & {@MV+}; _Participle Modifiers_ In participle modifiers - that is to say, in cases where the passive participle modifies a noun directly, like "The dog chased by Fred was black" - Mv is used, not Pv. See "Mv". _Pa_ Pa connects certain verbs to predicative adjectives: +-S--+-Pa-+ | | | The dog was black Only certain verbs carry Pa+ connectors ("be", "seem", "look", "taste"). A few carry Pa+ conjoined with O+, such as "make" and "keep": +---Pa---+ +-S-+-O-+ | | | | | I made him happy A few adjectives can act only as prenominals, not predicatives ("former", "other"); these have only A+ connectors, no Pa-. Many adjectives can take phrasal complements when used in predicative position: "She is eager to go", "It is not clear who will be hired", "I am certain Joe did it", "He is fond of cookies". On such adjectives, Pa+ is conjoined with TO+, TOi+, TH+, Ce+, QI+, or OF+ connectors, as appropriate. Pa+ is also conjoined with @MV+, allowing prepositional or adverbial modifiers ("She is happy with her job"). In all these cases, the modifying phrase is optional ("fond" is an exception: "*He is fond"). Paf connectors are used for post-processing, to control the use of "filler" subjects like "it" and "there". See "SF: Constraints on "Filler"-Only Phrases." Pa*j is used for verbs like "make" (mentioned above), which take object+adjective ("I made him happy"). In such cases, the adjective applies to the direct object, not the previous subject; thus a new domain must be started which includes the O and Pa links but not the S. Pa*j links therefore start "urfl domains". Pa*j is exactly analagous to TOo and I*j; see "TOo". PF is used in certain direct and indirect questions with "be". Normally forms of "be" require a complement to the right; but they can also be satisfied by a question-word to the left like "where", "when" or "how". +-Wq+-PF-+-SI+ | | | | || Where are you +--PF----+ +--QI-+ +-S-+ | | | | I wonder where you are Forms of "be" thus have PF- directly disjoined with "O+ or Pg+ or Pv+...". The question-words "where" and "when" have PF- conjoined with Wq- or QI-. In questions, then, a Wq- connection to the wall is made, enforcing s-v inversion (see "SI"); in indirect questions, an QI connection is made, preventing s-v inversion. Note that other verbs cannot be used in this way: "*Where ran you", "*What think you", "*What have you?" The first two cases are prevented because ordinary verbs don't have SI+ connectors; the third is prevented because, although "have" has an SI+, it is conjoined only with PP+ (used in past-participles) and not with O+ and B- (used with direct objects). Only on forms of "be" is the SI+ conjoined with the entire complement expression. PF is also used in cases of fronted prepositional and participle phrases. +---------PF--------+-SI-+ | | | Among the candidates was John Smith, a professor Carrying the box was a small child ?Described by John was a new program Such phrases vary widely in their acceptability, depending on the participle or preposition; we allow all participles and most prepositions to be used this way, but only at stage 2. (These phrases are not to be confused with openers. Openers attach to the subject of the main clause, which does not have s-v inversion and can have any verb: "Carrying the box, John left". Fronted prepositional or participle constructions involve s-v inversion; the phrase must attach to a form of "be".) In such cases, the fronted phrase must attach to the wall. S-v inversion is necessary here ("*Among the candidates John Smith was"); thus the participle makes a Wq connection to the wall, enforcing s-v inversion in post-processing. among: J+ & (Mp- or MVp- or ... [[Wq- & PF+]]) carrying: (Pg- & (O+...)) or ((O+...) & (Ss*g+ or COp+ or [[Wq- & PF+]])); For participles, note that the Wq- & PF+ complex is directly disjoined with the COp+ and Ss*g+, and is to the right of the complement expression; it is fully disjoined with the Pg, which is to the left of its complement expression. See "CO: Participles as Openers" for an explanation. PP connects forms of "have" with past participles: +-PP-+ | | He has gone Forms of "have" have "PP+" connectors; past participles have "PP-" connectors, conjoined with their complement connectors (O, TH, TO, etc.). Since the past participle form of the verb is usually the same as the simple past form (which uses an "S" connector), we can usually use the same expression for both. died arrived moved purchased: (S- or PP-) & [complement]; In cases where the simple past and the past participle are distinct, however, we must use separate expressions. began went forsook: S- & [complement]; begun gone forsaken: PP- & [complement]; The past participle is, in every case, the same as the passive form (where there is one); but the passive complement expression is usually quite different from the past-participle one, so the past-participle connector PP and the passive connector Pv may not be directly disjoined. (See "Pv".) PPi is used by post-processing to control the use of "it" and "there"; see "SF: Filler-it". Q is used in questions, in several different ways. It is used to connect to auxiliaries in simple s-v inversion questions, when there is no preceding object. In some cases the auxiliary connects to the wall (ex. 1 below). In other cases it connects to a question-word like "when", "where", or "why" (ex. 2). +-Qd+ | | || Are you going to to the movie +-W+-Q-+ | | | || Why did you go In noun-focused questions, either object-type or subject-type, the question-word connects to the verb ("What did you buy", "Who bought that"). In the former case, there is s-v inversion, but no "Q" connection is made. Thus the "Q-" is optional on auxiliaries. Post-processing ensures that when a question-word is used in object-type or when-where-why questions (not subject-type questions), the outer group of the sentence contains some kind of SI; this is because in all such cases, a Wq is used to connect to the wall, and this requires an SI in the same group. P.P. further ensures that SI connectors can only be used when a Wq is present (see "SI"). However, in simple yes-no questions, question inversion must occur, yet no question word is present. Thus the Q+ on the wall is subscripted "Qd", and this is added to the list of connectors that permit s-v inversion. (Note that the Q- on auxiliaries is conjoined with SI+, but disjoined with S-; by link logic it can only be used with SI+. Post-processing tightly constrains the use of SI, preventing it from occurring in indirect questions, relative clauses, etc.; the use of Q is thus automatically constrained as well.) Q is also used for questions with prepositions: "IN which room WERE you working", "TO whom WERE you speaking". See "JQ", "Jw". _Adverbial questions: Qe_ Qe is used to connect adverbs to following auxiliaries in adverbial questions: +---I---+ +Eeh-+--Qe--+-SI+ | | | | | | How quickly did you run Qe can only be used in questions ("*Very quickly did you run"). This is enforced by the fact that the sentence must connect to the wall, and can only do so through "how". Again, the use of Qe in indirect questions is prevented because Q- on auxiliaries is conjoined with SI, whose use is constrained by post-processing. One false positive must be addressed here, however: +------------B-------+ | +----I---+ | +--Qe--+-SI-+ | | | | | | * Who quickly did John hit We prohibit this in post-processing by requiring that a group containing a Qe must contain an EEh. QI connects certain verbs and adjectives to question-words, forming indirect questions: +--QI--+ | | 1. I wonder what book he will read 2. I wonder what will happen 3. I wonder what he will do 4. I wonder when he will come 5. I wonder how big it is +--QI--+ | | I am not certain where he is Question-words thus have "(QI- or W-)". (The W- is used in direct questions, to attach to the wall). This is conjoined with various connectors. Determiner question-words, like 1 above, use D**w+; subject-type questions (ex.2) use Ss*w+; object-type questions (ex.3) use B*w+; where/when/how questions (ex.4) use Cs+, PF+ or TOn+; adjective questions (ex.5) use EAh+. See entries for these individual connector-types. Verbs ("wonder", "know", "ask") and adjectives ("certain", "clear") that take indirect questions have QI+, disjoined with other complement connectors. QIi is used by P.P. to enforce the correct use of "filler" "it". Certain phrases can only be used with "it" as the subject ("It is not clear what will happen"; *"I am not clear what will happen"); this is enforced in post-processing. "QIi" is therefore directly analagous to "THi" (see "THi"). QI#d is, similarly, used by post-processing. The issue here is whether or not the QI connector should begin a domain. In all indirect questions, it is clear that the indirect question should be in a different group from what precedes. With "where/when/how" constructions ("I wonder when he will come"); the question-word makes a Cs connection to the new subject, beginning an 's' domain; thus the QI connector need not start a domain itself. In all other cases, however, the links extending to the right of the question-word (B, S, D, etc.) are not normally domain-starting. In these cases, then, the QI link has to start the domain. Thus we make "QI#d+" a domain-starting link, and assign it to question-words like "what" and "who"; "where", "when" and "how" have the non-domain-starting "QI+". R connects nouns to relative clauses. In subject-type relatives, it connects to the relative pronoun. In object- type relatives, it connects either to the relative pronoun or the subject of the relative clause if the relative pronoun is omitted. +-----B------+ +-R--+C-+-S--+ | | | | 1. The dog that I chased was black +----B----+ +-R-+RS---+ | | | 2. The dog who chased me was black +--B------+ +-R-+-S---+ | | | 3. The dog I chased was black Consider the following simplified expressions: dog: ...(R+ & B+)... & (({C- or R-} & S+) or O+ or...); who: R- & (C+ or RS+); When a relative pronoun makes a R connection back to a main noun, it must make either a C connection to the subject of a new clause (in an object-type relative - ex.1 above), or a RS connection to a finite verb (in a subject-type relative - ex. 2 above). Note that, conjoined with their S+ connectors, nouns have not only C- but R-. Thus, in object-type relatives, the main noun can connect directly to the subject of the relative clause; no relative pronoun is necessary (ex. 3 above). The "R+ & B+" complex on nouns is disjoined with the "@M+". Nouns frequently take a relative clause, or one or more prepositional phrases; they rarely take both. (See "M".) We also do not allow multiple relative clauses. These sound all right in some cases ("The movie I saw that I told you about"), but in practice seems to be extremely rare. _Other Words with "(R+ & B+)"_ Some pronouns and determiners which act as noun phrases have a "R+ & B+" complex also: "ALL who apply for the job will be considered", "We need SOMEONE who can program well". Relative clauses may also occur within commas ("Dave, who you met yesterday, is here". In this case, however, a very different structure is formed. See "MX*r". _Relative Clauses and Post-Processing_ When a relative clause is created, an 'r' domain is begun, by the R connector on the main noun. This 'r' domain spreads through the relative clause, and then back through the B connector hooking back to the main noun. +-----------S-------+ +-----B(r)-+ | +R(r)+RS(r)+-O(r)+ | | | | | | The dog who chased me was black The point of the 'r' domain is to include all the connectors involved in the relative clause. By ordinary domain logic, however, after spreading back through the B connector, the domain would then continue to spread to the S connector in the main clause and everything else to the right. This is clearly undesireable. We therefore create a special list of "restricted links" in post-processing. These are links through which domains are traced no further, when they extend back to the left of the root word (the word on the left end of the domain-starting link: the main noun, in this case). _Cycles in Relative Clauses_ It will be noted that the linkages of relative clauses involve "cycles"; more links are present than are necessary than to simply connect all the words. Why is the cycle needed? In object-type relatives, it seems natural for the main noun to connect to the verb of which it is the implied object. But by requiring a connection to the subject of the relative clause as well, we can prevent question inversion at the linkage stage ("*The dog did John chase was black"); hence the C+ on relative pronouns. Once we require the relative pronoun to connect to the right in object-type relatives, we must let it connect to the right in subject-type relatives as well; hence the "RS" connector. (There is another motivation for the RS connector, involving embedded clauses within relatives: see "RS".) As always with cycles, however, there is a danger here of strange ill-formed sentences arising, such as "The dog I died John chased was black". To prevent this, we include Bp and Bs in a list of "must_be_connected_without" connectors. Post-processing then insists that the sentence must be fully connected even when these connectors are removed: in other words, it insists that they must be part of a cycle. It was mentioned that R is used when the noun connects to the subject of the relative clause rather than to the relative pronoun. R is also used for infinitival phrases following nouns: "The TEAM TO beat is Miami". This construction is very rare and is given a cost of 2. RS is used in subject-type relative clauses. +----B----+ +-R-+RS---+ | | | The dog who chased me was black Finite verbs have a B- connector disjoined with their S- connectors which connects to the main noun of a relative clause. This B- is conjoined with a RS-; when verbs make a B connection back to a noun, they must form a RS connection as well. This is supplied by the relative pronoun. died arrived: (S- or (RS- & B-)); who: (R- & (RS+ or C+); See "R" for a full discussion of relative clauses. A complicated situation arises when the verb of a subject-type relative is itself contained in an embedded clause. +------B------------+ +-R-+--C-+-S--+-RS--+ | | | | | The dog who John said chased me was black The "who" makes a C connection to the outer subject of the relative constituent, just as if it was an object-type relative. But in this case, the "B" on the main noun is connecting to the relative verb as a subject, not an object. The verb therefore requires a RS connection. Moreover, the outer verb of the relative constituent, "said", requires a complement (normally it makes a "TH" or "C" connection to the subject of a embedded clause: "John SAID HE was coming"; "*John said"). Thus we give such verbs a "RSb" connector, conjoined with their other complement connectors. (In this situation, the RS connector must start a domain; this is the reason for the "b" subscript.) Notice that in the above case - unlike other subject-type relatives - the relative pronoun may be omitted: "The dog John said chased me was black". This follows naturally from the current system. _Other "RS" Constructions_ Because the relative pronoun may be omitted here, and because (either via the relative pronoun or directly) the main noun is making a clause connection to the subject of the relative clause, this situation has a lot in common with object-type relative clauses; yet the main noun is the implied subject of the embedded clause, not the object. In this way, other "B+" connectors can be used that are normally used in object-type constructions, for example, those on question-words, "whatever"-type words, and "transitive" adjectives (the latter case is a little weird, but we allow it): Who do you think hit John Whatever you think will work is fine John is easy to think hit Joe RW connects the right-hand wall to the left-hand wall in cases where the right-hand wall is not needed for punctuation purposes. See "X: Comma Phrases at the Beginning and End of Sentences". S connects subject-nouns to finite verbs: +-S---+ | | The dog chased the cat Ss connects singular nouns to singular verb forms ("The dog chases the cat"); Sp connects plural nouns to plural verb forms ("The dogs chase the cat"). Simple-past forms do not distinguish between singular and plural ("The dog chased the cat", "The dogs chased the cat"); thus an unsubscripted S- is used. S##t is used by PP to control sentences like "The problem is that John is coming"/"*The dog is that John is coming". See "THb". S#i, S#x, and S##i are used to control the use of the pronoun "I". "I" normally acts like a plural noun ("I run" / "*I runs"), with two exceptions: "I was" / "*I were", and "I am" / "*I are". To control this, we simply give "I" an Sp*i+ connector; we give "are" and "were" an Spx- connector; and, in post-processing, we outlaw Spxi connectors. We also give "am" an Spi- connector, and outlaw Spi connectors; "We am" is thus prevented, but "I am" forms an Spii link and is thus allowed. Ss#w is used for question-words like "who" that can act as noun-phrases in subject-type questions: "Who is coming?" This subscript serves a rather arcane purpose in post-processing. See D##w. _Ss*g: Gerunds_ Ss*g is used for gerunds: "-ing" forms of verbs that act like nouns or heads of noun phrases. +------Ss*g-----+ | | Playing the piano is fun This is perhaps a good place for a general discussion of gerunds. The use of gerunds is extremely problematic. It involves a huge twilight zone of strings which vary rather gradually in their grammaticality. There are two basic questions to be answered. 1. What are the rules governing the construction of the actual gerund phrase? 2. What are the rules governing its larger context? _How can gerund phrases be constructed?_ Gerund phrases can be constructed in two basic ways: A) with the normal complement of the verb, and no determiner; or B) with a determiner (and other common modifiers often used with nouns), but without the normal complement. Let us call these the c/nd case and the nc/d case, respectively. _The Complement / No Determiner Case_ A. Gerunds can be used with their normal complement and no determiner: Sleeping is fun Chasing dogs is fun Telling John to leave won't help the situation Telling John that you hate him will destroy your relationship Graduating from college first will make you more marketable Whatever requirements apply to the complement as the verb is normally used apply to the gerund also (but only in the c/nd case: the nc/d case is quite different, as described below): *I talked John *Talking John is fun *I told to do it *Telling to do it won't help the situation *I like to chase ?Chasing is fun (A possible exception is cases like the last one, where the complement is simply omitted from a verb with otherwise requires one. This case will be discussed later.) (With strain, gerunds may also be used in this way with a possessive determiner: "Your telling John to leave was a mistake". For this we use DP; see "DP".) To handle cases such as these, we could directly disjoin Ss*g+ with Pg- on "-ing" forms of verbs (the Pg- is used in present participles). There is a problem here, however. On present participles, the complement expression must occur to the right of the main "Pg- or Mg-" expression. This is because when "B" connectors are used, they may need to hook back beyond the Pg link: +------B------+ | +---Pg---+ | | | What are you doing With gerunds, however, the reverse is true: the Ss*g link must link to the right beyond whatever complement links are made (O, TH, etc.): +----Ss*g----------+ +-TO-+ | | | | Trying to kiss Susan was stupid therefore the entire "Pg" expression must be disjoined with the entire "Ss*g+" expression. Because of this, it seems clearer simply to treat the present participle and the gerund as two different words: the present participles are listed under "trying.v", "chasing.v"; the gerunds are listed under "trying.g", "chasing.g". The Ss*g+ expression contains the entire complement of the verb. So far, then, the usual expression for the ".g" entry for gerunds is simply: [normal complement] & Ss*g+; _The Determiner/No Complement Usage_ Gerunds may also be used with a determiner. In this case, they may not take their normal complements. The situation here is complex. Take a simple transitive verb like "hit". When a determiner is used, taking a simple direct object is clearly wrong (ex. 1). Taking no complement sounds questionable (ex. 2). The usual thing is to take an "of" phrase (ex. 3). 1.*The hitting dogs is fun 2.?The hitting is fun 3.The hitting of dogs is fun With intransitive verbs, having no complement sounds funny also; here again, it's normal for an "of" phrase to be used. (Notice that with intransitive gerunds, "of X" names the subject of the verb; with transitive gerunds, "of X" names the object.) 4.?The graduating changes the situation 5.?The sleeping can ruin a lecture 6.The graduating of Fred changes the situation 7.The sleeping of students can ruin a lecture How about complex verbs? Here again, straightforward use of the complement is definitely wrong (ex. 8-11). Use with no complement is iffy (ex. 12-13), as is use with no complement (ex. 14-18). 8.*The telling John to leave was stupid 9.*The showing how to use the program seemed to interest people 10.*The attempting to go to the party really angered Joe 11.*The demonstrating that our program could handle complex sentences impressed people 12.?The telling was unfortunate 13.?The demonstrating seemed to impress people 14.?The telling of John was stupid 15.?The showing of the program seemed to impress people 16.*The attempting of John to go angered John 17.*The telling of John to go was stupid 18.?The demonstrating of our program's abilities impressed people Sentences like the last five above are perhaps not so much incorrect as unnecessary; in most cases, we have nouns such as "demonstration" and "attempt" which we use instead. So, when a determiner is present, using the normal complement is wrong with transitives and complex verbs. Having no complement sounds doubtful with transitives and complex verbs, and also with intransitives (where it corresponds to the normal use of the verb). Using "of" phrases is fine with transitives, okay with intransitives, doubtful with complex verbs. (This means that with some complex verbs that cannot be transitive, like "wish" and "hope", there is no really good way of using the gerund with a determiner - which I think is true.) For the moment, we ignore most of these subtle distinctions. We allow any gerund to be used with a determiner; we allow use with "of" at stage 1 (using OF+); we allow use without "of" at stage 2; and we disallow any use of the normal complement. This yields (D- & (OF+ or [[()]])) & Ss*g+; When gerunds are modified by determiners, they may also be modified by adjectives, relative clauses, and participle modifiers. Other determiners may be used, such as possessives. Singular-only determiners like "a" cannot be used; gerunds seem to act like mass nouns in this sense. Even some mass-noun determiners like "some" and "most" sound doubtful, but for now we allow them. The sleeping of students described by Fred is a big problem The sleeping of students I told you about is a big problem The insensitive/frequent/habitual sleeping of students is a big problem His hitting of the dog didn't help matters ?Some hitting of dogs will solve the problem ?Most hitting of dogs is unecessary This yields: {@A-} & Dmu- & (OF+ or [[()]]) & {R+ & Bs+...})) & Ss*g+; _Gerund Phrases with Neither Determiner nor Complement_ What about gerund phrases with neither a determiner nor a complement? Running is fun ?Chasing should be disallowed ?Telling was a bad idea With intransitive verbs like "run", such a usage will be allowed anyway; an "empty" complement is one possible use of the verb. Even with verbs like "chasing" and "telling", however, one occasionally sees use of the verb with neither a determiner or a complement. We therefore allow this at stage 2; we incorporate it into the nc/d expression. The expression then becomes {@AN-} & {@A-} & (Dmu- or [[()]]) & (OF+ or [[()]]) & {@M+} & {R+ & Bs+...} & Ss*g+; _Noun-modifiers on Gerunds_ Adjectival-noun modifiers of gerunds pose a problem. One sometimes sees these used either with a determiner or without. Drug running has fueled the economy here for many years The drug running here has fueled the economy here for many years ?Dog hitting is a big problem ?The dog hitting is a big problem However, one never sees both a noun-modifier on a gerund and a complement: *Dog chasing cats is a big problem *Student complaining that the rules are unfair is very common Therefore, we treat noun-modifiers as part of the nc/d usage (which, as described above, does allow stage 2 usage without a determiner). {@AN-} & {@A-} & (Dmu- or [[()]]) & (OF+ or [[()]]) & {@M+} & {R+ & Bs+...} & Ss*g+; _The Use of Gerunds_ So much for the way gerund phrases are constructed. Once a valid gerund phrase has been constructed, how may it be used? It may be noted that, in the above sentences, I have used gerund phrases as subjects connecting to a wide variety of verbs. There are certainly a large number of verbs that gerunds can be subjects for, ranging from absolutely clear-cut cases, to somewhat metaphorical cases, to highly metaphorical cases. Many verbs which normally refer to actions, which one would think could not be performed by other actions, are often used with gerunds. The only kind that are never used are those that imply physical actions which are never used figuratively, and those that imply a state of mind: e.g. a propositional attitude, emotion, etc.. Inviting John will cause problems Inviting John may well destroy the party Inviting John says to Cathy that you don't like her *Inviting John kicked Fred in the pants *Inviting John knows that Fred won't come *Inviting John hopes that he won't come There seems to be no good way of limiting these uses, especially as we do not do so for ordinary nouns (i.e., we allow "The invitation of John knows/hopes that Fred won't come"). So, we allow gerunds - both in the nc/d case and the c/nd case - to make S links to any verb. (Clearly, both types act as singular nouns, not plurals: "*Inviting John cause problems.") As prepositional objects, gerunds can be used quite freely also. Some prepositions take gerunds extremely freely and commonly. The nc/d usage sounds acceptable with almost any preposition; the c/nd usage is extremely common with some prepositions like "by" and "about", less common with others. I caused a problem by inviting John ?I caused a problem by the inviting of John I should have talked to you before inviting John ?I should have talked to you before the inviting of John We had a discussion about inviting John We had a discussion about the inviting of John ?This led to inviting John This led to the inviting of John However, the c/nd usage has already been handled here in another way, using Mg (see "M: Mg and Mv used with Conjunctions"). We need only address the nc/d case here. (This is another reason for disjoining the nc/d case and the c/nd case.) Thus we add J- to the nc/d expression: (([normal complement] or @AN-) & Ss*g+) or ({@AN-} & {A-} & (Dmu- or [[()]])... & (Ss+ or J-)); How about ordinary verb objects? Here, the c/nd case usually sounds quite wrong. There are a few exceptional complex verbs where it occurs commonly - more with the c/nd usage than the nc/d usage: I hate/enjoy/recommend inviting John ?I hate/enjoy/recommend the inviting of John to parties For c/nd cases of this type we use Pg. With most other verbs, gerunds are used very rarely as objects. It seems that gerunds are used figuratively less often as objects than they are as subjects. When they are used, it is more often with the nc/d usage. We completed/defended/fought the inviting of John *We completed/defended/fought inviting John We made the inviting of John impossible We expected the inviting of John to cause problems ?We expected inviting John to cause problems We gave the inviting of John a thorough discussion *We gave inviting John a thorough discussion *We kicked/believed/shouted/promised/sold the inviting of John *We kicked/believed/shouted/promised/sold inviting John For the moment, we simply allow all nc/d usages as objects, and forbid all c/nd usages (except for a few verbs that use Pg). This yields the following: (([normal complement] or @AN-) & Ss*g+) or ({@AN-} & {A-} & (Dmu- or [[()]])... & (Ss+ or J- or O-)); There are a few gerunds that are used as objects with a variety of different verbs. Many of these are sports, bodily functions discussed in medical terms, and activities which are proscribed or illegal. That portion of the brain controls speaking, reading, and writing *The law controls selling cigarettes The disease can cause swelling and itching *The recession caused losing jobs The law would prohibit/allow/restrict/affect smoking/fishing in public places *The law would prohibit/allow/restrict/affect selling cigarettes / eating fish The usual usage here is without either a complement or a determiner. Recall that this usage is covered under the c/nd usage. Therefore, since we allow gerunds to be used as ordinary objects under the c/nd usage, these cases will be covered as well. _"Urfl-only" domains_ One final point about gerunds. In the c/nd usage of gerunds, the gerund phrase appears to constitute a verb expression which implies a different subject from the rest of the sentence. By the logic of post-processing, then, it seems sensible to include the gerund phrase in its own domain. When the gerund is acting as a subject, the gerund phrase includes everything starting from the gerund word and tracing to the right, _underneath_ (and not tracing through) the Ss*g link. This domain structure does not correspond to any existing domain structure - either normal domains or "urfl" domains (see "TOo"). Thus we create a special domain structure especially for this purpose: "urfl-only". Ss*g links then start 'd' domains, which are "urfl-only". +--------------Ss*g--------+-----O---+ +---MVp(d)--+---J(d)--+ | +---D--+ +-O(d)-+ | +D(d)+ | | +-A+ | | | | | | | | | Telling John about the party was a bad idea ("Urfl" domains include everything "under the root link from the left", as well as including everything traced from the word on the right end of the root link, like ordinary domains. "Urfl-only" domains include _only_ the links that are under the root link and reachable from the left.) In the nc/d usage, gerunds seem to act much more like simple nouns. Therefore there is no need for them to create new domains. Thus ordinary S connectors are used here. _Other Special S Connectors_ Ss*t is used for a few nouns that can take "be+that" as predicates: The idea was that we would go to London *The vacation was that we could go to London We enforce this distinction in post-processing. See "THb". Ss*b is used for a few subjects - "it", "this", and "that" - that can take the predicate "(be) because (clause)": "This is because he is stupid". See "BI". Ss*q is used for subjects like "question" that can take the predicate "(be) (indirect question)": "The question is why he did it". See "BI". SF is a special connector used for certain "filler" subjects like "it" and "there". It interacts heavily with post-processing. Post-processing is used both to enforce that certain predicates may not be used with "filler" subjects, and also that certain predicates may only be used with such subjects. SF is also used with a few special phrases like "to" and "that" phrases when used as subjects. _"Filler-it"_ Many verbs and adjectives take complements like "to+infinitive" or "that+clause": (see "TO" and "TH") 3. I expect that he will go 4. I am glad that he is going 1. He wants to go 2. He is eager to go However, there are certain adjective-complement and verb-complement phrases that may only be used with the subject "it". 5. It is likely that John will go 6. *John is likely that John will go 7. It seems that John should go 8. *John seems that John should go 9. It was suggested that John would go 10. *John was suggested that John would go 11. It is important to go 12. *John is important to go It can be seen in the dictionary that the verbs and adjectives on the left of "that" in exs. 5-10 - "seems", "likely" - have specially subscripted THi+ connectors, unlike "expect" and "glad" which have unsubscripted TH+ connectors. Moreover, "wants" and "eager" have TO+ connectors; "important" has, instead, a "TOi+" connector. "It" also has a special "SFsi" connector; verbs which "It" may connect to directly are given "SFs-" connectors, disjoined with their "S-". ("It" also has an ordinary "Ss" connector; see below). Thus the connectors that may only be used with "it", as well as "it" itself, are marked with special connectors. We use this information in post-processing to make the distinctions noted above. Recall that post-processing divides the links of a sentence into groups, corresponding roughly to subject-verb expressions. In post-processing, we insist that any group which contains a THi connector or a TOi connector must contain an SFsi connector. In this way, we are able to reject ex. 6 above while accepting ex. 5. +SFsi+--Paf-+-THi+--Ce+-S(e)-+I(e)+ | | | | | | | It is likely that John will go +-Ss-+--Paf-+-THi+--Ce+-S(e)-+I(e)+ | | | | | | | *John is likely that John will go Of course, "it" may also be used referentially, as an ordinary pronoun: "It is black with a long tail". Thus "it" has the following: it: SFsi+ or Ss+ or O- or J-; A sentence like ex. 5 will thus get two parses, one with SFsi and one with Ss; the one with Ss will be rejected. This is not merely vanity, but has an important function, as described below. _Constraints on "Filler"-Only Phrases_ Predicate phrases that may only be used with "it" as the subject, such as "likely that" and "important to", might be described as "filler-only". There are other constraints on the way such phrases are used, beside the fact that "it" must be the subject. Only certain verbs and adjectives may be combined with them: *It seems to be likely that John will go *It is glad to be likely that John will go Thus, as well as enforcing that "it" is the subject of such phrases, we must ensure that they are not used in combination with predicates like "glad" and "wants to". Another way of thinking about this is that there is a "filler-it" which may not be used with "referential-only" predicates and a referential-"it" which may not be used with "filler-only" predicates. This is in fact the approach we use. The "S" connector on "it" corresponds to referential uses, the "SF" to "filler" uses. We have already insisted that ""filler" only" predicates (like "likely that") are not used with "S"; we now must require that "referential-only" phrases are not used with "SF". This is again enforced with p.p. rules. The verb and adjective connectors (I, T, Pg, Pa, Pv, TO) on the verbs and adjectives that may be used with "filler-it" ("be", "seem") are subscripted with "i" (yielding Ii, PPi, Pgi, Pai, Pvi, TOi); all others, like those on "want" and "glad", are left unsubscripted. (Note that "it" can only make a direct SF connection to certain verbs anyway; thus the use of non- referential "it" is partially controlled at the linkage level.) PP rules then dictate that when an SFsi is present in a group, the only forms of verb connectors (I, T, Pg, Pa, and TO) which are allowed are those subscripted with an "i". The following linkage is thus rejected, because the group containing an SFsi also contains unsubscripted "Pa" and "TO" links. +SFsi+-Pa-+-TO+Ii+-Paf-+-THi-+--Ce+S(e)+I(e)+ | | | | | | | | | | *It is glad to be likely that Joe will go Many sentences with "it" receive two parses, one with "S" and one with "SF". The above sentence, for example, also receives this linkage: +-Ss+-Pa-+-TO+Ii+-Paf-+-THi-+--Ce+S(e)+I(e)+ | | | | | | | | | | *It is glad to be likely that Joe will go This linkages is also rejected, however, because the "THi" requires an "SFsi". If a sentence contains "filler"-only links (in a group with "it"), the "S" parse will be invalid; if it contains referential-only links, the "SF" parse will be invalid. In this case, then, one linkage fails to meet the "SF" requirements, the other fails to meet the "S" requirements, and the sentence is rejected. Note that the division of the sentence into domains is essential here. Consider the sentence +SFsi+-Paf+THi-+--Ce-+S(e)+Ii(e)+Pa(e)+TO(e)+I(e)+ | | | | | | | | | | It is likely that John will be glad to go If the sentence were not divided into domains, the post-processing rules would see that there are unsubscripted "Pa", "TO" and "I" connectors, which are incompatible with "SFsi" (as well as a THi connector, incompatible with the "S" usage of "it"); thus the sentence would be rejected. With domains, however, the parser knows that "THi" goes with "It" and "glad to go" goes with "John", and everything is okay. The same thing applies also, of course, with relative clauses, subordinate clauses, and the like: "the weather was terrible, but we thought it was likely that John, who was an excellent sailor, would get the boat safely back to the shore". Such constructions would obviously wreak havoc with post-processing rules unless the clauses were clearly demarcated. The same applies with sentences with "to": +SFsi+--Paf--+-TOi-+I(e)-+ | | | | | it is important to go In this case, the "TOi" on "important" must begin a new domain, so that the post-processor knows that "go" does not relate to "filler-it" (with which it is incompatible). The tricky thing here is that "adjective+to" constructions do not always start new domains: in sentences like "He is ready to go", the infinitive clearly relates to the subject preceding the adjective. For such adjectives, we use unsubscripted TO. (Note that there are a few adjective-infinitive constructions, such as "certain" and "likely", that do not start new domains with "to", but which are compatible with "filler-it": "It is certain to be important to go". Such adjectives take "TOf".) _"There" as a subject: SFst and SFp_ SFst and SFp are used when "there" is acting as a subject: +-SFst-+ | | There seems to be a problem. +-SFp-+ | | There seem to be problems. "There" therefore has "SFst+ or SFp+"; verbs which can connect directly to "there" have "SFst-" or "SFp-", as appropriate. It may seem odd to distinguish between singular and plural "there", since the forms are the same; this relates to post-processing. There are constraints on the way "there" can be used as a subject, which are enforced mainly through post-processing. This is similar to the case of "filler-it", only simpler. Like "filler-it", "there" is compatible only with certain verbs and adjectives: There seems to have been a problem There might appear to be a problem *There is eager to be a dog *There wants to try to be a dog These are precisely the verbs and adjectives that are compatible with "filler-it". These verbs have already been specially subscripted; post-processing rules enforce that groups which contain unsubscripted connectors of these types may not also contain a "SFsi" (used with "filler-it"). Thus we simply make such connectors incompatible with "SFst" and "SFp" as well. In this way, we prohibit incorrect sentences like those above. See "filler-it". Note, however, that, unlike "it", "there" is always a "filler" subject; it therefore has no ordinary "S" connector. There are other constraints on "there". The "filler-only" phrases used with "it" (likely that, important to) may not be used with "There": "*There is likely that John is coming". Moreover, when "there" is acting as a subject, there must be an object in the clause ("*There seems to be likely"), and there must be number agreement between the object and the verb ("*There seem to be a dog here"). This is again enforced through post-processing. Forms of the verb "be" have O*t+ connectors. "O-" connectors on singular nouns (used in ordinary direct-object links: see "O") are given "s" subscripts; those on plural nouns have "p" subscripts. When a form of "be" connects to an Os- or Op- connector on a noun, an Opt or Ost link is thus created: +-SFp-+ +-Ost-+ | | | | * There seem to have been a dog here +-SFst+ +-Ost-+ | | | | There seems to have been a dog here PP rules then dictate that every SFst must have an Ost in its group and every SFp must have an Osp in its group. In the process, we also ensure that phrases used with "filler" it, like "likely that", do not occur with "there"; it turns out that there is no way for a group to contain a direct object and an predicative adjectival phrase like "likely that" at the same time. As discussed with "it", the use of post-processing domains is important here. Some nouns take "to" phrases as complements: "There was an effort to revive the bill". A verb like "revive" in this case is fine, because it does not relate to the subject "there" ("*There seems to have revived the bill"). But in order for the post-processing rules to know this, the TOn link from the noun ("an EFFORT TO revive") must begin a new group, letting the parser know that a new subject is in force. _"Special Subjects": SFsx_ SFsx is used for a few phrases that can act as subjects under rather constrained circumstances: "that"-clause phrases, "to"-infinitive phrases, and "where-when-why" phrases. +---------SFsx---+ +Ce-+-S--+ | | | | | That Joe is angry is not surprising +--------SFsx----+ +-I--+ | | | | To invite Bill would be a mistake +------SFsd---+ +--Cs-+ | | | | Where they went is a mystery In each case, the subject phrase is a kind of phrase that occurs frequently as a verb complement. Thus we simply directly disjoin and conjoin "SFsx" on the head word with whatever connectors are used for verb complements. With "that", for example: that: Ce+ & (TH- or SFsx+) As with gerunds, constructions of this kind seem to vary quite gradually in their grammaticality (See "Ss*g"). However, they are much rarer than gerunds, and seem much more constrained. Finishing college would make you more marketable ?To finish college would make you more marketable The graduating of Fred changes the situation ?That Fred graduated changes the situation Most often, such special subject phrases are used with the verb "be" and a few other verbs and adjectives ("seem", "appear", "likely"). these are the same predicates that may be taken by "filler-it" and "there". We already have a system in place in post-processing for restricting the predicates used with "filler-it" and "there" (which use "SFsi" and "SFst", respectively). (See "SF: Filler-it".) For now, then, we simply apply those same constraints to "SFsx". (One difference is that we allow forms of "be" to take direct objects under these circumstances: "That John graduated is a problem".) This solution could probably be improved, however. There are further constraints on the use of "special subjects". They may not occur in relative clauses (ex. 1), and they may not invert with their auxiliary (exx. 2-3). The problem that John graduated is is very large Is that John graduated a problem? Is to graduate a good idea? To solve the first problem, we give special subjects a restricted "clause" expression conjoined with their "SF+" connectors, omitting the normal "C-" connector. To solve the second problem, we simply do not give special subjects "SI-" connectors. SFI connects "filler" subjects "it" and "there" to invertible verbs in questions with s-v inversion: +SFI+ | | Is it likely that John will go Is there going to be a problem Both the use of "it" and "there" and the use of subject-verb inversion are highly constrained by post-processing; see "SF" and "SI". However, there is nothing problematic about SFI; SFI is to SF precisely as SI is to S. SI is used in subject-verb inversion: +----Pg---+ +-----I------+ +-SI-+ | +-SI-+ | | | | | | | Is John coming Who did John see Only verbs which may be inverted (i.e. modals and auxiliaries) have SI+ connectors. On such verbs, the SI is disjoined with the S-, and is conjoined with whatever complement connectors the verb may be used with when it is inverted. For example, forms of "have" may be inverted when they are taking a past participle, but not when they are taking a direct object. Thus they have have: (SI+ & PP-) or (S- & (O+ or TO+ or PP+)); This yields: They have finished it Have they finished it They have dogs *Have they dogs They have to go *Have they to go The use of SI and S connectors is highly constrained. In many situations, subject-verb inversion may not occur; in some situations, it must occur. In many cases, the enforcement of this involves post-processing. _Questions requiring subject-verb inversion_ In some cases, s-v inversion _must_ occur: in object-type questions ("*Which dog you hit", "Which dog did you hit"), or with question words like "where" ("*Where you will go", "Where will you go"). This is enforced by post-processing. When a question word begins a sentence, it must make a Wq connection to the wall (or a Ws connection: see below). (There is no other way for the wall to connect to the sentence.) The Wq connection starts a 'm' domain, and is included in the domain. We then require that a group with a "Wq" contain contain some kind of SI connector. This prevents "*Which dog I hit". This solution works equally well when the question contains an embedded clause: "What do you think he did?", "*What you think he did?" In both cases, "What" begins a group; in both cases, the link between "you" and its finite verb is in the outer group of this domain. Thus this group is required to contain an "SI" connector of some kind. +----------Bsw(m(e))--------------+ | +----I(m)--+ | +-Wq(m)+ +SI(m)+ +Ce(m(e))+S(m(e))-+ | | | | | | | || What do you think he did A similar situation arises with adjectival questions: +-----AF----+ | +---I---+ +-Wq+EAh+ +SI-+ | | | | | | | || How big will it be Here again, s-v inversion must be enforced ("*How big it is"). We therefore give "how" a Wq- connector; this is then in the same group as whatever S or SI connector is in the outer group, and p.p. insists that it must be an SI. S-V inversion must also be enforced with the question words "where" and "when" and "how" (when used in this way): "Where(/when/how) will you go", "*Where you will go". This is done with simple connector logic. Unlike with object-type constructions, the question word here is not making a B connector to the rest of the sentence; it must find some other way to connect. For this purpose, the question words "where", "when", "why" and "who" have Q+ conjoined with Wq-; and Q- on verbs is conjoined with SI+, disjoined with S-. where: (Wq- & Q+)... have: ({Q-} & SI+ & ...) or ({C-} & S+ & ...) Thus we allow "Where have you gone"; we prevent "*Where you have gone". _Cases where s-v inversion may not occur_ In many cases, s-v inversion is prohibited: in relative clauses (*The dog who did you buy was black) and subordinate clauses ("I left the party after did you see Fred"). No linkages are found for these; again, there is no way for the illegally-inverted segment to connect to the rest of the sentence. (The one case where s-v inversion may occur unwanted is in indirect questions; this problem is discussed below.) _Simple s-v inversion ("yes-no") questions_ Questions may also be formed by simply inverting the verb and the auxiliary ("Are you coming", "Did you go"). In this case the question must make some kind of connection to the wall; we use the Q connector for this, giving the wall Q+. The problem here is that no question word is used; thus there is no Wq present, indicating that s-v inversion may occur. Thus the Q+ on the wall is subscripted "Qd", and this is added to the list of connectors that permit s-v inversion. (See "Q".) _Questions without s-v inversion_ Post-processing cannot insist on subject-verb inversion in all question-word questions. Subject-type questions do not contain s-v inversion: +-S--+ | | Who hit John To allow this, question words like "who" and "which" may also make a Ws connection to the wall. Ws is exempted from the p.p. constraints applied to Wq; a group containing a Ws need not contain an SI connector (indeed, they may not; see below). who: (S+ or B+) or (Ws- or Wq-); which: (D**w+ or S+ or B+) or (Ws- or Wq-); (The special subscript on the D+ of "which" will be explained below.) But in that case what will prevent illegal sentences being formed using the Ws: "*Who John hit", and "*Where John goes"? As described above, the second case is prevented by link logic; the sentence cannot form. For the first case, we require p.p.: we dictate that a group with a Ws may not contain an B#m. (We must also prevent the redundant parses resulting from Ws being used in "where/when/why" questions with s-v inversion. We do this by simply not giving these words Ws.) In short: if there is a question-word present in the main clause, there must either be a Wq, in which case s-v inversion is enforced, or there must be a Ws, in which case B#m connections are prohibited in p.p. and "where-when-why" questions do not form; the result is that with Ws all s-v inversion questions are prevented either at the linkage stage or in post-processing; Ws is therefore used with all and only non-s-v questions, Wq with all and only s-v questions. There is one final problem: enforcing s-v inversion in object-type questions with embedded clauses. +----------Bsm(m(e))----+ +-Ws(m)+D**w(m(e))+ +S(m)+Ce(m)+S(m(e))+ | | | | | | | * || Which dog you think you hit Here, "which" can use its Ws connection. The parser then "thinks" it is a subject-type question; it finds no s-v inversion, and no B connector in the outer group, therefore it accepts the sentence. To prevent this, we assign question-word determiners ("which" and "what" a D**w+ subscript; we then stipulate that a group with a Ws must contain a "D**w" connector. We further make the B#m link _not_ a restricted link; the D**w is restricted, however. (See "B: Questions: B#w, B#m".) In this case, then, the 'e' group spreads back to include the "D**w" connector; the group containing the Ws therefore no longer contains a D**w; and the sentence is rejected. (In effect, we want post-processing to "know" when something is a well-formed subject-type question. It only knows this is the case if a) there is an S in the outer group of the sentence, and b) the D connector of that subject is also in that group.) (One more little annoyance: We've said that groups with Ws correspond to subject-type questions. We've insisted that they may not contain SI links; and, to avoid the false positive just discussed, we've insisted that they must contain a D**w. Now the only problem is, they sometimes don't contain a D**w; sometimes the question word is itself the subject: "Who is coming?". So we give question-words specially subscripted "S**w" connectors; and then we say, a group with a Ws must contain _either_ a D**w or an S**w.) _Indirect Questions_ It was mentioned that in most cases where s-v inversion is prohibited (relative clauses and the like), it is prevented at the linkage level. There is one case where it is not, however, namely indirect questions. +----B(s)-----+ | +----I(s)-+ +-R-+ +SI(s)+ | | | | | | * I know who did you hit We prevent this by saying that an SI connector may only occur when Wq or Qd is present. Wq connections can only form between a question-word and the wall; Qd can only form between an auxiliary verb and the wall. Thus the above construction will be rejected. (With "where/when/how" questions, this is accomplished at the linkage level. QI- on these words is disjoined with Q+; thus there is no way for the question word to connect to a s-v-inverted clause.) The same applies to "how [adjective]" questions: "*I wonder how big is it" is therefore rejected. _Other Uses of SI Besides Questions: SI*j_ There are a few other situations besides questions where SI is used. In subjunctive clauses, a nominative noun phrase is used (e.g., "he" rather than "him"). In such cases, however, is most convenient to connect it to the preceding "that", rather than the following verb. Therefore we use SI here. +----I--+ +--TS--+-SI-+ | | | | | I suggested that he go We also use SI in quotation constructions with certain verbs like "say": +------CCq------+----SI----+ | | | The President is busy, said the spokesman As described above, the use of SI is usually tightly constrained by post-processing. Rather than try to adjust the post-processing constraints to handle these situations, we avoid the problem. We simply give the SI+ a "*j" subscript here; thus the post-processing constraints do not apply, and SI*j can be used anywhere that a linkage can be found. Verbs (and the word "that") that take subjunctive clauses thus carry "SI*j+". TA is used to connect adjectives like "late" to month names: +----IN----+ | +--TA--+ | | | We did it in early December See "DT" for more discussion of time expressions. TD connects day-of-the-week words to time expression like "morning", "afternoon", and "evening". The day-of-the-week word serves as the head of the expression, and makes an MV, CO or other connection to the rest of the sentence. See "DT" for more discussion of time expressions. TH connects words that take "that [clause]" complements with the word "that". These include verbs ("I assured him that"), nouns ("The idea that...") and adjectives ("We are certain that..."). +---TH----+ | | I assured him that I would finish the project The idea that we would hire John is preposterous I'm certain that he could do a good job The word "that" therefore has "TH- & Ce+". The Ce+ connects to the subject of a following clause (see "C"). Verbs that can take "that"+clause have TH+ disjoined with their other complement connectors (O+, TO+, etc.). With many such verbs, the "that" is optional; they thus carry "TH+ or Ce+", allowing a direct connection to the subject of the subordinate clause. A few verbs can take object+"that" ("I told him that I was angry"); such verbs have "O+ & {TH+}." Adjectives like "certain" have "TH+" disjoined with other complement connectors. Nouns like "idea", "opinion", and "argument" have TH+ conjoined with {@M+} (used in prepositional phrases) but disjoined with {C+ & B+} (used in relative clauses); this is somewhat arbitrary. THb connectors are used to connect forms of "be" to "that". The reason for this is that only certain nouns can serve as the subject in such cases: "The problem is that John is coming", "*The dog is that John is coming". To enforce this, we give such nouns "Ss*t+" connectors Instead of the usual "Ss+". We then insist in post-processing that a group containing a THb must contain an Ss*t. THi connectors are used with certain adjectives ("important") which may only take "that+clause" when "filler-it" is the subject. This distinction involves post-processing. See "SF: Filler-it" for further explanation. TI is used for titles like "president" and "chairman", which can be used in certain circumstances without a determiner: after the preposition "as" or after verbs liked "name" or "elect". +--------------CO-------------+ +--TI--+-Mp---+ | | | | | As president of the company, it is my decision +----TI---+ | | He was named president of the company There are special dictionary entries for the no-determiner usage of these nouns: president.i chairman.i: {@M+} & (BI- or (Xd- & Xc+ & MX-) or TI-); As this expression shows, such nouns can also be complements of "be" ("He is president of the company"), using BI-; and they can also form comma expressions modifying a noun ("John Smith, president of the company, ..."), using MX-. TM is used to connect month names to day numbers: +-ON-+-TM-+ | | | It happened on January 21 See "DT" for more discussion of time expressions. TO connects verbs and adjectives which take infinitival complements to the word "to". +---TO--+ | | I tried to start the car We intend to be firm We are eager to do it The word "to" then makes an I link to an infinitive verb form. "To" therefore carries "TO- & I+". Verbs which take "to"+infinitive have "TO+" connectors. Many such verbs can also take other kinds of complements; simple objects (O+), clauses with or without "that" (TH+/ Ce+), indirect questions (QI+), and so on. Such connectors are disjoined with "TO+". In some cases, the verb may take no complement at all "We hesitated"; in others, some kind of complement is obligatory (*"We intend"). Some verbs take a direct object plus an infinitive ("We told him to go"); these verbs do _not_ use TO+, but rather TOo+. See "TOo". Some adjectives also take "TO+". This is only the case for usages where the same subject is implied before and after the adjective: "We were ready to go", but not "It is important to go". See "TOi". There are other situations involving "to"+infinitive where specially subscripted TO connectors are used: transitive adjectives ("He is easy to hit"; see "TOt"), indirect questions ("I wonder where to go"; see "TOn") and nouns that take to+infinitive ("We made an effort to go"; see "TOn"). The word "to" (and only this word) carries unsubscripted TO-, it is directly conjoined with I+. The reason for the distinctions between TO, TOt, TOn TOi, and TOo relates to post-processing. Recall that post-processing divides the links of a sentence into groups, corresponding roughly to subject-verb expressions. In some uses of infinitives - specifically, those described above which use unsubscripted TO - the infinitive simply continues the subject-verb expression that precedes it. Indeed, one can use a number of infinitives, all relating to the same subject: "I hope to be ready to consent to try to do it." In other cases (those specially subscripted TO connectors), the infinitive implies a different subject from what precedes ("I told him to go", "He is easy to hit"). In these cases the link connecting "to" with the infinitive must start a new domain. This is particularly important in sentences like the following, where post-processing is used to enforce the use of certain verbs. There is certain to be a problem *There is eager to be a problem There might be an opportunity to refuse to do it *There might refuse to be an opportunity to do it In addition, certain uses of infinitives are only permitted with "filler-it" as the subject: "*John is important to go". See "SF" for explanation of all this. It is important to note that we allow any kind of sentence to take an infinitival phrase, meaning "in order to": "We bought some eggs to make cookies". See "MVi". We give this connector a cost of 1. Thus many sentences which involve incorrect uses of "filler"-only phrases ("John is important to do it") will receive valid linkages using "MVi". _Infinitival complements of transitive verbs: TOo_ Some verbs can take an infinitival complement as well as a direct object. In such cases, TOo is used. +---TOo--+ +--O--+ | | | | I advised him to go It will be seen that every "TOo" on verbs is conjoined with a preceding "O+". (Usually the TOo+ is optional: "I advised him".) Note that in such situations, the infinitival verb relates not to the main subject ("I" in this case), but to the direct object of the verb ("him"). This is unlike other infinitival complements of verbs ("I hesitated/wanted/tried to go") where the preceding subject remains in force; for these situations, we use "TO". Like TO, TOo connects a verb to the word "to'. The reason for distinguishing between them relates entirely to post-processing. TOo begins a new domain, thus telling post-processing that a the infinitive verb relates not to the preceding subject but to a new subject. In addition, TOo links start a special kind of domain, "urfl domains". Ordinary domains contain everything that can be reached from the right end of the domain-starting link. "Urfl domains", however, also include whatever can be reached from the left end of the domain-starting link and b) is underneath that link. (Hence the name: "urfl" stands for "under root from left".) In the case, below, then, an ordinary domain started by the "C" link would contain only the "D" link. An "urfl" domain started by the C link would contain the C, the D, and the B. +---C---+ +-A-+-B-+ +-D-+ | | | | | bla bla bla bla bla This is useful in the case of verbs which take objects and infinitives. Recall that the logic of domains is that links relating to a single subject-verb expression should be contained in a group. In this case, the infinitive relates to the object of the preceding verb; thus we want the O link to be in the same group as the infinitive. TOo starts "x" domains, which are "urfl": +--TOo(x)+ +-S--+-O(x)+ +I(x)+ | | | | | I told him to go We use this to control uses of "filler-it" and "there". "It" and "there" have OX connectors, which are used only in cases like this. The use of verbs and adjectives with "filler-it" and "there" is highly constrained; we have a complex apparatus for enforcing these rules in post-processing (see "SF"). The same rules apply when "it" and "there" are used with Obj+infinitive constructions. +-TOo(x)-+ +OX(x)+ +Ii(x)+Paf(x)+ | | | | | I expected it to be easy to use the program *I expected John to be easy to use the program Once we have the adjective/verb connectors in the same group as their implied subject, as in this case, we can simply apply the same constraints to "OXt" (for "there") that we apply for "SFst", and the same constraints to "OXi" that we apply to "SFsi", and everything else follows naturally. _Other Kinds of TO Connectors: TOn, TOi, TOt_ TOn is used with nouns that take infinitival complements: "The EFFORT TO finish the program was successful". (Only certain nouns can take such complements: "*The computer to finish the program was fast".) With such nouns, the TOn is conjoined with the @M+ (used in prepositional phrases and some other kinds of modifiers), disjoined with the "(C+ & B+)"; this is perhaps rather arbitrary. TOn is also used in indirect questions, to connect question words to "to": "I wonder WHERE TO go". (In object-type indirect questions, like "I wonder what to do next", no TOn connection is made; see "Ia".) Question words such as "where" therefore have "R- & (TOn+ or Cs+...)". TOi is used with adjectives that take infinitival complements but which take "filler-it" as the main subject. +-TOi-+ | | It is fun to try to beat the program In such cases, there are constraints on the verbs that may be taken by "it" ("*It tries to be fun to beat the program"), but not on the verbs in the infinitival phrase. This is enforced with post-processing; see "SF: Filler-it". TOn and TOi both start new domains. They are therefore like TOo, but unlike TO, which is used with adjectives and verbs that take infinitive complements but which does not start a new domain. The reason is that the infinitive and what follows usually imply a new subject, rather than relating to the subject that precedes. This is important, for uses of "filler-it" and "there". See "SF". TOt is used for certain adjectives which take transitive infinitival constructions. +----B-----+ +-TOt-+-I--+ | | | John is easy to hit Such adjectives take "TOt+ & B+", disjoined with their other complement connectors (TO+, THi+, etc.). TOf connectors are used to enforce the correct use of non- referential "it". "TOf" connectors are used with adjectives which do not start new domains, but which may also be used with "filler-it" and "there". See "SF: Filler-it". TQ is used as the determiner connector for time expressions acting as fronted objects, as in "how many years did it last". See "OT". TS is used in subjunctive constructions. It connects certain verbs that can take subjunctive clauses as complements - "suggest", "require" - to the word "that". Such verbs have TS+, disjoined with other complement connectors (TH+, TO+, etc.). The word "that" has "TS- & SI*j+ & I*j+". Thus "that" connects to a subject (all nouns and nominative pronouns have SI- connectors) and to an infinitive verb; the subject and verb do not connect to each other. +----I--+ +--TS--+-SI-+ | | | | | I suggested that he go SI connectors are mainly used in questions where there is question inversion ("Did he go", "Where did he go", "Who did he see"). The use of SI is highly constrained in post-processing, and is usually only permitted in questions; to avoid this, we give the SI connectors here a special subscript. See "SI*j". TS is also used for certain adjectives that take subjunctive: "It is IMPORTANT THAT he go". TSi is used here, since only the the "filler-it" may be used as a subject. Notice that the "I+" connector on "that" is also subscripted: "I*j+". This connector starts an "urfl domain" (see "TOo".) The domain structure below is thus formed: +-I*j(x)----+ +--TS--+SIs*j(x)+ | | | | | I suggested that he go We thereby capture the intuition that the SI link and the I link in a subjunctive clause form part of a single subject-verb expression, distinct from the previous subject- verb expression. TY is used for certain idiomatic usages of year numbers. +--TY-+ +--ON-+-TD-+ +-Xd+-Xc+ | | | | | | I saw him on January 21 , 1990 \\\\\ See "DT" for more discussion of time expressions. U is a special connector used with nouns. In most cases, nouns have both a determiner requirement ("D-") and a main subject-object requirement ("S+ or O- or J-"). In a few cases, however, a single word appears to satisfy both requirements: +--U---+ | | What kind_of dog did you buy "Dog" would seem to be acting like the prepositional object of "of" here; but, unusually, it requires no determiner. Therefore we create a "U-" connector, disjoined with both its "D-" and its "S+ or O- or J-..." complex. Other uses include the following: +--U--+ | | We spend four dollars per student We spend four dollars a student For the latter case, we give the word "a" the expression "[[Mp- or Us+]]". Since this usage is rare compared to the other uses of "a", we make it stage 2. U*t is used in comparatives; see "MV: Comparatives". UN connects the words "until" and "since" to certain time phrases like "after (clause)" and "before (clause)". "Until" and "since" are the only words that can take such phrases as complements ("*We waited before/by/in after the movie"); hence this special connector. V is used for attaching various verbs to idiomatic expressions that may be non-adjacent. Each verb has its own subscript. +---V----+ | | I took him for granted (Vt) I held him responsible (Vh) He did nothing but complain (Vd) W is used to attach main clauses to the wall. Almost all kinds of main clauses - declaratives, most questions (object-type, subject-type, where/when/why, and prepositional), and imperatives - use a W of some kind to attach to the wall. The only exception is "yes-no" questions, which attach to the wall with Q. See "Q". +---W------+ | | ///// The dog ran (Wd) ///// Who did you hit (Wq) ///// Who is coming (Ws) ///// To whom did you speak (Wj) ///// Go away (Wi) Note that the wall is automatically inserted at the beginning of every sentence, and is then treated like a normal word; by the connectivity rule, therefore, it _must_ make some kind of connection to the sentence. The wall thus has "W- or Q-". W is also used to attach clauses back to coordinating conjunctions in declarative sentences; coordinating conjunctions thus have "CC- & (Wd+ or Wq+ or Ws+)". CC is used to make a link back to the subject of the previous main clause. _Declarative Sentences: Wd_ Wd is used in ordinary declarative sentences, to connect the main clause back to the wall (or to a previous coordinating conjunction). Nouns carry Wd-, optionally conjoined with their S+ connectors. Wd- on nouns is directly disjoined with C- (used in dependent clauses) and R- (used in some relative clauses); see "C-". dog: (({@CO-} & Wd-) or ({@CO-} & C-) or R+) & S+ _Questions: Wq, Ws, Wj_ Wq, Ws and Wj are used to connect many types of questions to the wall: subject questions (Ws), object questions (Wq), where/when/why questions (Wq), adjectival questions (Wq), and prepositional questions (Wj). Each of these connector types interacts heavily with post-processing. See "SI" for an explanation of Wq and Ws; see "JQ" for an explanation of Wj. _Imperatives: Wi_ Wi is used to connect imperatives to the wall. +-W-+ | | ///// Go away Imperative verb forms have "Wi-", conjoined with their complement connectors. Since the imperative verb form is always the same as the infinitive form (and the plural, in every case except "be"), the same expression can be used. _Coordinating Conjunctions_ There are a number of words that serve to link clauses together: coordinating conjunctions like "and" and "but", and subordinating conjunctions like "after" and "because". +--CC----+-Wd+ | | | John left but he returned later +-MVs+-Cs+ | | | John left after I saw you Note that subordinating and coordinating conjunctions use very different linking structures. First of all, both the left-pointing and right-pointing connectors on the conjunctions are different; "but" has "CC- & Wd+", "after" has "MVs- & Cs+". Secondly, coordinating conjunctions connect back to the subject of the previous clause, subordinating conjunctions to the verb. There are several reasons for making these distinctions. First of all, coordinating conjunctions may not be used in relative clauses: *The man I tried to hit but John stopped me is here *The man I tried to stop John but he hit is here *The man I hit but John comforted is here (There are other constraints on relative clauses: the main noun of a relative clause may not link to something inside an embedded clause. We handle this using Ce and Cs; see "Ce".) So, we need to prevent these constructions. Coordinating conjunctions have another related property. They may be used to connect clauses in sequence, like subordinating conjunctions. But whereas subordinating conjunctions seem to link in a nested way, with each modifying the last, coordinating conjunctions seem to "leap" over any preceding subordinating conjunctions: +------------+-C-+-S-+------+-C---+--S--+ | | | | | | | 1. John screamed when I arrived after Sue left (seems right) +---- ? ---+ +------------+-C-+-S-+ +-W---+--S--+ | | | | | | | 2. John screamed when I arrived but Sue left (seems wrong) +-------------CC------------+ +------------+-C-+-S-+ +-C---+--S--+ | | | | | | | 3. John screamed when I arrived but Sue left (seems right) We handle this in the following way. In the first place, coordinating conjunctions link to the left not with MVs-, like other conjunctions, but with CC-. and but: CC- & W+; dog: {R- or C- or (W- & {CC+})} & S+...; Note that subject nouns may make a CC connection to the right, but only if a W is being made to the left (i.e., if the noun in a subject of a main clause), not if a C is being made. In other words, while subordinating conjunctions connect to the main verb of the nearest clause to the left, coordinating conjunctions connect to the _subject_ of the nearest _main_ clause to the left. Thus ex. 3 above is allowed, but ex. 2 is prevented. The problem with relative clauses is solved also. In relative clauses, the main subject of the relative always makes either a C- or an R- to the left, and neither one is conjoined with CC+; so no coordinating conjunctions can appear. Note that the above expressions also allow coordinating conjunctions to link clauses in sequence: +-------CC---+--W-+---CC--+-W-+ | | | | | John screamed and Fred ran but Dave cried Coordinating conjunctions may also connect directly to the wall: "And John screamed". Thus they carry a "Wc" connector. Furthermore, a coordinating conjunction may link to a following question, rather than to a declarative clause. They may not, however, link from a question to a declarative clause: I know you don't like Joe, but why did you send him that nasty note *Why did you send Joe that nasty note, but I know you don't like him Thus we give such conjunctions the following: (CC- or Wc-) & (Wd+ or Wq+ or Ws+ or Qd+); Another reason for distinguishing between W and C is that certain openers like participle openers may be used in main clauses but not dependent ones; see "COp". WN is used to connect "when" phrases to time-nouns like "year" and "day". In such cases, WN is acting like a noun-modifier; such modifiers would normally use M-. However, "when" cannot normally be used in this way, as the second sentence shows; only with time nouns. Thus time-nouns have an option WN+ (along with an optional @M+); "when" has WN- conjoined with Cs+, connecting to a subordinate clause. +--WN-+ | | The year when I lived in England was wonderful *The school when I lived in England was wonderful WR is used to connect the word "where" to a few verbs like "put". "Put" usually requires an object plus a prepositional phrase as a complement; but this prepositional phrase requirement can be satisfied by the preceding word "where" in questions. *I put it +----WR------+-O-+ | | | Where did you put it X is used to connect punctuation symbols to words. Xc connects a word to a comma to the right; Xd connects a word to a comma to a left. Xx is used with colons and semi-colons. _Comma Phrases_ A wide variety of words and phrases can be used with commas on either side. Some are noun-modifiers, using MX to connect to a previous noun. Some are verb modifiers, using MV to connect to a previous verb or E to connect to a following one. Some are "openers", using CO to connect to a following clause subject. And some are quoting expressions ("he said"), which can be inserted at various places in the sentence. In each of these cases, the head word of the expression must link to commas on either side. +-MX---+ | +Xd-+-----Xc-------+ | | | | 1. John , a doctor , is here 2. John , with his mother , is here 3. John , carrying the dog , left the room 4. John , who you know , is here 5. John , to whom I spoke , is here +--MV--+ | +Xd-+-----Xc-------+ | | | | 6. We left , carrying the dog , and Fred followed 7. We left , quietly , and Fred followed 8. We left , with John , and Fred followed 9. We left , when we saw him , and Fred followed +-------CO-------+ +Xd-+-----Xc-------+ | | | | | 10. He said that , after the party , he had gone home 11. He said that , eventually , he had gone home 12. He said that , after he saw us , he had gone home +---Xd-+--Xc-------+ | | | He left , he said , and Joe followed (CCq-) John , he said , left the party (Eq+) 14. After the party , he said , he left (COq+) In each case, the head word must connect most closely to the words within the comma expression; then to the commas themselves; then to any words outside. So, for example, in ex. 2 above, prepositions have with: J+ & (.... or (Xc+ & Xd- & MX-)) ^ ^ ^ ^ link to prep. links to commas link to previous object noun In some cases, the commas are obligatory (such as noun modifiers of nouns, like ex. 1: "*John a doctor is here"). In other cases the commas are essentially optional, but for various reasons, we decided to use a different connectors for with-comma and without-comma expressions (for example, prepositional phrases modifying nouns take M without commas, MX with commas). In other cases, the commas are optional and the same connector is used with or without; this is the case, for example, with manner adverbs modifying verbs. Such adverbs therefore take "{Xc+ & Xd-} & MVa-". One special case must be discussed. On opener phrases, the following comma is optional; the preceding comma is also optional, and is only permitted when there is a following comma. He said that after the party he went home He said that after the party, he went home He said that, after the party, he went home *He said that, after the party he went home Openers therefore take "{{Xd-} & Xc+} & CO+". _Comma Phrases at the Beginning and End of Sentences_ It was said that, with the exception of openers, expressions generally require either two commas or none; it is not permitted to have only a comma at the beginning or only one at the end. One very important exception to all this must be mentioned, namely, comma expressions at the beginnings and endings of sentences. The case of phrases at the beginning has been dealt with, since these are always openers; there the beginning comma may be omitted. With phrases at the end, however, the linkage expressions proposed above will require a comma at the end of the sentence, and this is clearly wrong. A comma is not only not required; it is incorrect. (The same applies to openers; the expressions provided will allow a comma at the very beginning of the sentence.) As described above, then, the system will accept the incorrect sentences 1 and 2 below, and reject the correct sentence 3. +-Xd+-----Xc-------+ | | | 1. *He left , carrying the dog , 2. * , After the party , he left 3. He left , carrying the dog Regarding the false positives 1 and 2, we simply allow these. Rules of the form "word X is not permitted at the beginning or end of a sentence" do not seem to arise in syntax generally, only in punctuation; therefore we have not considered it important to accommodate them. (Clearly, a system for weeding out sentences beginning or ending with commas could easily be devised.) What is important is to allow sentence 3 above. This we do by installing a "right-hand wall". The right-hand wall has an "Xc-" connector; this may satisfy the demand of a comma-phrase head-word for a following comma, when the phrase occurs at the end of the sentence. Ex. 3 above will therefore be accepted: +-Xd+-----Xc-------+ | | | He left , carrying the dog \\\\\ (This punctuation usage is the only function of the right-hand wall. In cases where the punctuation function is not needed, we give the right-hand wall an "RW-", and the left-hand wall an "RW+"; thus the two walls connect to each other, preserving connectivity.) In many cases, comma phrases can be used in sequence. These can either be cases of one comma phrase nested inside another (ex. 1 below), or one comma phrase following another, both modifying the same previous word (ex.2) or the same following word (ex.3); or one comma phrase modifying a previous word, and another modifying a following word (ex.4). +--------------------Xc----+ +------MV-+ +---MX-----+ | | +-Xd--+ | +---Xd--+-Xc+ | | | | | | | | | 1.He left , carrying the dog , a poodle , , and Fred followed +---------MX--------+ +---MX---+ | | +-Xd--+-Xc-+ +Xd+---Xc-----+ | | | | | | | 2. John , a doctor , , who you met , is here 3. After he saw John, Fred said, he left the party 4. Although I liked my doctor, Mr. Smith, later, I chose a different one The structures shown for exx. 1 and 2 are those that would be required given the expressions above. However, this is obviously wrong. Each comma requires a phrase before and afterwards; but multiple commas in a row are not required, indeed (again) they are not permitted. What seems to happen is that a single comma fulfills the demands of more than one phrase. In ex. 1, the final comma fulfills the Xc demand of both phrases; in ex. 2, the comma at the end fulfills the Xc demand of the previous phrase and the Xd demand of the following one. We handle this by giving the comma the following expression: COMMA: {@Xc-} & (Xc- or Xd+) Now, as well as making an obligatory Xd or Xd connection, it can make any number of additional Xc connections as well. For exs. 1 and 2 above, then, this yields: +----------------Xc-------+ +------MV-+ +---MX---+ | | +-Xd--+ | +-Xd--+--Xc--+ | | | | | | | He left , carrying the dog , a poodle , and Fred followed +---------MX-----+ +---MX---+ | | +-Xd--+-Xc-+Xd+---Xc-----+ | | | | | | John , a doctor , who you met , is here There are three problems here. First, when a comma is serving the demands for more than one phrase, it is creating a cycle; it is linking two phrases together that are indirectly linked in some other way. Cycles are dangerous; they may allow non-cyclic expressions to form where they are not wanted (for example, "John, a doctor I saw Fred, a doctor, is here"). To prevent these, we modify the comma expression slightly: COMMA: ({@Xca-} & (Xd+ or Xc-)) In post-processing, we then add "Xca" to the "must be connected without" list. This means, in effect, that any time an Xca connection is made (i.e., any time a single comma is being used in more than one phrase), the sentence must be connected even without that Xca; this is another way of saying that Xca may only be used in cycles. It is clear that the function of fulfilling several Xc+ demands may also be performed by the right-hand wall: "We left, carrying the dog, a poodle". Thus the right-hand wall is given not just Xc-, but "{@Xca-} & Xc-". A second problem involves post-processing. The groups of links formed in post-processing are supposed to correspond to clauses. In some cases it seems appropriate for comma phrases to form groups; for example, in relative clauses ("John , who you met , is here"). Again, however, a problem arises with multiple comma phrases. Consider the following sentence. Here, if left unchecked, the domain started by the first MX connector will spread via the second comma to the second comma phrase ("a doctor"), which is clearly not part of the relative clause, but rather a direct modifier of the main noun. What's worse, it will then spread back through the second MX connector and then through the rest of the sentence. +-----------------S(r)------------+ +-----------MX(r)--------+ | | +---Xca(r)--+ | | +MX(r)+--B(r)--+ | | | | +Xd+ +S(r)-+ +Xd(r)-+Xc(r)+ | | | | | | | | | | John , who you met , a doctor , is here We solve this problem by creating a "ignore_these_links" list in post-processing. Links in this list are simply ignore by post-processing; that is, no domains are traced through them. We add "Xca" to this list. Then, in the above sentence, the "r" domain started by the first MX will not spread through the Xca to the second comma phrase; it will be confined to the relative clause. A third problem involves false positives. Return to the cases discussed above, involving consecutive commas: "John , who you met , , a doctor , is here". We have allowed for the correct version of this sentence, in which the functions of the consecutive commas are all served by a single comma. But we have not prohibited the consecutive-comma sentence, which is of course incorrect. Here again, we have not worried much about this false positive, since the kind of rule involved ("Do not allow more than one X in a row") seems to arise only in punctuation; presumably it could easily be dealt with if it were important to do so. _Xx: Colons and Semi-Colons_ Xx is used with colons and semi-colons. Unlike comma phrases - where a word is usually the head of the phrase, with the comma(s) acting as appendanges - with colons and semi-colons, the two phrases are attached together _through_ the punctuation symbol. More specifically, the "Xx" attaches the colon or semi-colon back to the wall. +-----Xx-----+ +-Wd--+ +Wd+ | | | | ///// I left ; Joe stayed +----------Xx--------+ +-Wd--+ +Wd+ | | | | ///// I have an idea : we should go that John should go plastics With semi-colons, the following phrase must be a clause; semi-colons thus carry "Xx- & W+". With colons, the following phrase can be a clause, "to-" or "that-" phrase, or noun phrase; thus colons have "Xx- & (W+ or J+ or TH+ or TO+)". Y is used in certain idiomatic time and place expressions. It connects quantity expressions to the head word of the expression (i.e., the word that connects to the rest of the sentence): +------MVp-------+ | +--Y--+ | | | We swam three miles away We swam three miles from the shore We swam three weeks ago We swam three hours after we saw you Such expressions can usually be used as openers, verb-modifying prepositions, or noun-modifying prepositions "We swam three weeks ago", "Three weeks ago we swam", "The party three weeks ago was great". The word carrying Y- is thus generally conjoined with (MVp- or CO+ or Mp-). Words that carry Y+ usually also carry {ND-}; this connects back to a number expression. In some cases the ND- is optional ("We swam after we saw you"); in other cases it is obligatory (*"We swam ago"). In some cases, the word carrying Y+ must take a prepositional object or clause as well: "*We swam three hours after"; such words therefore have "(J+ or Cs+) & Y+". In other cases no connection may be made: "*We swam three miles away the shore." Yt is used in time expressions. Words like "ago" and "after" have "Yt-"; "days", "weeks", etc, have Yt+, as well as idiomatic expressions like "a_long_time". Yd is used in distance expressions: "miles" and "feet" carry Yd+, prepositions like "from" and "behind" carry Yd-. Ya is used for a few adjectives that can take spatial expressions as modifiers ("he is three feet tall"). Ye is used in the expression "We swim every three weeks". Here "every" is treated as the head; but in this case the number expression follows the head instead of preceding it. Yx is used in expressions like "I did it three times a week". Here the "a week" expression is treated as the head. YP is used in possessive constructions to connect plural noun forms ending in s to "'". See also YS. +-YP-+--D-+ | | | The students ' wishes have been neglected. YS connects nouns to the possessive suffix "'s". "'s" then acts as a determiner, making a D connection to a noun. +-YS-+-D-+ | | | John 's dog is black Plural nouns take only an "'"; "The students' dogs are black". This uses a YP connector. The dictionary thus includes the following: 's: YS- & D+; ': YP- & D+; A possessive pronoun ("John's") can also act as a complete noun phrase: "John's is black". Thus the expressions "'s" and "'" also carry the main expression carried by nouns "(S+ or O- etc.)". This usage is rare, however; we therefore make it a stage 2 usage. All proper and common singular nouns have YS+; all plural nouns have YP+. (YS is also used for plural forms that don't end in s: "Men's legs are longer than women's".) Nouns which form possessive determiners in this way - whether singular or plural - take their own determiners and adjectives just like ordinary nouns (ex. 1-6 below). However, they may not take post-modifiers like relative clauses (ex. 7-8); nor, of course, may they act as subjects or objects. Thus YS+ and YP+ on nouns must be conjoined with "@A- & D+", disjoined with everything else. 1.The rich student was tall 2.The rich student's car was black 3.*Student was tall 4.*Student's car was black 5.Students are tall 6.Students' cars are black 7.The man I met was tall 8.*The man I met's car was black In a few weird cases, a noun can take both a possessive determiner and another determiner; these cannot be handled ("A girls' school" is rejected). Z connects the preposition "as" to certain verbs. +-----CO--+ +--Z--+ | | | | As I said, I like broccoli Some verbs which take clausal complements - "threaten", "say", "mention" - can also be used in constructions like the above; in such cases, the complement requirement of the verb - which may be mandatory ("*I said") - is satisfied. Such verbs therefore have "Z-" disjoined with their other complement connectors (TH+, Ce+, etc.). The word "as" has "Z+" disjoined with "Cs+" (used in conjunction phrases) and J+ (used for prepositional objects). Z+ is conjoined with "CO+ or MVs-"; this allows such phrases as closers as well: "I like broccoli, as I said". as.p: (Z+ or J+ or Cs+) & (CO+ or MVs-); The Z connector can also be used in comparative expressions: He was not as late as I expected ?He was not as late as I said "As.z" (the second "as" in a comparative expression) therefore has Z+ as well. However, some verbs can be used in this construction only as openers and closers, not as comparatives (such as "said" above). Therefore we subscript the connector on "as.z" Zc+; verbs with can not take Z in comparatives have Zx-. "As" phrases of this kind can also be used without a subject: He likes broccoli, as was expected He earns as much as was expected Here, the phrase "was expected" has two demands: "is" demands a subject, and "expected" demands a complement. "As" must fulfill both of these demands. So "as" has a subject connector, optionally conjoined with the Z+. as.z: ({SFsic+} & Zc+ +----Z-----+ +SF-+--Pv--+ | | | he earns as much as was expected "Than" may also be used in this way. The use of comparatives is highly constrained by post-processing: see "MVx". The "Zc" link here starts a domain, which includes only the "than" or "as" phrase. Note also that the subject connector her is an "SF", not a "S", and that it is subscripted. In post-processing, we then enforce the same restrictions on the verbs that may occur with "SFsic" that we enforce with the filler subjects "it" and "there". Thus we allow "...than seemed to be expected", but prohibit "...than wanted to be expected", etc.. See "SF: Filler-it" for more explanation.