An Explanation of the Link Parser Output
If you type in:When your back is is against the whiteboard, I'll be back to back you up
You will see the following output from the parser:No complete linkages found. ++++Time 2.13 seconds (21.62 total) Found 14 linkages (14 with no P.P. violations) at null count 1 Linkage 1, cost vector = (UNUSED=1 DIS=1 AND=0 LEN=46) +--------------------------CO*s-------------------------+ +-------------------------Xc-------------------------+ | +----Cs----+ +-------Jp------+ | | +----MVi-- | +--Ds-+----Ss----+--Pp-+ +---D*u---+ | +Sp*+-Ix-+--K--+ | | | | | | | | | | | | when your back.n [is] is.v against the whiteboard[?].n , I.p 'll be.v back.k -+ +----K---+ +--I-+-Ox-+ | | | | | to back.v you up Constituent tree: (S (SBAR (WHADVP When) (S (NP your back) is (VP is (PP against (NP the whiteboard))))) , (S (NP I) (VP 'll (VP be (PRT back) (S (VP to (VP back (NP you) (PRT up))))))))
The linkage and the word annotations
For a moment just consider everything after the line beginning "Linkage 1''. The information about the sentence structure is conveyed in this linkage diagram. The horizontal dashed lines represent links between words. These are labeled with a link type. For example, in the above diagram there is an Ss link between back.n and is. You can click on a link label for more information about that link type. (The reader is referred to a more detailed explanation of link grammars.)
The words of the sentence are shown below the linkage diagram. They are annotated by the parser in several ways. First of all, if the word you typed was not in the dictionary (like "whiteboard'' above), it is followed by [?], followed by one of .n, .v, .a, or .e, depending on whether the word is being interpreted as a noun, verb, adjective, or adverb. The same kinds of suffixes are used for words that actually are in the dictionary. For example, back.n indicates that this word is being used as a noun. In this sentence, the parser successfully identifies three different tokens of the same word as different parts of speech, depending on the context: back.n (noun), back.e (adverb), and back.v (verb).
If a word is enclosed in square brackets (like [is] above), this indicates that the parser has been forced to effectively delete this word (by using null links) in order to find a grammatical interpretation of the sentence.
The symbol ///// is known as the wall. This is an artifical word inserted at the beginning and end of every sentence before it is parsed. You can click on it for more explanation.
The constituent tree
The parser also shows a constituent representation of the sentence, labeling noun phrases, verb phrases, clauses ("S"), etc.. The constituent representation is derived from the linkage. The constituents are indented to show their level of embedding in the tree. For example, the NP "your back" is at the same level as the VP "is against the whiteboard". Click here for more information about this.
Phases of parsing
The process of parsing a sentence proceeds in several phases. In the first phase, it attempts to find a "complete" linkage for a sentence, in which all the words are linked together. If the parser cannot interpret the sentence, it begins to relax this constraint. null count 1 indicates that the parser is allowing one word to be ignored, or more generally, it is allowing the sentence to be partitioned into two disconnected components. Higher null counts allow more words to be ignored and more disconnected components. The parser successively tries higher and higher null counts. Once the parser finds a valid linkage with some null count, the process terminates. The Allow null links button controls this feature.
If the parser still hasn't found a linkage after approximately 20 seconds, it enters "panic mode". In this mode, the parser tries again to find a linkage for the sentence under more limited circumstances. It can only consider links of a certain length; it can also allow "islands" of disconnected words. With panic mode, the parser can usually find a linkage for even quite long sentences within a reasonable time.
The parser does not consider a sentence to be "grammatical", just because it finds a valid linkage for that sentence. The linkage must satisfy a post-processing phase. The parser indicates the post-processing status with messages like "Found 2 linkages (1 with no P.P. violations)". If all of the linkages at one stage have post processing violations, the parser continues looking for a satisfactory linkage in the next phase.
If there is more than one satisfactory linkage, the parser orders them according to certain simple heuristics. It either shows the first linkage in its ordering, or all of them, depending on the value of the show all linkages button. The cost vector determines the ordering used. This vector has three components. The first component (most significant in the ordering) is the total cost of all the usages of words in the linkage. The dictionary assigns different costs to usages of a word; while most usages have cost 0, some have non-zero cost. The second component has to do with the relative size of components combined by conjunctions. The third component is the total length of all links in the linkage.
When a sentence makes use of a conjunction like "and" more than one linkage diagram is displayed. (Actually, this doesn't always happen. Sometimes a conjunction's behavior is captured by the ordinary link logic.) These linkages correspond to the different "sub-sentences" represented by the conjunction. For example if you say "the dog and cat died", this is partitioned into two sub-sentences "the dog died" and "the cat died". Both sublinkages must pass post-processing. Two linkages are shown in this case.
Why didn't my sentence parse?
There are a number of possible reasons for this, the most common of which of which we mention briefly here.
Conversational style: A sentence like "Hi, my name is John", or "John laughed himself silly", are conversational in style. The system is tuned to handle newspaper-style language, not conversational style.
Case matters: "I saw bob" is rejected, because "bob" is in the dictionary as a verb. "I saw Bob" works fine, as does "I saw bab". The latter works because "bab" is not in the dictionary and the parser will apply the generic unknown word definition.
Missing Idioms: You may have used an idiom that is not listed. "Now is the time to come" is rejected, because "now is the time" is not listed as an idiom in the dictionary "This is the time ..." is fine.
Conjunctions: The parser imposes some limitations on the use of conjunctions. These are rather rare in natural sentences, but they do arise occasionally. Details appear in "Parsing English with a Link Grammar" on our bibliography page.
Errors: Yes, we admit that it is just barely possible that there might be some teeny weeny little errors in the grammar. You can usually very quickly locate such an error by "binary search". Try simpler and simpler sentences until you find a very simple one on which the parser behaves incorrectly. Please send us email if you find such an error.
Daniel Sleator Last modified: Mon Apr 20 18:59:28 EDT 1998