In this section, we consider the execution of ALE phrase structure grammars compiled for parsing. The examples shown in this section have been produced while running with the mini-interpreter off. The mini-interpreter will be discussed in the next section.
The primary predicate for parsing is illustrated as follows:
[Code]
| ?- rec [john,hits,every,toy].
STRING:
0 john 1 hits 2 every 3 toy 4
CATEGORY:
cat
QSTORE e_list
SYNSEM basic
SEM every
RESTR toy
ARG1 [0] individual
SCOPE hit
HITTEE [0]
HITTER j
VAR [0]
SYN s
ANOTHER? y.
CATEGORY:
cat
QSTORE ne_list_quant
HD every
RESTR toy
ARG1 [0] individual
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM hit
HITTEE [0]
HITTER j
SYN s
ANOTHER? y.
no
The first thing to note here is that the input string must be entered
as a Prolog list of atoms. In particular, it must have an opening and
closing bracket, with words separated by commas. No variables should
occur in the query, nor anything other than atoms. The first part of
the output repeats the input string, separated by numbers (nodes
in the chart) which indicate positions in the string for later use in
inspecting the chart directly. ALE asserts one lexical item for
every unit interval, with empty categories being stored as loops from
every single node to itself. The second part of the output is a
category which is derived for the input string. If there are multiple
solutions, these can be iterated through by providing positive answers
to the query. The final no response above indicates that the
category displayed is the only one that was found. If there are no
parses for a string, an answer of no is returned, as with:
| ?- rec([runs,john]). STRING: 0 runs 1 john 2 no
Notice that there is no notion of ``distinguished start symbol'' in parsing. Rather, the recognizer generates all categories that it can find for the input string. This allows sentence fragments and phrases to be analyzed, as in:
| ?- rec [big,kid].
STRING:
0 big 1 kid 2
CATEGORY:
cat
QSTORE ne_list_quant
HD some
RESTR and
CONJ1 kid
ARG1 [0] individual
CONJ2 big
ARG1 [0]
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM [0]
SYN np
ANOTHER? n.
There is also a two-place version of rec that
displays only those parses that satisfy a given description:
[Code]
| ?- rec([big,kid],s). STRING: 0 big 1 kid 2 noThis call to rec/2 failed because there were no parses of big kid of type, s. Internally, the parser still generates all of the edges that it normally does -- the extra description is only applied at the end as a filter.
Once parsing has taken place for a sentence using rec/1, it is
possible to look at categories that were generated internally. In
general, the parser will find every possible analysis of every
substring of the input string, and these will be available for later
inspection. For instance, suppose the last call to rec/1
executed was rec [john,hits,every,toy], the results of which are
given above. Then the following query can be made:
[Code]
| ?- edge(2,4).
COMPLETED CATEGORIES SPANNING: every toy
cat
QSTORE ne_list_quant
HD every
RESTR toy
ARG1 [0] individual
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM [0]
SYN np
Edge created for category above:
index: 20
from: 2 to: 4
string: every toy
rule: np_det_nbar
# of dtrs: 2
Action(retract,dtr-#,continue,abort)?
|: continue.
no
| ?-
The possible replies in the action-line will be discussed in the next
section. This tells us that from positions 2 to 4, which
covers the string every toy in the input, the indicated category
was found. Even though an active chart parser is used, it is not
possible to inspect active edges. This is because ALE
represents active edges as dynamic structures that are not available
after they have been evaluated.
Using edge/2 it is possible to debug grammars by seeing how far analyses progressed and by inspecting analyses of substrings.
There is also a predicate, rec/4 that binds the
answer to variables instead of displaying it:B.1
[Code]
| ?- rec([kim,walks],Ref,SVs,Iqs). Iqs = [ineq(_A,index(_B-third,_C-plur,_D-masc),_E,index(_B-third,_C-plur, _D-masc),done)], SVs = phrase(_F-synsem(_G-loc(_H-cat(_I-verb(_J-minus,_K-minus,_L-none,_M- bool,_N-fin),_O-unmarked,_P-e_list),...)))?Ref is an unbound variable that represents the root node of the resulting feature structure. SVs is a term whose functor is the type of the feature structure, whose arity is the number of appropriate features for that type, and whose arguments are the values at those appropriate features in alphabetical order. Each value, in turn, is of the form, Ref-SVs, another pair of root variable and type-functored term. Iqs is a list of the inequations that apply to the feature structure and its substructures. Each member of the list represents a disjunction of inequations, i.e., one must be satisfied, with the list itself being interpreted conjunctively, i.e., every member must be satisfied. Each member is represented by a chain of ineq/5 structures:
ineq(Ref1,SVs1,Ref2,SVs2,ineq(......,done)...)The first four arguments represent the Ref-SVs pairs of the two inequated feature structures of the first disjunct. The fifth argument contains another ineq/5 structure, or done/0. These three structures are suitable for passing to gen/3 or gen/4. These representations are not grounded; so if you want to assert them into a database, be sure to assert them all in one predicate to preserve variable sharing. For more details on ALE's internal representation, the reader is referred to Carpenter and Penn (1996).
There is also a rec/5, which works just like rec/4, but filters its output through the description in its last argument, just like rec/2:
| ?- rec([kim,walks],Ref,SVs,Iqs,phrase). Iqs = [ineq(_A,index(_B-third,_C-plur,_D-masc),_E,index(_B-third,_C-plur, _D-masc),done)], SVs = phrase(_F-synsem(_G-loc(_H-cat(_I-verb(_J-minus,_K-minus,_L-none,_M- bool,_N-fin),_O-unmarked,_P-e_list),...)))?The call succeeds here because the given answer is of type, phrase.
rec_list/2 iteratively displays all of the
solutions for each string in a list of list of words that satisfy a
description:
[Code]
| ?- rec_list([[kim,sees,sandy],[sandy,sees,kim]],phrase). STRING: 0 kim 1 sees 2 sandy 3 CATEGORY: phrase QRETR e_list QSTORE e_set SYNSEM synsem... ANOTHER? y. STRING: 0 sandy 1 sees 2 kim 3 CATEGORY: phrase QRETR e_list QSTORE e_set SYNSEM synsem ANOTHER? y. noIf no filtering through a description is desired, the description, bot, which is trivially satisfied, can be used. When rec_list/2 runs out of solutions for a string, it moves on to the next string.
rec_best/2 iteratively displays all of the
solutions satisfying a given description for the first string in
a list of list of words that has a solution satisfying that
description.
[Code]
| ?- rec_best([[kim,sees,sandy],[sandy,sees,kim]],phrase). STRING: 0 kim 1 sees 2 sandy 3 CATEGORY: phrase QRETR e_list QSTORE e_set SYNSEM synsem... ANOTHER? y. noWhen rec_best/2 runs out of solutions for a string that had at least one solution, it fails. It only tries the strings in its first argument until it finds one that has solutions.
There is also a three-place rec_list/3 that collects the solutions to rec_list/2 in a list of terms of the form fs(Tag-SVs,Iqs).