[Code]
Lexical entries in ALE are specified as rewriting
rules, as given by the following BNF syntax:
<lex_entry> ::= <word> ---> <desc>.For instance, in the categorial grammar lexicon in the appendix, the following lexical entry is provided, along with the relevant macros:
john ---> @ pn(j). pn(Name) macro synsem: @ np(Name), @ quantifier_free. np(Ind) macro syn:np, sem:Ind. quantifier_free macro qstore:[].Read declaratively, this rule says that the word john has as its lexical category the most general satisfier of the description @ pn(j), which is:
cat SYNSEM basic SYN np SEM j QSTORE e_listNote that this lexical entry is equivalent to that given without macros by:
john ---> synsem:(syn:np, sem:j), qstore:e_list.Macros are useful as a method of organizing lexical information to keep it consistent across lexical entries. The lexical entry for the word runs is:
runs ---> @ iv((run,runner:Ind),Ind). iv(Sem,Arg) macro synsem:(backward, arg: @ np(Arg), res:(syn:s, sem:Sem)), @ quantifier_free.This entry uses nested macros along with structure sharing, and expands to the category:
cat SYNSEM backward ARG synsem SYN np SEM [0] sem_obj RES SYN s SEM run RUNNER [0] QSTORE e_listIt also illustrates how macro parameters are in fact treated as variables.
Multiple lexical entries may be provided for each word. Disjunctions may also be used in lexical entries, but are expanded out at compile-time. Thus the first three lexical entries, taken together, compile to the same result as the fourth:
bank ---> syn:noun, sem:river_bank. bank ---> syn:noun, sem:money_bank. bank ---> syn:verb, sem:roll_plane. bank ---> ( syn:noun, sem:( river_bank ; money_bank ) ; syn:verb, sem:roll_plane ).Note that this last entry uses the standard Prolog layout conventions of placing each conjunct and disjunct on its own line, with commas at the end of lines, and disjunctions set off with vertically aligned parentheses at the beginning of lines.
The compiler finds all the most general satisfiers for lexical entries at compile time, reporting on those lexical entries that have unsatisfiable descriptions. In the above case of bank, the second combined method is marginally faster at compile-time, but their run-time performance is identical. The reason for this is that both entries have the same set of most general satisfiers.
ALE supports the construction of large lexica, as it relies on Prolog's hashing mechanism to actually look up a lexical entry for a word during bottom-up parsing. For generation, ALE indexes lexical entries for faster unification, as described in Penn and Popescu (1997). Constraints on types can also be used to enforce conditions on lexical representations, allowing for further factorization of information.