[Reference Manual]
[Code]
The primary representational device in ALE is the typed
feature structure. In phrase structure grammars, feature structures
model categories, while in the definite clause programs, they serve
the same role as first-order terms in Prolog, that of a universal data
structure. Feature structures are much like the frames of AI systems,
the records of imperative programming languages like C or Pascal, and
the feature descriptions used in standard linguistic theories of
phonology, and more recently, of syntax.
Rather than presenting a formal definition of feature structures, which can be found in Carpenter (1992:Chapter 2), we present an informal description here. In fact, we begin by discussing feature structures which are not necessarily well-typed. In the next section, the type system is presented.
A feature structure consists of two pieces of information. The first is a type. Every feature structure must have a type drawn from the inheritance hierarchy. The other kind of information specified by a feature structure is a finite, possibly empty, collection of feature/value pairs. A feature value pair consists of a feature and a value, where the value is itself a feature structure. The difference between feature structures and the representations used in phonology and in GPSG, for instance, is that it is possible for two different substructures (values of features at some level of nesting) to be token identical in a feature structure. Consider the following feature structure drawn from the lexical entry for John in the categorial grammar in the appendix, displayed in the output notation of ALE:
cat QSTORE e_list SYNSEM basic SEM j SYN npThe type of this feature structure is cat, which is interpreted to mean it is a category. It is defined for two features, QSTORE and SYNSEM. As can be seen from this example, we follow the HPSG notational convention of displaying features in all caps, while types are displayed in lower case. Also note that features and their values are printed in alphabetic order of the feature names. In this case, the value of the QSTORE feature is the simple feature structure of type e_list,4.1which has no feature values. On the other hand, the feature SYNSEM has a complex feature as its value, which is of type basic, and has two feature values SEM and SYN, both of which have simple values.
This last feature structure doesn't involve any structure sharing. But consider the lexical entry for runs:
cat QSTORE e_list SYNSEM backward ARG basic SEM [0] individual SYN np RES basic SEM run RUNNER [0] SYN sHere there is structure sharing between the path SYNSEM ARG SEM and the path SYNSEM RES SEM RUNNER, where a path is simply a sequence of features. This structure sharing is indicated by the tag [0]. In this case, the sharing indicates that the semantics of the argument of runs fills the runner role in the semantics of the result. Also note that a shared structure is only displayed once; later occurrences simply list the tag. Of course, this example only involves structure sharing of a very simple feature structure, in this case one consisting of only a type with no features. In general, structures of arbitrary complexity may be shared, as we will see in the next example.
ALE, like Prolog II and HPSG, but unlike most other systems, allows cyclic structures to be processed and even printed. For instance, consider the following representation we might use for the liar sentence This sentence is false:
[0] false ARG1 [0]In this case, the empty path and the feature ARG1 share a value. Similarly, the path ARG1 ARG1 ARG1 and the path ARG1 ARG1, both of which are defined, are also identical. But consider a representation for the negation of the liar sentence, It is false that this sentence is false:
false ARG1 [0] false ARG1 [0]Unlike Prolog II, ALE does not necessarily treat these two feature structures as being identical, as it does not conflate a cyclic structure with its infinite unfolding. We take up the notion of token identical structures in the section below on extensionality.
It is interesting to note that with typed feature structures, there is a choice between representing information using a type and representing the same information using feature values. This is a familiar situation found in most inheritance-based representation schemes. Thus the relation specified in the value of the path SYNSEM RES SEM is represented using a type, in:
SEM run RUNNER [0]An alternative encoding, which is not without merit, is:
SEM unary_rel REL run ARG1 [0]In general, type information is processed much more efficiently than feature value information, so as much information as possible should be placed in the types. The drawback is that type information must be computed at compile-time and remain accessible at run-time. More types simply require more memory.4.2