[Reference Manual]
[Code]
The primary representational device in ALE is the typed
feature structure. In phrase structure grammars, feature structures
model categories, while in the definite clause programs, they serve
the same role as first-order terms in Prolog, that of a universal data
structure. Feature structures are much like the frames of AI systems,
the records of imperative programming languages like C or Pascal, and
the feature descriptions used in standard linguistic theories of
phonology, and more recently, of syntax.
Rather than presenting a formal definition of feature structures, which can be found in Carpenter (1992:Chapter 2), we present an informal description here. In fact, we begin by discussing feature structures which are not necessarily well-typed. In the next section, the type system is presented.
A feature structure consists of two pieces of information. The first is a type. Every feature structure must have a type drawn from the inheritance hierarchy. The other kind of information specified by a feature structure is a finite, possibly empty, collection of feature/value pairs. A feature value pair consists of a feature and a value, where the value is itself a feature structure. The difference between feature structures and the representations used in phonology and in GPSG, for instance, is that it is possible for two different substructures (values of features at some level of nesting) to be token identical in a feature structure. Consider the following feature structure drawn from the lexical entry for John in the categorial grammar in the appendix, displayed in the output notation of ALE:
cat
QSTORE e_list
SYNSEM basic
SEM j
SYN np
The type of this feature structure is cat, which is interpreted
to mean it is a category. It is defined for two features, QSTORE and SYNSEM. As can be seen from this example, we follow
the HPSG notational convention of displaying features in all caps,
while types are displayed in lower case. Also note that features and
their values are printed in alphabetic order of the feature names.
In this case, the value of the QSTORE feature is
the simple feature structure of type e_list,4.1which has no feature
values. On the other hand, the feature SYNSEM has a complex
feature as its value, which is of type basic, and has two feature
values SEM and SYN, both of which have simple values.
This last feature structure doesn't involve any structure sharing. But consider the lexical entry for runs:
cat
QSTORE e_list
SYNSEM backward
ARG basic
SEM [0] individual
SYN np
RES basic
SEM run
RUNNER [0]
SYN s
Here there is structure sharing between the path SYNSEM ARG SEM
and the path SYNSEM RES SEM RUNNER, where a path is simply
a sequence of features. This structure sharing is indicated by the
tag [0]. In this case, the sharing indicates that the
semantics of the argument of runs fills the runner role in the
semantics of the result. Also note that a shared structure is only
displayed once; later occurrences simply list the tag. Of course,
this example only involves structure sharing of a very simple feature
structure, in this case one consisting of only a type with no
features. In general, structures of arbitrary complexity may be
shared, as we will see in the next example.
ALE, like Prolog II and HPSG, but unlike most other systems, allows cyclic structures to be processed and even printed. For instance, consider the following representation we might use for the liar sentence This sentence is false:
[0] false
ARG1 [0]
In this case, the empty path and the feature ARG1 share a value.
Similarly, the path ARG1 ARG1 ARG1 and the path ARG1 ARG1,
both of which are defined, are also identical. But consider a
representation for the negation of the liar sentence, It is false
that this sentence is false:
false
ARG1 [0] false
ARG1 [0]
Unlike Prolog II, ALE does not necessarily treat these two feature
structures as being identical, as it does not conflate a cyclic
structure with its infinite unfolding. We take up the notion of
token identical structures in the section below on extensionality.
It is interesting to note that with typed feature structures, there is a choice between representing information using a type and representing the same information using feature values. This is a familiar situation found in most inheritance-based representation schemes. Thus the relation specified in the value of the path SYNSEM RES SEM is represented using a type, in:
SEM run
RUNNER [0]
An alternative encoding, which is not without merit, is:
SEM unary_rel
REL run
ARG1 [0]
In general, type information is processed much more efficiently than
feature value information, so as much information as possible should
be placed in the types. The drawback is that type information must be
computed at compile-time and remain accessible at run-time. More
types simply require more memory.4.2