Next: ILEX and Opportunistic
Up: The ILEX architecture
Previous: Content Determination
Although the process of content determination has worked through a number of
moves that may be made in the generated text, the result is not the kind of
tree structure that one needs for realisation and also has been influenced
only by local considerations of coherence. Text planning therefore
requires
the following two steps:
- Extend the subgraph to a complete subgraph that includes all
the relations linking the selected fact nodes.
- Produce from this an ``optimal'' selection of relations, so as
to give rise to an RST tree including all the selected facts.
Once the complete subgraph has been obtained, the text planning problem is
exactly that described by [Marcu 97]. The idea of combining a set of facts
together into an ``optimal'' text is also compatible with [Hovy 90] and
the earlier work of [Mann and Moore 81].
Again this involves exploiting opportunities.
For instance, in order to avoid an awkward focus shift at some point, one
might attempt to include a
selected fact about a new entity immediately after another one that mentions
the same entity. Other text planning operations that are opportunistic in
nature include aggregation [Dalianis and Hovy 96] and redundancy suppression
[McDonald 92], though we will not consider these here.
ILEX has been designed in a modular way such that each module
(processing stage) can be easily pulled out and a replacement module
inserted. This is true for text planning, where we are currently
experimenting with a range of planning algorithms. Again, these are all
opportunistic in nature, rather than being strongly
goal-directed or schema-based.
We could use Marcu's methods directly, but are exploring more widely because:
- We would like to take into account a wider range of preference criteria,
some of which involve global properties of trees (e.g. preferences based on
focus and on sizes of substructures). (Marcu in fact uses a global
evaluation based on ``right skew'' in his work on rhetorical
parsing.) We would like to develop global criteria further.
- We argue elsewhere that entity-based elaborations are rather
different from other rhetorical relations and that the algorithms
and representations should reflect this directly. (See [Knott et al 1998].)
- Marcu's approach involves finding an optimal solution to a constraint
satisfaction problem and enumerating all RST trees compatible with a given
sequence of facts. We believe that the combinatorics of this will be
unattractive for large examples (since constraint satisfaction is
intractable in the general case) and
wish to investigate heuristic approaches.
One of our current text planning algorithms uses a deterministic procedure to
map a sequence of facts onto a single RST tree and a genetic algorithm to
search for a sequence whose tree is as highly-valued as possible
[Mellish et al 1998]. This is not
yet integrated with the main ILEX system, but when run on content selected
by ILEX produces text plans which could be realised as texts such as that
shown in Figure 6. Note that for this text the realisation
(including aggregation and referring expression generation) has been done by
hand, though the ordering and choice of rhetorical relations is performed by
the system. Although there are perhaps some places where limited schemas
would have helped this text, nevertheless the system has been quite successful
in interleaving more ``important'' facts about the designer and the style
with facts about the topic of the text.
Figure 6: Example output text (realised by hand)
Next: ILEX and Opportunistic
Up: The ILEX architecture
Previous: Content Determination
Mick O'Donnell
Mon Feb 9 14:09:51 GMT 1998