Text Planning

Next: ILEX and Opportunistic Up: The ILEX architecture Previous: Content Determination

Text Planning

Although the process of content determination has worked through a number of moves that may be made in the generated text, the result is not the kind of tree structure that one needs for realisation and also has been influenced only by local considerations of coherence. Text planning therefore requires the following two steps:

Extend the subgraph to a complete subgraph that includes all the relations linking the selected fact nodes.
Produce from this an ``optimal'' selection of relations, so as to give rise to an RST tree including all the selected facts.

Once the complete subgraph has been obtained, the text planning problem is exactly that described by [Marcu 97]. The idea of combining a set of facts together into an ``optimal'' text is also compatible with [Hovy 90] and the earlier work of [Mann and Moore 81]. Again this involves exploiting opportunities. For instance, in order to avoid an awkward focus shift at some point, one might attempt to include a selected fact about a new entity immediately after another one that mentions the same entity. Other text planning operations that are opportunistic in nature include aggregation [Dalianis and Hovy 96] and redundancy suppression [McDonald 92], though we will not consider these here.

ILEX has been designed in a modular way such that each module (processing stage) can be easily pulled out and a replacement module inserted. This is true for text planning, where we are currently experimenting with a range of planning algorithms. Again, these are all opportunistic in nature, rather than being strongly goal-directed or schema-based. We could use Marcu's methods directly, but are exploring more widely because:

We would like to take into account a wider range of preference criteria, some of which involve global properties of trees (e.g. preferences based on focus and on sizes of substructures). (Marcu in fact uses a global evaluation based on ``right skew'' in his work on rhetorical parsing.) We would like to develop global criteria further.
We argue elsewhere that entity-based elaborations are rather different from other rhetorical relations and that the algorithms and representations should reflect this directly. (See [Knott et al 1998].)
Marcu's approach involves finding an optimal solution to a constraint satisfaction problem and enumerating all RST trees compatible with a given sequence of facts. We believe that the combinatorics of this will be unattractive for large examples (since constraint satisfaction is intractable in the general case) and wish to investigate heuristic approaches.

One of our current text planning algorithms uses a deterministic procedure to map a sequence of facts onto a single RST tree and a genetic algorithm to search for a sequence whose tree is as highly-valued as possible [Mellish et al 1998]. This is not yet integrated with the main ILEX system, but when run on content selected by ILEX produces text plans which could be realised as texts such as that shown in Figure 6. Note that for this text the realisation (including aggregation and referring expression generation) has been done by hand, though the ordering and choice of rhetorical relations is performed by the system. Although there are perhaps some places where limited schemas would have helped this text, nevertheless the system has been quite successful in interleaving more ``important'' facts about the designer and the style with facts about the topic of the text.

Figure 6: Example output text (realised by hand)

Next: ILEX and Opportunistic Up: The ILEX architecture Previous: Content Determination

Mick O'Donnell
Mon Feb 9 14:09:51 GMT 1998