This paper presents some initial experiments using stochastic search methods for aspects of text planning. The work was motivated by the needs of the ILEX system for generating descriptions of musum artefacts (in particular, 20th Century jewellery) [Mellish et al 98]. We present results on examples semi-automatically generated from datastructures that exist within ILEX.
Forming a set of facts about a piece of jewellery into a structure that yields a coherent text is a non-trivial problem. Rhetorical Structure Theory [Mann and Thompson 87] claims that a text is coherent just in case it can be analysed hierarchically in terms of relations between text spans. Much work in NLG makes the assumption that constructing something like an RS tree is a necessary step in the planning of a text. This work takes as its starting point Marcu's [Marcu 97] excellent formalisation of RST and the problem of building legal RST trees, and for the purposes of this paper the phrase ``text planning'' will generally denote the task characterised by him. In this task, one is given a set of facts all of which should be included in a text and a set of relations between facts, some of which can be included in the text. The task is to produce a legal RS tree using the facts and some relations (or the ``best'' such tree).
Following the original work on RST and assumptions that have been commonly made in subsequent work, we will assume that there is a fixed set of possible relations (we include ``joint'' as a second-class relation which can be applied to any two facts, but whose use is not preferred). Each relation has a nucleus and a satellite (we don't consider multiple nuclei or satellites here, apart from the case of ``joint'', which is essentially multinuclear). Each relation may be indicated by a distinctive ``cue phrase'', with the nucleus and satellite being realised in some fashion around it. Each relation has applicability conditions which can be tested between two atomic facts. For two complex text spans, a relation holds exactly when that relation holds between the nuclei of those spans. Relations can thus hold between text spans of arbitrary size.
Figure 1 shows an example of the form of the input that is used for the experiments reported here.
Each primitive ``fact'' is represented in terms of a subject, verb and complement (as well as a unique identifier). The ``subject'' is assumed to be the entity that the fact is ``about''. The approaches reported here have not yet been linked to a realisation component, and so the entities are represented simply by canned phrases for readability (it is assumed that each entity in the domain has a fixed distinctive phrase that is always used for it). Relations are represented in terms of the relation name, the nucleus and satellite facts and a list (in this example, empty) of precondition facts which need to have been assimilated before the relation can be used (this represents an extension to Marcu's chcracterisation). This example uses the definition of (object-attribute) ``elaboration'' that we will be using consistently, namely that one fact can elaborate another if they have an entity in common (of course, there are other kinds of elaborations, but we would want to model them differently).
There seem to be three main approaches to controlling the search for a good RS tree (or something similar). One is to restrict what relations can appear in the nucleus and satellite of others (for instance, using Hovy's [Hovy 90] idea of ``growth points''). This is a step towards creating ``schemas'' for larger pieces of text. It can therefore be expected that it will produce very good results in restricted domains where limited text patterns are used, but that it will be hard to extend it to freer text types. The second idea is to use information about goals to limit possibilities. This is an element of Hovy's work but is more apparent in the planning work of Moore and Paris [Moore and Paris 93]. This second approach will work well if there are strong goals in the domain which really can influence textual decisions. This is not always the case. For instance, in our ILEX domain [Mellish et al 98] the system's goal is something very general like ``say interesting things about item X, subject to length and coherence constraints''.
The third approach, most obviously exemplified by [Marcu 97], is to use some form of explicit search through possible trees, guided by heuristics about tree quality. Marcu first of all attempts to find the best ordering of the facts. For every relation that could be indicated, constraints are generated saying what the order of the two facts involved should be and that the facts should be adjacent. The constraints are weighted according to attributes of rhetorical relations that have been determined empirically. A standard constraint satisfaction algorithm is used to find the linear sequence such that the total weight of the satisfied constraints is maximal. Once the sequence of facts is known, a general algorithm [Marcu 96] is used to construct all possible RS trees based on those facts. It is not clear how the best such tree is selected, though clearly the adjacency and order constraints could in principle be reapplied in some way (possibly with other heuristics that Marcu has used in rhetorical parsing) to select a tree.
We are interested in further developing the ideas of Marcu, but seek to address the following problems: