Any document marked up for RST can be used for variable-length document presentation. This section describes the process whereby the rst-structure is pruned to produce a suitable length document.
As described in the introduction, the basic mechanism involves assigning each structural relation a relevance score between 0.0 and 1.0. For instance, ELABORATION may have a score of 0.40 (low relevance), while PURPOSE might be scored more highly.
By an RST-tree, I assume a tree with the top-nucleus as the root of the tree, and satellites hanging off this, and their satellites hanging off of them. Our task is then to prune branches off of this tree. The top-nucleus has a relevance value of 1.0 (maximum relevance).
Through a process of recursive descent, we assign each node in the tree the relevance level of its parent, multiplied by the relevance score of the relation which connects them to the parent. For instance, an ELABORATION of the top-nucleus would have relevance 0.4 (1.0 * 0.4), while an ELABORATION of that node would have relevance 0.16 (0.4 * 0.4). Nodes lower in the RST-tree (less nuclear) will thus have lower relevance than higher nodes (more nuclear), and will thus be the first to be pruned.
This is a simple mechanism, but it has shown good results in producing reasonable texts at whatever degree of verbosity. It is easy to see that an elaboration of an elaboration will in most cases be less essential to a text than the elaboration itself.
However, there are some cases where this method breaks down -- nuclearity does not always reflect centrality of information. Sometimes an author introduces information in a rhetorically unimportant place, yet that information may be needed later to understand the argument. One example of this in the summary shown earlier is where the original text had said: he was faced with constant pressure from Edward to sign. He refused to do so. In the summary, ``to sign'' was pruned as, but it was actually a central concept, and the anaphoric ``so'' failed because of its pruning.
The text-nodes are then placed in a queue, position based on their relevance score.
When a request is received to display the text at a particular length, the system needs to determine which text-nodes to display. Taking each node in turn from the relevance queue (starting with the most relevant), the program checks to see if including this text node will push the word-count over the limit. If not it adds the node to the nodes-to-be-expressed list, and increments the words-so-far count. When the word-limit is exceeded, the procedure then turns to expressing the selected nodes. The nodes are expressed in the order in which they appeared in the original full text.
Note that the satellites of a node will always have lower or equal relevance than the node itself, so we never include a satellite in the nodes-to-be-expressed list if its nucleus is not, which may produces incoherency.
The RST Markup Tool, and consequently document presentation, allows markup of more than simple nuclear-satellite relations. This includes:
All of these structures are handled in terms of the relation (role) linking the constituent to the whole, and this relation is handled identically to simple RST relations in text pruning.
The actual values associated with each relation are not fixed, but can be varied by the user. The user can select values which reflect their interests, highlighting some types of rhetorical relations, and ignoring others.
The system comes with three inbuilt `user-models', representing different ranges of interest: ( standard, (average values), how&why preferring cause, reason, purpose, conditionals, etc., and when&where, preferring spatial- and temporal-locations and extents. Figure 3 demonstrate the slight difference of information (bold font) included in the text when switching between the when&where set and the how&why set. We might also add such sets as naive, preferring definitions, clarifications, restatements, and elaborations, while an expert might value these less, but prefer generalisations, etc. Apart from these built-in values, the user can also assign values to each relation independently.