next up previous
Next: Summary Up: Variable Length On-Line Document Previous: Preserving Coherence in

Document Preparation

Before the text can be used for variable-length presentation, it needs to be marked-up in terms of RST structure. To facilitate this step, we have developed an RST Markup Tool, which allows a user to:

  1. Segment the text.

  2. Graphically link these segments together into an RST-tree.

Text Segmentation

Each of these tasks has a separate interface within the tool. The first is shown in figure 4. The buttons ``Sentences'' and ``Paragraphs'' result in automatic recognition of sentence and paragraph boundaries. If further segmentation is required, the user can switch into segmentation mode, during which they need only click at each segment boundary to introduce a segmentation marker. To edit the text (modifying the text, correcting spelling errors, etc.), switch to the Edit mode.

 

 


Figure 4: Text Segmentation Tool

A problem occurs with embedded elements -- cases where a rhetorically dependent stretch of text occurs within another node. For instance, we might wish to treat the embedded clause in the following as dependent on the main clause: John, -- I think you know him -- is here for two weeks. At present, the interface does not handle such cases. A simple solution is for the user to move the embedded text outside of the enclosing text.

Text Structuring

The second step of document preparation involves structuring the text. Another interface of the RST Markup Tool allows the user to connect the segments into a rhetorical structure tree, as shown in figure 5. We have followed the graphical style presented in Mann & Thompson (1987).

 

 


Figure 5: Text Structuring Tool

Initially, all segments are unconnected, ordered at the top of the window. The user can then drag the mouse from one segment (the nucleus) to another (the satellite). Upon releasing the mouse button, the system offers a menu of relations to choose from (the user can use the relation-sets provided with the system, or provide their own).

The system allows both plain rst-relations and also multi-nuclear relations (e.g., joint, sequence, etc.). Scoping is also possible, whereby the user indicates that the nucleus of a relation is not a segment itself, but rather a segment and its satellites. See figure gif for an example of both a multi-nuclear structure, and scoping. In addition, McKeon-style schemas (sometimes called story-grammars) can be used to represent constituency-type structures. See figure gif.

  
Figure: Scoping and multi-nuclear relations

  
Figure: Constituent Structure

The user can switch freely between text segmentation and text structuring mode -- to edit text, or to change segment boundaries. The system keeps track of the structure assigned so far. If the user, in editing the text, deletes a segment, the system forgets structuring information concerning that segment.

Because rst-structures can become very elaborate, the RST Tool allows the user to collapse sub-trees -- hiding the substructure under a node, This makes it easier, for instance, to connect two nodes which normally would not appear on the same page of the editor.

The user can save the present state of the screen as postscript, for inclusion in Latex documents. Alternatively, a snapshot utility can be used to save selected parts of the structure in other formats. The structured text can be saved to a file, for later re-editing, or for use in variable-length document presentation.



next up previous
Next: Summary Up: Variable Length On-Line Document Previous: Preserving Coherence in



Mick O'Donnell
Mon Nov 18 18:41:07 GMT 1996