The genetic algorithm we used takes the following form:
The ordinal representation [Michalewicz 92] assumes that there is an initial canonical sequence of facts (in the figure, this is assumed to be 1,2,3,4). A given sequence is represented by a sequence of numbers, where the ith element indicates the position of the ith element of the sequence in that canonical sequence with all previous elements deleted. So the ith element is always a number between 1 and n+1-i, where n is the length of the sequence. Mutation is implemented by a change of a random element to a random legal value. Crossover (here) is implemented by two-point crossover - the material between two random points of the sequences (the same points for both) is swapped over, yielding two new sequences. The ordinal representation has been used extensively for tasks such as the travelling salesman problem, and it has the advantage that the crossover operation is particularly simple.
In many ways, this is a more obvious encoding, though the operations are chosen to reflect the intuition that order and adjacency information should generally be maintained from old solution(s) to the new ones they give rise to. A sequence of facts is represented simply as that sequence. Mutation selects a random element, removes it from the sequence and then inserts it again in a random place. Crossover inserts a random subsequence of one solution into another, deleting duplicates that occur outside the inserted subsequence.
Figure 4 shows an example text produced using the path encoding operations (for j-342540, after 2000 iterations, just under 2 minutes, score 180). The same remarks about hand editing apply as before.
Figure 5 summarises the results for subtree swapping and the two genetic algorithms on a set of examples. These results summarise the mean and standard deviations of the scores of the system run 10 times. The system was tried with a limit of 2000 and 4000 iterations around the main loop of the algorithm. These took about 2 and 4 minutes respectively. With each example problem we have specified the number of facts, the number of elaboration relations and the number of non-elaboration relations. Note that there is not a very clear basis for comparison between algorithms, since each algorithm performs different operations during an ``iteration''. Nevertheless, since iterations take roughly the same amount of time one can get a rough idea of the relative performance.
The example text is now in a single paragraph, with a clear link from each sentence to the previous ones. From the numerical results, one can see that there is much less variability than before. This is mainly because the rigid tree-building constraints prevent really bad trees being built and so the worst results are less bad. The results are also significantly better than for subtree swapping, with the edge-sensitive representation clearly winning.