Class GuanoAlgorithm
- All Implemented Interfaces:
DiffXAlgorithm
Implementation note: this algorithm effectively detects the correct changes in the sequences, but will not necessarily return events that can be serialised as well-formed XML as they stand.
Known problem in this implementation: elements that contain themselves tend to generate events that are harder to serialise as XML.
This class is said 'fit' because it will adapt the matrix to the sequences that it is being given in order to improve performance.
Note: The name of this class comes from a contracted version of the features of this algorithm, as explained below:
- Weighted, each token is has a given weight;
- Symmetrical, when possible, the algorithm will try to choose a path that is symmetrical in regards to the arrangement of the tokens;
- Matrix, this class uses a matrix for its internal representation;
- Version:
- 11 May 2010
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final booleanSet totrueto show debug info.private ElementStateThe state of the elements.private intThe length of the LCS.private final intLength of the first sequence to compare.private final intLength of the second sequence to compare.private MatrixMatrix storing the paths.private final EventSequenceThe first sequence of events to test.private final EventSequenceThe second sequence of events to test. -
Constructor Summary
ConstructorsConstructorDescriptionGuanoAlgorithm(EventSequence seq0, EventSequence seq1) Creates a new DiffXAlgorithmBase. -
Method Summary
Modifier and TypeMethodDescriptionfinal EventSequenceReturns the first sequence used for the diff-x comparison.final EventSequenceReturns the second sequence used for the diff-x comparison.intlength()Returns the length of the longest common sequence.private voidprintLost(int i, int j) Print information when the algorithm gets lost in the matrix, ie when it does not know which direction to follow.voidprocess(DiffXFormatter formatter) Writes the diff sequence using the specified formatter.private voidprocessEmpty(DiffXFormatter formatter) Writes the diff sequence using the specified formatter when one of the sequences is empty.private static MatrixsetupMatrix(EventSequence s1, EventSequence s2) Determines the most appropriate matrix to use.
-
Field Details
-
DEBUG
private static final boolean DEBUGSet totrueto show debug info.- See Also:
-
sequence1
The first sequence of events to test. -
sequence2
The second sequence of events to test. -
length1
private final int length1Length of the first sequence to compare. -
length2
private final int length2Length of the second sequence to compare. -
matrix
Matrix storing the paths. -
estate
The state of the elements. -
length
private transient int lengthThe length of the LCS.
-
-
Constructor Details
-
GuanoAlgorithm
Creates a new DiffXAlgorithmBase.- Parameters:
seq0- The first sequence to compare.seq1- The second sequence to compare.
-
-
Method Details
-
length
public int length()Returns the length of the longest common sequence.- Specified by:
lengthin interfaceDiffXAlgorithm- Returns:
- the length of the longest common sequence.
-
process
Writes the diff sequence using the specified formatter.- Specified by:
processin interfaceDiffXAlgorithm- Parameters:
formatter- The formatter that will handle the output.- Throws:
IOException- If thrown by the formatter.
-
getFirstSequence
Description copied from interface:DiffXAlgorithmReturns the first sequence used for the diff-x comparison.- Specified by:
getFirstSequencein interfaceDiffXAlgorithm- Returns:
- the first sequence used for the diff-x comparison.
- See Also:
-
getSecondSequence
Description copied from interface:DiffXAlgorithmReturns the second sequence used for the diff-x comparison.- Specified by:
getSecondSequencein interfaceDiffXAlgorithm- Returns:
- the second sequence used for the diff-x comparison.
- See Also:
-
processEmpty
Writes the diff sequence using the specified formatter when one of the sequences is empty.The result becomes either only insertions (when the second sequence is empty) or deletions (when the first sequence is empty).
- Parameters:
formatter- The formatter that will handle the output.- Throws:
IOException- If thrown by the formatter.
-
setupMatrix
Determines the most appropriate matrix to use.Calculates the maximum length of the shortest weighted path if both sequences are totally different, which corresponds to the sum of all the events.
- Parameters:
s1- The first sequence.s2- The second sequence.- Returns:
- The most appropriate matrix.
-
printLost
private void printLost(int i, int j) Print information when the algorithm gets lost in the matrix, ie when it does not know which direction to follow.- Parameters:
i- The X position.j- The Y position.
-