Package org.docx4j.diff
Class Differencer
java.lang.Object
org.docx4j.diff.Differencer
Capable of comparing a pair of:
- w:body (only lightly tested)
- w:sdtContent (used extensively)
- w:p (includes an algorithm aimed at producing a better diff)
See org.docx4j.samples.CompareDocuments for an example of how to use.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate Map<Relationship,Part> protected static org.slf4j.Loggerstatic Integerprivate StringBecause the resulting document might be built out of the results of a number of diffs, we need to be sure that the id's are unique across these diffs.private static final SimpleDateFormat(package private) static ObjectFactory(package private) static Templates(package private) static Templates(package private) static Templates -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static voidAdd a word to a w:r's existing w:tprivate static StringcombineAdjacent(XMLStreamReader reader) private static RcreateRunStructure(String textVal, P existingP, int rIndex) voiddiff(Body newer, Body older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) voiddiff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight) Compare 2 p objects, returning a result containing w:ins and w:del elementsvoiddiff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, boolean preProcess) Compare 2 p objects, returning a result containing w:ins and w:del elementsvoiddiff(SdtContentBlock cbNewer, SdtContentBlock cbOlder, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) private voiddiffWorker(Node newer, Node older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) This is private, in order to control what objects the user can invoke diff on.private static StringgetDiffxOutput(String xml1, String xml2) static final IntegergetId()The id to be allocated to the ins/delstatic int[]static StringgetRunString(P p, int i) private static intgetWordCount(String sentence) protected static PloadParagraph(String filename) static voidstatic voidvoidmarkupAsDeletion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsRight) voidmarkupAsInsertion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft) static voidregisterRelationship(Differencer pd, RelationshipsPart docPartRels, String relId, String newRelId) This is a Xalan extension function, invoked from diffx2wml.xslt Any rel which is present in the results of the comparison must point to a valid target of the correct type, or the resulting document will be broken.voidsetRelsDiffIdentifier(String relsDiffIdentifier) static voidsetXsltDiffx2Wml(Templates xsltDiffx2Wml) org/docx4j/diff/diffx2wml.xslt will be used by default to transform the diff output into a Word docx with tracked changes.private static intsum(int[] array, int idx1, int idx2) private static NodeConverts the reader to a node.private static StringtoRangeString(StringComparator sc, int start, int length, boolean space) voidtoWML(String in, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) private voidtransformDiffxOutputToWml(Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, StreamSource src)
-
Field Details
-
log
protected static org.slf4j.Logger log -
wmlFactory
-
composedRels
-
RFC3339_FORMAT
-
xsltDiffx2Wml
-
xsltMarkupInsert
-
xsltMarkupDelete
-
nextId
-
relsDiffIdentifier
Because the resulting document might be built out of the results of a number of diffs, we need to be sure that the id's are unique across these diffs. This is passed into the XSLT, where it is used as part of the generated rel id.
-
-
Constructor Details
-
Differencer
public Differencer()
-
-
Method Details
-
log
-
getComposedRels
-
setXsltDiffx2Wml
org/docx4j/diff/diffx2wml.xslt will be used by default to transform the diff output into a Word docx with tracked changes. This method allows you to use your own xslt instead.- Parameters:
xsltDiffx2Wml-
-
getId
The id to be allocated to the ins/del- Returns:
-
setRelsDiffIdentifier
- Parameters:
relsDiffIdentifier- the relsDiffIdentifier to set
-
registerRelationship
public static void registerRelationship(Differencer pd, RelationshipsPart docPartRels, String relId, String newRelId) This is a Xalan extension function, invoked from diffx2wml.xslt Any rel which is present in the results of the comparison must point to a valid target of the correct type, or the resulting document will be broken. So we pass the old and new rels objects, and progressively build up a List of relationships which will need to be in the resulting document. Because the resulting document might be built out of the results of a number of diffs, we need to be sure that the id's are unique across these diffs. -
diff
public void diff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight) Compare 2 p objects, returning a result containing w:ins and w:del elements- Parameters:
pl- - the left paragraphpr- - the right paragraphresult-
-
diff
public void diff(SdtContentBlock cbNewer, SdtContentBlock cbOlder, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) -
diff
public void diff(Body newer, Body older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) -
diffWorker
private void diffWorker(Node newer, Node older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) This is private, in order to control what objects the user can invoke diff on. At present there are public methods for pairs of w:body, w:sdtContent, and w:p. TODO: consider/test w:table! -
toWML
public void toWML(String in, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder) -
transformDiffxOutputToWml
private void transformDiffxOutputToWml(Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, StreamSource src) throws Exception - Parameters:
result-author-date-docPartRelsLeft-docPartRelsRight-src-- Throws:
Exception
-
markupAsInsertion
public void markupAsInsertion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft) -
markupAsDeletion
public void markupAsDeletion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsRight) -
diff
public void diff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, boolean preProcess) Compare 2 p objects, returning a result containing w:ins and w:del elements- Parameters:
pl- - the left paragraphpr- - the right paragraphresult-
-
sum
private static int sum(int[] array, int idx1, int idx2) -
addWord
Add a word to a w:r's existing w:t -
createRunStructure
-
toRangeString
-
loadParagraph
- Throws:
Exception
-
getParagraphRunTextWordCounts
-
getWordCount
-
getRunString
-
getDiffxOutput
-
toNode
Converts the reader to a node.- Parameters:
xml- The reader on the XML.isNSAware- Whether the factory should be namespace aware.- Returns:
- The corresponding node.
-
combineAdjacent
- Throws:
XMLStreamException
-
main
- Throws:
Exception
-