Class Differencer

java.lang.Object
org.docx4j.diff.Differencer

public class Differencer extends Object
Capable of comparing a pair of: - w:body (only lightly tested) - w:sdtContent (used extensively) - w:p (includes an algorithm aimed at producing a better diff) See org.docx4j.samples.CompareDocuments for an example of how to use.
  • Field Details

    • log

      protected static org.slf4j.Logger log
    • wmlFactory

      static ObjectFactory wmlFactory
    • composedRels

      private Map<Relationship,Part> composedRels
    • RFC3339_FORMAT

      private static final SimpleDateFormat RFC3339_FORMAT
    • xsltDiffx2Wml

      static Templates xsltDiffx2Wml
    • xsltMarkupInsert

      static Templates xsltMarkupInsert
    • xsltMarkupDelete

      static Templates xsltMarkupDelete
    • nextId

      public static Integer nextId
    • relsDiffIdentifier

      private String relsDiffIdentifier
      Because the resulting document might be built out of the results of a number of diffs, we need to be sure that the id's are unique across these diffs. This is passed into the XSLT, where it is used as part of the generated rel id.
  • Constructor Details

    • Differencer

      public Differencer()
  • Method Details

    • log

      public static void log(String message)
    • getComposedRels

      public Map<Relationship,Part> getComposedRels()
    • setXsltDiffx2Wml

      public static void setXsltDiffx2Wml(Templates xsltDiffx2Wml)
      org/docx4j/diff/diffx2wml.xslt will be used by default to transform the diff output into a Word docx with tracked changes. This method allows you to use your own xslt instead.
      Parameters:
      xsltDiffx2Wml -
    • getId

      public static final Integer getId()
      The id to be allocated to the ins/del
      Returns:
    • setRelsDiffIdentifier

      public void setRelsDiffIdentifier(String relsDiffIdentifier)
      Parameters:
      relsDiffIdentifier - the relsDiffIdentifier to set
    • registerRelationship

      public static void registerRelationship(Differencer pd, RelationshipsPart docPartRels, String relId, String newRelId)
      This is a Xalan extension function, invoked from diffx2wml.xslt Any rel which is present in the results of the comparison must point to a valid target of the correct type, or the resulting document will be broken. So we pass the old and new rels objects, and progressively build up a List of relationships which will need to be in the resulting document. Because the resulting document might be built out of the results of a number of diffs, we need to be sure that the id's are unique across these diffs.
    • diff

      public void diff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight)
      Compare 2 p objects, returning a result containing w:ins and w:del elements
      Parameters:
      pl - - the left paragraph
      pr - - the right paragraph
      result -
    • diff

      public void diff(SdtContentBlock cbNewer, SdtContentBlock cbOlder, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder)
    • diff

      public void diff(Body newer, Body older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder)
    • diffWorker

      private void diffWorker(Node newer, Node older, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder)
      This is private, in order to control what objects the user can invoke diff on. At present there are public methods for pairs of w:body, w:sdtContent, and w:p. TODO: consider/test w:table!
    • toWML

      public void toWML(String in, Result result, String author, Calendar date, RelationshipsPart docPartRelsNewer, RelationshipsPart docPartRelsOlder)
    • transformDiffxOutputToWml

      private void transformDiffxOutputToWml(Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, StreamSource src) throws Exception
      Parameters:
      result -
      author -
      date -
      docPartRelsLeft -
      docPartRelsRight -
      src -
      Throws:
      Exception
    • markupAsInsertion

      public void markupAsInsertion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft)
    • markupAsDeletion

      public void markupAsDeletion(SdtContentBlock cbLeft, Result result, String author, Calendar date, RelationshipsPart docPartRelsRight)
    • diff

      public void diff(P pl, P pr, Result result, String author, Calendar date, RelationshipsPart docPartRelsLeft, RelationshipsPart docPartRelsRight, boolean preProcess)
      Compare 2 p objects, returning a result containing w:ins and w:del elements
      Parameters:
      pl - - the left paragraph
      pr - - the right paragraph
      result -
    • sum

      private static int sum(int[] array, int idx1, int idx2)
    • addWord

      private static void addWord(R r, String word)
      Add a word to a w:r's existing w:t
    • createRunStructure

      private static R createRunStructure(String textVal, P existingP, int rIndex)
    • toRangeString

      private static String toRangeString(StringComparator sc, int start, int length, boolean space)
    • loadParagraph

      protected static P loadParagraph(String filename) throws Exception
      Throws:
      Exception
    • getParagraphRunTextWordCounts

      public static int[] getParagraphRunTextWordCounts(P p)
    • getWordCount

      private static int getWordCount(String sentence)
    • getRunString

      public static String getRunString(P p, int i)
    • getDiffxOutput

      private static String getDiffxOutput(String xml1, String xml2)
    • toNode

      private static Node toNode(Reader xml, boolean isNSAware)
      Converts the reader to a node.
      Parameters:
      xml - The reader on the XML.
      isNSAware - Whether the factory should be namespace aware.
      Returns:
      The corresponding node.
    • combineAdjacent

      private static String combineAdjacent(XMLStreamReader reader) throws XMLStreamException
      Throws:
      XMLStreamException
    • main

      public static void main(String[] args) throws Exception
      Throws:
      Exception