Package org.docx4j.model.fields.merge
Class MailMerger
java.lang.Object
org.docx4j.model.fields.merge.MailMerger
- Direct Known Subclasses:
MailMergerWithNext
Perform a mail merge.
Instance values are merged into a docx containing
MERGEFIELD to produce output docx made up of
a copy of the input docx for each collection of
input values.
The output can be a single docx, or multiple docx.
If you choose single docx, there are two ways to
do this:
One is using MergeDocx, which will ensure each
constituent "document" doesn't affect the neighbouring
ones (eg numbering will restart).
The other is the "poor man's" approach, which
puts them together, and just hopes for the best.
Images and hyperlinks should be ok. But numbering
will continue, as will footnotes/endnotes.
From 3.0, there is some support for formatting switches
(date/time, numeric, and general), and basic
support for MERGEFORMAT.
LIMITATIONS:
- no support for text before (\b) and text after (\f)
switches
- no support for \m and \v switches
- no support for multiple MERGEFIELD in a single
instruction (eg MERGEFIELD CoutesyTitle \f " " MERGEFIELD FirstName \f " " MERGEFIELD LastName )
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static classIf we're converting MERGEFIELD to FORMTEXT, it is desirable to make the w:fldChar/w:ffData/w:name unique within the docx (though Word 2010 can still open the docx if they aren't), and to remove spacesstatic enum -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static MailMerger.OutputFieldprivate static org.slf4j.Logger -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected static voidcanonicaliseStarts(ComplexFieldLocator fl, List<FieldRef> fieldRefs) protected static StringextractInstr(List<Object> instructions) private static StringextractLang(R resultsSlot) Extract language information from run parameters to be able to format month, day, week, etc.static WordprocessingMLPackagegetConsolidatedResultCrude(WordprocessingMLPackage input, List<Map<DataFieldName, String>> data) A "poor man's" approach, which generates the mail merge results as a single docx, and just hopes for the best.static WordprocessingMLPackagegetConsolidatedResultCrude(WordprocessingMLPackage input, List<Map<DataFieldName, String>> data, boolean processHeadersAndFooters) A "poor man's" approach, which generates the mail merge results as a single docx, and just hopes for the best.protected static StringgetDatafieldNameFromInstr(String instr) Get the datafield name from, for example <w:instrText xml:space="preserve"> MERGEFIELD Kundenstrasse \* MERGEFORMAT </w:instrText> or <w:instrText xml:space="preserve"> MERGEFIELD Kundenstrasse</w:instrText>private static SectPrgetDocumentSeparator(WordprocessingMLPackage template) Word uses the existing sectPr element, but adds a page numbering restart to it.protected static StringgetTextInsideContent(ContentAccessor paragraph) Parse through all content inside the paragraph to concatenate all values inside a textstatic voidperformMerge(WordprocessingMLPackage input, Map<DataFieldName, String> data, boolean processHeadersAndFooters) Perform merge on a single instance.performOnInstance(WordprocessingMLPackage input, List<Object> contentList, Map<DataFieldName, String> datamap, MailMerger.FormTextFieldNames formTextFieldNames) performOverList(WordprocessingMLPackage input, List<Object> contentList, List<Map<DataFieldName, String>> data, MailMerger.FormTextFieldNames formTextFieldNames) The idea is to be able to perform a mail merge on content from main document part, or a header/footer etc.protected static voidrecursiveRemove(ContentAccessor content, Object needToBeRemoved) To remove an object from the docx templateprotected static voidRemove the field but preserve the paragraph and content around itprotected static voidsetFormFieldProperties(FieldRef fr, String ffName, String ffTextInputFormat) static voidsetMERGEFIELDInOutput(MailMerger.OutputField fieldFate) What to do with the MERGEFIELD in the output docx.
-
Field Details
-
log
private static org.slf4j.Logger log -
fieldFate
-
-
Constructor Details
-
MailMerger
public MailMerger()
-
-
Method Details
-
getConsolidatedResultCrude
public static WordprocessingMLPackage getConsolidatedResultCrude(WordprocessingMLPackage input, List<Map<DataFieldName, String>> data) throws Docx4JExceptionA "poor man's" approach, which generates the mail merge results as a single docx, and just hopes for the best. Images and hyperlinks should be ok. But numbering will continue, as will footnotes/endnotes.- Parameters:
input-data-- Returns:
- Throws:
Docx4JException
-
getConsolidatedResultCrude
public static WordprocessingMLPackage getConsolidatedResultCrude(WordprocessingMLPackage input, List<Map<DataFieldName, String>> data, boolean processHeadersAndFooters) throws Docx4JExceptionA "poor man's" approach, which generates the mail merge results as a single docx, and just hopes for the best. Images and hyperlinks should be ok. But numbering will continue, as will footnotes/endnotes. [Advert:] If this isn't working for you, the commercial Enterprise Edition of docx4j (MergeDocx component) will solve your problems.- Parameters:
input-data-processHeadersAndFooters- process headers and footers in FIRST section only. If you have multiple sections in your input docx, performMerge is a better approach- Returns:
- Throws:
Docx4JException
-
getDocumentSeparator
Word uses the existing sectPr element, but adds a page numbering restart to it. TODO: investigate what it does with headers/footers.- Parameters:
template-- Returns:
-
performMerge
public static void performMerge(WordprocessingMLPackage input, Map<DataFieldName, String> data, boolean processHeadersAndFooters) throws Docx4JExceptionPerform merge on a single instance. This is the best approach, if your input has headers/footers in multiple sections. If you are using MergeDocx, you can use that to join the instances into a single docx. WARNING: The input docx will be modified, so input a copy if that is a problem. This is left to the user, since that can potentially be more efficient, than doing it here.- Parameters:
input-data-processHeadersAndFooters-- Throws:
Docx4JException
-
performOverList
private static List<List<Object>> performOverList(WordprocessingMLPackage input, List<Object> contentList, List<Map<DataFieldName, String>> data, MailMerger.FormTextFieldNames formTextFieldNames) throws Docx4JExceptionThe idea is to be able to perform a mail merge on content from main document part, or a header/footer etc. We return a list of content lists, so that the consumer can choose whether to produce a single docx (via MergeDocx or otherwise), or a docx for each item in the list.- Parameters:
wordMLPackage-data-- Returns:
- Throws:
Docx4JException
-
performOnInstance
private static List<Object> performOnInstance(WordprocessingMLPackage input, List<Object> contentList, Map<DataFieldName, String> datamap, MailMerger.FormTextFieldNames formTextFieldNames) throws Docx4JException- Throws:
Docx4JException
-
extractLang
Extract language information from run parameters to be able to format month, day, week, etc. in its abbreviated form according to the language specified by the lang element on the run containing the field instructions. Also it will be used to use language specificDecimalFormatSymbolsfor number formating- Parameters:
R- Run
-
canonicaliseStarts
protected static void canonicaliseStarts(ComplexFieldLocator fl, List<FieldRef> fieldRefs) throws Docx4JException - Parameters:
fl-fieldRefs-- Throws:
Docx4JException
-
getDatafieldNameFromInstr
Get the datafield name from, for example <w:instrText xml:space="preserve"> MERGEFIELD Kundenstrasse \* MERGEFORMAT </w:instrText> or <w:instrText xml:space="preserve"> MERGEFIELD Kundenstrasse</w:instrText> -
extractInstr
-
removeSimpleField
Remove the field but preserve the paragraph and content around it- Parameters:
fr-
-
getTextInsideContent
Parse through all content inside the paragraph to concatenate all values inside a text- Parameters:
paragraph- The paragraph which contains (or not) data- Returns:
- All text inside the paragraph
-
recursiveRemove
To remove an object from the docx template- Parameters:
content- Body (or other part) of the templateneedToBeRemoved- The object that will be removed from the content
-
setMERGEFIELDInOutput
What to do with the MERGEFIELD in the output docx. Default is REMOVED. KEEP_MERGEFIELD will allow you to perform another merge on the output document. The AS_FORMTEXT options convert the MERGEFIELD to a FORMTEXT field. This is convenient if you want users to be able to edit the field, where editing is restricted to forms.- Parameters:
fieldFate-
-
setFormFieldProperties
-