Altova MapForce 2024 Enterprise Edition

Template objects are the major building blocks of your design that enable you to define PDF extraction rules. The PDF Extractor supports the following objects:

 

Root/Document

Group/Filter

Split

Text Capture

Merge Source and Merge Target

Collage

Assignments

Ordered Choice and When

 

For step-by-step instructions on how to create a template and define its structure, using objects, see Tutorial.

 

Insert an object

Except for the document root, which cannot be created or deleted, all the other objects can be added as follows:

 

If you have just created a template, the root element will appear at the top of the Schema pane, with the add children here option underneath the root (screenshot below). To add an object under the root element, right-click the root node or the add children here option and select an object from the context menu.

PDFEX_RootElement

If you already have a tree of different objects in the Schema pane, and you would like to add another object, right-click the object relative to which you would like to position a new one and select Add, Insert Before, or Insert After (depending on your intentions) and select an object from the context menu.

 

Depending on the object type, the Add and Insert After options will lead to different results. For Text Captures, Assignments, and Merge Sources, the Add and Insert After options will have the same result: A new object will be placed after the selected Text Capture, Assignment, or Merge Source. However, for Group/Filter, Split, Merge Target, Collage, Ordered Choice and When objects, the results of the Add and Insert After options will diverge. For example, if your model tree has a Group/Filter object, and you want to add a text capture as a child node of this Group/Filter object, right-click the Group/Filter object and select Add | Text Capture from the context menu. This will place the new Text Capture, as a child node, right under the Group/Filter node (screenshot below). If needed, you can then move the new object to a different location in the tree.

PDFEX_AddObject

However, if you want to insert this Text Capture after the Group/Filter tree, right-click the Group/Filter node and select Insert After | Text Capture from the context menu. The result is displayed in the screenshot below.

PDFEX_InsertObject

In both cases, you can also add objects, using the toolbar commands.

 

Create Text Capture, Split, and Merge Source from PDF View pane

In addition to the methods described above, Text Capture, Split, and Merge Source objects can also be added to the model tree from the PDF View pane: Select an area of interest on a page of your PDF document, right-click the selected area, and select a type of object you would like to create (screenshot below).

PDFEX_CreateObject

 

Wrap/unwrap children

Besides creating individual objects, you can also place objects into the following containers: Group/Filter, Split, Merge Target, and Collage. Placing objects into these containers means that the objects will become child elements of the selected object (e.g., Collage). Wrapping objects into a parent object may be beneficial when you want to apply the same processing logic to a set of objects.

 

To wrap an object into a parent object, right-click the object in the Schema pane, select Wrap Into and then select the relevant option. If you already have a group of child objects (e.g., Text Captures) wrapped into a parent object (e.g., Group/Filter), you can choose to unwrap all the child objects at a time, by clicking their parent node and selecting Unwrap Children from the context menu. This action will remove the parent object but leave all the child objects intact. If there is another object at a higher level (e.g., Split) that used to be the parent of the removed object (i.e., Group/Filter), the child objects will automatically become subordinate to this new parent object.

 

If you have a parent object with child objects under it, you can also choose to wrap all the child objects into another container. For example, if you have a Group/Filter object that has text captures as child objects, right-click the Group/Filter object, select Wrap All Children Into from the context menu, and then select the object into which you would like to wrap all the text captures.

 

Expression syntax of object properties

Most object properties accept expressions in a domain-specific expression language. For more information about expression syntax, see Expression Syntax.

 

© 2018-2024 Altova GmbH