Altova MapForce 2024 Enterprise Edition

The Text Capture object enables you to extract some text from a page of a PDF document. When you create a text capture, it appears in the model tree in the Schema pane and in the Output pane. You can optionally wrap a text capture inside an XML tag, by giving the capture a name, which will help you organize the tree in the Output pane into a meaningful structure (see code listing below). The default name of a text capture is Capture. For information about how to add objects to the model tree, see Insert an Object.

 

<Invoice>

<Header>GARDENING SERVICES INVOICE</Header>

<BillTo>Oswald Grim

Darkwood St. 17

Boston, MA 02128

+1-617-8767675</BillTo>

<InvoiceNo>4560123</InvoiceNo>

<Date>2023-09-05</Date>

<...>

</Invoice>

 

When you click a text capture in the model tree of the Schema pane, the capture becomes immediately highlighted in the PDF View pane (screenshot below), which helps to easily locate the capture on the page. The highlighted area has a text label that corresponds to the capture's name visible in the model tree and in the Output pane. You can also click elements or their values in the Output pane to see what objects they refer to on the page of your PDF document. For details, see Step 2 of the tutorial.

PDFEX_TextCaptureHighlighted

Properties in the Properties pane

You can configure the following properties of the Text Capture object:

 

 

© 2018-2024 Altova GmbH