Transforming and Converting Protobuf


MapForce supports mapping protocol buffers (Protobuf) to and from other structured data formats as mapping sources or targets. In the constant quest for more efficient ways to transfer, manipulate, and manage large structured data sets, Google has created a language- and platform-neutral data format similar to XML, but smaller, faster, and simpler than even JSON data. Tools are available to generate and work with Protobuf using Java, Python, C++, C#, Ruby, and other programming languages.

The structure of any Protobuf message is defined in a .proto file that defines each field name and value type. Altova MapForce lets users drop these .proto files into a data mapping as a source or target along with any other data, including XML, JSON, relational databases, Excel, flat files, REST and SOAP web services, and others.  .proto files versions 2 and 3 are supported.

A MapForce data mapping creates compatibility between existing XML, JSON, database or legacy data formats and new applications leveraging the efficiency of Protobuf.

To get started converting and transforming Protobuf, simply use the Insert menu or the quick access toolbar button to insert a .proto file into the mapping.

Data mapping protocol buffers in Altova MapForce

MapForce includes an example Protobuf data mapping, shown here:

Data mapping protocol buffers example in MapForce

The .proto file used as the output target matches the example described in the online documentation for the Persons contact list. The source data is an XML file with many additional elements not needed for this Protobuf stream. When the data mapping is executed, the necessary elements are extracted from the XML file to create the output stream.

Note that the file type for the Protobuf output is BLOB, or Binary Large Object. MapForce allows developers to create protocol buffer data streams, or read protocol buffer input data, without generating source code in Java, C++, or any other language, then compiling and executing the code for each protobuf binary based on a new .proto file.

View the Converted Data 

Clicking the Output button at the bottom of the main MapForce data mapping window executes the mapping, supplying the file Altova_Hierarchical.xml as the data source. The resulting data stream is opened in the output preview window in a JSON-like representation:

Output preview of data mapping protocol buffers

For one-off requirements, MapForce users can save the binary file via an option in the Output menu:

Save binary output created by data mapping protocol buffers

Here is a partial representation of the actual generated binary data as displayed in a common hexadecimal viewer tool:

View of binary output created from data mapping protocol buffers

The efficiency of the Protobuf stream is evident in the binary data. All the overhead of XML or JSON element names is removed, along with spaces, tabs, brackets, and other characters normally included to enable human readability.

Map and Transform Protobuf 

If you receive a Protobuf stream, you can map it to your internal enterprise data format. The image below shows a mapping to a database:

A protocol buffer data mapping where the protobuf is the source and the target is a database table.

The mapping uses several data processing conversion functions to manipulate the incoming binary data to match the structure of the existing database table. MapForce supports data mapping to or from all popular relational databases and NoSQL databases. Click here to see the complete list.

The output of this mapping is a SQL script to insert data from the binary to a database:

A SQL script as the output of a protobuf to database mapping.

After the script is executed, we can verify the database contents using DatabaseSpy, Altova’s SQL editor:

Table contents after execution of the protobuf to database mapping.

Automated Execution

Production workflows may require repeated execution of data mappings to generate new data streams based on the same .proto definition, but using different source data. In the first example above, a different XML instance document could be supplied. A data mapping from a database or from a REST Web service to Protobuf may need to be executed on a regular schedule to include the updated source data.

In those or other cases where repeated execution is required, MapForce users can save the data mapping as a MapForce Server execution file via a simple menu selection:

MapForce Server execution file for data mapping protocol buffers

The execution file defines the inputs, outputs, and any intermediary processing steps that must be applied to the data (including sorting, filtering, custom functions, or others) in a form optimized for execution in a server environment. MapForce Server automates execution of these compiled data mappings via a command line or API interface.

MapForce Server may also be configured with FlowForce Server, RaptorXML Server, or StyleVision Server, depending on the needs of the enterprise. When MapForce Server operates under the management of FlowForce Server, data mappings are executed as FlowForce Server job steps that can be triggered at a specific time or time interval, or based on an event such as the arrival of a new file in a watched folder.

For Protobuf data mappings where the source is a database query or REST request, the query is executed as part of the mapping. When the input is a file such as a JSON or XML document, the new file is specified as a FlowForce job parameter at runtime.

You can try this all now with a fully-functional MapForce free trial

Tags: , , , ,