Working with Avro Big Data in Your Favorite XML Editor


Big Data trends have developers working with XML alongside other data protocols such as JSON and Apache Avro, and XMLSpy supports both of these with dedicated editing views and functionality.

Let’s see how specialized Avro support in XMLSpy makes visualizing and searching Avro files, as well as editing Avro schemas, uniquely easy. We’ll also look at some of the advantages of utilizing RaptorXML Server for high-performance Avro processing.

shutterstock_88166515

What is Avro?

Apache Avro™ is a system for compact, fast, binary serialization of big data that is most often used within the Apache Hadoop framework. In addition to the advantages of its compact binary format, Avro is platform-independent and can be used to exchange data between programs that are written in a different language. The corresponding Avro schema is always included in the transmitted Avro message, making it possible for any application to de-serialize the data.

Avro Logo (TM)

Logo trademark of the Apache Software Foundation

View and Edit Avro Schema

Avro schemas are written in JSON, and as such can be readily viewed and edited in the XMLSpy JSON editor, which lets you switch between text-based editing and/or Grid View for a graphical representation of the document’s structure.

The screenshot below shows an Avro schema in Text View, which provides line numbering, source folding, bracket matching, intelligent entry helpers, and other helpful functionality for editing the JSON, as well as built-in validation against the Avro spec.

 Avro Schema Editor

Viewing and Searching Avro Files

Binary Avro files aren’t only huge – they’re also not readily viewable in any productive way using existing tools. To make this easier, developers can take advantage of the specialized Avro View in XMLSpy.

Below is a shot of the user-friendly Avro viewer that uses a grid to displays the Avro data structures in an easy-to-read tabular format.

Avro Binary Viewer

The Blocks Pane on the left-hand side lets you select any of the blocks of data, which are displayed by their index number, to view in the Data Pane.

You can also quickly search the entire file at once, and each instance of the search string will be highlighted both in the data pane as well as in any block that contains the string. Searching by regular expressions is also supported.

Since the Avro file includes the corresponding schema, that is also displayed at the top of the Blocks pane.  Click the arrow button to extract the Avro schema and view it in Text View, where you can also save and/or edit it as required.

The Avro Viewer also supports validation of the Avro binary against its schema.

These Avro tools are a great addition to XMLSpy for developers working with big data in any format. Now let’s ratchet up the processing power a notch for when you’re faced with a large volume of Avro files.

Avro Processing on RaptorXML Server

RaptorXML Server, Altova’s third generation validation and processing engine, is perfectly suited to wrangle the huge amounts of data in Avro files. Built from the ground-up to be optimized for parallel computing, RaptorXML includes a bevy of features that deliver hyper-performance, increased throughput, and efficient memory utilization to validate and process big data.

RaptorXML supports Avro in addition to XML, JSON, and XBRL. Commands are available to extract an Avro schema from an instance, validate Avro schemas, and validate Avro instances against their associated schema.

Check out Avro support in XMLSpy now. You can also try RaptorXML Server free for 30 days.

 

Tags: , , , , , ,