ETL (Extract-Transform-Load) tools provide a mechanism for extracting data from external sources, transforming it to a normalized data format, and then loading it into an end target or repository.
With support for prevalent enterprise data formats (XML, JSON, databases, flat files, EDI, Protocol Buffers, etc.), MapForce is an extremely effective, lightweight, and scalable ETL tool. MapForce offers a straightforward, visual user interface that lets you easily load any supported mapping structures and then use drag and drop functionality to connect nodes and add data processing functions and filters, or use the visual function builder for more complex ETL projects.
In addition to its visual interface, MapForce is also accessible through a flexible Java or COM-based API, providing an integrated automation layer to your ETL implementations.
MapForce allows you to easily associate your ETL data structures using drag and drop functionality.
Advanced data processing filters and functions can be added via a built-in function library, and you can use the visual function builder to combine multiple inline and/or recursive operations in more complex ETL or data integration projects, and even save functions for use in other mapping projects.
MapForce also supports advanced ETL scenarios involving multiple input and output schemas, multiple source and/or target files, or advanced multi-pass data transformations.
Support for data streaming gives your ETL projects a huge performance boost with the ability to stream input from arbitrarily large XML, CSV, and FLF files and relational databases, and stream output to equally large XML, CSV, and FLF files or insert it into a database.
This built-in functionality means that MapForce can easily process massive data sets and ETL projects, limited only by the amount of disk space available on your local machine or accessible on a network.
In order to activate this feature, simply select the BUILTIN icon from the toolbar in the MapForce design pane.
With support for bulk database insert as well as direct data streaming, MapForce Server is also ideally suited for execution of ETL data mappings.
The MapForce interface facilitates data integration typically required in ETL projects with a graphical interface that includes many options for managing, visualizing, manipulating, and executing individual mappings and complex ETL mapping projects. Use the design pane to graphically define mapping components, add functions and filters for data manipulation, and drag connectors to transform between source and target formats
The design pane includes user-friendly features to help you to easily work with, identify, and redefine even the most complex mappings. For example, clicking an item name automatically selects it for connecting to another item. Pop-up prompts appear when you position your mouse over parts of the mapping connections where you can view additional information such as mapping target item(s) or datatype.
Connections are easily moved by clicking and dragging to the desired target, and duplicate connectors (from the same source to another target) can be created by dragging a connection while simply holding down the CTRL key. Activating the autoconnect child items icon automatically connects all child items of the same name under the parent item.
ETL Data mappings can have multiple inputs and multiple outputs and combine completely disparate data types.
MapForce includes a rich library of data processing functions to perform virtually any necessary data conversion required by the integration project, and a handy overview window lets you visualize an entire data mapping project and zoom in on specific areas as needed and indicates position within the map when you scroll through the design pane. Navigate even the largest data mapping project with ease!
In ETL applications, new information intended for import often arrives in a data format incompatible with the existing repository. MapForce offers unparalleled power and flexibility for advanced conversion and transformation, making it an efficient ETL tool for global and intra-enterprise data integration projects. You can easily integrate data from multiple files in different formats.
With support for automating data integration, MapForce is an ideal middleware product for connecting distributed applications in any local enterprise, Web-based workflow, or even Cloud architecture.
Three data formats are included in the ETL mapping illustrated below. The main source file is an Excel spreadsheet and the final output repository is a database. However, records in the SharesAndLeaves database column are stored as a JSON object. The MapForce ETL tool constructs the JSON object on the fly and includes it in the generated database insert statement, as highlighted in the output preview at the lower right of the screenshot.
Whether it is an XML or database schema, EDI configuration file, or XBRL taxonomy and beyond, MapForce integrates data based on data structures, regardless of underlying particular instance of content. This means that you can re-use data mappings as business data changes.
MapForce can generate MapForce execution files for execution by MapForce Server, or royalty-free data integration code in Java, C#, or C++ for automated reprocessing of data mappings. This lets you implement scheduled or event-triggered data integration/migration operations for inclusion in any reporting, e-commerce, or SOA-based applications. MapForce data integration operations can also be automated via a data integration API, or ActiveX control.
Data integration projects often require data manipulation to convert input data before it is consumed by the target system. MapForce provides an intuitive visual function builder, fully scalable data processing functions with built-in libraries, filters and conditions, and more, to empower you to easily manipulate data and integrate disparate formats.
MapForce does not depend on assembling a collection of adaptors or other extra-cost add-ons. Users have complete control over design and implementation of data integration and transformations of all datatypes, including any combination of XML, database, flat file, EDI, Excel, XBRL, and/or Web service data.
In any ETL scenario, incoming data needs to be harmonized with the established standards of the repository. Depending on the relevant data characteristics, items like name styles, date and time formats, decimal precision, currency, temperature and measurement scales, and other factors may need to be converted. MapForce provides an extensible library of data processing and conversion functions for filtering and manipulating data according to the needs of your ETL mapping project. To save time and leverage work you’ve already completed and tested, you can even import existing data conversion code or an XSLT 1.0 or 2.0 file for use as a function library. MapForce also includes a unique visual function builder to define and reuse custom functions that combine multiple operations.
MapForce can handle the most advanced conversion scenarios, allowing you to define rules based on conditions, Boolean logic, string operations, mathematical computations, SQL and SQL/XML statements, or any user-defined function. You can even use an existing Web service to look up or process data in any mapping.
MapForce Data Conversion:
Many of the built-in functions, such as concat, add, multiply, etc., support an unlimited number of parameters, making it easy to perform mathematical manipulations and combine multiple parameters. Aggregate functions allow you to perform computations on groups of data, including count, sum, min, average, join-string, and others. Conversion functions are provided to conveniently parse complex data types.
Functions in the core library are generalized and not specific to any type of output. Using these core functions, you can create XSLT 1.0/2.0, XQuery, Java, C++, or C# data conversion code by simply selecting the language(s) you require.
Intermediate variables are a special type of component that store an intermediate mapping result for further processing and can be used to solve various advanced mapping problems. An intermediate variable is equivalent to a regular (non-inline) user-defined function, and is a structural component without an instance file.
Inserting filters and conditions into a mapping allows you to select data from the source based on Boolean conditions.
The if-else condition in MapForce is equivalent to a switch statement in many programming languages, enabling you to easily control the flow of data in your mapping projects by matching a value to a selected criterion.
MapForce supports transformation input parameters, allowing outside parameters to affect mapping transformations. The transformation input parameters can be passed to the main mapping function created by the MapForce code generator in Java, C#, or C++.
MapForce includes powerful support for database ETL mapping, including mapping between any of database data and XML, flat files, EDI, Excel (OOXML), XBRL, Web services, and even other database formats.
When you load a database structure in the design window, MapForce automatically interprets the database schema, lets you pick available database tables and views, and recognizes table relationships.
Once you have loaded all of the content models required for your database ETL mapping, complete the mapping by simply dragging connecting lines between the source and target structures.
MapForce supports all major relational databases, empowering you to create graphical database mapping designs between database source data, data processing functions and filters, and other data structures of various types.
Database Mapping Features:
Many database ETL mappings require manipulation of data between the source and target based on Boolean conditions or SQL and SQL/XML statements. You may need to perform logical comparisons, mathematical computations, or string operations, check for database data of a particular value, and make other modifications to the data. In screenshot above, data processing functions appear as the boxes between the lines joining the source and the target data model.
Data processing functions enable you perform advanced database mappings on-the-fly for a multitude of real-world transformation requirements. You can, for example, construct database mappings that use XML or EDI messages to extract database rows based on filter criteria from the XML or EDI elements.
Once your mapping is defined, the built-in MapForce Engine allows you to view and save the results with one click.
Mappings to a database produce output in the form of SQL scripts (e.g., SELECT, INSERT, UPDATE, and DELETE statements) that are run against your target database directly from within MapForce.
EDI standards are a dominant format for e-commerce data exchange, giving organizations a fast and accurate method for exchanging transaction data. EDI preceded other integrated business technologies such as ERP, CRM, and many other supply chain enabling technologies, making data mapping and transformation important considerations when EDI messages are the data source for an ETL application.
MapForce is a graphical EDI mapper with native support for all major business data formats in use today, including XML, databases, flat files, Excel, Web services, as well as the EDIFACT, X12, HL7, NCPDP SCRIPT, IDoc, and PADIS EDI transaction sets.
MapForce simplifies EDI data integration by allowing you to visually define mappings between UN/EDIFACT, ANSI X12, HIPAA X12, Health Level 7 (HL7), NCPDP SCRIPT, SAP IDoc, IATA PADIS, or TRADACOMS and XML, databases, flat files, Excel, and other EDI systems, as well as Web services operations. This allows your organization to reap the benefits of exchanging information electronically – without increasing costs or the complexity of your application infrastructure.
MapForce allows you to define a single mapping to accommodate EDI files that contain multiple message types. Support is provided for EDI files containing multiple message types either in the source or target of your data mapping project, providing optimal flexibility for EDI mapping.
In the EDI X12 standard, the 997 acknowledgment message relays the status of an inbound interchange - confirming receipt of a transaction, transaction errors, etc.
MapForce can be easily configured to automatically create a mapping from your input X12 messages to 997 acknowledgments, helping you to define seamless trade links within and beyond your partner network.
MapForce lets you confirm the accuracy of EDI output from your mapping through validation of all EDI source and target components, and the mapping output. This helps you ensure that only valid EDI messages are processed.
Legacy data contained in existing text files can be extracted and updated to be consistent with an existing ETL repository. MapForce includes the unique FlexText utility for parsing and converting text files such as mainframe text reports, text-based log files, and other legacy text file types in mapping designs. With its visual interface, FlexText lets you insert an existing text file and extract the portions you want to convert in the MapForce mapping interface.
FlexText produces a template that is then loaded into MapForce, where individual text nodes can be converted to any combination of XML, database, EDI, XBRL, flat file, Excel, JSON, Protocol Buffers, and Web services data. By saving the configuration you create in FlexText, you can reuse the same template to convert multiple text files in multiple mappings.
FlexText allows you to create rules for text file conversion templates for ETL mapping. When you open a text file in the FlexText interface, the file is displayed in two blocks. The root block represents the original file, while the operation block (to its right) displays the data of the file in real-time as you extract the data you need.
The result of every operation you make is visible in real-time, so you can immediately see if you’ve achieved the desired result.
Legacy text files may contain useful data in CSV (comma-separated values) or FLF (fixed length field) formats inside a more complex flat file. FlexText allows you to directly extract such data using the CSV and FLF operations. After applying Split and other operations, you can store remaining CSV- or FLF-formatted fields by configuring the field names, lengths, etc.
FlexText allows you to isolate the data you need to access by removing non-relevant text, characters, and whitespace using split commands. Each split presents your data in two new blocks: one that contains the data you have split out, and another displays the modified view of your converted file. You can immediately see the result of each operation you perform.
FlexText supports Node and Ignore operations for further flexibility in constructing the information tree. An Ignore operation marks a block of text as irrelevant for conversion purposes and instructs MapForce to ignore it. The Node operation creates a new node in the information tree in MapForce so that you can properly represent the hierarchical nature of your text data when needed.
The Switch operation allows you to define multiple conditions for a single block of text. Data in the text file is passed to the associated container for use in your MapForce conversion only if it meets a defined condition.
FlexText also supports for regular expressions. For instance, an input file could be a system-generated report with numbers and letter codes in the left margin that indicate record types where a sequence of any five digits followed “O” indicates the beginning of a new section for one office location.
Altova MapForce Server includes the built-in data transformation engine developed for MapForce and is greatly enhanced to operate in server environments. MapForce Server performs data transformations for any combination of XML, JSON, database, EDI, XBRL, flat file, Excel, and/or Web service using preprocessed and optimized data mappings stored in execution files based on data mappings defined in MapForce. MapForce Server takes data transformation to the next level with server capabilities including parallel processing and multi-threading, bulk SQL merge, cross-platform support, and more.
Preprocessing enables faster performance and reduced memory footprint for most data mappings. MapForce Server operates under the management of FlowForce Server, in a standalone configuration executed from a command line, or programmatically via an API.
After a MapForce ETL mapping is designed and tested, it can be executed by MapForce Server to automate business processes that require repetitive data transformations.
MapForce pre-processes and optimizes ETL data mappings, stores them in MapForce Server Execution files for command-line execution by MapForce Server, and uploads them for use in FlowForce Server jobs.
When MapForce Server operates under the management of FlowForce Server, data mappings are executed as FlowForce Server job steps, based on triggers defined as part of the FlowForce Server job. For example, a new XML instance document lands in a directory, which triggers a multi-step FlowForce Server job to first validate the file using RaptorXML+XBRL Server, then execute MapForce Server to extract certain data from the XML and insert it into an ETL repository database.