Transitioning Data Mapping Projects from Development through Testing and Production


Data mapping projects often mirror software development efforts with distinct phases for design, testing, and deployment. This is especially true for ETL (Extract Transform Load) projects when repeated data mapping execution is required as new data becomes available, and the stakes increase higher with large data sets. The Altova MissionKit and Server Software products provide Global Resources to define configurations for each project phase and smoothly transition between them.

Let’s take a look at an example based on a MapForce data mapping from a source file to a database.

The data mapping project shown below mapping takes a CSV file containing one or more simple sales orders as input. Each order contains a product number and quantity and the mapping calculates the total sale amount based on the current product price, generates a unique order number, and inserts the order into an existing database.

CSV to database data mapping project

The process to develop a data mapping like this and incorporate it into an enterprise workflow requires three distinct steps: a developer or data professional designs the mapping, a quality assurance tester validates the mapping, and finally the mapping is deployed to the production environment. Global Resources let various project stakeholders switch the data mapping project source data file and target database for each phase without modifying the mapping itself.

Global Resources are portable references to files, folders, or databases that work as aliases. When stored as Global Resources, paths and database connection details become reusable and available across multiple Altova applications. The image below shows Global Resources that reference an input data file and a database.

Manage global resources configurations

Global References can also be organized into configurations. For instance, some data mapping projects require separate configurations for mapping design, testing, and production. Switching between configurations changes both the source data file and the target database.

The image below shows a portion of the MapForce toolbar with the Global Resource configuration drop-down menu. This is where the user selects the active configuration.

Selecting a global resources configuration

In the data mapping itself, both the input file and the target database are defined to point to Global Resources. Shown below is the Component settings dialog for the source data file as a Global Resource:

Selecting a global resource as the input file for the data mapping project

The target database is also defined for each Global Resource configuration:

Defining a database in a global resource configuration for a data mapping project

The data mapping project designer initially works with a small sample data set and a copy of the database structure. When the mapping is complete the developer executes it directly in MapForce to create and execute a SQL script to insert the data. The MapForce Output window reports the results:

SQL script execution result example for data mapping projects

For the testing phase we want to execute the data mapping directly in MapForce Server in a test environment that uses a different input file and copy of the database than the developer originally worked with.

The developer compiles the mapping to a MapForce Server execution file from the MapForce File menu:

Data mapping projects file menu

The MapForce Server execution file contains the mapping and the Global Resources file and database references associated with the mapping but does not resolve any particular Global Resources configuration. This allows the Global Resources configuration to be selected at runtime. The Global Resources definitions are stored in an XML file named GlobalResources.xml on the mapping designer’s workstation. The designer would provide both the MapForce Server execution file and the Global Resources file for the testing phase.

The testing team uses the MapForce Server command line interface to execute the mapping in the desired configuration. Shown here is the generic form of the MapForce Server command line with the mapping name, and parameters for Global Resource file and Global Resource configuration:

A generic data mapping project execution command

And here is the actual command as it might appear in a command window:

A command to execute the data mapping project in the testing configuration

If the test results are satisfactory the mapping and Global Resource can be deployed to FlowForce Server and executed in a FlowForce Server job, perhaps as part of an enterprise scheduled data import operation. The mapping is deployed from the main file menu shown above, via the Deploy to FlowForce Server option:

Deploying the data mapping project to FlowForce Server for automated execution

Global Resources are deployed from the Manage Global Resources dialog:

Deploying the data mapping project production configuration to FlowForce Server for automated execution

Each configuration is stored as a separate FlowForce Server object and referenced in a FlowForce Server job definition.

Download a free trial to smoothly transition data mapping projects through your own enterprise workflow stages!

Tags: , , , , ,