AI-powered Data Mapping in MapForce
Current AI models have the potential to enhance data integration tasks in numerous ways. Some of the most significant advances relevant to data mapping and ETL center around AI-powered classification abilities.
Whether classifying natural human language inputs or other unstructured data such as images and audio, AI-powered systems excel at the types of categorization tasks that have historically been extremely challenging, time consuming, and error-prone. AI interfaces based on LLMs (large language models) are able to analyze the vast amounts of training data required to learn the intricate patterns, contexts, and nuances of language required to efficiently “understand” human generated speech and content.
In turn, the ability of AI systems to classify inputs across various domains can help organizations add value to their data in meaningful ways. This is especially applicable for enhancing data written to a database or other datastore during data integration or ETL processes, where the AI-provided data offers additional signals to inform business decision making.
Since many AI systems, such as OpenAI’s GPT-4, are available via API, it is immediately possible to integrate their functionality in data transformation projects in MapForce.
Using built-in, no-code tools to define web services requests in MapForce, it’s easy to set up calls to an API, including the OpenAI API, Azure OpenAI API, AWS AI Services, and so on, to enable AI-powered data processing in any data mapping project.
The broad steps for configuring AI functionality in MapForce include:
The following applications describe real-world implementations using AI to classify data during ETL or data integration processes.
Automating sentiment analysis of natural language has forever been a thorn in the side of data analysts, since machines lacked the required understanding of the less concrete aspects of human speech like context, sarcasm, ambiguity, and slang.
Since AI has largely overcome these limitations, it can analyze text data, such as customer reviews or social media posts, to determine the sentiment expressed in the text, whether it's positive, negative, or neutral. This classification can help companies understand customer feedback, gauge public opinion, and make data-driven decisions based on sentiment analysis results.
The data mapping project below uses AI to analyze incoming records in a support database and automatically determines whether an entry is positive, negative, constitutes a bug report, or should be considered as a feature request. The results are then written to the Customer Feedback database.
This article describes the steps required to build this type of AI-powered ETL functionality in MapForce.
Similar to text classification, image classification powered by AI is light years ahead of older technologies. For instance, in e-commerce, AI can automatically categorize product images into different classes or identify specific objects within images. This classification can aid in inventory management, search optimization, content organization, and so on.
In the data mapping example below, a product catalog database is enhanced with AI-powered image classification to add descriptive tags to product listings. This is particularly useful when the product name is ambiguous (e.g., “Mongoose,” which is a bicycle) or the description is missing altogether.
The mapping calls the Microsoft Azure Cognitive Services Computer Vision API to analyze product images and return a list of tags that will be added to the database. For instance, when analyzing an image of the product called Yellow Watermelon – which turns out to be a fishing lure – the tags returned are “bait, metalware”.
Learn how to create this data integration project using AI to classify images in this article.
The possibilities for using AI-based classification to add value to mapped data are many. In addition to image and sentiment classification described above, developers can use no-code, AI-powered tools in MapForce to automate: