Posts

Internationalization with the Altova MissionKit


The following post is written by Peter Reynolds, CEO and translation management consultant at TM-Global and Executive Director of Kilgray Translation Technologies. An Irish national based in Warsaw, he holds a BSc and an MBA degree from Open University and is a localization and translation industry veteran. Peter previously worked at Idiom Technologies Inc. — now SDL PLC. As director of the LSP Partner Program at Idiom, Peter was responsible for making its global LSP partners program a successful and innovative venture. Before Idiom, he worked on language technology development for several global localization companies: Lionbridge, Bowne Global Solutions and Berlitz GlobalNET. He managed the Dublin development team responsible for BerlitzIT, Elcano, Freeway 2.0 technology solutions, and internal project and vendor management tools. Peter has been actively involved in the development and promotion of standards (notably XLIFF) for more than ten years, mostly at OASIS. Until 2008 when XLIFF was published, he was secretary of the XLIFF Technical Committee at OASIS and chaired the Translation Web Services TC. He is currently involved in OASIS, TILP as well as being the Irish expert to ISO SC2 and SC4 and training auditors for the EN 15038 standard.

Introduction

Every developer wants his or her applications to be used and hopes they will be very popular. A web application developed in rural Maine USA could easily be used by someone living in the next township or in Malaysia, New Zealand, Germany or Poland. Even if the application is not translated (localized), there are some important differences between how data is represented from one locale to another. The W3C definition of internationalization is “the design and development of a product that is enabled for target audiences that vary in culture, region, or language”. This does not mean that the product has to be translated into the language of the target audience but that it is designed in such a way that the target audience can use the application and understands the way data is presented. The reason for internationalization is to ensure the widest possible audience for your application and to make its translation easier and less costly. This article will introduce you to internationalization and demonstrate how applications can be internationalized using the Altova MissionKit, an integrated suite of XML, database, and UML tools including XMLSpy, StyleVision, MapForce, and others. If you are using tools such as XMLSpy and StyleVision it is very likely that you are already creating internationalized XML applications. The strategy which I suggest is that you try and figure out what target audience your applications are intended for beforehand and implement internationalization accordingly. In this article I will first discuss a strategy for internationalizing XML. I will then introduce the Internationalization Tag Set and examine issues relating to XML internationalization.

Strategy for Internationalizing XML

The first step in planning internationalization is to make an informed decision as to the level of internationalization you require. There may be people in your organization who can help you make this decision, and it would be particularly useful to obtain input from people who live in different countries. The three-level approach presented below should help you decide on the level of internationalization you are going to implement. However, you should remember that you may encounter some problems if your documents or applications are not internationalized, but you will certainly not have the same problems if to ensure that they are fully internationalized. The three levels of internationalizations are:

  • Level 1 – Your applications are likely to have a relatively small audience, which could grow, but the applications are unlikely to be translated or used internationally. In that case you should just follow the suggestions in this article and ensure that you use the functionality in Altova MissionKit to support internationalization.
  • Level 2 – Your applications will have a wide audience and could be translated and used internationally. As well as using the Altova MissionKit functionality you should also use the Internationalization Tag Set. This is a schema released by the W3C for the purpose of internationalization.
  • Level 3 – Your applications are most likely to be used internationally and translated into a number of different languages. You should consider how to improve the localization process by separating content from code and ensuring the translators can see the document or application as the end user would see it. This is beyond the scope of this article but you will find some relevant information on the subject in the references below.

The software tools in the Altova MissionKit have a lot of functionality which supports internationalization. If you are using these tools you have a very strong basis for creating internationalized XML documents. Unicode is the default encoding for applications created in the XMLSpy XML editor, and I would strongly recommend using this character set.

Internationalization Tag Set

The Internationalization Tag Set (ITS) is recommended by W3C and designed to create XML which is internationalized and can easily be localized. If you are working with XML documents which might be localized, I would recommend using ITS. With this technology you are able to specify which text requires translation, provide instructions for translators and specify the direction of the text. The seven data categories included in the ITS are:

  • Translate: Defines which parts of a document are translatable.
  • Localization Note: Provides notes and helpful information for translators.
  • Terminology: Identifies terms in the documents.
  • Directionality: Indicates the direction which the document or part of the document is written and should be read.
  • Ruby: Indicates which parts of the document should be displayed as ruby text. (Ruby is a short run of text alongside a base text, typically used in South-East Asian language documents to indicate pronunciation or to provide a brief annotation).
  • Language Information: Identifies language used for the different parts of the document.
  • Elements Within Text: Indicates how elements should be treated with regard to linguistic segmentation.

W3C has published a best practices guide for internationalizing XML documents which details how to use ITS. It can be found on their web site at: http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070427/ The specification can be found in this section: http://www.w3.org/TR/2007/REC-its-20070403/ I would strongly recommend you read these documents before proceeding with internationalization.

Internationalization Issues

The following table describes some of the internationalization issues you may come across. This will be followed by a more detailed explanation of these issues and suggestions for how they can be resolved using the Altova MissionKit. .

ISSUE DESCRIPTION
Encoding Characters need to be supported by the code page being used. Unicode is an encoding which supports characters from all common language.
Date & TimeHow dates and time are represented varies between countries.
NumbersHow decimal points and thousands are represented varies between different countries.
CurrencyAs well as difference with how the number is represented in some countries the currency symbol or word is written after the number while in most it is written before.
Salutation & Names There are many differences in salutations between countries, and in some countries, such as Hungary, a person’s name is written with the family name first. No middle name is used in Japanese.
AddressThere are a number of differences relating to address, such as the house number appearing before the street name in some countries and after in others. Also, some countries use a ZIP code vs. a postal code.
RTLText is many languages is read from left to right, but in some, such as Hebrew and Arabic, the text is read from right to left (bi-directional).
Sorting & Collation There are differences in how alphabets are sorted. Some Scandinavian languages have an ‘aa’ character which is usually, but not always, sorted at the end of the alphabet.
Exclamation & Question MarksIn English questions and exclamation marks are always at the end of the sentence, while in Spanish there is a question mark at the beginning and end of a sentence.

.

Encoding

All electronic text uses a character coding system where the character is represented by a number. Before the widespread use of Unicode this was one of the most significant internationalization issues. When an application tries to show a character that is not represented in a code page it will appear as garbage text. There were not only problems between different languages but also with characters appearing incorrectly on computers running different operating system. Unicode has solved most of these problems by creating a single code page regardless of platform, program or language. XML uses Unicode as its default code page. Any XML documents you create in XMLSpy will by default have the declaration encoding="UTF-8” If the file has not been created in XMLSpy, you need to ensure that the file is saved as UTF-8. UTF is an acronym for Unicode transformation format, and UTF-8 is a flavor of Unicode that uses 1, 2 or 4 bytes to store characters. It is the most commonly used flavor and is very widely used for XML and the Web. The other versions of Unicode which XMLSpy supports are:

  • UTF- 7. This is 7 bit version of Unicode. It should only be used in the context of 7 bit transports, such as email.
  • ISO 1064 UCS – 2 and UTF – 16. UCS is an acronym for Universal Character Set and UCS-2 uses two bytes for each character. UTF-16 is an extension of UCS-2 which uses 2 or 4 bytes to represent a character. UTF-16 is often used by Windows and Java. You should use UTF – 16 rather than UCS – 2 for new documents.
  • ISO 1064 UCS- 4. Uses 4 bytes for each character and is the same as UTF-32. UTF-32 is often used by Unix.

There may be reasons for using default encoding other than UTF-8. To set the default encoding in XMLSpy go to Tools | Options and select the encoding tab.  XMLSpy encoding options If you want to change the encoding for an individual XML document, open the document in XMLSpy and select File| Encoding. XML encoding options

Language

The XML namespace defines xml:lang to identify the language of an XML document. The value for xml:lang must be an ISO language code (ISO 639- 2). If you have an XML document which is written in one language but has a segment in another language you can use xml:lang at the root element to identify the main language of the document and use it at the element where the text in another language is used to identify that language.

Dates

In different countries dates and time are represented in very different ways. Let’s take as an example the date 10/09/08:

In most European countries this means the 10th of September 2008.
In the United States this means the 9th of October 2008.
In Japan this means 8th of October 2009.

The way to deal with this is to use ISO 8601 for specifying date and time within your application. This is a standard way for representing date and time in the format YYYY-MM-DDTHH:MM:SS[±HH:MM] where

YYYY- represents year
MM- represents month
DD – represents day
T signifies that Time follows this
HH- represents hours
MM- represents minutes
SS- represents seconds.

You can then use StyleVision to create a style sheet which formats the date in a way suitable to your target audience. StyleVision is a graphical stylesheet design tool that allows drag-and-drop design of XSLT and XSL:FO stylesheets to render XML data in HTML, Microsoft Word, PDF, and other formats. To use the date formatting functionality within StyleVision:

  • Select the contents placeholder or input field of the node.
  • In the Properties sidebar, select the content item, and then the Content group of properties.
  • Click the Edit button of the Input Formatting property.
  • The Input Formatting dialog will appear:

StyleVision date formatting

  • Select the Formated radio button. This will allow you to choose which data type you would like to use, and if you have selected a date, you can then choose the format for the date.

You can also select other date and time formats here. I would strongly recommend using the date picker. In order to insert the date picker, the cursor must be between an xs:date or xs:dateTime node. You then go to Insert on the main menu and Select Insert Date Picker. If the cursor is not between xs:date or xs:dateTime node the Insert Date Picker menu item will be greyed out.

Numbers

Decimals can be preceded by either a point or a comma depending on the locale. There are also differences for how thousands are represented. StyleVision provides functionality where you can format a number for your intended audience:

  • Select the contents placeholder or input field of the node.
  • In the Properties sidebar, select the content item, and then the Content group of properties.
  • Click the Edit button of the Input Formatting property.
  • The Input Formatting dialog will appear

StyleVision number formatting

  • Select the Formatted radio button. This will allow you to choose the number format.

Money

The issues involving numbers also apply to money, but in addition to this there are different conventions for representing the currency symbol. Some currencies share the same name and symbol, such as the dollar, but the Australian, Canadian and Singaporean dollar are not the same currency, and this should be identifiable. You can deal with the numbers as shown above, but the issue of whether the currency name or symbol should go before or after the number is likely to be dealt with as part of the translation process.

Address

One of the problems faced by customers buying from a foreign company while making an online purchase is that the system does not allow them to enter their address properly. There are many differences, such as the house number being before or after the street name, the order the components of the address are placed and the format of the zip/postal code. CEN (The European Standards Institution) has developed a standard which lists the components of an address, and the UPU (Universal Postal Union) is further developing this to produce a comprehensive list of name and address elements. I would recommend that you ensure that you are getting the data you need for your main target markets but make sure that someone from another country can also enter their address. A drop-down list of countries could be used to ensure that there is error checking when you know certain components of an address are required but does not produce the error for other countries where you do not know the address structure.

Credit Cards

Some US-based web sites will not accept credit cards from outside the US. As a security check they insist on a valid US address. If you want to accept credit card payments and do business with people outside your country, you should check that foreign credit cards will be accepted.

RTL (bidi)

In many languages the text is being read from left to right but this is by no means universal. Arabic and Hebrew are written from right to left. In XML documents this causes further confusion as the XML elements are read from left to right but any text should be read from right to left. The ITS namespace has a direction attribute which can be used to identify which direction should be read. <its:span dir="rtl">متعة الأسماك!</its:span>

Sorting

There are differences in how alphabets are sorted. Some Scandinavian languages have an ‘aa’ character which is usually, but not always, sorted at the end of the alphabet. If you have set the language in your XML document and use xsl:sort for your XSL document then the sorting should work according to the sorting rules for that language. However, you should check that your processor does this as that is not always the case. The example files which come with StyleVision contain examples for sorting. Select StyleVision examples, then the tutorial folder, then sorting and open the file SortingOnTwoTextKeys.sps. To see how the sorting works go to the design view and right click on the member element. Then select the ‘sort by’ option on the context menu. Here you can control how the sorting works for this particular list.

Exclamation and Question Marks

In English, questions and exclamation marks are always at the end of the sentence, while in Spanish this punctuation occurs at the beginning and end of a sentence. This is something which will usually be corrected during the translation process.

Conclusions

Internationalization is an important step in ensuring the widest target audience for your application, and that translation is as cost effect and easy as possible. Your approach to this should be very pragmatic. Time spent up-front sorting out internationalization will result in huge benefits throughout the process and significantly increase marketing potential for your product. The purpose of this article was to present an overview and introduce you to internationalization. There is a lot more useful information available in the references listed below. Tools such as XMLSpy and StyleVision, both of which are included in the Altova MissionKit software suite, go a long way in making the internationalization process for XML documents much easier by providing a lot of in-built support for internationalization. The Internationalization Tag Set from W3C is a very significant innovation which is a great addition to the toolkit available to a developer who wants to build internationalized XML applications. XML is a technology which has had internationalization and translation in mind since its inception. The use of Unicode as the default encoding for XML is very significant and greatly facilitates dealing with any internationalization problems you may come across. The functionality available within the Altova MissionKit, ITS and Unicode are the basis for creating good internationalized applications.   Reference The following is a list of useful web sites and other resources providing further information on internationalization: Leading XML tools provider – Altova https://www.altova.com/ . They also offer a free trial of the MissionKit: https://www.altova.com/download. Unicode web site http://www.unicode.org/ Internationalization Tag Set http://www.w3.org/TR/2007/REC-its-20070403/ W3C Best Practices for internationalization http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070427/ Open Tag (Yves Savourel’s) http://www.opentag.com/ Yves Savourel, ‘XML Internationalization and Localization’, a book which is an excellent source of information. More information can be found at: http://www.opentag.com/xmli18nbook.htm The TM-Global research and resource web site publishes a lot of useful articles, opinions and surveys on translation, localization and industry standards http://www.tm-global.com/ Web sites of internationalization guru Tex Texin http://www.xencraft.com/ and http://www.i18nguy.com/ Localization Flow – web site of internationalization experts http://www.locflowtech.com/ Value for money XML-based TEnTs and translation tools are available from companies such as Kilgray Translation Technologies http://www.kilgray.com/

Tags: , , ,

Exploring Large XML/XBRL Documents with XMLSpy


Last week, while giving a demo of the new XBRL capabilities in the Altova MissionKit, we stumbled across an interesting question: What is the best way for a semi-technical SME (in this case a CPA) to navigate a large XML/XBRL document for data entry? XMLSpy, which is included in the MissionKit tool suite, has a lot of cool features and different views for XML data, including the ever-popular grid view for visualizing the hierarchical structure of an instance document in a graphical manner. The ability to easily expand and collapse containers and drag and drop to change position makes XMLSpy’s grid view a pretty good choice for the task.  XMLSpy grid view Of course let’s not forget that the XMLSpy XML editor also has a Find feature that would enable users to simply press Ctrl F or use the Find in Files window to find any element that they are looking for… but alas, in the case of XBRL, where element names are mindbogglingly verbose, this may be a challenge. Consider, for example, the US-GAAP’s aptly named <us-gaap:IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAnd IncomeLossFromEquityMethodInvestments>. Not so much fun to type into a Find dialog… Our solution, therefore, and the winner for the easiest and most comprehensive way for even a non-technical user to find XML elements in a large document, utilizes a combination of longstanding XMLSpy features (the XPath Analyzer window) and a new feature in XMLSpy v2009, XPath auto-completion. Simply begin typing the element name in the XPath Analyzer window, and XMLSpy will show you all of the possibilities. Next, choose the one you are looking for, and XMLSpy will navigate directly to that node in the XML document.   xpath auto-completion in XMLSpy   Now that was easy! And better yet, you get to tell your friends that you know XPath. 😉 Of course, for developers, intelligent XPath auto-completion provides a lot more than the ability to find a node quickly. As you type, it provides you with valid XPath functions, as well as element and attribute names from the associated schema and XML instance(s). XMLSpy accounts for namespaces when listing options and even provides deep path suggestions when the required node is not in close proximity to the current context. XMLSpy is available standalone or as part of the award-winning MissionKit tool suite.

Tags: , , ,

What's New in XMLSpy 2009?


In addition to being tremendously useful, some of the new features in XMLSpy 2009 are just plain cool. The complete list of new functionality includes:

  • Support for XBRL 2.1 and XBRL Dimensions 1.0  
  • XBRL Taxonomy Editor
  • XPath auto-completion 
  • Native support for additional databases 
  • Support for XML fields in SQL Server
  • Extensions for identity constraints editing in Schema View 
  • Expanded source control system support
  • Support for the XSLT extension altova:evaluate  
  • Support for Apache FOP 0.95  

We’ve already blogged quite a bit about the first two items on the list: support for XBRL validation and XBRL taxonomy editing. Some more details on the other new features are below.

Intelligent XPath Auto-Completion

We’ve been delighted to receive feedback from customers who are really excited about this new feature. If you’re developing XSLT or XQuery, writing XPath expressions just got a lot easier. As you’re composing an XPath expression in Text View, Grid View, or the XPath Analyzer, XMLSpy now provides you with valid XPath functions, as well as element and attribute names from the associated schema and XML instance(s). XMLSpy’s intelligent XPath auto-completion accounts for namespaces when listing options and even provides deep path suggestions when the required node is not in close proximity to the current context. XPath auto-completion  

Native Support for Additional Databases

XMLSpy 2009 adds new native support for the latest versions of SQL Server and Oracle, and brand new support for PostgreSQL. Support for DBs in XMLSpy allows you to generate an XML Schema based on a database, import and export data based on database structures, and generate relational database structures from XML Schemas, and so on. The built-in Database Query window lets you perform queries against the database and edit the data. Here’s the complete list of databases with native support in XMLSpy:

  • Microsoft® SQL Server® 2000, 2005, 2008
  • IBM DB2® 8, 9
  • IBM DB2 for iSeries® v5.4
  • IBM DB2 for zSeries® 8, 9
  • Oracle® 9i, 10g, 11g
  • Sybase® 12
  • MySQL® 4, 5
  • PostgreSQL 8
  • Microsoft Access™ 2003, 2007

SQL Server support has also been enhanced to allow viewing and editing of XML fields that are stored in the database.

Extensions for Identity Constraint Editing in Schema View

Configuring identity constraints (i.e., key/keyref/unique values) is an important aspect of XML Schema development, especially for database users. Adding to existing support for editing these identity constraints, there are now enhanced visual cues and editing options in XMLSpy 2009. A new tab Identity Constraints tab in the Components entry helper window displays all existing constraints in a tree view and allows you to easily modify or create new relationships. Furthermore, identity constraints are now indicated by green lines, informative icons, and mouse-over messages in the Content Model View. A right-click menu allows you to easily add new relationships and specify field and selector values by typing them manually, using drop-down entry helpers, or by simply dragging and dropping the desired nodes. Schema identity constraints

Expanded Source Control System Support

Based on customer feedback, we’ve completely reworked the source control system interface in XMLSpy and also added the same level of source control support to UModel, our UML modeling tool, allowing both products to intelligently integrate with all major SCM tools. Once a project is bound to a version control system, XMLSpy automatically monitors the status of all files and prompts the you to check out a file whenever you starts to modify the document. In addition, the actual state of each file is shown through checkmarks or locks in the upper right corner of each file icon.   What do you think of these new features? What would you like to see added to the next version of XMLSpy? Let us know by commenting below.

Tags: , , , , ,

XBRL Support Added to Altova MissionKit 2009


xbrl_tools Earlier this week we blogged about the release of Version 2009 of Altova MissionKit, which includes the complete Altova product line. One of the major themes in this latest release is comprehensive support for XBRL across multiple Altova MissionKit tools, so lets look into XBRL itself and the new functionality in more detail.

What is XBRL?

The eXtensible Business Reporting Language (XBRL) is an XML-based vocabulary for electronic transmission of business and financial data. Currently in version 2.1, XBRL is an open standard that is maintained by XBRL International, a global non-profit consortium of over 550 major companies, organizations, and government agencies. XBRL was developed to facilitate business intelligence (BI) automation by enabling machine-to-machine communication and data processing for financial information with an eye towards cost reduction through the elimination of time consuming and error-prone human interaction. Official support from European Parliament and a mandate the United States Securities and Exchange Commission (SEC) has all but secured XBRL’s future as the official standard for financial reporting. You can learn a lot more about the nuts and bolts of XBRL in the XBRL white paper written by Altova’s Technical Marketing Manager, Liz Andrews.

XBRL Tools

All of the power and functionality that XBRL brings to financial data is useless without XBRL-conformant tools to interpret and process this data. In fact, understanding the importance of tools for XBRL, the XBRL recommendation writes software vendors into its abstract:

XBRL allows software vendors, programmers, intermediaries in the preparation and distribution process and end users who adopt it as a specification to enhance the creation, exchange, and comparison of business reporting information.

As such, XBRL development and data integration can be supported in a variety of different ways. The Altova MissionKit 2009 provides comprehensive support for working with XBRL, from validation and editing, to transformation and rendering, in multiple tools: XMLSpy 2009 – XBRL validator and XBRL taxonomy editor. When you need to ensure that your XBRL filing documents are valid and compliant, XMLSpy can be used to to validate an XBRL instance file against its corresponding taxonomy. If you need to extend a standard taxonomy to meet your organization’s filing needs, the graphical XBRL taxonomy editor in XMLSpy gives you a graphical model with tabs that organize different types of taxonomy elements, wizards, and entry helpers, all of which guide data input based on the structural information provided in the XBRL instance and base taxonomies. MapForce 2009XBRL exchange and data integration tool. When you need to extract financial data from back end systems and transform it into compliant XBRL filings, MapForce makes it easy with drag-and-drop, graphical data mapping and instant conversion. MapForce allows bi-directional data mapping between XBRL and databases, flat files, Excel 2007, XML, and Web services. Once your mapping project is defined, you can utilize it to automate quarterly and annual XBRL filing generation by transforming data with the MapForce UI, command line, or auto-generated, royalty free application code. This makes public financial data submission a repeatable and highly manageable process, allowing organizations to produce valid XBRL reports as required based on the variable data stored in accounting system fields – without having to define new mappings every quarter. mapforce_xbrl_thumb If you need to aggregate XBRL filings from different time periods or different organizations in your back end systems for analysis, MapForce lets you map the XBRL data to databases, flat files, XML – even Web services. Mapping data directly from its source format removes the need for re-keying and potential errors. Together, MapForce and XBRL enable the automation of the multi-dimensional financial analysis that organizations and stakeholders use to evaluate market, company, and industry performance on a regular basis. StyleVision 2009 – XBRL rendering tool. Once data is in XBRL format, and it’s time to render it for consumption on the Web or in printed materials, StyleVision lets you create a straightforward XBRL report to output the data in multiple formats. Simply drag and drop a taxonomy financial statement onto the design pane as an XBRL table, and then use StyleVision’s graphical interface to format stylesheets for simultaneous output in HTML, RTF, PDF, and Word 2007 (OOXML). The XBRL Table Wizard makes it easy to customize the table structure and specify the concepts to include in the report. In the case of the US-GAAP taxonomy, which provides, in addition to the hierarchical organization in its presentation linkbase, some best practices information on how to structure XBRL instances, you can simply select US-GAAP mode to have StyleVision automatically output the data according to this information. xbrl_design To try this powerful XBRL functionality for yourself, download a free, fully functional 30-day trial of the Altova MissionKit 2009, and please let us know what you think by commenting on this blog or in the Altova Discussion Forum.

Tags: , , , , , ,

Altova customer Recordare builds MusicXML-based solution


Case Studies

Recordare® is a technology company focused on providing software and services to the musical community. Their flagship products, the Dolet® plugin family, are platform-independent plugins for popular music notation programs, facilitating the seamless exchange and interaction of sheet music data files by leveraging MusicXML. Dolet acts as a high quality translator between the MusicXML data format and other applications, enabling users to work with these files on any conceivable system, including industry leading notation and musical composition applications Finale® and Sibelius®. The list of MusicXML adopters also includes optical scanning utilities like SharpEye or capella-scan, music sequencers like Cubase, and beyond. Dolet increases the MusicXML support in all of these programs and promotes interoperability and the sharing of musical scores. In creating the Dolet plugins, Recordare used Altova’s XML editor, XMLSpy, for editing and testing the necessary MusicXML XML Schemas and DTDs, and the diff/merge tool, DiffDog, for regression testing.

The Challenge

Music interchange between applications had traditionally been executed using the MIDI (Musical Instrument Digital Interface) file format, a message transfer protocol that has its roots in electronic music. MIDI is not an ideal transfer format for printed music, because it does not take into account the multitude of notations (e.g., rests, repeats, dynamics, lyrics, slurs, tempo marks, etc.) that convey much of the meaning. MusicXML is an open, XML-based file format specifically created to encapsulate musical notation or digital sheet music data that was built on top of previous formats, MuseData and Humdrum. XML lends MusicXML the power and flexibility to be easily accessed, parsed, rendered, and otherwise manipulated by a wide variety of automated tools, and its general acceptance as a standard makes it an ideal format for scoring using computer technology. Since its original release by Recordare in January of 2004 (version 2.0 was released in June 2007), MusicXML has gained acceptance in the music notation industry with support in over 100 leading products, and is recognized as the de facto XML standard for music notation interchange. These products would not have adopted MusicXML unless it could be used to exchange data with industry-leading applications like Finale and Sibelius. By developing advanced plugins for popular music notation suites, Recordare would be able to deliver to their customers all of the advantages that XML can bring for data exchange and standardization.

The Solution

Below is an example showing the score of the first few measures of Beethoven’s An die ferne Geliebte, Op. 98 as it is written in sheet music: and a small snippet of the same piece translated to MusicXML: The MusicXML-based Dolet 4 plugins for Finale and Sibelius provide a more accurate and usable representation of sheet music than Standard MIDI translation. For example, the images below show the same piece of music. On the left is a Finale 2009 rendering of a MIDI file exported from Sibelius, and on the right is the same application’s interpretation of a MusicXML 2.0 file exported from the same version of Sibelius.
In the MIDI rendition, vital information like chord symbols, lyrics, slurs, articulations, and even title and composer are omitted from the translation. In addition to providing native support for MusicXML, the recently released Dolet 4 for Finale and Dolet 4 for Sibelius plugins enhance the capabilities of these programs by adding advanced features like:

  • Batch translation
  • More accurate and reliable data exchange
  • More formatting control
  • Support for the MusicXML XML Schema (in addition to the DTD)

In developing the plugins, Recordare was subject to specific requirements dictated by the Sibelius and Finale applications. The Sibelius plugin was programmed in ManuScript, and is one of the largest plugins ever written in that language. Finale, on the other hand, requires plugins to have a C++ core, and Recordare implemented this, adding MusicXML logic in Java and a JNI layer to provide the two-way Java/C++ communication. Recordare’s Dolet plugins are now critical aspects of the music preparation process for many television and film scores as well as new music publications. Errors in translation need to be fixed in maintenance updates, while ensuring that no new errors are introduced into these complex translation plugins. Regression testing of the MusicXML file produced by the Dolet plugins is thus an essential part of Recordare’s quality assurance process. Recordare used Altova’s DiffDog in the development of the Dolet plugins. XMLSpy was used to test and edit their DTDs and XML Schemas, and DiffDog for regression testing the MusicXML files produced by the software. Recordare has several regression test suites covering a wide range of musical repertoire, from baroque to hip-hop. DiffDog allows easy differencing of multiple runs of these test suites, including the ability to ignore differences in XML metadata elements such as software version and XML creation date that always change across test cases. Recordare has used Altova’s XMLSpy XML editor to edit the MusicXML DTDs and XML Schemas, starting with the use of XMLSpy 3.5 (released in 2001) to create the earliest alpha and beta versions of the MusicXML DTD. Version 2.0 of MusicXML added a compressed zip version of the format, similar to what is used in other XML applications like Open Office and Open XML. XMLSpy 2008 Enterprise Edition’s comprehensive support for zipped XML files made it easy to test this new feature together with the Dolet for Finale plugin.

A small portion of the extensive MusicXML schema shown in XMLSpy’s graphical XML schema editor

XMLSpy’s support for XQuery has also contributed to Recordare’s regression testing efforts. In response to a customer request, Recordare now exports XML processing instructions from the Dolet for Sibelius plugin when it encounters a musical feature that it is unable to translate correctly. A simple XQuery execution to search for all the processing instructions in the XML files in a given folder lets Recordare check for the presence of these restrictions within each test suite, and then compare the resulting XML files using DiffDog between runs of the test suite. Recently, customer demand led Recordare to develop an XSD version of the MusicXML format. XMLSpy Enterprise Edition was used to develop and test the schemas. Schema validation, schema restriction and extension, and automatically-generated schema documentation were all able to be tested using XMLSpy’s features.

The Results

The Dolet plugins are extensions for common industry software that harness the built-in capabilities of the MusicXML format to make musical scores truly interchangeable across disparate systems and toolsets. These plugins have the capacity to render accurate and meaningful musical notation based on the powerful MusicXML specification. The leading XML Schema editing capabilities in XMLSpy and the strong XML and directory differencing support in DiffDog enabled Recordare to write and polish the MusicXML schemas and perform regression testing on the Dolet plugins. The resulting high quality of the schemas and software has made MusicXML and the Dolet plugins a key element of the toolkit for composers, arrangers, publishers, copyists, and engravers throughout the industry wherever printed music is used. Try XMLSpy, DiffDog, and the other Altova MissionKit tools for yourself with a free 30-day trial.

Tags: , , , , , , , , ,

Case Study: Equifax


equifax Check out the case study below to learn how leading US credit reporting entity Equifax® built an advanced SOAP interface for their identity verification and authentication Web service.

Overview

Equifax is a leading credit reporting entity and provider of analytical and decision support tools. Their real-time authentication system, eIDverifier, offers government and businesses personalized online security measures that help protect them against fraud and comply with federal legislation. The eIDverifier process is used within e-commerce and other online applications to authenticate users’ identities based on their answers to personalized questions drawn from Equifax’s extensive data stores. The authentication process consists of five steps:

  1. Integrity Check – eIDverifier standardizes and screens applicant-provided information to test for data inconsistencies and irregularities.
  2. Pattern Recognition – A pattern recognition algorithm is conducted on each transaction. For example, a velocity parameter determines the number of times an applicant has applied for authentication in a specific time frame.
  3. Identity Validation – To confirm an identity’s legitimacy, eIDverifier uses a “waterfall” approach in gathering validation information from multiple data sources. This means that if the identity cannot be validated with the first data source, eIDverifier will proceed to the next data source until the identity is validated.
  4. Interactive Query – eIDverifier presents multiple-choice questions to the applicant based upon “shared secret” information that should only be known to the applicant and Equifax. The question sets are customizable to meet individual risk thresholds.
  5. Decision Logic / Output Assessment – There are two output components to eIDverifier – an assessment score and reason codes. The assessment score indicates the likelihood of an applicant presenting fraudulent information, while reason codes provide important details on questionable information and highlight any discrepancies between the consumer’s application information and Equifax data sources.

eIDverifier relies on the SOAP protocol to send messages defining these interactions back and forth between the client interface and the Equifax servers. Third party institutions license the eIDverifier SOAP interface for use within their online application processes, enabling them to integrate its functionality and access information contained in Equifax’s databases.Equifax uses the XMLSpy XML Schema editor to graphically design the XSDs that serve as the foundation for their SOAP interface.

The Challenge

Equifax needed a sophisticated tool for designing the XML Schemas that would define the data types for their Web service, as well as a mechanism for creating the WSDL documents that would describe the interface as a whole. As a Java shop, Equifax needed a solution that would be compatible with their other development tools, and that would work seamlessly with the Eclipse IDE. Though there are plenty of Java tools available that have the capacity for XML Schema development, XMLSpy presented the most attractive option for schema design because of its comprehensive graphical design and editing options.The Equifax development team took a further step to simplify their Web services creation, using XML Beans and the Codehaus XFire/CXF Java SOAP framework to auto-generate WSDL from their XML Schemas.

The Solution

eIDverifier relies on a variety of different technologies to bring identity verification and authentication to its clients. XMLSpy provides the following benefits:XML Schema

XML Schema is used to express the structure of the data, as well as the individual elements and attributes that it is comprised of. Because a large portion of the data relies on end-user input in the form of address, phone number, driver’s license number, etc., it is vital that this information is in a format that can be digested by the system.Using XMLSpy’s graphical XML Schema editor, the Equifax development team was able to easily visualize and maintain the structure of their XML Schema. A portion of the schema that was created appears below:

SOAP interface

This data type definition provides the syntax, and dictates the structure, for the data that is transmitted by the eIDverifier Web service.

XMLSpy’s unique graphical XML Schema editor allowed the Equifax development team to create and maintain a complex schema definition without writing any code manually. They were also able to automatically generate human-readable documentation that can be used to present the architecture for review at any time in the development process, and that describes each element and attribute in detail.SOAP interface

WSDL

The processes executed by eIDverifier are described by a WSDL document that incorporates the XML Schema to provide information about data types, functions, and other interface details to the client – defining and dictating the actions taken by the client application to send and retrieve information between the end-user and the Equifax servers. The Equifax team chose to autogenerate a WSDL document using the Codehaus XFire/CXF framework. The XML Schema was used as the basis for an XMLBeans implementation, which was then compiled as a Java service class. Once the eIDverifier service was exposed, XFire automatically generated a WSDL – the WSDL is shown below in the XMLSpy graphical WSDL editor.

SOAP interfaceThis WSDL serves as the basis for the eIDverifier application, defining the ports and messages that make up the communication infrastructure of the Web service.

The Results

The eIDverifier SOAP interface allows external applications to access Equifax’s backend data stores, exposing it as a Web service and enabling them to retrieve secure information without jeopardizing the integrity of the Equifax mainframe. Utilizing WSDL and SOAP, and surrounded by Java architecture, eIDverifier is able to confirm user identity by returning a set of multiple choice questions based on the secure data maintained by Equifax.SOAP interfaceXMLSpy enabled the Equifax team to quickly and easily create a graphical schema representation and the matching documentation to serve as the basis for the Web service. It also allowed the development team to focus on their Java code, rather than the intricacies of XML Schema and WSDL design. The Altova MissionKit provides numerous tools for advanced Web services development, from the graphical XML Schema and WSDL editing discussed here, to SOAP debugging, and even graphical Web services generation and data mapping. Download a free trial to check it out for yourself.

Tags: , , , , ,

Case Study: Wrycan, Fitz & Floyd, MarketLive


wrycan Fitz and Floyd is a leader in design and manufacture of hand painted ceramic gift ware. In 2007, they approached Wrycan, an Altova partner focused on content-centric XML expertise and related software development, for help creating a solution that would allow Fitz and Floyd to interface their existing CRM system to their new Web-based storefront application from MarketLive, the leader in e-commerce software solutions. Fitz and Floyd had already purchased a license for the Altova MissionKit software suite, so Wrycan was able to jump right in and start mapping data from Fitz and Floyd’s Oracle database to MarketLive’s proprietary schema using Altova MapForce. Wrycan assigned the project to a Principal Consultant, who had plenty of previous experience with XML technologies (including XSLT and XML Schema) as well as with large-scale databases, but who had never before used MapForce, Altova’s data conversion, transformation, and integration tool.

The Challenge

Fitz and Floyd required a solution that would automatically synchronize data from their Oracle database to MarketLive’s storefront application. It needed to perform the following functions: inventory updates, product updates, and order status updates. This way, when a customer ordered a Fitz and Floyd product via the MarketLive interface, they would be getting real-time information about the company’s inventory. The solution needed to be simple to use, easy to maintain, cost effective, and completed on time, so they could put their new storefront into production promptly. Fitz and Floyd’s existing data was housed in an Oracle 8.0.5 database and was organized according to internal requirements. In order to transform their data into a format that would work with MarketLive’s storefront application, Fitz and Floyd’s data needed to be mapped to MarketLive’s XML Schema. In addition, there needed to be a system in place to track and log any transaction errors that occurred.

The Solution

Because of MapForce’s ease-of-use, the Principal Consultant was able to get started using its intuitive features right away. Wrycan used MapForce to map the transformation from Fitz and Floyd’s Oracle database to the XML Schema definition (XSD) instance provided by MarketLive. Using the database as the source component and the XSD as the target, the following mapping was produced: MapForce mapping transparent In order to map to some XML Schema entities that were not explicitly defined in the original MarketLive schema, Wrycan used Altova XMLSpy’s graphical XML Schema editor to fill in the gaps, adding attributes to the schema that had not previously existed and thus ensuring that all necessary Fitz and Floyd data would be mapped to the MarketLive Web interface. An example of the schema modifications is shown below: XML Schema modifications Wrycan used MapForce’s unique code generation capabilities to automatically produce a Java applet that was used to update Fitz and Floyd’s product, inventory, shipping, and order status information programmatically. This specialized applet was then packaged along with Wrycan’s proprietary Transaction Manager. MapForce made it very easy to update and redeploy the data mapping requirements as they changed throughout the project. Because of MapForce’s ease of use and built-in code-generation capabilities, less technical users can also update the data mapping when there are changes.

Simple Web-based Transaction Manager

Utilizing open source Java technologies such as Apache Tomcat and Quartz Enterprise Job Scheduler, Wrycan was able to create a simple transaction manager that allowed the transactions handled by the MapForce-generated, Java-based data integration applet to be scheduled, processed, and logged. The Transaction Manager is a custom software application made specifically for Fitz and Floyd by Wrycan, but built in such a way that it can be reused for future clients. It consists of several components:

  • User interface – allows the integration of MapForce-generated Java code
  • FTP interface – adds the ability for files to be downloaded for transformation from Oracle database format to the eCommerce platform XML format or vice versa
  • Scheduler – allows the automation of the data migration
  • Reporter – stores transaction results in XML files accessible in the user interface and also has the ability send emails in case of exceptions

The Transaction Manager’s user interface is the point of contact for Fitz and Floyd to control and schedule any data transformations. Because Wrycan wanted to be able to reuse the Transaction Manager, they chose to generate the MapForce code in Java, a platform-independent programming language. (MapForce can also generate application source code in C# and C++.) This code is an integral part of the Transaction Manager, as it dictates the data mapping process, allowing Fitz and Floyd’s internal information to be accessed via the MarketLive interface. The FTP interface is a simple way to manage the transfer and delivery of files from within the Transaction Manager once the MapForce-generated Java applet has transformed the data according to the MarketLive schema. A built-in batch scheduler allows Fitz and Floyd to automate the data migration operations by content type (i.e. order, inventory, product, etc.). Batch jobs The reporting component allows the result of each transaction to be logged in XML. Because of this, if any transaction errors occurred, Wrycan was able to use Altova XMLSpy to analyze and debug the issues.

The Results

Fitz and Floyd now has an easy to use data integration layer that is extensible by adding new MapForce transformations, and they can easily adjust their current transactions. Any updates made to the Fitz and Floyd Oracle database are automatically transferred to the MarketLive application in a format that it can readily understand. Log Details Because the Transaction Manager application is based on platform-independent Java code (generated by MapForce), Wrycan also has a reusable application that can be used as an asset by any online retail company. Wrycan is now able to approach potential clients with a proven data integration layer product that provides job scheduling, email notification, and FTP integration and can utilize any database or schema output via a custom Altova MapForce transformation. When speaking about this project, Dan Ochs, the principal consultant at Wrycan involved with the Fitz and Floyd application stated “MapForce has proven to be an easy-to-use, effective tool for making the data integration and mapping process much easier and faster to implement.” This and many other customer case studies involving Altova solutions are available in the Altova library.

Tags: , , , , ,