Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries [Thread Prev] >Thread Next - Re: Questions about character entities in XML and PCI security compliance Questions about character entities in XML and PCI security complianceTo: NULL Date: 8/7/2008 3:53:00 PM Hi all. This is a rather long posting but I have some questions concerning the usage of character entities in XML documents and PCI security compliance. The company I work for is using a third party ecommerce service for hosting its online store. A few months ago this third party commerce site began using PGP file encryption on XML files (e.g. web orders) transferred to us as part of the ongoing PCI security compliance. Basically we only need to add a PGP decryption process before we can parse the incoming XML files so there should not have been any technical issue. However, we noticed that XML files they created since PGP encryption was implemented contain some unusual character entities. For example, if a XML file have elements containing characters such as <, >, &, -, /, ' and so on, the XML file will use the following character entities to represent them as shown below: Character Unusal Character Entities < &lt; > &gt; & &amp; - &#45; / &#47; ' &#39; No matter how you look at them, they are NOT the proper character entities for the original characters shown. The problem with these bad character entities is that when we use .Net Framework components such as XmlReader to load the XML file, character entities are not expanded back to the original characters they represent. Instead I would get the following result: Unusal Character Entities Expanded Result: &lt; < &gt; > &#38; & &#45; - &#47; / &#39; ' If you take a close look at the expanded results, you would see that they are the normal character entities you would expect to see. It seems to me that XML export process used by the ecommerce site has applied character entities "encoding" twice. For example, the proper character entity for / is /. However, if you treat / as data string and not as character entity and apply another "encoding", you would get &#47;. This means that whenever a online customer enter characters such as & or / in their name or shipping address, the XML file we parsed will not give us the correct text. For example, if customer entered "Christian & Cruz" on their shipping address the XML file we downloaded will show them as "Christian &#38; Cruz". And when the XML file is parsed the resulting string we get would be "Christian & Cruz". Another example. If a customer entered "c/o R. Fenton, M.D." in their shipping address, the XML file will show this string as "c&#47;o R. Fenton, M.D.". And the resulting string we parsed would be "c/o R. Fenton, M.D.". When we reported this problem to the ecommerse hosting company, their response was that these character entities were "encoded" per PCI security policy and thus they have no plan to "fix" them. Their reply sounds strange because these weird character entities they use in XML files are NOT data encryption nor do they provide security benefits. Can anyone tell me if there is in fact some kind of special character entities used in XML file per PCI security compliancy? Or is our ecommerce hosting company wrong? Any information would be appreciated. Thank you. | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
