Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Parsing a generic data file

From: dnovatchev@-----.---
To: NULL
Date: 12/15/2007 1:33:00 PM
The FXSL library has a json-document() function (written entirely in
XSLT
2.0 and using the FXSL's LR parsing framework (also written entirely
in XSLT
2.0) ).

When this transformation:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:f="http://fxsl.sf.net/"
 exclude-result-prefixes="f xs"
 >
 <xsl:import href="../f/func-json-document.xsl"/>

 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="vstrParam" as="xs:string">
{

  "teacher":{
    "name":
      "Mr Borat",
    "age":
      "35",
    "Nationality":
      "Kazakhstan"
             },


  "Class":{
    "Semester":
      "Summer",
    "Room":
      "null",
    "Subject":
      "Politics",
    "Notes":
      "We're happy, you happy?"
           },

  "Students":
  [
    {
      "Smith":
        [{"First_Name":"Mary","sex":"Female"}],
      "Brown":
        [{"First_Name":"John","sex":"Male"}],
      "Jackson":
        [{"First_Name":"Jackie","sex":"Female"}]
    }
  ],


  "Grades":
  [
    {
      "Test":
       [{"grade":"A","points":68},{"grade":"B","points":25},
{"grade":"C","points":15}],
      "Test":
        [{"grade":"C","points":2},{"grade":"B","points":29},
{"grade":"A","points":55}],
      "Test":
        [{"grade":"C","points":2},{"grade":"A","points":72},
{"grade":"A","points":65}]
    }
  ]

}
 </xsl:variable>

 <xsl:template match="/">
    <xsl:sequence select="f:json-document($vstrParam)"/>
 </xsl:template>
</xsl:stylesheet>

is applied (containing essentially your original data, with "First
Name"
changed to "First_Name", and null changed to "null

the following result is produced:

<teacher>
   <name>Mr Borat</name>
   <age>35</age>
   <Nationality>Kazakhstan</Nationality>
</teacher>
<Class>
   <Semester>Summer</Semester>
   <Room>null</Room>
   <Subject>Politics</Subject>
   <Notes>We're happy, you happy?</Notes>
</Class>
<Students>
   <Smith>
      <First_Name>Mary</First_Name>
      <sex>Female</sex>
   </Smith>
   <Brown>
      <First_Name>John</First_Name>
      <sex>Male</sex>
   </Brown>
   <Jackson>
      <First_Name>Jackie</First_Name>
      <sex>Female</sex>
   </Jackson>
</Students>
<Grades>
   <Test>
      <grade>A</grade>
      <points>68</points>
   </Test>
   <Test>
      <grade>B</grade>
      <points>25</points>
   </Test>
   <Test>
      <grade>C</grade>
      <points>15</points>
   </Test>
   <Test>
      <grade>C</grade>
      <points>2</points>
   </Test>
   <Test>
      <grade>B</grade>
      <points>29</points>
   </Test>
   <Test>
      <grade>A</grade>
      <points>55</points>
   </Test>
   <Test>
      <grade>C</grade>
      <points>2</points>
   </Test>
   <Test>
      <grade>A</grade>
      <points>72</points>
   </Test>
   <Test>
      <grade>A</grade>
      <points>65</points>
   </Test>
</Grades>

One can use json-document() in any XPath expressions, for example,
getting
all female students is as easy as:

    f:json-document($vstrParam)/Students/*[sex = 'Female']

and produces:

<Smith>
   <First_Name>Mary</First_Name>
   <sex>Female</sex>
</Smith>
<Jackson>
   <First_Name>Jackie</First_Name>
   <sex>Female</sex>
</Jackson>


I will fix the implementation of json-document() to replace whitespace
in
element names with underscores and to process the unquoted string
null.


Cheers,
Dimitre Novatchev



On Dec 13, 7:52 pm, "Jasper" <notaro...@dontmail.com> wrote:
> Hi, Maybe this is off-topic, but perhaps you can help. I'm looking for ideas
> on how to parse a data file.
>
> I dont know XML but I know it parses data in text format.
>
> I have a structured data file of the general form shown below. I dont have
> any definition of the data. Basically it looks like it is hierarchical,
> token/data pairs defined by brackets and square brackets.
>
> I would like to parse this out into some sort of data object(s) in C++  so
> that I can gain programmatic access to the variables.
>
> My app is C++ so the solution must be the same. Also it must be very
> lightweight and *very* fast as I must decode multiple pages in realtime.
>
> Would adapting an XML parser to do this be a possible solution?
>
> Any pointers/ideas/references/code snippets/observations appreciated.
>
> TIA
>
> Basic example showing data structure (whitespaces and carriage returns added
> by me for clarity).
>
> {
>
> "teacher":{
>   "name":
>     "Mr Borat",
>   "age":
>     "35",
>   "Nationality":
>     "Kazakhstan"},
>
> "Class":{
>   "Semester":
>     "Summer",
>   "Room":
>     null,
>   "Subject":
>     "Politics",
>   "Notes":
>     "We're happy, you happy?"},
>
> "Students":
> [
> {
>   "Smith":
>     [{"First Name":"Mary","sex":"Female"}],
>   "Brown":
>     [{"First Name":"John","sex":"Male"}],
>   "Jackson":
>     [{"First Name":"Jackie","sex":"Female"}]}
>
> ],
>
> "Grades":
> [
> {
>   "Test":
>      [{"grade":"A","points":68},{"grade":"B","points":25},{"grade":"C","points":-15}],
>   "Test":
>     [{"grade":"C","points":2},{"grade":"B","points":29},{"grade":"A","points":5-5}],
>   "Test":
>     [{"grade":"C","points":2},{"grade":"A","points":72},{"grade":"A","points":6-5}]}
>
> ]
>
>
>
> }- Hide quoted text -
>
> - Show quoted text -



transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent