Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


new here, xml, ebml, iff, and binary xml

From: "cr88192" <cr88192@------.-------.--->
To: NULL
Date: 11/1/2004 9:09:00 AM
I am new here, so I appologize if I am trolling.
I get to a point eventually I think.

I guess I am requesting oppinion on the general idea more than anything 
else.


XML:
for quite a while, I had become complacent with use of textual xml in my 
projects, after all:
it is easy to type;
it is easy to read;
it is a rather capable encoding;
...

xml, however, has some limitations:
it is a little bulky in some cases;
it is not well suited to storing large chunks of binary data;
it would not be possible to access it randomly (or at least not without 
parsing it first, or I am missing something signifigant);
...

(eg: so I wouldn't want to, say, use xml as a container for, say, several GB 
of video).


IFF:
and for other things, there are other formats, eg: iff and riff, which:
are fairly simple (and differ only really in number endianess);
work well at dealing with chunks of binary data (eg: riff is used as the 
base of both the avi and wav format).

http://www.szonye.com/bradd/iff.html
riff is similar, just the endianess is different, 'FORM' is 'RIFF', '   ' is 
'JUNK', ...

for some things however, riff and iff were showing limitations:
they waste space with small items;
there is an inherit 4GB filesize limit;
they are not very expressive;
tags have a fixed size of 4 chars, which is lame imo;
...

I had before designed a kludged over variant of riff, which didn't offer 
that much new and was kind of ugly.


EBML:
http://www.matroska.org/technical/specs/index.html

ebml is used as the basis of the matroska format (mka, mkv). on the site 
describing it, it compares itself to xml (but is in most ways similar to 
riff).
it's tags and sizes are variable length, sort of fixing some issues with 
riff.
it is, however, not that much like xml.


Binary XML:
this has manifested itself in a few forms, one of the most popular is wbxml:
http://www.w3.org/TR/wbxml/
which demonstrates the possibility of binary xml as a hacked together mess 
and has been used in both arguments for and against binary xml.
imo, wbxml is an example of a bad approach.

other ideas I have heard stated involve use of ASN.1 and schemas as a basis 
for binary xml. however, I will argue that this line of approach likely has 
a limited application domain.

there are possible good points, however, to binary xml encodings:
large datasets in a single file;
random access;
possible uses of binary xml in domains where textual xml is currently not 
very suitible;
...


ok, so I have gathered some ideas, and came up with something that sort of 
borrows pieces from iff, ebml (rough structure and variable length numbers), 
xml (namespaces, attributes, ...), and wbxml (use of dictionaries, albeit 
mine may be dynamically constructed and don't have an arbitrary size limit, 
...).

it is being designed such that it can be used both like formats like 
riff/ebml, and can also represent xml (a subset, eg, the basic syntax and 
namespaces). namespaces are an important feature in dealing with binary data 
types and mixing xml and data, or mixing different kinds of data.


at present I don't have either a version of the spec online or any working 
code for that matter.
I can followup with the draft spec if anyone cares. for now I am regarding 
it more as a "proof of concept" (if that).

or such...





transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent