Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries [Thread Prev] >Thread Next - summerize possibly (Re: new here, xml, ebml, iff, and binary xml) new here, xml, ebml, iff, and binary xmlTo: NULL Date: 11/1/2004 9:09:00 AM I am new here, so I appologize if I am trolling. I get to a point eventually I think. I guess I am requesting oppinion on the general idea more than anything else. XML: for quite a while, I had become complacent with use of textual xml in my projects, after all: it is easy to type; it is easy to read; it is a rather capable encoding; ... xml, however, has some limitations: it is a little bulky in some cases; it is not well suited to storing large chunks of binary data; it would not be possible to access it randomly (or at least not without parsing it first, or I am missing something signifigant); ... (eg: so I wouldn't want to, say, use xml as a container for, say, several GB of video). IFF: and for other things, there are other formats, eg: iff and riff, which: are fairly simple (and differ only really in number endianess); work well at dealing with chunks of binary data (eg: riff is used as the base of both the avi and wav format). http://www.szonye.com/bradd/iff.html riff is similar, just the endianess is different, 'FORM' is 'RIFF', ' ' is 'JUNK', ... for some things however, riff and iff were showing limitations: they waste space with small items; there is an inherit 4GB filesize limit; they are not very expressive; tags have a fixed size of 4 chars, which is lame imo; ... I had before designed a kludged over variant of riff, which didn't offer that much new and was kind of ugly. EBML: http://www.matroska.org/technical/specs/index.html ebml is used as the basis of the matroska format (mka, mkv). on the site describing it, it compares itself to xml (but is in most ways similar to riff). it's tags and sizes are variable length, sort of fixing some issues with riff. it is, however, not that much like xml. Binary XML: this has manifested itself in a few forms, one of the most popular is wbxml: http://www.w3.org/TR/wbxml/ which demonstrates the possibility of binary xml as a hacked together mess and has been used in both arguments for and against binary xml. imo, wbxml is an example of a bad approach. other ideas I have heard stated involve use of ASN.1 and schemas as a basis for binary xml. however, I will argue that this line of approach likely has a limited application domain. there are possible good points, however, to binary xml encodings: large datasets in a single file; random access; possible uses of binary xml in domains where textual xml is currently not very suitible; ... ok, so I have gathered some ideas, and came up with something that sort of borrows pieces from iff, ebml (rough structure and variable length numbers), xml (namespaces, attributes, ...), and wbxml (use of dictionaries, albeit mine may be dynamically constructed and don't have an arbitrary size limit, ...). it is being designed such that it can be used both like formats like riff/ebml, and can also represent xml (a subset, eg, the basic syntax and namespaces). namespaces are an important feature in dealing with binary data types and mixing xml and data, or mixing different kinds of data. at present I don't have either a version of the spec online or any working code for that matter. I can followup with the draft spec if anyone cares. for now I am regarding it more as a "proof of concept" (if that). or such... | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
