Altova Mailing List Archives
>comp.text.xml Archive Home
>Thread Prev - Attribute vs. Element
Re: Attribute vs. Element
Date: 5/23/2006 5:11:00 AM
Philipp wrote: > Hello. OK I know this is the most asked question in XML (it says in some > tutorial), but still. Please give me your insight on this (as I'm a newbie). You've actually asked a rarer variant on it. > I want to store parameters for a programm in an XML file. I can see 3 > intelligent ways to this. > > > 1) > <?xml version="1.0" ?> > <PARAMETERS> > <LATTICE> > <LPAR name="Coverage" unit="ML">0.1</LPAR> > <LPAR name="Frequency" unit="Hz">10^3</LPAR> > </LATTICE> > ... (other params in other elements) > </PARAMETERS> > > 2) > <?xml version="1.0" ?> > <PARAMETERS> > <LATTICE> > <Coverage unit="ML">0.1</Coverage> > <Frequency unit="Hz">10^3</Frequency> > </LATTICE> > ... (other params in other elements) > </PARAMETERS> > > 3) > <?xml version="1.0" ?> > <PARAMETERS> > <LATTICE> > <LPAR name="Coverage" unit="ML" value="0.1"\> > <LPAR name="Frequency" unit="Hz" value="10^3"\> > </LATTICE> > ... (other params in other elements) > </PARAMETERS> There is no real difference between #2 and #3. Personally I'd go with #3 if it's strictly a config file, because it's consistent between the handling of name and value (an entirely trivial human-friendliess issue). If there's a risk of the file being "viewed" though, then I'd favour #2, although this is also a very minor issue. There's a vague de facto standard (from the way HTML gets processed) that "unknown" elements in the most anonymous default contexts are given a default human rendering by showing their text content and hiding their attributes. #1 is interesting though, and quite different. If we assume for the moment that <LATTICE> and <LPAR> have some generic meaning as "config files" then you've also introduced the concepts of "Coverage" and "Frequency" into the XML DTD and they're obviously very application-specific. Neither of these is either good or bad, but they are different -- a DTD that contains "Frequency" is now application specific, not just a generic one for doing config files. That has significant implications about project design - in general XML doesn't work well unless the entire DTD is mapped out before implementing the data / code that uses it. Getting this aspect is a regular source of problems, especially for big projects. There are project techniques to work round it, or you may even find yourself avoiding XML in favour or a more up-to-date technique. In the extreme case this becomes the "Nominals" problem which is a classic "hard problem" from the AI world. So swap between elements and attributes without two much thought -- in the simple case they're both simple atomic structures that are visible in the XML Infoset and they really are interchangeable (cardinality or internal structure might force one into becoming an element). When you start moving application concepts from XML values to XML names though, that's when it gets interesting. I'm a little more concerned about the "10^3" markup for exponents within the value itself. Although this is certainly a reasonable way of representing such values, it's not mainstream. I'd use a more common floating point notation such as "1E3" or "1.0E3" instead. PS - UPPER CASE tagnames get tiring to read after a while. I'd suggest you use lower case (mixed case is a pain) > As far as I see, all three are valid > - no multiple attributes with same name > - the "value" is atomic, so can be stored in an attribute Good basic rules to follow > - I cannot think of any parameter (Coverage/Freq) which should need to > be extensible later on. That's where your <Coverage> element would bite you later, if it would! > - one attribute (unit) modifies another (value) in version (3), which > seems to be bad practice That's fine. It's a reasonable and relevant qualification of the value (giving it dimensions) > - I would like to address the parameters directly (using tinyXML), so > it's easier (is-it?) in version 2 where the Element name, No. Any "useful" query language makes this almost transparent to you. If it's hard, get another XML query platform. The last statement isn't strictly accurate in complex cases involving Reasoners -- but it's actually <LPAR name="Coverage" ...> that's the easier case to process !