Altova Mailing List Archives>Archive Index >microsoft.public.xml Archive Home >Recent entries >Thread Prev - SAX parse and ISAXContentHandlerImpl Members [Thread Next] RE: SAX parse and ISAXContentHandlerImpl MembersTo: NULL Date: 9/3/2005 9:23:00 PM "RCGray" wrote:
> I have an application which takes an XML document and uses a SAX reader to
> parse the document. The xml document could have several "empty elements"
> like <Value /> or <Value></Value>.
>
> I am currently using the startElement, characters, and endElement methods to
> help me determine what to do with the data. My question is, how can I
> differentiate between an empty element and an starting element in a
> "structure".
>
> <something>
> <somethingelse>data</somethingelse>
> <anemptyelement />
> </something>
>
> I am building a vector list of those data elements through the parser and
> these methods. When endElement is called I push the data to the vector if
> the data is not an empty string. In the example above <something> does not
> have any "data" and the string is empty so I bypass pushing the data to the
> vector. The same problem occurs with <anemptyelement /> although I want to
> know in the main application that <anemptyelement> was simply blank this time.
>
> Is there a method that would easily distinguish between an empty element and
> the other element?
>
> Should I just keep track of the "levels" that I am in? Devise a hack where
> <something> would be level 0 and <somethingelse> and <anemptyelement> would
> be in level 1. I could add this concept of levels to startElement. I am
> hoping for something more elegant.
>
> Thanks for your time...
The term "structure" above should have been "element group". Anyway, I can
determine if I can save an empty string in an empty element if my last level
was a startElement. So for each parse handler I gave a level value at the
end of each handler method. IE. startElement() was level 1, characters() was
level 2, and endElement() was level 3. Therefore in endElement(), if my
vector string was empty and my last level was 1, then I know I was processing
an empty element and I pushed an empty string to the vector.
The code says it better:
HRESULT STDMETHODCALLTYPE MyContent::endElement(
/* [in] */ wchar_t __RPC_FAR *pwchNamespaceUri,
/* [in] */ int cchNamespaceUri,
/* [in] */ wchar_t __RPC_FAR *pwchLocalName,
/* [in] */ int cchLocalName,
/* [in] */ wchar_t __RPC_FAR *pwchRawName,
/* [in] */ int cchRawName)
{
if(m_sValue.size() != 0)
masDocument.push_back(m_sValue);
else if (m_iLastLevel == 1)
{
m_sValue = _T("");
masDocument.push_back(m_sValue);
}
m_sValue.clear();
m_iLastLevel = 3; // 1 = startElement(), 2 = characters(), 3 =
endElement();
return S_OK;
}
This should suffice for now. I couldn't find a good reference in the MSDN
documentation nor a good web page otherwise I'd reference them.
Happy Labor Day.
| ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
