Altova Mailing List Archives


Re: [xml-dev] XHTML 5 and validation

From: Mike Sokolov <sokolov@--------.--->
To: Jesper Tverskov <jesper.tverskov@-----.--->
Date: 5/20/2011 10:21:00 PM
BOM in UTF-8 seems to cause problems with some XML parsers (incl. Xerces 
2.9.1).  They seem to believe it is white space in the prolog.  To deal 
with this, we have had to insert a processor prior to our parser which 
checks for BOM and strips it out.

-Mike

On 05/20/2011 12:05 PM, Jesper Tverskov wrote:
> Good news!
>
> All the issues I have so aggressively raised, have been solved.
>
> The solution is to use the wonderful polyglot version of XHTML5,
> http://www.w3.org/TR/html-polyglot/, that is an XML document that
> validates as HTML5 if served with mimetype "text/html" and as XHTML5
> if served with mimetype "application/xhtml+xml".
>
> I have made a test document and both W3C Markup Validator and
> Validator.nu work right away. The document validates without the need
> for settings as HTML5 and as XHTML5 depending only on the mimetype
> used!
>
> The W3C Markup Validator could be better. It doesn't say what mimetype
> was detected, and report "valid HTML5" in both cases.
>
> *** There is another problem with the the W3C Validator, I would like
> to ask the list about.
>
> The W3C guidelines, "Polyglot Markup: HTML-Compatible XHTML Documents"
> (see link above) recommends to use one of three methods, separately or
> in combination, to get encoding right:
>
> Within the document
> 1.	Byte Order Mark (BOM) character (preferred).
> 2.	<meta charset="UTF-8"/>.
> Outside the document
> 3.	When setting the mimetype.
>
> I use all three methods.
>
> The W3C Markup Validator validates the document but gives the following warning:
>
> "Byte-Order Mark found in UTF-8 File. The Unicode Byte-Order Mark
> (BOM) in UTF-8 encoded files is known to cause problems for some text
> editors and older browsers. You may want to consider avoiding its use
> until it is better supported."
>
> Is this still relevant? Should I drop the BOM (the polyglot guidelines
> call it the preferred method) and only use the two other methods?
>
> By the way, some other good news.
>
> *** It is easy to create polyglot XHTML5 with XSLT 2.0 ( I used
> Saxon). The following serialization
> attributes should be used:
>
> method="xhtml"
> omit-xml-declaration="yes"
> include-content-type="no"
> byte-order-mark="yes" (optional)
> encoding="UTF-8" (optional)
> indent="yes" (optional)
>
> And you should remember to place<meta charset="UTF-8"/>  in the head
> section of the XHTML document.
> That is it. Thanks for the help.
>
> Cheers,
> Jesper Tverskov
> http://www.xmlplease.com
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@l...
> subscribe: xml-dev-subscribe@l...
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>    

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@l...
subscribe: xml-dev-subscribe@l...
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

Disclaimer

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.