Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] XML: why there is no escape (was Re: [xml-dev] Whatto escape when serializing XML)

From: Rick Jelliffe <rjelliffe@-------.---.-->
To: xml-dev@-----.---.---
Date: 1/3/2007 11:33:00 PM
Rick Marshall wrote:
> I don't know why, but I'm guessing Dennis Ritchie chose the "\" 
> instead of ESC
Oh, ESC would not be suitable at all, it is a character coding thing not 
a language-level code.
> Personally I think it would have been better for XML (SGML?) to stick 
> to an existing programming practice (and {} instead of <>) - but the 
> document world had evolving differently to the programming world and I 
> guess we just have to live with clash of cultures.
Well, SGML allowed you to change < for { but almost no-one did it. This 
is because in programs "<" is very common and "{" is rare, while in 
legal/technical/quality documents "{" is more common than "<".  (But 
sometimes for documents with many "<" people did remap to use "{".)  
Usually when people did use "{", they used it a short cut for an open 
tag not just a delimiter:
   {p}This is a {list}{item}short{/item}{item}example{/item}{/list}{/p}
is not a great advance in markup while short referencing it down to the 
wiki-like
   This is a {*short *example}
is much more useful.
> Michael Kay wrote:
>>> To escape a character means to do something (typically, to prefix it 
>>> with \ in C-family languages) to allow the character to be used 
>>> literally but without its normal parser treatment.  So \ before a 
>>> newline in a shell script is an escaped character.     
>>
>> Kernighan and Ritchie don't use "escape" as a verb, but they do refer to
>> constructs such as "\n" and "\b" as "escape sequences". So it seems 
>> fairly
>> natural that people should use the verb "escape [a character]" to mean
>> "represent [a character] by means of an escape sequence". 
>> Representing tab
>> by "\t" doesn't seem very different from representing tab by "&#x9;", so
>> it's natural that the same verb should be used for that too.
I think K&R might have called them escape sequences because the codes 
would be converted by particular implementations into device-dependent 
escape sequences (e.g. the appropriate ANSI escape sequence or whatever, 
using termcap or printcap or whatever.) Not because the "r" was being 
escaped by the "\".

But a question like "How do we escape a non-printing character" should 
have only one answer in XML: "You cannot escape non-printing characters 
(in the sense of adding some prefix that makes them OK); you can only 
represent them using numeric character references (which some people may 
call an escape sequence) and some of them, notably NULL, you simply 
cannot represent unless you have HEX or BIN64 encoded embedded fragments.

Cheers
Rick


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent