Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!fu-berlin.de!uni-berlin.de!news.dfncis.de!not-for-mail
From: =?ISO-8859-1?Q?Sven_K=F6hler?= <remove-sven.koehler@gmail.com>
Newsgroups: comp.lang.java.programmer
Subject: Re: A proposal to handle file encodings
Date: Sun, 25 Nov 2012 15:09:40 +0100
Lines: 18
Message-ID: <ahen43F9fcbU2@mid.dfncis.de>
References: <lb6ta81u9imfdtlpuesoc8slncju0ehsnm@4ax.com> <k8o50f$1q6$1@news.albasani.net> <9kava8lk1ignppq7rso7gmcb541gnerf8q@4ax.com> <k8oers$p98$1@news.albasani.net> <slrnkb00l8.jbt.hjp-usenet2@hrunkner.hjp.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Trace: news.dfncis.de yxD+8QSW51qx1uXQ8A9a1AKkHH/4wsc2Tvzf6RMovt+fOFLRNLT3u+BUdRVj9RUgWqDIF7+Drx
Cancel-Lock: sha1:cqWAMJGp7qWGk+yLRDedjxs7p4o=
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121028 Thunderbird/16.0.1
In-Reply-To: <slrnkb00l8.jbt.hjp-usenet2@hrunkner.hjp.at>
Xref: csiph.com comp.lang.java.programmer:19940

Am 24.11.2012 00:11, schrieb Peter J. Holzer:
> On 2012-11-23 18:21, Jan Burse <janburse@fastmail.fm> wrote:
>> Roedy Green schrieb:
>>> The HTML encoding is incompetent. You can't read it without knowing
>>> the encoding.
> 
> Not true in practice. Almost all encodings used in the real world are
> some superset of ASCII, and you only need to recognize ASCII characters
> to find the relevant meta tag.

With the exception of UTF-16LE/BE for example.
Or is a BOM mandatory for UTF-16? The downside of BOMs is that they
break feature like includes. Many include mechanism just copy the
bytestream, this BOMs appear in the middle of the page.


Regards,
  Sven