Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #97053

Re: Readlines returns non ASCII character

Subject Re: Readlines returns non ASCII character
References <CAFHq_S4dQxkQoHhP0hQfNvZ0KVtz2F-PFqbePOyYUGFXCLqipg@mail.gmail.com> <CAJ4+4aoMn+93oQsVXnDF2wmnpDfUbg9zna==v2RVh2DP_o4otg@mail.gmail.com> <CAFHq_S6wVanCc6GKRRRWdYwFcqzq2Cv-Cdz3KPz=hX0w9jZdPw@mail.gmail.com> <CAJ4+4aqoYLeeShM+PSR5hxdHa=v_d=XGCY_MifxUPQUwVKuuMQ@mail.gmail.com>
From MRAB <python@mrabarnett.plus.com>
Date 2015-09-24 01:09 +0100
Newsgroups comp.lang.python
Message-ID <mailman.111.1443053408.28679.python-list@python.org> (permalink)

Show all headers | View raw


On 2015-09-24 00:51, paul.hermeneutic@gmail.com wrote:
>   If this starts at the beginning of the file, then it indicates that
> the file is UTF-16 (LE).
>
> UTF-8[t 1]     EF BB BF       239 187 191
> UTF-16 (BE)    FE FF          254 255
> UTF-16 (LE)    FF FE          255 254
> UTF-32 (BE)    00 00 FE FF    0 0 254 255
> UTF-32 (LE)    FF FE 00 00    255 254 0 0
>
The "signature" EF BB BF indicates the encoding called "utf-8-sig" by
Python. It occurs on Windows.

If the file doesn't start with any of these, then it could be using any
encoding (except UTF-16 or UTF-32).

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Readlines returns non ASCII character MRAB <python@mrabarnett.plus.com> - 2015-09-24 01:09 +0100

csiph-web