Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #97056
| Subject | Re: Readlines returns non ASCII character |
|---|---|
| References | (1 earlier) <CAJ4+4aoMn+93oQsVXnDF2wmnpDfUbg9zna==v2RVh2DP_o4otg@mail.gmail.com> <CAFHq_S6wVanCc6GKRRRWdYwFcqzq2Cv-Cdz3KPz=hX0w9jZdPw@mail.gmail.com> <CAJ4+4aqoYLeeShM+PSR5hxdHa=v_d=XGCY_MifxUPQUwVKuuMQ@mail.gmail.com> <56033F56.1020308@mrabarnett.plus.com> <CALwzidnuUP8bL7cTP05VTCtC_1ok8i4VMXEg-exGc-_egu=beA@mail.gmail.com> |
| From | MRAB <python@mrabarnett.plus.com> |
| Date | 2015-09-24 03:02 +0100 |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.114.1443060146.28679.python-list@python.org> (permalink) |
On 2015-09-24 02:37, Ian Kelly wrote: > On Wed, Sep 23, 2015 at 6:09 PM, MRAB <python@mrabarnett.plus.com> wrote: >> On 2015-09-24 00:51, paul.hermeneutic@gmail.com wrote: >>> >>> If this starts at the beginning of the file, then it indicates that >>> the file is UTF-16 (LE). >>> >>> UTF-8[t 1] EF BB BF 239 187 191 >>> UTF-16 (BE) FE FF 254 255 >>> UTF-16 (LE) FF FE 255 254 >>> UTF-32 (BE) 00 00 FE FF 0 0 254 255 >>> UTF-32 (LE) FF FE 00 00 255 254 0 0 >>> >> The "signature" EF BB BF indicates the encoding called "utf-8-sig" by >> Python. It occurs on Windows. >> >> If the file doesn't start with any of these, then it could be using any >> encoding (except UTF-16 or UTF-32). > > Yes, but what does it mean when the signature is 00 FF 00 FE 00 FF and > occurs not at the beginning but repeatedly throughout the file, as > appears in the OP's case? > > At least, I'm assuming that the high-order bytes are 00 based on what > the OP posted. I wouldn't be surprised though if they're just being > mangled by the terminal, if it happens to be a certain one that will > not be named but uses CP 1252. > Yes, a byte-string literal or a hex dump of, say, the first 256 bytes would've been better.
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Readlines returns non ASCII character MRAB <python@mrabarnett.plus.com> - 2015-09-24 03:02 +0100
csiph-web