Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #33456
| Path | csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <buck@yelp.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; '16,': 0.03; '"""': 0.05; 'correspond': 0.07; 'interpreted': 0.07; 'undefined': 0.07; 'used.': 0.07; 'defined.': 0.09; 'friday,': 0.09; 'semantics': 0.09; 'to:addr:comp.lang.python': 0.09; 'undefined.': 0.09; 'url:unicode': 0.09; 'cc:addr:python-list': 0.10; "'hello": 0.16; "'replace')": 0.16; 'combinations': 0.16; 'decode': 0.16; 'iso/iec': 0.16; 'uses,': 0.16; 'wrote:': 0.17; 'unicode': 0.17; 'creates': 0.18; '>>>': 0.18; 'bit': 0.21; 'error.': 0.21; 'cc:2**0': 0.23; 'example': 0.23; 'specified': 0.23; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'handling': 0.27; 'prevent': 0.27; 'represent': 0.28; 'fri,': 0.30; 'function': 0.30; 'error': 0.30; 'code': 0.31; 'generally': 0.32; 'received:google.com': 0.34; 'loss': 0.34; 'acceptable': 0.35; 'data,': 0.35; 'nov': 0.35; 'pm,': 0.35; 'table': 0.35; 'subject:?': 0.35; 'received:209.85': 0.35; 'url:org': 0.36; 'characters': 0.36; 'received:209': 0.37; 'subject:: ': 0.38; 'url:docs': 0.38; 'application': 0.40; 'from:no real name:2**0': 0.60; 'skip:n 10': 0.63; 'url:0': 0.67; 'positions': 0.68; 'standards,': 0.84; 'url:dk': 0.84 |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=path:newsgroups:date:in-reply-to:complaints-to:injection-info :nntp-posting-host:references:user-agent:x-google-web-client :x-google-ip:mime-version:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=tB/kdzTQhfu8/mrziJtYHOOWoGJp2pd706YBJz/rAaI=; b=mEe8PTpLtr7p4dpFWVUsrsfbkg+rJwfizYQDZldM3Ura5/eJ+0GpMb/Ez1WCdcQyhB gRmAkpOVuvl5kjdaKVeB0JCc9I8XzII/THY3o/J343yNyygx76vj021bSkpuvpnkasU+ MyPNjBIDFs2Qs5AgwUKarZoqLUb8bvT/G7hP1jnlBqy+zl86tNS84Sz/2Z1x+H8pylXM iulTM8T5OyonPxEC0yf+Vn2qs4lvDDdH3vNRGSatxvtTGV3QiU+Bgugq69Hs4Ut8livA 9LzxvRQGfyJcJiFgqNNh9Lg+BO8ibVwruXb0uylsZbJvTuWcKSHcBkFBAR/ZMcRaJ+xV ahkQ== |
| Newsgroups | comp.lang.python |
| Date | Fri, 16 Nov 2012 15:27:54 -0800 (PST) |
| In-Reply-To | <mailman.3762.1353105272.27098.python-list@python.org> |
| Complaints-To | groups-abuse@google.com |
| Injection-Info | glegroupsg2000goo.googlegroups.com; posting-host=98.248.112.191; posting-account=64lhtQoAAAC4jcng0haBX247t-tzqGPA |
| References | <f063ebaf-89ee-4558-a762-0241efa39dcc@googlegroups.com> <mailman.3762.1353105272.27098.python-list@python.org> |
| User-Agent | G2/1.0 |
| X-Google-Web-Client | true |
| X-Google-IP | 98.248.112.191 |
| MIME-Version | 1.0 |
| Subject | Re: latin1 and cp1252 inconsistent? |
| From | buck@yelp.com |
| To | comp.lang.python@googlegroups.com |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | quoted-printable |
| X-Gm-Message-State | ALoCoQkSm+STyrsR7CIVuZEWFNnKRekozU6dnwBD+HSHev8S8I/4e5NoqgWU+tztYWOzf4lBwadt |
| Cc | Python <python-list@python.org> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Message-ID | <mailman.3764.1353108483.27098.python-list@python.org> (permalink) |
| Lines | 34 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1353108483 news.xs4all.nl 6878 [2001:888:2000:d::a6]:57520 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:33456 |
Show key headers only | View raw
On Friday, November 16, 2012 2:34:32 PM UTC-8, Ian wrote:
> On Fri, Nov 16, 2012 at 2:44 PM, <buck> wrote:
>
> > Latin1 has a block of 32 undefined characters.
>
>
> These characters are not undefined. 0x80-0x9f are the C1 control
> codes in Latin-1, much as 0x00-0x1f are the C0 control codes, and
> their Unicode mappings are well defined.
They are indeed undefined: ftp://std.dkuug.dk/JTC1/sc2/wg3/docs/n411.pdf
""" The shaded positions in the code table correspond
to bit combinations that do not represent graphic
characters. Their use is outside the scope of
ISO/IEC 8859; it is specified in other International
Standards, for example ISO/IEC 6429.
However it's reasonable for 0x81 to decode to U+81 because the unicode standard says: http://www.unicode.org/versions/Unicode6.2.0/ch16.pdf
""" The semantics of the control codes are generally determined by the application with which they are used. However, in the absence of specific application uses, they may be interpreted according to the control function semantics specified in ISO/IEC 6429:1992.
> You can use a non-strict error handling scheme to prevent the error.
> >>> b'hello \x81 world'.decode('cp1252', 'replace')
> 'hello \ufffd world'
This creates a non-reversible encoding, and loss of data, which isn't acceptable for my application.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
latin1 and cp1252 inconsistent? buck@yelp.com - 2012-11-16 13:44 -0800
Re: latin1 and cp1252 inconsistent? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-16 15:33 -0700
Re: latin1 and cp1252 inconsistent? buck@yelp.com - 2012-11-16 15:27 -0800
Re: latin1 and cp1252 inconsistent? Dave Angel <d@davea.name> - 2012-11-16 19:05 -0500
Re: latin1 and cp1252 inconsistent? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-16 17:20 -0700
Re: latin1 and cp1252 inconsistent? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-11-18 01:48 -0500
Re: latin1 and cp1252 inconsistent? buck@yelp.com - 2012-11-16 15:27 -0800
Re: latin1 and cp1252 inconsistent? Nobody <nobody@nowhere.com> - 2012-11-17 00:33 +0000
Re: latin1 and cp1252 inconsistent? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-16 18:08 -0700
Re: latin1 and cp1252 inconsistent? buck@yelp.com - 2012-11-17 08:56 -0800
Re: latin1 and cp1252 inconsistent? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-17 11:08 -0700
Re: latin1 and cp1252 inconsistent? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-17 11:13 -0700
Re: latin1 and cp1252 inconsistent? Nobody <nobody@nowhere.com> - 2012-11-17 19:15 +0000
csiph-web