Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #63995
| Path | csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <travisgriggs@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.012 |
| X-Spam-Evidence | '*H*': 0.98; '*S*': 0.00; 'encoding': 0.05; 'represents': 0.05; 'subject:Python': 0.06; 'referring': 0.07; 'sys': 0.07; 'bytes.': 0.09; 'encode': 0.09; 'forms,': 0.09; 'terms,': 0.09; 'runs': 0.10; 'python': 0.11; 'jan': 0.12; '8bit%:32': 0.16; 'alphabet': 0.16; 'encoding.': 0.16; 'encodings': 0.16; 'encodings,': 0.16; 'semantics': 0.16; 'utf8': 0.16; 'wrote:': 0.18; "skip:' 30": 0.19; '>>>': 0.22; 'import': 0.22; 'bytes': 0.24; 'unicode': 0.24; 'guys': 0.24; '(or': 0.24; 'source': 0.25; '15,': 0.26; 'define': 0.26; 'skip:v 30': 0.26; 'skip:_ 20': 0.27; 'header:In-Reply-To:1': 0.27; 'received:172.16': 0.29; 'skip:p 30': 0.29; 'am,': 0.29; 'points': 0.29; "doesn't": 0.30; 'skip:( 20': 0.30; 'code': 0.31; '>>>>': 0.31; 'waters': 0.31; 'yourself.': 0.31; 'says': 0.33; 'sense': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'representing': 0.36; 'sequence': 0.36; 'should': 0.36; 'example,': 0.37; 'two': 0.37; 'message-id:@gmail.com': 0.38; 'e.g.': 0.38; 'mapping': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; "you'll": 0.62; 'header:Message-Id:1': 0.63; 'term': 0.63; '8bit%:10': 0.64; 'more': 0.64; 'different': 0.65; 'levels': 0.65; 'skip:\xe2 10': 0.65; 'talking': 0.65; 'between': 0.67; 'rendering': 0.68; 'subjectcharset:utf-8': 0.72; 'other.': 0.75; '8bit%:46': 0.78; '2014,': 0.84; 'ambiguous': 0.84; 'batchelder': 0.84; 'confusing': 0.84; 'everything.': 0.84; 'it\xe2\x80\x99s': 0.84; 'you\xe2\x80\x99re': 0.91 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=zOwFtInn0BfjthJXV/Wpk+9DBQSZOJws6O/bQ3Os20U=; b=s8SfgYsrgCmS3d3XC50PeSQMdRwE896glRLl5I13jSsl1axbIRJe+ExLMXv6rrryAE pARPXOoj5lM0CCrlHgd1dgqCVFydSEwuRrrzX6ZzRwljdwPGyf+g3MFyUg8YM+rhkIk5 tg7HVAb2XFvvTsvC+ExowuLb2ovpyxkEVH7sQI6NoBR3V7h6cMzFUZxUio3j8lwI39Xz UqI3XEEB5fYAkdWhp/Fr49PcQo6Mt+j5XLtMUg7ZUAHrtUTwpIO2mt5OnPdCb0zHn5MO D4Yx5TEGi9CAOsII7ecMVn7Il+iQqjSTmFJcIu4qc8QCUgx2t0Ul1RPIC8JNT7E6VMnn xA3Q== |
| X-Received | by 10.66.180.200 with SMTP id dq8mr3945733pac.104.1389803332421; Wed, 15 Jan 2014 08:28:52 -0800 (PST) |
| Content-Type | text/plain; charset=utf-8 |
| Mime-Version | 1.0 (Mac OS X Mail 7.1 \(1827\)) |
| Subject | Re: 'Straße' ('Strasse') and Python 2 |
| From | Travis Griggs <travisgriggs@gmail.com> |
| In-Reply-To | <52D68402.6030407@chamonix.reportlab.co.uk> |
| Date | Wed, 15 Jan 2014 08:28:49 -0800 |
| Content-Transfer-Encoding | quoted-printable |
| References | <30dfa6f1-61b2-49b8-bc65-5fd18d498c38@googlegroups.com> <52D67873.2010502@chamonix.reportlab.co.uk> <lb5u13$9hs$1@ger.gmane.org> <52D68402.6030407@chamonix.reportlab.co.uk> |
| To | python-list@python.org |
| X-Mailer | Apple Mail (2.1827) |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.5520.1389803336.18130.python-list@python.org> (permalink) |
| Lines | 63 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1389803336 news.xs4all.nl 2975 [2001:888:2000:d::a6]:48380 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:63995 |
Show key headers only | View raw
On Jan 15, 2014, at 4:50 AM, Robin Becker <robin@reportlab.com> wrote:
> On 15/01/2014 12:13, Ned Batchelder wrote:
> ........
>>> On my utf8 based system
>>>
>>>
>>>> robin@everest ~:
>>>> $ cat ooo.py
>>>> if __name__=='__main__':
>>>> import sys
>>>> s='A̅B'
>>>> print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
>>>> robin@everest ~:
>>>> $ python ooo.py
>>>> version_info=sys.version_info(major=3, minor=3, micro=3,
>>>> releaselevel='final', serial=0)
>>>> len(A̅B)=3
>>>> robin@everest ~:
>>>> $
>>>
>>>
> ........
>> You are right that more than one codepoint makes up a grapheme, and that you'll
>> need code to deal with the correspondence between them. But let's not muddy
>> these already confusing waters by referring to that mapping as an encoding.
>>
>> In Unicode terms, an encoding is a mapping between codepoints and bytes. Python
>> 3's str is a sequence of codepoints.
>>
> Semantics is everything. For me graphemes are the endpoint (or should be); to get a proper rendering of a sequence of graphemes I can use either a sequence of bytes or a sequence of codepoints. They are both encodings of the graphemes; what unicode says is an encoding doesn't define what encodings are ie mappings from some source alphabet to a target alphabet.
But you’re talking about two levels of encoding. One runs on top of the other. So insisting that you be able to call them all encodings, makes the term pointless, because now it’s ambiguous as to what you’re referring to. Are you referring to encoding in the sense of representing code points with bytes? Or are you referring to what the unicode guys call “forms”?
For example, the NFC form of ‘ñ’ is ’\u00F1’. ‘nThe NFD form represents the exact same grapheme, but is ‘\u006e\u0303’. You can call them encodings if you want, but I echo Ned’s sentiment that you keep that to yourself. Conventionally, they’re different forms, not different encodings. You can encode either form with an encoding, e.g.
'\u00F1'.encode('utf8’)
'\u00F1'.encode('utf16’)
'\u006e\u0303'.encode('utf8’)
'\u006e\u0303'.encode('utf16')
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-11 23:50 -0800
Re: 'Straße' ('Strasse') and Python 2 Peter Otten <__peter__@web.de> - 2014-01-12 09:31 +0100
Re: 'Straße' ('Strasse') and Python 2 Stefan Behnel <stefan_ml@behnel.de> - 2014-01-12 10:00 +0100
Re: 'Straße' ('Strasse') and Python 2 Ned Batchelder <ned@nedbatchelder.com> - 2014-01-12 07:17 -0500
Re: 'Straße' ('Strasse') and Python 2 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-12 12:33 +0000
Re: 'Straße' ('Strasse') and Python 2 MRAB <python@mrabarnett.plus.com> - 2014-01-12 18:33 +0000
Re: 'Straße' ('Strasse') and Python 2 Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-01-13 09:27 +0100
Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-13 01:54 -0800
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-13 21:26 +1100
Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-13 10:38 +0000
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-13 21:57 +1100
Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-13 08:24 -0800
Re: 'Straße' ('Strasse') and Python 2 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-13 17:02 +0000
Re: 'Straße' ('Strasse') and Python 2 Michael Torrie <torriem@gmail.com> - 2014-01-13 08:58 -0700
Re: 'Straße' ('Strasse') and Python 2 Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-01-13 19:37 +0100
Mistake or Troll (was Re: 'Straße' ('Strasse') and Python 2) Terry Reedy <tjreedy@udel.edu> - 2014-01-13 18:05 -0500
Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 12:00 +0000
Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 00:43 +0000
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 12:26 +1100
Re: 'Straße' ('Strasse') and Python 2 Ned Batchelder <ned@nedbatchelder.com> - 2014-01-15 07:13 -0500
Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-15 06:55 -0800
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 02:14 +1100
Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 00:32 +0000
Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-16 10:51 +0000
Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 14:07 +0000
Re: 'Straße' ('Strasse') and Python 2 Tim Chase <python.list@tim.thechases.com> - 2014-01-16 09:24 -0600
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 21:58 +1100
Re: 'StraÃYe' ('Strasse') and Python 2 "Frank Millman" <frank@chagford.com> - 2014-01-16 14:06 +0200
Re: 'StraÃYe' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-16 13:03 +0000
Re: 'Straße' ('Strasse') and Python 2 Travis Griggs <travisgriggs@gmail.com> - 2014-01-16 13:30 -0800
Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 12:50 +0000
Re: 'Straße' ('Strasse') and Python 2 Travis Griggs <travisgriggs@gmail.com> - 2014-01-15 08:28 -0800
Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 16:55 +0000
Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 04:14 +1100
Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 17:28 +0000
Re: 'Straße' ('Strasse') and Python 2 Ian Kelly <ian.g.kelly@gmail.com> - 2014-01-15 11:32 -0700
Re: 'Straße' ('Strasse') and Python 2 Terry Reedy <tjreedy@udel.edu> - 2014-01-15 19:27 -0500
csiph-web