Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'programmer': 0.03; 'anyway.': 0.05; 'binary': 0.07; 'string': 0.09; 'ascii': 0.09; 'bytes.': 0.09; 'facts': 0.09; 'feasible.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:design': 0.09; '16-bit': 0.16; 'finney': 0.16; 'janssen': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'subject:Language': 0.16; 'unicode.': 0.16; '(the': 0.22; '(in': 0.22; 'header:User-Agent:1': 0.23; 'bytes': 0.24; 'text.': 0.24; 'unicode': 0.24; 'header:X-Complaints-To:1': 0.27; 'writes:': 0.31; 'text': 0.33; 'programmers': 0.33; 'third': 0.33; 'basic': 0.35; 'no,': 0.35; 'data,': 0.36; 'surely': 0.36; 'similar': 0.36; 'turn': 0.37; 'two': 0.37; 'list': 0.37; 'ben': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'most': 0.60; 'simple': 0.61; "you're": 0.61; 'back': 0.62; 'more': 0.64; 'series': 0.66; 'between': 0.67; "today's": 0.70; '8bit%:100': 0.72; '\xe2\x80\x93': 0.77; 'learn.': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Ben Finney Subject: Re: Language design Date: Thu, 12 Sep 2013 10:57:04 +1000 References: <522eb795$0$29999$c3e8da3$5496439d@news.astraweb.com> <7wbo412m02.fsf@benfinney.id.au> <7w38pb2ble.fsf@benfinney.id.au> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: rasputin.madmonks.org X-Public-Key-ID: 0xBD41714B X-Public-Key-Fingerprint: 9CFE 12B0 791A 4267 887F 520C B7AC 2E51 BD41 714B X-Public-Key-URL: http://www.benfinney.id.au/contact/bfinney-gpg.asc X-Post-From: Ben Finney User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Cancel-Lock: sha1:WF19aanORvbrFWyFmOc05QZWmag= X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 27 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1378947440 news.xs4all.nl 15872 [2001:888:2000:d::a6]:58820 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:54021 Mark Janssen writes: > > Unicode is not 16-bit any more than ASCII is 8-bit. And you used the > > word "encod[e]", which is the standard way to turn Unicode into bytes > > anyway. No, a Unicode string is a series of codepoints - it's most > > similar to a list of ints than to a stream of bytes. > > Okay, now you're in blah, blah land. Text is (in the third millennium) Unicode. Unicode text is not binary data and never will be. Unicode text can be *encoded* to binary data, and that data can be *decoded* back to Unicode text. The two are never the same thing. You're demonstrating my point: the pernicious “text is binary data” falsehood needs to be eradicated from everything today's programmers learn. We need the simple facts about the basic difference between text and bytes to be learned by every programmer as early as can feasible. -- \ 德不孤、必有鄰。 (The virtuous are not abandoned, | `\ they shall surely have neighbours.) | _o__) —孔夫子 Confucius, 551 BCE – 479 BCE | Ben Finney