Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin1!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '"this': 0.03; 'encoding': 0.05; 'subject:Python': 0.06; '*not*': 0.07; 'encoded': 0.07; 'string': 0.09; 'ascii': 0.09; 'from:addr:ethan': 0.09; 'from:addr:stoneleaf.us': 0.09; 'from:name:ethan furman': 0.09; 'message-id:@stoneleaf.us': 0.09; 'subset': 0.09; 'type,': 0.09; '~ethan~': 0.09; 'ascii,': 0.16; 'encoding.': 0.16; 'other,': 0.16; 'wrote:': 0.18; 'properly': 0.19; 'header:User-Agent:1': 0.23; 'string,': 0.24; 'unicode': 0.24; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'characters': 0.30; 'are.': 0.31; 'ordinary': 0.31; 'subject:some': 0.31; 'file': 0.32; 'agreed': 0.32; 'text': 0.33; "can't": 0.35; 'something': 0.35; 'but': 0.35; 'really': 0.36; 'operating': 0.37; 'needed': 0.38; 'whatever': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'how': 0.40; 'most': 0.60; 'black': 0.61; 'course': 0.61; 'such': 0.63; 'different': 0.65; '(that': 0.65; 'received:69.56': 0.68; 'facilities': 0.69; 'fact,': 0.69; 'realization': 0.91 Date: Fri, 06 Jun 2014 06:24:35 -0700 From: Ethan Furman User-Agent: Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Python 3.2 has some deadly infection References: <538a8f48$0$29978$c3e8da3$5496439d@news.astraweb.com> <538bcfff$0$29978$c3e8da3$5496439d@news.astraweb.com> <538C5BB8.1020702@chamonix.reportlab.co.uk> <538f1a61$0$29978$c3e8da3$5496439d@news.astraweb.com> <53902bb1$0$11109$c3e8da3@news.astraweb.com> <87wqcvu20h.fsf@elektro.pacujo.net> <7b3543f6-6f62-49c5-abdc-e2783fd6d629@googlegroups.com> <87oay7tnxt.fsf@elektro.pacujo.net> <87tx7z5hvw.fsf@elektro.pacujo.net> In-Reply-To: <87tx7z5hvw.fsf@elektro.pacujo.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator3304.hostgator.com X-AntiAbuse: Original Domain - python.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stoneleaf.us X-BWhitelist: no X-Source-IP: 70.194.162.219 X-Exim-ID: 1Wsu8L-0004f7-Ev X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: ([192.168.43.29]) [70.194.162.219]:31773 X-Source-Auth: ethan+stoneleaf.us X-Email-Count: 1 X-Source-Cap: dG9idWs7dG9idWs7Z2F0b3IzMzA0Lmhvc3RnYXRvci5jb20= X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 24 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1402062359 news.xs4all.nl 2972 [2001:888:2000:d::a6]:49533 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72845 On 06/05/2014 11:30 AM, Marko Rauhamaa wrote: > > How text is represented is very different from whether text is a > fundamental data type. A fundamental text file is such that ordinary > operating system facilities can't see inside the black box (that is, > they are *not* encoded as far as the applications go). Of course they are. It may be an ASCII-encoding of some flavor or other, or something really (to me) strange -- but an encoding is most assuredly in affect. ASCII is *not* the state of "this string has no encoding" -- that would be Unicode; a Unicode string, as a data type, has no encoding. To transport it, store it, etc., it must (usually?) be encoded into something -- utf-8, ASCII, turkish, or whatever subset is agreed upon and will hopefully contain all the Unicode characters needed for the string to be properly represented. The realization that ASCII was, in fact, an encoding was a big paradigm shift for me, but a necessary one. -- ~Ethan~