Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.025 X-Spam-Evidence: '*H*': 0.95; '*S*': 0.00; 'ascii': 0.09; 'happen,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'suggest': 0.14; 'wrote': 0.14; 'buggy': 0.16; 'ellipses': 0.16; 'ellipsis': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'wrote:': 0.18; '(not': 0.18; 'machine': 0.22; '>>>': 0.22; 'header:User-Agent:1': 0.23; 'received:comcast.net': 0.24; 'stick': 0.24; 'text,': 0.24; 'unicode': 0.24; 'somewhere': 0.26; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'character': 0.29; 'characters': 0.30; 'fault': 0.31; 'them?': 0.31; 'universal': 0.31; 'languages': 0.32; 'text': 0.33; 'plain': 0.33; 'maybe': 0.34; 'but': 0.35; 'google': 0.35; 'there': 0.35; 'html,': 0.36; 'in:': 0.36; 'should': 0.36; 'wrong': 0.37; 'to:addr:python-list': 0.38; 'track': 0.38; 'aside': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'how': 0.40; 'course': 0.61; "you're": 0.61; 'exchange': 0.63; 'happen': 0.63; 'became': 0.64; 'more': 0.64; 'different': 0.65; 'between': 0.67; '>from': 0.68; '(is': 0.84; 'illustrated': 0.84; 'subject:Managing': 0.84; 'rusi': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Ned Batchelder Subject: Re: Managing Google Groups headaches Date: Fri, 06 Dec 2013 21:24:50 -0500 References: <5f370a06-8d2c-4d7d-bc22-b9a489c15c59@googlegroups.com> <132658ff-d06a-4136-ade6-353189da5769@googlegroups.com> <51007240-6bc9-4f0b-9937-4883bcc0ceb6@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: c-50-133-228-126.hsd1.ma.comcast.net User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 38 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1386383106 news.xs4all.nl 2949 [2001:888:2000:d::a6]:60921 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:61217 On 12/6/13 8:03 AM, rusi wrote: >> I think you're off on the wrong track here. This has nothing to do with >> >plain text (ascii or otherwise). It has to do with divorcing how you >> >store and transport messages (be they plain text, HTML, or whatever) >> >from how a user interacts with them. > > Evidently (and completely inadvertently) this exchange has just > illustrated one of the inadmissable assumptions: > > "unicode as a medium is universal in the same way that ASCII used to be" > > I wrote a number of ellipsis characters ie codepoint 2026 as in: > > - human communication… > (is not very different from) > - machine communication… > > Somewhere between my sending and your quoting those ellipses became > the replacement character FFFD > >>> > > - human communication� >>> > >(is not very different from) >>> > > - machine communication� > Leaving aside whose fault this is (very likely buggy google groups), > this mojibaking cannot happen if the assumption "All text is ASCII" > were to uniformly hold. > > Of course with unicode also this can be made to not happen, but that > is fragile and error-prone. And that is because ASCII (not extended) > is ONE thing in a way that unicode is hopelessly a motley inconsistent > variety. You seem to be suggesting that we should stick to ASCII. There are of course languages that need more than just the Latin alphabet. How would you suggest we support them? Or maybe I don't understand? --Ned.