Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'from:addr:yahoo.co.uk': 0.04; 'string': 0.09; 'badly': 0.09; 'boring': 0.09; 'lawrence': 0.09; 'okay': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:trying': 0.09; 'tismer': 0.09; 'python': 0.11; 'posted': 0.15; 'does,': 0.16; 'esse': 0.16; 'mardi': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reminded': 0.16; 'subject:non': 0.16; 'folks': 0.16; 'language': 0.16; 'wrote:': 0.18; 'examples': 0.20; '>>>': 0.22; 'programming': 0.22; 'import': 0.22; 'coding': 0.22; 'header:User- Agent:1': 0.23; 'example.': 0.24; 'unicode': 0.24; 'mon,': 0.24; 'second': 0.26; 'post': 0.26; 'least': 0.26; 'asking': 0.27; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'character': 0.29; "i'm": 0.30; '-0700,': 0.31; '>>>>': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'trivial': 0.31; 'this.': 0.32; 'handled': 0.32; 'but': 0.35; 'google': 0.35; 'idle': 0.36; "i'll": 0.36; 'too': 0.37; 'christian': 0.38; 'to:addr:python- list': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'skip:u 10': 0.60; 'matter': 0.61; 'world.': 0.61; "you've": 0.63; 'email addr:gmail.com': 0.63; 'provide': 0.64; 'here': 0.66; 'stated': 0.69; 'topic,': 0.81; 'characters,': 0.84; 'fact.': 0.84; 'received:2': 0.84; 'senator': 0.84; 'speech,': 0.84; 'subject:.. ': 0.84; 'dutch': 0.91; '2013': 0.98 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Mark Lawrence Subject: Re: trying to strip out non ascii.. or rather convert non ascii Date: Tue, 29 Oct 2013 15:56:45 +0000 References: <526c412a$0$29972$c3e8da3$5496439d@news.astraweb.com> <526f4612$0$6512$c3e8da3$5496439d@news.astraweb.com> <63fa9fcd-6445-41ee-8873-e1ee046e2031@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: host-2-98-206-151.as13285.net User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 In-Reply-To: <63fa9fcd-6445-41ee-8873-e1ee046e2031@googlegroups.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 49 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1383062192 news.xs4all.nl 15935 [2001:888:2000:d::a6]:47912 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:57925 On 29/10/2013 15:38, wxjmfauth@gmail.com wrote: It's okay folks I'll snip all the double spaced google crap as the poster is clearly too bone idle to follow the instructions that have been repeatedly posted here asking for people not to post double spaced google crap. > Le mardi 29 octobre 2013 06:22:27 UTC+1, Steven D'Aprano a écrit : >> On Mon, 28 Oct 2013 07:01:16 -0700, wxjmfauth wrote: >>> And of course, logically, they are very, very badly handled with the >>> Flexible String Representation. >> >> I'm reminded of Cato the Elder, the Roman senator who would end every >> speech, no matter the topic, with "Ceterum censeo Carthaginem esse >> delendam" ("Furthermore, I consider that Carthage must be destroyed"). >> >> But at least he had the good grace to present that as an opinion, instead >> of repeating a falsehood as if it were a fact. >> >> -- >> >> Steven > > ------ > >>>> import timeit >>>> timeit.timeit("a = 'hundred'; 'x' in a") > 0.12621293837694095 >>>> timeit.timeit("a = 'hundreij'; 'x' in a") > 0.26411553466961735 > > If you are understanding the coding of characters, Unicode > and what this FSR does, it is a child play to produce gazillion > of examples like this. > > (Notice the usage of a Dutch character instead of a boring €). > > jmf > You've stated above that logically unicode is badly handled by the fsr. You then provide a trivial timing example. WTF??? -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence