Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.038 X-Spam-Evidence: '*H*': 0.92; '*S*': 0.00; 'algorithm': 0.04; 'string.': 0.05; '"""': 0.07; '"the': 0.07; 'collier': 0.09; 'converts': 0.09; 'sentence': 0.09; 'tackle': 0.09; 'cc:addr :python-list': 0.11; '>>': 0.16; '9:13': 0.16; 'comparison"': 0.16; 'earlier.': 0.16; 'letters.': 0.16; 'lowercase': 0.16; 'perfect.': 0.16; 'those,': 0.16; 'varies': 0.16; '\xa0you': 0.16; 'sat,': 0.16; 'wrote:': 0.18; '>>>': 0.22; 'email addr:gmail.com>': 0.22; 'cc:addr:python.org': 0.22; '>>>': 0.24; 'subject:Code': 0.24; 'unicode': 0.24; 'cc:2**0': 0.24; 'options': 0.25; '>': 0.26; 'compare': 0.26; 'equivalent': 0.26; 'first,': 0.26; 'second': 0.26; 'header:In- Reply-To:1': 0.27; 'correct': 0.29; 'chris': 0.29; 'am,': 0.29; 'characters': 0.30; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'code': 0.31; 'context.': 0.31; "d'aprano": 0.31; 'option.': 0.31; 'steven': 0.31; 'figure': 0.32; 'skip:c 30': 0.32; 'url:python': 0.33; 'are:': 0.33; 'fri,': 0.33; '"the': 0.34; "can't": 0.35; 'something': 0.35; 'german': 0.35; 'johnson': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'version': 0.36; 'really': 0.36; 'described': 0.36; 'url:listinfo': 0.36; 'similar': 0.36; 'url:org': 0.36; 'should': 0.36; 'example,': 0.37; 'two': 0.37; 'skip:& 10': 0.38; 'url:library': 0.38; 'pm,': 0.38; 'anything': 0.39; 'either': 0.39; 'according': 0.40; 'url:mail': 0.40; 'even': 0.60; 'remove': 0.60; 'dave': 0.60; 'worry': 0.60; 'most': 0.60; 'hope': 0.61; 'url:u': 0.61; 'new': 0.61; 'url:3': 0.61; 'range': 0.61; 'simply': 0.61; 'simple': 0.61; 'making': 0.63; 'more': 0.64; 'to:addr:gmail.com': 0.65; 'url:0': 0.67; '20,': 0.68; 'combining': 0.68; 'hints': 0.68; 'containing': 0.69; 'jul': 0.74; 'url:gif': 0.82; 'future,': 0.83; 'url:images': 0.83; '3.13': 0.84; 'characters,': 0.84; 'angel': 0.91; 'treatment': 0.95; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Ub9s2VBl3jeC0HlRrqZhqSH8ML5t4lUaWu3qTT8di84=; b=UvjOAiQCGQSEM8rLlm14K+zoIHiD6EqHd8q41mkql5fJmWyCerwzxS2lpKj20Fup8z m0/dmsNa9CCKzzSayHbFl9zijDtCC85dgQDuck0+jw2qfJ2fZeAUmzGFWkh6RCAUY3wd H0VotxcIPLiJjv1cuzdRSe7dnwo6J6NLoNhvN0sPOUxM8ZIEhrl4pQ0wW1ZvCZx0Luk7 ZhPV6CJtTKBqH6G9IBE04t6+uhwDcvtvZA5sDH/KuosFp2CN0RSyOcQHaC3u4HLbUNYB kuSGZUumgbhcd5KdcO3RuynAdYxqmvwzJBMkWtG/6pW8A8mv7Ew8Meqk0Pw49S2Eb6aZ jyug== MIME-Version: 1.0 X-Received: by 10.224.123.68 with SMTP id o4mr21299730qar.106.1374291720581; Fri, 19 Jul 2013 20:42:00 -0700 (PDT) In-Reply-To: References: <51e97e6e$0$29971$c3e8da3$5496439d@news.astraweb.com> <51E9B8EB.5060007@Gmail.com> <51E9E237.2040903@Gmail.com> Date: Fri, 19 Jul 2013 23:42:00 -0400 Subject: Re: Share Code Tips From: David Hutto To: Chris Angelico Content-Type: multipart/alternative; boundary=047d7bd6bc92d2b8ab04e1e93961 Cc: python-list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 201 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1374291729 news.xs4all.nl 15869 [2001:888:2000:d::a6]:38946 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:50941 --047d7bd6bc92d2b8ab04e1e93961 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Just use an explanatory user tip that states it should be case sensitive, just like with most sites, or apps. On Fri, Jul 19, 2013 at 9:13 PM, Chris Angelico wrote: > On Sat, Jul 20, 2013 at 11:04 AM, Devyn Collier Johnson > wrote: > > > > On 07/19/2013 07:09 PM, Dave Angel wrote: > >> > >> On 07/19/2013 06:08 PM, Devyn Collier Johnson wrote: > >>> > >>> > >>> On 07/19/2013 01:59 PM, Steven D'Aprano wrote: > >> > >> > >> > >>> > >>> > >>> As for the case-insensitive if-statements, most code uses Latin > letters. > >>> Making a case-insensitive-international if-statement would be > >>> interesting. I can tackle that later. For now, I only wanted to take > >>> care of Latin letters. I hope to figure something out for all > characters. > >>> > >> > >> Once Steven gave you the answer, what's to figure out? You simply use > >> casefold() instead of lower(). The only constraint is it's 3.3 and > later, > >> so you can't use it for anything earlier. > >> > >> http://docs.python.org/3.3/library/stdtypes.html#str.casefold > >> > >> """ > >> str.casefold() > >> Return a casefolded copy of the string. Casefolded strings may be used > for > >> caseless matching. > >> > >> Casefolding is similar to lowercasing but more aggressive because it i= s > >> intended to remove all case distinctions in a string. For example, the > >> German lowercase letter '=DF' is equivalent to "ss". Since it is alrea= dy > >> lowercase, lower() would do nothing to '=DF'; casefold() converts it t= o > "ss". > >> > >> The casefolding algorithm is described in section 3.13 of the Unicode > >> Standard. > >> > >> New in version 3.3. > >> """ > >> > > Chris Angelico said that casefold is not perfect. In the future, I want > to > > make the perfect international-case-insensitive if-statement. For now, = my > > code only supports a limited range of characters. Even with casefold, I > will > > have some issues as Chris Angelico mentioned. Also, "=DF" is not really= the > > same as "ss". > > Well, casefold is about as good as it's ever going to be, but that's > because "the perfect international-case-insensitive comparison" is a > fundamentally impossible goal. Your last sentence hints as to why; > there is no simple way to compare strings containing those characters, > because the correct treatment varies according to context. > > Your two best options are: Be case sensitive (and then you need only > worry about composition and combining characters and all those > nightmares - the ones you have to worry about either way), or use > casefold(). Of those, I prefer the first, because it's safer; the > second is also a good option. > > ChrisA > -- > http://mail.python.org/mailman/listinfo/python-list > --=20 Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com* --047d7bd6bc92d2b8ab04e1e93961 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Just use an explanatory user tip that states it should be = case sensitive, just like with most sites, or apps.


O= n Fri, Jul 19, 2013 at 9:13 PM, Chris Angelico <rosuav@gmail.com> wrote:
On S= at, Jul 20, 2013 at 11:04 AM, Devyn Collier Johnson
<devyncjohnson@gmail.com&= gt; wrote:
>
> On 07/19/2013 07:09 PM, Dave Angel wrote:
>>
>> On 07/19/2013 06:08 PM, Devyn Collier Johnson wrote:
>>>
>>>
>>> On 07/19/2013 01:59 PM, Steven D'Aprano wrote:
>>
>>
>> =A0 =A0 =A0<snip>
>>>
>>>
>>> As for the case-insensitive if-statements, most code uses Lati= n letters.
>>> Making a case-insensitive-international if-statement would be<= br> >>> interesting. I can tackle that later. For now, I only wanted t= o take
>>> care of Latin letters. I hope to figure something out for all = characters.
>>>
>>
>> Once Steven gave you the answer, what's to figure out? =A0You = simply use
>> casefold() instead of lower(). =A0The only constraint is it's = 3.3 and later,
>> so you can't use it for anything earlier.
>>
>> http://docs.python.org/3.3/library/stdtypes.html#= str.casefold
>>
>> """
>> str.casefold()
>> Return a casefolded copy of the string. Casefolded strings may be = used for
>> caseless matching.
>>
>> Casefolding is similar to lowercasing but more aggressive because = it is
>> intended to remove all case distinctions in a string. For example,= the
>> German lowercase letter '=DF' is equivalent to "ss&qu= ot;. Since it is already
>> lowercase, lower() would do nothing to '=DF'; casefold() c= onverts it to "ss".
>>
>> The casefolding algorithm is described in section 3.13 of the Unic= ode
>> Standard.
>>
>> New in version 3.3.
>> """
>>
> Chris Angelico said that casefold is not perfect. In the future, I wan= t to
> make the perfect international-case-insensitive if-statement. For now,= my
> code only supports a limited range of characters. Even with casefold, = I will
> have some issues as Chris Angelico mentioned. Also, "=DF" is= not really the
> same as "ss".

Well, casefold is about as good as it's ever going to be, b= ut that's
because "the perfect international-case-insensitive comparison" i= s a
fundamentally impossible goal. Your last sentence hints as to why;
there is no simple way to compare strings containing those characters,
because the correct treatment varies according to context.

Your two best options are: Be case sensitive (and then you need only
worry about composition and combining characters and all those
nightmares - the ones you have to worry about either way), or use
casefold(). Of those, I prefer the first, because it's safer; the
second is also a good option.

ChrisA
--
http://mail.python.org/mailman/listinfo/python-list



--
Best Rega= rds,
David Hutto<= /span>
CEO: http://www.hitwebdevelopment.com
--047d7bd6bc92d2b8ab04e1e93961--