Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #84373

Re: Case-insensitive sorting of strings (Python newbie)

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!bcyclone04.am1.xlned.com!bcyclone04.am1.xlned.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.003
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'encoding': 0.05; 'subject:Python': 0.06; 'utf-8': 0.07; 'string': 0.09; 'literal': 0.09; 'parameter': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'jan': 0.12; '24,': 0.16; 'fancy': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'open()': 0.16; 'quoted': 0.16; 'sorts': 0.16; 'subject:Case': 0.16; 'subject:insensitive': 0.16; 'unicode.': 0.16; 'sat,': 0.16; 'wrote:': 0.18; 'basically': 0.19; 'written': 0.21; 'import': 0.22; 'cc:addr:python.org': 0.22; 'print': 0.22; 'byte': 0.24; 'string,': 0.24; 'unicode': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'function': 0.29; 'am,': 0.29; 'generally': 0.29; 'message-id:@mail.gmail.com': 0.30; 'that.': 0.31; '>>>>': 0.31; 'text': 0.33; 'actual': 0.34; 'subject: (': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'fact': 0.38; 'rather': 0.38; 'little': 0.38; 'either': 0.39; 'skip:u 10': 0.60; 'read': 0.60; 'most': 0.60; 'kind': 0.63; 'here': 0.66; 'line,': 0.68; 'default': 0.69; 'sports': 0.69; '2015': 0.84; 'otten': 0.84; 'to:none': 0.92; 'differences': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=cvybg3wsZtgwupi20TF2X8NoSH732rX5/n2Y8ViiDXU=; b=BQoB3WTPwBYlKCDCo+x84PxiF899f86z8kbMNq9yvt6hlWXge9JOix6gLN7ZDY9aao pTKCR7GuK/WELoUSrMN/1+nOm80Z3N+0P52iAwf8kwUnekePlDvwIgCqXJuClvYN0hzr xC9U25qD/wRutkqTSwrBViDL/poK/zDAaAi+y1BUuDsQ0wt4ddGlZvKyOUpMRJNNUYrO 19CDC7gs4ZRLwF9Ook4s/5tt78Sd4GLXiHkCjjvnBNzUUBUCQt761XAgrPU6UQ23Nn/x 3YNCLJ1OWh9FWfGamaN6bO2rqDp5uYUS32jrEF7UxjVPWQBepJHUE6CVZhOZUnt00ZCJ AwSA==
MIME-Version 1.0
X-Received by 10.140.21.229 with SMTP id 92mr15646453qgl.33.1422035934311; Fri, 23 Jan 2015 09:58:54 -0800 (PST)
In-Reply-To <m9u1pv$a0o$1@ger.gmane.org>
References <54C27E13.5090808@ntlworld.com> <m9u1pv$a0o$1@ger.gmane.org>
Date Sat, 24 Jan 2015 04:58:54 +1100
Subject Re: Case-insensitive sorting of strings (Python newbie)
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.18048.1422035937.18130.python-list@python.org> (permalink)
Lines 24
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1422035937 news.xs4all.nl 2954 [2001:888:2000:d::a6]:40722
X-Complaints-To abuse@xs4all.nl
X-Received-Bytes 4381
X-Received-Body-CRC 2552968134
Xref csiph.com comp.lang.python:84373

Show key headers only | View raw


On Sat, Jan 24, 2015 at 4:53 AM, Peter Otten <__peter__@web.de> wrote:
> Now the same with unicode. To read text with a specific encoding use either
> codecs.open() or io.open() instead of the built-in (replace utf-8 with your
> actual encoding):
>
>>>> import io
>>>> for line in io.open("tmp.txt", encoding="utf-8"):
> ...     line = line.strip()
> ...     print line, line.lower()

In Python 3, the built-in open() function sports a fancy encoding=
parameter like that.

for line in open("tmp.txt", encoding="utf-8"):

If you can, I would recommend using Python 3 for all this kind of
thing. The difference may not be huge, but there are all sorts of
little differences here and there that mean that Unicode support is
generally better; most of it stems from the fact that the default
quoted string literal is a Unicode string rather than a byte string,
which means that basically every function ever written for Py3 has
been written to be Unicode-compatible.

ChrisA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Case-insensitive sorting of strings (Python newbie) Chris Angelico <rosuav@gmail.com> - 2015-01-24 04:58 +1100

csiph-web