X-FeedAbuse: http://nntpfeed.proxad.net/abuse.pl feeded by 88.191.16.109 Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!nospam.fr.eu.org!usenet-fr.net!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!news.glorb.com!news-xfer.nntp.sonic.net!news.astraweb.com!border5.newsrouter.astraweb.com!not-for-mail From: Ben Finney Newsgroups: comp.lang.python Subject: Re: unicode by default References: X-Public-Key-ID: 0xAC128405 X-Public-Key-Fingerprint: 517C F14B B2F3 98B0 CB35 4855 B8B2 4C06 AC12 8405 X-Public-Key-URL: http://www.benfinney.id.au/contact/bfinney-pubkey.asc X-Post-From: Ben Finney Date: Thu, 12 May 2011 14:07:08 +1000 Message-ID: <874o50k1eb.fsf@benfinney.id.au> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) Cancel-Lock: sha1:FPAcqnHgRInevd3cVyLK7AAses4= MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Lines: 30 Organization: Unlimited download news at news.astraweb.com NNTP-Posting-Host: c609a03a.news.astraweb.com X-Trace: DXC==JEmF1P80PS1IA2mQNenURL?0kYOcDh@ZXb<8`kRh<;P]QE`fi=n]\^]G;2>V^?kWSbEW9A[5UK?UNZ[SL`C\KgS?gj:G8[`UAW Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:5193 MRAB writes: > You need to understand the difference between characters and bytes. Yep. Those who don't need to join us in the third millennium, and the resources pointed out in this thread are good to help that. > A string contains characters, a file contains bytes. That's not true for Python 2. I'd phrase that as: * Text is a sequence of characters. Most inputs to the program, including files, sockets, etc., contain a sequence of bytes. * Always know whether you're dealing with text or with bytes. No object can be both. * In Python 2, ‘str’ is the type for a sequence of bytes. ‘unicode’ is the type for text. * In Python 3, ‘str’ is the type for text. ‘bytes’ is the type for a sequence of bytes. -- \ “I went to a garage sale. ‘How much for the garage?’ ‘It's not | `\ for sale.’” —Steven Wright | _o__) | Ben Finney