Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeder.erje.net!dedekind.zen.co.uk!zen.net.uk!hamilton.zen.co.uk!reader01.nrc01.news.zen.net.uk.POSTED!not-for-mail From: Nobody Subject: Re: "convert" string to bytes without changing data (encoding) Date: Thu, 30 Aug 2012 06:51:11 +0100 User-Agent: Pan/0.14.2 (This is not a psychotic episode. It's a cleansing moment of clarity.) Message-Id: Newsgroups: comp.lang.python References: <9tg21lFmo3U1@mid.dfncis.de> <9tg4qoFbfpU1@mid.dfncis.de> <9th0u8Fuf2U1@mid.dfncis.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Lines: 16 Organization: Zen Internet NNTP-Posting-Host: be2d15ec.news.zen.co.uk X-Trace: DXC=j9ZQ[hLVZP7FiQn`d_;:_;a0UP_O8AJo<=dR0\ckLKG0WeZ<[7LZNR6=7^go=G5I;5M2Z^cWRFGA;mhLN2QKW;[> X-Complaints-To: abuse@zen.co.uk Xref: csiph.com comp.lang.python:28090 On Wed, 29 Aug 2012 19:39:15 -0400, Piet van Oostrum wrote: >> Reading from stdin/a file gets you bytes, and not a string, because >> Python cannot automagically guess what format the input is in. >> > Huh? Oh, it can certainly guess (in the absence of any other information, it uses the current locale). Whether or not that guess is correct is a different matter. Realistically, if you want sensible behaviour from Python 3.x, you need to use an ISO-8859-1 locale. That ensures that conversion between str and bytes will never fail, and an str-bytes-str or bytes-str-bytes round-trip will pass data through unmangled.