Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin1!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed2a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'encoding': 0.05; 'explicitly': 0.05; 'defaults': 0.07; 'linux,': 0.07; 'utf-8': 0.07; 'variables': 0.07; 'parameter': 0.09; 'cc:addr:python-list': 0.11; 'windows': 0.15; 'at,': 0.16; 'conveyed': 0.16; 'encoding.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'non-ascii': 0.16; 'notably': 0.16; 'omitting': 0.16; 'varies': 0.16; '(you': 0.16; 'wrote:': 0.18; 'thu,': 0.19; 'written': 0.21; 'feb': 0.22; 'cc:addr:python.org': 0.22; 'bytes': 0.24; 'specify': 0.24; 'environment': 0.24; 'cc:2**0': 0.24; 'script': 0.25; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; 'idea': 0.28; 'point': 0.28; 'rest': 0.29; 'chris': 0.29; 'tim': 0.29; 'message- id:@mail.gmail.com': 0.30; '13,': 0.31; 'chase': 0.31; 'file': 0.32; 'probably': 0.32; 'linux': 0.33; '(most': 0.33; "i'd": 0.34; "can't": 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'done': 0.36; 'doing': 0.36; 'shows': 0.36; 'subject:?': 0.36; 'should': 0.36; 'pm,': 0.38; 'read': 0.60; 'break': 0.61; "you're": 0.61; 'back': 0.62; "you'll": 0.62; 'such': 0.63; 'default': 0.69; 'whereas': 0.91; 'to:none': 0.92; 'yourself,': 0.95 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=d6Lk8W7EX8fvTN4el9P1YeHqLiYgjOr3GOVdZQXOFOI=; b=r7m3yqjPnoZuAneYZEJcfpBuVlAAQJWTewmWOtr0rC293eMYkVUraoTXModlbiSA70 VfMGml4WcyTFzZuyrRFGFT19/9h1vZDaLSyFxO8SlQsNwK+1deYdzNWcpyAZd3Ml7LjI +ii+aGYhqMfGT1Mu/2hTgJLUfe4JPyludlEiGhHck8RZRRUKPJUgzT9coMcvH2E5DSth Is+0BXus2XNNxC+Yc8d1CThkCC14V4i/J4Hnf5oouBF7oWPRCVQPQG+fQ4lI2jSdVW03 2XJ7hIJAqQgpok30W2KfJCS/8wFLeKxBHmEqsRJENaUWfJWAcAZgCoXVVic4W8YWC8NH 7juw== MIME-Version: 1.0 X-Received: by 10.68.201.10 with SMTP id jw10mr55686626pbc.25.1392263227268; Wed, 12 Feb 2014 19:47:07 -0800 (PST) In-Reply-To: <20140212212953.458b810a@bigbox.christie.dr> References: <6c76ef4e-8c7c-4199-b30d-c4d55c1061c8@googlegroups.com> <20140212161427.0a9843d5@bigbox.christie.dr> <20140212184432.1df9b491@bigbox.christie.dr> <20140212212953.458b810a@bigbox.christie.dr> Date: Thu, 13 Feb 2014 14:47:07 +1100 Subject: Re: Wait... WHAT? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 21 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1392263231 news.xs4all.nl 2840 [2001:888:2000:d::a6]:36944 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:66147 On Thu, Feb 13, 2014 at 2:29 PM, Tim Chase wrote: > My original point (though > perhaps not conveyed as well as I'd intended) was that only bytes get > written to the disk, and that some encoding must take place. It can > be done implicitly using some defaults which may break (as demoed), > whereas one would be better off doing it explicitly such as Chris > shows And since the default encoding varies based on matters outside your script (most notably platform - I tried this on Windows and Linux, and got a default of UTF-8 on Linux and CP-1252 on Windows; but environment variables and such can interfere too), I would say that omitting the encoding= parameter should be done ONLY when you actually have no idea what the encoding is, only that it's "probably something from the rest of the system". And, well, if that's what you're looking at, you definitely can't trust to reading or writing non-ASCII (you can probably trust ASCII). When you create a file that you'll read back yourself, specify an encoding. ChrisA