Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.fsmpi.rwth-aachen.de!newsfeed.kamp.net!newsfeed.kamp.net!87.79.20.101.MISMATCH!newsreader4.netcologne.de!news.netcologne.de!xlned.com!feeder3.xlned.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.004 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'interfaces': 0.04; 'encoding': 0.05; 'explicitly': 0.05; 'output': 0.05; 'subject:Python': 0.06; 'utf-8': 0.07; 'escape': 0.09; 'cc:addr :python-list': 0.11; 'python': 0.11; 'systems.': 0.12; 'assume': 0.14; 'changes': 0.15; 'decade,': 0.16; 'defaulting': 0.16; 'encoding.': 0.16; 'languages)': 0.16; 'to:addr:pearwood.info': 0.16; 'to:addr:steve+comp.lang.python': 0.16; "to:name:steven d'aprano": 0.16; 'wrappers': 0.16; 'java,': 0.16; 'wrote:': 0.18; 'module': 0.19; 'machine': 0.22; 'appears': 0.22; 'input': 0.22; 'preferred': 0.22; 'cc:addr:python.org': 0.22; 'specify': 0.24; 'cc:2**0': 0.24; 'nearly': 0.26; 'header:In-Reply-To:1': 0.27; 'tim': 0.29; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'code': 0.31; "d'aprano": 0.31; 'default,': 0.31; 'steven': 0.31; 'subject:some': 0.31; 'option': 0.32; 'another': 0.32; '(e.g.': 0.33; 'except': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'collecting': 0.36; 'interact': 0.36; 'largely': 0.36; 'changing': 0.37; 'clear': 0.37; 'fact': 0.38; 'anything': 0.39; 'how': 0.40; 'eventually': 0.60; 'manually': 0.60; 'most': 0.60; 'break': 0.61; 'information': 0.63; 'name': 0.63; 'more': 0.64; 'production': 0.68; 'invalid': 0.68; 'default': 0.69; 'products': 0.71; 'internet': 0.71; 'products.': 0.72; 'guaranteed': 0.75; 'behavior': 0.77; 'age.': 0.84; 'producers': 0.84; 'treating': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=qQ39fvGgeHcV46Q6RP4fOBIBszojXE9ctaR/eQ4dmSc=; b=DxBmFx9lnniaPHIShFeHvXhbEbbjqlTfI1n/MNDIMWWRmY2nBHVovwuYzxgW/xCQrr 99FkXzDRCgeA2AdWS7oTg4fa2w0Ffwas03kwX0x0gvXtcHqVm09sT3duqI5wQflydnEm eHDiZblTL3iqePHUuo5wR1LenlCBXL1fzxIQoLWMya2PvFWomFRp19jMsE6OV1g6nGUH jdYENwqI6hLik92kkgSjRU3X/UIkfHSrbinwuuQkCp7QPJD1DLG2vH+t+ssIoR9cHg+l n0E0jdbTwWxZfFO9UIC3fjcM4mzzHuhJSYmxK5dVDPDmsDyGYq+U5+ZsvMJX/9PpfRde HSWg== MIME-Version: 1.0 X-Received: by 10.182.158.73 with SMTP id ws9mr33888270obb.14.1401663273194; Sun, 01 Jun 2014 15:54:33 -0700 (PDT) In-Reply-To: <538a8f48$0$29978$c3e8da3$5496439d@news.astraweb.com> References: <538a8f48$0$29978$c3e8da3$5496439d@news.astraweb.com> Date: Mon, 2 Jun 2014 08:54:33 +1000 Subject: Re: Python 3.2 has some deadly infection From: Tim Delaney To: "Steven D'Aprano" Content-Type: multipart/alternative; boundary=089e0149470e7e5b6704face2910 Cc: Python-List X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 81 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401663275 news.xs4all.nl 2915 [2001:888:2000:d::a6]:52373 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72386 --089e0149470e7e5b6704face2910 Content-Type: text/plain; charset=UTF-8 On 1 June 2014 12:26, Steven D'Aprano wrote: > > "with cross-platform behavior preferred over system-dependent one" -- > It's not clear how cross-platform behaviour has anything to do with the > Internet age. Python has preferred cross-platform behaviour forever, > except for those features and modules which are explicitly intended to be > interfaces to system-dependent features. (E.g. a lot of functions in the > os module are thin wrappers around OS features. Hence the name of the > module.) > There is the behaviour of defaulting input and output to the system encoding. I personally think we would all be better off if Python (and Java, and many other languages) defaulted to UTF-8. This hopefully would eventually have the effect of producers changing to output UTF-8 by default, and consumers learning to manually specify an encoding when it's not UTF-8 (due to invalid codepoints). I'm currently working on a product that interacts with lots of other products. These other products can be using any encoding - but most of the functions that interact with I/O assume the system default encoding of the machine that is collecting the data. The product has been in production for nearly a decade, so there's a lot of pushback against changes deep in the code for fear that it will break working systems. The fact that they are working largely by accident appears to escape them ... FWIW, changing to use iso-latin-1 by default would be the most sensible option (effectively treating everything as bytes), with the option for another encoding if/when more information is known (e.g. there's often a call to return the encoding, and the output of that call is guaranteed to be ASCII). Tim Delaney --089e0149470e7e5b6704face2910 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On 1= June 2014 12:26, Steven D'Aprano <steve+comp.lang.= python@pearwood.info> wrote:

"with cross-platform behavior preferred over system-dependent one"= ; --
It's not clear how cross-platform behaviour has anything to do with the=
Internet age. Python has preferred cross-platform behaviour forever,
except for those features and modules which are explicitly intended to be interfaces to system-dependent features. (E.g. a lot of functions in the os module are thin wrappers around OS features. Hence the name of the
module.)

There is the behaviour of defa= ulting input and output to the system encoding. I personally think we would= all be better off if Python (and Java, and many other languages) defaulted= to UTF-8. This hopefully would eventually have the effect of producers cha= nging to output UTF-8 by default, and consumers learning to manually specif= y an encoding when it's not UTF-8 (due to invalid codepoints).

I'm currently working on a product that interacts w= ith lots of other products. These other products can be using any encoding = - but most of the functions that interact with I/O assume the system defaul= t encoding of the machine that is collecting the data. The product has been= in production for nearly a decade, so there's a lot of pushback agains= t changes deep in the code for fear that it will break working systems. The= fact that they are working largely by accident appears to escape them ...<= /div>

FWIW, changing to use iso-latin-1 by default would be t= he most sensible option (effectively treating everything as bytes), with th= e option for another encoding if/when more information is known (e.g. there= 's often a call to return the encoding, and the output of that call is = guaranteed to be ASCII).

Tim Delaney=C2=A0
--089e0149470e7e5b6704face2910--