Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'skip:[ 20': 0.04; 'subject:Python': 0.06; 'plenty': 0.07; 'ascii': 0.09; 'properly.': 0.09; 'subject:Why': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'bug': 0.12; '3.3,': 0.16; 'compliant,': 0.16; 'otoh,': 0.16; 'range,': 0.16; 'range.': 0.16; 'subject:make': 0.16; '>>>': 0.22; 'memory': 0.22; 'import': 0.22; 'cc:addr:python.org': 0.22; '>>>': 0.24; 'unicode': 0.24; 'non': 0.24; '(or': 0.24; 'cc:2**0': 0.24; 'cc:no real name:2**0': 0.24; '>': 0.26; 'header:In-Reply-To:1': 0.27; 'returned': 0.30; 'message-id:@mail.gmail.com': 0.30; "skip:' 10": 0.31; '(usually': 0.31; 'portuguese': 0.31; 'languages': 0.32; 'text': 0.33; '-----': 0.33; 'becomes': 0.33; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; '8bit%:17': 0.36; 'european': 0.36; 'skip:& 10': 0.38; 'list,': 0.38; 'that,': 0.38; 'realize': 0.39; 'skip:u 10': 0.60; 'skip:t 30': 0.61; 'subject:more': 0.64; 'become': 0.64; 'more': 0.64; 'to:addr:gmail.com': 0.65; 'wish': 0.70; 'home.': 0.72; 'gain': 0.79; 'seriously,': 0.84; 'subject:money': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=Iz//yId/d77n7qNkHFBrmDMUMdPIyP0x+tUu3Cc0/vo=; b=m9qQBzFdxgMLJpetayyizC2cyW5NYsdZHxm572ObH2iQrULNc1yyzPzZK8mMTq35Ru +zwldE4iUQWuUpsK5//9LFdfJkCJlT+pwiF2zAPLL+cX5nIdkF7b/IG9GfUErc4wSVl2 PpLB1wObX7bo6PT8x8t9hC6c+FrgknoAK1Y+M+El39pTno45v6mi8kWhuO6FDFT6d82t qH9reQcX3ULVKEuk8tnhjxXyhQ0cpdpVvIv8zhLV+YI06Zz6AzocV5O1qP5htUz2+a6Y 7/GqYmugRSbeY0pS+2WORxzCTGQ/2hY2WpOoaSTjz4Rjc6nzvmdeyKFoBCdPtRyqKIOS JVqA== MIME-Version: 1.0 X-Received: by 10.49.25.112 with SMTP id b16mr1704986qeg.21.1367933705754; Tue, 07 May 2013 06:35:05 -0700 (PDT) In-Reply-To: References: <5186aeb6$0$29997$c3e8da3$5496439d@news.astraweb.com> Date: Tue, 7 May 2013 14:35:05 +0100 Subject: Re: Why do Perl programmers make more money than Python programmers From: =?ISO-8859-1?Q?F=E1bio_Santos?= To: jmfauth Content-Type: multipart/alternative; boundary=047d7b6da4c89bddf804dc20e241 Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 167 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1367933714 news.xs4all.nl 15880 [2001:888:2000:d::a6]:60063 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:44891 --047d7b6da4c89bddf804dc20e241 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > ----- > > > 1) The memory gain for many of us (usually non ascii users) > just become irrelevant. > > >>> sys.getsizeof('ma=C3=A7=C3=A3') > 41 > >>> sys.getsizeof('abcd') > 29 > > 2) More critical, Py 3.3, just becomes non unicode compliant, > (eg European languages or "ascii" typographers !) > > >>> import timeit > >>> timeit.timeit("'abcd'*1000 + 'a'") > 2.186670111428325 > >>> timeit.timeit("'abcd'*1000 + '=E2=82=AC'") > 2.9951699820528432 > >>> timeit.timeit("'abcd'*1000 + '=C5=93'") > 3.0036780444886233 > >>> timeit.timeit("'abcd'*1000 + '=E1=BA=9E'") > 3.004992278824048 > >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '=C5=93'") > 3.231025618708202 > >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '=E2=82=AC'") > 3.215894398100758 > >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '=C5=93'") > 3.224407974255655 > >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '=E2=80=99'") > 3.2206342273566406 > >>> timeit.timeit("'abcd'*1000 + '=E2=80=99'") > 2.9914403449067777 > > 3) Python is "pround" to cover the whole unicode range, > unfortunately it "breaks" the BMP range. > Small GvR exemple (ascii) from the the bug list, > but with non ascii characters. > > # Py 3.2, all chars > > >>> timeit.repeat("a =3D 'hundred'; 'x' in a") > [0.09087790617297742, 0.07456871885972305, 0.07449940353376405] > >>> timeit.repeat("a =3D 'ma=C3=A7=C3=A3=C3=A9=E2=82=AC=E1=BA=9E'; 'x' in= a") > [0.10088136800095526, 0.07488497003487282, 0.07497594640028638] > > > # Py 3.3 ascii and non ascii chars > >>> timeit.repeat("a =3D 'hundred'; 'x' in a") > [0.11426985953005442, 0.10040049292649655, 0.09920834808588097] > >>> timeit.repeat("a =3D 'ma=C3=A7=C3=A3=C3=A9=E2=82=AC=E1=BA=9E'; '=C3= =A9' in a") > [0.2345595188256766, 0.21637172864154763, 0.2179096624382737] > > > There are plenty of good reasons to use Python. There are > also plenty of good reasons to not use (or now to drop) > Python and to realize that if you wish to process text > seriously, you are better served by using "corporate > products" or tools using Unicode properly. > > jmf This is so off-topic that, after reading this, I feel I have just returned from the Moon. OTOH, it would seem like you know the Portuguese word for apple, so I also feel home. I am so confused. --047d7b6da4c89bddf804dc20e241 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


>
>
> -----
>
>
> 1) The memory gain for many of us (usually non ascii users)
> just become irrelevant.
>
> >>> sys.getsizeof('ma=C3=A7=C3=A3')
> 41
> >>> sys.getsizeof('abcd')
> 29
>
> 2) More critical, Py 3.3, just becomes non unicode compliant,
> (eg European languages or "ascii" typographers !)
>
> >>> import timeit
> >>> timeit.timeit("'abcd'*1000 + 'a'&quo= t;)
> 2.186670111428325
> >>> timeit.timeit("'abcd'*1000 + '=E2=82=AC&= #39;")
> 2.9951699820528432
> >>> timeit.timeit("'abcd'*1000 + '=C5=93'= ;")
> 3.0036780444886233
> >>> timeit.timeit("'abcd'*1000 + '=E1=BA=9E&= #39;")
> 3.004992278824048
> >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '= =C5=93'")
> 3.231025618708202
> >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '= =E2=82=AC'")
> 3.215894398100758
> >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '= =C5=93'")
> 3.224407974255655
> >>> timeit.timeit("'ma=C3=A7=C3=A3'*1000 + '= =E2=80=99'")
> 3.2206342273566406
> >>> timeit.timeit("'abcd'*1000 + '=E2=80=99&= #39;")
> 2.9914403449067777
>
> 3) Python is "pround" to cover the whole unicode range,
> unfortunately it "breaks" the BMP range.
> Small GvR exemple (ascii) from the the bug list,
> but with non ascii characters.
>
> # Py 3.2, all chars
>
> >>> timeit.repeat("a =3D 'hundred'; 'x' = in a")
> [0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
> >>> timeit.repeat("a =3D 'ma=C3=A7=C3=A3=C3=A9=E2=82= =AC=E1=BA=9E'; 'x' in a")
> [0.10088136800095526, 0.07488497003487282, 0.07497594640028638]
>
>
> # Py 3.3 ascii and non ascii chars
> >>> timeit.repeat("a =3D 'hundred'; 'x' = in a")
> [0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
> >>> timeit.repeat("a =3D 'ma=C3=A7=C3=A3=C3=A9=E2=82= =AC=E1=BA=9E'; '=C3=A9' in a")
> [0.2345595188256766, 0.21637172864154763, 0.2179096624382737]
>
>
> There are plenty of good reasons to use Python. There are
> also plenty of good reasons to not use (or now to drop)
> Python and to realize that if you wish to process text
> seriously, you are better served by using "corporate
> products" or tools using Unicode properly.
>
> jmf

This is so off-topic that, after reading this, I feel I have= just returned from the Moon.

OTOH, it would seem like you know the Portuguese word for ap= ple, so I also feel home.

I am so confused.

--047d7b6da4c89bddf804dc20e241--