Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'skip:[ 20': 0.04; 'subject:Python': 0.06; 'plenty': 0.07; 'ascii': 0.09; 'properly.': 0.09; 'subject:Why': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'bug': 0.12; '3.3,': 0.16; 'compliant,': 0.16; 'galaxy': 0.16; 'otoh,': 0.16; 'range,': 0.16; 'range.': 0.16; 'received:192.168.1.4': 0.16; 'subject:make': 0.16; ':-)': 0.16; 'wrote:': 0.18; '>>>': 0.22; 'memory': 0.22; 'import': 0.22; 'email addr:gmail.com>': 0.22; 'cc:addr:python.org': 0.22; 'header:User-Agent:1': 0.23; '>>>': 0.24; 'comparing': 0.24; 'unicode': 0.24; 'non': 0.24; '(or': 0.24; 'cc:2**0': 0.24; 'cc:no real name:2**0': 0.24; '>': 0.26; 'header:In-Reply- To:1': 0.27; 'returned': 0.30; 'url:mailman': 0.30; "skip:' 10": 0.31; '(usually': 0.31; 'portuguese': 0.31; 'languages': 0.32; 'text': 0.33; 'url:python': 0.33; '-----': 0.33; 'becomes': 0.33; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; '8bit%:17': 0.36; 'european': 0.36; 'url:listinfo': 0.36; 'url:org': 0.36; 'skip:& 10': 0.38; 'list,': 0.38; 'that,': 0.38; 'realize': 0.39; 'url:mail': 0.40; 'skip:u 10': 0.60; 'skip:t 30': 0.61; 'subject:more': 0.64; 'become': 0.64; 'more': 0.64; 'finally': 0.65; 'to:addr:gmail.com': 0.65; '>from': 0.68; 'wish': 0.70; 'home.': 0.72; 'to:charset:iso-8859-1': 0.74; 'gain': 0.79; 'seriously,': 0.84; 'subject:money': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:user-agent:in-reply-to:references:mime-version :content-type:subject:from:date:to:cc:message-id; bh=l1dA1QmNkbidFom3fmUImv0C+FhlVas7h7jqBCoYpJM=; b=ZfKz5vJEAwpoq669xxHPIcigCazP1yuYBtpQBLF/iCzM8u5I+S784VkbnfVOuqQAMC z+wlRkXdbqasxXth9bP6dh7Xe5HGJXQ/qerWJ83SvF/m205bSNVXrajm2ln8DnTQGNeS 0qrFnOqKZjw3GtikPd0AU5HlvT5eKhvoSz5qpBME+Amo3VTcpZ09QJONKgCeWzfNyH6d /FsgPEofwq0a6q8rHDO7s+eMZks2ECnXJUJFKqYe4nC4/EperMhSug9CpNsGUUrVxn+P v/9uHltPf8WMyltne81xzz3Scs+f/88PReJPoRWbA1YLhQgzh66nsqQOCpPQpfwRtCL2 T+ig== X-Received: by 10.194.122.166 with SMTP id lt6mr3857234wjb.14.1367936285942; Tue, 07 May 2013 07:18:05 -0700 (PDT) User-Agent: K-9 Mail for Android In-Reply-To: References: <5186aeb6$0$29997$c3e8da3$5496439d@news.astraweb.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----DCVGXDUYRX9BHNJXPXH2SVFFHQ2LUE" Subject: Re: Why do Perl programmers make more money than Python programmers From: Steve Simmons Date: Tue, 07 May 2013 15:17:52 +0100 To: =?ISO-8859-1?Q?F=E1bio_Santos?= Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 169 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1367936288 news.xs4all.nl 16004 [2001:888:2000:d::a6]:50938 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:44894 ------DCVGXDUYRX9BHNJXPXH2SVFFHQ2LUE Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "Fábio Santos" wrote: >> >> >> ----- >> >> >> 1) The memory gain for many of us (usually non ascii users) >> just become irrelevant. >> >> >>> sys.getsizeof('maçã') >> 41 >> >>> sys.getsizeof('abcd') >> 29 >> >> 2) More critical, Py 3.3, just becomes non unicode compliant, >> (eg European languages or "ascii" typographers !) >> >> >>> import timeit >> >>> timeit.timeit("'abcd'*1000 + 'a'") >> 2.186670111428325 >> >>> timeit.timeit("'abcd'*1000 + '€'") >> 2.9951699820528432 >> >>> timeit.timeit("'abcd'*1000 + 'œ'") >> 3.0036780444886233 >> >>> timeit.timeit("'abcd'*1000 + 'ẞ'") >> 3.004992278824048 >> >>> timeit.timeit("'maçã'*1000 + 'œ'") >> 3.231025618708202 >> >>> timeit.timeit("'maçã'*1000 + '€'") >> 3.215894398100758 >> >>> timeit.timeit("'maçã'*1000 + 'œ'") >> 3.224407974255655 >> >>> timeit.timeit("'maçã'*1000 + '’'") >> 3.2206342273566406 >> >>> timeit.timeit("'abcd'*1000 + '’'") >> 2.9914403449067777 >> >> 3) Python is "pround" to cover the whole unicode range, >> unfortunately it "breaks" the BMP range. >> Small GvR exemple (ascii) from the the bug list, >> but with non ascii characters. >> >> # Py 3.2, all chars >> >> >>> timeit.repeat("a = 'hundred'; 'x' in a") >> [0.09087790617297742, 0.07456871885972305, 0.07449940353376405] >> >>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a") >> [0.10088136800095526, 0.07488497003487282, 0.07497594640028638] >> >> >> # Py 3.3 ascii and non ascii chars >> >>> timeit.repeat("a = 'hundred'; 'x' in a") >> [0.11426985953005442, 0.10040049292649655, 0.09920834808588097] >> >>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a") >> [0.2345595188256766, 0.21637172864154763, 0.2179096624382737] >> >> >> There are plenty of good reasons to use Python. There are >> also plenty of good reasons to not use (or now to drop) >> Python and to realize that if you wish to process text >> seriously, you are better served by using "corporate >> products" or tools using Unicode properly. >> >> jmf > >This is so off-topic that, after reading this, I feel I have just >returned >from the Moon. > >OTOH, it would seem like you know the Portuguese word for apple, so I >also >feel home. > >I am so confused. > > >------------------------------------------------------------------------ > >-- >http://mail.python.org/mailman/listinfo/python-list Good to see jmf finally comparing apples with apples :-) Sent from a Galaxy far far away ------DCVGXDUYRX9BHNJXPXH2SVFFHQ2LUE Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
"Fábio Santos" <fabiosantosart@gmail.com> wrote:


>
>
> -----
>
>
> 1) The memory gain for many of us (usually non ascii users)
> just become irrelevant.
>
> >>> sys.getsizeof('maçã')
> 41
> >>> sys.getsizeof('abcd')
> 29
>
> 2) More critical, Py 3.3, just becomes non unicode compliant,
> (eg European languages or "ascii" typographers !)
>
> >>> import timeit
> >>> timeit.timeit("'abcd'*1000 + 'a'")
> 2.186670111428325
> >>> timeit.timeit("'abcd'*1000 + '€'")
> 2.9951699820528432
> >>> timeit.timeit("'abcd'*1000 + 'œ'")
> 3.0036780444886233
> >>> timeit.timeit("'abcd'*1000 + 'ẞ'")
> 3.004992278824048
> >>> timeit.timeit("'maçã'*1000 + 'œ'")
> 3.231025618708202
> >>> timeit.timeit("'maçã'*1000 + '€'")
> 3.215894398100758
> >>> timeit.timeit("'maçã'*1000 + 'œ'")
> 3.224407974255655
> >>> timeit.timeit("'maçã'*1000 + '’'")
> 3.2206342273566406
> >>> timeit.timeit("'abcd'*1000 + '’'")
> 2.9914403449067777
>
> 3) Python is "pround" to cover the whole unicode range,
> unfortunately it "breaks" the BMP range.
> Small GvR exemple (ascii) from the the bug list,
> but with non ascii characters.
>
> # Py 3.2, all chars
>
> >>> timeit.repeat("a = 'hundred'; 'x' in a")
> [0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
> >>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a")
> [0.10088136800095526, 0.07488497003487282, 0.07497594640028638]
>
>
> # Py 3.3 ascii and non ascii chars
> >>> timeit.repeat("a = 'hundred'; 'x' in a")
> [0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
> >>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")
> [0.2345595188256766, 0.21637172864154763, 0.2179096624382737]
>
>
> There are plenty of good reasons to use Python. There are
> also plenty of good reasons to not use (or now to drop)
> Python and to realize that if you wish to process text
> seriously, you are better served by using "corporate
> products" or tools using Unicode properly.
>
> jmf

This is so off-topic that, after reading this, I feel I have just returned from the Moon.

OTOH, it would seem like you know the Portuguese word for apple, so I also feel home.

I am so confused.

-- 
http://mail.python.org/mailman/listinfo/python-list

Good to see jmf finally comparing apples with apples :-)

Sent from a Galaxy far far away ------DCVGXDUYRX9BHNJXPXH2SVFFHQ2LUE--