Groups > comp.lang.python > #45689 > unrolled thread

RE: PEP 378: Format Specifier for Thousands Separator

Started by	Carlos Nepomuceno <carlosnepomuceno@outlook.com>
First post	2013-05-21 23:22 +0300
Last post	2013-05-22 14:52 +0300
Articles	8 — 4 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  RE: PEP 378: Format Specifier for Thousands Separator Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-21 23:22 +0300
    Re: PEP 378: Format Specifier for Thousands Separator Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-22 02:42 +0000
      RE: PEP 378: Format Specifier for Thousands Separator Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-22 05:56 +0300
        Re: PEP 378: Format Specifier for Thousands Separator Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-22 03:08 +0000
          RE: PEP 378: Format Specifier for Thousands Separator Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-22 06:38 +0300
            Re: PEP 378: Format Specifier for Thousands Separator 88888 Dihedral <dihedral88888@gmail.com> - 2013-05-22 00:14 -0700
          Re: PEP 378: Format Specifier for Thousands Separator Ned Batchelder <ned@nedbatchelder.com> - 2013-05-22 07:25 -0400
          RE: PEP 378: Format Specifier for Thousands Separator Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-22 14:52 +0300

#45689 — RE: PEP 378: Format Specifier for Thousands Separator

From	Carlos Nepomuceno <carlosnepomuceno@outlook.com>
Date	2013-05-21 23:22 +0300
Subject	RE: PEP 378: Format Specifier for Thousands Separator
Message-ID	<mailman.1939.1369167746.3114.python-list@python.org>

----------------------------------------
> Date: Tue, 21 May 2013 14:53:54 -0500
> From: bahamutzero8825@gmail.com
> To: python-list@python.org
[...]
>>
> What myth? People should indeed be using .format(), but no one said % formatting was going away soon. Also, the suggested change to the docs
> wasn't made and the issue is closed. The current docs do not say that % formatting isn't going to be deprecated, but it does mention its
> caveats and suggests .format(). If you are trying to say that % formatting will never ever go away, then you are wrong. It is highly
> unlikely to go away in a 3.x release, but /may/ get phased out in Python 4.0.

I vote for keeping str.__mod__()!

Anyway, is it possible to overload str.__mod__() without deriving a class? I mean to have something like:

old_mod = str.__mod__
def new_mod(x):
    global old_mod
    try:
        old_mod(x)
    except ValueError, ex:
        #catch ValueError: unsupported format character ',' (0x2c) at index 1
        #process new '%,d' format here
        return '{:,}'.format(x)  #just to illustrate the behaviour. it would have it's own faster code

str.__mod__ = new_mod  #this raises TypeError: can't set attributes of built-in/extension type 'str'
sys.stdout.write('num=%,d\n' % 1234567)


> --
> CPython 3.3.2 | Windows NT 6.2.9200 / FreeBSD 9.1
> --
> http://mail.python.org/mailman/listinfo/python-list

[toc] | [next] | [standalone]

#45698

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-05-22 02:42 +0000
Message-ID	<519c30b0$0$6599$c3e8da3$5496439d@news.astraweb.com>
In reply to	#45689

On Tue, 21 May 2013 23:22:24 +0300, Carlos Nepomuceno wrote:

> Anyway, is it possible to overload str.__mod__() without deriving a
> class? I mean to have something like:

No, not in Python. If you want to monkey-patch built-in classes on the 
fly, with all the troubles that causes, use Ruby.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#45699

From	Carlos Nepomuceno <carlosnepomuceno@outlook.com>
Date	2013-05-22 05:56 +0300
Message-ID	<mailman.1946.1369191420.3114.python-list@python.org>
In reply to	#45698

----------------------------------------
> From: steve+comp.lang.python@pearwood.info
> Subject: Re: PEP 378: Format Specifier for Thousands Separator
> Date: Wed, 22 May 2013 02:42:56 +0000
> To: python-list@python.org
>
> On Tue, 21 May 2013 23:22:24 +0300, Carlos Nepomuceno wrote:
>
>> Anyway, is it possible to overload str.__mod__() without deriving a
>> class? I mean to have something like:
>
> No, not in Python. If you want to monkey-patch built-in classes on the
> fly, with all the troubles that causes, use Ruby.
>

So, the only alternative to have "'%,d' % x" rendering the thousands separator output would a C source code modification?

> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list

[toc] | [prev] | [next] | [standalone]

#45702

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-05-22 03:08 +0000
Message-ID	<519c36c6$0$6599$c3e8da3$5496439d@news.astraweb.com>
In reply to	#45699

On Wed, 22 May 2013 05:56:53 +0300, Carlos Nepomuceno wrote:

> ----------------------------------------
>> From: steve+comp.lang.python@pearwood.info Subject: Re: PEP 378: Format
>> Specifier for Thousands Separator Date: Wed, 22 May 2013 02:42:56 +0000
>> To: python-list@python.org
>>
>> On Tue, 21 May 2013 23:22:24 +0300, Carlos Nepomuceno wrote:
>>
>>> Anyway, is it possible to overload str.__mod__() without deriving a
>>> class? I mean to have something like:
>>
>> No, not in Python. If you want to monkey-patch built-in classes on the
>> fly, with all the troubles that causes, use Ruby.
>>
>>
> So, the only alternative to have "'%,d' % x" rendering the thousands
> separator output would a C source code modification?

That's one alternative. But the language you would be then running will 
no longer be Python.

Another alternative would be to write a pre-processor that parses your 
Python source code, extracts any reference to the above, and replaces it 
with a call to the appropriate format call. But not only is that a lot of 
work for very little gain, but it's also more or less impossible to do in 
full generality. And again, what you are running will be something 
different than Python, it will be Python plus a pre-processor.

Don't fight the language. You will lose.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#45706

From	Carlos Nepomuceno <carlosnepomuceno@outlook.com>
Date	2013-05-22 06:38 +0300
Message-ID	<mailman.1951.1369193993.3114.python-list@python.org>
In reply to	#45702

----------------------------------------
> From: steve+comp.lang.python@pearwood.info
> Subject: Re: PEP 378: Format Specifier for Thousands Separator
> Date: Wed, 22 May 2013 03:08:54 +0000
> To: python-list@python.org
[...]
>> So, the only alternative to have "'%,d' % x" rendering the thousands
>> separator output would a C source code modification?
>
> That's one alternative. But the language you would be then running will
> no longer be Python.
>
> Another alternative would be to write a pre-processor that parses your
> Python source code, extracts any reference to the above, and replaces it
> with a call to the appropriate format call. But not only is that a lot of
> work for very little gain, but it's also more or less impossible to do in
> full generality. And again, what you are running will be something
> different than Python, it will be Python plus a pre-processor.
>
>
> Don't fight the language. You will lose.

Not fighting the language. In fact it's not even a language issue.
All I need is a standard library[1] improvement: "%,d"! That's all!

Just to put in perspective the performance difference of str.__mod__() and str.format():

C:\Python27>python -m timeit -cv -n10000000 "'%d'%12345"
raw times: 0.386 0.38 0.373
10000000 loops, best of 3: 0.0373 usec per loop

C:\Python27>python -m timeit -cv -n10000000 "'{:d}'.format(12345)"
raw times: 7.91 7.89 7.98
10000000 loops, best of 3: 0.789 usec per loop

C:\Python27>python -m timeit -cv -n10000000 "'{:,d}'.format(12345)"
raw times: 8.7 8.67 8.78
10000000 loops, best of 3: 0.867 usec per loop

That shows str.format() is 20 times slower than str.__mod__() for a simple decimal integer literal formatting.
And it's additionally 10% slower if the thousands separator format specifier (',') is used.

[1] I think that translates to Python source code in 'Objects/stringobject.c' and maybe 'Objects/unicodeobject.c'

>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list

[toc] | [prev] | [next] | [standalone]

#45715

From	88888 Dihedral <dihedral88888@gmail.com>
Date	2013-05-22 00:14 -0700
Message-ID	<087e4cab-92bd-4a14-aab4-a986e426302b@googlegroups.com>
In reply to	#45706

Carlos Nepomuceno於 2013年5月22日星期三UTC+8上午11時38分45秒寫道：
> ----------------------------------------
> > From: steve+comp.lang.python@pearwood.info
> > Subject: Re: PEP 378: Format Specifier for Thousands Separator
> > Date: Wed, 22 May 2013 03:08:54 +0000
> > To: python-list@python.org
> [...]
> >> So, the only alternative to have "'%,d' % x" rendering the thousands
> >> separator output would a C source code modification?
> >
> > That's one alternative. But the language you would be then running will
> > no longer be Python.
> >
> > Another alternative would be to write a pre-processor that parses your
> > Python source code, extracts any reference to the above, and replaces it
> > with a call to the appropriate format call. But not only is that a lot of
> > work for very little gain, but it's also more or less impossible to do in
> > full generality. And again, what you are running will be something
> > different than Python, it will be Python plus a pre-processor.
> >
> >
> > Don't fight the language. You will lose.
> 
> Not fighting the language. In fact it's not even a language issue.
> All I need is a standard library[1] improvement: "%,d"! That's all!
> 
> Just to put in perspective the performance difference of str.__mod__() and str.format():
> 
> C:\Python27>python -m timeit -cv -n10000000 "'%d'%12345"
> raw times: 0.386 0.38 0.373
> 10000000 loops, best of 3: 0.0373 usec per loop
> 
> C:\Python27>python -m timeit -cv -n10000000 "'{:d}'.format(12345)"
> raw times: 7.91 7.89 7.98
> 10000000 loops, best of 3: 0.789 usec per loop
> 
> C:\Python27>python -m timeit -cv -n10000000 "'{:,d}'.format(12345)"
> raw times: 8.7 8.67 8.78
> 10000000 loops, best of 3: 0.867 usec per loop
> 
> That shows str.format() is 20 times slower than str.__mod__() for a simple decimal integer literal formatting.
> And it's additionally 10% slower if the thousands separator format specifier (',') is used.
> 
> [1] I think that translates to Python source code in 'Objects/stringobject.c' and maybe 'Objects/unicodeobject.c'
> 
> >
> >
> > --
> > Steven
> > --
> > http://mail.python.org/mailman/listinfo/python-list

The conversions of the  32 bit integers and 64 bit floats
 to the strings of the  base 10 digits require an 
efficint div and mod normally in the low level.

[toc] | [prev] | [next] | [standalone]

#45723

From	Ned Batchelder <ned@nedbatchelder.com>
Date	2013-05-22 07:25 -0400
Message-ID	<mailman.1959.1369221931.3114.python-list@python.org>
In reply to	#45702

On 5/21/2013 11:38 PM, Carlos Nepomuceno wrote:
>> From:steve+comp.lang.python@pearwood.info
>> >Subject: Re: PEP 378: Format Specifier for Thousands Separator
>> >Date: Wed, 22 May 2013 03:08:54 +0000
>> >To:python-list@python.org
> [...]
>>> >>So, the only alternative to have "'%,d' % x" rendering the thousands
>>> >>separator output would a C source code modification?
>> >
>> >That's one alternative. But the language you would be then running will
>> >no longer be Python.
>> >
>> >Another alternative would be to write a pre-processor that parses your
>> >Python source code, extracts any reference to the above, and replaces it
>> >with a call to the appropriate format call. But not only is that a lot of
>> >work for very little gain, but it's also more or less impossible to do in
>> >full generality. And again, what you are running will be something
>> >different than Python, it will be Python plus a pre-processor.
>> >
>> >
>> >Don't fight the language. You will lose.
> Not fighting the language. In fact it's not even a language issue.
> All I need is a standard library[1] improvement: "%,d"! That's all!

You have to keep in mind that 2.7 is not getting any new features, no 
matter how small they seem.  If you create a patch that implements the 
comma flag in %-formatting, it *might* go into 3.x, but it will not go 
into 2.7.

--Ned.

[toc] | [prev] | [next] | [standalone]

#45725

From	Carlos Nepomuceno <carlosnepomuceno@outlook.com>
Date	2013-05-22 14:52 +0300
Message-ID	<mailman.1961.1369223544.3114.python-list@python.org>
In reply to	#45702

----------------------------------------
> Date: Wed, 22 May 2013 07:25:13 -0400
> From: ned@nedbatchelder.com
[...]
> You have to keep in mind that 2.7 is not getting any new features, no
> matter how small they seem. If you create a patch that implements the
> comma flag in %-formatting, it *might* go into 3.x, but it will not go
> into 2.7.
>
> --Ned.

No problem. I have just discovered i was measuring the wrong thing.

My test case is been optimized at compile time by CPython that treats "'%d' % 12345" as a constant.
My use case is different because I almost have no literals been used with % operator.

So my gain isn't that great. In fact it's faster with str.format() than %, and it's even faster if I use the default format specifier.

C:\Python27>python -m timeit -cv -n10000000 -s"v=12345" "'%d'%v"
raw times: 10.5 10.7 10.7
10000000 loops, best of 3: 1.05 usec per loop

C:\Python27>python -m timeit -cv -n10000000 -s"v=12345" "'{:d}'.format(v)"
raw times: 8.11 8.09 8.02
10000000 loops, best of 3: 0.802 usec per loop

C:\Users\Josue\Documents\Python>python -m timeit -cv -n10000000 -s"v=12345" "'{}'.format(v)"
raw times: 5.3 5.5 5.62
10000000 loops, best of 3: 0.53 usec per loop

Using variables (100% of cases) makes str.format() 50% faster than %.

[toc] | [prev] | [standalone]

csiph-web

RE: PEP 378: Format Specifier for Thousands Separator

Contents

#45689 — RE: PEP 378: Format Specifier for Thousands Separator

#45698

#45699

#45702

#45706

#45715

#45723

#45725