Groups > comp.lang.python > #2217 > unrolled thread

Re: Guido rethinking removal of cmp from sort method

Started by	Antoon Pardon <Antoon.Pardon@rece.vub.ac.be>
First post	2011-03-30 11:06 +0200
Last post	2011-04-02 11:27 +0000
Articles	15 — 6 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: Guido rethinking removal of cmp from sort method Antoon Pardon <Antoon.Pardon@rece.vub.ac.be> - 2011-03-30 11:06 +0200
    Re: Guido rethinking removal of cmp from sort method Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-03-31 02:13 +0000
      Re: Guido rethinking removal of cmp from sort method Antoon Pardon <Antoon.Pardon@rece.vub.ac.be> - 2011-03-31 11:41 +0200
        Re: Guido rethinking removal of cmp from sort method Paul Rubin <no.email@nospam.invalid> - 2011-03-31 04:59 -0700
      Re: Guido rethinking removal of cmp from sort method geremy condra <debatem1@gmail.com> - 2011-04-01 14:31 -0700
        Re: Guido rethinking removal of cmp from sort method Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-04-02 00:41 +0000
          Re: Guido rethinking removal of cmp from sort method geremy condra <debatem1@gmail.com> - 2011-04-01 18:22 -0700
            Re: Guido rethinking removal of cmp from sort method Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-04-02 11:01 +0000
              Re: Guido rethinking removal of cmp from sort method geremy condra <debatem1@gmail.com> - 2011-04-02 23:22 -0700
              Re: Guido rethinking removal of cmp from sort method Brian Quinlan <brian@sweetapp.com> - 2011-04-03 16:34 +1000
                Re: Guido rethinking removal of cmp from sort method Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-04-03 10:21 +0000
                  Re: Guido rethinking removal of cmp from sort method geremy condra <debatem1@gmail.com> - 2011-04-03 12:58 -0700
          Re: Guido rethinking removal of cmp from sort method Paul Rubin <no.email@nospam.invalid> - 2011-04-01 19:29 -0700
            Re: Guido rethinking removal of cmp from sort method Chris Angelico <rosuav@gmail.com> - 2011-04-02 13:43 +1100
            Re: Guido rethinking removal of cmp from sort method Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-04-02 11:27 +0000

#2217 — Re: Guido rethinking removal of cmp from sort method

From	Antoon Pardon <Antoon.Pardon@rece.vub.ac.be>
Date	2011-03-30 11:06 +0200
Subject	Re: Guido rethinking removal of cmp from sort method
Message-ID	<mailman.9.1301475780.2990.python-list@python.org>

On Tue, Mar 29, 2011 at 03:35:40PM -0400, Terry Reedy wrote:
> For anyone interested, the tracker discussion on removing cmp is at
> http://bugs.python.org/issue1771
> There may have been more on the old py3k list and pydev list.
> 
> One point made there is that removing cmp= made list.sort consistent
> with all the other comparision functions,
> min/max/nsmallest/nlargest/groupby that only have a key arg. How
> many would really want cmp= added everywhere?

I wouldn't have a problem with it.

I would also like to react to the following.

Guido van Rossum in msg95975 on http://bugs.python.org/issue1771 wrote:
| Also, for all of you asking for cmp back, I hope you realize that 
| sorting N values using a custom cmp function makes about N log N calls 
| calls to cmp, whereas using a custom key calls the key function only N 
| times.  This means that even if your cmp function is faster than the 
| best key function you can write, the advantage is lost as N increases 
| (which is just where you'd like it to matter most :-).

This is a play on semantics. If you need python code to compare
two items, then this code will be called N log N times, independently
of the fact how this code is presented, as a cmp function or as rich
comparison methods. So forcing people to write a key function in cases
where this will only result in the cmp code being translated to __lt__
code, accomplishes nothing. 

As far as I can see, key will only produce significant speedups, if
comparing items can then be completly done internally in the python
engine without referencing user python code.

> A minor problem problem with cmp is that the mapping between return
> values and input comparisons is somewhat arbitrary. Does -1 mean a<b
> or b<a? (That can be learned and memorized, of course, though I tend
> to forget without constant use).

My rule of thumb is that a < b is equivallent with cmp(a, b) < 0

> A bigger problem is that it conflicts with key=. What is the result of
> l=[1,3,2]
> l.sort(cmp=lambda x,y:y-x, key=lambda x: x)
> print l
> ? (for answer, see http://bugs.python.org/issue11712 )
> 
> While that can also be learned, I consider conflicting parameters
> undesireable and better avoided when reasonably possible. So I see
> this thread as a discussion of the meaning of 'reasonably' in this
> particular case.

But what does this have to do with use cases? Does what is reasonable
depend on the current use cases without regard of possible future use
cases? Is the conflict between key and cmp a lesser problem in the
case of someone having a huge data set to sort on a computer that lacks
the resources to decorate as opposed to currently noone having such
a data set? Are we going to decide which functions/methods get a cmp
argument depening on which use cases we currently have that would need it?

This thread started with a request for use cases. But if you take this
kind of things into consideration, I don't see how use cases can then
make a big difference in the final decision.

[toc] | [next] | [standalone]

#2263

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-03-31 02:13 +0000
Message-ID	<4d93e360$0$29996$c3e8da3$5496439d@news.astraweb.com>
In reply to	#2217

On Wed, 30 Mar 2011 11:06:20 +0200, Antoon Pardon wrote:

> As far as I can see, key will only produce significant speedups, if
> comparing items can then be completly done internally in the python
> engine without referencing user python code.

Incorrect. You don't even need megabytes of data to see significant 
differences. How about a mere 1000 short strings?

[steve@wow-wow ~]$ python2.6
Python 2.6.6 (r266:84292, Dec 21 2010, 18:12:50)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from random import shuffle
>>> data = ['a'*n for n in range(1000)]
>>> shuffle(data)
>>> from timeit import Timer
>>>
>>> t_key = Timer('sorted(data, key=lambda a: len(a))',
... 'from __main__ import data')
>>> t_cmp = Timer('sorted(data, cmp=lambda a,b: cmp(len(a), len(b)))',
... 'from __main__ import data')
>>>
>>> min(t_key.repeat(number=1000, repeat=5))
0.89357517051696777
>>> min(t_cmp.repeat(number=1000, repeat=5))
7.6032949066162109

That's almost ten times slower.

Of course, the right way to do that specific sort is:

>>> t_len = Timer('sorted(data, key=len)', 'from __main__ import data')
>>> min(t_len.repeat(number=1000, repeat=5))
0.64559602737426758

which is even easier and faster. But even comparing a pure Python key 
function to the cmp function, it's obvious that cmp is nearly always 
slower.

Frankly, trying to argue that cmp is faster, or nearly as fast, is a 
losing proposition. In my opinion, the only strategy that has even a 
faint glimmer of hope is to find a convincing use-case where speed does 
not matter.

Or, an alternative approach would be for one of the cmp-supporters to 
take the code for Python's sort routine, and implement your own sort-with-
cmp (in C, of course, a pure Python solution will likely be unusable) and 
offer it as a download. For anyone who knows how to do C extensions, this 
shouldn't be hard: just grab the code in Python 2.7 and make it a stand-
alone function that can be imported. 

If you get lots of community interest in this, that is a good sign that 
the solution is useful and practical, and then you can push to have it 
included in the standard library or even as a built-in.

And if not, well, at least you will be able to continue using cmp in your 
own code.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#2278

From	Antoon Pardon <Antoon.Pardon@rece.vub.ac.be>
Date	2011-03-31 11:41 +0200
Message-ID	<mailman.28.1301564292.2990.python-list@python.org>
In reply to	#2263

On Thu, Mar 31, 2011 at 02:13:53AM +0000, Steven D'Aprano wrote:
> On Wed, 30 Mar 2011 11:06:20 +0200, Antoon Pardon wrote:
> 
> > As far as I can see, key will only produce significant speedups, if
> > comparing items can then be completly done internally in the python
> > engine without referencing user python code.
> 
> Incorrect. You don't even need megabytes of data to see significant 
> differences. How about a mere 1000 short strings?
> 
> 
> [steve@wow-wow ~]$ python2.6
> Python 2.6.6 (r266:84292, Dec 21 2010, 18:12:50)
> [GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from random import shuffle
> >>> data = ['a'*n for n in range(1000)]
> >>> shuffle(data)
> >>> from timeit import Timer
> >>>
> >>> t_key = Timer('sorted(data, key=lambda a: len(a))',
> ... 'from __main__ import data')
> >>> t_cmp = Timer('sorted(data, cmp=lambda a,b: cmp(len(a), len(b)))',
> ... 'from __main__ import data')
> >>>
> >>> min(t_key.repeat(number=1000, repeat=5))
> 0.89357517051696777
> >>> min(t_cmp.repeat(number=1000, repeat=5))
> 7.6032949066162109
> 
> That's almost ten times slower.

But how does it contradict what I wrote above? Maybe I didn't make
myself clear but in your example, the key function produces ints.
With ints the comparison is completely done within the python engine
without the need to defer to user python code (through the rich
comparison functions). So this is the kind of key I described that
would produce a significant speedup.

But once your key produces user objects that need to be compared
through user python code with __lt__ methods, getting a speedup
through the use of key instead of cmp will not be obvious and
in a number of case you will loose speed.

> Of course, the right way to do that specific sort is:
> >>> t_len = Timer('sorted(data, key=len)', 'from __main__ import data')
> >>> min(t_len.repeat(number=1000, repeat=5))
> 0.64559602737426758
> 
> which is even easier and faster. But even comparing a pure Python key 
> function to the cmp function, it's obvious that cmp is nearly always 
> slower.

I don't find that obvious at all. 

> Frankly, trying to argue that cmp is faster, or nearly as fast, is a 
> losing proposition. In my opinion, the only strategy that has even a 
> faint glimmer of hope is to find a convincing use-case where speed does 
> not matter.

I don't argue such a general statement. I just argue there are cases
where it is and I supporeted that statement with numbers. The only
way these numbers could be disputed was by focussing on specific details
that didn't generalize.

Now maybe providing a cmp_to_key function written in C, can reduce
the overhead for such cases as Raymond Hettinger suggested. If it
turns out that that works out, then I would consider this thread
usefull even if that will be the only result of this thread.

Something else the dev team can consider, is a Negation class.
This class would wrap itself around a class or object but reverse the ordering.
So that we would get Negation('a') > Negation('b'). That would make it
easy to create sort keys in which partial keys had to be sorted differently
from each other. This would have to be done by the dev team since I
guess that writing such a thing in Python would loose all the speed
of using a key-function.

> Or, an alternative approach would be for one of the cmp-supporters to 
> take the code for Python's sort routine, and implement your own sort-with-
> cmp (in C, of course, a pure Python solution will likely be unusable) and 
> offer it as a download. For anyone who knows how to do C extensions, this 
> shouldn't be hard: just grab the code in Python 2.7 and make it a stand-
> alone function that can be imported. 
> 
> If you get lots of community interest in this, that is a good sign that 
> the solution is useful and practical, and then you can push to have it 
> included in the standard library or even as a built-in.
> 
> And if not, well, at least you will be able to continue using cmp in your 
> own code.

I'll first let Raymond Hettinger rewrite cmp_to_key in C, as I understand he
suggested, and reevaluate after that.

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]

#2279

From	Paul Rubin <no.email@nospam.invalid>
Date	2011-03-31 04:59 -0700
Message-ID	<7x1v1n7b6m.fsf@ruckus.brouhaha.com>
In reply to	#2278

Antoon Pardon <Antoon.Pardon@rece.vub.ac.be> writes:
> Something else the dev team can consider, is a Negation class....
> This would have to be done by the dev team since I
> guess that writing such a thing in Python would loose all the speed
> of using a key-function.

That is a good idea.  SQL has something like it, I think, and Haskell
also has it.  I thought about writing it in Python but it would slow
things down a lot.  I hadn't thought of writing it in C for some reason.

[toc] | [prev] | [next] | [standalone]

#2408

From	geremy condra <debatem1@gmail.com>
Date	2011-04-01 14:31 -0700
Message-ID	<mailman.105.1301693472.2990.python-list@python.org>
In reply to	#2263

On Wed, Mar 30, 2011 at 7:13 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:

<snip>

> Or, an alternative approach would be for one of the cmp-supporters to
> take the code for Python's sort routine, and implement your own sort-with-
> cmp (in C, of course, a pure Python solution will likely be unusable) and
> offer it as a download. For anyone who knows how to do C extensions, this
> shouldn't be hard: just grab the code in Python 2.7 and make it a stand-
> alone function that can be imported.
>
> If you get lots of community interest in this, that is a good sign that
> the solution is useful and practical, and then you can push to have it
> included in the standard library or even as a built-in.
>
> And if not, well, at least you will be able to continue using cmp in your
> own code.

I don't have a horse in this race, but I do wonder how much of Python
could actually survive this test. My first (uneducated) guess is "not
very much"- we would almost certainly lose large pieces of the string
API and other builtins, and I have no doubt at all that a really
significant chunk of the standard library would vanish as well. In
fact, looking at the data I took from PyPI a while back, it's pretty
clear that Python's feature set would look very different overall if
we applied this test to everything.

Geremy Condra

[toc] | [prev] | [next] | [standalone]

#2419

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-04-02 00:41 +0000
Message-ID	<4d9670a9$0$29992$c3e8da3$5496439d@news.astraweb.com>
In reply to	#2408

On Fri, 01 Apr 2011 14:31:09 -0700, geremy condra wrote:

> On Wed, Mar 30, 2011 at 7:13 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
> 
> <snip>
> 
>> Or, an alternative approach would be for one of the cmp-supporters to
>> take the code for Python's sort routine, and implement your own
>> sort-with- cmp (in C, of course, a pure Python solution will likely be
>> unusable) and offer it as a download. For anyone who knows how to do C
>> extensions, this shouldn't be hard: just grab the code in Python 2.7
>> and make it a stand- alone function that can be imported.
>>
>> If you get lots of community interest in this, that is a good sign that
>> the solution is useful and practical, and then you can push to have it
>> included in the standard library or even as a built-in.
>>
>> And if not, well, at least you will be able to continue using cmp in
>> your own code.
> 
> I don't have a horse in this race, but I do wonder how much of Python
> could actually survive this test. My first (uneducated) guess is "not
> very much"- we would almost certainly lose large pieces of the string
> API and other builtins, and I have no doubt at all that a really
> significant chunk of the standard library would vanish as well. In fact,
> looking at the data I took from PyPI a while back, it's pretty clear
> that Python's feature set would look very different overall if we
> applied this test to everything.

I don't understand what you mean by "this test".

I'm certainly not suggesting that we strip every built-in of all methods 
and make everything a third-party C extension. That would be insane.

Nor do I mean that every feature in the standard library should be forced 
to prove itself or be removed. The features removed from Python 3 were 
deliberately few and conservative, and it was a one-off change (at least 
until Python 4000 in the indefinite future). If something is in Python 3 
*now*, you can assume that it won't be removed any time soon.

What I'm saying is this: cmp is already removed from sorting, and we 
can't change the past. Regardless of whether this was a mistake or not, 
the fact is that it is gone, and therefore re-adding it is a new feature 
request. Those who want cmp functionality in Python 3 have three broad 
choices:

(1) suck it up and give up the fight; the battle is lost, move on;

(2) keep arguing until they either wear down the Python developers or get 
kill-filed; never give up, never surrender;

(3) port the feature that they want into a third-party module, so that 
they can actually use it in code, and then when they have evidence that 
the community needs and/or wants this feature, then try to have it re-
added to the language.

I'm suggesting that #3 is a more practical, useful approach than writing 
another hundred thousand words complaining about what a terrible mistake 
it was. Having to do:

from sorting import csort

as a prerequisite for using a comparison function is not an onerous 
requirement for developers. If fans of functional programming can live 
with "from functools import reduce", fans of cmp can live with that.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#2422

From	geremy condra <debatem1@gmail.com>
Date	2011-04-01 18:22 -0700
Message-ID	<mailman.113.1301707325.2990.python-list@python.org>
In reply to	#2419

On Fri, Apr 1, 2011 at 5:41 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Fri, 01 Apr 2011 14:31:09 -0700, geremy condra wrote:
>
>> On Wed, Mar 30, 2011 at 7:13 PM, Steven D'Aprano
>> <steve+comp.lang.python@pearwood.info> wrote:
>>
>> <snip>
>>
>>> Or, an alternative approach would be for one of the cmp-supporters to
>>> take the code for Python's sort routine, and implement your own
>>> sort-with- cmp (in C, of course, a pure Python solution will likely be
>>> unusable) and offer it as a download. For anyone who knows how to do C
>>> extensions, this shouldn't be hard: just grab the code in Python 2.7
>>> and make it a stand- alone function that can be imported.
>>>
>>> If you get lots of community interest in this, that is a good sign that
>>> the solution is useful and practical, and then you can push to have it
>>> included in the standard library or even as a built-in.
>>>
>>> And if not, well, at least you will be able to continue using cmp in
>>> your own code.
>>
>> I don't have a horse in this race, but I do wonder how much of Python
>> could actually survive this test. My first (uneducated) guess is "not
>> very much"- we would almost certainly lose large pieces of the string
>> API and other builtins, and I have no doubt at all that a really
>> significant chunk of the standard library would vanish as well. In fact,
>> looking at the data I took from PyPI a while back, it's pretty clear
>> that Python's feature set would look very different overall if we
>> applied this test to everything.
>
>
> I don't understand what you mean by "this test".

I mean testing whether a feature should be in Python based on whether
it can meet some undefined standard of popularity if implemented as a
third-party module or extension.

> I'm certainly not suggesting that we strip every built-in of all methods
> and make everything a third-party C extension. That would be insane.

Granted, but I think the implication is clear: that only those
features which could be successful if implemented and distributed by a
third party should be in Python. My argument is that there are many
features currently in Python that I doubt would pass that test, but
which should probably be in anyway. The conclusion I draw from that is
that this isn't a particularly good way to determine whether something
should be in standard Python.

> Nor do I mean that every feature in the standard library should be forced
> to prove itself or be removed. The features removed from Python 3 were
> deliberately few and conservative, and it was a one-off change (at least
> until Python 4000 in the indefinite future). If something is in Python 3
> *now*, you can assume that it won't be removed any time soon.

I may have been unclear, so let me reiterate: I'm not under the
impression that you're advocating this as a course of action. I'm just
pointing out that the standard for inclusion you're advocating is
probably not a particularly good one, especially in this case, and
engaging in a bit of a thought experiment about what would happen if
other parts of Python were similarly scrutinized.

> What I'm saying is this: cmp is already removed from sorting, and we
> can't change the past. Regardless of whether this was a mistake or not,
> the fact is that it is gone, and therefore re-adding it is a new feature
> request. Those who want cmp functionality in Python 3 have three broad
> choices:

I might quibble over whether re-adding is the same as a new feature
request, but as I said- I don't care about cmp.

> (1) suck it up and give up the fight; the battle is lost, move on;
>
> (2) keep arguing until they either wear down the Python developers or get
> kill-filed; never give up, never surrender;
>
> (3) port the feature that they want into a third-party module, so that
> they can actually use it in code, and then when they have evidence that
> the community needs and/or wants this feature, then try to have it re-
> added to the language.
>
> I'm suggesting that #3 is a more practical, useful approach than writing
> another hundred thousand words complaining about what a terrible mistake
> it was. Having to do:
>
> from sorting import csort
>
> as a prerequisite for using a comparison function is not an onerous
> requirement for developers. If fans of functional programming can live
> with "from functools import reduce", fans of cmp can live with that.

And that's fine, as I said I don't have a horse in this race. My point
is just that I don't think the standard you're using is a good one-
ISTM that if it *had* been applied evenly we would have wound up with
a much less complete (and much less awesome) Python than we have
today. That indicates that there are a reasonable number of real-world
cases where it hasn't and shouldn't apply.

Geremy Condra

[toc] | [prev] | [next] | [standalone]

#2449

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-04-02 11:01 +0000
Message-ID	<4d97020f$0$29992$c3e8da3$5496439d@news.astraweb.com>
In reply to	#2422

On Fri, 01 Apr 2011 18:22:01 -0700, geremy condra wrote:
[...]
>>> I don't have a horse in this race, but I do wonder how much of Python
>>> could actually survive this test. My first (uneducated) guess is "not
>>> very much"- we would almost certainly lose large pieces of the string
>>> API and other builtins, and I have no doubt at all that a really
>>> significant chunk of the standard library would vanish as well. In
>>> fact, looking at the data I took from PyPI a while back, it's pretty
>>> clear that Python's feature set would look very different overall if
>>> we applied this test to everything.
>>
>>
>> I don't understand what you mean by "this test".
> 
> I mean testing whether a feature should be in Python based on whether it
> can meet some undefined standard of popularity if implemented as a
> third-party module or extension.
[...]
> Granted, but I think the implication is clear: that only those features
> which could be successful if implemented and distributed by a third
> party should be in Python.

Ah, gotcha.

I think you're reading too much into what I said -- I wasn't implying 
that community support is the only acceptable reason for the existence of 
features in Python.

Development of Python is not a democracy, it is a meritocracy. It is 
designed by a small team of language developers, starting with Guido van 
Rossum. Those who do the work decide what goes in, based on whatever 
combination of factors they choose:

* some features are such obvious no-brainers that only a complete idiot 
would leave them out ("what do you mean, there's no way to add two 
numbers?");
* what other languages do;
* personal preference;
* tools that they personally find useful, or that they expect will be 
useful to many;

etc. And *every one of these* is subject to the requirement of a rough 
consensus, or a BDFL pronouncement. The rest of us can only hope to 
persuade the Python developers: if you want somebody to scratch your itch 
instead of their own, you need to convince them to do so.

My point was that good community support is a fairly good method of 
persuasion. The broader community does not get a vote, but that does not 
mean their voices are unheard.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#2504

From	geremy condra <debatem1@gmail.com>
Date	2011-04-02 23:22 -0700
Message-ID	<mailman.155.1301811737.2990.python-list@python.org>
In reply to	#2449

On Sat, Apr 2, 2011 at 4:01 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Fri, 01 Apr 2011 18:22:01 -0700, geremy condra wrote:
> [...]
>>>> I don't have a horse in this race, but I do wonder how much of Python
>>>> could actually survive this test. My first (uneducated) guess is "not
>>>> very much"- we would almost certainly lose large pieces of the string
>>>> API and other builtins, and I have no doubt at all that a really
>>>> significant chunk of the standard library would vanish as well. In
>>>> fact, looking at the data I took from PyPI a while back, it's pretty
>>>> clear that Python's feature set would look very different overall if
>>>> we applied this test to everything.
>>>
>>>
>>> I don't understand what you mean by "this test".
>>
>> I mean testing whether a feature should be in Python based on whether it
>> can meet some undefined standard of popularity if implemented as a
>> third-party module or extension.
> [...]
>> Granted, but I think the implication is clear: that only those features
>> which could be successful if implemented and distributed by a third
>> party should be in Python.
>
> Ah, gotcha.
>
> I think you're reading too much into what I said -- I wasn't implying
> that community support is the only acceptable reason for the existence of
> features in Python.
>
> Development of Python is not a democracy, it is a meritocracy. It is
> designed by a small team of language developers, starting with Guido van
> Rossum. Those who do the work decide what goes in, based on whatever
> combination of factors they choose:

I think we're talking at cross purposes. The point I'm making is that
there are lots of issues where popularity as a third party module
isn't really a viable test for whether a feature is sufficiently
awesome to be in core python. As part of determining whether I thought
it was appropriate in this case I essentially just asked myself
whether any of the really good and necessary parts of Python would
fail to be readmitted under similar circumstances, and I think the
answer is that very few would come back in. To me, that indicates that
this isn't the right way to address this issue, although I admit that
I lack any solid proof to base that conclusion on.

Geremy Condra

[toc] | [prev] | [next] | [standalone]

#2505

From	Brian Quinlan <brian@sweetapp.com>
Date	2011-04-03 16:34 +1000
Message-ID	<mailman.156.1301812482.2990.python-list@python.org>
In reply to	#2449

On 3 Apr 2011, at 16:22, geremy condra wrote:
> I think we're talking at cross purposes. The point I'm making is that
> there are lots of issues where popularity as a third party module
> isn't really a viable test for whether a feature is sufficiently
> awesome to be in core python. As part of determining whether I thought
> it was appropriate in this case I essentially just asked myself
> whether any of the really good and necessary parts of Python would
> fail to be readmitted under similar circumstances, and I think the
> answer is that very few would come back in. To me, that indicates that
> this isn't the right way to address this issue, although I admit that
> I lack any solid proof to base that conclusion on.

This has been discussed a few times on python-dev. I think that most  
developers acknowledge that small-but-high-utility modules would not  
survive outside of the core because people would simple recreate them  
rather than investing the time to find, learn and use them.

Cheers,
Brian

[toc] | [prev] | [next] | [standalone]

#2520

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-04-03 10:21 +0000
Message-ID	<4d984a21$0$29992$c3e8da3$5496439d@news.astraweb.com>
In reply to	#2505

On Sun, 03 Apr 2011 16:34:34 +1000, Brian Quinlan wrote:

> On 3 Apr 2011, at 16:22, geremy condra wrote:
>> I think we're talking at cross purposes. The point I'm making is that
>> there are lots of issues where popularity as a third party module isn't
>> really a viable test for whether a feature is sufficiently awesome to
>> be in core python. As part of determining whether I thought it was
>> appropriate in this case I essentially just asked myself whether any of
>> the really good and necessary parts of Python would fail to be
>> readmitted under similar circumstances, and I think the answer is that
>> very few would come back in. To me, that indicates that this isn't the
>> right way to address this issue, although I admit that I lack any solid
>> proof to base that conclusion on.
> 
> This has been discussed a few times on python-dev. I think that most
> developers acknowledge that small-but-high-utility modules would not
> survive outside of the core because people would simple recreate them
> rather than investing the time to find, learn and use them.

That's certainly true for pure Python code, but for a C extension, the 
barrier to Do It Yourself will be much higher for most Python coders.

On the other hand, for a pure Python function or class, you could stick 
it on ActiveState's Python cookbook and get some imperfect measure of 
popularity and/or usefulness from the comments and votes there.



-- 
Steven

[toc] | [prev] | [next] | [standalone]

#2534

From	geremy condra <debatem1@gmail.com>
Date	2011-04-03 12:58 -0700
Message-ID	<mailman.173.1301860688.2990.python-list@python.org>
In reply to	#2520

On Sun, Apr 3, 2011 at 3:21 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sun, 03 Apr 2011 16:34:34 +1000, Brian Quinlan wrote:
>
>> On 3 Apr 2011, at 16:22, geremy condra wrote:
>>> I think we're talking at cross purposes. The point I'm making is that
>>> there are lots of issues where popularity as a third party module isn't
>>> really a viable test for whether a feature is sufficiently awesome to
>>> be in core python. As part of determining whether I thought it was
>>> appropriate in this case I essentially just asked myself whether any of
>>> the really good and necessary parts of Python would fail to be
>>> readmitted under similar circumstances, and I think the answer is that
>>> very few would come back in. To me, that indicates that this isn't the
>>> right way to address this issue, although I admit that I lack any solid
>>> proof to base that conclusion on.
>>
>> This has been discussed a few times on python-dev. I think that most
>> developers acknowledge that small-but-high-utility modules would not
>> survive outside of the core because people would simple recreate them
>> rather than investing the time to find, learn and use them.
>
> That's certainly true for pure Python code, but for a C extension, the
> barrier to Do It Yourself will be much higher for most Python coders.

I don't think people will work around it in C. I think they'll
grudgingly accept a slow and kludgy python workaround, and more to the
point I think they would do that with a vast majority of features at
this scale. That's why I say this isn't a good test here- because you
could apply it to a great feature or a terrible feature and with
overwhelming probability have them fail in both cases.

> On the other hand, for a pure Python function or class, you could stick
> it on ActiveState's Python cookbook and get some imperfect measure of
> popularity and/or usefulness from the comments and votes there.

Frankly, I have little trust in this as a measure of popularity. Even
PyPI isn't a great indicator, and the numbers you get off of
ActiveState are almost certain to be way, way noisier.

Geremy Condra

[toc] | [prev] | [next] | [standalone]

#2426

From	Paul Rubin <no.email@nospam.invalid>
Date	2011-04-01 19:29 -0700
Message-ID	<7xtyeh4c7s.fsf@ruckus.brouhaha.com>
In reply to	#2419

Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
> What I'm saying is this: cmp is already removed from sorting, and we 
> can't change the past. Regardless of whether this was a mistake or
> not, 

No it's not already removed, I just tried it (in Python 2.6, which is
called "Python" for short) and it still works.  It's not "removed" from
Python until basically all Python users have migrated and "Python"
essentially always means "Python 3".  Until that happens, for Python 2
users, Python 3 is just a fork of Python with some stuff added and some
stuff broken, that might get its act together someday.  I see in the
subject of this thread, "Guido rethinking removal of cmp from sort
method" which gives hope that one particular bit of breakage might get
fixed.

> the fact is that it is gone, and therefore re-adding it is a new feature 
> request. Those who want cmp functionality in Python 3 have three broad 
> choices: ...
> (3) port the feature that they want into a third-party module, ...
> I'm suggesting that #3 is a more practical, useful approach ...
> Having to do:
>    from sorting import csort ...
> If fans of functional programming can live 
> with "from functools import reduce", fans of cmp can live with that.

If "sorting" is in the stdlib like functools is, then the similarity
makes sense and the suggestion isn't so bad.  But you're proposing a 3rd
party module, which is not the same thing at all.  "Batteries included"
actually means something, namely that you don't have to write your
critical applications using a library base written with a Wikipedia-like
development model where anybody can ship anything, where you're expected
to examine every module yourself before you can trust it.  Stuff in the
stdlib occasionally has bugs or gaps, but it has a generally consistent
quality level, is documented, and has been reviewed and reasonably
sanity checked by a central development group that knows what it's
doing.  Stuff in 3rd party libraries has none of the above.  There are
too many places for it to go wrong and I've generally found it best to
stick with stdlib modules instead of occasionally superior modules that
have the disadvantage of coming from a third party.

[toc] | [prev] | [next] | [standalone]

#2428

From	Chris Angelico <rosuav@gmail.com>
Date	2011-04-02 13:43 +1100
Message-ID	<mailman.116.1301712230.2990.python-list@python.org>
In reply to	#2426

On Sat, Apr 2, 2011 at 1:29 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> If "sorting" is in the stdlib like functools is, then the similarity
> makes sense and the suggestion isn't so bad.  But you're proposing a 3rd
> party module, which is not the same thing at all.  "Batteries included"
> actually means something...

To me, "batteries included" means that I can:
1) Write a Python script on any Ubuntu laptop that I put my hands on,
and expect it to work.
2) Put a shebang on it, chmod it plus exx, and give it to someone, and
expect it to work on his system.
3) Post it to my web site along with the comment "You will need a
Python interpreter to run this", and expect it to work.

Every third-party library I need weakens that. Sure, situation 1 isn't
too hard; but the other two end up becoming a bit awkward. The
Yosemite Project requires a support module on Windows, making it that
bit harder to share with people; but I accept that, because it's doing
some rather unusual things (simulating keypresses on another window).
Sorting a list is not unusual enough to justify a third-party module.

ChrisA

[toc] | [prev] | [next] | [standalone]

#2451

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-04-02 11:27 +0000
Message-ID	<4d970824$0$29992$c3e8da3$5496439d@news.astraweb.com>
In reply to	#2426

On Fri, 01 Apr 2011 19:29:59 -0700, Paul Rubin wrote:

> Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
>> What I'm saying is this: cmp is already removed from sorting, and we
>> can't change the past. Regardless of whether this was a mistake or not,
> 
> No it's not already removed, I just tried it (in Python 2.6, which is
> called "Python" for short) and it still works.  It's not "removed" from
> Python until basically all Python users have migrated and "Python"
> essentially always means "Python 3".

You know full well that I'm talking about Python 3, not Python 2 or 
Python 1. In Python 2, sort still takes a cmp argument, so what's the 
problem?

> Until that happens, for Python 2
> users, Python 3 is just a fork of Python with some stuff added and some
> stuff broken, that might get its act together someday.  I see in the
> subject of this thread, "Guido rethinking removal of cmp from sort
> method" which gives hope that one particular bit of breakage might get
> fixed.

You call it a breakage, but many others disagree. The point of this 
thread was supposed to be to encourage people like you to come up with 
good, reasoned, reasonable arguments for adding cmp, not to engage in FUD 
about Python 3 being a "fork" of Python, or that there is never under any 
circumstances any good reason for removing features.

I expected a certain amount of bitterness to come through, but if I had 
realised just what a bunch of whining I was going to unleash, I would 
have kept quiet.

[...]
> If "sorting" is in the stdlib like functools is, then the similarity
> makes sense and the suggestion isn't so bad.  But you're proposing a 3rd
> party module, which is not the same thing at all.

In the face of opposition from senior developers who don't want cmp in 
the language, and who will presumably oppose adding a "needless" module 
to the standard library, you will need good solid evidence that this 
functionality is wanted and needed, and not just a bunch of crappy 
rationalizations like "I can't be bothered thinking up a key function", 
which was actually suggested by someone in this thread. (Not in those 
exact words, but that's the gist of it.) A good way to gather such 
evidence is to make it a third party module first: if people want the 
feature, they will install it, just like they install numpy or pyparsing 
or any other wanted module that is not in the standard library.

And if not, in the absolute worst case, at least *you* can continue using 
cmp in your own code.

I realise that this strategy will only imperfectly capture community 
desire for cmp sorting, but do you have a better strategy?

You don't have to follow my suggestion. If you think you have a better 
strategy for convincing the people doing the actual work to scratch 
*your* itch instead of their own, then go right ahead and use it.

-- 
Steven

[toc] | [prev] | [standalone]

csiph-web

Re: Guido rethinking removal of cmp from sort method

Contents

#2217 — Re: Guido rethinking removal of cmp from sort method

#2263

#2278

#2279

#2408

#2419

#2422

#2449

#2504

#2505

#2520

#2534

#2426

#2428

#2451