Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #106635 > unrolled thread

Re: how to convert code that uses cmp to python3

Started byPaul Rubin <no.email@nospam.invalid>
First post2016-04-07 12:26 -0700
Last post2016-04-08 02:12 -0400
Articles 13 — 4 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: how to convert code that uses cmp to python3 Paul Rubin <no.email@nospam.invalid> - 2016-04-07 12:26 -0700
    Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-07 22:32 +0300
      Re: how to convert code that uses cmp to python3 Paul Rubin <no.email@nospam.invalid> - 2016-04-07 14:15 -0700
        Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-08 07:22 +0300
          Re: how to convert code that uses cmp to python3 Terry Reedy <tjreedy@udel.edu> - 2016-04-08 02:20 -0400
            Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-08 09:53 +0300
              Re: how to convert code that uses cmp to python3 Paul Rubin <no.email@nospam.invalid> - 2016-04-08 00:03 -0700
                Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-08 10:40 +0300
                  Re: how to convert code that uses cmp to python3 Paul Rubin <no.email@nospam.invalid> - 2016-04-08 01:43 -0700
      Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-08 07:17 +0300
        Re: how to convert code that uses cmp to python3 Ian Kelly <ian.g.kelly@gmail.com> - 2016-04-07 22:37 -0600
          Re: how to convert code that uses cmp to python3 Marko Rauhamaa <marko@pacujo.net> - 2016-04-08 08:00 +0300
      Re: how to convert code that uses cmp to python3 Terry Reedy <tjreedy@udel.edu> - 2016-04-08 02:12 -0400

#106635 — Re: how to convert code that uses cmp to python3

FromPaul Rubin <no.email@nospam.invalid>
Date2016-04-07 12:26 -0700
SubjectRe: how to convert code that uses cmp to python3
Message-ID<87a8l5p6ek.fsf@nightsong.com>
Chris Angelico <rosuav@gmail.com> writes:
> First off, what does it actually *mean* to have a tree with numbers
> and keys as strings? Are they ever equal? Are all integers deemed
> lower than all strings? Something else?

If the AVL tree's purpose is to be an alternative lookup structure to
Python's hash-based dictionaries, then it doesn't really matter what the
ordering between values is, as long as it's deterministic.

[toc] | [next] | [standalone]


#106636

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-07 22:32 +0300
Message-ID<874mbdi5ai.fsf@elektro.pacujo.net>
In reply to#106635
Paul Rubin <no.email@nospam.invalid>:

> Chris Angelico <rosuav@gmail.com> writes:
>> First off, what does it actually *mean* to have a tree with numbers
>> and keys as strings? Are they ever equal? Are all integers deemed
>> lower than all strings? Something else?
>
> If the AVL tree's purpose is to be an alternative lookup structure to
> Python's hash-based dictionaries, then it doesn't really matter what
> the ordering between values is, as long as it's deterministic.

I use AVL trees to implement timers. You need to be able to insert
elements in a sorted order and remove them quickly.

Guido chose a different method to implement timers for asyncio. He
decided to never remove canceled timers.


Marko

[toc] | [prev] | [next] | [standalone]


#106637

FromPaul Rubin <no.email@nospam.invalid>
Date2016-04-07 14:15 -0700
Message-ID<8760vtp1d3.fsf@nightsong.com>
In reply to#106636
Marko Rauhamaa <marko@pacujo.net> writes:
> Guido chose a different method to implement timers for asyncio. He
> decided to never remove canceled timers.

Oh my, that might not end well.  There are other approaches that don't
need AVL trees and can remove cancelled timers, e.g. "timer wheels" as
used in Erlang and formerly (don't know about now) in the Linux kernel.

[toc] | [prev] | [next] | [standalone]


#106639

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-08 07:22 +0300
Message-ID<87pou0hgqe.fsf@elektro.pacujo.net>
In reply to#106637
Paul Rubin <no.email@nospam.invalid>:

> Marko Rauhamaa <marko@pacujo.net> writes:
>> Guido chose a different method to implement timers for asyncio. He
>> decided to never remove canceled timers.
>
> Oh my, that might not end well. There are other approaches that don't
> need AVL trees and can remove cancelled timers, e.g. "timer wheels" as
> used in Erlang and formerly (don't know about now) in the Linux
> kernel.

The issue is known. It has been tackled with a kind of a "garbage
collection" scheme:

   <URL: https://bugs.python.org/issue22448>


Marko

[toc] | [prev] | [next] | [standalone]


#106652

FromTerry Reedy <tjreedy@udel.edu>
Date2016-04-08 02:20 -0400
Message-ID<mailman.66.1460096435.2253.python-list@python.org>
In reply to#106639
On 4/8/2016 12:22 AM, Marko Rauhamaa wrote:
> Paul Rubin <no.email@nospam.invalid>:
>
>> Marko Rauhamaa <marko@pacujo.net> writes:
>>> Guido chose a different method to implement timers for asyncio. He
>>> decided to never remove canceled timers.

Only initially.  He approved a change immediately when presented with a 
concrete problem.

>> Oh my, that might not end well. There are other approaches that don't
>> need AVL trees and can remove cancelled timers, e.g. "timer wheels" as
>> used in Erlang and formerly (don't know about now) in the Linux
>> kernel.
>
> The issue is known. It has been tackled with a kind of a "garbage
> collection" scheme:
>
>     <URL: https://bugs.python.org/issue22448>

and fixed 1 1/2 years ago.


-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#106653

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-08 09:53 +0300
Message-ID<87zit4fv6r.fsf@elektro.pacujo.net>
In reply to#106652
Terry Reedy <tjreedy@udel.edu>:

> On 4/8/2016 12:22 AM, Marko Rauhamaa wrote:
>> The issue is known. It has been tackled with a kind of a "garbage
>> collection" scheme:
>>
>>     <URL: https://bugs.python.org/issue22448>
>
> and fixed 1 1/2 years ago.

On the surface, the garbage collection scheme looks dubious, but maybe
it works perfect in practice.


Marko

[toc] | [prev] | [next] | [standalone]


#106655

FromPaul Rubin <no.email@nospam.invalid>
Date2016-04-08 00:03 -0700
Message-ID<87y48ooa3q.fsf@nightsong.com>
In reply to#106653
Marko Rauhamaa <marko@pacujo.net> writes:
> On the surface, the garbage collection scheme looks dubious, but maybe
> it works perfect in practice.

It looked suspicious at first glance but I think it is ok.  Basically on
at most every timeout event (scheduling, expiration, or cancellation),
it does an O(n) operation (scanning and re-heapifying the timeout list)
with probability O(1/n) where n is the queue size, which itself changes
(by 0, +1 or -1) when a timeout event happens.  That is, its overhead is
a constant factor unless I'm missing something.  There are some
efficiency gains possible but it seems par for the course for Python
code.

[toc] | [prev] | [next] | [standalone]


#106660

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-08 10:40 +0300
Message-ID<87vb3sft0f.fsf@elektro.pacujo.net>
In reply to#106655
Paul Rubin <no.email@nospam.invalid>:

> Marko Rauhamaa <marko@pacujo.net> writes:
>> On the surface, the garbage collection scheme looks dubious, but
>> maybe it works perfect in practice.
>
> It looked suspicious at first glance but I think it is ok. Basically
> on at most every timeout event (scheduling, expiration, or
> cancellation), it does an O(n) operation (scanning and re-heapifying
> the timeout list) with probability O(1/n) where n is the queue size,
> which itself changes (by 0, +1 or -1) when a timeout event happens.
> That is, its overhead is a constant factor unless I'm missing
> something. There are some efficiency gains possible but it seems par
> for the course for Python code.

I compared the performance of AVL trees, the older heapq technique as
well as the "GC scheme" in 2014 with a challenging but realistic
test scenario. AVL trees and the GC scheme had pretty much the same
throughput (and the older, simple heapq was slower).

With AVL trees, it's easier to be convinced about worst-case
performance. It is more difficult to see the potential pathological
cases with the GC scheme.


Marko

[toc] | [prev] | [next] | [standalone]


#106666

FromPaul Rubin <no.email@nospam.invalid>
Date2016-04-08 01:43 -0700
Message-ID<87twjco5ih.fsf@nightsong.com>
In reply to#106660
Marko Rauhamaa <marko@pacujo.net> writes:
> With AVL trees, it's easier to be convinced about worst-case
> performance.

I'd have thought the main reason to use AVL trees was persistence, so
you could have multiple slightly different trees sharing most of their
structures.

> It is more difficult to see the potential pathological cases with the
> GC scheme.

How bad can the GC scheme be, if the constants are picked properly?
I'll think about this tomorrow, it's late here now.

[toc] | [prev] | [next] | [standalone]


#106638

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-08 07:17 +0300
Message-ID<87twjchgyd.fsf@elektro.pacujo.net>
In reply to#106636
Ian Kelly <ian.g.kelly@gmail.com>:

> On Thu, Apr 7, 2016 at 1:32 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> I use AVL trees to implement timers. You need to be able to insert
>> elements in a sorted order and remove them quickly.
>
> Why would AVL trees implementing timers ever need non-numeric keys
> though?
>
> It seems to me that if you're mixing types like this then the ordering
> is likely not actually important.

The keys are expiry times. You could use numbers or you could use
datetime objects.

Ordering is crucial when it comes to timers.


Marko

[toc] | [prev] | [next] | [standalone]


#106640

FromIan Kelly <ian.g.kelly@gmail.com>
Date2016-04-07 22:37 -0600
Message-ID<mailman.62.1460090229.2253.python-list@python.org>
In reply to#106638
On Apr 7, 2016 10:22 PM, "Marko Rauhamaa" <marko@pacujo.net> wrote:
>
> Ian Kelly <ian.g.kelly@gmail.com>:
>
> > On Thu, Apr 7, 2016 at 1:32 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> >> I use AVL trees to implement timers. You need to be able to insert
> >> elements in a sorted order and remove them quickly.
> >
> > Why would AVL trees implementing timers ever need non-numeric keys
> > though?
> >
> > It seems to me that if you're mixing types like this then the ordering
> > is likely not actually important.
>
> The keys are expiry times. You could use numbers or you could use
> datetime objects.

Yes, but why would you want to use both?

> Ordering is crucial when it comes to timers.

I'm not disputing that.

[toc] | [prev] | [next] | [standalone]


#106644

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-04-08 08:00 +0300
Message-ID<87k2k8heyv.fsf@elektro.pacujo.net>
In reply to#106640
Ian Kelly <ian.g.kelly@gmail.com>:

> On Apr 7, 2016 10:22 PM, "Marko Rauhamaa" <marko@pacujo.net> wrote:
>> The keys are expiry times. You could use numbers or you could use
>> datetime objects.
>
> Yes, but why would you want to use both?

I was never talking about mixing key types. I was simply reacting (out
of context) to a suggestion that AVL trees are simply dictionaries.


Marko

[toc] | [prev] | [next] | [standalone]


#106650

FromTerry Reedy <tjreedy@udel.edu>
Date2016-04-08 02:12 -0400
Message-ID<mailman.64.1460095944.2253.python-list@python.org>
In reply to#106636
On 4/7/2016 3:32 PM, Marko Rauhamaa wrote:

> I use AVL trees to implement timers. You need to be able to insert
> elements in a sorted order and remove them quickly.
>
> Guido chose a different method to implement timers for asyncio. He
> decided to never remove canceled timers.

In 3.5.1, asyncio.base_events.BaseEventLoop._run_once
has this code to remove cancelled timers when they become too numerous.

         if (sched_count > _MIN_SCHEDULED_TIMER_HANDLES and
             self._timer_cancelled_count / sched_count >
                 _MIN_CANCELLED_TIMER_HANDLES_FRACTION):
             # Remove delayed calls that were cancelled if their number
             # is too high
             new_scheduled = []
             for handle in self._scheduled:
                 if handle._cancelled:
                     handle._scheduled = False
                 else:
                     new_scheduled.append(handle)

             heapq.heapify(new_scheduled)
             self._scheduled = new_scheduled
             self._timer_cancelled_count = 0


-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web