Groups > comp.lang.python > #32329 > unrolled thread

Negative array indicies and slice()

Started by	andrewr3mail@gmail.com
First post	2012-10-28 20:12 -0700
Last post	2012-11-01 18:08 -0700
Articles	20 on this page of 73 — 16 participants

Back to article view | Back to comp.lang.python

  Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 20:12 -0700
    Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 21:42 -0600
      Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
        Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 22:25 -0600
          Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:18 -0700
            Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 11:19 +0000
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 22:32 +1100
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 21:52 -0700
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 23:40 +1100
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:02 +0000
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 23:01 -0700
                Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 09:52 -0400
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:20 -0700
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:01 -0600
                  Re: Negative array indicies and slice() Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-10-30 00:04 +0000
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:54 -0700
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:15 -0600
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 00:53 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:09 -0600
              Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:14 +0000
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:42 -0700
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 00:02 +0000
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 12:34 -0700
                    Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 08:17 +0000
                      Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 08:47 -0700
                      Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:48 +0000
                      Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-30 23:29 -0600
                  Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-29 23:53 -0600
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:04 -0700
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 09:55 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:07 -0600
                Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 19:24 -0400
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:43 -0600
                    Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 20:17 -0400
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 18:05 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 11:00 -0700
              Re: Negative array indicies and slice() Chris Kaynor <ckaynor@zindagigames.com> - 2012-10-29 18:49 -0700
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 15:39 -0700
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 23:55 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 00:51 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:17 -0700
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:21 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:32 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:46 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 12:02 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 07:21 -0700
              Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 21:33 +0000
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 10:07 +0000
                  Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-31 16:01 +0000
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:47 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:55 -0600
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-31 09:00 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 16:02 -0600
              Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:30 +0000
            Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:27 -0600
            Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:33 -0700
          Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
      Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
      Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
        Re: Negative array indicies and slice() alex23 <wuwei23@gmail.com> - 2012-10-28 21:44 -0700
          Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:24 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:37 -0700
              Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
                Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-29 09:36 +0000
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 10:34 +0000
              Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
        Re: Negative array indicies and slice() Paul Rubin <no.email@nospam.invalid> - 2012-10-28 22:14 -0700
          Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:08 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:26 -0700
      Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
    Re: Negative array indicies and slice() MRAB <python@mrabarnett.plus.com> - 2012-10-29 03:45 +0000
    Re: Negative array indicies and slice() 88888 Dihedral <dihedral88888@googlemail.com> - 2012-11-01 18:08 -0700

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

#32469

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 00:51 -0600
Message-ID	<mailman.3076.1351579918.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> In addition to those items you mention, of which the reference count is not
> even *inside* the struct -- there is additional debugging information not
> mentioned.  Built in objects contain a "line number", a "column number", and
> a "context" pointer.  These each require a full word of storage.
>
> Also, built in types appear to have a "kind" field which indicates the
> object "type" but is not a pointer.  That suggests two "object" type
> indicators, a generic pointer (probably pointing to "builtin"? somewhere
> outside the struct) and a specific one (an enum) inside the "C" struct.
>
> Inside the tuple struct, I count 4 undocumented words of information.
> Over all, there is a length, the list of pointers, a "kind", "line", "col"
> and "context"; making 6 pieces in total.
>
> Although your comment says the head pointer is not required; I found in
> 3.3.0 that it is a true head pointer; The Tuple() function on line 2069 of
> Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.  That
> pointer is copied into the Tuple struct.

As above, you're looking at the compiler code, which is why you're
finding things like "line" and "column".  The tuple struct is defined
in tupleobject.h and stores tuple elements in a tail array.

> How ironic,  slices don't have debugging info, that's the main reason they
> are smaller.
> When I do slice(3,0,2), suprisingly "Slice()" is NOT called.
> But when I do a[1:2:3] it *IS* called.

Because compiling the latter involves parsing slicing syntax, and
compiling the former does not. :-)

[toc] | [prev] | [next] | [standalone]

#32472

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 17:17 -0700
Message-ID	<mailman.3079.1351581543.27098.python-list@python.org>
In reply to	#32359

On 10/29/2012 11:51 PM, Ian Kelly wrote:
> On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson
>
> As above, you're looking at the compiler code, which is why you're
> finding things like "line" and "column".  The tuple struct is defined
> in tupleobject.h and stores tuple elements in a tail array.
>

If you re-check my post to chris, I listed the struct you mention.
The C code is what is actually run (by GDB breakpoint test) when a tuple 
is instantiated.
If the tuple were stripped of the extra data -- then it ought to be as 
small as slice().
But it's not as small -- so either the sys.getsizeof() is lying -- or 
the struct you mention is not complete.

Which?

--Andrew.

[toc] | [prev] | [next] | [standalone]

#32473

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 01:21 -0600
Message-ID	<mailman.3080.1351581739.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 7:49 PM, Chris Kaynor <ckaynor@zindagigames.com> wrote:
> NOTE: The above is taken from reading the source code for Python 2.6.
> For some odd reason, I am getting that an empty tuple consists of 6
> pointer-sized objects (48 bytes on x64), rather than the expected 3
> pointer-sized (24 bytes on x64). Slices are showing up as the expected
> 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
> pointer (8 bytes on x64) per item. I imagine I am missing something,
> but cannot figure out what that would be.

I'm likewise seeing 4 extra words in tuples in 32-bit Python 3.3.
What I've found is that for tuples and other collection objects, the
garbage collector tacks on an extra header before the object in
memory.  That header looks like this:

typedef union _gc_head {
    struct {
        union _gc_head *gc_next;
        union _gc_head *gc_prev;
        Py_ssize_t gc_refs;
    } gc;
    long double dummy;  /* force worst-case alignment */
} PyGC_Head;

gc_next and gc_prev implement a doubly-linked list that the garbage
collector uses to explicitly track this object.  gc_refs is used for
counting references during a garbage collection and stores the GC
state of the object otherwise.

I'm not entirely certain why collection objects get this special
treatment, but there you have it.

[toc] | [prev] | [next] | [standalone]

#32474

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 01:32 -0600
Message-ID	<mailman.3081.1351582397.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 6:17 PM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> If you re-check my post to chris, I listed the struct you mention.
> The C code is what is actually run (by GDB breakpoint test) when a tuple is
> instantiated.

When you were running GDB, were you debugging the interactive
interpreter or a precompiled script?  The interactive interpreter does
a compilation step for every line entered.

> If the tuple were stripped of the extra data -- then it ought to be as small
> as slice().
> But it's not as small -- so either the sys.getsizeof() is lying -- or the
> struct you mention is not complete.

As just explained, the extra 16 bytes are added by the garbage collector.

[toc] | [prev] | [next] | [standalone]

#32477

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 02:46 -0600
Message-ID	<mailman.3083.1351586837.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 1:21 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> I'm not entirely certain why collection objects get this special
> treatment, but there you have it.

Thinking about it some more, this makes sense.  The GC header is there
to support garbage collection for the object.  Atomic types like ints
do not need this header because they do not reference other objects
and so cannot be involved in reference cycles.  For those types,
reference counting is sufficient.  For types like collections that do
reference other objects, garbage collection is needed.

Expanding on this, I suspect it is actually a bug that slice objects
are not tracked by the garbage collector.  The following code appears
to result in a memory leak:

import gc
gc.disable()
while True:
    for i in range(100):
        l = []
        s = slice(l)
        l.append(s)
        del s, l
    _ = gc.collect()

Try running that and watch your Python memory usage climb and climb.
For contrast, replace the slice with a list and observe that memory
usage does *not* climb.  On each iteration, the code constructs a
reference cycle between a slice and a list.  It seems that because
slices are not tracked by the garbage collector, it is unable to break
these cycles.

[toc] | [prev] | [next] | [standalone]

#32500

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 12:02 -0600
Message-ID	<mailman.3096.1351620171.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
> File a bug report?

Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180

[toc] | [prev] | [next] | [standalone]

#32503

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-30 07:21 -0700
Message-ID	<mailman.3102.1351632022.27098.python-list@python.org>
In reply to	#32359

On 10/30/2012 11:02 AM, Ian Kelly wrote:
> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman<ethan@stoneleaf.us>  wrote:
>> File a bug report?
> Looks like it's already been wontfixed back in 2006:
>
> http://bugs.python.org/issue1501180
Thanks, IAN, you've answered the first of my questions and have been a 
great help.
(And yes, I was debugging interactive mode... I took a nap after writing 
that post, as I realized I had reached my 1 really bad post for the day... )

I at least I finally know why Python chooses to implement slice() as a 
separate object from tuple; even if I don't like the implications.

I think there are three main consequences of the present implementation 
of slice():

1) The interpreter code size is made larger with no substantial 
improvement in functionality, which increases debugging effort.
2) No protection against perverted and surprising (are you surprised?! I 
am) memory operation exists.
3) There is memory savings associated with not having garbage collection 
overhead.

D'Apriano mentioned the named values, start, stop, step in a slice() 
which are an API and legacy issue;  These three names must also be 
stored in the interpreter someplace.  Since slice is defined at the "C" 
level as a struct, have you already found these names in the source code 
(hard-coded), or are they part of a .py file associated with the 
interface to the "C" code?

[toc] | [prev] | [next] | [standalone]

#32506

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2012-10-30 21:33 +0000
Message-ID	<mailman.3105.1351632899.27098.python-list@python.org>
In reply to	#32359

On 30/10/2012 18:02, Ian Kelly wrote:
> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
>> File a bug report?
>
> Looks like it's already been wontfixed back in 2006:
>
> http://bugs.python.org/issue1501180
>

Absolutely bloody typical, turned down because of an idiot.  Who the 
hell is Tim Peters anyway? :)

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]

#32522

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2012-10-31 10:07 +0000
Message-ID	<5090f867$0$29967$c3e8da3$5496439d@news.astraweb.com>
In reply to	#32506

On Tue, 30 Oct 2012 21:33:32 +0000, Mark Lawrence wrote:

> On 30/10/2012 18:02, Ian Kelly wrote:
>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us>
>> wrote:
>>> File a bug report?
>>
>> Looks like it's already been wontfixed back in 2006:
>>
>> http://bugs.python.org/issue1501180
>>
>>
> Absolutely bloody typical, turned down because of an idiot.  Who the
> hell is Tim Peters anyway? :)

I see your smiley, but for the benefit of those who actually don't know 
who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python 
history. He invented Python's astonishingly excellent sort routine, 
Timsort, and popularised the famous adverbial phrase signoffs you will 
see in a lot of older posts.

Basically, he is in the pantheon of early Python demigods.

stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#32528

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2012-10-31 16:01 +0000
Message-ID	<mailman.3123.1351699337.27098.python-list@python.org>
In reply to	#32522

On 31/10/2012 10:07, Steven D'Aprano wrote:
> On Tue, 30 Oct 2012 21:33:32 +0000, Mark Lawrence wrote:
>
>> Absolutely bloody typical, turned down because of an idiot.  Who the
>> hell is Tim Peters anyway? :)
>
> I see your smiley, but for the benefit of those who actually don't know
> who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python
> history. He invented Python's astonishingly excellent sort routine,
> Timsort, and popularised the famous adverbial phrase signoffs you will
> see in a lot of older posts.
>
> Basically, he is in the pantheon of early Python demigods.
>
> stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,
>

4 / 10, must try harder, the omission of the Zen of Python is considered 
a very serious matter :)

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]

#32507

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 15:47 -0600
Message-ID	<mailman.3106.1351633661.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> On 30/10/2012 18:02, Ian Kelly wrote:
>>
>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
>>>
>>> File a bug report?
>>
>>
>> Looks like it's already been wontfixed back in 2006:
>>
>> http://bugs.python.org/issue1501180
>>
>
> Absolutely bloody typical, turned down because of an idiot.  Who the hell is
> Tim Peters anyway? :)

I don't really disagree with him, anyway.  It is a rather obscure bug
-- is it worth increasing the memory footprint of slice objects by 80%
in order to fix it?

[toc] | [prev] | [next] | [standalone]

#32508

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 15:55 -0600
Message-ID	<mailman.3107.1351634173.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> D'Apriano mentioned the named values, start, stop, step in a slice() which
> are an API and legacy issue;  These three names must also be stored in the
> interpreter someplace.  Since slice is defined at the "C" level as a struct,
> have you already found these names in the source code (hard-coded), or are
> they part of a .py file associated with the interface to the "C" code?

You mean the mapping of Python attribute names to C struct members?
That's in sliceobject.c:

static PyMemberDef slice_members[] = {
    {"start", T_OBJECT, offsetof(PySliceObject, start), READONLY},
    {"stop", T_OBJECT, offsetof(PySliceObject, stop), READONLY},
    {"step", T_OBJECT, offsetof(PySliceObject, step), READONLY},
    {0}
};

[toc] | [prev] | [next] | [standalone]

#32509

From	Chris Angelico <rosuav@gmail.com>
Date	2012-10-31 09:00 +1100
Message-ID	<mailman.3108.1351634459.27098.python-list@python.org>
In reply to	#32359

On Wed, Oct 31, 2012 at 8:47 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> On 30/10/2012 18:02, Ian Kelly wrote:
>>>
>>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
>>>>
>>>> File a bug report?
>>>
>>>
>>> Looks like it's already been wontfixed back in 2006:
>>>
>>> http://bugs.python.org/issue1501180
>>>
>>
>> Absolutely bloody typical, turned down because of an idiot.  Who the hell is
>> Tim Peters anyway? :)
>
> I don't really disagree with him, anyway.  It is a rather obscure bug
> -- is it worth increasing the memory footprint of slice objects by 80%
> in order to fix it?

Bug report: If I take this gun, aim it at my foot, and pull the
trigger, sometimes a hole appears in my foot.

This is hardly normal use of slice objects. And the penalty isn't a
serious one unless you're creating cycles repeatedly.

ChrisA

[toc] | [prev] | [next] | [standalone]

#32510

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-30 16:02 -0600
Message-ID	<mailman.3109.1351634558.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 3:55 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
> <andrew3@r3dsolutions.com> wrote:
>> D'Apriano mentioned the named values, start, stop, step in a slice() which
>> are an API and legacy issue;  These three names must also be stored in the
>> interpreter someplace.  Since slice is defined at the "C" level as a struct,
>> have you already found these names in the source code (hard-coded), or are
>> they part of a .py file associated with the interface to the "C" code?
>
> You mean the mapping of Python attribute names to C struct members?
> That's in sliceobject.c:
>
> static PyMemberDef slice_members[] = {
>     {"start", T_OBJECT, offsetof(PySliceObject, start), READONLY},
>     {"stop", T_OBJECT, offsetof(PySliceObject, stop), READONLY},
>     {"step", T_OBJECT, offsetof(PySliceObject, step), READONLY},
>     {0}
> };

Note that the slice API also includes the slice.indices method.

They also implement rich comparisons, but this appears to be done by
copying the data to tuples and comparing the tuples, which is actually
a bit ironic considering this discussion. :-)

[toc] | [prev] | [next] | [standalone]

#32512

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2012-10-30 23:30 +0000
Message-ID	<mailman.3111.1351639865.27098.python-list@python.org>
In reply to	#32359

On 30/10/2012 21:47, Ian Kelly wrote:
> On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> On 30/10/2012 18:02, Ian Kelly wrote:
>>>
>>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
>>>>
>>>> File a bug report?
>>>
>>>
>>> Looks like it's already been wontfixed back in 2006:
>>>
>>> http://bugs.python.org/issue1501180
>>>
>>
>> Absolutely bloody typical, turned down because of an idiot.  Who the hell is
>> Tim Peters anyway? :)
>
> I don't really disagree with him, anyway.  It is a rather obscure bug
> -- is it worth increasing the memory footprint of slice objects by 80%
> in order to fix it?
>

Thinking about it I entirely agree.  An 80% increase in memory foorprint 
where the slice objects are being used with Python 3.3.0 Unicode would 
have disastrous consequences given the dire state of said Unicode, which 
is why some regular contributors here are giving up with Python and 
using Go.

Oh gosh look at the time, I'm just going for a walk so I can talk with 
the Pixies at the bottom of my garden before they go night nights.

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]

#32408

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-29 11:27 -0600
Message-ID	<mailman.3036.1351531699.27098.python-list@python.org>
In reply to	#32345

On Mon, Oct 29, 2012 at 1:54 AM, Andrew <andrewr3mail@gmail.com> wrote:
> My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__().

Yes, I misconstrued your question.  I thought you wanted to change the
behavior of slicing to wrap around the end when start > stop instead
of returning an empty sequence.  What you actually want is a new
sequence built from indexes supplied by an iterable.  Chris has
already given you a list comprehension solution to solve that.  You
could also use map for this:

new_seq = list(map(old_seq.__getitem__, iterable))

Since you seem to be concerned about performance, I'm not sure in this
case whether the map or the list comprehension will be faster.  I'll
leave you to test that on your intended hardware.

> In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data.  Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought.  One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C.

I'm aware of what is possible in C with pointer arithmetic.  This is
Python, though, and Python by design has neither pointers nor pointer
arithmetic.  In any case, initializing the pointer to the end of the
array would still not do what you want, since the positive indices
would then extend past the end of the array.

[toc] | [prev] | [next] | [standalone]

#32468

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 16:33 -0700
Message-ID	<mailman.3075.1351578914.27098.python-list@python.org>
In reply to	#32345

Hi Ian,

There are several interesting/thoughtful things you have written.
I like the way you consider a problem before knee jerk answering.

The copying you mention (or realloc) doesn't re-copy the objects on the 
list.
It merely re-copies the pointer list to those objects. So lets see what 
it would do...

I have seen doubling as the supposed re-alloc method, but I'll assume 
1.25 --
so, 1.25**x = 20million, is 76 copies (max).

The final memory copy would leave about a 30MB hole.
And my version of Python operates initially with a 7MB virtual footprint.

Sooo.... If the garbage collection didn't operate at all, the copying 
would waste around:

 >>> z,w = 30e6,0
 >>> while (z>1): w,z = w+z, z/1.25
...
 >>> print(w)
149999995.8589521

eg: 150MB cummulative.
The doubles would amount to 320Megs max.

Not enough to fill virtual memory up; nor even cause a swap on a 2GB 
memory machine.
It can hold everything in memory at once.

So, I don't think Python's memory management is the heart of the problem,
although memory wise-- it does require copying around 50% of the data.

As an implementation issue, though, the large linear array may cause 
wasteful caching/swapping loops, esp, on smaller machines.

On 10/29/2012 10:27 AM, Ian Kelly wrote:
> Yes, I misconstrued your question.  I thought you wanted to change the
> behavior of slicing to wrap around the end when start>  stop instead
> of returning an empty sequence. ...  Chris has
> already given ...  You
> could also use map for this:
>
> new_seq = list(map(old_seq.__getitem__, iterable))
MMM... interesting.

I am not against changing the behavior, but I do want solutions like you 
are offering.
As I am going to implement a python interpreter, in C,  being able to do 
things differently could significantly reduce the interpreter's size.

However, I want to break existing scripts very seldom...

> I'm aware of what is possible in C with pointer arithmetic. This is 
> Python, though, and Python by design has neither pointers nor pointer 
> arithmetic. In any case, initializing the pointer to the end of the 
> array would still not do what you want, since the positive indices 
> would then extend past the end of the array. 

Yes, *and* if you have done assembly language programming -- you know 
that testing for sign is a trivial operation.  It doesn't even require a 
subtraction.  Hence, at the most basic machine level -- changing the 
base pointer *once* during a slice operation is going to be far more 
efficient than performing multiple subtractions from the end of an 
array, as the Python API defines.
I'll leave out further gory details... but it is a Python interpreter 
built in "C" issue.

[toc] | [prev] | [next] | [standalone]

#32346

From	Andrew <andrewr3mail@gmail.com>
Date	2012-10-29 00:54 -0700
Message-ID	<mailman.2996.1351497278.27098.python-list@python.org>
In reply to	#32337

On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote:
> On Sun, Oct 28, 2012 at 10:00 PM,  Andrew wrote:
> 
> > Hi Ian,
> 
> > Well, no it really isn't equivalent.
> 
> > Consider a programmer who writes:
> 
> > xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]
> 
> >
> 
> > That is the "idea" of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3???
> 
> 
> 
> That is what ranges do, but your question was about slices, not ranges.

Actually, I said in the OP:

"I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't."

=========================

Thank you for the code snippet; I don't think it likely that existing programs depend on nor use a negative index and a positive index expecting to take a small chunk in the center... hence, I would return the whole array; Or if someone said [-len(listX) : len(listX)+1 ] I would return the whole array twice.
That's the maximum that is possible.
If someone could show me a normal/reasonable script which *would* expect the other behavior, I'd like to know; compatibility is important.

=========================

My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__().

The slice class when passed to __getitem__()  was created to merely pass two numbers and a stride to __getitem__;  As far as I know slice() itself does *nothing* in the actual processing of the elements.  So, it's *redundant* functionality, and far worse, it's restrictive.

The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ?

A simple list of three values would be just as efficient as slice()!
xrange is more flexible, and can be just as efficient.

So, Have I misunderstood the operation of slice()?  I think I might have... but I don't know.

In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data.  Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought.  One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C.

Consider, also, that xrange() does not actually create a list -- but merely an iterator generating integers which is exactly what __getitem__ works on.
So, xrange() does not need to incur a memory or noticeable time penalty.

>From micro-python, it's clear that their implementation of xrange() is at the 'C' level; which is extremely fast.

[toc] | [prev] | [next] | [standalone]

#32334

From	andrewr3mail@gmail.com
Date	2012-10-28 21:00 -0700
Message-ID	<mailman.2988.1351483258.27098.python-list@python.org>
In reply to	#32331

On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
> On Sun, Oct 28, 2012 at 9:12 PM, andrew wrote:
> 
> > The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying.
> 
> >
> 
> > Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around.
> 
> >
> 
> > I *hate* replicating code every time I need to do this!
> 
> >
> 
> > I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't.
> 
> >
> 
> >
> 
> > Here's an example for Linux shell, otherwise remove /bin/env...
> 
> > {{{#!/bin/env python
> 
> > a=[1,2,3,4,5,6,7,8,9,10]
> 
> > print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
> 
> > }}}
> 
> 
> 
> 
> 
> For a sequence of length 10, "a[-4:3]" is equivalent to "a[6:3]",
> 
> which is an empty slice since index 6 is after index 3.
> 
> 
> 
> If you want it to wrap around, then take two slices and concatenate
> 
> them with "a[-4:] + a[:3]".

Hi Ian,
Well, no it really isn't equivalent.
Consider a programmer who writes:
xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]

That is the "idea" of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3???

I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same.

So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__;  They are totally separate classes.

I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive.

eg: a line in my program reads:
a[x-5:x]

if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !

I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible.

[toc] | [prev] | [next] | [standalone]

#32335

From	Andrew <andrewr3mail@gmail.com>
Date	2012-10-28 21:09 -0700
Message-ID	<bb922c4b-08b5-4e93-9452-5306b2d953a9@googlegroups.com>
In reply to	#32331

On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
> On Sun, Oct 28, 2012 at 9:12 PM,  <Andrew> wrote:
> 
> > The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying.
> 
> >
> 
> > Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around.
> 
> >
> 
> > I *hate* replicating code every time I need to do this!
> 
> >
> 
> > I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't.
> 
> >
> 
> >
> 
> > Here's an example for Linux shell, otherwise remove /bin/env...
> 
> > {{{#!/bin/env python
> 
> > a=[1,2,3,4,5,6,7,8,9,10]
> 
> > print a[-4:3]  # I am interested in getting [7,8,9,10,1,2] but I get [].
> 
> > }}}
> 
> 
> 
> 
> 
> For a sequence of length 10, "a[-4:3]" is equivalent to "a[6:3]",
> 
> which is an empty slice since index 6 is after index 3.
> 
> 
> 
> If you want it to wrap around, then take two slices and concatenate
> 
> them with "a[-4:] + a[:3]".

Hi Ian,
Well, no it really isn't equivalent; although Python implements it as equivalent.

Consider a programmer who writes:
xrange(-4,3) 

They clearly *want* [-4,-3,-2,-1,0,1,2]

That is the "idea" of a range; So, for what reason would anyone want -4 to +3 to be 6:3???  Can you show me some code where this is desirable??

I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same.

So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__;  They are totally separate classes.

I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive.

eg: a line in my program reads:
a[x-5:x]

if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !

I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible.

[toc] | [prev] | [next] | [standalone]

Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →

csiph-web

Negative array indicies and slice()

Contents

#32469

#32472

#32473

#32474

#32477

#32500

#32503

#32506

#32522

#32528

#32507

#32508

#32509

#32510

#32512

#32408

#32468

#32346

#32334

#32335