Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #32329 > unrolled thread
| Started by | andrewr3mail@gmail.com |
|---|---|
| First post | 2012-10-28 20:12 -0700 |
| Last post | 2012-11-01 18:08 -0700 |
| Articles | 20 on this page of 73 — 16 participants |
Back to article view | Back to comp.lang.python
Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 20:12 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 21:42 -0600
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 22:25 -0600
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:18 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 11:19 +0000
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 22:32 +1100
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 21:52 -0700
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 23:40 +1100
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:02 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 23:01 -0700
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 09:52 -0400
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:20 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:01 -0600
Re: Negative array indicies and slice() Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-10-30 00:04 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:54 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:15 -0600
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 00:53 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:09 -0600
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:14 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:42 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 00:02 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 12:34 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 08:17 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 08:47 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:48 +0000
Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-30 23:29 -0600
Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-29 23:53 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:04 -0700
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 09:55 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:07 -0600
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 19:24 -0400
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:43 -0600
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 20:17 -0400
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 18:05 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 11:00 -0700
Re: Negative array indicies and slice() Chris Kaynor <ckaynor@zindagigames.com> - 2012-10-29 18:49 -0700
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 15:39 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 23:55 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 00:51 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:17 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:21 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:32 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:46 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 12:02 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 07:21 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 21:33 +0000
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 10:07 +0000
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-31 16:01 +0000
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:47 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:55 -0600
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-31 09:00 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 16:02 -0600
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:30 +0000
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:27 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:33 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
Re: Negative array indicies and slice() alex23 <wuwei23@gmail.com> - 2012-10-28 21:44 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:24 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:37 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-29 09:36 +0000
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 10:34 +0000
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
Re: Negative array indicies and slice() Paul Rubin <no.email@nospam.invalid> - 2012-10-28 22:14 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:08 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:26 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
Re: Negative array indicies and slice() MRAB <python@mrabarnett.plus.com> - 2012-10-29 03:45 +0000
Re: Negative array indicies and slice() 88888 Dihedral <dihedral88888@googlemail.com> - 2012-11-01 18:08 -0700
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 00:51 -0600 |
| Message-ID | <mailman.3076.1351579918.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson <andrew3@r3dsolutions.com> wrote: > In addition to those items you mention, of which the reference count is not > even *inside* the struct -- there is additional debugging information not > mentioned. Built in objects contain a "line number", a "column number", and > a "context" pointer. These each require a full word of storage. > > Also, built in types appear to have a "kind" field which indicates the > object "type" but is not a pointer. That suggests two "object" type > indicators, a generic pointer (probably pointing to "builtin"? somewhere > outside the struct) and a specific one (an enum) inside the "C" struct. > > Inside the tuple struct, I count 4 undocumented words of information. > Over all, there is a length, the list of pointers, a "kind", "line", "col" > and "context"; making 6 pieces in total. > > Although your comment says the head pointer is not required; I found in > 3.3.0 that it is a true head pointer; The Tuple() function on line 2069 of > Python-ast.c, (3.3 version) -- is passed in a pointer called *elts. That > pointer is copied into the Tuple struct. As above, you're looking at the compiler code, which is why you're finding things like "line" and "column". The tuple struct is defined in tupleobject.h and stores tuple elements in a tail array. > How ironic, slices don't have debugging info, that's the main reason they > are smaller. > When I do slice(3,0,2), suprisingly "Slice()" is NOT called. > But when I do a[1:2:3] it *IS* called. Because compiling the latter involves parsing slicing syntax, and compiling the former does not. :-)
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 17:17 -0700 |
| Message-ID | <mailman.3079.1351581543.27098.python-list@python.org> |
| In reply to | #32359 |
On 10/29/2012 11:51 PM, Ian Kelly wrote: > On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson > > As above, you're looking at the compiler code, which is why you're > finding things like "line" and "column". The tuple struct is defined > in tupleobject.h and stores tuple elements in a tail array. > If you re-check my post to chris, I listed the struct you mention. The C code is what is actually run (by GDB breakpoint test) when a tuple is instantiated. If the tuple were stripped of the extra data -- then it ought to be as small as slice(). But it's not as small -- so either the sys.getsizeof() is lying -- or the struct you mention is not complete. Which? --Andrew.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 01:21 -0600 |
| Message-ID | <mailman.3080.1351581739.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 7:49 PM, Chris Kaynor <ckaynor@zindagigames.com> wrote:
> NOTE: The above is taken from reading the source code for Python 2.6.
> For some odd reason, I am getting that an empty tuple consists of 6
> pointer-sized objects (48 bytes on x64), rather than the expected 3
> pointer-sized (24 bytes on x64). Slices are showing up as the expected
> 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
> pointer (8 bytes on x64) per item. I imagine I am missing something,
> but cannot figure out what that would be.
I'm likewise seeing 4 extra words in tuples in 32-bit Python 3.3.
What I've found is that for tuples and other collection objects, the
garbage collector tacks on an extra header before the object in
memory. That header looks like this:
typedef union _gc_head {
struct {
union _gc_head *gc_next;
union _gc_head *gc_prev;
Py_ssize_t gc_refs;
} gc;
long double dummy; /* force worst-case alignment */
} PyGC_Head;
gc_next and gc_prev implement a doubly-linked list that the garbage
collector uses to explicitly track this object. gc_refs is used for
counting references during a garbage collection and stores the GC
state of the object otherwise.
I'm not entirely certain why collection objects get this special
treatment, but there you have it.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 01:32 -0600 |
| Message-ID | <mailman.3081.1351582397.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 6:17 PM, Andrew Robinson <andrew3@r3dsolutions.com> wrote: > If you re-check my post to chris, I listed the struct you mention. > The C code is what is actually run (by GDB breakpoint test) when a tuple is > instantiated. When you were running GDB, were you debugging the interactive interpreter or a precompiled script? The interactive interpreter does a compilation step for every line entered. > If the tuple were stripped of the extra data -- then it ought to be as small > as slice(). > But it's not as small -- so either the sys.getsizeof() is lying -- or the > struct you mention is not complete. As just explained, the extra 16 bytes are added by the garbage collector.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 02:46 -0600 |
| Message-ID | <mailman.3083.1351586837.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 1:21 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> I'm not entirely certain why collection objects get this special
> treatment, but there you have it.
Thinking about it some more, this makes sense. The GC header is there
to support garbage collection for the object. Atomic types like ints
do not need this header because they do not reference other objects
and so cannot be involved in reference cycles. For those types,
reference counting is sufficient. For types like collections that do
reference other objects, garbage collection is needed.
Expanding on this, I suspect it is actually a bug that slice objects
are not tracked by the garbage collector. The following code appears
to result in a memory leak:
import gc
gc.disable()
while True:
for i in range(100):
l = []
s = slice(l)
l.append(s)
del s, l
_ = gc.collect()
Try running that and watch your Python memory usage climb and climb.
For contrast, replace the slice with a list and observe that memory
usage does *not* climb. On each iteration, the code constructs a
reference cycle between a slice and a list. It seems that because
slices are not tracked by the garbage collector, it is unable to break
these cycles.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 12:02 -0600 |
| Message-ID | <mailman.3096.1351620171.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote: > File a bug report? Looks like it's already been wontfixed back in 2006: http://bugs.python.org/issue1501180
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-30 07:21 -0700 |
| Message-ID | <mailman.3102.1351632022.27098.python-list@python.org> |
| In reply to | #32359 |
On 10/30/2012 11:02 AM, Ian Kelly wrote: > On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman<ethan@stoneleaf.us> wrote: >> File a bug report? > Looks like it's already been wontfixed back in 2006: > > http://bugs.python.org/issue1501180 Thanks, IAN, you've answered the first of my questions and have been a great help. (And yes, I was debugging interactive mode... I took a nap after writing that post, as I realized I had reached my 1 really bad post for the day... ) I at least I finally know why Python chooses to implement slice() as a separate object from tuple; even if I don't like the implications. I think there are three main consequences of the present implementation of slice(): 1) The interpreter code size is made larger with no substantial improvement in functionality, which increases debugging effort. 2) No protection against perverted and surprising (are you surprised?! I am) memory operation exists. 3) There is memory savings associated with not having garbage collection overhead. D'Apriano mentioned the named values, start, stop, step in a slice() which are an API and legacy issue; These three names must also be stored in the interpreter someplace. Since slice is defined at the "C" level as a struct, have you already found these names in the source code (hard-coded), or are they part of a .py file associated with the interface to the "C" code?
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2012-10-30 21:33 +0000 |
| Message-ID | <mailman.3105.1351632899.27098.python-list@python.org> |
| In reply to | #32359 |
On 30/10/2012 18:02, Ian Kelly wrote: > On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote: >> File a bug report? > > Looks like it's already been wontfixed back in 2006: > > http://bugs.python.org/issue1501180 > Absolutely bloody typical, turned down because of an idiot. Who the hell is Tim Peters anyway? :) -- Cheers. Mark Lawrence.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-10-31 10:07 +0000 |
| Message-ID | <5090f867$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32506 |
On Tue, 30 Oct 2012 21:33:32 +0000, Mark Lawrence wrote: > On 30/10/2012 18:02, Ian Kelly wrote: >> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> >> wrote: >>> File a bug report? >> >> Looks like it's already been wontfixed back in 2006: >> >> http://bugs.python.org/issue1501180 >> >> > Absolutely bloody typical, turned down because of an idiot. Who the > hell is Tim Peters anyway? :) I see your smiley, but for the benefit of those who actually don't know who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python history. He invented Python's astonishingly excellent sort routine, Timsort, and popularised the famous adverbial phrase signoffs you will see in a lot of older posts. Basically, he is in the pantheon of early Python demigods. stop-me-before-i-start-gushing-over-the-timbot-ly y'rs, -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2012-10-31 16:01 +0000 |
| Message-ID | <mailman.3123.1351699337.27098.python-list@python.org> |
| In reply to | #32522 |
On 31/10/2012 10:07, Steven D'Aprano wrote: > On Tue, 30 Oct 2012 21:33:32 +0000, Mark Lawrence wrote: > >> Absolutely bloody typical, turned down because of an idiot. Who the >> hell is Tim Peters anyway? :) > > I see your smiley, but for the benefit of those who actually don't know > who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python > history. He invented Python's astonishingly excellent sort routine, > Timsort, and popularised the famous adverbial phrase signoffs you will > see in a lot of older posts. > > Basically, he is in the pantheon of early Python demigods. > > stop-me-before-i-start-gushing-over-the-timbot-ly y'rs, > 4 / 10, must try harder, the omission of the Zen of Python is considered a very serious matter :) -- Cheers. Mark Lawrence.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 15:47 -0600 |
| Message-ID | <mailman.3106.1351633661.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote: > On 30/10/2012 18:02, Ian Kelly wrote: >> >> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote: >>> >>> File a bug report? >> >> >> Looks like it's already been wontfixed back in 2006: >> >> http://bugs.python.org/issue1501180 >> > > Absolutely bloody typical, turned down because of an idiot. Who the hell is > Tim Peters anyway? :) I don't really disagree with him, anyway. It is a rather obscure bug -- is it worth increasing the memory footprint of slice objects by 80% in order to fix it?
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 15:55 -0600 |
| Message-ID | <mailman.3107.1351634173.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> D'Apriano mentioned the named values, start, stop, step in a slice() which
> are an API and legacy issue; These three names must also be stored in the
> interpreter someplace. Since slice is defined at the "C" level as a struct,
> have you already found these names in the source code (hard-coded), or are
> they part of a .py file associated with the interface to the "C" code?
You mean the mapping of Python attribute names to C struct members?
That's in sliceobject.c:
static PyMemberDef slice_members[] = {
{"start", T_OBJECT, offsetof(PySliceObject, start), READONLY},
{"stop", T_OBJECT, offsetof(PySliceObject, stop), READONLY},
{"step", T_OBJECT, offsetof(PySliceObject, step), READONLY},
{0}
};
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-10-31 09:00 +1100 |
| Message-ID | <mailman.3108.1351634459.27098.python-list@python.org> |
| In reply to | #32359 |
On Wed, Oct 31, 2012 at 8:47 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote: >> On 30/10/2012 18:02, Ian Kelly wrote: >>> >>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote: >>>> >>>> File a bug report? >>> >>> >>> Looks like it's already been wontfixed back in 2006: >>> >>> http://bugs.python.org/issue1501180 >>> >> >> Absolutely bloody typical, turned down because of an idiot. Who the hell is >> Tim Peters anyway? :) > > I don't really disagree with him, anyway. It is a rather obscure bug > -- is it worth increasing the memory footprint of slice objects by 80% > in order to fix it? Bug report: If I take this gun, aim it at my foot, and pull the trigger, sometimes a hole appears in my foot. This is hardly normal use of slice objects. And the penalty isn't a serious one unless you're creating cycles repeatedly. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-30 16:02 -0600 |
| Message-ID | <mailman.3109.1351634558.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 3:55 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Tue, Oct 30, 2012 at 8:21 AM, Andrew Robinson
> <andrew3@r3dsolutions.com> wrote:
>> D'Apriano mentioned the named values, start, stop, step in a slice() which
>> are an API and legacy issue; These three names must also be stored in the
>> interpreter someplace. Since slice is defined at the "C" level as a struct,
>> have you already found these names in the source code (hard-coded), or are
>> they part of a .py file associated with the interface to the "C" code?
>
> You mean the mapping of Python attribute names to C struct members?
> That's in sliceobject.c:
>
> static PyMemberDef slice_members[] = {
> {"start", T_OBJECT, offsetof(PySliceObject, start), READONLY},
> {"stop", T_OBJECT, offsetof(PySliceObject, stop), READONLY},
> {"step", T_OBJECT, offsetof(PySliceObject, step), READONLY},
> {0}
> };
Note that the slice API also includes the slice.indices method.
They also implement rich comparisons, but this appears to be done by
copying the data to tuples and comparing the tuples, which is actually
a bit ironic considering this discussion. :-)
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2012-10-30 23:30 +0000 |
| Message-ID | <mailman.3111.1351639865.27098.python-list@python.org> |
| In reply to | #32359 |
On 30/10/2012 21:47, Ian Kelly wrote: > On Tue, Oct 30, 2012 at 3:33 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote: >> On 30/10/2012 18:02, Ian Kelly wrote: >>> >>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman <ethan@stoneleaf.us> wrote: >>>> >>>> File a bug report? >>> >>> >>> Looks like it's already been wontfixed back in 2006: >>> >>> http://bugs.python.org/issue1501180 >>> >> >> Absolutely bloody typical, turned down because of an idiot. Who the hell is >> Tim Peters anyway? :) > > I don't really disagree with him, anyway. It is a rather obscure bug > -- is it worth increasing the memory footprint of slice objects by 80% > in order to fix it? > Thinking about it I entirely agree. An 80% increase in memory foorprint where the slice objects are being used with Python 3.3.0 Unicode would have disastrous consequences given the dire state of said Unicode, which is why some regular contributors here are giving up with Python and using Go. Oh gosh look at the time, I'm just going for a walk so I can talk with the Pixies at the bottom of my garden before they go night nights. -- Cheers. Mark Lawrence.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-29 11:27 -0600 |
| Message-ID | <mailman.3036.1351531699.27098.python-list@python.org> |
| In reply to | #32345 |
On Mon, Oct 29, 2012 at 1:54 AM, Andrew <andrewr3mail@gmail.com> wrote: > My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). Yes, I misconstrued your question. I thought you wanted to change the behavior of slicing to wrap around the end when start > stop instead of returning an empty sequence. What you actually want is a new sequence built from indexes supplied by an iterable. Chris has already given you a list comprehension solution to solve that. You could also use map for this: new_seq = list(map(old_seq.__getitem__, iterable)) Since you seem to be concerned about performance, I'm not sure in this case whether the map or the list comprehension will be faster. I'll leave you to test that on your intended hardware. > In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data. Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought. One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C. I'm aware of what is possible in C with pointer arithmetic. This is Python, though, and Python by design has neither pointers nor pointer arithmetic. In any case, initializing the pointer to the end of the array would still not do what you want, since the positive indices would then extend past the end of the array.
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 16:33 -0700 |
| Message-ID | <mailman.3075.1351578914.27098.python-list@python.org> |
| In reply to | #32345 |
Hi Ian, There are several interesting/thoughtful things you have written. I like the way you consider a problem before knee jerk answering. The copying you mention (or realloc) doesn't re-copy the objects on the list. It merely re-copies the pointer list to those objects. So lets see what it would do... I have seen doubling as the supposed re-alloc method, but I'll assume 1.25 -- so, 1.25**x = 20million, is 76 copies (max). The final memory copy would leave about a 30MB hole. And my version of Python operates initially with a 7MB virtual footprint. Sooo.... If the garbage collection didn't operate at all, the copying would waste around: >>> z,w = 30e6,0 >>> while (z>1): w,z = w+z, z/1.25 ... >>> print(w) 149999995.8589521 eg: 150MB cummulative. The doubles would amount to 320Megs max. Not enough to fill virtual memory up; nor even cause a swap on a 2GB memory machine. It can hold everything in memory at once. So, I don't think Python's memory management is the heart of the problem, although memory wise-- it does require copying around 50% of the data. As an implementation issue, though, the large linear array may cause wasteful caching/swapping loops, esp, on smaller machines. On 10/29/2012 10:27 AM, Ian Kelly wrote: > Yes, I misconstrued your question. I thought you wanted to change the > behavior of slicing to wrap around the end when start> stop instead > of returning an empty sequence. ... Chris has > already given ... You > could also use map for this: > > new_seq = list(map(old_seq.__getitem__, iterable)) MMM... interesting. I am not against changing the behavior, but I do want solutions like you are offering. As I am going to implement a python interpreter, in C, being able to do things differently could significantly reduce the interpreter's size. However, I want to break existing scripts very seldom... > I'm aware of what is possible in C with pointer arithmetic. This is > Python, though, and Python by design has neither pointers nor pointer > arithmetic. In any case, initializing the pointer to the end of the > array would still not do what you want, since the positive indices > would then extend past the end of the array. Yes, *and* if you have done assembly language programming -- you know that testing for sign is a trivial operation. It doesn't even require a subtraction. Hence, at the most basic machine level -- changing the base pointer *once* during a slice operation is going to be far more efficient than performing multiple subtractions from the end of an array, as the Python API defines. I'll leave out further gory details... but it is a Python interpreter built in "C" issue.
[toc] | [prev] | [next] | [standalone]
| From | Andrew <andrewr3mail@gmail.com> |
|---|---|
| Date | 2012-10-29 00:54 -0700 |
| Message-ID | <mailman.2996.1351497278.27098.python-list@python.org> |
| In reply to | #32337 |
On Sunday, October 28, 2012 9:26:01 PM UTC-7, Ian wrote: > On Sun, Oct 28, 2012 at 10:00 PM, Andrew wrote: > > > Hi Ian, > > > Well, no it really isn't equivalent. > > > Consider a programmer who writes: > > > xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2] > > > > > > That is the "idea" of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3??? > > > > That is what ranges do, but your question was about slices, not ranges. Actually, I said in the OP: "I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't." ========================= Thank you for the code snippet; I don't think it likely that existing programs depend on nor use a negative index and a positive index expecting to take a small chunk in the center... hence, I would return the whole array; Or if someone said [-len(listX) : len(listX)+1 ] I would return the whole array twice. That's the maximum that is possible. If someone could show me a normal/reasonable script which *would* expect the other behavior, I'd like to know; compatibility is important. ========================= My intended inferences about the iterator vs. slice question was perhaps not obvious to you; Notice: an iterator is not *allowed* in __getitem__(). The slice class when passed to __getitem__() was created to merely pass two numbers and a stride to __getitem__; As far as I know slice() itself does *nothing* in the actual processing of the elements. So, it's *redundant* functionality, and far worse, it's restrictive. The philosophy of Python is to have exactly one way to do something when possible; so, why create a stand alone class that does nothing an existing class could already do, and do it better ? A simple list of three values would be just as efficient as slice()! xrange is more flexible, and can be just as efficient. So, Have I misunderstood the operation of slice()? I think I might have... but I don't know. In 'C', where Python is written, circularly linked lists -- and arrays are both very efficient ways of accessing data. Arrays can, in fact, have negative indexes -- perhaps contrary to what you thought. One merely defines a variable to act as the base pointer to the array and initialize it to the *end* of the array. Nor is the size of the data elements an issue, since in Python all classes are accessed by pointers which are of uniform size. I routinely do this in C. Consider, also, that xrange() does not actually create a list -- but merely an iterator generating integers which is exactly what __getitem__ works on. So, xrange() does not need to incur a memory or noticeable time penalty. >From micro-python, it's clear that their implementation of xrange() is at the 'C' level; which is extremely fast.
[toc] | [prev] | [next] | [standalone]
| From | andrewr3mail@gmail.com |
|---|---|
| Date | 2012-10-28 21:00 -0700 |
| Message-ID | <mailman.2988.1351483258.27098.python-list@python.org> |
| In reply to | #32331 |
On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
> On Sun, Oct 28, 2012 at 9:12 PM, andrew wrote:
>
> > The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying.
>
> >
>
> > Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around.
>
> >
>
> > I *hate* replicating code every time I need to do this!
>
> >
>
> > I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't.
>
> >
>
> >
>
> > Here's an example for Linux shell, otherwise remove /bin/env...
>
> > {{{#!/bin/env python
>
> > a=[1,2,3,4,5,6,7,8,9,10]
>
> > print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get [].
>
> > }}}
>
>
>
>
>
> For a sequence of length 10, "a[-4:3]" is equivalent to "a[6:3]",
>
> which is an empty slice since index 6 is after index 3.
>
>
>
> If you want it to wrap around, then take two slices and concatenate
>
> them with "a[-4:] + a[:3]".
Hi Ian,
Well, no it really isn't equivalent.
Consider a programmer who writes:
xrange(-4,3) *wants* [-4,-3,-2,-1,0,1,2]
That is the "idea" of a range; for what reason would anyone *EVER* want -4 to +3 to be 6:3???
I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same.
So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__; They are totally separate classes.
I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive.
eg: a line in my program reads:
a[x-5:x]
if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !
I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible.
[toc] | [prev] | [next] | [standalone]
| From | Andrew <andrewr3mail@gmail.com> |
|---|---|
| Date | 2012-10-28 21:09 -0700 |
| Message-ID | <bb922c4b-08b5-4e93-9452-5306b2d953a9@googlegroups.com> |
| In reply to | #32331 |
On Sunday, October 28, 2012 8:43:30 PM UTC-7, Ian wrote:
> On Sun, Oct 28, 2012 at 9:12 PM, <Andrew> wrote:
>
> > The slice operator does not give any way (I can find!) to take slices from negative to positive indexes, although the range is not empty, nor the expected indexes out of range that I am supplying.
>
> >
>
> > Many programs that I write would require introducing variables and logical statements to correct the problem which is very lengthy and error prone unless there is a simple work around.
>
> >
>
> > I *hate* replicating code every time I need to do this!
>
> >
>
> > I also don't understand why slice() is not equivalent to an iterator, but can replace an integer in __getitem__() whereas xrange() can't.
>
> >
>
> >
>
> > Here's an example for Linux shell, otherwise remove /bin/env...
>
> > {{{#!/bin/env python
>
> > a=[1,2,3,4,5,6,7,8,9,10]
>
> > print a[-4:3] # I am interested in getting [7,8,9,10,1,2] but I get [].
>
> > }}}
>
>
>
>
>
> For a sequence of length 10, "a[-4:3]" is equivalent to "a[6:3]",
>
> which is an empty slice since index 6 is after index 3.
>
>
>
> If you want it to wrap around, then take two slices and concatenate
>
> them with "a[-4:] + a[:3]".
Hi Ian,
Well, no it really isn't equivalent; although Python implements it as equivalent.
Consider a programmer who writes:
xrange(-4,3)
They clearly *want* [-4,-3,-2,-1,0,1,2]
That is the "idea" of a range; So, for what reason would anyone want -4 to +3 to be 6:3??? Can you show me some code where this is desirable??
I do agree that the data held in -4 is equivalent to the data in 6, but the index is not the same.
So: Why does python choose to convert them to positive indexes, and have slice operate differently than xrange -- for the slice() object can't possibly know the size of the array when it is passed in to __getitem__; They are totally separate classes.
I realize I can concat. two slice ranges, BUT, the ranges do not always span from negative to positive.
eg: a line in my program reads:
a[x-5:x]
if x is 7, then this is a positive index to a positive index.
So, there is no logic to using two slices concatd !
I use this arbitrary range code *often* so I need a general purpose solution.
I looked up slice() but the help is of no use, I don't even know how I might overload it to embed some logic to concatenate ranges of data; nor even if it is possible.
[toc] | [prev] | [next] | [standalone]
Page 3 of 4 — ← Prev page 1 2 [3] 4 Next page →
Back to top | Article view | comp.lang.python
csiph-web