Groups > comp.lang.python > #32329 > unrolled thread

Negative array indicies and slice()

Started by	andrewr3mail@gmail.com
First post	2012-10-28 20:12 -0700
Last post	2012-11-01 18:08 -0700
Articles	20 on this page of 73 — 16 participants

Back to article view | Back to comp.lang.python

  Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 20:12 -0700
    Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 21:42 -0600
      Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
        Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 22:25 -0600
          Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:18 -0700
            Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 11:19 +0000
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 22:32 +1100
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 21:52 -0700
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 23:40 +1100
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:02 +0000
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 23:01 -0700
                Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 09:52 -0400
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:20 -0700
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:01 -0600
                  Re: Negative array indicies and slice() Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-10-30 00:04 +0000
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:54 -0700
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:15 -0600
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 00:53 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:09 -0600
              Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:14 +0000
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:42 -0700
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 00:02 +0000
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 12:34 -0700
                    Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 08:17 +0000
                      Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 08:47 -0700
                      Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:48 +0000
                      Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-30 23:29 -0600
                  Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-29 23:53 -0600
                  Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:04 -0700
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 09:55 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:07 -0600
                Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 19:24 -0400
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:43 -0600
                    Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 20:17 -0400
                  Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 18:05 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 11:00 -0700
              Re: Negative array indicies and slice() Chris Kaynor <ckaynor@zindagigames.com> - 2012-10-29 18:49 -0700
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 15:39 -0700
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 23:55 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 00:51 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:17 -0700
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:21 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:32 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:46 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 12:02 -0600
              Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 07:21 -0700
              Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 21:33 +0000
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 10:07 +0000
                  Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-31 16:01 +0000
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:47 -0600
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:55 -0600
              Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-31 09:00 +1100
              Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 16:02 -0600
              Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:30 +0000
            Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:27 -0600
            Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:33 -0700
          Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
      Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
      Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
        Re: Negative array indicies and slice() alex23 <wuwei23@gmail.com> - 2012-10-28 21:44 -0700
          Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:24 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:37 -0700
              Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
                Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-29 09:36 +0000
                Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 10:34 +0000
              Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
        Re: Negative array indicies and slice() Paul Rubin <no.email@nospam.invalid> - 2012-10-28 22:14 -0700
          Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:08 -0700
            Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:26 -0700
      Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
    Re: Negative array indicies and slice() MRAB <python@mrabarnett.plus.com> - 2012-10-29 03:45 +0000
    Re: Negative array indicies and slice() 88888 Dihedral <dihedral88888@googlemail.com> - 2012-11-01 18:08 -0700

Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →

#32427

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2012-10-29 22:14 +0000
Message-ID	<508effd1$0$29967$c3e8da3$5496439d@news.astraweb.com>
In reply to	#32359

On Mon, 29 Oct 2012 11:19:38 +0000, Steven D'Aprano wrote:

> Because xrange represents a concrete sequence of numbers, all three of
> start, end and stride must be concrete, known, integers:
> 
> py> xrange(4, None, 2)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: an integer is required
> 
> Whereas slices can trivially include blanks that get filled in only when
> actually used:
> 
> py> "hello world"[aslice]
> 'owrd'
> py> "NOBODY expects the Spanish Inquisition!"[aslice] 
> 'D xet h pns nusto!'

/me facepalms/


Argggh, I forgot to copy-and-paste the critical line defining aslice:

aslice = slice(4, None, 2)


Sorry about that.



-- 
Steven

[toc] | [prev] | [next] | [standalone]

#32434

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 08:42 -0700
Message-ID	<mailman.3053.1351550917.27098.python-list@python.org>
In reply to	#32359

On 10/29/2012 10:09 AM, Ian Kelly wrote:
> On Oct 29, 2012 7:10 AM, "Andrew Robinson"<andrew3@r3dsolutions.com>  wrote:
>> I will be porting Python 3.xx to a super low power embedded processor (MSP430), both space and speed are at a premium.
>> Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't even run on this system. etc.
> If that's the case, then running Python at all is probably a mistake.
> You know the interpreter alone has an overhead of nearly 6 MB?
There's already a version of the python interpreter which fits in under 
100K:
http://code.google.com/p/python-on-a-chip/
It's not the 3.x series, though; and I don't want to redo this once 2.7 
really does become obsolete.
>> Yes, I realize that.
>> But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class?
> You can just overload that one method in a subclass of list.  Being
> able to monkey-patch __getitem__ for the list class itself would not
> be advisable, as it would affect all list slicing anywhere in your
> program and possibly lead to some unexpected behaviors.
That's what I am curious about.
What unexpected behaviors would a "monkey patch" typically cause?
If no one really uses negative and positive indexes in the same slice 
operation, because there is no reason to do so...
It will only break the occasional esoteric application.

>
> 20 million is nothing.  On a 32-bit system, sys.maxsize == 2 ** 31 -
> 1.  If the error you were seeing was MemoryError, then more likely you
> were running into dynamic allocation issues due to fragmentation of
> virtual memory.
>
>

No, there was no error at all.  Pthon just crashed & exited; not even an 
exception that I can recall.   It was if it exited normally!

The list was generated in a single pass by many .append() 's, and then 
copied once -- the original was left in place; and then I attempted to 
slice it.

I am able to routinely to 5 million length lists, copy, slice, cut, 
append, and delete from them without this ever happening.
If fragmentation were the issue, I'd think the shorter lists would cause 
the problem after many manipulations...

It may not be a bug in python itself, though, of course.  There are 
libraries it uses which might have a bug.

[toc] | [prev] | [next] | [standalone]

#32444

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2012-10-30 00:02 +0000
Message-ID	<508f1910$0$29967$c3e8da3$5496439d@news.astraweb.com>
In reply to	#32434

On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:

>>> But, why can't I just overload the existing __getitem__ for lists and
>>> not bother writing an entire class?

You say that as if writing "an entire class" was a big complicated 
effort. It isn't. It is trivially simple, a single line:

class MyList(list):
    ...

plus the __getitem__ definition, which you would have to write anyway. It 
is a trivial amount of extra effort.

>> You can just overload that one method in a subclass of list.  Being
>> able to monkey-patch __getitem__ for the list class itself would not be
>> advisable, as it would affect all list slicing anywhere in your program
>> and possibly lead to some unexpected behaviors.
>
> That's what I am curious about.
> What unexpected behaviors would a "monkey patch" typically cause? 

What part of "unexpected" is unclear?

Monkey-patching is poor programming technique because it leads to 
*unexpected* and *impossible to predict* interactions between *distant* 
parts of the code. It leads to impossible to predict differences between 
the source code on disk and the actual running code. It leads to 
impossible to predict differences between documented behaviour and actual 
behaviour.

Let me see if I can illustrate a flavour of the sort of things that can 
happen if monkey-patching built-ins were allowed.

You create a list and print it:

# simulated output
py> x = [5, 2, 4, 1]
py> print(x)
[1, 2, 4, 5]

What? How did that happen? That's not the list you provided. The order 
has been lost.

So you dig deep into your code, and you don't find anything. And you read 
the Python documentation for lists, and don't find anything. And you 
google the Internet, and don't find anything. And you ask for help, and 
everybody says you're crazy because when they duplicate your code they 
get the expected behaviour. And you report a bug in Python, and it gets 
closed as "cannot replicate".

Finally you search deep into the libraries used in your code, and *five 
days later* discover that your code uses library A which uses library B 
which uses library C which uses library D which installs a harmless 
monkey-patch to print, but only if library E is installed, and you just 
happen to have E installed even though your code never uses it, AND that 
monkey-patch clashes with a harmless monkey-patch to list.__getitem__ 
installed by library F. And even though each monkey-patch alone is 
harmless, the combination breaks your code's output.

Python allows, but does not encourage, monkey-patching of code written in 
pure Python, because it sometimes can be useful. It flat out prohibits 
monkey-patching of builtins, because it is just too dangerous.

Ruby allows monkey-patching of everything. And the result was predictable:

http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#32459

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 12:34 -0700
Message-ID	<mailman.3067.1351564823.27098.python-list@python.org>
In reply to	#32444

On 10/29/2012 05:02 PM, Steven D'Aprano wrote:
> On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:
>
>>>> But, why can't I just overload the existing __getitem__ for lists and
>>>> not bother writing an entire class?
> You say that as if writing "an entire class" was a big complicated
> effort. It isn't. It is trivially simple, a single line:
>
> class MyList(list):
>      ...
No, I don't think it big and complicated.  I do think it has timing 
implications which are undesirable because of how *much* slices are used.
In an embedded target -- I have to optimize; and I will have to reject 
certain parts of Python to make it fit and run fast enough to be useful.

>>> You can just overload that one method in a subclass of list.  Being
>>> able to monkey-patch __getitem__ for the list class itself would not be
>>> advisable, as it would affect all list slicing anywhere in your program
>>> and possibly lead to some unexpected behaviors.
>> That's what I am curious about.
>> What unexpected behaviors would a "monkey patch" typically cause?
> What part of "unexpected" is unclear?
>
Ahh -- The I don't know approach!  It's only unexpected if one is a bad 
programmer...!
> Let me see if I can illustrate a flavour of the sort of things that can
> happen if monkey-patching built-ins were allowed.
>
> You create a list and print it:
>
> # simulated output
> py>  x = [5, 2, 4, 1]
> py>  print(x)
> [1, 2, 4, 5]
<snip>

Finally you search deep into the libraries used in your code, and *five
days later* discover that your code uses library A which uses library B
which uses library C which uses library D which installs a harmless
monkey-patch to print, but only if library E is installed, and you just
happen to have E installed even though your code never uses it, AND that
monkey-patch clashes with a harmless monkey-patch to list.__getitem__
installed by library F. And even though each monkey-patch alone is
harmless, the combination breaks your code's output.

Right, which means that people developing the libraries made 
contradictory assumptions.

> Python allows, but does not encourage, monkey-patching of code written in
> pure Python, because it sometimes can be useful. It flat out prohibits
> monkey-patching of builtins, because it is just too dangerous.
>
> Ruby allows monkey-patching of everything. And the result was predictable:
>
> http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/
>

I read that post carefully; and the author purposely notes that he is 
exaggerating.
BUT Your point is still well taken.

What you are talking about is namespace preservation; and I am thinking 
about it. I can preserve it -- but only if I disallow true Python 
primitives in my own interpreter; I can't provide two sets in the memory 
footprint I am using.

 From my perspective, the version of Python that I compile will not be 
supported by the normal python help; The predecessor which first forged 
this path, Pymite, has the same problems -- however, the benefits 
ought-weigh the disadvantages; and the experiment yielded useful 
information on what is redundant in Python (eg: range is not supported) 
and when that redundancy is important for some reason.

If someone had a clear explanation of the disadvantages of allowing an 
iterator, or a tuple -- in place of a slice() -- I would have no qualms 
dropping the subject.  However, I am not finding that yet.  I am finding 
very small optimization issues...

The size of an object is at least 8 bytes.  Hence, three numbers is 
going to be at least 24 bytes; and that's 24 bytes in *excess* of the 
size of slice() or tuple () which are merely containers.  So -- There 
*ARE* savings in memory when using slice(), but it isn't really 2x 
memory -- its more like 20% -- once the actual objects are considered.

The actual *need* for a slice() object still hasn't been demonsrated.  I 
am thinking that the implementation of __getitem__() is very poor 
probably because of legacy issues.

A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple.
Alternately:  An iterator, like xrange(), could be made which takes None 
as a parameter, or a special value like 'inf'.
Since these two values would never be passed to xrange by already 
developed code, allowing them would not break working code.

I am only aware of one possible reason that slice() was once thought to 
be necessary; and that is because accessing the element of a tuple would 
recursively call __getitem__ on the tuple.  But, even that is easily 
dismissed once the fixed integer indexes are considered.

Your thoughts?  Do you have any show stopper insights?

[toc] | [prev] | [next] | [standalone]

#32476

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2012-10-30 08:17 +0000
Message-ID	<508f8cfd$0$29884$c3e8da3$5496439d@news.astraweb.com>
In reply to	#32459

By the way Andrew, the timestamps on your emails appear to be off, or 
possibly the time zone. Your posts are allegedly arriving before the 
posts you reply to, at least according to my news client.

On Mon, 29 Oct 2012 12:34:24 -0700, Andrew Robinson wrote:

> On 10/29/2012 05:02 PM, Steven D'Aprano wrote:
>> On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:
>>
>>>>> But, why can't I just overload the existing __getitem__ for lists
>>>>> and not bother writing an entire class?
>> You say that as if writing "an entire class" was a big complicated
>> effort. It isn't. It is trivially simple, a single line:
>>
>> class MyList(list):
>>      ...
> No, I don't think it big and complicated.  I do think it has timing
> implications which are undesirable because of how *much* slices are
> used. In an embedded target -- I have to optimize; and I will have to
> reject certain parts of Python to make it fit and run fast enough to be
> useful.

Then I look forward to seeing your profiling results that show that the 
overhead of subclassing list is the bottleneck in your application.

Until then, you are making the classic blunder of the premature optimizer:

"More computing sins are committed in the name of efficiency (without 
necessarily achieving it) than for any other single reason — including 
blind stupidity." — W.A. Wulf

I am not impressed by performance arguments when you have (apparently) 
neither identified the bottlenecks in your code, nor even measured the 
performance. You are essentially *guessing* where the bottlenecks are, 
and *hoping* that some suggested change will be an optimization rather 
than a pessimization.

Of course I may be wrong, and you have profiled your code and determined 
that the overhead of inheritance is a problem. If so, that's a different 
ball game. But your posts so far suggest to me that you're trying to 
predict performance optimizations rather than measure them.

>>>> You can just overload that one method in a subclass of list.  Being
>>>> able to monkey-patch __getitem__ for the list class itself would not
>>>> be advisable, as it would affect all list slicing anywhere in your
>>>> program and possibly lead to some unexpected behaviors.
>>> That's what I am curious about.
>>> What unexpected behaviors would a "monkey patch" typically cause?
>> What part of "unexpected" is unclear?
>>
> Ahh -- The I don't know approach!  It's only unexpected if one is a bad
> programmer...!

No, it is unexpected because once you start relying on monkey-patching, 
and the libraries you install are monkey-patching, you have a 
combinational explosion of interactions. Any piece of code, anywhere, 
could monkey-patch any other piece of code -- it is a form of action-at-a-
distance coding, like GOTOs and global variables. Use it with caution, in 
moderation.

>> Let me see if I can illustrate a flavour of the sort of things that can
>> happen if monkey-patching built-ins were allowed.
[...]
> Right, which means that people developing the libraries made
> contradictory assumptions.

Not necessarily. Not only can monkey-patches conflict, but they can 
combine in bad ways. It isn't just that Fred assumes X and Barney assumes 
not-X, but also that Fred assumes X and Barney assumes Y and *nobody* 
imagined that there was some interaction between X and Y.

[...]
>> Ruby allows monkey-patching of everything. And the result was
>> predictable:
>>
>> http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-
ruby/
>>
>>
> I read that post carefully; and the author purposely notes that he is
> exaggerating.

Not carefully enough. He notes that he was using a provocative title and 
that he doesn't actually think that Ruby is being destroyed. But the 
actual harm he describes is real, e.g. bugs that take months to track 
down.

> What you are talking about is namespace preservation; 

I haven't mentioned namespaces. Nothing I have said has anything to do 
with namespaces. I remember Apple monkey-patching routines in ROM back in 
the mid 1980s, long before there was anything corresponding to namespaces 
in Apple's programming model. 

> and I am thinking
> about it. I can preserve it -- but only if I disallow true Python
> primitives in my own interpreter; I can't provide two sets in the memory
> footprint I am using.

If you want to write a language that is not Python, go right ahead.

> If someone had a clear explanation of the disadvantages of allowing an
> iterator, or a tuple -- in place of a slice() -- I would have no qualms
> dropping the subject.  However, I am not finding that yet.  I am finding
> very small optimization issues...

Python has certain public interfaces. If your language does not support 
those public interfaces, then it might be an awesome language, but it is 
not Python.

Python slices have certain features:

* they can be used repeatedly;

* they have public attributes start, step and stop;

* The stop attribute can be None, and the slice will default to the
  length of the thing being sliced, which is not known until call-time.

* Slices have a public indices method.

These are known characteristics of slices, and there is Python code that 
relies on it. If your language does not support these, then it is not 
Python.

Iterators cannot replace slices, because once you have looked up an 
iterator value, that value is gone, never to be seen again.

Tuples cannot replace slices, because tuples do not have start, step and 
stop attributes; most tuples have no need of them, and if you care about 
memory footprint, the last thing you want is to give every tuple three 
unused or meaningless named attributes. Besides, what values should they 
take in an empty tuple?

xrange cannot replaces slices, because there is no way to create an xrange 
object with an empty or missing end; besides, xrange objects have no 
public start/stop/step attributes.

And none of them have a public indices method.

> The actual *need* for a slice() object still hasn't been demonsrated.

Hell, the actual *need* for slicing hasn't been demonstrated! Or Python! 
Since C doesn't have slicing, nobody needs it! Am I right?

Needed or not, that is what Python has, so if your language doesn't have 
it, it isn't Python.

> I
> am thinking that the implementation of __getitem__() is very poor
> probably because of legacy issues.

You haven't demonstrated that it is very poor.

> A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple.

Sure. And a tuple can also be () or (1, "x", [], None, None, None, None). 
Now your list.__getitem__ code has to deal with those instead.

> Alternately:  An iterator, like xrange(), could be made which takes None
> as a parameter, or a special value like 'inf'. 

So you want to get rid of slices because you want to save every byte of 
memory you can, and your way to do that is to introduce a new, *heavier* 
type that does more than slices? 

Oh man, that's funny.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#32511

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-30 08:47 -0700
Message-ID	<mailman.3110.1351637364.27098.python-list@python.org>
In reply to	#32476

On 10/30/2012 01:17 AM, Steven D'Aprano wrote:
> By the way Andrew, the timestamps on your emails appear to be off, or
> possibly the time zone. Your posts are allegedly arriving before the
> posts you reply to, at least according to my news client.
:D -- yes, I know about that problem.  Every time I reboot it shows up 
again...
It's a distribution issue, my hardware clock is in local time -- but 
when the clock is read by different scripts in my distribution, some 
refuse to accept that the system clock is not UTC.
I'll be upgrading in a few weeks -- so I'm just limping along until 
then. My apology.

> Then I look forward to seeing your profiling results that show that the
> overhead of subclassing list is the bottleneck in your application.
>
> Until then, you are making the classic blunder of the premature optimizer:
>
> "More computing sins are committed in the name of efficiency (without
> necessarily achieving it) than for any other single reason — including
> blind stupidity." — W.A. Wulf

I'm sure that's true.  Optimization, though, is a very general word.

On a highway in my neighborhood -- the government keeps trying to put 
more safety restrictions on it, because it statistically registers as 
the "highest accident rate road" in the *entire* region.

Naturally, the government assumes that people in my neighborhood are 
worse drivers than usual and need to be policed more -- but the truth 
is, that highway is the *ONLY* access road in the region for dozens of 
miles in any direction for a densely populated area, so if there is 
going to be an accident it will happen there; the extra safety 
precautions are not necessary when the accident rate is looked at from a 
per-capita perspective of those driving the highway.

I haven't made *the* blunder of the premature optimizer because I 
haven't implemented anything yet.  Premature optimizers don't bother to 
hold public conversation and take correction.
OTOH:  people who don't ever optimize out of fear, pay an increasing 
bloat price with time.

> I am not impressed by performance arguments when you have (apparently)
> neither identified the bottlenecks in your code, nor even measured the
> performance.
Someone else already did a benchmark between a discrete loop and a slice 
operation.
The difference in speed was an order of magnitude different.
I bench-marked a map operation, which was *much* better -- but also 
still very slow in comparison.

Let's not confound an issue here -- I am going to implement the python 
interpreter; and am not bound by optimization considerations of the 
present python interpreter -- There are things I can do which as a 
python programmer -- you can't.  I have no choice but to re-implement 
and optimize the interpreter -- the question is merely how to go about it.

>   You are essentially *guessing* where the bottlenecks are,
> and *hoping* that some suggested change will be an optimization rather
> than a pessimization.
>
> Of course I may be wrong, and you have profiled your code and determined
> that the overhead of inheritance is a problem. If so, that's a different
> ball game. But your posts so far suggest to me that you're trying to
> predict performance optimizations rather than measure them.
Not really; Inheritance itself and it's timing aren't my main concern.  
Even if the time was *0* that wouldn't change my mind.

There are man hours in debugging time caused by not being able to wrap 
around in a slice. (I am not ignoring the contrary man hours of an API 
change's bugs).

Human psychology is important; and it's a double edged sword.

I would refer you to a book written by Steve Maguire, Writing Solid 
Code; Chapter 5; Candy machine interfaces.

He uses the "C" function "realloc()" as an excellent example of a bad 
API; but still comments on one need that it *does* fulfill -- "I've 
found it better to have one function that both shrinks and expands 
blocks so that I don't have to write *ifs* constructs every time I need 
to resize memory.  True, I give up some extra argument checking, but 
this is offset by the *ifs* that I no longer need to write (*and 
possibly mess up*).

* Extra steps that a programmer must take to achieve a task are places 
where bugs get introduced.

* API's which must be debugged to see what particular operation it is 
performing rather than knowing what that operation is from looking at 
the un-compiled code are places where bugs get introduced.

These two points are not friendly with each other -- they are in fact, 
generally in conflict.
>> Right, which means that people developing the libraries made
>> contradictory assumptions.
> Not necessarily. Not only can monkey-patches conflict, but they can
> combine in bad ways. It isn't just that Fred assumes X and Barney assumes
> not-X, but also that Fred assumes X and Barney assumes Y and *nobody*
> imagined that there was some interaction between X and Y.
They *STILL* made contradictory assumptions; each of them assumed the 
interaction mechanism would not be applied in a certain way -- and then 
used it based on that assumption.
> Not carefully enough. He notes that he was using a provocative title and
> that he doesn't actually think that Ruby is being destroyed. But the
> actual harm he describes is real, e.g. bugs that take months to track
> down.
Yes.  I read that *carefully*.
BTW:  I'm not planning on monkey patching -- I did say your comments 
were well taken.

>> What you are talking about is namespace preservation;
> I haven't mentioned namespaces. Nothing I have said has anything to do
> with namespaces.
keywords are a namespace.  xrange is a keyword, etc.
You don't want pre-defined API method name to have its operation 
altered; you want the original names preserved. Overloading pollutes 
namespaces, etc.

> If you want to write a language that is not Python, go right ahead.

That's not the only possibility.  It will still be called Python and for 
good reason.

>
>> If someone had a clear explanation of the disadvantages of allowing an
>> iterator, or a tuple -- in place of a slice() -- I would have no qualms
>> dropping the subject.  However, I am not finding that yet.  I am finding
>> very small optimization issues...
> Python has certain public interfaces. If your language does not support
> those public interfaces, then it might be an awesome language, but it is
> not Python.
I didn't say I wouldn't support it.  How do you mean this?

> Iterators cannot replace slices, because once you have looked up an
> iterator value, that value is gone, never to be seen again.
Yes they can replace slices, but not always.
For example;  Lets say an iterator were allowed to replace a slice;

then a[ 1:3 ] would still be a slice, but a[ xrange(1,3) ] would not.
The second piece of code does not deny that slices exist, it just allows 
an iterator to be passed to __getitems__.

> Tuples cannot replace slices, because tuples do not have start, step and
> stop attributes;
I refer you to my comments to IAN.

>
> And none of them have a public indices method.
Hmm.. a new point.  Do you have a link?
>
>> The actual *need* for a slice() object still hasn't been demonsrated.
> Hell, the actual *need* for slicing hasn't been demonstrated! Or Python!
> Since C doesn't have slicing, nobody needs it! Am I right?
I asked about the need for slices() not the others, in the OP.
Thanks for your comments on the names, etc.  That's helpful -- and I 
hope I haven't driven your blood pressure up too much unintentionally.

> Needed or not, that is what Python has, so if your language doesn't have
> it, it isn't Python.
Fine.  But if mine still has these, but only as an *optional* wrapper 
function -- it still *RUNS* all python progams.

> Sure. And a tuple can also be () or (1, "x", [], None, None, None, None).
> Now your list.__getitem__ code has to deal with those instead.
>
And the present __getitem__ can be passed a perverted slice() as IAN 
demonstrated and the bug-list examined as "goofy".  These strange things 
simply cause an exception.

> So you want to get rid of slices because you want to save every byte 
> of memory you can, and your way to do that is to introduce a new, 
> *heavier* type that does more than slices? Oh man, that's funny.

I'm not Introducing a new type !
Besides; tuples will likely be lighter on memory in my target and I 
don't recall saying a tuple does more than slices.  I did say that 
*iterators* were more *flexible* -- is that what you are thinking of?

[toc] | [prev] | [next] | [standalone]

#32513

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2012-10-30 23:48 +0000
Message-ID	<mailman.3112.1351640879.27098.python-list@python.org>
In reply to	#32476

On 30/10/2012 15:47, Andrew Robinson wrote:
>
> I would refer you to a book written by Steve Maguire, Writing Solid
> Code; Chapter 5; Candy machine interfaces.
>

The book that took a right hammering here 
http://accu.org/index.php?module=bookreviews&func=search&rid=467 ?

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]

#32518

From	Michael Torrie <torriem@gmail.com>
Date	2012-10-30 23:29 -0600
Message-ID	<mailman.3116.1351661388.27098.python-list@python.org>
In reply to	#32476

On 10/30/2012 09:47 AM, Andrew Robinson wrote:
> Let's not confound an issue here -- I am going to implement the python 
> interpreter; and am not bound by optimization considerations of the 
> present python interpreter -- There are things I can do which as a 
> python programmer -- you can't.  I have no choice but to re-implement 
> and optimize the interpreter -- the question is merely how to go about it.

As this is the case, why this long discussion?  If you are arguing for a
change in Python to make it compatible with what this fork you are going
to create will do, this has already been fairly thoroughly addressed
earl on, and reasons why the semantics will not change anytime soon have
been given.

[toc] | [prev] | [next] | [standalone]

#32465

From	Michael Torrie <torriem@gmail.com>
Date	2012-10-29 23:53 -0600
Message-ID	<mailman.3072.1351576425.27098.python-list@python.org>
In reply to	#32444

On 10/29/2012 01:34 PM, Andrew Robinson wrote:
> No, I don't think it big and complicated.  I do think it has timing 
> implications which are undesirable because of how *much* slices are used.
> In an embedded target -- I have to optimize; and I will have to reject 
> certain parts of Python to make it fit and run fast enough to be useful.

Since you can't port the full Python system to your embedded machine
anyway, why not just port a subset of python and modify it to suit your
needs right there in the C code.  It would be a fork, yes, but any port
to this target will be a fork anyway.  No matter how you cut it, it
won't be easy at all, and won't be easy to maintain.  You'll basically
be writing your own implementation of Python (that's what
python-on-a-chip is, and that's why it's closer to Python 2.x than
Python 3).  That's totally fine, though.  I get the impression you think
you will be able to port cPython as is to your target.  Without a libc,
an MMU on the CPU, and a kernel, it's not going to just compile and run.

Anyway, the only solution, given your constraints, is to implement your
own python interpreter to handle a subset of Python, and modify it to
suit your tastes.  What you want with slicing behavior changes has no
place in the normal cPython implementation, for a lot of reasons.  The
main one is that it is already possible to implement what you are
talking about in your own python class, which is a fine solution for a
normal computer with memory and CPU power available.

[toc] | [prev] | [next] | [standalone]

#32471

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 17:04 -0700
Message-ID	<mailman.3078.1351581043.27098.python-list@python.org>
In reply to	#32444

On 10/29/2012 10:53 PM, Michael Torrie wrote:
> On 10/29/2012 01:34 PM, Andrew Robinson wrote:
>> No, I don't think it big and complicated.  I do think it has timing
>> implications which are undesirable because of how *much* slices are used.
>> In an embedded target -- I have to optimize; and I will have to reject
>> certain parts of Python to make it fit and run fast enough to be useful.
> Since you can't port the full Python system to your embedded machine
> anyway, why not just port a subset of python and modify it to suit your
> needs right there in the C code.  It would be a fork, yes,
You're exactly right;  That's what I *know* I am faced with.

>    Without a libc, an MMU on the CPU, and a kernel, it's not going to just compile and run.
I have libc.  The MMU is a problem; but the compiler implements the 
standard "C" math library; floats, though, instead of doubles.  That's 
the only problem -- there.
>   What you want with slicing behavior changes has no
> place in the normal cPython implementation, for a lot of reasons.  The
> main one is that it is already possible to implement what you are
> talking about in your own python class, which is a fine solution for a
> normal computer with memory and CPU power available.
If the tests I outlined in the previous post inaccurately describe a 
major performance improvement and at least a modest code size reduction; 
or will *often* introduce bugs -- I *AGREE* with you.

Otherwise, I don't.  I don't think wasting extra CPU power is a good 
thing -- Extra CPU power can always be used by something else....

I won't belabor the point further.  I'd love to see a counter example to 
the specific criteria I just provided to IAN -- it would end my quest; 
and be a good reference to point others to.

[toc] | [prev] | [next] | [standalone]

#32436

From	Chris Angelico <rosuav@gmail.com>
Date	2012-10-30 09:55 +1100
Message-ID	<mailman.3054.1351551326.27098.python-list@python.org>
In reply to	#32359

On Tue, Oct 30, 2012 at 2:42 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> No, there was no error at all.  Pthon just crashed & exited; not even an
> exception that I can recall.   It was if it exited normally!

Can you create a reproducible test case? There's usually a cause to
these sorts of things.

ChrisA

[toc] | [prev] | [next] | [standalone]

#32439

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-29 17:07 -0600
Message-ID	<mailman.3056.1351552107.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> The list was generated in a single pass by many .append() 's, and then
> copied once -- the original was left in place; and then I attempted to slice
> it.

Note that if the list was generated by .appends, then it was copied
more than once.  Python reserves a specific amount of space for the
list.  When it grows past that, the list must be reallocated and
copied.  It grows the list exponentially in order to keep the
amortized time complexity of append at O(1), but the point is that a
list of 20 million items is going to be resized and copied several
times before it is complete.

[toc] | [prev] | [next] | [standalone]

#32442

From	Roy Smith <roy@panix.com>
Date	2012-10-29 19:24 -0400
Message-ID	<roy-4A06AD.19241029102012@news.panix.com>
In reply to	#32439

In article <mailman.3056.1351552107.27098.python-list@python.org>,
 Ian Kelly <ian.g.kelly@gmail.com> wrote:

> On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson
> <andrew3@r3dsolutions.com> wrote:
> > The list was generated in a single pass by many .append() 's, and then
> > copied once -- the original was left in place; and then I attempted to slice
> > it.
> 
> Note that if the list was generated by .appends, then it was copied
> more than once.  Python reserves a specific amount of space for the
> list.  When it grows past that, the list must be reallocated and
> copied.  It grows the list exponentially in order to keep the
> amortized time complexity of append at O(1), but the point is that a
> list of 20 million items is going to be resized and copied several
> times before it is complete.

I think you're missing the point of "amortized constant time".  Yes, the 
first item appended to the list will be copied lg(20,000,000) ~= 25 
times, because the list will be resized that many times(*).  But, on 
average (I'm not sure if "average" is strictly the correct word here), 
each item will be copied only once.

Still, it always stuck me as odd that there's no preallocate() method.  
There are times when you really do know how many items you're going to 
add to the list, and doing a single allocation would be a win.  And it 
doesn't cost anything to provide it.  I suppose, however, if you're 
adding enough items that preallocating would really matter, then maybe 
you want an array instead.

(*) I don't know the exact implementation; I'm assuming each resize is a 
factor of 2.

[toc] | [prev] | [next] | [standalone]

#32443

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-29 17:43 -0600
Message-ID	<mailman.3057.1351554215.27098.python-list@python.org>
In reply to	#32442

On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith <roy@panix.com> wrote:
> I think you're missing the point of "amortized constant time".  Yes, the
> first item appended to the list will be copied lg(20,000,000) ~= 25
> times, because the list will be resized that many times(*).  But, on
> average (I'm not sure if "average" is strictly the correct word here),
> each item will be copied only once.
>
> Still, it always stuck me as odd that there's no preallocate() method.
> There are times when you really do know how many items you're going to
> add to the list, and doing a single allocation would be a win.  And it
> doesn't cost anything to provide it.  I suppose, however, if you're
> adding enough items that preallocating would really matter, then maybe
> you want an array instead.
>
> (*) I don't know the exact implementation; I'm assuming each resize is a
> factor of 2.

The growth factor is approximately 1.125.  "Approximately" because
there is also a small constant term.  The average number of copies per
item converges on 8.

[toc] | [prev] | [next] | [standalone]

#32448

From	Roy Smith <roy@panix.com>
Date	2012-10-29 20:17 -0400
Message-ID	<roy-31BCE0.20171729102012@news.panix.com>
In reply to	#32443

In article <mailman.3057.1351554215.27098.python-list@python.org>,
 Ian Kelly <ian.g.kelly@gmail.com> wrote:

> On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith <roy@panix.com> wrote:
> > I think you're missing the point of "amortized constant time".  Yes, the
> > first item appended to the list will be copied lg(20,000,000) ~= 25
> > times, because the list will be resized that many times(*).  But, on
> > average (I'm not sure if "average" is strictly the correct word here),
> > each item will be copied only once.
> >
> > Still, it always stuck me as odd that there's no preallocate() method.
> > There are times when you really do know how many items you're going to
> > add to the list, and doing a single allocation would be a win.  And it
> > doesn't cost anything to provide it.  I suppose, however, if you're
> > adding enough items that preallocating would really matter, then maybe
> > you want an array instead.
> >
> > (*) I don't know the exact implementation; I'm assuming each resize is a
> > factor of 2.
> 
> The growth factor is approximately 1.125.  "Approximately" because
> there is also a small constant term.  The average number of copies per
> item converges on 8.

Wow, that's surprising.  It also makes it that much more surprising that 
there's no way to pre-allocate.

[toc] | [prev] | [next] | [standalone]

#32446

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-29 18:05 -0600
Message-ID	<mailman.3059.1351555560.27098.python-list@python.org>
In reply to	#32442

On Mon, Oct 29, 2012 at 5:43 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> The growth factor is approximately 1.125.  "Approximately" because
> there is also a small constant term.  The average number of copies per
> item converges on 8.

Of course, that is the *maximum* number of copies.  The actual number
could be much less if realloc() performs well.

[toc] | [prev] | [next] | [standalone]

#32450

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 11:00 -0700
Message-ID	<mailman.3062.1351559168.27098.python-list@python.org>
In reply to	#32359

On 10/29/2012 06:53 AM, Chris Angelico wrote:
> Can you provide links to these notes? I'm looking at
> cpython/Include/sliceobject.h that has this comment:
>
> /*
>
> A slice object containing start, stop, and step data members (the
> names are from range).  After much talk with Guido, it was decided to
> let these be any arbitrary python type.  Py_None stands for omitted values.
> */
>
> Also, the code for slice objects in CPython works with Py_ssize_t (a
> signed quantity of the same length as size_t), which will allow at
> least 2**31 for an index. I would guess that your crashes were nothing
> to do with 20 million elements and slices.
>
> ChrisA
Let's look at the source code rather than the web notes -- the source 
must be the true answer anyhow.

I downloaded the source code for python 3.3.0, as the tbz;
In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 
& ff.

Clearly a slice is malloced for a slice_ty type.
It has four elements: kind, lower, upper, and step.

So, tracing it back to the struct definition...

"Include/Python-ast.h"  has "typedef struct _slice *slice_ty;"

And, here's the answer!:

enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
struct _slice {
         enum _slice_kind kind;
         union {
                 struct {
                         expr_ty lower;
                         expr_ty upper;
                         expr_ty step;
                 } Slice;

                 struct {
                         asdl_seq *dims;
                 } ExtSlice;

                 struct {
                         expr_ty value;
                 } Index;

         } v;
};


So, slice() does indeed have arbitrary python types included in it; 
contrary to what I read elsewhere.
expr_ty is a pointer to an arbitrary expression, so the actual structure 
is 4 pointers, at 32 bits each = 16 bytes.
The size of the structure itself, given in an earlier post, is 20 bytes 
-- which means one more pointer is involved, perhaps the one pointing to 
the slice structure itself.

Hmm...!

An empty tuple gives sys.getsizeof( () ) = 24.

But, I would expect a tuple to be merely a list of object pointers; 
hence I would expect 4 bytes for len(), and then a head pointer 4 bytes, 
and then a pointer for each object.
3 objects gives 12 bytes, + 8 = 16 bytes.

Then we need one more pointer so Python knows where the struct is...
So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size 
as slice() --

but it's 24, even when empty...
And 36 when initialized...
What are the extra 16 bytes for?

All I see is:
typedef struct { object** whatever } PyTupleObject;

[toc] | [prev] | [next] | [standalone]

#32454

From	Chris Kaynor <ckaynor@zindagigames.com>
Date	2012-10-29 18:49 -0700
Message-ID	<mailman.3064.1351561789.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 11:00 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
>
> Let's look at the source code rather than the web notes -- the source must
> be the true answer anyhow.
>
> I downloaded the source code for python 3.3.0, as the tbz;
> In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 &
> ff.
>
> Clearly a slice is malloced for a slice_ty type.
> It has four elements: kind, lower, upper, and step.
>
> So, tracing it back to the struct definition...
>
> "Include/Python-ast.h"  has "typedef struct _slice *slice_ty;"
>
> And, here's the answer!:
>
> enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
> struct _slice {
>         enum _slice_kind kind;
>         union {
>                 struct {
>                         expr_ty lower;
>                         expr_ty upper;
>                         expr_ty step;
>                 } Slice;
>
>                 struct {
>                         asdl_seq *dims;
>                 } ExtSlice;
>
>                 struct {
>                         expr_ty value;
>                 } Index;
>
>         } v;
> };
>
>
> So, slice() does indeed have arbitrary python types included in it; contrary
> to what I read elsewhere.
> expr_ty is a pointer to an arbitrary expression, so the actual structure is
> 4 pointers, at 32 bits each = 16 bytes.
> The size of the structure itself, given in an earlier post, is 20 bytes --
> which means one more pointer is involved, perhaps the one pointing to the
> slice structure itself.
>
> Hmm...!
>
> An empty tuple gives sys.getsizeof( () ) = 24.
>
> But, I would expect a tuple to be merely a list of object pointers; hence I
> would expect 4 bytes for len(), and then a head pointer 4 bytes, and then a
> pointer for each object.
> 3 objects gives 12 bytes, + 8 = 16 bytes.
>
> Then we need one more pointer so Python knows where the struct is...
> So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size as
> slice() --
>
> but it's 24, even when empty...
> And 36 when initialized...
> What are the extra 16 bytes for?

Every Python object requires two pieces of data, both of which are
pointer-sized (one is a pointer, one is an int the size of a pointer).
These are: a pointer to the object's type, and the object's reference
count.

A tuple actually does not need a head pointer: the head pointer is
merely an offset from the tuple's pointer. It merely has a ref count,
type, an item count, and pointers to its contents.

A slice has the same type pointer and reference count, then three
pointers to the start, stop, and step objects. This means a slice
object should be the same size as a two-item tuple: the tuple needs a
count, while that is fixed at 3 for a slice (though some items may be
unset).

NOTE: The above is taken from reading the source code for Python 2.6.
For some odd reason, I am getting that an empty tuple consists of 6
pointer-sized objects (48 bytes on x64), rather than the expected 3
pointer-sized (24 bytes on x64). Slices are showing up as the expected
5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
pointer (8 bytes on x64) per item. I imagine I am missing something,
but cannot figure out what that would be.

>
> All I see is:
> typedef struct { object** whatever } PyTupleObject;
>

[toc] | [prev] | [next] | [standalone]

#32464

From	Andrew Robinson <andrew3@r3dsolutions.com>
Date	2012-10-29 15:39 -0700
Message-ID	<mailman.3071.1351575542.27098.python-list@python.org>
In reply to	#32359

On 10/29/2012 06:49 PM, Chris Kaynor wrote:
> Every Python object requires two pieces of data, both of which are 
> pointer-sized (one is a pointer, one is an int the size of a pointer). 
> These are: a pointer to the object's type, and the object's reference 
> count. A tuple actually does not need a head pointer: the head pointer 
> is merely an offset from the tuple's pointer. It merely has a ref 
> count, type, an item count, and pointers to its contents. A slice has 
> the same type pointer and reference count, then three pointers to the 
> start, stop, and step objects. This means a slice object should be the 
> same size as a two-item tuple: the tuple needs a count, while that is 
> fixed at 3 for a slice (though some items may be unset). NOTE: The 
> above is taken from reading the source code for Python 2.6. For some 
> odd reason, I am getting that an empty tuple consists of 6 
> pointer-sized objects (48 bytes on x64), rather than the expected 3 
> pointer-sized (24 bytes on x64). Slices are showing up as the expected 
> 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 
> pointer (8 bytes on x64) per item. I imagine I am missing something, 
> but cannot figure out what that would be.
>> All I see is:
>> typedef struct { object** whatever } PyTupleObject;
>>
It's fairly straight forward in 3.2.0.  I debugged the code with GDB and 
watched.
Perhaps it is the same in 2.6 ?

In addition to those items you mention, of which the reference count is 
not even *inside* the struct -- there is additional debugging 
information not mentioned.  Built in objects contain a "line number", a 
"column number", and a "context" pointer.  These each require a full 
word of storage.

Also, built in types appear to have a "kind" field which indicates the 
object "type" but is not a pointer.  That suggests two "object" type 
indicators, a generic pointer (probably pointing to "builtin"? somewhere 
outside the struct) and a specific one (an enum) inside the "C" struct.

Inside the tuple struct, I count 4 undocumented words of information.
Over all, there is a length, the list of pointers, a "kind", "line", 
"col" and "context"; making 6 pieces in total.

Although your comment says the head pointer is not required; I found in 
3.3.0 that it is a true head pointer; The Tuple() function on line 2069 
of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.  
That pointer is copied into the Tuple struct.

How ironic,  slices don't have debugging info, that's the main reason 
they are smaller.
When I do slice(3,0,2), suprisingly "Slice()" is NOT called.
But when I do a[1:2:3] it *IS* called.

[toc] | [prev] | [next] | [standalone]

#32466

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-10-29 23:55 -0600
Message-ID	<mailman.3073.1351576533.27098.python-list@python.org>
In reply to	#32359

On Mon, Oct 29, 2012 at 12:00 PM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> I downloaded the source code for python 3.3.0, as the tbz;
> In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 &
> ff.

Python-ast.c is part of the compiler code.  That's not the struct used
to represent the object at runtime, but the struct used to represent
the AST node while compiling.

For the runtime definition of slices, look at sliceobject.h and
sliceobject.c.  Slices are represented as:

typedef struct {
    PyObject_HEAD
    PyObject *start, *stop, *step;	/* not NULL */
} PySliceObject;

"PyObject_HEAD" is a macro that incorporates the object type pointer
and the reference count.  Hence, 5 words.

[toc] | [prev] | [next] | [standalone]

Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →

csiph-web

Negative array indicies and slice()

Contents

#32427

#32434

#32444

#32459

#32476

#32511

#32513

#32518

#32465

#32471

#32436

#32439

#32442

#32443

#32448

#32446

#32450

#32454

#32464

#32466