Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #32329 > unrolled thread
| Started by | andrewr3mail@gmail.com |
|---|---|
| First post | 2012-10-28 20:12 -0700 |
| Last post | 2012-11-01 18:08 -0700 |
| Articles | 20 on this page of 73 — 16 participants |
Back to article view | Back to comp.lang.python
Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 20:12 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 21:42 -0600
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-28 22:25 -0600
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:18 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 11:19 +0000
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 22:32 +1100
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 21:52 -0700
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-29 23:40 +1100
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:02 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-28 23:01 -0700
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 09:52 -0400
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:20 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:01 -0600
Re: Negative array indicies and slice() Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-10-30 00:04 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:54 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:15 -0600
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 00:53 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:09 -0600
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 22:14 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 08:42 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 00:02 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 12:34 -0700
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-30 08:17 +0000
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 08:47 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:48 +0000
Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-30 23:29 -0600
Re: Negative array indicies and slice() Michael Torrie <torriem@gmail.com> - 2012-10-29 23:53 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:04 -0700
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-30 09:55 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:07 -0600
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 19:24 -0400
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 17:43 -0600
Re: Negative array indicies and slice() Roy Smith <roy@panix.com> - 2012-10-29 20:17 -0400
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 18:05 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 11:00 -0700
Re: Negative array indicies and slice() Chris Kaynor <ckaynor@zindagigames.com> - 2012-10-29 18:49 -0700
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 15:39 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 23:55 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 00:51 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 17:17 -0700
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:21 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 01:32 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 02:46 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 12:02 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-30 07:21 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 21:33 +0000
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 10:07 +0000
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-31 16:01 +0000
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:47 -0600
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 15:55 -0600
Re: Negative array indicies and slice() Chris Angelico <rosuav@gmail.com> - 2012-10-31 09:00 +1100
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-30 16:02 -0600
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-30 23:30 +0000
Re: Negative array indicies and slice() Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-29 11:27 -0600
Re: Negative array indicies and slice() Andrew Robinson <andrew3@r3dsolutions.com> - 2012-10-29 16:33 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-29 00:54 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-28 21:00 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
Re: Negative array indicies and slice() alex23 <wuwei23@gmail.com> - 2012-10-28 21:44 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:24 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:37 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
Re: Negative array indicies and slice() Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-29 09:36 +0000
Re: Negative array indicies and slice() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-29 10:34 +0000
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:59 -0700
Re: Negative array indicies and slice() Paul Rubin <no.email@nospam.invalid> - 2012-10-28 22:14 -0700
Re: Negative array indicies and slice() andrewr3mail@gmail.com - 2012-10-29 01:08 -0700
Re: Negative array indicies and slice() Chris Rebert <clp2@rebertia.com> - 2012-10-29 01:26 -0700
Re: Negative array indicies and slice() Andrew <andrewr3mail@gmail.com> - 2012-10-28 21:09 -0700
Re: Negative array indicies and slice() MRAB <python@mrabarnett.plus.com> - 2012-10-29 03:45 +0000
Re: Negative array indicies and slice() 88888 Dihedral <dihedral88888@googlemail.com> - 2012-11-01 18:08 -0700
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-10-29 22:14 +0000 |
| Message-ID | <508effd1$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32359 |
On Mon, 29 Oct 2012 11:19:38 +0000, Steven D'Aprano wrote: > Because xrange represents a concrete sequence of numbers, all three of > start, end and stride must be concrete, known, integers: > > py> xrange(4, None, 2) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: an integer is required > > Whereas slices can trivially include blanks that get filled in only when > actually used: > > py> "hello world"[aslice] > 'owrd' > py> "NOBODY expects the Spanish Inquisition!"[aslice] > 'D xet h pns nusto!' /me facepalms/ Argggh, I forgot to copy-and-paste the critical line defining aslice: aslice = slice(4, None, 2) Sorry about that. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 08:42 -0700 |
| Message-ID | <mailman.3053.1351550917.27098.python-list@python.org> |
| In reply to | #32359 |
On 10/29/2012 10:09 AM, Ian Kelly wrote: > On Oct 29, 2012 7:10 AM, "Andrew Robinson"<andrew3@r3dsolutions.com> wrote: >> I will be porting Python 3.xx to a super low power embedded processor (MSP430), both space and speed are at a premium. >> Running Python on top of Java would be a *SERIOUS* mistake. .NET won't even run on this system. etc. > If that's the case, then running Python at all is probably a mistake. > You know the interpreter alone has an overhead of nearly 6 MB? There's already a version of the python interpreter which fits in under 100K: http://code.google.com/p/python-on-a-chip/ It's not the 3.x series, though; and I don't want to redo this once 2.7 really does become obsolete. >> Yes, I realize that. >> But, why can't I just overload the existing __getitem__ for lists and not bother writing an entire class? > You can just overload that one method in a subclass of list. Being > able to monkey-patch __getitem__ for the list class itself would not > be advisable, as it would affect all list slicing anywhere in your > program and possibly lead to some unexpected behaviors. That's what I am curious about. What unexpected behaviors would a "monkey patch" typically cause? If no one really uses negative and positive indexes in the same slice operation, because there is no reason to do so... It will only break the occasional esoteric application. > > 20 million is nothing. On a 32-bit system, sys.maxsize == 2 ** 31 - > 1. If the error you were seeing was MemoryError, then more likely you > were running into dynamic allocation issues due to fragmentation of > virtual memory. > > No, there was no error at all. Pthon just crashed & exited; not even an exception that I can recall. It was if it exited normally! The list was generated in a single pass by many .append() 's, and then copied once -- the original was left in place; and then I attempted to slice it. I am able to routinely to 5 million length lists, copy, slice, cut, append, and delete from them without this ever happening. If fragmentation were the issue, I'd think the shorter lists would cause the problem after many manipulations... It may not be a bug in python itself, though, of course. There are libraries it uses which might have a bug.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-10-30 00:02 +0000 |
| Message-ID | <508f1910$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32434 |
On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:
>>> But, why can't I just overload the existing __getitem__ for lists and
>>> not bother writing an entire class?
You say that as if writing "an entire class" was a big complicated
effort. It isn't. It is trivially simple, a single line:
class MyList(list):
...
plus the __getitem__ definition, which you would have to write anyway. It
is a trivial amount of extra effort.
>> You can just overload that one method in a subclass of list. Being
>> able to monkey-patch __getitem__ for the list class itself would not be
>> advisable, as it would affect all list slicing anywhere in your program
>> and possibly lead to some unexpected behaviors.
>
> That's what I am curious about.
> What unexpected behaviors would a "monkey patch" typically cause?
What part of "unexpected" is unclear?
Monkey-patching is poor programming technique because it leads to
*unexpected* and *impossible to predict* interactions between *distant*
parts of the code. It leads to impossible to predict differences between
the source code on disk and the actual running code. It leads to
impossible to predict differences between documented behaviour and actual
behaviour.
Let me see if I can illustrate a flavour of the sort of things that can
happen if monkey-patching built-ins were allowed.
You create a list and print it:
# simulated output
py> x = [5, 2, 4, 1]
py> print(x)
[1, 2, 4, 5]
What? How did that happen? That's not the list you provided. The order
has been lost.
So you dig deep into your code, and you don't find anything. And you read
the Python documentation for lists, and don't find anything. And you
google the Internet, and don't find anything. And you ask for help, and
everybody says you're crazy because when they duplicate your code they
get the expected behaviour. And you report a bug in Python, and it gets
closed as "cannot replicate".
Finally you search deep into the libraries used in your code, and *five
days later* discover that your code uses library A which uses library B
which uses library C which uses library D which installs a harmless
monkey-patch to print, but only if library E is installed, and you just
happen to have E installed even though your code never uses it, AND that
monkey-patch clashes with a harmless monkey-patch to list.__getitem__
installed by library F. And even though each monkey-patch alone is
harmless, the combination breaks your code's output.
Python allows, but does not encourage, monkey-patching of code written in
pure Python, because it sometimes can be useful. It flat out prohibits
monkey-patching of builtins, because it is just too dangerous.
Ruby allows monkey-patching of everything. And the result was predictable:
http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 12:34 -0700 |
| Message-ID | <mailman.3067.1351564823.27098.python-list@python.org> |
| In reply to | #32444 |
On 10/29/2012 05:02 PM, Steven D'Aprano wrote: > On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote: > >>>> But, why can't I just overload the existing __getitem__ for lists and >>>> not bother writing an entire class? > You say that as if writing "an entire class" was a big complicated > effort. It isn't. It is trivially simple, a single line: > > class MyList(list): > ... No, I don't think it big and complicated. I do think it has timing implications which are undesirable because of how *much* slices are used. In an embedded target -- I have to optimize; and I will have to reject certain parts of Python to make it fit and run fast enough to be useful. >>> You can just overload that one method in a subclass of list. Being >>> able to monkey-patch __getitem__ for the list class itself would not be >>> advisable, as it would affect all list slicing anywhere in your program >>> and possibly lead to some unexpected behaviors. >> That's what I am curious about. >> What unexpected behaviors would a "monkey patch" typically cause? > What part of "unexpected" is unclear? > Ahh -- The I don't know approach! It's only unexpected if one is a bad programmer...! > Let me see if I can illustrate a flavour of the sort of things that can > happen if monkey-patching built-ins were allowed. > > You create a list and print it: > > # simulated output > py> x = [5, 2, 4, 1] > py> print(x) > [1, 2, 4, 5] <snip> Finally you search deep into the libraries used in your code, and *five days later* discover that your code uses library A which uses library B which uses library C which uses library D which installs a harmless monkey-patch to print, but only if library E is installed, and you just happen to have E installed even though your code never uses it, AND that monkey-patch clashes with a harmless monkey-patch to list.__getitem__ installed by library F. And even though each monkey-patch alone is harmless, the combination breaks your code's output. Right, which means that people developing the libraries made contradictory assumptions. > Python allows, but does not encourage, monkey-patching of code written in > pure Python, because it sometimes can be useful. It flat out prohibits > monkey-patching of builtins, because it is just too dangerous. > > Ruby allows monkey-patching of everything. And the result was predictable: > > http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/ > I read that post carefully; and the author purposely notes that he is exaggerating. BUT Your point is still well taken. What you are talking about is namespace preservation; and I am thinking about it. I can preserve it -- but only if I disallow true Python primitives in my own interpreter; I can't provide two sets in the memory footprint I am using. From my perspective, the version of Python that I compile will not be supported by the normal python help; The predecessor which first forged this path, Pymite, has the same problems -- however, the benefits ought-weigh the disadvantages; and the experiment yielded useful information on what is redundant in Python (eg: range is not supported) and when that redundancy is important for some reason. If someone had a clear explanation of the disadvantages of allowing an iterator, or a tuple -- in place of a slice() -- I would have no qualms dropping the subject. However, I am not finding that yet. I am finding very small optimization issues... The size of an object is at least 8 bytes. Hence, three numbers is going to be at least 24 bytes; and that's 24 bytes in *excess* of the size of slice() or tuple () which are merely containers. So -- There *ARE* savings in memory when using slice(), but it isn't really 2x memory -- its more like 20% -- once the actual objects are considered. The actual *need* for a slice() object still hasn't been demonsrated. I am thinking that the implementation of __getitem__() is very poor probably because of legacy issues. A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple. Alternately: An iterator, like xrange(), could be made which takes None as a parameter, or a special value like 'inf'. Since these two values would never be passed to xrange by already developed code, allowing them would not break working code. I am only aware of one possible reason that slice() was once thought to be necessary; and that is because accessing the element of a tuple would recursively call __getitem__ on the tuple. But, even that is easily dismissed once the fixed integer indexes are considered. Your thoughts? Do you have any show stopper insights?
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-10-30 08:17 +0000 |
| Message-ID | <508f8cfd$0$29884$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32459 |
By the way Andrew, the timestamps on your emails appear to be off, or possibly the time zone. Your posts are allegedly arriving before the posts you reply to, at least according to my news client. On Mon, 29 Oct 2012 12:34:24 -0700, Andrew Robinson wrote: > On 10/29/2012 05:02 PM, Steven D'Aprano wrote: >> On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote: >> >>>>> But, why can't I just overload the existing __getitem__ for lists >>>>> and not bother writing an entire class? >> You say that as if writing "an entire class" was a big complicated >> effort. It isn't. It is trivially simple, a single line: >> >> class MyList(list): >> ... > No, I don't think it big and complicated. I do think it has timing > implications which are undesirable because of how *much* slices are > used. In an embedded target -- I have to optimize; and I will have to > reject certain parts of Python to make it fit and run fast enough to be > useful. Then I look forward to seeing your profiling results that show that the overhead of subclassing list is the bottleneck in your application. Until then, you are making the classic blunder of the premature optimizer: "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason — including blind stupidity." — W.A. Wulf I am not impressed by performance arguments when you have (apparently) neither identified the bottlenecks in your code, nor even measured the performance. You are essentially *guessing* where the bottlenecks are, and *hoping* that some suggested change will be an optimization rather than a pessimization. Of course I may be wrong, and you have profiled your code and determined that the overhead of inheritance is a problem. If so, that's a different ball game. But your posts so far suggest to me that you're trying to predict performance optimizations rather than measure them. >>>> You can just overload that one method in a subclass of list. Being >>>> able to monkey-patch __getitem__ for the list class itself would not >>>> be advisable, as it would affect all list slicing anywhere in your >>>> program and possibly lead to some unexpected behaviors. >>> That's what I am curious about. >>> What unexpected behaviors would a "monkey patch" typically cause? >> What part of "unexpected" is unclear? >> > Ahh -- The I don't know approach! It's only unexpected if one is a bad > programmer...! No, it is unexpected because once you start relying on monkey-patching, and the libraries you install are monkey-patching, you have a combinational explosion of interactions. Any piece of code, anywhere, could monkey-patch any other piece of code -- it is a form of action-at-a- distance coding, like GOTOs and global variables. Use it with caution, in moderation. >> Let me see if I can illustrate a flavour of the sort of things that can >> happen if monkey-patching built-ins were allowed. [...] > Right, which means that people developing the libraries made > contradictory assumptions. Not necessarily. Not only can monkey-patches conflict, but they can combine in bad ways. It isn't just that Fred assumes X and Barney assumes not-X, but also that Fred assumes X and Barney assumes Y and *nobody* imagined that there was some interaction between X and Y. [...] >> Ruby allows monkey-patching of everything. And the result was >> predictable: >> >> http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying- ruby/ >> >> > I read that post carefully; and the author purposely notes that he is > exaggerating. Not carefully enough. He notes that he was using a provocative title and that he doesn't actually think that Ruby is being destroyed. But the actual harm he describes is real, e.g. bugs that take months to track down. > What you are talking about is namespace preservation; I haven't mentioned namespaces. Nothing I have said has anything to do with namespaces. I remember Apple monkey-patching routines in ROM back in the mid 1980s, long before there was anything corresponding to namespaces in Apple's programming model. > and I am thinking > about it. I can preserve it -- but only if I disallow true Python > primitives in my own interpreter; I can't provide two sets in the memory > footprint I am using. If you want to write a language that is not Python, go right ahead. > If someone had a clear explanation of the disadvantages of allowing an > iterator, or a tuple -- in place of a slice() -- I would have no qualms > dropping the subject. However, I am not finding that yet. I am finding > very small optimization issues... Python has certain public interfaces. If your language does not support those public interfaces, then it might be an awesome language, but it is not Python. Python slices have certain features: * they can be used repeatedly; * they have public attributes start, step and stop; * The stop attribute can be None, and the slice will default to the length of the thing being sliced, which is not known until call-time. * Slices have a public indices method. These are known characteristics of slices, and there is Python code that relies on it. If your language does not support these, then it is not Python. Iterators cannot replace slices, because once you have looked up an iterator value, that value is gone, never to be seen again. Tuples cannot replace slices, because tuples do not have start, step and stop attributes; most tuples have no need of them, and if you care about memory footprint, the last thing you want is to give every tuple three unused or meaningless named attributes. Besides, what values should they take in an empty tuple? xrange cannot replaces slices, because there is no way to create an xrange object with an empty or missing end; besides, xrange objects have no public start/stop/step attributes. And none of them have a public indices method. > The actual *need* for a slice() object still hasn't been demonsrated. Hell, the actual *need* for slicing hasn't been demonstrated! Or Python! Since C doesn't have slicing, nobody needs it! Am I right? Needed or not, that is what Python has, so if your language doesn't have it, it isn't Python. > I > am thinking that the implementation of __getitem__() is very poor > probably because of legacy issues. You haven't demonstrated that it is very poor. > A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple. Sure. And a tuple can also be () or (1, "x", [], None, None, None, None). Now your list.__getitem__ code has to deal with those instead. > Alternately: An iterator, like xrange(), could be made which takes None > as a parameter, or a special value like 'inf'. So you want to get rid of slices because you want to save every byte of memory you can, and your way to do that is to introduce a new, *heavier* type that does more than slices? Oh man, that's funny. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-30 08:47 -0700 |
| Message-ID | <mailman.3110.1351637364.27098.python-list@python.org> |
| In reply to | #32476 |
On 10/30/2012 01:17 AM, Steven D'Aprano wrote: > By the way Andrew, the timestamps on your emails appear to be off, or > possibly the time zone. Your posts are allegedly arriving before the > posts you reply to, at least according to my news client. :D -- yes, I know about that problem. Every time I reboot it shows up again... It's a distribution issue, my hardware clock is in local time -- but when the clock is read by different scripts in my distribution, some refuse to accept that the system clock is not UTC. I'll be upgrading in a few weeks -- so I'm just limping along until then. My apology. > Then I look forward to seeing your profiling results that show that the > overhead of subclassing list is the bottleneck in your application. > > Until then, you are making the classic blunder of the premature optimizer: > > "More computing sins are committed in the name of efficiency (without > necessarily achieving it) than for any other single reason — including > blind stupidity." — W.A. Wulf I'm sure that's true. Optimization, though, is a very general word. On a highway in my neighborhood -- the government keeps trying to put more safety restrictions on it, because it statistically registers as the "highest accident rate road" in the *entire* region. Naturally, the government assumes that people in my neighborhood are worse drivers than usual and need to be policed more -- but the truth is, that highway is the *ONLY* access road in the region for dozens of miles in any direction for a densely populated area, so if there is going to be an accident it will happen there; the extra safety precautions are not necessary when the accident rate is looked at from a per-capita perspective of those driving the highway. I haven't made *the* blunder of the premature optimizer because I haven't implemented anything yet. Premature optimizers don't bother to hold public conversation and take correction. OTOH: people who don't ever optimize out of fear, pay an increasing bloat price with time. > I am not impressed by performance arguments when you have (apparently) > neither identified the bottlenecks in your code, nor even measured the > performance. Someone else already did a benchmark between a discrete loop and a slice operation. The difference in speed was an order of magnitude different. I bench-marked a map operation, which was *much* better -- but also still very slow in comparison. Let's not confound an issue here -- I am going to implement the python interpreter; and am not bound by optimization considerations of the present python interpreter -- There are things I can do which as a python programmer -- you can't. I have no choice but to re-implement and optimize the interpreter -- the question is merely how to go about it. > You are essentially *guessing* where the bottlenecks are, > and *hoping* that some suggested change will be an optimization rather > than a pessimization. > > Of course I may be wrong, and you have profiled your code and determined > that the overhead of inheritance is a problem. If so, that's a different > ball game. But your posts so far suggest to me that you're trying to > predict performance optimizations rather than measure them. Not really; Inheritance itself and it's timing aren't my main concern. Even if the time was *0* that wouldn't change my mind. There are man hours in debugging time caused by not being able to wrap around in a slice. (I am not ignoring the contrary man hours of an API change's bugs). Human psychology is important; and it's a double edged sword. I would refer you to a book written by Steve Maguire, Writing Solid Code; Chapter 5; Candy machine interfaces. He uses the "C" function "realloc()" as an excellent example of a bad API; but still comments on one need that it *does* fulfill -- "I've found it better to have one function that both shrinks and expands blocks so that I don't have to write *ifs* constructs every time I need to resize memory. True, I give up some extra argument checking, but this is offset by the *ifs* that I no longer need to write (*and possibly mess up*). * Extra steps that a programmer must take to achieve a task are places where bugs get introduced. * API's which must be debugged to see what particular operation it is performing rather than knowing what that operation is from looking at the un-compiled code are places where bugs get introduced. These two points are not friendly with each other -- they are in fact, generally in conflict. >> Right, which means that people developing the libraries made >> contradictory assumptions. > Not necessarily. Not only can monkey-patches conflict, but they can > combine in bad ways. It isn't just that Fred assumes X and Barney assumes > not-X, but also that Fred assumes X and Barney assumes Y and *nobody* > imagined that there was some interaction between X and Y. They *STILL* made contradictory assumptions; each of them assumed the interaction mechanism would not be applied in a certain way -- and then used it based on that assumption. > Not carefully enough. He notes that he was using a provocative title and > that he doesn't actually think that Ruby is being destroyed. But the > actual harm he describes is real, e.g. bugs that take months to track > down. Yes. I read that *carefully*. BTW: I'm not planning on monkey patching -- I did say your comments were well taken. >> What you are talking about is namespace preservation; > I haven't mentioned namespaces. Nothing I have said has anything to do > with namespaces. keywords are a namespace. xrange is a keyword, etc. You don't want pre-defined API method name to have its operation altered; you want the original names preserved. Overloading pollutes namespaces, etc. > If you want to write a language that is not Python, go right ahead. That's not the only possibility. It will still be called Python and for good reason. > >> If someone had a clear explanation of the disadvantages of allowing an >> iterator, or a tuple -- in place of a slice() -- I would have no qualms >> dropping the subject. However, I am not finding that yet. I am finding >> very small optimization issues... > Python has certain public interfaces. If your language does not support > those public interfaces, then it might be an awesome language, but it is > not Python. I didn't say I wouldn't support it. How do you mean this? > Iterators cannot replace slices, because once you have looked up an > iterator value, that value is gone, never to be seen again. Yes they can replace slices, but not always. For example; Lets say an iterator were allowed to replace a slice; then a[ 1:3 ] would still be a slice, but a[ xrange(1,3) ] would not. The second piece of code does not deny that slices exist, it just allows an iterator to be passed to __getitems__. > Tuples cannot replace slices, because tuples do not have start, step and > stop attributes; I refer you to my comments to IAN. > > And none of them have a public indices method. Hmm.. a new point. Do you have a link? > >> The actual *need* for a slice() object still hasn't been demonsrated. > Hell, the actual *need* for slicing hasn't been demonstrated! Or Python! > Since C doesn't have slicing, nobody needs it! Am I right? I asked about the need for slices() not the others, in the OP. Thanks for your comments on the names, etc. That's helpful -- and I hope I haven't driven your blood pressure up too much unintentionally. > Needed or not, that is what Python has, so if your language doesn't have > it, it isn't Python. Fine. But if mine still has these, but only as an *optional* wrapper function -- it still *RUNS* all python progams. > Sure. And a tuple can also be () or (1, "x", [], None, None, None, None). > Now your list.__getitem__ code has to deal with those instead. > And the present __getitem__ can be passed a perverted slice() as IAN demonstrated and the bug-list examined as "goofy". These strange things simply cause an exception. > So you want to get rid of slices because you want to save every byte > of memory you can, and your way to do that is to introduce a new, > *heavier* type that does more than slices? Oh man, that's funny. I'm not Introducing a new type ! Besides; tuples will likely be lighter on memory in my target and I don't recall saying a tuple does more than slices. I did say that *iterators* were more *flexible* -- is that what you are thinking of?
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2012-10-30 23:48 +0000 |
| Message-ID | <mailman.3112.1351640879.27098.python-list@python.org> |
| In reply to | #32476 |
On 30/10/2012 15:47, Andrew Robinson wrote: > > I would refer you to a book written by Steve Maguire, Writing Solid > Code; Chapter 5; Candy machine interfaces. > The book that took a right hammering here http://accu.org/index.php?module=bookreviews&func=search&rid=467 ? -- Cheers. Mark Lawrence.
[toc] | [prev] | [next] | [standalone]
| From | Michael Torrie <torriem@gmail.com> |
|---|---|
| Date | 2012-10-30 23:29 -0600 |
| Message-ID | <mailman.3116.1351661388.27098.python-list@python.org> |
| In reply to | #32476 |
On 10/30/2012 09:47 AM, Andrew Robinson wrote: > Let's not confound an issue here -- I am going to implement the python > interpreter; and am not bound by optimization considerations of the > present python interpreter -- There are things I can do which as a > python programmer -- you can't. I have no choice but to re-implement > and optimize the interpreter -- the question is merely how to go about it. As this is the case, why this long discussion? If you are arguing for a change in Python to make it compatible with what this fork you are going to create will do, this has already been fairly thoroughly addressed earl on, and reasons why the semantics will not change anytime soon have been given.
[toc] | [prev] | [next] | [standalone]
| From | Michael Torrie <torriem@gmail.com> |
|---|---|
| Date | 2012-10-29 23:53 -0600 |
| Message-ID | <mailman.3072.1351576425.27098.python-list@python.org> |
| In reply to | #32444 |
On 10/29/2012 01:34 PM, Andrew Robinson wrote: > No, I don't think it big and complicated. I do think it has timing > implications which are undesirable because of how *much* slices are used. > In an embedded target -- I have to optimize; and I will have to reject > certain parts of Python to make it fit and run fast enough to be useful. Since you can't port the full Python system to your embedded machine anyway, why not just port a subset of python and modify it to suit your needs right there in the C code. It would be a fork, yes, but any port to this target will be a fork anyway. No matter how you cut it, it won't be easy at all, and won't be easy to maintain. You'll basically be writing your own implementation of Python (that's what python-on-a-chip is, and that's why it's closer to Python 2.x than Python 3). That's totally fine, though. I get the impression you think you will be able to port cPython as is to your target. Without a libc, an MMU on the CPU, and a kernel, it's not going to just compile and run. Anyway, the only solution, given your constraints, is to implement your own python interpreter to handle a subset of Python, and modify it to suit your tastes. What you want with slicing behavior changes has no place in the normal cPython implementation, for a lot of reasons. The main one is that it is already possible to implement what you are talking about in your own python class, which is a fine solution for a normal computer with memory and CPU power available.
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 17:04 -0700 |
| Message-ID | <mailman.3078.1351581043.27098.python-list@python.org> |
| In reply to | #32444 |
On 10/29/2012 10:53 PM, Michael Torrie wrote: > On 10/29/2012 01:34 PM, Andrew Robinson wrote: >> No, I don't think it big and complicated. I do think it has timing >> implications which are undesirable because of how *much* slices are used. >> In an embedded target -- I have to optimize; and I will have to reject >> certain parts of Python to make it fit and run fast enough to be useful. > Since you can't port the full Python system to your embedded machine > anyway, why not just port a subset of python and modify it to suit your > needs right there in the C code. It would be a fork, yes, You're exactly right; That's what I *know* I am faced with. > Without a libc, an MMU on the CPU, and a kernel, it's not going to just compile and run. I have libc. The MMU is a problem; but the compiler implements the standard "C" math library; floats, though, instead of doubles. That's the only problem -- there. > What you want with slicing behavior changes has no > place in the normal cPython implementation, for a lot of reasons. The > main one is that it is already possible to implement what you are > talking about in your own python class, which is a fine solution for a > normal computer with memory and CPU power available. If the tests I outlined in the previous post inaccurately describe a major performance improvement and at least a modest code size reduction; or will *often* introduce bugs -- I *AGREE* with you. Otherwise, I don't. I don't think wasting extra CPU power is a good thing -- Extra CPU power can always be used by something else.... I won't belabor the point further. I'd love to see a counter example to the specific criteria I just provided to IAN -- it would end my quest; and be a good reference to point others to.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-10-30 09:55 +1100 |
| Message-ID | <mailman.3054.1351551326.27098.python-list@python.org> |
| In reply to | #32359 |
On Tue, Oct 30, 2012 at 2:42 AM, Andrew Robinson <andrew3@r3dsolutions.com> wrote: > No, there was no error at all. Pthon just crashed & exited; not even an > exception that I can recall. It was if it exited normally! Can you create a reproducible test case? There's usually a cause to these sorts of things. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-29 17:07 -0600 |
| Message-ID | <mailman.3056.1351552107.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson <andrew3@r3dsolutions.com> wrote: > The list was generated in a single pass by many .append() 's, and then > copied once -- the original was left in place; and then I attempted to slice > it. Note that if the list was generated by .appends, then it was copied more than once. Python reserves a specific amount of space for the list. When it grows past that, the list must be reallocated and copied. It grows the list exponentially in order to keep the amortized time complexity of append at O(1), but the point is that a list of 20 million items is going to be resized and copied several times before it is complete.
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2012-10-29 19:24 -0400 |
| Message-ID | <roy-4A06AD.19241029102012@news.panix.com> |
| In reply to | #32439 |
In article <mailman.3056.1351552107.27098.python-list@python.org>, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Mon, Oct 29, 2012 at 9:42 AM, Andrew Robinson > <andrew3@r3dsolutions.com> wrote: > > The list was generated in a single pass by many .append() 's, and then > > copied once -- the original was left in place; and then I attempted to slice > > it. > > Note that if the list was generated by .appends, then it was copied > more than once. Python reserves a specific amount of space for the > list. When it grows past that, the list must be reallocated and > copied. It grows the list exponentially in order to keep the > amortized time complexity of append at O(1), but the point is that a > list of 20 million items is going to be resized and copied several > times before it is complete. I think you're missing the point of "amortized constant time". Yes, the first item appended to the list will be copied lg(20,000,000) ~= 25 times, because the list will be resized that many times(*). But, on average (I'm not sure if "average" is strictly the correct word here), each item will be copied only once. Still, it always stuck me as odd that there's no preallocate() method. There are times when you really do know how many items you're going to add to the list, and doing a single allocation would be a win. And it doesn't cost anything to provide it. I suppose, however, if you're adding enough items that preallocating would really matter, then maybe you want an array instead. (*) I don't know the exact implementation; I'm assuming each resize is a factor of 2.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-29 17:43 -0600 |
| Message-ID | <mailman.3057.1351554215.27098.python-list@python.org> |
| In reply to | #32442 |
On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith <roy@panix.com> wrote: > I think you're missing the point of "amortized constant time". Yes, the > first item appended to the list will be copied lg(20,000,000) ~= 25 > times, because the list will be resized that many times(*). But, on > average (I'm not sure if "average" is strictly the correct word here), > each item will be copied only once. > > Still, it always stuck me as odd that there's no preallocate() method. > There are times when you really do know how many items you're going to > add to the list, and doing a single allocation would be a win. And it > doesn't cost anything to provide it. I suppose, however, if you're > adding enough items that preallocating would really matter, then maybe > you want an array instead. > > (*) I don't know the exact implementation; I'm assuming each resize is a > factor of 2. The growth factor is approximately 1.125. "Approximately" because there is also a small constant term. The average number of copies per item converges on 8.
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2012-10-29 20:17 -0400 |
| Message-ID | <roy-31BCE0.20171729102012@news.panix.com> |
| In reply to | #32443 |
In article <mailman.3057.1351554215.27098.python-list@python.org>, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Mon, Oct 29, 2012 at 5:24 PM, Roy Smith <roy@panix.com> wrote: > > I think you're missing the point of "amortized constant time". Yes, the > > first item appended to the list will be copied lg(20,000,000) ~= 25 > > times, because the list will be resized that many times(*). But, on > > average (I'm not sure if "average" is strictly the correct word here), > > each item will be copied only once. > > > > Still, it always stuck me as odd that there's no preallocate() method. > > There are times when you really do know how many items you're going to > > add to the list, and doing a single allocation would be a win. And it > > doesn't cost anything to provide it. I suppose, however, if you're > > adding enough items that preallocating would really matter, then maybe > > you want an array instead. > > > > (*) I don't know the exact implementation; I'm assuming each resize is a > > factor of 2. > > The growth factor is approximately 1.125. "Approximately" because > there is also a small constant term. The average number of copies per > item converges on 8. Wow, that's surprising. It also makes it that much more surprising that there's no way to pre-allocate.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-29 18:05 -0600 |
| Message-ID | <mailman.3059.1351555560.27098.python-list@python.org> |
| In reply to | #32442 |
On Mon, Oct 29, 2012 at 5:43 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote: > The growth factor is approximately 1.125. "Approximately" because > there is also a small constant term. The average number of copies per > item converges on 8. Of course, that is the *maximum* number of copies. The actual number could be much less if realloc() performs well.
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 11:00 -0700 |
| Message-ID | <mailman.3062.1351559168.27098.python-list@python.org> |
| In reply to | #32359 |
On 10/29/2012 06:53 AM, Chris Angelico wrote:
> Can you provide links to these notes? I'm looking at
> cpython/Include/sliceobject.h that has this comment:
>
> /*
>
> A slice object containing start, stop, and step data members (the
> names are from range). After much talk with Guido, it was decided to
> let these be any arbitrary python type. Py_None stands for omitted values.
> */
>
> Also, the code for slice objects in CPython works with Py_ssize_t (a
> signed quantity of the same length as size_t), which will allow at
> least 2**31 for an index. I would guess that your crashes were nothing
> to do with 20 million elements and slices.
>
> ChrisA
Let's look at the source code rather than the web notes -- the source
must be the true answer anyhow.
I downloaded the source code for python 3.3.0, as the tbz;
In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089
& ff.
Clearly a slice is malloced for a slice_ty type.
It has four elements: kind, lower, upper, and step.
So, tracing it back to the struct definition...
"Include/Python-ast.h" has "typedef struct _slice *slice_ty;"
And, here's the answer!:
enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
struct _slice {
enum _slice_kind kind;
union {
struct {
expr_ty lower;
expr_ty upper;
expr_ty step;
} Slice;
struct {
asdl_seq *dims;
} ExtSlice;
struct {
expr_ty value;
} Index;
} v;
};
So, slice() does indeed have arbitrary python types included in it;
contrary to what I read elsewhere.
expr_ty is a pointer to an arbitrary expression, so the actual structure
is 4 pointers, at 32 bits each = 16 bytes.
The size of the structure itself, given in an earlier post, is 20 bytes
-- which means one more pointer is involved, perhaps the one pointing to
the slice structure itself.
Hmm...!
An empty tuple gives sys.getsizeof( () ) = 24.
But, I would expect a tuple to be merely a list of object pointers;
hence I would expect 4 bytes for len(), and then a head pointer 4 bytes,
and then a pointer for each object.
3 objects gives 12 bytes, + 8 = 16 bytes.
Then we need one more pointer so Python knows where the struct is...
So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size
as slice() --
but it's 24, even when empty...
And 36 when initialized...
What are the extra 16 bytes for?
All I see is:
typedef struct { object** whatever } PyTupleObject;
[toc] | [prev] | [next] | [standalone]
| From | Chris Kaynor <ckaynor@zindagigames.com> |
|---|---|
| Date | 2012-10-29 18:49 -0700 |
| Message-ID | <mailman.3064.1351561789.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 11:00 AM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
>
> Let's look at the source code rather than the web notes -- the source must
> be the true answer anyhow.
>
> I downloaded the source code for python 3.3.0, as the tbz;
> In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 &
> ff.
>
> Clearly a slice is malloced for a slice_ty type.
> It has four elements: kind, lower, upper, and step.
>
> So, tracing it back to the struct definition...
>
> "Include/Python-ast.h" has "typedef struct _slice *slice_ty;"
>
> And, here's the answer!:
>
> enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
> struct _slice {
> enum _slice_kind kind;
> union {
> struct {
> expr_ty lower;
> expr_ty upper;
> expr_ty step;
> } Slice;
>
> struct {
> asdl_seq *dims;
> } ExtSlice;
>
> struct {
> expr_ty value;
> } Index;
>
> } v;
> };
>
>
> So, slice() does indeed have arbitrary python types included in it; contrary
> to what I read elsewhere.
> expr_ty is a pointer to an arbitrary expression, so the actual structure is
> 4 pointers, at 32 bits each = 16 bytes.
> The size of the structure itself, given in an earlier post, is 20 bytes --
> which means one more pointer is involved, perhaps the one pointing to the
> slice structure itself.
>
> Hmm...!
>
> An empty tuple gives sys.getsizeof( () ) = 24.
>
> But, I would expect a tuple to be merely a list of object pointers; hence I
> would expect 4 bytes for len(), and then a head pointer 4 bytes, and then a
> pointer for each object.
> 3 objects gives 12 bytes, + 8 = 16 bytes.
>
> Then we need one more pointer so Python knows where the struct is...
> So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size as
> slice() --
>
> but it's 24, even when empty...
> And 36 when initialized...
> What are the extra 16 bytes for?
Every Python object requires two pieces of data, both of which are
pointer-sized (one is a pointer, one is an int the size of a pointer).
These are: a pointer to the object's type, and the object's reference
count.
A tuple actually does not need a head pointer: the head pointer is
merely an offset from the tuple's pointer. It merely has a ref count,
type, an item count, and pointers to its contents.
A slice has the same type pointer and reference count, then three
pointers to the start, stop, and step objects. This means a slice
object should be the same size as a two-item tuple: the tuple needs a
count, while that is fixed at 3 for a slice (though some items may be
unset).
NOTE: The above is taken from reading the source code for Python 2.6.
For some odd reason, I am getting that an empty tuple consists of 6
pointer-sized objects (48 bytes on x64), rather than the expected 3
pointer-sized (24 bytes on x64). Slices are showing up as the expected
5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
pointer (8 bytes on x64) per item. I imagine I am missing something,
but cannot figure out what that would be.
>
> All I see is:
> typedef struct { object** whatever } PyTupleObject;
>
[toc] | [prev] | [next] | [standalone]
| From | Andrew Robinson <andrew3@r3dsolutions.com> |
|---|---|
| Date | 2012-10-29 15:39 -0700 |
| Message-ID | <mailman.3071.1351575542.27098.python-list@python.org> |
| In reply to | #32359 |
On 10/29/2012 06:49 PM, Chris Kaynor wrote:
> Every Python object requires two pieces of data, both of which are
> pointer-sized (one is a pointer, one is an int the size of a pointer).
> These are: a pointer to the object's type, and the object's reference
> count. A tuple actually does not need a head pointer: the head pointer
> is merely an offset from the tuple's pointer. It merely has a ref
> count, type, an item count, and pointers to its contents. A slice has
> the same type pointer and reference count, then three pointers to the
> start, stop, and step objects. This means a slice object should be the
> same size as a two-item tuple: the tuple needs a count, while that is
> fixed at 3 for a slice (though some items may be unset). NOTE: The
> above is taken from reading the source code for Python 2.6. For some
> odd reason, I am getting that an empty tuple consists of 6
> pointer-sized objects (48 bytes on x64), rather than the expected 3
> pointer-sized (24 bytes on x64). Slices are showing up as the expected
> 5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1
> pointer (8 bytes on x64) per item. I imagine I am missing something,
> but cannot figure out what that would be.
>> All I see is:
>> typedef struct { object** whatever } PyTupleObject;
>>
It's fairly straight forward in 3.2.0. I debugged the code with GDB and
watched.
Perhaps it is the same in 2.6 ?
In addition to those items you mention, of which the reference count is
not even *inside* the struct -- there is additional debugging
information not mentioned. Built in objects contain a "line number", a
"column number", and a "context" pointer. These each require a full
word of storage.
Also, built in types appear to have a "kind" field which indicates the
object "type" but is not a pointer. That suggests two "object" type
indicators, a generic pointer (probably pointing to "builtin"? somewhere
outside the struct) and a specific one (an enum) inside the "C" struct.
Inside the tuple struct, I count 4 undocumented words of information.
Over all, there is a length, the list of pointers, a "kind", "line",
"col" and "context"; making 6 pieces in total.
Although your comment says the head pointer is not required; I found in
3.3.0 that it is a true head pointer; The Tuple() function on line 2069
of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.
That pointer is copied into the Tuple struct.
How ironic, slices don't have debugging info, that's the main reason
they are smaller.
When I do slice(3,0,2), suprisingly "Slice()" is NOT called.
But when I do a[1:2:3] it *IS* called.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-10-29 23:55 -0600 |
| Message-ID | <mailman.3073.1351576533.27098.python-list@python.org> |
| In reply to | #32359 |
On Mon, Oct 29, 2012 at 12:00 PM, Andrew Robinson
<andrew3@r3dsolutions.com> wrote:
> I downloaded the source code for python 3.3.0, as the tbz;
> In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 &
> ff.
Python-ast.c is part of the compiler code. That's not the struct used
to represent the object at runtime, but the struct used to represent
the AST node while compiling.
For the runtime definition of slices, look at sliceobject.h and
sliceobject.c. Slices are represented as:
typedef struct {
PyObject_HEAD
PyObject *start, *stop, *step; /* not NULL */
} PySliceObject;
"PyObject_HEAD" is a macro that incorporates the object type pointer
and the reference count. Hence, 5 words.
[toc] | [prev] | [next] | [standalone]
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
Back to top | Article view | comp.lang.python
csiph-web