Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'suppose': 0.07; 'python': 0.09; 'agree,': 0.09; 'any.': 0.09; 'bool': 0.09; 'exception,': 0.09; 'imho.': 0.09; 'longs': 0.09; 'sep': 0.09; 'subject:set': 0.09; 'throw': 0.09; 'to:addr:comp.lang.python': 0.09; 'bug': 0.10; 'cc:addr:python-list': 0.10; 'itself.': 0.11; "wouldn't": 0.11; 'index': 0.13; 'language': 0.14; 'cases': 0.15; 'modification': 0.15; 'arrays,': 0.16; 'bug,': 0.16; 'count.': 0.16; 'flag,': 0.16; 'increment': 0.16; 'iterating': 0.16; 'iterator': 0.16; 'iterator.': 0.16; 'iterators': 0.16; 'iterators,': 0.16; 'justified': 0.16; 'made,': 0.16; 'originally,': 0.16; 'overflow.': 0.16; 'precarious': 0.16; 'set,': 0.16; 'wrote:': 0.17; 'byte': 0.17; 'detect': 0.17; 'instance': 0.17; 'pointer': 0.17; 'version.': 0.17; '>>>': 0.18; "i'd": 0.22; 'cc:2**0': 0.23; 'monday,': 0.23; 'cc:no real name:2**0': 0.24; 'idea': 0.24; 'testing': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'am,': 0.27; 'prevent': 0.27; 'set.': 0.27; "doesn't": 0.28; 'faster,': 0.29; 'long.': 0.29; 'overhead': 0.29; 'wrap': 0.29; 'reset': 0.29; 'figure': 0.30; 'point': 0.31; 'problem.': 0.32; 'structure': 0.32; 'could': 0.32; 'int': 0.33; 'problem': 0.33; 'another': 0.33; 'that,': 0.34; 'version': 0.34; 'received:google.com': 0.34; 'done': 0.34; 'list': 0.35; 'faster': 0.35; 'open': 0.35; 'doing': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'next': 0.35; 'created': 0.36; 'but': 0.36; 'too': 0.36; 'enough': 0.36; 'possible': 0.37; 'keeps': 0.37; 'does': 0.37; 'two': 0.37; 'late': 0.37; 'received:209': 0.37; 'received:209.85.216': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'object': 0.38; 'some': 0.38; 'easily': 0.39; 'think': 0.40; 'most': 0.61; 'solve': 0.62; 'more': 0.63; 'therefore': 0.65; 'exceed': 0.65; 'serial': 0.66; 'upper': 0.75; 'strategies': 0.76; 'counts': 0.81; 'angel,': 0.84; 'flag.': 0.84; 'received:209.85.216.184': 0.84; 'reminds': 0.84; 'angel': 0.93; 'technique': 0.93 Newsgroups: comp.lang.python Date: Mon, 3 Sep 2012 17:24:02 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=67.184.78.189; posting-account=MQ3pigoAAACeFzUFjVAePnOjOJMNlvq9 References: <7xy5le7cli.fsf@ruckus.brouhaha.com> <502dab6c$0$29978$c3e8da3$5496439d@news.astraweb.com> <1567e8c7-a2bb-41f4-9be8-18e9f4d063cb@googlegroups.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 67.184.78.189 MIME-Version: 1.0 Subject: Re: set and dict iteration From: Aaron Brady To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 88 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346718245 news.xs4all.nl 6854 [2001:888:2000:d::a6]:37176 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28376 On Monday, September 3, 2012 3:28:28 PM UTC-5, Dave Angel wrote: > On 09/03/2012 04:04 PM, Aaron Brady wrote: >=20 > > On Monday, September 3, 2012 2:30:24 PM UTC-5, Ian wrote: >=20 > >> On Sun, Sep 2, 2012 at 11:43 AM, Aaron Brady wr= ote: >=20 > >> >=20 > >>> We could use a Python long object for the version index to prevent ov= erflow. Combined with P. Rubin's idea to count the number of open iterator= s, most use cases still wouldn't exceed a single word comparison; we could = reset the counter when there weren't any. >=20 > >> >=20 > >> >=20 > >> We could use a Python long; I just don't think the extra overhead is >=20 > >> >=20 > >> justified in a data structure that is already highly optimized for >=20 > >> >=20 > >> speed. Incrementing and testing a C int is *much* faster than doing >=20 > >> >=20 > >> the same with a Python long. >=20 > > I think the technique would require two python longs and a bool in the = set, and a python long in the iterator. >=20 > > >=20 > > One long counts the number of existing (open) iterators. Another count= s the version. The bool keeps track of whether an iterator has been create= d since the last modification, in which case the next modification requires= incrementing the version and resetting the flag. >=20 >=20 >=20 > I think you're over-engineering the problem. it's a bug if an iterator >=20 > is used after some change is made to the set it's iterating over. We >=20 > don't need to catch every possible instance of the bug, that's what >=20 > testing is for. The point is to "probably" detect it, and for that, all >=20 > we need is a counter in the set and a counter in the open iterator.=20 >=20 > Whenever changing the set, increment its count. And whenever iterating, >=20 > check the two counters. if they don't agree, throw an exception, and >=20 > destroy the iterator. i suppose that could be done with a flag, but it >=20 > could just as easily be done by zeroing the pointer to the set. >=20 >=20 >=20 > I'd figure a byte or two would be good enough for the counts, but a C >=20 > uint would be somewhat faster, and wouldn't wrap as quickly. >=20 >=20 >=20 > --=20 >=20 >=20 >=20 > DaveA Hi D. Angel, The serial index constantly reminds me of upper limits. I have the same pr= oblem with PHP arrays, though it's not a problem with the language itself. The linked list doesn't have a counter, it invalidates iterators when a mod= ification is made, therefore it's the "correct" structure in my interpretat= ion. But it does seem more precarious comparatively, IMHO. Both strategies solve the problem I posed originally, they both involve tra= de-offs, and it's too late to include either in 3.3.0.