Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.albasani.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'element': 0.07; 'none:': 0.07; 'rename': 0.07; 'exception,': 0.09; 'iterate': 0.09; 'lst': 0.09; 'subject:iterable': 0.09; 'val': 0.09; 'cc:addr:python- list': 0.11; 'def': 0.12; 'wrote': 0.14; '[1].': 0.16; 'bit.': 0.16; 'collections': 0.16; 'compatible,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'iterable)': 0.16; 'iterable,': 0.16; 'iterables': 0.16; 'iterating': 0.16; 'iteration': 0.16; 'iterator': 0.16; 'iterator,': 0.16; 'iterators': 0.16; 'once.': 0.16; 'roy': 0.16; 'rule.': 0.16; 'set,': 0.16; 'subclass': 0.16; 'wrote:': 0.18; 'implementing': 0.19; 'thu,': 0.19; 'meant': 0.20; '>>>': 0.22; 'previously': 0.22; 'cc:addr:python.org': 0.22; 'load': 0.23; 'case.': 0.24; 'library,': 0.24; 'space.': 0.24; 'cc:2**0': 0.24; 'this:': 0.26; 'least': 0.26; 'distribute': 0.26; 'asking': 0.27; 'skip:_ 20': 0.27; 'values': 0.27; 'header:In-Reply-To:1': 0.27; 'specifically': 0.29; '(this': 0.29; '[1]': 0.29; 'raise': 0.29; '(like': 0.30; 'matching': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'getting': 0.31; 'requests': 0.31; 'bunch': 0.31; 'sep': 0.31; 'though.': 0.31; 'with,': 0.31; 'class': 0.32; 'lists': 0.32; 'probably': 0.32; 'cases': 0.33; 'minimal': 0.33; 'skip:_ 10': 0.34; 'could': 0.34; 'something': 0.35; 'objects': 0.35; 'operations': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'consistent': 0.36; 'i.e.': 0.36; 'ordered': 0.36; 'returning': 0.36; 'set.': 0.36; 'subject:?': 0.36; 'example,': 0.37; 'list': 0.37; 'starting': 0.37; 'server': 0.38; 'skip:[ 10': 0.38; 'pm,': 0.38; 'that,': 0.38; 'anything': 0.39; 'expect': 0.39; 'itself': 0.39; 'either': 0.39; 'most': 0.60; 'tell': 0.60; 'simple': 0.61; "you'll": 0.62; 'name': 0.63; 'kind': 0.63; 'skip:n 10': 0.64; 'different': 0.65; 'between': 0.67; 'anything.': 0.68; 'smith': 0.68; 'subject:there': 0.68; 'records,': 0.69; 'therefore': 0.72; 'records': 0.73; 'unusual': 0.74; 'special': 0.74; 'guaranteed': 0.75; 'guaranteed.': 0.84; 'results,': 0.84; 'subject:check': 0.84; 'yielded': 0.84; '"how': 0.91; 'do:': 0.91; 'to:none': 0.92; 'imagine': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=PZB6orWwqfnM4BOxFECJvL+bqRTHvmjQ4SKooEi3B6o=; b=hpk5mnAd/okpdqMnXR9io9T7pZHZsWtkiPFn6pJtyJZiHLJAv7UUWB31THn4QGDGdJ mwoO7DlFVcdH+Hxk8Qjnqx+Z7LE3l57C3rdpChUrpdqdMiflgRK3rdgkUbzdDfO7OjTT 0Yy9iaHlPOMuzPYbX9ylXoSf5zxI6MGAEa6d+IUsBZR2Dd6fst0E+DOmdiT8JOsb2nrI jjI0APDWf3BuZW5gDisU+cJY5ic7PJ+ZTJKcRBds52gaNmKGZGSxYkBOdKfSO9g09LNv w3dMJl8r2aE8NlyQTeUwYtbbg0nOHWsbD8sQ+g+xODO7DdWHqwG0/l+QAO8KhGLJLC5I I24Q== MIME-Version: 1.0 X-Received: by 10.43.94.7 with SMTP id bw7mr5017074icc.26.1411047205738; Thu, 18 Sep 2014 06:33:25 -0700 (PDT) In-Reply-To: References: Date: Thu, 18 Sep 2014 23:33:25 +1000 Subject: Re: Is there a canonical way to check whether an iterable is ordered? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 88 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1411047208 news.xs4all.nl 2912 [2001:888:2000:d::a6]:59290 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:78015 On Thu, Sep 18, 2014 at 10:58 PM, Roy Smith wrote: > I suspect what he meant was "How can I tell if I'm iterating over an > ordered collection?", i.e. iterating over a list vs. iterating over a > set. Right, which is what I meant by asking if the order mattered. When you iterate over a set, you'll get some kind of order (because iterables have to have an order), but it won't mean anything. > Is there anything which requires an iterator to be deterministic? For > example, let's say I have an iterable, i, and I do: > > list1 = [item for item in i] > list2 = [item for item in i] > > am I guaranteed that list1 == list2? It will be for all the collections > I can think of in the standard library, but if I wrote my own class with > an __iter__() which yielded the items in a non-deterministic order, > would I be violating something other than the principle of least > astonishment? It's not guaranteed. If you do exactly that, with no operations in between, then yes, all the stdlib collections will (AFAIK) give you matching lists, but that's definitely not required by iterator protocol. The one thing you can rely on (and therefore must comply with, when you design an iterable) is that iteration will hit every element exactly once. Implementing that on most collections means returning the values in some internal representational order, something that's consistent across the lifetime of the iterator; having that not be consistent across multiple iterators would be the exception, not the rule. There would be a few special cases where you specifically *want to* have the lists differ, though. Imagine DNS records: perhaps you have four A records for some name, and you want to distribute load between them [1]. You could have your name server do something like this: class _cl_iter: def __init__(self, lst, start): self._data = lst self._start = self._next = start def __iter__(self): return self def __next__(self): if self._next is None: raise StopIteration val = self._data[self._next] self._next = (self._next + 1) % len(self._data) if self._next == self._start: self._next = None return val class CircularList: def __init__(self, it): self._data = list(it) self._next = -1 def __iter__(self): self._next = (self._next + 1) % len(self._data) return _cl_iter(self._data, self._next) Every time you iterate over a given CircularList, you'll get the same results, but starting at a different point: >>> lst = CircularList(("192.0.2.1","192.0.2.2","192.0.2.3","192.0.2.4")) >>> list(lst) ['192.0.2.1', '192.0.2.2', '192.0.2.3', '192.0.2.4'] >>> list(lst) ['192.0.2.2', '192.0.2.3', '192.0.2.4', '192.0.2.1'] >>> list(lst) ['192.0.2.3', '192.0.2.4', '192.0.2.1', '192.0.2.2'] >>> list(lst) ['192.0.2.4', '192.0.2.1', '192.0.2.2', '192.0.2.3'] So if you have a whole bunch of these for your different A records, and you return them in iteration order to each client, you'll end up with different clients getting them in a different order, with minimal extra storage space. (This is Py3 code; to make it Py2 compatible, probably all you need to do is subclass object and rename __next__ to next.) But this is a pretty unusual case. I would expect that most objects will either iterate consistently until mutated, or return only what wasn't previously consumed (like an iterator, which is itself iterable). ChrisA [1] This isn't true load-balancing, of course, but it's a simple way to distribute requests a bit. It's better than nothing.