Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <CAN1F8qU1T42YFcE5VSPDTOyXud3anwPSptDvOF9MHxoR5H3ftA@mail.gmail.com>
References: <506C4B23.6020809@gmail.com> <CAHVvXxSOkSEsMVR=3Yh8CMx+qDFd=hd=wK6U8sW96WKXpqk5AA@mail.gmail.com> <CAN1F8qUy8e-Micsi_B3X7+8Z4Co-4xi1E6ovnfwwXf8j4n9S8g@mail.gmail.com> <CAAvjzF0KWVspZikM+xa_XXKOmsMq+mybPGo_5UfpAr=UKMHw0Q@mail.gmail.com> <CAN1F8qU1T42YFcE5VSPDTOyXud3anwPSptDvOF9MHxoR5H3ftA@mail.gmail.com>
Date: Thu, 4 Oct 2012 17:12:38 +0200
Subject: Re: Combinations of lists
From: Steen Lysgaard <boxeakasteen@gmail.com>
To: Joshua Landau <joshua.landau.ws@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Python <python-list@python.org>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.1801.1349363561.27098.python-list@python.org>
Lines: 82
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:30733

2012/10/4 Joshua Landau <joshua.landau.ws@gmail.com>:
> On 3 October 2012 21:15, Steen Lysgaard <boxeakasteen@gmail.com> wrote:
>>
>> Hi,
>>
>> thanks for your interest. Sorry for not being completely clear, yes
>> the length of m will always be half of the length of h.
>
>
> (Please don't top post)
>
> I have a solution to this, then.
> It's not short or fast, but it's a lot faster than yours.
>
> But first let me explain the most obvious optimization to your version of
> the code:
>
>> combs = set()
>>
>>
>> for a in permutations(range(len(h)),len(h)):
>>     comb = []
>>     for i in range(len(h)):
>>         comb.append(c[i][a[i]])
>>     comb.sort()
>>
>>     frzn = tuple(comb)
>>     if frzn not in combs:
>>         combs.add(frzn)
>
>
>  What I have done here is make your "combs" a set. This helps because you
> are searching inside it and that is an O(N) operation... for lists.
> A set can do the same in O(1). Simplez.
>
> first  = list("AABBCCDDEE")
> second = list("abcde")
> import itertools
> #
> # Generator, so ignoring case convention
> class force_unique_combinations:
> def __init__(self, lst, n):
> self.cache = set()
> self.internal_iter = itertools.combinations(lst, n)
> def __iter__(self):
> return self
> def __next__(self):
> while True:
> nxt = next(self.internal_iter)
> if not nxt in self.cache:
> self.cache.add(nxt)
> return nxt
> def combine(first, second):
> sletter = second[0]
> first_combinations = force_unique_combinations(first, 2)
> if len(second) == 1:
> for combination in first_combinations:
> yield [sletter+combination[0], sletter+combination[1]]
> else:
> for combination in first_combinations:
> first_ = first[:]
> first_.remove(combination[0])
> first_.remove(combination[1])
> prefix = [sletter+combination[0], sletter+combination[1]]
> for inner in combine(first_, second[1:]):
> yield prefix + inner
>
>
> This is quite naive, because I don't know how to properly implement
> force_unique_combinations, but it runs. I hope this is right. If you need
> significantly more speed your best chance is probably Cython or C, although
> I don't doubt 10x more speed may well be possible from within Python.
>
>
> Also, 88888 Dihedral is a bot, or at least pretending like crazy to be one.

Great, I've now got a solution much faster than what I could come up with.
Thanks to the both of you.
And a good spot on 88... I could not for my life understand what he
(it) had written.

/Steen