Groups > comp.lang.python > #30733 > unrolled thread

Re: Combinations of lists

Started by	Steen Lysgaard <boxeakasteen@gmail.com>
First post	2012-10-04 17:12 +0200
Last post	2012-10-04 12:20 -0700
Articles	3 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: Combinations of lists Steen Lysgaard <boxeakasteen@gmail.com> - 2012-10-04 17:12 +0200
    Re: Combinations of lists 88888 Dihedral <dihedral88888@googlemail.com> - 2012-10-04 12:20 -0700
    Re: Combinations of lists 88888 Dihedral <dihedral88888@googlemail.com> - 2012-10-04 12:20 -0700

#30733 — Re: Combinations of lists

From	Steen Lysgaard <boxeakasteen@gmail.com>
Date	2012-10-04 17:12 +0200
Subject	Re: Combinations of lists
Message-ID	<mailman.1801.1349363561.27098.python-list@python.org>

2012/10/4 Joshua Landau <joshua.landau.ws@gmail.com>:
> On 3 October 2012 21:15, Steen Lysgaard <boxeakasteen@gmail.com> wrote:
>>
>> Hi,
>>
>> thanks for your interest. Sorry for not being completely clear, yes
>> the length of m will always be half of the length of h.
>
>
> (Please don't top post)
>
> I have a solution to this, then.
> It's not short or fast, but it's a lot faster than yours.
>
> But first let me explain the most obvious optimization to your version of
> the code:
>
>> combs = set()
>>
>>
>> for a in permutations(range(len(h)),len(h)):
>>     comb = []
>>     for i in range(len(h)):
>>         comb.append(c[i][a[i]])
>>     comb.sort()
>>
>>     frzn = tuple(comb)
>>     if frzn not in combs:
>>         combs.add(frzn)
>
>
>  What I have done here is make your "combs" a set. This helps because you
> are searching inside it and that is an O(N) operation... for lists.
> A set can do the same in O(1). Simplez.
>
> first  = list("AABBCCDDEE")
> second = list("abcde")
> import itertools
> #
> # Generator, so ignoring case convention
> class force_unique_combinations:
> def __init__(self, lst, n):
> self.cache = set()
> self.internal_iter = itertools.combinations(lst, n)
> def __iter__(self):
> return self
> def __next__(self):
> while True:
> nxt = next(self.internal_iter)
> if not nxt in self.cache:
> self.cache.add(nxt)
> return nxt
> def combine(first, second):
> sletter = second[0]
> first_combinations = force_unique_combinations(first, 2)
> if len(second) == 1:
> for combination in first_combinations:
> yield [sletter+combination[0], sletter+combination[1]]
> else:
> for combination in first_combinations:
> first_ = first[:]
> first_.remove(combination[0])
> first_.remove(combination[1])
> prefix = [sletter+combination[0], sletter+combination[1]]
> for inner in combine(first_, second[1:]):
> yield prefix + inner
>
>
> This is quite naive, because I don't know how to properly implement
> force_unique_combinations, but it runs. I hope this is right. If you need
> significantly more speed your best chance is probably Cython or C, although
> I don't doubt 10x more speed may well be possible from within Python.
>
>
> Also, 88888 Dihedral is a bot, or at least pretending like crazy to be one.

Great, I've now got a solution much faster than what I could come up with.
Thanks to the both of you.
And a good spot on 88... I could not for my life understand what he
(it) had written.

/Steen

[toc] | [next] | [standalone]

#30742

From	88888 Dihedral <dihedral88888@googlemail.com>
Date	2012-10-04 12:20 -0700
Message-ID	<3c6e2683-6328-4225-9ef2-20bf21c3a846@googlegroups.com>
In reply to	#30733

On Thursday, October 4, 2012 11:12:41 PM UTC+8, Steen Lysgaard wrote:
> 2012/10/4 Joshua Landau <joshua.landau.ws@gmail.com>:
> 
> > On 3 October 2012 21:15, Steen Lysgaard <boxeakasteen@gmail.com> wrote:
> 
> >>
> 
> >> Hi,
> 
> >>
> 
> >> thanks for your interest. Sorry for not being completely clear, yes
> 
> >> the length of m will always be half of the length of h.
> 
> >
> 
> >
> 
> > (Please don't top post)
> 
> >
> 
> > I have a solution to this, then.
> 
> > It's not short or fast, but it's a lot faster than yours.
> 
> >
> 
> > But first let me explain the most obvious optimization to your version of
> 
> > the code:
> 
> >
> 
> >> combs = set()
> 
> >>
> 
> >>
> 
> >> for a in permutations(range(len(h)),len(h)):
> 
> >>     comb = []
> 
> >>     for i in range(len(h)):
> 
> >>         comb.append(c[i][a[i]])
> 
> >>     comb.sort()
> 
> >>
> 
> >>     frzn = tuple(comb)
> 
> >>     if frzn not in combs:
> 
> >>         combs.add(frzn)
> 
> >
> 
> >
> 
> >  What I have done here is make your "combs" a set. This helps because you
> 
> > are searching inside it and that is an O(N) operation... for lists.
> 
> > A set can do the same in O(1). Simplez.
> 
> >
> 
> > first  = list("AABBCCDDEE")
> 
> > second = list("abcde")
> 
> > import itertools
> 
> > #
> 
> > # Generator, so ignoring case convention
> 
> > class force_unique_combinations:
> 
> > def __init__(self, lst, n):
> 
> > self.cache = set()
> 
> > self.internal_iter = itertools.combinations(lst, n)
> 
> > def __iter__(self):
> 
> > return self
> 
> > def __next__(self):
> 
> > while True:
> 
> > nxt = next(self.internal_iter)
> 
> > if not nxt in self.cache:
> 
> > self.cache.add(nxt)
> 
> > return nxt
> 
> > def combine(first, second):
> 
> > sletter = second[0]
> 
> > first_combinations = force_unique_combinations(first, 2)
> 
> > if len(second) == 1:
> 
> > for combination in first_combinations:
> 
> > yield [sletter+combination[0], sletter+combination[1]]
> 
> > else:
> 
> > for combination in first_combinations:
> 
> > first_ = first[:]
> 
> > first_.remove(combination[0])
> 
> > first_.remove(combination[1])
> 
> > prefix = [sletter+combination[0], sletter+combination[1]]
> 
> > for inner in combine(first_, second[1:]):
> 
> > yield prefix + inner
> 
> >
> 
> >
> 
> > This is quite naive, because I don't know how to properly implement
> 
> > force_unique_combinations, but it runs. I hope this is right. If you need
> 
> > significantly more speed your best chance is probably Cython or C, although
> 
> > I don't doubt 10x more speed may well be possible from within Python.
> 
> >
> 
> >
> 
> > Also, 88888 Dihedral is a bot, or at least pretending like crazy to be one.
> 
> 
> 
> Great, I've now got a solution much faster than what I could come up with.
> 
> Thanks to the both of you.
> 
> And a good spot on 88... I could not for my life understand what he
> 
> (it) had written.
> 
> 
> 
> /Steen

If an unique order is defined, then it is trivial to solve this problem
without any recursions.

[toc] | [prev] | [next] | [standalone]

#30743

From	88888 Dihedral <dihedral88888@googlemail.com>
Date	2012-10-04 12:20 -0700
Message-ID	<mailman.1808.1349378409.27098.python-list@python.org>
In reply to	#30733

On Thursday, October 4, 2012 11:12:41 PM UTC+8, Steen Lysgaard wrote:
> 2012/10/4 Joshua Landau <joshua.landau.ws@gmail.com>:
> 
> > On 3 October 2012 21:15, Steen Lysgaard <boxeakasteen@gmail.com> wrote:
> 
> >>
> 
> >> Hi,
> 
> >>
> 
> >> thanks for your interest. Sorry for not being completely clear, yes
> 
> >> the length of m will always be half of the length of h.
> 
> >
> 
> >
> 
> > (Please don't top post)
> 
> >
> 
> > I have a solution to this, then.
> 
> > It's not short or fast, but it's a lot faster than yours.
> 
> >
> 
> > But first let me explain the most obvious optimization to your version of
> 
> > the code:
> 
> >
> 
> >> combs = set()
> 
> >>
> 
> >>
> 
> >> for a in permutations(range(len(h)),len(h)):
> 
> >>     comb = []
> 
> >>     for i in range(len(h)):
> 
> >>         comb.append(c[i][a[i]])
> 
> >>     comb.sort()
> 
> >>
> 
> >>     frzn = tuple(comb)
> 
> >>     if frzn not in combs:
> 
> >>         combs.add(frzn)
> 
> >
> 
> >
> 
> >  What I have done here is make your "combs" a set. This helps because you
> 
> > are searching inside it and that is an O(N) operation... for lists.
> 
> > A set can do the same in O(1). Simplez.
> 
> >
> 
> > first  = list("AABBCCDDEE")
> 
> > second = list("abcde")
> 
> > import itertools
> 
> > #
> 
> > # Generator, so ignoring case convention
> 
> > class force_unique_combinations:
> 
> > def __init__(self, lst, n):
> 
> > self.cache = set()
> 
> > self.internal_iter = itertools.combinations(lst, n)
> 
> > def __iter__(self):
> 
> > return self
> 
> > def __next__(self):
> 
> > while True:
> 
> > nxt = next(self.internal_iter)
> 
> > if not nxt in self.cache:
> 
> > self.cache.add(nxt)
> 
> > return nxt
> 
> > def combine(first, second):
> 
> > sletter = second[0]
> 
> > first_combinations = force_unique_combinations(first, 2)
> 
> > if len(second) == 1:
> 
> > for combination in first_combinations:
> 
> > yield [sletter+combination[0], sletter+combination[1]]
> 
> > else:
> 
> > for combination in first_combinations:
> 
> > first_ = first[:]
> 
> > first_.remove(combination[0])
> 
> > first_.remove(combination[1])
> 
> > prefix = [sletter+combination[0], sletter+combination[1]]
> 
> > for inner in combine(first_, second[1:]):
> 
> > yield prefix + inner
> 
> >
> 
> >
> 
> > This is quite naive, because I don't know how to properly implement
> 
> > force_unique_combinations, but it runs. I hope this is right. If you need
> 
> > significantly more speed your best chance is probably Cython or C, although
> 
> > I don't doubt 10x more speed may well be possible from within Python.
> 
> >
> 
> >
> 
> > Also, 88888 Dihedral is a bot, or at least pretending like crazy to be one.
> 
> 
> 
> Great, I've now got a solution much faster than what I could come up with.
> 
> Thanks to the both of you.
> 
> And a good spot on 88... I could not for my life understand what he
> 
> (it) had written.
> 
> 
> 
> /Steen

If an unique order is defined, then it is trivial to solve this problem
without any recursions.

[toc] | [prev] | [standalone]

csiph-web

Re: Combinations of lists

Contents

#30733 — Re: Combinations of lists

#30742

#30743