Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #60163
| Date | 2013-11-22 00:01 +1100 |
|---|---|
| From | John O'Hagan <research@johnohagan.com> |
| Subject | Re: Recursive generator for combinations of a multiset? |
| References | <20131121174614.53450d51@mini.home> <CAHVvXxQ0Kdd91nCmmz6fw0fMiA=1nrPT=DLZAxhrFNhsuS89DA@mail.gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3008.1385038886.18130.python-list@python.org> (permalink) |
On Thu, 21 Nov 2013 11:42:49 +0000
Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
> On 21 November 2013 06:46, John O'Hagan <research@johnohagan.com>
> wrote:
> >
> > I found a verbal description of such an algorithm and came up with
> > this:
> >
> > def multicombs(it, r):
> > result = it[:r]
> > yield result
> > while 1:
> > for i in range(-1, -r - 1, -1):
> > rep = result[i]
> > if rep < it[i]:
> > break
> > else:
> > break
> > for j, n in enumerate(it):
> > if n > rep:
> > break
> > result = result[:i] + it[j:j - i]
> > yield result
>
> I'm not really sure what it is you're asking for. I thought if I ran
> the code I'd understand but that just confused me more. Is the output
> below correct? If not what should it be?
>
> multicombs("abracadabra", 0)
> ['']
> multicombs("abracadabra", 1)
> ['a']
> multicombs("abracadabra", 2)
> ['ab', 'br', 'ra']
> multicombs("abracadabra", 3)
> ['abr', 'ara', 'bra']
> multicombs("abracadabra", 4)
> ['abra']
> multicombs("abracadabra", 5)
> ['abrac', 'abrbr', 'abrra', 'braca', 'brara', 'brbra', 'racad',
> 'racbr', 'racra']
I neglected to mention that multicombs takes a sorted iterable;
it doesn't work right otherwise. I'd forgotten that because my
wordlists are guaranteed sorted by the way they're built. Sorry about
that.
In my use-case the first argument to multicombs is a tuple of words
which may contain duplicates, and it produces all unique combinations
of a certain length of those words, eg:
list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3))
[('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'),
('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'),
('in', 'the', 'the')]
Contrast this with:
list(itertools.combinations(('cat', 'hat', 'in', 'the', 'the'), 3))
[('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'hat', 'the'),
('cat', 'in', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'),
('hat', 'in', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'),
('in', 'the', 'the')]
which produces results which are redundant for my purposes.
What I'm looking for is a recursive algorithm which does what
multicombs does (order unimportant) so that I can apply a pruning
shortcut like the one I used in the recursive cartesian product
algorithm in my original post.
Multiset combination algorithms seem pretty thin on the ground out
there - as I said, I could only find a description of the procedure
above, no actual code. The ones I did find are non-recursive. I'm
hoping some combinatorics and/or recursion experts can offer advice.
Regards,
--
John
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Re: Recursive generator for combinations of a multiset? John O'Hagan <research@johnohagan.com> - 2013-11-22 00:01 +1100
Re: Recursive generator for combinations of a multiset? James <hslee911@yahoo.com> - 2013-11-21 18:14 -0800
Re: Recursive generator for combinations of a multiset? John O'Hagan <research@johnohagan.com> - 2013-11-23 12:07 +1100
csiph-web