Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #60271 > unrolled thread

Re: Recursive generator for combinations of a multiset?

Started byMRAB <python@mrabarnett.plus.com>
First post2013-11-23 04:23 +0000
Last post2013-11-23 04:23 +0000
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Recursive generator for combinations of a multiset? MRAB <python@mrabarnett.plus.com> - 2013-11-23 04:23 +0000

#60271 — Re: Recursive generator for combinations of a multiset?

FromMRAB <python@mrabarnett.plus.com>
Date2013-11-23 04:23 +0000
SubjectRe: Recursive generator for combinations of a multiset?
Message-ID<mailman.3067.1385180625.18130.python-list@python.org>
On 23/11/2013 00:58, John O'Hagan wrote:
> On Thu, 21 Nov 2013 12:59:26 -0800
> Dan Stromberg <drsalists@gmail.com> wrote:
>
>> On Wed, Nov 20, 2013 at 10:46 PM, John O'Hagan
>> <research@johnohagan.com>wrote:
>>
>> >
>> > Short story: the subject says it all, so if you have an answer
>> > already, fire away. Below is the long story of what I'm using it
>> > for, and why I think it needs to be recursive. It may even be of
>> > more general interest in terms of filtering the results of
>> > generators.
>> >
>>
>> I think you probably need permutations rather than combinations.
>>
>> Also, I think you'll need to form a word (partitioned off by spaces),
>> and then check it against a set containing /usr/share/dict/words
>> before recursing for the remainder of the sentence - this should
>> speed things up a LOT.
>
> Thanks for the reply. If I understand you correctly, you are suggesting
> permuting the input _characters_ to form words and then seeing if
> they exist, as opposed to my approach of combining known words and
> seeing if they are anagrams. (Permutations of words would not help find
> anagrams as they merely change the word order). Here is an attempt at
> that:
>
> def anagrams(partition, input_string):
>      """Find anagrams which fit given partition of input string length"""
>      if not partition:
>          yield (), input_string
>          return
>      for words, checkstring in anagrams(partition[:-1], input_string):
>          for word in itertools.permutations(checkstring, partition[-1]):
>              word = ''.join(word)
>              if word in WORDS: #WORDS is collection of dictionary words
>                  newstring = checkstring
>                  for l in word:
>                      newstring = newstring.replace(l, '' , 1)
>                  yield words + (word,), newstring
>
> There are two problems with this. If there are repeated characters in
> the input, redundant results are produced; a multiset-permutation
> algorithm would fix this. But the main problem is it is incredibly
> slow: on my run-of-the-mill laptop, it chokes on anything longer than
> about 10 characters, spending most of its time rejecting non-words.
>
> Or have I misunderstood your suggestion?
>
If you want to know how to get unique permutations, have a look here:

http://mail.python.org/pipermail/python-ideas/2013-October/023610.html

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web