Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!news.tele.dk!feed118.news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.006 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'else:': 0.03; 'algorithm': 0.04; 'argument': 0.05; '21,': 0.07; 'advice.': 0.09; 'sub': 0.09; 'def': 0.12; "'in',": 0.16; "'the',": 0.16; '-1):': 0.16; 'combinations': 0.16; 'dump': 0.16; 'pruning': 0.16; 'shortcut': 0.16; 'subject:combinations': 0.16; 'subject:generator': 0.16; 'tuple': 0.16; 'thursday,': 0.16; 'sender:addr:gmail.com': 0.17; 'wrote:': 0.18; 'code.': 0.18; 'thu,': 0.19; 'thanks.': 0.20; 'python?': 0.22; 'script': 0.25; 'mention': 0.26; 'push': 0.26; 'certain': 0.27; 'header:In-Reply-To:1': 0.27; 'words': 0.29; "doesn't": 0.30; 'said,': 0.30; "i'm": 0.30; 'that.': 0.31; '(my': 0.31; 'perl': 0.31; 'post.': 0.31; 'produces': 0.31; 'actual': 0.34; "i'd": 0.34; 'could': 0.34; 'convert': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'combination': 0.36; 'otherwise.': 0.36; 'words,': 0.36; 'yield': 0.36; 'charset:us- ascii': 0.36; 'subject:?': 0.36; 'hat': 0.38; 'nov': 0.38; 'to:addr:python-list': 0.38; 'does': 0.39; 'to:addr:python.org': 0.39; 'above,': 0.60; 'algorithms': 0.60; 'experts': 0.60; 'ground': 0.60; 'break': 0.61; 'length': 0.61; 'john': 0.61; 'first': 0.61; 'offer': 0.62; 'skip:n 10': 0.64; 'guaranteed': 0.75; 'hoping': 0.75; 'does!': 0.84; 'forgotten': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:subject:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; bh=q9pe4xC1FlU4b2GKpOiSvnS1eIvrwQnSoSQQa04SqJM=; b=ol75KBu8vaXy8VxV12v9Pqi7cIw83eXt3S8CIXfKOf/2IIP8f57veSy396htQNxauy ZgfvmM94382b2OgkXaqEub4ZZCWZpggVCkj1aOW3YdsBS9cnrGBNr+3Ggi40E+dk2Gvt nGhHEh+vJEvzA+cLvfwlwtJsCqzgozWZVt2dqYDHee2WQenOXU/j/NR2Db6OY4N+7RgB PuGMLS9KOLdgTgp/4j6XgnSvfIuu151+qUuL5dRKAbm5m00sen9p0Iz35I+7tU2e6w/a DxEx7g4io44lWbAo6Tccp4THStLAX6WL4wYLoKrdDPF/nl+/Fz+D0DuI5tuqdliilo/F tOQw== X-Received: by 10.68.254.132 with SMTP id ai4mr14532487pbd.51.1385168859321; Fri, 22 Nov 2013 17:07:39 -0800 (PST) Sender: "John O'Hagan" Date: Sat, 23 Nov 2013 12:07:32 +1100 From: John O'Hagan To: python-list@python.org Subject: Re: Recursive generator for combinations of a multiset? In-Reply-To: <9b7100f3-0d99-4eac-8be2-7f4403da341f@googlegroups.com> References: <20131121174614.53450d51@mini.home> <9b7100f3-0d99-4eac-8be2-7f4403da341f@googlegroups.com> X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.22; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 158 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1385168868 news.xs4all.nl 15934 [2001:888:2000:d::a6]:38736 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:60258 On Thu, 21 Nov 2013 18:14:41 -0800 (PST) James wrote: > On Thursday, November 21, 2013 5:01:15 AM UTC-8, John O'Hagan wrote: [...] > > > On 21 November 2013 06:46, John O'Hagan > > > > > wrote: > > [...] > > > > > > def multicombs(it, r): > > > > > > result = it[:r] > > > > > > yield result > > > > > > while 1: > > > > > > for i in range(-1, -r - 1, -1): > > > > > > rep = result[i] > > > > > > if rep < it[i]: > > > > > > break > > > > > > else: > > > > > > break > > > > > > for j, n in enumerate(it): > > > > > > if n > rep: > > > > > > break > > > > > > result = result[:i] + it[j:j - i] > > > > > > yield result > > > > > [...] > > > > I neglected to mention that multicombs takes a sorted iterable; > > > > it doesn't work right otherwise. I'd forgotten that because my > > > > wordlists are guaranteed sorted by the way they're built. Sorry > > about > > > > that. > > > > > > > > In my use-case the first argument to multicombs is a tuple of words > > > > which may contain duplicates, and it produces all unique > > combinations > > > > of a certain length of those words, eg: > > > > > > > > list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3)) > > > > > > > > [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), > > > > ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), > > > > ('in', 'the', 'the')] > > > > [...] > > What I'm looking for is a recursive algorithm which does what > > > > multicombs does (order unimportant) so that I can apply a pruning > > > > shortcut like the one I used in the recursive cartesian product > > > > algorithm in my original post. > > > > > > > > Multiset combination algorithms seem pretty thin on the ground out > > > > there - as I said, I could only find a description of the procedure > > > > above, no actual code. The ones I did find are non-recursive. I'm > > > > hoping some combinatorics and/or recursion experts can offer > > advice. > > > > [...] > > > > John > > Could convert the following perl script to python? > > use Data::Dump qw(dump); > dump combo([@ARGV], 3); > > sub combo { > my ($t, $k) = @_; > my @T = @$t; > my @R = (); > my %g = (); > if ($k == 1) { > for (@T) { > push @R, $_ unless $g{$_}++; > } > } else { > while (my $x = shift @T) { > $p = combo([@T], $k-1); > for (@{$p}) { > my $q = $x.",".$_; > push @R, $q unless $g{$q}++; > } > } > } > [@R]; > } > > $ prog.pl cat hat in the the > [ > "cat,hat,in", > "cat,hat,the", > "cat,in,the", > "cat,the,the", > "hat,in,the", > "hat,the,the", > "in,the,the", > ] > > James Thanks. Now I just have to learn Perl to understand what that does! :) Regards, -- John