Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'elif': 0.04; 'result,': 0.05; '"__main__":': 0.07; '%s"': 0.07; '__name__': 0.07; 'lst': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'stating': 0.09; 'subset': 0.09; 'def': 0.10; 'suggest': 0.11; 'aug': 0.13; "hasn't": 0.15; 'clause,': 0.16; 'duplicates': 0.16; 'element,': 0.16; 'invalid.': 0.16; 'len)': 0.16; 'lists...': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'regenerate': 0.16; 'result:': 0.16; 'sources,': 0.16; 'true:': 0.16; 'violated': 0.16; 'drawing': 0.17; 'duplicate': 0.17; 'element': 0.17; 'all,': 0.21; 'import': 0.21; 'sorry,': 0.22; 'elements': 0.23; 'random': 0.24; 'least': 0.25; 'skip:" 20': 0.26; '???': 0.27; 'entries': 0.27; "doesn't": 0.28; 'header:X-Complaints-To:1': 0.28; 'initial': 0.28; '"in': 0.29; 'probability': 0.29; 'unique,': 0.29; 'starts': 0.29; 'source': 0.29; 'lists': 0.31; 'sources': 0.32; 'could': 0.32; 'print': 0.32; 'requirement.': 0.33; 'url:home': 0.33; 'to:addr:python-list': 0.33; 'equal': 0.33; 'entry': 0.33; 'requirements': 0.33; 'list': 0.35; 'needed': 0.35; 'there': 0.35; 'list.': 0.35; 'received:org': 0.36; 'but': 0.36; 'skip:g 30': 0.36; 'charset:us-ascii': 0.36; 'skip:p 20': 0.36; 'enough': 0.36; 'item': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'to:addr:python.org': 0.39; 'whom': 0.39; 'build': 0.39; 'list,': 0.39; 'where': 0.40; 'skip:" 10': 0.40; 'header:Received:5': 0.40; 'your': 0.60; 'range': 0.60; 'remove': 0.61; "you've": 0.61; 'first': 0.61; 'chance': 0.61; 'dead': 0.62; 'night': 0.62; 'times': 0.63; 'more': 0.63; 'limit': 0.65; 'results': 0.65; 'removal': 0.65; 'actually,': 0.84; 'conditions,': 0.84; 'mice': 0.84; 'result))': 0.84; 'thee': 0.84; '"it': 0.91; 'dennis': 0.91; 'items,': 0.91; 'technically': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Dennis Lee Bieber Subject: Re: Probability Algorithm Date: Sun, 26 Aug 2012 17:43:33 -0400 Organization: > Bestiaria Support Staff < References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: adsl-76-253-101-185.dsl.klmzmi.sbcglobal.net X-Newsreader: Forte Agent 3.3/32.846 X-No-Archive: YES X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 128 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346017426 news.xs4all.nl 6957 [2001:888:2000:d::a6]:44099 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:27950 On Sun, 26 Aug 2012 11:47:57 +0800, ??? declaimed the following in gmane.comp.python.general: > Sorry, missing some conditions Then, I suggest first produce a proper requirements document stating ALL conditions, data sources, and results After all, your final result consists of just three items, drawn from X lists of unspecified length with Xp probabilities. Your attempt build subset lists in which the length is equal to the probability, and THEN draw three items from these sublists is invalid. Consider where your DLIST starts out with three entries -- but you are going to reduce it to one entry (probability 1%)... While a 1% probability is small -- technically it could still be drawn three times in a row. But since you've already trimmed the DLIST to one element, and you don't want duplicates in the result, you've violated your probability requirement. If you'd left DLIST with all three entries you'd have been able to draw those three -- 1% chance each time -- and meet the 3-elements, unique, criteria. It may be more complex if DLIST elements duplicate those in other lists... Actually, the only limit I see is that the shortest list must contain at least 3 elements (if only three elements will be in the final result). That is, in general, each source list must be at least as long as the result list, as random probabilities could draw each element from the same list. When actually drawing from a list, if the element had been drawn from a /different/ list earlier, just remove it from the current list and redraw -- don't regenerate the "probability" random number, you've already determined which list for the probability clause, you just need to find the unique item in that list, and since each list is at least the length needed for the result, there must be at least one value that hasn't been drawn from any other list. -=-=-=-=- import random import copy RESULT_LEN = 3 PROBABILITIES = [ 43, 37 + 43, 19 + 37 + 43, 1 + 19 + 37 + 43 ] ALREADY_PICKED_LIST = "a the in on at is not".split() A_SOURCE = "ask not for whom the bell tolls".split() B_SOURCE = "it tolls for thee".split() C_SOURCE = "in the dead of the night the mice will play".split() D_SOURCE = "thee hast caught me out".split() def clean(lst, picked=ALREADY_PICKED_LIST): return [itm for itm in lst if itm not in picked] def pickOne(lst, picked): while True: itm = random.choice(lst) if itm not in picked: break lst.remove(itm) return itm def generateResult(lsts, probabilities = PROBABILITIES, resultLen = RESULT_LEN): #ensure enough data if min(lsts, len) < resultLen: print "Insufficient unique data to ensure desired results" return None else: result = [] for i in range(resultLen): p = random.randint(0, 99) if p < probabilities[0]: lst = lsts[0] elif p < probabilities[1]: lst = lsts[1] elif p < probabilities[2]: lst = lsts[2] elif p < probabilities[3]: lst = lsts[3] else: print "Probabilities outside range of 1..100" break result.append(pickOne(lst, result)) return result if __name__ == "__main__": #build initial lists sources = [ clean(A_SOURCE), clean(B_SOURCE), clean(C_SOURCE), clean(D_SOURCE) ] print "" for i in range(15): # use copy.deepcopy() so duplicate removal doesn't change master lists result = generateResult(copy.deepcopy(sources)) print "Final result: %s" % " | ".join(sorted(result)) -=-=-=- Final result: dead | it | thee Final result: for | it | tolls Final result: bell | thee | whom Final result: for | thee | tolls Final result: night | thee | tolls Final result: ask | for | it Final result: for | it | night Final result: for | whom | will Final result: it | of | whom Final result: ask | tolls | whom Final result: ask | for | tolls Final result: for | play | thee Final result: dead | thee | whom Final result: of | tolls | will Final result: for | of | will {Actually, you could avoid the deepcopy() above by changing pickOne() to NOT remove already picked items, but just try rand.choice() until a new/unique item is picked} -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/