Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'assignment': 0.07; 'key.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'trees': 0.09; 'python': 0.11; 'creates': 0.14; '"w")': 0.16; '(key,': 0.16; 'csv': 0.16; 'dict': 0.16; 'dictionaries': 0.16; 'duplicates': 0.16; 'files:': 0.16; 'overwriting': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'script?': 0.16; 'index': 0.16; 'wrote:': 0.18; 'obviously': 0.18; 'trying': 0.19; 'split': 0.19; 'print': 0.22; 'header:User-Agent:1': 0.23; 'skip': 0.24; 'script': 0.25; 'second': 0.26; 'asking': 0.27; 'values': 0.27; 'header:X-Complaints-To:1': 0.27; 'list:': 0.30; 'lines': 0.31; 'fine,': 0.31; 'file': 0.32; 'proceed': 0.33; 'problem': 0.35; "can't": 0.35; 'but': 0.35; 'there': 0.35; 'skip:[ 10': 0.38; 'to:addr:python-list': 0.38; 'previous': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'how': 0.40; 'solve': 0.60; 'new': 0.61; 'no.': 0.61; 'numbers': 0.61; 'first': 0.61; 'more': 0.64; 'different': 0.65; 'phone': 0.66; 'sound': 0.68; 'limit': 0.70; 'ref': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: Dictionaries Date: Thu, 20 Mar 2014 15:08:31 +0100 Organization: None References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p57bdb635.dip0.t-ipconnect.de User-Agent: KNode/4.11.5 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 76 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1395324532 news.xs4all.nl 2934 [2001:888:2000:d::a6]:56694 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:68578 ishish wrote: > This might sound weird, but is there a limit how many dictionaries a > can create/use in a single script? No. > My reason for asking is I split a 2-column-csv (phone#, ref#) file into > a dict and am trying to put duplicated phone numbers with different ref > numbers into new dictionaries. The script deducts the duplicated 46 > numbers but it only creates batch1.csv. Since I obviously can't see the > wood for the trees here, can someone pls punch me into the right > direction.... > ...(No has_key is fine, its python 2.7) > > f = open("file.csv", 'r') Consider a csv with the lines Number... 123,first 123,second 456,third > myDict = {} > Batch1 = {} > Batch2 = {} > Batch3 = {} > > for line in f: > if line.startswith('Number' ): > print "First line ignored..." > else: > k, v = line.split(',') > myDict[k] = v the first time around the assignment is myDict["123"] = "first\n" the second time it is myDict["123"] = "second\n" i. e. you are overwriting the previous value and only keep the value corresponding to the last occurrence of a key. A good approach to solve the problem of keeping an arbitrary number of values per key is to make the dict value a list: myDict = {} with open("data.csv") as f: next(f) # skip first line for line in f: k, v = line.split(",") myDict.setdefault(k, []).append(v) This will produce a myDict { "123": ["first\n", "second\n"], "456": ["third\n"] } You can then proceed to find out the number of batches: num_batches = max(len(v) for v in myDict.values()) Now write the files: for index in range(num_batches): with open("batch%s.csv" % (index+1), "w") as f: for key, values in myDict.items(): if len(values) > index: # there are more than index duplicates f.write("%s,%s" % (key, values[index]))