Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #68578
| Path | csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <python-python-list@m.gmane.org> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.001 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'assignment': 0.07; 'key.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'trees': 0.09; 'python': 0.11; 'creates': 0.14; '"w")': 0.16; '(key,': 0.16; 'csv': 0.16; 'dict': 0.16; 'dictionaries': 0.16; 'duplicates': 0.16; 'files:': 0.16; 'overwriting': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'script?': 0.16; 'index': 0.16; 'wrote:': 0.18; 'obviously': 0.18; 'trying': 0.19; 'split': 0.19; 'print': 0.22; 'header:User-Agent:1': 0.23; 'skip': 0.24; 'script': 0.25; 'second': 0.26; 'asking': 0.27; 'values': 0.27; 'header:X-Complaints-To:1': 0.27; 'list:': 0.30; 'lines': 0.31; 'fine,': 0.31; 'file': 0.32; 'proceed': 0.33; 'problem': 0.35; "can't": 0.35; 'but': 0.35; 'there': 0.35; 'skip:[ 10': 0.38; 'to:addr:python-list': 0.38; 'previous': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'how': 0.40; 'solve': 0.60; 'new': 0.61; 'no.': 0.61; 'numbers': 0.61; 'first': 0.61; 'more': 0.64; 'different': 0.65; 'phone': 0.66; 'sound': 0.68; 'limit': 0.70; 'ref': 0.84 |
| X-Injected-Via-Gmane | http://gmane.org/ |
| To | python-list@python.org |
| From | Peter Otten <__peter__@web.de> |
| Subject | Re: Dictionaries |
| Date | Thu, 20 Mar 2014 15:08:31 +0100 |
| Organization | None |
| References | <CANXBEFogXsze6WByg_iCit-oLEQK-RXsgJCDM=ocZ7JPNgdb8g@mail.gmail.com> <cc9658ed24ed4af1b65b5098a00f9aac@home.minuskel.de> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="ISO-8859-1" |
| Content-Transfer-Encoding | 7Bit |
| X-Gmane-NNTP-Posting-Host | p57bdb635.dip0.t-ipconnect.de |
| User-Agent | KNode/4.11.5 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.8301.1395324531.18130.python-list@python.org> (permalink) |
| Lines | 76 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1395324532 news.xs4all.nl 2934 [2001:888:2000:d::a6]:56694 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:68578 |
Show key headers only | View raw
ishish wrote:
> This might sound weird, but is there a limit how many dictionaries a
> can create/use in a single script?
No.
> My reason for asking is I split a 2-column-csv (phone#, ref#) file into
> a dict and am trying to put duplicated phone numbers with different ref
> numbers into new dictionaries. The script deducts the duplicated 46
> numbers but it only creates batch1.csv. Since I obviously can't see the
> wood for the trees here, can someone pls punch me into the right
> direction....
> ...(No has_key is fine, its python 2.7)
>
> f = open("file.csv", 'r')
Consider a csv with the lines
Number...
123,first
123,second
456,third
> myDict = {}
> Batch1 = {}
> Batch2 = {}
> Batch3 = {}
>
> for line in f:
> if line.startswith('Number' ):
> print "First line ignored..."
> else:
> k, v = line.split(',')
> myDict[k] = v
the first time around the assignment is
myDict["123"] = "first\n"
the second time it is
myDict["123"] = "second\n"
i. e. you are overwriting the previous value and only keep the value
corresponding to the last occurrence of a key.
A good approach to solve the problem of keeping an arbitrary number of
values per key is to make the dict value a list:
myDict = {}
with open("data.csv") as f:
next(f) # skip first line
for line in f:
k, v = line.split(",")
myDict.setdefault(k, []).append(v)
This will produce a myDict
{
"123": ["first\n", "second\n"],
"456": ["third\n"]
}
You can then proceed to find out the number of batches:
num_batches = max(len(v) for v in myDict.values())
Now write the files:
for index in range(num_batches):
with open("batch%s.csv" % (index+1), "w") as f:
for key, values in myDict.items():
if len(values) > index: # there are more than index duplicates
f.write("%s,%s" % (key, values[index]))
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Dictionaries Peter Otten <__peter__@web.de> - 2014-03-20 15:08 +0100
csiph-web