Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #90613

Re: Looking for direction

Path csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <lab@pacbell.net>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.007
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'python.': 0.02; 'else:': 0.03; 'beginner': 0.05; 'duplicate': 0.07; 'memory.': 0.07; 'problem:': 0.07; 'rows': 0.09; 'rows,': 0.09; '72k': 0.16; 'before.': 0.16; 'columns': 0.16; 'csv': 0.16; 'dict': 0.16; 'dictionaries': 0.16; 'does,': 0.16; 'help?': 0.16; 'mylist': 0.16; 'stumbled': 0.16; 'tuple.': 0.16; 'so.': 0.16; 'wrote:': 0.18; 'module': 0.19; 'file,': 0.19; 'thu,': 0.19; 'not,': 0.20; 'written': 0.21; 'coding': 0.22; 'header:User-Agent:1': 0.23; "aren't": 0.24; 'exists': 0.24; '(or': 0.24; "i've": 0.25; 'handling': 0.26; 'header:In-Reply-To:1': 0.27; 'external': 0.29; 'am,': 0.29; 'said,': 0.30; "i'm": 0.30; 'work.': 0.31; 'easier': 0.31; 'usually': 0.31; 'about.': 0.31; "d'aprano": 0.31; 'large.': 0.31; 'steven': 0.31; 'yesterday': 0.31; 'file': 0.32; 'there.': 0.32; 'this.': 0.32; 'worked': 0.33; 'computer.': 0.33; 'lab': 0.33; 'problem': 0.35; 'something': 0.35; 'add': 0.35; 'really': 0.36; 'combination': 0.36; 'in.': 0.36; 'processed': 0.36; 'doing': 0.36; "i'll": 0.36; 'should': 0.36; 'wrong': 0.37; 'so,': 0.37; 'list': 0.37; 'being': 0.38; 'thank': 0.38; 'lists.': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'short': 0.38; 'does': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'enough': 0.39; 'how': 0.40; 'even': 0.60; 'remove': 0.60; 'received:67.195': 0.60; 'staff': 0.61; 'simple': 0.61; 'more': 0.64; 'accounts': 0.64; 'charset:windows-1252': 0.65; 'life': 0.66; 'hours': 0.66; 'here': 0.66; 'reply': 0.66; 'dont': 0.67; 'received:gq1.yahoo.com': 0.68; 'received:mail.gq1.yahoo.com': 0.68; 'study': 0.69; '2015': 0.84; 'received:bullet.mail.gq1.yahoo.com': 0.84; 'approached': 0.93; 'hundred': 0.95
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=pacbell.net; s=s2048; t=1431622669; bh=fiBi83Zg8BYNuQJc7pDul6l6siMd9G5VZQ+L2+ENsLY=; h=Date:From:To:Subject:References:In-Reply-To:From:Subject; b=Eovq8vfbzoRivJzkMeZGKWLtBzFkd71q5tlsEx34T8RCCeFjiY3FUg6G22YMGBDlsycvHSUOHQiVV45gCJTi58Eq0zhJ7HeZQBtmaF/4Ow+04NfAls8GRne8RiWSOKRwjy6yMF+EM9O7uC580kTFYERgjASFh+peb64fyltxGbEcMFTw2XjNviehv3NmlgrjsjXkymgYjdV2BKli5IwMVhhzC8Kx357JNaNm8rGdhRZkhf3Vo+5qbwoDdoOjzrhHgzOM+U5wJid0a6Uqfq+GYTzXt5XJ7/wRZar3y7FR2KJubwUHHeToNEFqn9bDO28LY3u/yQVQUQt9109MD41bmA==
X-Yahoo-Newman-Id 468520.45482.bm@smtp119.sbc.mail.gq1.yahoo.com
X-Yahoo-Newman-Property ymail-3
X-YMail-OSG FDmGGXoVM1kScIx1vu4NbGGUgqA.u.MaNwRqtoGha5QLjx_ CDmSOXYdeBE38U7.mfOl3ktrgp2VDkJB3k75eF3bwEVNqvxxuEqNo.RhCKi1 IHMGt3PVQ6WoVTXRyDUfb5pmxHlcroE9lroIpIcFX9lR0HWAuaIqsygXRpAs 2FWeC2oPUSfEkvU7ncYku7ewooJDXlaDR797JmOj11Apiii1EQOVeJYyvy_M GOdBn4GktBLebFTcLiDC0zinmNRAU9znd89CtkO8fW0.RgmSNaf0A3DLjtwL fREHwD3mchgv_FKV3.JpW6.O5dcL_hSwjGEx7JJrIjkaqPFrrkfPrEHLEps9 OBSEm.ueinZj8eXxWTARfdITHasIcw5WuPCJDl.5NcksM8awSzbmy1qP0nfg hqHLkBVLrNt8cEqerqHiHRQ4uGhz_qEq.J_EGUifkaZhB0VM8sM9l.uB0ViU NChXpn2WhMNKxsjrI9ZW3hUwdLalCAT8hpRIgYhAtQXgwrivPFVEH9LhWywZ LMcqkFp5lWCAy5SR2JX8D_f1AJrSI6h_BXHr8xEZm
X-Yahoo-SMTP pX5JzlCswBCHjFe3u2nDwDnAKSoAPmfgWieH5g--
Date Thu, 14 May 2015 09:57:48 -0700
From 20/20 Lab <lab@pacbell.net>
Organization 20/20 Optometric
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version 1.0
To python-list@python.org
Subject Re: Looking for direction
References <mailman.465.1431559626.12865.python-list@python.org> <5553f8fe$0$13012$c3e8da3$5496439d@news.astraweb.com>
In-Reply-To <5553f8fe$0$13012$c3e8da3$5496439d@news.astraweb.com>
Content-Type text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding 7bit
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5.1431622814.17265.python-list@python.org> (permalink)
Lines 59
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1431622814 news.xs4all.nl 2933 [2001:888:2000:d::a6]:44278
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:90613

Show key headers only | View raw



On 05/13/2015 06:23 PM, Steven D'Aprano wrote:
> On Thu, 14 May 2015 09:24 am, 20/20 Lab wrote:
>
>> I'm a beginner to python.  Reading here and there.  Written a couple of
>> short and simple programs to make life easier around the office.
>>
>> That being said, I'm not even sure what I need to ask for. I've never
>> worked with external data before.
>>
>> I have a LARGE csv file that I need to process.  110+ columns, 72k
>> rows.  I managed to write enough to reduce it to a few hundred rows, and
>> the five columns I'm interested in.
> That's not large. Large is millions of rows, or tens of millions if you have
> enough memory. What's large to you and me is usually small to the computer.
>
> You should use the csv module for handling the CSV file, if you aren't
> already doing so. Do you need a url to the docs?
>
I actually stumbled across the csv module after coding enough to make a 
list of lists.  So that is more the reason I approached the list;  
Nothing like spending hours (or days) coding something that already 
exists and just dont know about.
>> Now is were I have my problem:
>>
>> myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>>              [72976, "YYY", "Item", "Qty", "Noise"],
>>              [123, "XXX" "ItemTypo", "Qty", "Noise"]    ]
>>
>> Basically, I need to check for rows with duplicate accounts row[0] and
>> staff (row[1]), and if so, remove that row, and add it's Qty to the
>> original row. I really dont have a clue how to go about this.
> Is the order of the rows important? If not, the problem is simpler.
>
>
> processed = {}  # hold the processed data in a dict
>
> for row in myList:
>      account, staff = row[0:2]
>      key = (account, staff)  # Put them in a tuple.
>      if key in processed:
>          # We've already seen this combination.
>          processed[key][3] += row[3]  # Add the quantities.
>      else:
>          # Never seen this combination before.
>          processed[key] = row
>
> newlist = list(processed.values())
>
>
> Does that help?
>
>
>
It does, immensely.  I'll make this work.  Thank you again for the link 
from yesterday and apologies for hitting the wrong reply button.  I'll 
have to study more on the usage and implementations of dictionaries and 
tuples.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Looking for direction 20/20 Lab <lab@pacbell.net> - 2015-05-13 16:24 -0700
  Re: Looking for direction Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-14 11:23 +1000
    Re: Looking for direction 20/20 Lab <lab@pacbell.net> - 2015-05-14 09:57 -0700
    Re: Looking for direction Tim Chase <python.list@tim.thechases.com> - 2015-05-14 12:17 -0500
    Re: Looking for direction Ziqi Xiong <xiongziqi84@gmail.com> - 2015-05-15 03:31 +0000
  Re: Looking for direction darnold <darnold992000@yahoo.com> - 2015-05-20 05:50 -0700
    Re: Looking for direction 20/20 Lab <lab@pacbell.net> - 2015-05-20 14:18 -0700

csiph-web