Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #90592

Re: Looking for direction

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!weretis.net!feeder4.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <davea@davea.name>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.004
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'python.': 0.02; 'python,': 0.02; 'tutorial': 0.03; 'beginner': 0.05; 'column': 0.07; 'duplicate': 0.07; 'problem:': 0.07; 'rows': 0.09; 'rows,': 0.09; 'assume': 0.14; '72k': 0.16; 'appear?': 0.16; 'before.': 0.16; 'columns': 0.16; 'csv': 0.16; 'dict': 0.16; 'dictionary.': 0.16; 'guessing': 0.16; 'mylist': 0.16; 'preserve': 0.16; 'say.': 0.16; 'all.': 0.16; 'wrote:': 0.18; 'not,': 0.20; 'written': 0.21; 'help.': 0.21; 'seems': 0.21; 'header:User-Agent:1': 0.23; 'merge': 0.24; "shouldn't": 0.24; "i've": 0.25; 'compare': 0.26; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'rest': 0.29; 'external': 0.29; 'said,': 0.30; "i'm": 0.30; 'easier': 0.31; 'file': 0.32; 'there.': 0.32; 'this.': 0.32; 'probably': 0.32; 'figure': 0.32; 'supposed': 0.32; 'run': 0.32; 'worked': 0.33; 'lab': 0.33; 'not.': 0.33; 'there,': 0.34; 'could': 0.34; 'problem': 0.35; 'hundreds': 0.35; 'add': 0.35; 'there': 0.35; 'really': 0.36; 'done,': 0.36; 'in.': 0.36; 'useful': 0.36; "i'll": 0.36; 'so,': 0.37; 'turn': 0.37; 'two': 0.37; 'list': 0.37; 'list.': 0.37; 'being': 0.38; 'lists.': 0.38; 'handle': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'rather': 0.38; 'short': 0.38; 'sure': 0.39; 'to:addr:python.org': 0.39; 'enough': 0.39; 'mailing': 0.39; 'how': 0.40; 'even': 0.60; 'remove': 0.60; 'read': 0.60; 'identify': 0.61; 'staff': 0.61; 'simple': 0.61; "you're": 0.61; 'first': 0.61; 'back': 0.62; 'accounts': 0.64; 'great': 0.65; 'charset:windows-1252': 0.65; 'life': 0.66; 'here': 0.66; 'dont': 0.67; 'received:74.208': 0.68; 'comment.': 0.84; "it'd": 0.84; 'received:74.208.4.194': 0.84; 'hundred': 0.95
Date Wed, 13 May 2015 20:07:07 -0400
From Dave Angel <davea@davea.name>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version 1.0
To python-list@python.org
Subject Re: Looking for direction
References <5553DD2E.2080600@pacbell.net>
In-Reply-To <5553DD2E.2080600@pacbell.net>
Content-Type text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding 7bit
X-Provags-ID V03:K0:HrBk0P+O3RPBe7/8jxs/V4XDYqOQnDmDoSJ0nJXumHGS/g/GodJ P+YYNRr+g3PiSk3i6plQ1gxvNI/R6OzjEEtPN8bAPZRbW7YCwoIdck3lFktiEXM6hjY7Jsu VtwAtUptu6YToRPK2mpf1nnwf74SIqUkFRoGAVSWtt+OPfVqUhaDkN42Bd2237+26sRVfU4 8qFIWRSdP7QCzZ/tF8OJw==
X-UI-Out-Filterresults notjunk:1;
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.468.1431562050.12865.python-list@python.org> (permalink)
Lines 59
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1431562050 news.xs4all.nl 2836 [2001:888:2000:d::a6]:36737
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:90592

Show key headers only | View raw


On 05/13/2015 07:24 PM, 20/20 Lab wrote:
> I'm a beginner to python.  Reading here and there.  Written a couple of
> short and simple programs to make life easier around the office.
>
Welcome to Python, and to this mailing list.

> That being said, I'm not even sure what I need to ask for. I've never
> worked with external data before.
>
> I have a LARGE csv file that I need to process.  110+ columns, 72k
> rows.

That's not very large at all.

>  I managed to write enough to reduce it to a few hundred rows, and
> the five columns I'm interested in.

>
> Now is were I have my problem:
>
> myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>             [72976, "YYY", "Item", "Qty", "Noise"],
>             [123, "XXX" "ItemTypo", "Qty", "Noise"]    ]
>

It'd probably be useful to identify names for your columns, even if it's 
just in a comment.  Guessing from the paragraph below, I figure the 
first two columns are "account" & "staff"

> Basically, I need to check for rows with duplicate accounts row[0] and
> staff (row[1]), and if so, remove that row, and add it's Qty to the
> original row.

And which column is that supposed to be?  Shouldn't there be a number 
there, rather than a string?

> I really dont have a clue how to go about this.  The
> number of rows change based on which run it is, so I couldnt even get
> away with using hundreds of compare loops.
>
> If someone could point me to some documentation on the functions I would
> need, or a tutorial it would be a great help.
>

Is the order significant?  Do you have to preserve the order that the 
accounts appear?  I'll assume not.

Have you studied dictionaries?  Seems to me the way to handle the 
problem is to read in a row, create a dictionary with key of (account, 
staff), and data of the rest of the line.

Each time you read a row, you check if the key is already in the 
dictionary.  If not, add it.  If it's already there, merge the data as 
you say.

Then when you're done, turn the dict back into a list of lists.

-- 
DaveA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Looking for direction Dave Angel <davea@davea.name> - 2015-05-13 20:07 -0400

csiph-web