Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #90653 > unrolled thread

Re: Looking for direction

Started by20/20 Lab <lab@pacbell.net>
First post2015-05-14 08:58 -0700
Last post2015-05-14 08:58 -0700
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Looking for direction 20/20 Lab <lab@pacbell.net> - 2015-05-14 08:58 -0700

#90653 — Re: Looking for direction

From20/20 Lab <lab@pacbell.net>
Date2015-05-14 08:58 -0700
SubjectRe: Looking for direction
Message-ID<mailman.27.1431674916.17265.python-list@python.org>

On 05/13/2015 06:12 PM, Dave Angel wrote:
> On 05/13/2015 08:45 PM, 20/20 Lab wrote:>
>
> You accidentally replied to me, rather than the mailing list. Please 
> use reply-list, or if your mailer can't handle that, do a Reply-All, 
> and remove the parts you don't want.
>
> >
> > On 05/13/2015 05:07 PM, Dave Angel wrote:
> >> On 05/13/2015 07:24 PM, 20/20 Lab wrote:
> >>> I'm a beginner to python.  Reading here and there. Written a 
> couple of
> >>> short and simple programs to make life easier around the office.
> >>>
> >> Welcome to Python, and to this mailing list.
> >>
> >>> That being said, I'm not even sure what I need to ask for. I've never
> >>> worked with external data before.
> >>>
> >>> I have a LARGE csv file that I need to process.  110+ columns, 72k
> >>> rows.
> >>
> >> That's not very large at all.
> >>
> > In the grand scheme, I guess not.  However I'm currently doing this
> > whole process using office.  So it can be a bit daunting.
>
> I'm not familiar with the "office" operating system.
>
> >>>  I managed to write enough to reduce it to a few hundred rows, and
> >>> the five columns I'm interested in.
> >>
> >>>
> >>> Now is were I have my problem:
> >>>
> >>> myList = [ [123, "XXX", "Item", "Qty", "Noise"],
> >>>             [72976, "YYY", "Item", "Qty", "Noise"],
> >>>             [123, "XXX" "ItemTypo", "Qty", "Noise"]    ]
> >>>
> >>
> >> It'd probably be useful to identify names for your columns, even if
> >> it's just in a comment.  Guessing from the paragraph below, I figure
> >> the first two columns are "account" & "staff"
> >
> > The columns that I pull are Account, Staff, Item Sold, Quantity sold,
> > and notes about the sale (notes arent particularly needed, but the
> > higher ups would like them in the report)
> >>
> >>> Basically, I need to check for rows with duplicate accounts row[0] 
> and
> >>> staff (row[1]), and if so, remove that row, and add it's Qty to the
> >>> original row.
> >>
> >> And which column is that supposed to be?  Shouldn't there be a number
> >> there, rather than a string?
> >>
> >>> I really dont have a clue how to go about this.  The
> >>> number of rows change based on which run it is, so I couldnt even get
> >>> away with using hundreds of compare loops.
> >>>
> >>> If someone could point me to some documentation on the functions I 
> would
> >>> need, or a tutorial it would be a great help.
> >>>
> >>
> >> Is the order significant?  Do you have to preserve the order that the
> >> accounts appear?  I'll assume not.
> >>
> >> Have you studied dictionaries?  Seems to me the way to handle the
> >> problem is to read in a row, create a dictionary with key of (account,
> >> staff), and data of the rest of the line.
> >>
> >> Each time you read a row, you check if the key is already in the
> >> dictionary.  If not, add it.  If it's already there, merge the data as
> >> you say.
> >>
> >> Then when you're done, turn the dict back into a list of lists.
> >>
> > The order is irrelevant.  No, I've not really studied dictionaries, but
> > a few people have mentioned it.  I'll have to read up on them and, more
> > importantly, their applications.  Seems that they are more versatile
> > then I thought.
> >
> > Thank you.
>
> You have to realize that a tuple can be used as a key, in your case a 
> tuple of Account and Staff.
>
> You'll have to decide how you're going to merge the ItemSold, 
> QuantitySold, and notes.
>
Tells you how often I actually talk in mailing lists.  My apologies, and 
thank you again.

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web