Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #61606 > unrolled thread

adding values from a csv column and getting the mean. beginner help

Started bybrian cleere <briancleere@gmail.com>
First post2013-12-11 11:10 -0800
Last post2013-12-12 09:36 +0800
Articles 12 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  adding values from a csv column and getting the mean. beginner help brian cleere <briancleere@gmail.com> - 2013-12-11 11:10 -0800
    Re: adding values from a csv column and getting the mean. beginner help Tim Chase <python.list@tim.thechases.com> - 2013-12-11 13:20 -0600
    Re: adding values from a csv column and getting the mean. beginner help Chris Angelico <rosuav@gmail.com> - 2013-12-12 06:22 +1100
    Re: adding values from a csv column and getting the mean. beginner help Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-11 19:34 +0000
    Re: adding values from a csv column and getting the mean. beginner help Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-11 19:41 +0000
    Re: adding values from a csv column and getting the mean. beginner help Chris Angelico <rosuav@gmail.com> - 2013-12-12 06:46 +1100
    Re: adding values from a csv column and getting the mean. beginner help Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-11 20:00 +0000
    Re: adding values from a csv column and getting the mean. beginner help Chris Angelico <rosuav@gmail.com> - 2013-12-12 07:03 +1100
    Re: adding values from a csv column and getting the mean. beginner help Tim Chase <python.list@tim.thechases.com> - 2013-12-11 14:20 -0600
    Re: adding values from a csv column and getting the mean. beginner help Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-11 20:20 +0000
    Re: adding values from a csv column and getting the mean. beginner help Christopher Welborn <cjwelborn@live.com> - 2013-12-11 18:49 -0600
    Re: adding values from a csv column and getting the mean. beginner help "M.F." <morefool@gmail.com> - 2013-12-12 09:36 +0800

#61606 — adding values from a csv column and getting the mean. beginner help

Frombrian cleere <briancleere@gmail.com>
Date2013-12-11 11:10 -0800
Subjectadding values from a csv column and getting the mean. beginner help
Message-ID<e3a75021-32a2-471e-b486-51d1b28ffef6@googlegroups.com>
I know the problem is with the for loop but don't know how to fix. Any help with explanation would be appreciated.

#!/bin/env python
import csv
import sys

if len(sys.argv) < 3:
    print('Please specify a filename and column number: {} [csvfile] [column]'.format(sys.argv[0]))
    sys.exit(1)

filename = sys.argv[1]
column = int(sys.argv[2])

for line in filename() , column ():
    elements = line.strip().split(',')
    values.append(int(elements[col]))

csum = sum(values)
cavg = sum(values)/len(values)
print("Sum of column %d: %f" % (col, csum))
print("Avg of column %d: %f" % (col, cavg))

[toc] | [next] | [standalone]


#61607

FromTim Chase <python.list@tim.thechases.com>
Date2013-12-11 13:20 -0600
Message-ID<mailman.3921.1386789554.18130.python-list@python.org>
In reply to#61606
On 2013-12-11 11:10, brian cleere wrote:
> filename = sys.argv[1]
> column = int(sys.argv[2])
> 
> for line in filename() , column ():
>     elements = line.strip().split(',')
>     values.append(int(elements[col]))

1) you need to open the file

2) you need to make use of the csv module on that file

3) you need to extract the column

Thus it would looks something like

  column = int(sys.argv[2])
  f = open(sys.argv[1], "rb")
  r = csv.reader(f)
  try:
    for row in r:
      values.append(int(row[column]))
  finally:
    f.close()

which can be obtusely written as

  values = [int(row[column]) for row in csv.reader(open(sys.argv[1], "rb"))]

though the more expanded version allows you to do better error
handling (rows with insufficient columns, non-numeric/non-integer
values in the specified column, etc).

-tkc


[toc] | [prev] | [next] | [standalone]


#61608

FromChris Angelico <rosuav@gmail.com>
Date2013-12-12 06:22 +1100
Message-ID<mailman.3922.1386789733.18130.python-list@python.org>
In reply to#61606
On Thu, Dec 12, 2013 at 6:10 AM, brian cleere <briancleere@gmail.com> wrote:
> I know the problem is with the for loop but don't know how to fix. Any help with explanation would be appreciated.

Your problem is akin to debugging an empty file :) It's not so much a
matter of fixing what's not working as of starting at the very
beginning: How do you iterate over the content of a CSV file?

Now, you're almost there... partly. You have the split() call, which
will split on the comma, so if you go that route, all you need to do
is open the file, using the aptly-named builtin function "open".
You'll find docs on that if you do a quick search.

But you're actually part-way to the better solution. You're importing
the 'csv' module, which is exactly what you need here. All you need is
to read up on its docs:

http://docs.python.org/3/library/csv.html

I'm sure you can figure out the rest of your homework from there!

Now, with that out of the way, I'd like to just mention a couple of
other things.

>    print('Please specify a filename and column number: {} [csvfile] [column]'.format(sys.argv[0]))

Square brackets in a usage description often mean "optional". You may
want to be careful of that. There's no really good solution though.

> csum = sum(values)
> cavg = sum(values)/len(values)

Once you've calculated the sum once, you can reuse that to calculate
the average. Can you see how? :)

And finally: You're using Google Groups to post, which means your
paragraphs are unwrapped, and - unless you fight very hard against a
stupidly buggy piece of software - your replies will be malformed and
ugly. Don't make yourself look bad; switch to a better newsreader, or
to the mailing list:

https://mail.python.org/mailman/listinfo/python-list

The content is the same, you just subscribe to the list and read and
write as email.

Thanks! And welcome to the group.

ChrisA

[toc] | [prev] | [next] | [standalone]


#61610

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-12-11 19:34 +0000
Message-ID<mailman.3924.1386790461.18130.python-list@python.org>
In reply to#61606
On 11/12/2013 19:10, brian cleere wrote:
> I know the problem is with the for loop but don't know how to fix. Any help with explanation would be appreciated.
>
> #!/bin/env python
> import csv

You never use the csv module.

> import sys
>
> if len(sys.argv) < 3:
>      print('Please specify a filename and column number: {} [csvfile] [column]'.format(sys.argv[0]))
>      sys.exit(1)
>
> filename = sys.argv[1]
> column = int(sys.argv[2])
>
> for line in filename() , column ():

You're trying to loop around the filename and the column, you need to 
open the file and loop around that.

>      elements = line.strip().split(',')

Please don't do this when you've got the csv module to do things for you.

>      values.append(int(elements[col]))

Where did values come from?  Is it col or column, please make your mind up?

So let's stick things together.  Something like.

values = []
with open(filename) as csvfile:
     valuereader = csv.reader(csvfile)
     for row in valuereader:
         values.append(int(row[column]))

>
> csum = sum(values)
> cavg = sum(values)/len(values)
> print("Sum of column %d: %f" % (col, csum))
> print("Avg of column %d: %f" % (col, cavg))

I like consistency, new style formatting here, old style above, still if 
it works for you.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#61612

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-12-11 19:41 +0000
Message-ID<mailman.3926.1386790860.18130.python-list@python.org>
In reply to#61606
On 11/12/2013 19:22, Chris Angelico wrote:
> On Thu, Dec 12, 2013 at 6:10 AM, brian cleere <briancleere@gmail.com> wrote:
>> I know the problem is with the for loop but don't know how to fix. Any help with explanation would be appreciated.
>
> Your problem is akin to debugging an empty file :) It's not so much a
> matter of fixing what's not working as of starting at the very
> beginning: How do you iterate over the content of a CSV file?
>
> Now, you're almost there... partly. You have the split() call, which
> will split on the comma, so if you go that route, all you need to do
> is open the file, using the aptly-named builtin function "open".
> You'll find docs on that if you do a quick search.
>
> But you're actually part-way to the better solution. You're importing
> the 'csv' module, which is exactly what you need here. All you need is
> to read up on its docs:
>
> http://docs.python.org/3/library/csv.html
>
> I'm sure you can figure out the rest of your homework from there!
>
> Now, with that out of the way, I'd like to just mention a couple of
> other things.
>
>>     print('Please specify a filename and column number: {} [csvfile] [column]'.format(sys.argv[0]))
>
> Square brackets in a usage description often mean "optional". You may
> want to be careful of that. There's no really good solution though.

There is, https://pypi.python.org/pypi/docopt/0.6.1 :)

>
>> csum = sum(values)
>> cavg = sum(values)/len(values)
>
> Once you've calculated the sum once, you can reuse that to calculate
> the average. Can you see how? :)
>
> And finally: You're using Google Groups to post, which means your
> paragraphs are unwrapped, and - unless you fight very hard against a
> stupidly buggy piece of software - your replies will be malformed and
> ugly. Don't make yourself look bad; switch to a better newsreader, or
> to the mailing list:
>
> https://mail.python.org/mailman/listinfo/python-list
>
> The content is the same, you just subscribe to the list and read and
> write as email.

Ooh 'eck, we'll have the Popular Front for the Liberation of Google 
Groups squad out in force again, vainly trying to defend the bug ridden 
crap that they insist on using, and which I obviously won't mention. 
Whoops!!!

Oh Lord, won't you buy me Mozilla Thunderbird ?
My friends all use GG, I think that's absurd.
Worked hard all my lifetime, no help from the nerds,
So Lord, won't you buy me Mozilla Thunderbird ?

With apologies to the late, great Janis Joplin.

>
> Thanks! And welcome to the group.
>
> ChrisA
>

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#61613

FromChris Angelico <rosuav@gmail.com>
Date2013-12-12 06:46 +1100
Message-ID<mailman.3927.1386791221.18130.python-list@python.org>
In reply to#61606
On Thu, Dec 12, 2013 at 6:41 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> Square brackets in a usage description often mean "optional". You may
>> want to be careful of that. There's no really good solution though.
>
> There is, https://pypi.python.org/pypi/docopt/0.6.1 :)

That appears to use <x> for a mandatory argument x, which is then
slightly ambiguous with shell redirection. But that's the best
notation I've ever seen for distinguishing mandatory args from fixed
keywords.

CrisA

[toc] | [prev] | [next] | [standalone]


#61614

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-12-11 20:00 +0000
Message-ID<mailman.3928.1386792023.18130.python-list@python.org>
In reply to#61606
On 11/12/2013 19:46, Chris Angelico wrote:
> On Thu, Dec 12, 2013 at 6:41 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>>> Square brackets in a usage description often mean "optional". You may
>>> want to be careful of that. There's no really good solution though.
>>
>> There is, https://pypi.python.org/pypi/docopt/0.6.1 :)
>
> That appears to use <x> for a mandatory argument x, which is then
> slightly ambiguous with shell redirection. But that's the best
> notation I've ever seen for distinguishing mandatory args from fixed
> keywords.
>
> CrisA
>

I use the alternative X for a mandatory argument X.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#61615

FromChris Angelico <rosuav@gmail.com>
Date2013-12-12 07:03 +1100
Message-ID<mailman.3929.1386792229.18130.python-list@python.org>
In reply to#61606
On Thu, Dec 12, 2013 at 7:00 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> I use the alternative X for a mandatory argument X.

Also common, but how do you specify a keyword, then? Say you have a
command with subcommands:

$0 foo x y
Move the foo to (x,y)
$0 bar x y z
Go to bar X, order a Y, and Z it [eg 'compress', 'gzip', 'drink']

How do you show that x/y/z are mandatory args, but foo/bar are
keywords to be typed exactly? In some formats italicized text can make
that distinction, but not in pure text.

ChrisA

[toc] | [prev] | [next] | [standalone]


#61617

FromTim Chase <python.list@tim.thechases.com>
Date2013-12-11 14:20 -0600
Message-ID<mailman.3931.1386793170.18130.python-list@python.org>
In reply to#61606
On 2013-12-12 07:03, Chris Angelico wrote:
> Also common, but how do you specify a keyword, then? Say you have a
> command with subcommands:
> 
> $0 foo x y
> Move the foo to (x,y)
> $0 bar x y z
> Go to bar X, order a Y, and Z it [eg 'compress', 'gzip', 'drink']
> 
> How do you show that x/y/z are mandatory args, but foo/bar are
> keywords to be typed exactly? In some formats italicized text can
> make that distinction, but not in pure text.

I prefer {} notation:

  $0 mv [--optional] {x} {y}
  $0 bar [--mutually|--exclusive] {x} {y} {z}

-tkc

[toc] | [prev] | [next] | [standalone]


#61618

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-12-11 20:20 +0000
Message-ID<mailman.3932.1386793211.18130.python-list@python.org>
In reply to#61606
On 11/12/2013 20:03, Chris Angelico wrote:
> On Thu, Dec 12, 2013 at 7:00 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> I use the alternative X for a mandatory argument X.
>
> Also common, but how do you specify a keyword, then? Say you have a
> command with subcommands:
>
> $0 foo x y
> Move the foo to (x,y)
> $0 bar x y z
> Go to bar X, order a Y, and Z it [eg 'compress', 'gzip', 'drink']
>
> How do you show that x/y/z are mandatory args, but foo/bar are
> keywords to be typed exactly? In some formats italicized text can make
> that distinction, but not in pure text.
>
> ChrisA
>

Haven't a clue off the top of my head so read all about it here 
https://github.com/docopt/docopt

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#61631

FromChristopher Welborn <cjwelborn@live.com>
Date2013-12-11 18:49 -0600
Message-ID<mailman.3941.1386809370.18130.python-list@python.org>
In reply to#61606
On 12/11/2013 01:41 PM, Mark Lawrence wrote:
> On 11/12/2013 19:22, Chris Angelico wrote:
> There is, https://pypi.python.org/pypi/docopt/0.6.1 :)
>

+1 for docopt. It makes everything very clear. Just type out your usage 
string, and then run docopt(usage_str) on it to get a dict of your args. 
When I saw the video at http://docopt.org my jaw dropped. I couldn't 
believe all of the arg parsing junk I had been writing for even the 
smallest scripts. The other arg parsing libs make it easier than 
manually doing it, but docopt is magic.


-- 

- Christopher Welborn <cjwelborn@live.com>
   http://welbornprod.com

[toc] | [prev] | [next] | [standalone]


#61640

From"M.F." <morefool@gmail.com>
Date2013-12-12 09:36 +0800
Message-ID<l8b3u8$lru$1@speranza.aioe.org>
In reply to#61606
On 12/12/2013 03:10 AM, brian cleere wrote:
> I know the problem is with the for loop but don't know how to fix. Any help with explanation would be appreciated.
>
> #!/bin/env python
> import csv
> import sys
>
> if len(sys.argv) < 3:
>      print('Please specify a filename and column number: {} [csvfile] [column]'.format(sys.argv[0]))
>      sys.exit(1)
>
> filename = sys.argv[1]
> column = int(sys.argv[2])


> for line in filename() , column ():
>      elements = line.strip().split(',')
>      values.append(int(elements[col]))

"filename" is a string, and "column" is an integer, the interpreter 
should warn you that they are not callable
the above three lines could be changed to
==>
values = []
for line in open(filename):
      elements = line.strip().split(',')
      values.append(int(elements[column]))

or now that you have the "csv" module, you can use "csv.reader" to parse 
the file:
==>
values = [ row[column] for row in csv.reader(open(filename, 'rb')) if 
len(row) > column ]

>
> csum = sum(values)
> cavg = sum(values)/len(values)
> print("Sum of column %d: %f" % (col, csum))
> print("Avg of column %d: %f" % (col, cavg))
>

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web