Groups > comp.lang.python > #66769 > unrolled thread

The sum of numbers in a line from a file

Started by	kxjakkk <kjakupak@gmail.com>
First post	2014-02-20 08:22 -0800
Last post	2014-02-24 04:02 -0800
Articles	13 — 11 participants

Back to article view | Back to comp.lang.python

  The sum of numbers in a line from a file kxjakkk <kjakupak@gmail.com> - 2014-02-20 08:22 -0800
    Re: The sum of numbers in a line from a file John Gordon <gordon@panix.com> - 2014-02-20 16:46 +0000
      Re: The sum of numbers in a line from a file kjakupak@gmail.com - 2014-02-20 10:16 -0800
        Re: The sum of numbers in a line from a file Joel Goldstick <joel.goldstick@gmail.com> - 2014-02-20 13:32 -0500
          Re: The sum of numbers in a line from a file kjakupak@gmail.com - 2014-02-20 10:37 -0800
            Re: The sum of numbers in a line from a file Peter Otten <__peter__@web.de> - 2014-02-20 19:59 +0100
        Re: The sum of numbers in a line from a file Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2014-02-20 19:56 +0100
        Re: The sum of numbers in a line from a file MRAB <python@mrabarnett.plus.com> - 2014-02-20 19:17 +0000
    Re:The sum of numbers in a line from a file Dave Angel <davea@davea.name> - 2014-02-20 11:54 -0500
    Re: The sum of numbers in a line from a file Travis Griggs <travisgriggs@gmail.com> - 2014-02-20 14:35 -0800
    Re: The sum of numbers in a line from a file Johannes Schneider <johannes.schneider@galileo-press.de> - 2014-02-21 09:16 +0100
    Re: The sum of numbers in a line from a file sffjunkie@gmail.com - 2014-02-24 03:48 -0800
      Re: The sum of numbers in a line from a file sffjunkie@gmail.com - 2014-02-24 04:02 -0800

#66769 — The sum of numbers in a line from a file

From	kxjakkk <kjakupak@gmail.com>
Date	2014-02-20 08:22 -0800
Subject	The sum of numbers in a line from a file
Message-ID	<882091da-a499-477e-8f50-c5bdde7cdfec@googlegroups.com>

Let's say I have a sample file like this:

Name        1           2     3    4     5      6     7    8
------------------------------------------------------------------------
name1    099-66-7871   A-F    Y    100    67    81    59    98
name2    999-88-7766   A-F    N    99   100    96    91    90
name3    000-00-0110    AUD        5    100    28    19    76
name4    398-72-3333    P/F    Y    76    84    49    69    78
name5    909-37-3689    A-F    Y    97    94    100    61    79

For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name. 

All I've got is
sum([int(s.strip()) for s in open('file').readlines()])

[toc] | [next] | [standalone]

#66775

From	John Gordon <gordon@panix.com>
Date	2014-02-20 16:46 +0000
Message-ID	<le5bhe$461$1@reader1.panix.com>
In reply to	#66769

In <882091da-a499-477e-8f50-c5bdde7cdfec@googlegroups.com> kxjakkk <kjakupak@gmail.com> writes:

> Let's say I have a sample file like this:

> Name        1           2     3    4     5      6     7    8
> ------------------------------------------------------------------------
> name1    099-66-7871   A-F    Y    100    67    81    59    98
> name2    999-88-7766   A-F    N    99   100    96    91    90
> name3    000-00-0110    AUD        5    100    28    19    76
> name4    398-72-3333    P/F    Y    76    84    49    69    78
> name5    909-37-3689    A-F    Y    97    94    100    61    79

> For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name. 

> All I've got is
> sum([int(s.strip()) for s in open('file').readlines()])

This should get you started.  However, this code does not work for all
your imput lines, as 'name3' is missing the third field.  You'll have to
to modify the code to do something smarter than simple space-delimited
data.  But as I said, it's a start.

# open the file
with open('file') as fp:

    # process each line
    for line in fp.readlines():

        # split the line into a list of fields, delimited by spaces
        fields = line.split()

        # grab the name
        name = fields[0]
        
        # convert text values to integers and sum them
        sum1 = int(fields[4]) + int(fields[5]) + int(fields[6])
        sum2 = int(fields[7]) + int(fields[8])

        # compute the averages
        average1 = sum1 / 3.0
        average2 = sum2 / 2.0

        # display output
        print '%s %f %f' % (name, average1, average2)

-- 
John Gordon         Imagine what it must be like for a real medical doctor to
gordon@panix.com    watch 'House', or a real serial killer to watch 'Dexter'.

[toc] | [prev] | [next] | [standalone]

#66778

From	kjakupak@gmail.com
Date	2014-02-20 10:16 -0800
Message-ID	<9036049f-7d08-4de7-83f8-61e5fa2c2d2f@googlegroups.com>
In reply to	#66775

What I've got is 
def stu_scores():
    lines = []
    with open("file.txt") as f:
        lines.extend(f.readlines())
    return ("".join(lines[11:]))

scores = stu_scores()
for line in scores:
    fields = line.split()
    name = fields[0]
    sum1 = int(fields[4]) + int(fields[5]) + int(fields[6])
    sum2 = int(fields[7]) + int(fields[8])
    average1 = sum1 / 3.0
    average2 = sum2 / 2.0
    print ("%s %f %f %") (name, average1, average2)

It says that the list index is out of range on the sum1 line. I need stu_scores because the table from above starts on line 11.

[toc] | [prev] | [next] | [standalone]

#66779

From	Joel Goldstick <joel.goldstick@gmail.com>
Date	2014-02-20 13:32 -0500
Message-ID	<mailman.7201.1392921172.18130.python-list@python.org>
In reply to	#66778

[Multipart message — attachments visible in raw view] — view raw

On Feb 20, 2014 1:20 PM, <kjakupak@gmail.com> wrote:
>
> What I've got is
> def stu_scores():
>     lines = []
>     with open("file.txt") as f:
>         lines.extend(f.readlines())
>     return ("".join(lines[11:]))
>
> scores = stu_scores()
> for line in scores:
>     fields = line.split()
>     name = fields[0]
Print fields here to see what's up
>     sum1 = int(fields[4]) + int(fields[5]) + int(fields[6])
>     sum2 = int(fields[7]) + int(fields[8])
>     average1 = sum1 / 3.0
>     average2 = sum2 / 2.0
>     print ("%s %f %f %") (name, average1, average2)
>
> It says that the list index is out of range on the sum1 line. I need
stu_scores because the table from above starts on line 11.
> --
> https://mail.python.org/mailman/listinfo/python-list

[toc] | [prev] | [next] | [standalone]

#66780

From	kjakupak@gmail.com
Date	2014-02-20 10:37 -0800
Message-ID	<5dacc2f1-811e-4c30-9873-2f7573a7bb76@googlegroups.com>
In reply to	#66779

scores = stu_scores()
for line in scores:
    fields = line.split()
    name = fields[0]
print (fields)

Error comes up saying "IndexError: list index out of range."

[toc] | [prev] | [next] | [standalone]

#66782

From	Peter Otten <__peter__@web.de>
Date	2014-02-20 19:59 +0100
Message-ID	<mailman.7203.1392922774.18130.python-list@python.org>
In reply to	#66780

kjakupak@gmail.com wrote:

> Error comes up saying "IndexError: list index out of range."

OK, then the 12th line starts with a whitespace char. Add more print 
statements to see the problem:

> scores = stu_scores()

print scores

> for line in scores:

      print repr(line)

>     fields = line.split()
>     name = fields[0]
> print (fields)

Hint:

> def stu_scores():
>     lines = []
>     with open("file.txt") as f:
>         lines.extend(f.readlines())
>     return ("".join(lines[11:]))
 
Why did you "".join the lines?

[toc] | [prev] | [next] | [standalone]

#66781

From	Chris “Kwpolska” Warrick <kwpolska@gmail.com>
Date	2014-02-20 19:56 +0100
Message-ID	<mailman.7202.1392922619.18130.python-list@python.org>
In reply to	#66778

On Thu, Feb 20, 2014 at 7:16 PM,  <kjakupak@gmail.com> wrote:
> What I've got is
> def stu_scores():
>     lines = []
>     with open("file.txt") as f:
>         lines.extend(f.readlines())
>     return ("".join(lines[11:]))

This returns a string, not a list.  Moreover, lines.extend() is
useless. Replace this with:

def stu_scores():
    with open("file.txt") as f:
        lines = f.readlines()
    return lines[11:]

> scores = stu_scores()
> for line in scores:

`for` operating on strings iterates over each character.

>     fields = line.split()

Splitting one character will turn it into a list containing itself, or
nothing if it was whitespace.

>     name = fields[0]

This is not what you want it to be — it’s only a single letter.

>     sum1 = int(fields[4]) + int(fields[5]) + int(fields[6])

Thus it fails here, because ['n'] has just one item, and not nine.

>     sum2 = int(fields[7]) + int(fields[8])
>     average1 = sum1 / 3.0
>     average2 = sum2 / 2.0
>     print ("%s %f %f %") (name, average1, average2)
>
> It says that the list index is out of range on the sum1 line. I need stu_scores because the table from above starts on line 11.
> --
> https://mail.python.org/mailman/listinfo/python-list



-- 
Chris “Kwpolska” Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense

[toc] | [prev] | [next] | [standalone]

#66783

From	MRAB <python@mrabarnett.plus.com>
Date	2014-02-20 19:17 +0000
Message-ID	<mailman.7204.1392923835.18130.python-list@python.org>
In reply to	#66778

On 2014-02-20 18:16, kjakupak@gmail.com wrote:
> What I've got is
> def stu_scores():
>      lines = []
>      with open("file.txt") as f:
>          lines.extend(f.readlines())
>      return ("".join(lines[11:]))
>
> scores = stu_scores()
> for line in scores:
>      fields = line.split()
>      name = fields[0]
>      sum1 = int(fields[4]) + int(fields[5]) + int(fields[6])
>      sum2 = int(fields[7]) + int(fields[8])
>      average1 = sum1 / 3.0
>      average2 = sum2 / 2.0
>      print ("%s %f %f %") (name, average1, average2)
>
> It says that the list index is out of range on the sum1 line. I need stu_scores because the table from above starts on line 11.
>
Apart from the other replies, the final print is wrong. It should be:

     print "%s %f %f" % (name, average1, average2)

[toc] | [prev] | [next] | [standalone]

#66777

From	Dave Angel <davea@davea.name>
Date	2014-02-20 11:54 -0500
Message-ID	<mailman.7199.1392915307.18130.python-list@python.org>
In reply to	#66769

 kxjakkk <kjakupak@gmail.com> Wrote in message:
> Let's say I have a sample file like this:
> 
> Name        1           2     3    4     5      6     7    8
> ------------------------------------------------------------------------
> name1    099-66-7871   A-F    Y    100    67    81    59    98
> name2    999-88-7766   A-F    N    99   100    96    91    90
> name3    000-00-0110    AUD        5    100    28    19    76
> name4    398-72-3333    P/F    Y    76    84    49    69    78
> name5    909-37-3689    A-F    Y    97    94    100    61    79
> 
> For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name. 
> 
> All I've got is
> sum([int(s.strip()) for s in open('file').readlines()])
> 

Don'ttrytodoitallinoneline.thatwayyouactuallymighthaveaplacetoinse
rtsomeextralogic.
-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#66784

From	Travis Griggs <travisgriggs@gmail.com>
Date	2014-02-20 14:35 -0800
Message-ID	<mailman.7205.1392935724.18130.python-list@python.org>
In reply to	#66769

On Feb 20, 2014, at 8:54 AM, Dave Angel <davea@davea.name> wrote:

> kxjakkk <kjakupak@gmail.com> Wrote in message:
>> Let's say I have a sample file like this:
>> 
>> Name        1           2     3    4     5      6     7    8
>> ------------------------------------------------------------------------
>> name1    099-66-7871   A-F    Y    100    67    81    59    98
>> name2    999-88-7766   A-F    N    99   100    96    91    90
>> name3    000-00-0110    AUD        5    100    28    19    76
>> name4    398-72-3333    P/F    Y    76    84    49    69    78
>> name5    909-37-3689    A-F    Y    97    94    100    61    79
>> 
>> For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name. 
>> 
>> All I've got is
>> sum([int(s.strip()) for s in open('file').readlines()])
>> 
> 
> Don'ttrytodoitallinoneline.thatwayyouactuallymighthaveaplacetoinse
> rtsomeextralogic.
> 
Yes.

Clearly

the

preferred

way

to

do

it

is

with

lots

of

lines

with

room

for

expandability.

Sorry Dave, couldn’t resist. Clearly a balance between extremes is desirable.

(Mark, I intentionally put the blank lines in this time <grin>)

Travis Griggs
"“Every institution tends to perish by an excess of its own basic principle.” — Lord Acton

[toc] | [prev] | [next] | [standalone]

#66811

From	Johannes Schneider <johannes.schneider@galileo-press.de>
Date	2014-02-21 09:16 +0100
Message-ID	<mailman.7218.1392970584.18130.python-list@python.org>
In reply to	#66769

s = 4
e = 7
with f = file('path_to_file) as fp:
	for line in f:
		if line.startswith(name):
			avg =  sum(map(int, filter(ambda x : len(x) > 0,
				s.split(' '))[s : e])) / (e -  s)






On 20.02.2014 17:22, kxjakkk wrote:
> Let's say I have a sample file like this:
>
> Name        1           2     3    4     5      6     7    8
> ------------------------------------------------------------------------
> name1    099-66-7871   A-F    Y    100    67    81    59    98
> name2    999-88-7766   A-F    N    99   100    96    91    90
> name3    000-00-0110    AUD        5    100    28    19    76
> name4    398-72-3333    P/F    Y    76    84    49    69    78
> name5    909-37-3689    A-F    Y    97    94    100    61    79
>
> For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name.
>
> All I've got is
> sum([int(s.strip()) for s in open('file').readlines()])
>


-- 
Johannes Schneider
Webentwicklung
johannes.schneider@galileo-press.de
Tel.: +49.228.42150.xxx

Galileo Press GmbH
Rheinwerkallee 4 - 53227 Bonn - Germany
Tel.: +49.228.42.150.0 (Zentrale) .77 (Fax)
http://www.galileo-press.de/

Geschäftsführer: Tomas Wehren, Ralf Kaulisch, Rainer Kaltenecker
HRB 8363 Amtsgericht Bonn

[toc] | [prev] | [next] | [standalone]

#66977

From	sffjunkie@gmail.com
Date	2014-02-24 03:48 -0800
Message-ID	<e1d5596c-81d2-4422-8f8f-a3b38a596451@googlegroups.com>
In reply to	#66769

On Thursday, 20 February 2014 16:22:00 UTC, kxjakkk  wrote:
> Let's say I have a sample file like this:
> Name        1           2     3    4     5      6     7    8
> ------------------------------------------------------------------------
> name1    099-66-7871   A-F    Y    100    67    81    59    98
> name2    999-88-7766   A-F    N    99   100    96    91    90
> name3    000-00-0110    AUD        5    100    28    19    76
> name4    398-72-3333    P/F    Y    76    84    49    69    78
> name5    909-37-3689    A-F    Y    97    94    100    61    79
> 
> For name1, I want to add together columns 4, 5, 6, and get an average from that, then do the same for the last two columns. I want to do this for every name. 

The following solution works for Python3 (due to the unpacking using the * syntax)

----

from collections import defaultdict, namedtuple

info = namedtuple('info', 'sum avg')

interesting_data = (x.strip(' \n') for idx, x in enumerate(open('file').readlines()) if idx > 1 and len(x.strip(' \n')) > 0)

split_points = [2, 4, 5]

results = defaultdict(list)
for line in interesting_data:
    name, _, _, _, *rest = line.split()
    
    last_point = 0
    for point in split_points:
        s = sum(map(int, rest[last_point:point]))
        i = info(s, s / (point - last_point))
        
        results[name].append(i)
        last_point = point
        
print(results)
print(results['name3'][0].avg)

--Simon Kennedy

[toc] | [prev] | [next] | [standalone]

#66979

From	sffjunkie@gmail.com
Date	2014-02-24 04:02 -0800
Message-ID	<638bd989-e610-4629-92ab-40045f564af2@googlegroups.com>
In reply to	#66977

On Monday, 24 February 2014 11:48:23 UTC, sffj...@gmail.com  wrote:

> split_points = [2, 4, 5]

Change this to `split_points = [3, 5]` for your requirements

--Simon Kennedy

[toc] | [prev] | [standalone]

csiph-web

The sum of numbers in a line from a file

Contents

#66769 — The sum of numbers in a line from a file

#66775

#66778

#66779

#66780

#66782

#66781

#66783

#66777

#66784

#66811

#66977

#66979