Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #65491 > unrolled thread

parse a csv file into a text file

Started byZhen Zhang <zhen.zhang.uoft@gmail.com>
First post2014-02-05 16:10 -0800
Last post2014-02-06 12:51 -0600
Articles 20 on this page of 30 — 12 participants

Back to article view | Back to comp.lang.python


Contents

  parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-05 16:10 -0800
    Re: parse a csv file into a text file Asaf Las <roegltd@gmail.com> - 2014-02-05 16:17 -0800
      Re: parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-05 23:56 -0800
    Re: parse a csv file into a text file Roy Smith <roy@panix.com> - 2014-02-05 19:33 -0500
      Re: parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-05 23:52 -0800
        Re: parse a csv file into a text file Asaf Las <roegltd@gmail.com> - 2014-02-06 00:15 -0800
          Re: parse a csv file into a text file Asaf Las <roegltd@gmail.com> - 2014-02-06 00:48 -0800
        Re: parse a csv file into a text file MRAB <python@mrabarnett.plus.com> - 2014-02-06 13:16 +0000
          Re: parse a csv file into a text file Rustom Mody <rustompmody@gmail.com> - 2014-02-06 05:20 -0800
    Re: parse a csv file into a text file MRAB <python@mrabarnett.plus.com> - 2014-02-06 00:34 +0000
      Re: parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-06 00:01 -0800
    Re: parse a csv file into a text file Tim Chase <python.list@tim.thechases.com> - 2014-02-05 18:46 -0600
      Re: parse a csv file into a text file Asaf Las <roegltd@gmail.com> - 2014-02-05 19:59 -0800
        Re: parse a csv file into a text file Tim Chase <python.list@tim.thechases.com> - 2014-02-05 22:09 -0600
          Re: parse a csv file into a text file Asaf Las <roegltd@gmail.com> - 2014-02-05 20:17 -0800
      Re: parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-06 00:07 -0800
        Re: parse a csv file into a text file Tim Chase <python.list@tim.thechases.com> - 2014-02-06 12:49 -0600
    Re: parse a csv file into a text file Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-06 00:50 +0000
    Re:parse a csv file into a text file Dave Angel <davea@davea.name> - 2014-02-05 19:57 -0500
      Re: parse a csv file into a text file Zhen Zhang <zhen.zhang.uoft@gmail.com> - 2014-02-06 00:12 -0800
        Re: parse a csv file into a text file Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2014-02-06 10:49 +0200
        Re: parse a csv file into a text file Dave Angel <davea@davea.name> - 2014-02-06 06:35 -0500
        Re: parse a csv file into a text file Dave Angel <davea@davea.name> - 2014-02-06 06:53 -0500
    Re: parse a csv file into a text file Terry Reedy <tjreedy@udel.edu> - 2014-02-05 22:01 -0500
    Re: parse a csv file into a text file Neil Cerutti <neilc@norwich.edu> - 2014-02-06 14:02 +0000
    Re: parse a csv file into a text file Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-06 17:40 +0000
    Re: parse a csv file into a text file Tim Chase <python.list@tim.thechases.com> - 2014-02-06 11:51 -0600
    Re: parse a csv file into a text file Tim Golden <mail@timgolden.me.uk> - 2014-02-06 18:05 +0000
    Re: parse a csv file into a text file Neil Cerutti <neilc@norwich.edu> - 2014-02-06 18:34 +0000
    Re: parse a csv file into a text file Tim Chase <python.list@tim.thechases.com> - 2014-02-06 12:51 -0600

Page 1 of 2  [1] 2  Next page →


#65491 — parse a csv file into a text file

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-05 16:10 -0800
Subjectparse a csv file into a text file
Message-ID<5c268845-003f-4e24-b27a-c89e9fbfcc6c@googlegroups.com>
Hi, every one.

I am a second year EE student.
I just started learning python for my project.

I intend to parse a csv file with a format like 

3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28

into a text file like the following

Toronto 2503281
Montreal 1620693
Vancouver 578041

I am extracting the 1st and 5th column and save it into a text file.

This is what i have so far.


[code]

import csv
file = open('raw.csv')
reader = csv.reader(file)

f = open('NicelyDone.text','w')

for line in reader:
      f.write("%s %s"%line[1],%line[5])

[/code]

This is not working for me, I was able to extract the data from the csv file as line[1],line[5]. (I am able to print it out)
But I dont know how to write it to a .text file in the format i wanted.

Also, I have to process the first column eg, "Toronto (Ont.)" into "Toronto".
I am familiar with the function find(), I assume that i could extract Toronto out of Toronto(Ont.) using "(" as the stopping character, 
but based on my research , I have no idea how to use it and ask it to return me the string(Toronto).

Here is my question:
1:What is the data format for line[1], if it is string how come f.write()does not work. if it is not string, how do i convert it to a string?
2:How do i extract the word Toronto out of Toronto(Ont) into a string form using find() or other methods.

My thinking is that I could add those 2 string together like c=a+' ' +b, that would give me the format i wanted.
So i can use f.write() to write into a file  ;) 

Sorry if my questions sounds too easy or stupid.

Thanks ahead

Zhen

[toc] | [next] | [standalone]


#65492

FromAsaf Las <roegltd@gmail.com>
Date2014-02-05 16:17 -0800
Message-ID<f9877711-7672-48a3-a488-8c986ac8ce15@googlegroups.com>
In reply to#65491
On Thursday, February 6, 2014 2:10:16 AM UTC+2, Zhen Zhang wrote:
> Hi, every one.
> Zhen
str_t = '3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1' 
list_t = str_t.split(',')
print(list_t)
print("split result ", list_t[1], list_t[5])
print(list_t[1].split('"')[1])

[toc] | [prev] | [next] | [standalone]


#65514

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-05 23:56 -0800
Message-ID<fe850b95-4461-4dfd-aaa5-d009544f38d5@googlegroups.com>
In reply to#65492
On Wednesday, February 5, 2014 7:17:17 PM UTC-5, Asaf Las wrote:
> On Thursday, February 6, 2014 2:10:16 AM UTC+2, Zhen Zhang wrote:
> 
> > Hi, every one.
> 
> > Zhen
> 
> str_t = '3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1' 
> 
> list_t = str_t.split(',')
> 
> print(list_t)
> 
> print("split result ", list_t[1], list_t[5])
> 
> print(list_t[1].split('"')[1])

Thanks for the reply,
I did not get the line 
str_t = '3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1' 

I am processing a entire file not a line, so should i do 
str_t=line? maybe

list_t = str_t.split(',')
I think you are trying to spit a line into list.

but the line is already a list format right? that is why it allows me to do
something like line[1].
but I am not sure.

[toc] | [prev] | [next] | [standalone]


#65493

FromRoy Smith <roy@panix.com>
Date2014-02-05 19:33 -0500
Message-ID<roy-4D8195.19330005022014@news.panix.com>
In reply to#65491
In article <5c268845-003f-4e24-b27a-c89e9fbfcc6c@googlegroups.com>,
 Zhen Zhang <zhen.zhang.uoft@gmail.com> wrote:

> [code]
> 
> import csv
> file = open('raw.csv')
> reader = csv.reader(file)
> 
> f = open('NicelyDone.text','w')
> 
> for line in reader:
>       f.write("%s %s"%line[1],%line[5])
> 
> [/code]

Are you using Python 2 or 3?

> Here is my question:
> 1:What is the data format for line[1],

That's something you can easily figure out by printing out the 
intermediate values.  Try something like:

> for line in reader:
>       print type(line[1]), repr(line(1))

See if that prints what you expect.

> how come f.write() does not work.

What does "does not work" mean?  What does get written to the file?  Or 
do you get some sort of error?

I'm pretty sure I see your error, but I'm trying to lead you to being 
able to diagnose it yourself :-)

[toc] | [prev] | [next] | [standalone]


#65513

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-05 23:52 -0800
Message-ID<b84f4c7e-eaeb-4827-bf12-5ea656e40bf3@googlegroups.com>
In reply to#65493
On Wednesday, February 5, 2014 7:33:00 PM UTC-5, Roy Smith wrote:
> In article <5c268845-003f-4e24-b27a-c89e9fbfcc6c@googlegroups.com>,
> 
>  Zhen Zhang <zhen.zhang.uoft@gmail.com> wrote:
> 
> 
> 
> > [code]
> 
> > 
> 
> > import csv
> 
> > file = open('raw.csv')
> 
> > reader = csv.reader(file)
> 
> > 
> 
> > f = open('NicelyDone.text','w')
> 
> > 
> 
> > for line in reader:
> 
> >       f.write("%s %s"%line[1],%line[5])
> 
> > 
> 
> > [/code]
> 
> 
> 
> Are you using Python 2 or 3?
> 
> 
> 
> > Here is my question:
> 
> > 1:What is the data format for line[1],
> 
> 
> 
> That's something you can easily figure out by printing out the 
> 
> intermediate values.  Try something like:
> 
> 
> 
> > for line in reader:
> 
> >       print type(line[1]), repr(line(1))
> 
> 
> 
> See if that prints what you expect.
> 
> 
> 
> > how come f.write() does not work.
> 
> 
> 
> What does "does not work" mean?  What does get written to the file?  Or 
> 
> do you get some sort of error?
> 
> 
> 
> I'm pretty sure I see your error, but I'm trying to lead you to being 
> 
> able to diagnose it yourself :-)

Hi Roy ,

Thank you so much for the reply,
I am currenly running python 2.7

i run the 
 print type(line[1]), repr(line(1)) 
It tells me that 'list object is not callable

It seems the entire line is a data type of list instead of a data type of "line" as i thought.

The line[1] is a string element of list after all.

f.write("%s %s %s" %(output,location,output))works great, 
as MRAB mentioned, I have to do write it in term of tuples.

This is the code I am currently using

for line in reader:
     location ="%s"%(line[1])
     if '(' in location:
        # at this point, bits = ['Toronto ', 'Ont.)'] 
        bits = location.split('(')  
        location = bits[0].strip()
     output = "%s %s\n" %(location,line[5])
     f.write("%s" %(output))

It extracts desired information into a text file as i wanted.
however, the python program gives me a Error after the execution.
 location="%s"%(line[1])
 IndexError: list index out of range

I failed to figure out why.

[toc] | [prev] | [next] | [standalone]


#65520

FromAsaf Las <roegltd@gmail.com>
Date2014-02-06 00:15 -0800
Message-ID<2039ab62-d18a-4127-8fc0-917fdb0fe00a@googlegroups.com>
In reply to#65513
On Thursday, February 6, 2014 9:52:43 AM UTC+2, Zhen Zhang wrote:
> On Wednesday, February 5, 2014 7:33:00 PM UTC-5, Roy Smith wrote:
> I failed to figure out why.

OK, you had to look to what i posted second time. The first one is 
irrelevant. Note that file was emulated using StringIO. in your 
case it will be file name. 
You can grab script below and run directly as python script:

<------------------------------------ start of script
import io
import csv

str_t = '''3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28 '''

file_t = io.StringIO(str_t)

csv_t = csv.reader(file_t, delimiter = ',')
for row in csv_t: 
    print("split result ", row[1].strip('"').split('(')[0] , row[5])


<----------------------------- end of script
Output must be (i got it after run): 

split result  Toronto  2481494
split result  Montréal  1583590
split result  Vancouver  545671
split result  Richmond Hill  132030



row[1].strip('"').split('(')[0]    is City name 
row[5]                             is digits at pos 5 wished



Both are strings, so save them later into file. 
Regarding this one - you can split operations as below to see what is 
happening:
row[1]
row[1].strip('"')
row[1].strip('"').split('(')
row[1].strip('"').split('(')[0]

Have a nice day 

/Asaf

[toc] | [prev] | [next] | [standalone]


#65522

FromAsaf Las <roegltd@gmail.com>
Date2014-02-06 00:48 -0800
Message-ID<b547c5b9-9a7c-4035-816c-b93afb708bb8@googlegroups.com>
In reply to#65520
On Thursday, February 6, 2014 10:15:14 AM UTC+2, Asaf Las wrote:
> On Thursday, February 6, 2014 9:52:43 AM UTC+2, Zhen Zhang wrote:
> case it will be file name. 

little correction not a file name - file object, file_t is result from open()
as you did in your example

[toc] | [prev] | [next] | [standalone]


#65533

FromMRAB <python@mrabarnett.plus.com>
Date2014-02-06 13:16 +0000
Message-ID<mailman.6444.1391692605.18130.python-list@python.org>
In reply to#65513
On 2014-02-06 07:52, Zhen Zhang wrote:> On Wednesday, February 5, 2014 
7:33:00 PM UTC-5, Roy Smith wrote:
 >> In article <5c268845-003f-4e24-b27a-c89e9fbfcc6c@googlegroups.com>,
 >>  Zhen Zhang <zhen.zhang.uoft@gmail.com> wrote:
 >>
 >> > [code]
 >> >
 >> > import csv
 >> > file = open('raw.csv')
 >> > reader = csv.reader(file)
 >> >
 >> > f = open('NicelyDone.text','w')
 >> >
 >> > for line in reader:
 >> >       f.write("%s %s"%line[1],%line[5])
 >> >
 >> > [/code]
 >>
 >> Are you using Python 2 or 3?
 >>
 >> > Here is my question:
 >> > 1:What is the data format for line[1],
 >>
 >> That's something you can easily figure out by printing out the
 >> intermediate values.  Try something like:
 >>
 >> > for line in reader:
 >> >       print type(line[1]), repr(line(1))
 >>
 >> See if that prints what you expect.
 >>
 >> > how come f.write() does not work.
 >>
 >> What does "does not work" mean?  What does get written to the file?
 >> Or do you get some sort of error?
 >>
 >> I'm pretty sure I see your error, but I'm trying to lead you to being
 >> able to diagnose it yourself :-)
 >
 > Hi Roy ,
 >
 > Thank you so much for the reply,
 > I am currenly running python 2.7
 >
 > i run the
 >   print type(line[1]), repr(line(1))
 > It tells me that 'list object is not callable
 >
"line" is a list and within repr you're using (...) (parentheses)
instead of [...] (square brackets).

It might be clearer if you call the variable "row" because the CSV
reader returns rows, and each row is a list of strings.

 > It seems the entire line is a data type of list instead of a data
 > type of "line" as i thought.
 >
 > The line[1] is a string element of list after all.
 >
 > f.write("%s %s %s" %(output,location,output))works great,
 > as MRAB mentioned, I have to do write it in term of tuples.
 >
 > This is the code I am currently using
 >
 > for line in reader:
 >       location ="%s"%(line[1])
 >       if '(' in location:
 >          # at this point, bits = ['Toronto ', 'Ont.)']
 >          bits = location.split('(')
 >          location = bits[0].strip()
 >       output = "%s %s\n" %(location,line[5])
 >       f.write("%s" %(output))
 >
A 1-tuple (a tuple containing one item) is:

     (item, )

It's actually the comma that makes it a tuple (except for the 0-tuple
"()"); it's just that it's often necessary to wrap it in (...), and
people then think it's those that are making it a tuple, but it's not!

 > It extracts desired information into a text file as i wanted.
 > however, the python program gives me a Error after the execution.
 >   location="%s"%(line[1])
 >   IndexError: list index out of range
 >
 > I failed to figure out why.
 >
What is the value of "line" at that point?

[toc] | [prev] | [next] | [standalone]


#65534

FromRustom Mody <rustompmody@gmail.com>
Date2014-02-06 05:20 -0800
Message-ID<aeb43378-30f5-4718-b7c6-2fcc0cf85484@googlegroups.com>
In reply to#65533
On Thursday, February 6, 2014 6:46:37 PM UTC+5:30, MRAB wrote:
> 
> It's actually the comma that makes it a tuple (except for the 0-tuple
> "()"); it's just that it's often necessary to wrap it in (...), and
> people then think it's those that are making it a tuple, but it's not!

Interesting viewpoint -- didn't know that!

[toc] | [prev] | [next] | [standalone]


#65494

FromMRAB <python@mrabarnett.plus.com>
Date2014-02-06 00:34 +0000
Message-ID<mailman.6428.1391646904.18130.python-list@python.org>
In reply to#65491
On 2014-02-06 00:10, Zhen Zhang wrote:
> Hi, every one.
>
> I am a second year EE student.
> I just started learning python for my project.
>
> I intend to parse a csv file with a format like
>
> 3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
> 2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
> 5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
> 3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28
>
> into a text file like the following
>
> Toronto 2503281
> Montreal 1620693
> Vancouver 578041
>
> I am extracting the 1st and 5th column and save it into a text file.
>
> This is what i have so far.
>
>
> [code]
>
> import csv
> file = open('raw.csv')
> reader = csv.reader(file)
>
> f = open('NicelyDone.text','w')
>
> for line in reader:
>        f.write("%s %s"%line[1],%line[5])
>
> [/code]
>
> This is not working for me, I was able to extract the data from the csv file as line[1],line[5]. (I am able to print it out)
> But I dont know how to write it to a .text file in the format i wanted.
>
% is an operator. When used with a format string on its left, its
arguments go on its right. In the general case, those arguments should
be put in a tuple, although if there's only one argument and it's not a
tuple, you can write just that argument:

     f.write("%s %s" % (line[1], line[5]))

> Also, I have to process the first column eg, "Toronto (Ont.)" into "Toronto".
> I am familiar with the function find(), I assume that i could extract Toronto out of Toronto(Ont.) using "(" as the stopping character,
> but based on my research , I have no idea how to use it and ask it to return me the string(Toronto).
>
Use find to tell you the index of the "(" (if there isn't one then
it'll return -1) and then slice the string to get the part preceding it.

Another way is to use the "partition" method.

Also, have a look at the "strip"/"lstrip"/"rstrip" methods.

> Here is my question:
> 1:What is the data format for line[1], if it is string how come f.write()does not work. if it is not string, how do i convert it to a string?
> 2:How do i extract the word Toronto out of Toronto(Ont) into a string form using find() or other methods.
>
> My thinking is that I could add those 2 string together like c=a+' ' +b, that would give me the format i wanted.
> So i can use f.write() to write into a file  ;)
>
> Sorry if my questions sounds too easy or stupid.
>
> Thanks ahead
>
> Zhen
>

[toc] | [prev] | [next] | [standalone]


#65516

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-06 00:01 -0800
Message-ID<4cb12232-5677-442a-8ffb-5af3ea05e203@googlegroups.com>
In reply to#65494
On Wednesday, February 5, 2014 7:34:57 PM UTC-5, MRAB wrote:
> On 2014-02-06 00:10, Zhen Zhang wrote:
> 
> > Hi, every one.
> 
> >
> 
> > I am a second year EE student.
> 
> > I just started learning python for my project.
> 
> >
> 
> > I intend to parse a csv file with a format like
> 
> >
> 
> > 3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
> 
> > 2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
> 
> > 5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
> 
> > 3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28
> 
> >
> 
> > into a text file like the following
> 
> >
> 
> > Toronto 2503281
> 
> > Montreal 1620693
> 
> > Vancouver 578041
> 
> >
> 
> > I am extracting the 1st and 5th column and save it into a text file.
> 
> >
> 
> > This is what i have so far.
> 
> >
> 
> >
> 
> > [code]
> 
> >
> 
> > import csv
> 
> > file = open('raw.csv')
> 
> > reader = csv.reader(file)
> 
> >
> 
> > f = open('NicelyDone.text','w')
> 
> >
> 
> > for line in reader:
> 
> >        f.write("%s %s"%line[1],%line[5])
> 
> >
> 
> > [/code]
> 
> >
> 
> > This is not working for me, I was able to extract the data from the csv file as line[1],line[5]. (I am able to print it out)
> 
> > But I dont know how to write it to a .text file in the format i wanted.
> 
> >
> 
> % is an operator. When used with a format string on its left, its
> 
> arguments go on its right. In the general case, those arguments should
> 
> be put in a tuple, although if there's only one argument and it's not a
> 
> tuple, you can write just that argument:
> 
> 
> 
>      f.write("%s %s" % (line[1], line[5]))
> 
> 
> 
> > Also, I have to process the first column eg, "Toronto (Ont.)" into "Toronto".
> 
> > I am familiar with the function find(), I assume that i could extract Toronto out of Toronto(Ont.) using "(" as the stopping character,
> 
> > but based on my research , I have no idea how to use it and ask it to return me the string(Toronto).
> 
> >
> 
> Use find to tell you the index of the "(" (if there isn't one then
> 
> it'll return -1) and then slice the string to get the part preceding it.
> 
> 
> 
> Another way is to use the "partition" method.
> 
> 
> 
> Also, have a look at the "strip"/"lstrip"/"rstrip" methods.
> 
> 
> 
> > Here is my question:
> 
> > 1:What is the data format for line[1], if it is string how come f.write()does not work. if it is not string, how do i convert it to a string?
> 
> > 2:How do i extract the word Toronto out of Toronto(Ont) into a string form using find() or other methods.
> 
> >
> 
> > My thinking is that I could add those 2 string together like c=a+' ' +b, that would give me the format i wanted.
> 
> > So i can use f.write() to write into a file  ;)
> 
> >
> 
> > Sorry if my questions sounds too easy or stupid.
> 
> >
> 
> > Thanks ahead
> 
> >
> 
> > Zhen
> 
> >

Thanks for the reply, especially the tuple parts,
I was not familiar with this data format, 
but i guess i should :)

[toc] | [prev] | [next] | [standalone]


#65495

FromTim Chase <python.list@tim.thechases.com>
Date2014-02-05 18:46 -0600
Message-ID<mailman.6429.1391647532.18130.python-list@python.org>
In reply to#65491
On 2014-02-05 16:10, Zhen Zhang wrote:
> import csv
> file = open('raw.csv')

Asaf recommended using string methods to split the file.  Keep doing
what you're doing (using the csv module), as it attends to a lot of
edge-cases that will trip you up otherwise.  I learned this the hard
way several years into my Python career. :-)

> reader = csv.reader(file)
> 
> f = open('NicelyDone.text','w')
> 
> for line in reader:
>       f.write("%s %s"%line[1],%line[5])

Here, I'd start by naming the pieces that you get, so do

  for line in reader:
    location = line[1]
    value = line[5]

> Also, I have to process the first column eg, "Toronto (Ont.)" into
> "Toronto". I am familiar with the function find(), I assume that i
> could extract Toronto out of Toronto(Ont.) using "(" as the
> stopping character, but based on my research , I have no idea how
> to use it and ask it to return me the string(Toronto).

You can use the .split() method to split a string, so you could do
something like

  if '(' in location:
    bits = location.split('(')
    # at this point, bits = ['Toronto ', 'Ont.)']
    location = bits[0].strip() # also strip it to remove whitespace

> 1:What is the data format for line[1], if it is string how come
> f.write()does not work. if it is not string, how do i convert it to
> a string?

The problem is not that "it is not a string" but that you passing
multiple parameters, the second of which is invalid Python because it
has an extra percent-sign.  First create the one string that you
want to output:

  output = "%s %s\n" % (location, bits)

and then write it out to the file:

  f.write(output)

rather than trying to do it all in one pass.

-tkc




[toc] | [prev] | [next] | [standalone]


#65506

FromAsaf Las <roegltd@gmail.com>
Date2014-02-05 19:59 -0800
Message-ID<6a865ea9-c3e3-486a-a43e-31a28b4ac3a6@googlegroups.com>
In reply to#65495
On Thursday, February 6, 2014 2:46:04 AM UTC+2, Tim Chase wrote:
> On 2014-02-05 16:10, Zhen Zhang wrote:
> Asaf recommended using string methods to split the file.  Keep doing
> what you're doing (using the csv module), as it attends to a lot of
> edge-cases that will trip you up otherwise.  I learned this the hard
> way several years into my Python career. :-)

i did not recommend anything :-) 

import io
import csv

str_t = '''3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28 '''

file_t = io.StringIO(str_t)

csv_t = csv.reader(file_t, delimiter = ',')
for row in csv_t: 
    print("split result ", row[1].strip('"'), row[5])

[toc] | [prev] | [next] | [standalone]


#65507

FromTim Chase <python.list@tim.thechases.com>
Date2014-02-05 22:09 -0600
Message-ID<mailman.6434.1391659752.18130.python-list@python.org>
In reply to#65506
On 2014-02-05 19:59, Asaf Las wrote:
> On Thursday, February 6, 2014 2:46:04 AM UTC+2, Tim Chase wrote:
> > On 2014-02-05 16:10, Zhen Zhang wrote:
> > Asaf recommended using string methods to split the file.  Keep
> > doing what you're doing (using the csv module), as it attends to
> > a lot of edge-cases that will trip you up otherwise.  I learned
> > this the hard way several years into my Python career. :-)  
> 
> i did not recommend anything :-) 

From your code,

  list_t = str_t.split(',')

It might have been a short-hand for obtaining the results of a CSV
row, but it might be better written something like

  list_t = csv.reader([str_t])

-tkc

[toc] | [prev] | [next] | [standalone]


#65508

FromAsaf Las <roegltd@gmail.com>
Date2014-02-05 20:17 -0800
Message-ID<dfb930b8-4615-4ceb-8f54-5374f3929218@googlegroups.com>
In reply to#65507
On Thursday, February 6, 2014 6:09:52 AM UTC+2, Tim Chase wrote:
> On 2014-02-05 19:59, Asaf Las wrote:
> From your code,
>   list_t = str_t.split(',')
> It might have been a short-hand for obtaining the results of a CSV
> row, but it might be better written something like
>   list_t = csv.reader([str_t])
> -tkc

i was too fast to reply. you are correct!

/Asaf

[toc] | [prev] | [next] | [standalone]


#65517

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-06 00:07 -0800
Message-ID<d74338a6-2969-4288-a462-4cc8701fd098@googlegroups.com>
In reply to#65495
On Wednesday, February 5, 2014 7:46:04 PM UTC-5, Tim Chase wrote:
> On 2014-02-05 16:10, Zhen Zhang wrote:
> 
> > import csv
> 
> > file = open('raw.csv')
> 
> 
> 
> Asaf recommended using string methods to split the file.  Keep doing
> 
> what you're doing (using the csv module), as it attends to a lot of
> 
> edge-cases that will trip you up otherwise.  I learned this the hard
> 
> way several years into my Python career. :-)
> 
> 
> 
> > reader = csv.reader(file)
> 
> > 
> 
> > f = open('NicelyDone.text','w')
> 
> > 
> 
> > for line in reader:
> 
> >       f.write("%s %s"%line[1],%line[5])
> 
> 
> 
> Here, I'd start by naming the pieces that you get, so do
> 
> 
> 
>   for line in reader:
> 
>     location = line[1]
> 
>     value = line[5]
> 
> 
> 
> > Also, I have to process the first column eg, "Toronto (Ont.)" into
> 
> > "Toronto". I am familiar with the function find(), I assume that i
> 
> > could extract Toronto out of Toronto(Ont.) using "(" as the
> 
> > stopping character, but based on my research , I have no idea how
> 
> > to use it and ask it to return me the string(Toronto).
> 
> 
> 
> You can use the .split() method to split a string, so you could do
> 
> something like
> 
> 
> 
>   if '(' in location:
> 
>     bits = location.split('(')
> 
>     # at this point, bits = ['Toronto ', 'Ont.)']
> 
>     location = bits[0].strip() # also strip it to remove whitespace
> 
> 
> 
> > 1:What is the data format for line[1], if it is string how come
> 
> > f.write()does not work. if it is not string, how do i convert it to
> 
> > a string?
> 
> 
> 
> The problem is not that "it is not a string" but that you passing
> 
> multiple parameters, the second of which is invalid Python because it
> 
> has an extra percent-sign.  First create the one string that you
> 
> want to output:
> 
> 
> 
>   output = "%s %s\n" % (location, bits)
> 
> 
> 
> and then write it out to the file:
> 
> 
> 
>   f.write(output)
> 
> 
> 
> rather than trying to do it all in one pass.
> 
> 
> 
> -tkc

Hi Tim,

Thanks for the reply,

Does the split make a list or tuple?

also,

when  i do location=line[1],
it gives me a error even though the program did run correctly and output the correct file.
 location=line[1]
 IndexError: list index out of range 

when i do print line[1], there is no error.
it is really strange

[toc] | [prev] | [next] | [standalone]


#65550

FromTim Chase <python.list@tim.thechases.com>
Date2014-02-06 12:49 -0600
Message-ID<mailman.6454.1391712528.18130.python-list@python.org>
In reply to#65517
[first, it looks like you're posting via Google Groups which
annoyingly double-spaces everything in your reply. It's possible to
work around this, but you might want to subscribe via email or an
actual newsgroup client.  You can read more at
https://wiki.python.org/moin/GoogleGroupsPython ]

On 2014-02-06 00:07, Zhen Zhang wrote:
> Does the split make a list or tuple?

In this case, it happens to return a list, which you can check with

  print type("one two three".split())

However, also in this case, it doesn't matter, since either indexes
just fine.

> when  i do location=line[1],
> it gives me a error even though the program did run correctly and
> output the correct file. location=line[1]
>  IndexError: list index out of range 

Then it looks like you've got a blank line that doesn't actually have
data in it, so when it tries index into it, the only thing there is
[0], not [1].  As the message suggests :)

-tkc

[toc] | [prev] | [next] | [standalone]


#65496

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-02-06 00:50 +0000
Message-ID<mailman.6430.1391647886.18130.python-list@python.org>
In reply to#65491
On 06/02/2014 00:46, Tim Chase wrote:
> On 2014-02-05 16:10, Zhen Zhang wrote:
>> import csv
>> file = open('raw.csv')
>
> Asaf recommended using string methods to split the file.  Keep doing
> what you're doing (using the csv module), as it attends to a lot of
> edge-cases that will trip you up otherwise.  I learned this the hard
> way several years into my Python career. :-)

+1

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#65497

FromDave Angel <davea@davea.name>
Date2014-02-05 19:57 -0500
Message-ID<mailman.6431.1391648067.18130.python-list@python.org>
In reply to#65491
 Zhen Zhang <zhen.zhang.uoft@gmail.com> Wrote in message:
> Hi, every one.
> 
> I am a second year EE student.
> I just started learning python for my project.
> 
> I intend to parse a csv file with a format like 
> 
> 3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
> 2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
> 5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
> 3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28
> 
> into a text file like the following
> 
> Toronto 2503281
> Montreal 1620693
> Vancouver 578041
> 
> I am extracting the 1st and 5th column and save it into a text file.

Looks to me like columns 1 and 6.

> 
> This is what i have so far.
> 
> 
> [code]
> 
> import csv
> file = open('raw.csv')
> reader = csv.reader(file)
> 
> f = open('NicelyDone.text','w')
> 
> for line in reader:
>       f.write("%s %s"%line[1],%line[5])

Why not use print to file f? The approach for redirection is
 different between python 2 and 3, and you neglected to say which
 you're using. 
> 

> 
> My thinking is that I could add those 2 string together like c=a+' ' +b, that would give me the format i wanted.

And don't forget the "\n" at end of line. 

> So i can use f.write() to write into a file  ;) 

Or use print, which defaults to adding in a newline.

> Sorry if my questions sounds too easy or stupid.
> 

Not in the least.

> 
> 


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#65519

FromZhen Zhang <zhen.zhang.uoft@gmail.com>
Date2014-02-06 00:12 -0800
Message-ID<f81bcac3-e0a3-44d8-8285-d05d5971b7ee@googlegroups.com>
In reply to#65497
On Wednesday, February 5, 2014 7:57:26 PM UTC-5, Dave Angel wrote:
> Zhen Zhang <zhen.zhang.uoft@gmail.com> Wrote in message:
> 
> > Hi, every one.
> 
> > 
> 
> > I am a second year EE student.
> 
> > I just started learning python for my project.
> 
> > 
> 
> > I intend to parse a csv file with a format like 
> 
> > 
> 
> > 3520005,"Toronto (Ont.)",C  ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
> 
> > 2466023,"Montréal (Que.)",V  ,F,1620693,1583590,T,F,2.3,787060,743204,365.1303,4438.7,2
> 
> > 5915022,"Vancouver (B.C.)",CY ,F,578041,545671,F,F,5.9,273804,253212,114.7133,5039.0,8
> 
> > 3519038,"Richmond Hill (Ont.)",T  ,F,162704,132030,F,F,23.2,53028,51000,100.8917,1612.7,28
> 
> > 
> 
> > into a text file like the following
> 
> > 
> 
> > Toronto 2503281
> 
> > Montreal 1620693
> 
> > Vancouver 578041
> 
> > 
> 
> > I am extracting the 1st and 5th column and save it into a text file.
> 
> 
> 
> Looks to me like columns 1 and 6.
> 
> 
> 
> > 
> 
> > This is what i have so far.
> 
> > 
> 
> > 
> 
> > [code]
> 
> > 
> 
> > import csv
> 
> > file = open('raw.csv')
> 
> > reader = csv.reader(file)
> 
> > 
> 
> > f = open('NicelyDone.text','w')
> 
> > 
> 
> > for line in reader:
> 
> >       f.write("%s %s"%line[1],%line[5])
> 
> 
> 
> Why not use print to file f? The approach for redirection is
> 
>  different between python 2 and 3, and you neglected to say which
> 
>  you're using. 
> 
> > 
> 
> 
> 
> > 
> 
> > My thinking is that I could add those 2 string together like c=a+' ' +b, that would give me the format i wanted.
> 
> 
> 
> And don't forget the "\n" at end of line. 
> 
> 
> 
> > So i can use f.write() to write into a file  ;) 
> 
> 
> 
> Or use print, which defaults to adding in a newline.
> 
> 
> 
> > Sorry if my questions sounds too easy or stupid.
> 
> > 
> 
> 
> 
> Not in the least.
> 
> 
> 
> > 
> 
> > 
> 
> 
> 
> 
> 
> -- 
> 
> DaveA

Hi Dave  Thanks for the reply,
I am currently running python 2.7.

Yes, i thought there must be a print function in python like fprint in C++ that allows you to print into a file directly.
But i google about "print string into text file" I got answers using f.write() instead. :)

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web