Groups > comp.lang.python > #40157 > unrolled thread

Read csv file and create a new file

Started by	io <maroso@libero.it>
First post	2013-02-28 19:14 +0000
Last post	2013-03-04 23:58 +0000
Articles	15 — 6 participants

Back to article view | Back to comp.lang.python

  Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 19:14 +0000
    Re: Read csv file and create a new file Neil Cerutti <neilc@norwich.edu> - 2013-02-28 19:32 +0000
    Re: Read csv file and create a new file Joel Goldstick <joel.goldstick@gmail.com> - 2013-02-28 14:34 -0500
    Re: Read csv file and create a new file Dave Angel <davea@davea.name> - 2013-02-28 14:35 -0500
    Re: Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 19:51 +0000
      Re: Read csv file and create a new file Neil Cerutti <neilc@norwich.edu> - 2013-02-28 20:11 +0000
        Re: Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 20:23 +0000
          Re: Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 20:46 +0000
            Re: Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 20:51 +0000
            Re: Read csv file and create a new file Dave Angel <davea@davea.name> - 2013-02-28 16:23 -0500
              Re: Read csv file and create a new file io <maroso@libero.it> - 2013-02-28 22:05 +0000
                Re: Read csv file and create a new file Dave Angel <davea@davea.name> - 2013-02-28 19:13 -0500
    Re: Read csv file and create a new file Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-02-28 21:04 -0500
      Re: Read csv file and create a new file io <maroso@libero.it> - 2013-03-04 19:04 +0000
        Re: Read csv file and create a new file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-04 23:58 +0000

#40157 — Read csv file and create a new file

From	io <maroso@libero.it>
Date	2013-02-28 19:14 +0000
Subject	Read csv file and create a new file
Message-ID	<512fac9c$0$40355$4fafbaef@reader1.news.tin.it>

Hi,

i have to files. 

First file is a csv file
Second file is a plain text file where each row has a value (text)

I want to be able to create a third file using data from the first file 
excluding the values listed in the second file.

Example:

First file:
-----------

mtgoxeur	12	24	36
mtgoxusd	10	12	14
mtgoxpln	2	4	6


Second file:
------------

mtgoxusd



Third File (the resulting one) :
--------------------------------

mtgoxeur	12	24	36
mtgoxpln	2	4	6



Thanks to anyone that can help

[toc] | [next] | [standalone]

#40160

From	Neil Cerutti <neilc@norwich.edu>
Date	2013-02-28 19:32 +0000
Message-ID	<ap9pmvFu2gjU1@mid.individual.net>
In reply to	#40157

On 2013-02-28, io <maroso@libero.it> wrote:
> Hi,
>
> i have to files. 
>
> First file is a csv file
> Second file is a plain text file where each row has a value (text)
>
> I want to be able to create a third file using data from the first file 
> excluding the values listed in the second file.
>
> Example:
>
> First file:
> -----------
>
> mtgoxeur	12	24	36
> mtgoxusd	10	12	14
> mtgoxpln	2	4	6
>
>
> Second file:
> ------------
>
> mtgoxusd
>
>
>
> Third File (the resulting one) :
> --------------------------------
>
> mtgoxeur	12	24	36
> mtgoxpln	2	4	6
>
> Thanks to anyone that can help

You don't appear to need the csv module at all. You'll just need
the startswith string function.

For more help, please show us some code.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]

#40161

From	Joel Goldstick <joel.goldstick@gmail.com>
Date	2013-02-28 14:34 -0500
Message-ID	<mailman.2668.1362080055.2939.python-list@python.org>
In reply to	#40157

[Multipart message — attachments visible in raw view] — view raw

On Thu, Feb 28, 2013 at 2:14 PM, io <maroso@libero.it> wrote:

> Hi,
>
> i have to files.
>
> First file is a csv file
> Second file is a plain text file where each row has a value (text)
>

Read the second file so that you have a list of each of its values

Read the first file line by line.  Check if the value at the beginning of
the line is in the list of values from the second file.  If not, print it
to outfile.

You should look up split method on strings to split a string separated by
whitespace.

>
> I want to be able to create a third file using data from the first file
> excluding the values listed in the second file.
>
> Example:
>
> First file:
> -----------
>
> mtgoxeur        12      24      36
> mtgoxusd        10      12      14
> mtgoxpln        2       4       6
>
>
> Second file:
> ------------
>
> mtgoxusd
>
>
>
> Third File (the resulting one) :
> --------------------------------
>
> mtgoxeur        12      24      36
> mtgoxpln        2       4       6
>
>
>
> Thanks to anyone that can help
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com

[toc] | [prev] | [next] | [standalone]

#40162

From	Dave Angel <davea@davea.name>
Date	2013-02-28 14:35 -0500
Message-ID	<mailman.2669.1362080157.2939.python-list@python.org>
In reply to	#40157

On 02/28/2013 02:14 PM, io wrote:
> Hi,
>
> i have to files.
>
> First file is a csv file
> Second file is a plain text file where each row has a value (text)
>
> I want to be able to create a third file using data from the first file
> excluding the values listed in the second file.
>
> Example:
>
> First file:
> -----------
>
> mtgoxeur	12	24	36
> mtgoxusd	10	12	14
> mtgoxpln	2	4	6
>
>
> Second file:
> ------------
>
> mtgoxusd
>
>
>
> Third File (the resulting one) :
> --------------------------------
>
> mtgoxeur	12	24	36
> mtgoxpln	2	4	6
>
>
>
> Thanks to anyone that can help
>
>


Start by making a set out of the second file.  Don't forget to remove 
whitespace with strip()

Then loop through the first file.  For each line, split() it by 
whitespace, and conditionally write the line to the third file.  The 
test would be something like:
      if  fields[0] in myset:
            outfile.write(line)


-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#40164

From	io <maroso@libero.it>
Date	2013-02-28 19:51 +0000
Message-ID	<512fb533$0$40355$4fafbaef@reader1.news.tin.it>
In reply to	#40157

I'm a noob in python but my code looks like this :


import json
import urllib
import csv

url = "http://bitcoincharts.com/t/markets.json"
response = urllib.urlopen(url);
data = json.loads(response.read())

f = open("/home/io/markets.csv","wb")
c = csv.writer(f)

#apre un file di testo e legge il contenuto del file inserendolo in una 
stringa
esclusioni = open('/home/io/exclusions.txt','r')
string = ""
while 1:
    line = esclusioni.readline()
    if not line:break
    string += line
print string



# write headers
c.writerow(["Currency","Symbol","Bid", "Ask", "Volume"])

for d in data :
    if d["currency"] <> "SLL":  #esclude la valuta di secondlife SLL
        if d["bid"] is not None and d["ask"] is not None:
            if not any(str(d["symbol"]) in s for s in string):
                c.writerow([str(d["currency"]),str(d["symbol"]),str(d
["bid"]),str(d["ask"]),str(d["currency_volume"])])
    
esclusioni.close()

[toc] | [prev] | [next] | [standalone]

#40168

From	Neil Cerutti <neilc@norwich.edu>
Date	2013-02-28 20:11 +0000
Message-ID	<ap9s02Fuf0kU1@mid.individual.net>
In reply to	#40164

On 2013-02-28, io <maroso@libero.it> wrote:
> I'm a noob in python but my code looks like this :
>
>
> import json
> import urllib
> import csv

I take back what I said about the csv module. It appears you need
access to at least one of the data fields, so this is a good use
of csv.

> url = "http://bitcoincharts.com/t/markets.json"
> response = urllib.urlopen(url);
> data = json.loads(response.read())
>
> f = open("/home/io/markets.csv","wb")
> c = csv.writer(f)
>
> #apre un file di testo e legge il contenuto del file inserendolo in una 
> stringa
> esclusioni = open('/home/io/exclusions.txt','r')
> string = ""

The list of exclusions should be stored in a set or list, not a
string. This is your main "bug."

esclusioni_file = open('/home/io/exclusions.txt','r')
esclusioni = []

> while 1:
>     line = esclusioni.readline()
>     if not line:break
>     string += line
> print string

Iterate over the file instead of looping manually.

 for line in esclusioni_file:
     esclusioni.append(line.strip())
 print(esclusioni)

> # write headers
> c.writerow(["Currency","Symbol","Bid", "Ask", "Volume"])
>
> for d in data:
>     if d["currency"] <> "SLL":  #esclude la valuta di secondlife SLL
>         if d["bid"] is not None and d["ask"] is not None:
>             if not any(str(d["symbol"]) in s for s in string):

Why are you checking d["symbol"] instead of d["currency"]? Maybe
I misunderstood the question.

Test like this for either set or list container type. Use
whichever json field is appropriate:

              if d["currency"] not in esclusioni:

>                 c.writerow([str(d["currency"]),str(d["symbol"]),str(d
> ["bid"]),str(d["ask"]),str(d["currency_volume"])])
>     
> esclusioni.close()

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]

#40169

From	io <maroso@libero.it>
Date	2013-02-28 20:23 +0000
Message-ID	<512fbcc7$0$40355$4fafbaef@reader1.news.tin.it>
In reply to	#40168

> Iterate over the file instead of looping manually.
> 
>  for line in esclusioni_file:
>      esclusioni.append(line.strip())
>  print(esclusioni)

the print was only to see if it was reading correct data but iìm not 
needing to see it.

 
> Why are you checking d["symbol"] instead of d["currency"]? Maybe I
> misunderstood the question.

the file esclusioni.txt has the names of the market places that i want to 
exclude ... the correspoding marketplace name is found in the symbol 
value of the json imported data, that's why i'm comparing it with symbol.



> Test like this for either set or list container type. Use whichever json
> field is appropriate:
> 
>               if d["currency"] not in esclusioni:
> 
>>                 c.writerow([str(d["currency"]),str(d["symbol"]),str(d
>> ["bid"]),str(d["ask"]),str(d["currency_volume"])])
>>     
>> esclusioni.close()

that's a nice approach ( if d["symbol"] not in esclusioni: )!!!

Thanks i will give it a try ... i was getting crazy and my mind was 
looping dangerously!  :-)

[toc] | [prev] | [next] | [standalone]

#40174

From	io <maroso@libero.it>
Date	2013-02-28 20:46 +0000
Message-ID	<512fc240$0$40355$4fafbaef@reader1.news.tin.it>
In reply to	#40169

Neil, it works great!

Just one question : what can i do for ignoring the case sensitive of the 
symbol?

It wasn't working initially, then i wrote the values respecting case 
sensitive in the file esclusioni and all worked as a charm. I would just 
like to know if i could ignore the case senstive function.

Thanks alot for your help.

I'm in debt with you, i'll spend a pizza for you!

[toc] | [prev] | [next] | [standalone]

#40176

From	io <maroso@libero.it>
Date	2013-02-28 20:51 +0000
Message-ID	<512fc361$0$40355$4fafbaef@reader1.news.tin.it>
In reply to	#40174

The final working code is :

import json
import urllib
import csv

url = "http://bitcoincharts.com/t/markets.json"
response = urllib.urlopen(url);
data = json.loads(response.read())

f = open("/home/io/markets.csv","wb")
c = csv.writer(f)

#apre un file di testo e legge il contenuto del file inserendolo in una 
stringa
esclusioni_file = open('/home/io/exclusions.txt','r')
esclusioni = []

for line in esclusioni_file:
     esclusioni.append(line.strip())
#print(esclusioni)


# write headers
c.writerow(["Currency","Symbol","Bid", "Ask", "Volume"])

for d in data :
    if d["currency"] <> "SLL":  #esclude la valuta di secondlife SLL
        if d["bid"] is not None and d["ask"] is not None:
            if d["symbol"] not in esclusioni:
                #print d["symbol"]
                c.writerow([str(d["currency"]),str(d["symbol"]),str(d
["bid"]),str(d["ask"]),str(d["currency_volume"])])
    #esclusioni.close()

[toc] | [prev] | [next] | [standalone]

#40186

From	Dave Angel <davea@davea.name>
Date	2013-02-28 16:23 -0500
Message-ID	<mailman.2684.1362086631.2939.python-list@python.org>
In reply to	#40174

On 02/28/2013 03:46 PM, io wrote:
> Neil, it works great!
>
> Just one question : what can i do for ignoring the case sensitive of the
> symbol?
>
> It wasn't working initially, then i wrote the values respecting case
> sensitive in the file esclusioni and all worked as a charm. I would just
> like to know if i could ignore the case senstive function.
>
> Thanks alot for your help.
>
> I'm in debt with you, i'll spend a pizza for you!
>

Just use a tolower() method on both strings when you're comparing them. 
  Of course, that may not work well with international character sets. 
Some characters in some languages have no lowercase equivalent, and 
using toupper() has the same problem in other languages.

Also, the approach to case insensitive differs between Python 2.x and 3.x

-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#40194

From	io <maroso@libero.it>
Date	2013-02-28 22:05 +0000
Message-ID	<512fd4ab$0$40355$4fafbaef@reader1.news.tin.it>
In reply to	#40186

> Just use a tolower() method on both strings when you're comparing them.
>   Of course, that may not work well with international character sets.
> Some characters in some languages have no lowercase equivalent, and
> using toupper() has the same problem in other languages.
> 
> Also, the approach to case insensitive differs between Python 2.x and
> 3.x

Thanks for the info !
What about str().lower()?

[toc] | [prev] | [next] | [standalone]

#40205

From	Dave Angel <davea@davea.name>
Date	2013-02-28 19:13 -0500
Message-ID	<mailman.2694.1362096823.2939.python-list@python.org>
In reply to	#40194

On 02/28/2013 05:05 PM, io wrote:
>
>> Just use a tolower() method on both strings when you're comparing them.
>>    Of course, that may not work well with international character sets.
>> Some characters in some languages have no lowercase equivalent, and
>> using toupper() has the same problem in other languages.
>>
>> Also, the approach to case insensitive differs between Python 2.x and
>> 3.x
>
> Thanks for the info !
> What about str().lower()?
>

Actually, it's str.lower(), and that's what I meant.  Must be some other 
language that uses tolower().

-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#40211

From	Dennis Lee Bieber <wlfraed@ix.netcom.com>
Date	2013-02-28 21:04 -0500
Message-ID	<mailman.2697.1362103483.2939.python-list@python.org>
In reply to	#40157

On 28 Feb 2013 19:14:36 GMT, io <maroso@libero.it> declaimed the
following in gmane.comp.python.general:

> Hi,
> 
> i have to files. 
> 
> First file is a csv file
> Second file is a plain text file where each row has a value (text)
> 
> I want to be able to create a third file using data from the first file 
> excluding the values listed in the second file.
> 
> Example:
> 
> First file:
> -----------
> 
> mtgoxeur	12	24	36
> mtgoxusd	10	12	14
> mtgoxpln	2	4	6
> 
> 
> Second file:
> ------------
> 
> mtgoxusd
> 
> 
> 
> Third File (the resulting one) :
> --------------------------------
> 
> mtgoxeur	12	24	36
> mtgoxpln	2	4	6
> 
> 
> 
> Thanks to anyone that can help
> 

	This appears to be a variation of standard sort/merge operation. The
variation being that you /do not/ write lines of one file when they
match the other file.

	While an in-memory sort may be feasible (unless the data file is
extremely large) I would probably invoke your OS sort module on each
file...

	Then (pseudo-code, not executable python)

data = csvin.next()
key = keyfile.readln().strip()

while data and key:
	if data[0] < key:
		csvout.write(data)
		data = datafile.readln().strip()
	elif data[0] = key:
		data =csvin.next()
		#don't increment key to catch duplicates in data
	elif data[0] > key:
		key = keyfile.readln().strip()

while data:
	#no more keys to remove, copy rest of data file
	csvout.write(data)
	data = csvin.next()
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]

#40473

From	io <maroso@libero.it>
Date	2013-03-04 19:04 +0000
Message-ID	<5134f02e$0$40360$4fafbaef@reader1.news.tin.it>
In reply to	#40211

What you wrote seems interesting but i haven't understood.
Can you explain in simple words considering i'm italian and i'm not 
understanding so well some terms you use.
Sorry, i'm sure you are suggesting something really valid but can't 
understand it.

Marco.

[toc] | [prev] | [next] | [standalone]

#40487

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-03-04 23:58 +0000
Message-ID	<51353535$0$30001$c3e8da3$5496439d@news.astraweb.com>
In reply to	#40473

On Mon, 04 Mar 2013 19:04:14 +0000, io wrote:

> What you wrote seems interesting but i haven't understood. Can you
> explain in simple words considering i'm italian and i'm not
> understanding so well some terms you use. Sorry, i'm sure you are
> suggesting something really valid but can't understand it.
> 
> Marco.

Who are you talking to? Your reply has no context.

When replying to a message, please quote enough of the previous message 
so that readers can understand what you are replying to. There is usually 
no need to quote the entire message. Often just a few lines is enough.

If you spend some time reading other people's questions and answers, you 
will see what I mean.

-- 
Steven

[toc] | [prev] | [standalone]

csiph-web

Read csv file and create a new file

Contents

#40157 — Read csv file and create a new file

#40160

#40161

#40162

#40164

#40168

#40169

#40174

#40176

#40186

#40194

#40205

#40211

#40473

#40487