Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #57015 > unrolled thread
| Started by | "torque.india@gmail.com" <torque.india@gmail.com> |
|---|---|
| First post | 2013-10-18 07:18 +0530 |
| Last post | 2013-10-18 15:54 -0700 |
| Articles | 4 — 4 participants |
Back to article view | Back to comp.lang.python
finding data from two different files. "torque.india@gmail.com" <torque.india@gmail.com> - 2013-10-18 07:18 +0530
Re: finding data from two different files. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-18 02:24 +0000
Re: finding data from two different files. Roy Smith <roy@panix.com> - 2013-10-18 08:31 -0400
Re: finding data from two different files. Jim Gibson <jimsgibson@gmail.com> - 2013-10-18 15:54 -0700
| From | "torque.india@gmail.com" <torque.india@gmail.com> |
|---|---|
| Date | 2013-10-18 07:18 +0530 |
| Subject | finding data from two different files. |
| Message-ID | <mailman.1193.1382062311.18130.python-list@python.org> |
Hi all, I am new to python, just was looking for logic to understand to write code in the below scenario. I am having a file (filea) with multiple columns, and another file(fileb) with again multiple columns, but say i want to use column2 of fileb as a search expression to search for similar value in column3 of filea. and print it with value of rows of filea. filea: a 1 ab b 2 bc d 3 de e 4 ef . . . fileb z ab 24 y bc 85 x ef 123 w de 33 Regards../ omps
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-10-18 02:24 +0000 |
| Message-ID | <52609bd3$0$29981$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #57015 |
On Fri, 18 Oct 2013 07:18:49 +0530, torque.india@gmail.com wrote:
> Hi all,
>
> I am new to python, just was looking for logic to understand to write
> code in the below scenario.
>
> I am having a file (filea) with multiple columns, and another
> file(fileb) with again multiple columns, but say i want to use column2
> of fileb as a search expression to search for similar value in column3
> of filea. and print it with value of rows of filea.
>
> filea:
> a 1 ab
> b 2 bc
> d 3 de
> e 4 ef
> .
> .
> .
>
> fileb
> z ab 24
> y bc 85
> x ef 123
> w de 33
Can you explain your problem a little better? You've shown some example
data, which is great, but what are we supposed to do with it? Given the
data shown above, what result would you expect to get?
My guess is that you want to do something like this:
* walk through fileB, extract each line in turn
* extract the second column
* then search fileA for lines where column 3 matches
* then... I don't know, maybe print the match?
Repeatedly walking through fileA will be slow. So it is better to do this
only once, ahead of time. I suggest that you probably want to use the csv
module to read the data, but because I'm lazy, I'm going to do it by hand:
# Prepare fileA for later searches
data = {} # use a dict to map column 3 to the rest of the data
with open("fileA") as f:
for line in f:
fields = line.split() # split on whitespace
col3 = fields[2] # remember fields are numbered from 0, not 1
data[col3] = line
The above assumes that each item in column 3 is unique. If it isn't,
you'll need a different strategy.
Now on to the second part:
with open("fileB") as f:
for line in f:
col2 = line.split()[1]
# This next line assumes you're using Python2
print col2, data.get(col2, '***no match***')
Does this help?
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-10-18 08:31 -0400 |
| Message-ID | <roy-F77370.08312718102013@news.panix.com> |
| In reply to | #57015 |
In article <mailman.1193.1382062311.18130.python-list@python.org>,
"torque.india@gmail.com" <torque.india@gmail.com> wrote:
> Hi all,
>
> I am new to python, just was looking for logic to understand to write code in
> the below scenario.
>
> I am having a file (filea) with multiple columns, and another file(fileb)
> with again multiple columns, but say i want to use column2 of fileb as a
> search expression to search for similar value in column3 of filea. and print
> it with value of rows of filea.
>
> filea:
> a 1 ab
> b 2 bc
> d 3 de
> e 4 ef
> .
> .
> .
>
> fileb
> z ab 24
> y bc 85
> x ef 123
> w de 33
>
> Regards../ omps
Start by breaking this down into small tasks. The first thing you need
to be able to do is open filea, read it, and split each line up into
columns. You're going to want something along the lines of:
for line in open("filea"):
col1, col2, col3 = line.split()
Play with that for a while and make sure you understand what's going on.
There's the iteration over the lines of a file, the splitting of each
line into a list of fields, and the unpacking of that list into three
variables. Each of those are very common operations that you'll be
using often.
At some point, you're going to want to say, "I've got a line from fileb
whose column 2 is 'ab'; what line from filea has 'ab' in column 3?"
That call for a map. In Python, it's called a dictionary. As you read
fileb, you'll want to build a map, something like:
map = {}
for line in open("filea"):
col1, col2, col3 = line.split()
map[col3] = line
Once you've done that, try:
>>> print map
and see what it gives you. Then, read up on dictionaries
http://docs.python.org/2/tutorial/datastructures.html#dictionaries
and see if the hints I've given you are enough to get the rest of the
way yourself. If not, come back and ask more questions.
Oh, also, you didn't say what version of Python you're using. My
examples above assumed Python 2. If you're using Python 3, some minor
details may change, so let us know which you're using.
[toc] | [prev] | [next] | [standalone]
| From | Jim Gibson <jimsgibson@gmail.com> |
|---|---|
| Date | 2013-10-18 15:54 -0700 |
| Message-ID | <181020131554447773%jimsgibson@gmail.com> |
| In reply to | #57015 |
In article <mailman.1193.1382062311.18130.python-list@python.org>, <"torque.india@gmail.com"> wrote: > Hi all, > > I am new to python, just was looking for logic to understand to write code in > the below scenario. > > I am having a file (filea) with multiple columns, and another file(fileb) > with again multiple columns, but say i want to use column2 of fileb as a > search expression to search for similar value in column3 of filea. and print > it with value of rows of filea. > > filea: > a 1 ab > b 2 bc > d 3 de > e 4 ef > . > . > . > > fileb > z ab 24 > y bc 85 > x ef 123 > w de 33 > > Regards../ omps Interestingly, somebody named "Om Prakash Singh" asked the identical question on the perl beginners list, except with the word "perl" substituted for "python". Is this a homework problem? Are you unsure about which language to use? Are you comparison shopping? -- Jim Gibson
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web