Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44556 > unrolled thread

how to compare two fields in python

Started byupendra kumar Devisetty <upendrakumar.devisetty@googlemail.com>
First post2013-04-30 10:41 -0700
Last post2013-04-30 19:54 -0400
Articles 7 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  how to compare two fields in python upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> - 2013-04-30 10:41 -0700
    Re: how to compare two fields in python Joel Goldstick <joel.goldstick@gmail.com> - 2013-04-30 13:53 -0400
    Re: how to compare two fields in python Fábio Santos <fabiosantosart@gmail.com> - 2013-04-30 19:08 +0100
    Re: how to compare two fields in python Tim Chase <python.list@tim.thechases.com> - 2013-04-30 13:19 -0500
      Re: how to compare two fields in python upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> - 2013-04-30 11:22 -0700
        Re: how to compare two fields in python Fábio Santos <fabiosantosart@gmail.com> - 2013-04-30 19:42 +0100
    Re: how to compare two fields in python Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-04-30 19:54 -0400

#44556 — how to compare two fields in python

Fromupendra kumar Devisetty <upendrakumar.devisetty@googlemail.com>
Date2013-04-30 10:41 -0700
Subjecthow to compare two fields in python
Message-ID<59afdc2b-ab76-40b6-8f63-4a562e288029@googlegroups.com>
I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example.

BRM_1   679 1929
BRM_1   203 567
BRM_2   367 1308
BRM_3   435 509
As you can see field1 of line1 is same as field2 of line2 and so that field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2.

Thanks in advance..

Upendra

[toc] | [next] | [standalone]


#44557

FromJoel Goldstick <joel.goldstick@gmail.com>
Date2013-04-30 13:53 -0400
Message-ID<mailman.1190.1367344424.3114.python-list@python.org>
In reply to#44556

[Multipart message — attachments visible in raw view] — view raw

On Tue, Apr 30, 2013 at 1:41 PM, upendra kumar Devisetty <
upendrakumar.devisetty@googlemail.com> wrote:

> I have a very basic question in python. I want to go through each line of
> the a csv file and compare to see if the first field of line 1 is same as
> first field of next line and so on. If it finds a match then i would like
> to put that field in an object1 else put that field in a different object2.
> Finally i would like to count how many of the fields in object1 vs object2.
> Can this be done in python? Here is a small example.
>
> BRM_1   679 1929
> BRM_1   203 567
> BRM_2   367 1308
> BRM_3   435 509
> As you can see field1 of line1 is same as field2 of line2 and so that
> field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed
> in object2. So the final numbers of object1 is 1 and object2 is 2.
>
>
You should study the csv module.

> Thanks in advance..
>
> Upendra
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com

[toc] | [prev] | [next] | [standalone]


#44559

FromFábio Santos <fabiosantosart@gmail.com>
Date2013-04-30 19:08 +0100
Message-ID<mailman.1192.1367345342.3114.python-list@python.org>
In reply to#44556

[Multipart message — attachments visible in raw view] — view raw

... And collections.Counter. This is useful for (you guessed it) counting.

Maybe itertools.groupby will be helpful as well (it could be used to give
you your data grouped by the first column of data), but it could be a tad
advanced for you if you are not too familiar with iterators.

[toc] | [prev] | [next] | [standalone]


#44562

FromTim Chase <python.list@tim.thechases.com>
Date2013-04-30 13:19 -0500
Message-ID<mailman.1194.1367345875.3114.python-list@python.org>
In reply to#44556
On 2013-04-30 10:41, upendra kumar Devisetty wrote:
> I have a very basic question in python. I want to go through each
> line of the a csv file and compare to see if the first field of
> line 1 is same as first field of next line and so on. If it finds a
> match then i would like to put that field in an object1 else put
> that field in a different object2. Finally i would like to count
> how many of the fields in object1 vs object2. Can this be done in
> python? Here is a small example.
> 
> BRM_1   679 1929
> BRM_1   203 567
> BRM_2   367 1308
> BRM_3   435 509
> As you can see field1 of line1 is same as field2 of line2 and so
> that field BRM_1 should be place in object1 and BRM_2 and BRM_3
> should be placed in object2. So the final numbers of object1 is 1
> and object2 is 2.

You underdefine the problem.  What happens in the case of:

  BRM_1 ...
  BRM_1 ...
  BRM_2 ...
  BRM_1 ...   <-- duplicates a (not-immediately) previous line
  BRM_3 ...

Also, do the values that follow have any significance for this, or
are they just noise to be ignored?

-tkc


[toc] | [prev] | [next] | [standalone]


#44564

Fromupendra kumar Devisetty <upendrakumar.devisetty@googlemail.com>
Date2013-04-30 11:22 -0700
Message-ID<bf43e88c-dcf4-48bf-b416-5f891f1b71a4@googlegroups.com>
In reply to#44562
The data was sorted and so duplicates will not appear anywhere in the dataframe. The values does not have significance and can be ignored safely.

Thanks
Upendra

On Tuesday, April 30, 2013 11:19:56 AM UTC-7, Tim Chase wrote:
> On 2013-04-30 10:41, upendra kumar Devisetty wrote:
> 
> > I have a very basic question in python. I want to go through each
> 
> > line of the a csv file and compare to see if the first field of
> 
> > line 1 is same as first field of next line and so on. If it finds a
> 
> > match then i would like to put that field in an object1 else put
> 
> > that field in a different object2. Finally i would like to count
> 
> > how many of the fields in object1 vs object2. Can this be done in
> 
> > python? Here is a small example.
> 
> > 
> 
> > BRM_1   679 1929
> 
> > BRM_1   203 567
> 
> > BRM_2   367 1308
> 
> > BRM_3   435 509
> 
> > As you can see field1 of line1 is same as field2 of line2 and so
> 
> > that field BRM_1 should be place in object1 and BRM_2 and BRM_3
> 
> > should be placed in object2. So the final numbers of object1 is 1
> 
> > and object2 is 2.
> 
> 
> 
> You underdefine the problem.  What happens in the case of:
> 
> 
> 
>   BRM_1 ...
> 
>   BRM_1 ...
> 
>   BRM_2 ...
> 
>   BRM_1 ...   <-- duplicates a (not-immediately) previous line
> 
>   BRM_3 ...
> 
> 
> 
> Also, do the values that follow have any significance for this, or
> 
> are they just noise to be ignored?
> 
> 
> 
> -tkc

[toc] | [prev] | [next] | [standalone]


#44566

FromFábio Santos <fabiosantosart@gmail.com>
Date2013-04-30 19:42 +0100
Message-ID<mailman.1197.1367347358.3114.python-list@python.org>
In reply to#44564

[Multipart message — attachments visible in raw view] — view raw

> The data was sorted and so duplicates will not appear anywhere in the
dataframe.
>

I guess that's it. Use the standard csv module and itertools.groupby.
Groupby will produce a list of grouped objects. So you can group by the
first column by supplying a key function which just returns the first
column.

Check this out for an example:
http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby(most
upvoted answer)

[toc] | [prev] | [next] | [standalone]


#44576

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-04-30 19:54 -0400
Message-ID<mailman.1203.1367366071.3114.python-list@python.org>
In reply to#44556
On Tue, 30 Apr 2013 10:41:56 -0700 (PDT), upendra kumar Devisetty
<upendrakumar.devisetty@googlemail.com> declaimed the following in
gmane.comp.python.general:

> I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example.
>

	You are essentially describing a "control-break" (or "report break")
http://en.wikipedia.org/wiki/Control_break

	The basic algorithm requires one to keep track of "previous" record
and do a comparison. While the "control" field is the same, you do one
action. When the control changes you close out the previous group and
start a new one. 

> BRM_1   679 1929
> BRM_1   203 567
> BRM_2   367 1308
> BRM_3   435 509
> As you can see field1 of line1 is same as field2 of line2 and so that field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2.
>

	Pseudo-code:

control = None
group = []
for record in file:
	if control is not None and record[0] != control:
		output(group)	#close out previous group data
		group = []		#initialize new group data
		control = record[0]		#reset control break data
	group.append(record)		#add current record to group
 if group:
	output(group)		#handle non-empty last group


> Thanks in advance..
> 
> Upendra
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web