Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #44556 > unrolled thread
| Started by | upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> |
|---|---|
| First post | 2013-04-30 10:41 -0700 |
| Last post | 2013-04-30 19:54 -0400 |
| Articles | 7 — 5 participants |
Back to article view | Back to comp.lang.python
how to compare two fields in python upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> - 2013-04-30 10:41 -0700
Re: how to compare two fields in python Joel Goldstick <joel.goldstick@gmail.com> - 2013-04-30 13:53 -0400
Re: how to compare two fields in python Fábio Santos <fabiosantosart@gmail.com> - 2013-04-30 19:08 +0100
Re: how to compare two fields in python Tim Chase <python.list@tim.thechases.com> - 2013-04-30 13:19 -0500
Re: how to compare two fields in python upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> - 2013-04-30 11:22 -0700
Re: how to compare two fields in python Fábio Santos <fabiosantosart@gmail.com> - 2013-04-30 19:42 +0100
Re: how to compare two fields in python Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-04-30 19:54 -0400
| From | upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> |
|---|---|
| Date | 2013-04-30 10:41 -0700 |
| Subject | how to compare two fields in python |
| Message-ID | <59afdc2b-ab76-40b6-8f63-4a562e288029@googlegroups.com> |
I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example. BRM_1 679 1929 BRM_1 203 567 BRM_2 367 1308 BRM_3 435 509 As you can see field1 of line1 is same as field2 of line2 and so that field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2. Thanks in advance.. Upendra
[toc] | [next] | [standalone]
| From | Joel Goldstick <joel.goldstick@gmail.com> |
|---|---|
| Date | 2013-04-30 13:53 -0400 |
| Message-ID | <mailman.1190.1367344424.3114.python-list@python.org> |
| In reply to | #44556 |
[Multipart message — attachments visible in raw view] — view raw
On Tue, Apr 30, 2013 at 1:41 PM, upendra kumar Devisetty < upendrakumar.devisetty@googlemail.com> wrote: > I have a very basic question in python. I want to go through each line of > the a csv file and compare to see if the first field of line 1 is same as > first field of next line and so on. If it finds a match then i would like > to put that field in an object1 else put that field in a different object2. > Finally i would like to count how many of the fields in object1 vs object2. > Can this be done in python? Here is a small example. > > BRM_1 679 1929 > BRM_1 203 567 > BRM_2 367 1308 > BRM_3 435 509 > As you can see field1 of line1 is same as field2 of line2 and so that > field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed > in object2. So the final numbers of object1 is 1 and object2 is 2. > > You should study the csv module. > Thanks in advance.. > > Upendra > -- > http://mail.python.org/mailman/listinfo/python-list > -- Joel Goldstick http://joelgoldstick.com
[toc] | [prev] | [next] | [standalone]
| From | Fábio Santos <fabiosantosart@gmail.com> |
|---|---|
| Date | 2013-04-30 19:08 +0100 |
| Message-ID | <mailman.1192.1367345342.3114.python-list@python.org> |
| In reply to | #44556 |
[Multipart message — attachments visible in raw view] — view raw
... And collections.Counter. This is useful for (you guessed it) counting. Maybe itertools.groupby will be helpful as well (it could be used to give you your data grouped by the first column of data), but it could be a tad advanced for you if you are not too familiar with iterators.
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2013-04-30 13:19 -0500 |
| Message-ID | <mailman.1194.1367345875.3114.python-list@python.org> |
| In reply to | #44556 |
On 2013-04-30 10:41, upendra kumar Devisetty wrote: > I have a very basic question in python. I want to go through each > line of the a csv file and compare to see if the first field of > line 1 is same as first field of next line and so on. If it finds a > match then i would like to put that field in an object1 else put > that field in a different object2. Finally i would like to count > how many of the fields in object1 vs object2. Can this be done in > python? Here is a small example. > > BRM_1 679 1929 > BRM_1 203 567 > BRM_2 367 1308 > BRM_3 435 509 > As you can see field1 of line1 is same as field2 of line2 and so > that field BRM_1 should be place in object1 and BRM_2 and BRM_3 > should be placed in object2. So the final numbers of object1 is 1 > and object2 is 2. You underdefine the problem. What happens in the case of: BRM_1 ... BRM_1 ... BRM_2 ... BRM_1 ... <-- duplicates a (not-immediately) previous line BRM_3 ... Also, do the values that follow have any significance for this, or are they just noise to be ignored? -tkc
[toc] | [prev] | [next] | [standalone]
| From | upendra kumar Devisetty <upendrakumar.devisetty@googlemail.com> |
|---|---|
| Date | 2013-04-30 11:22 -0700 |
| Message-ID | <bf43e88c-dcf4-48bf-b416-5f891f1b71a4@googlegroups.com> |
| In reply to | #44562 |
The data was sorted and so duplicates will not appear anywhere in the dataframe. The values does not have significance and can be ignored safely. Thanks Upendra On Tuesday, April 30, 2013 11:19:56 AM UTC-7, Tim Chase wrote: > On 2013-04-30 10:41, upendra kumar Devisetty wrote: > > > I have a very basic question in python. I want to go through each > > > line of the a csv file and compare to see if the first field of > > > line 1 is same as first field of next line and so on. If it finds a > > > match then i would like to put that field in an object1 else put > > > that field in a different object2. Finally i would like to count > > > how many of the fields in object1 vs object2. Can this be done in > > > python? Here is a small example. > > > > > > BRM_1 679 1929 > > > BRM_1 203 567 > > > BRM_2 367 1308 > > > BRM_3 435 509 > > > As you can see field1 of line1 is same as field2 of line2 and so > > > that field BRM_1 should be place in object1 and BRM_2 and BRM_3 > > > should be placed in object2. So the final numbers of object1 is 1 > > > and object2 is 2. > > > > You underdefine the problem. What happens in the case of: > > > > BRM_1 ... > > BRM_1 ... > > BRM_2 ... > > BRM_1 ... <-- duplicates a (not-immediately) previous line > > BRM_3 ... > > > > Also, do the values that follow have any significance for this, or > > are they just noise to be ignored? > > > > -tkc
[toc] | [prev] | [next] | [standalone]
| From | Fábio Santos <fabiosantosart@gmail.com> |
|---|---|
| Date | 2013-04-30 19:42 +0100 |
| Message-ID | <mailman.1197.1367347358.3114.python-list@python.org> |
| In reply to | #44564 |
[Multipart message — attachments visible in raw view] — view raw
> The data was sorted and so duplicates will not appear anywhere in the dataframe. > I guess that's it. Use the standard csv module and itertools.groupby. Groupby will produce a list of grouped objects. So you can group by the first column by supplying a key function which just returns the first column. Check this out for an example: http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby(most upvoted answer)
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2013-04-30 19:54 -0400 |
| Message-ID | <mailman.1203.1367366071.3114.python-list@python.org> |
| In reply to | #44556 |
On Tue, 30 Apr 2013 10:41:56 -0700 (PDT), upendra kumar Devisetty
<upendrakumar.devisetty@googlemail.com> declaimed the following in
gmane.comp.python.general:
> I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example.
>
You are essentially describing a "control-break" (or "report break")
http://en.wikipedia.org/wiki/Control_break
The basic algorithm requires one to keep track of "previous" record
and do a comparison. While the "control" field is the same, you do one
action. When the control changes you close out the previous group and
start a new one.
> BRM_1 679 1929
> BRM_1 203 567
> BRM_2 367 1308
> BRM_3 435 509
> As you can see field1 of line1 is same as field2 of line2 and so that field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2.
>
Pseudo-code:
control = None
group = []
for record in file:
if control is not None and record[0] != control:
output(group) #close out previous group data
group = [] #initialize new group data
control = record[0] #reset control break data
group.append(record) #add current record to group
if group:
output(group) #handle non-empty last group
> Thanks in advance..
>
> Upendra
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web