Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #46157 > unrolled thread

how to compare two json file line by line using python?

Started byAvnesh Shakya <avnesh.nitk@gmail.com>
First post2013-05-26 21:32 -0700
Last post2013-05-27 10:50 +0000
Articles 7 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  how to compare two json file line by line using python? Avnesh Shakya <avnesh.nitk@gmail.com> - 2013-05-26 21:32 -0700
    Re: how to compare two json file line by line using python? rusi <rustompmody@gmail.com> - 2013-05-26 21:51 -0700
      Re: how to compare two json file line by line using python? Avnesh Shakya <avnesh.nitk@gmail.com> - 2013-05-27 10:35 +0530
    Re: how to compare two json file line by line using python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-27 05:33 +0000
      Re: how to compare two json file line by line using python? Avnesh Shakya <avnesh.nitk@gmail.com> - 2013-05-27 11:58 +0530
      Re: how to compare two json file line by line using python? Grant Edwards <invalid@invalid.invalid> - 2013-05-28 16:00 +0000
    Re: how to compare two json file line by line using python? Denis McMahon <denismfmcmahon@gmail.com> - 2013-05-27 10:50 +0000

#46157 — how to compare two json file line by line using python?

FromAvnesh Shakya <avnesh.nitk@gmail.com>
Date2013-05-26 21:32 -0700
Subjecthow to compare two json file line by line using python?
Message-ID<355f934e-bda0-4316-96bb-583c498ecb1a@googlegroups.com>
hi,
   how to compare two json file line by line using python? Actually I am doing it in this way..

import simplejson as json
def compare():
    newJsonFile= open('newData.json')
    lastJsonFile= open('version1.json')
    newLines = newJsonFile.readlines()
    print newLines
    sortedNew = sorted([repr(x) for x in newJsonFile])
    sortedLast = sorted([repr(x) for x in lastJsonFile])
    print(sortedNew == sortedLast)

compare()

But I want to compare line by line and value by value. but i found that json data is unordered data, so how can i compare them without sorting it. please give me some idea about it. I am new for it.
I want to check every value line by line.

Thanks

[toc] | [next] | [standalone]


#46161

Fromrusi <rustompmody@gmail.com>
Date2013-05-26 21:51 -0700
Message-ID<bb5d992b-4592-4777-ae47-3b3e090703a1@d8g2000pbe.googlegroups.com>
In reply to#46157
On May 27, 9:32 am, Avnesh Shakya <avnesh.n...@gmail.com> wrote:
> hi,
>    how to compare two json file line by line using python? Actually I am doing it in this way..
>
> import simplejson as json
> def compare():
>     newJsonFile= open('newData.json')
>     lastJsonFile= open('version1.json')
>     newLines = newJsonFile.readlines()
>     print newLines
>     sortedNew = sorted([repr(x) for x in newJsonFile])
>     sortedLast = sorted([repr(x) for x in lastJsonFile])
>     print(sortedNew == sortedLast)
>
> compare()
>
> But I want to compare line by line and value by value. but i found that json data is unordered data, so how can i compare them without sorting it. please give me some idea about it. I am new for it.
> I want to check every value line by line.
>
> Thanks

It really depends on what is your notion that the two files are same
or not.

For example does extra/deleted non-significant white-space matter?

By and large there are two approaches:
1. Treat json as serialized python data-structures, (and so) read in
the data-structures into python and compare there

2. Ignore the fact that the json file is a json file; just treat it as
text and use string compare operations

Naturally there could be other considerations: the files could be huge
and so you might want some hybrid of json and text approaches
etc etc

[toc] | [prev] | [next] | [standalone]


#46164

FromAvnesh Shakya <avnesh.nitk@gmail.com>
Date2013-05-27 10:35 +0530
Message-ID<mailman.2230.1369631143.3114.python-list@python.org>
In reply to#46161

[Multipart message — attachments visible in raw view] — view raw

Actually, I am extracting data from other site in json format and I want to
put it in my database and when I extract data again then I want to compare
last json file, if these are same then no issue otherwise i will add new
data in database, so here may be every time data can be changed or may be
not so I think sorting is required, but if i compare line by line that will
be good, I am thinking in this way...


On Mon, May 27, 2013 at 10:21 AM, rusi <rustompmody@gmail.com> wrote:

> On May 27, 9:32 am, Avnesh Shakya <avnesh.n...@gmail.com> wrote:
> > hi,
> >    how to compare two json file line by line using python? Actually I am
> doing it in this way..
> >
> > import simplejson as json
> > def compare():
> >     newJsonFile= open('newData.json')
> >     lastJsonFile= open('version1.json')
> >     newLines = newJsonFile.readlines()
> >     print newLines
> >     sortedNew = sorted([repr(x) for x in newJsonFile])
> >     sortedLast = sorted([repr(x) for x in lastJsonFile])
> >     print(sortedNew == sortedLast)
> >
> > compare()
> >
> > But I want to compare line by line and value by value. but i found that
> json data is unordered data, so how can i compare them without sorting it.
> please give me some idea about it. I am new for it.
> > I want to check every value line by line.
> >
> > Thanks
>
> It really depends on what is your notion that the two files are same
> or not.
>
> For example does extra/deleted non-significant white-space matter?
>
> By and large there are two approaches:
> 1. Treat json as serialized python data-structures, (and so) read in
> the data-structures into python and compare there
>
> 2. Ignore the fact that the json file is a json file; just treat it as
> text and use string compare operations
>
> Naturally there could be other considerations: the files could be huge
> and so you might want some hybrid of json and text approaches
> etc etc
> --
> http://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]


#46167

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-05-27 05:33 +0000
Message-ID<51a2f016$0$1744$c3e8da3$76491128@news.astraweb.com>
In reply to#46157
On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:

> But I want to compare line by line and value by value. but i found that
> json data is unordered data, so how can i compare them without sorting
> it. please give me some idea about it. I am new for it. I want to check
> every value line by line.

Why do you care about checking every value line by line? As you say 
yourself, JSON data is unordered, so "line by line" is the wrong way to 
compare it.


The right way is to decode the JSON data, and then compare whether it 
gives you the result you expect:

a = json.load("file-a")
b = json.load("file-b")
if a == b:
    print("file-a and file-b contain the same JSON data")

If what you care about is the *data* stored in the JSON file, this is the 
correct way to check it.

On the other hand, if you don't care about the data, but you want to 
detect changes to whitespace, blank lines, or other changes that make no 
difference to the JSON data, then there is no need to care that this is 
JSON data. Just treat it as text, and use the difflib library.

http://docs.python.org/2/library/difflib.html


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#46174

FromAvnesh Shakya <avnesh.nitk@gmail.com>
Date2013-05-27 11:58 +0530
Message-ID<mailman.2237.1369636125.3114.python-list@python.org>
In reply to#46167

[Multipart message — attachments visible in raw view] — view raw

Thanks a lot, I got it.


On Mon, May 27, 2013 at 11:03 AM, Steven D'Aprano <
steve+comp.lang.python@pearwood.info> wrote:

> On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:
>
> > But I want to compare line by line and value by value. but i found that
> > json data is unordered data, so how can i compare them without sorting
> > it. please give me some idea about it. I am new for it. I want to check
> > every value line by line.
>
> Why do you care about checking every value line by line? As you say
> yourself, JSON data is unordered, so "line by line" is the wrong way to
> compare it.
>
>
> The right way is to decode the JSON data, and then compare whether it
> gives you the result you expect:
>
> a = json.load("file-a")
> b = json.load("file-b")
> if a == b:
>     print("file-a and file-b contain the same JSON data")
>
> If what you care about is the *data* stored in the JSON file, this is the
> correct way to check it.
>
> On the other hand, if you don't care about the data, but you want to
> detect changes to whitespace, blank lines, or other changes that make no
> difference to the JSON data, then there is no need to care that this is
> JSON data. Just treat it as text, and use the difflib library.
>
> http://docs.python.org/2/library/difflib.html
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]


#46301

FromGrant Edwards <invalid@invalid.invalid>
Date2013-05-28 16:00 +0000
Message-ID<ko2kae$9oh$3@reader1.panix.com>
In reply to#46167
On 2013-05-27, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:
>
>> But I want to compare line by line and value by value. but i found that
>> json data is unordered data, so how can i compare them without sorting
>> it. please give me some idea about it. I am new for it. I want to check
>> every value line by line.
>
> Why do you care about checking every value line by line? As you say 
> yourself, JSON data is unordered, so "line by line" is the wrong way to 
> compare it.

There's no such thing as "lines" in JSON anyway. Outside of string
literals, all whitespace is equivalent, so replacing all newlines with
space characters results in equivalent blobs of JSON -- but one is a
single line, and the other is multiple lines.

> The right way is to decode the JSON data, and then compare whether it 
> gives you the result you expect:
>
> a = json.load("file-a")
> b = json.load("file-b")
> if a == b:
>     print("file-a and file-b contain the same JSON data")
>
> If what you care about is the *data* stored in the JSON file, this is
> the correct way to check it.

Yup.

-- 
Grant Edwards               grant.b.edwards        Yow! Are we laid back yet?
                                  at               
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#46187

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2013-05-27 10:50 +0000
Message-ID<knvdp8$gvn$4@dont-email.me>
In reply to#46157
On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:

>    how to compare two json file line by line using python? Actually I am
>    doing it in this way..

Oh what a lot of homework you have today.

Did you ever stop to think what the easiest way to compare two json 
datasets is?

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web