Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #91329 > unrolled thread

Decoding JSON file using python

Started byKarthik Sharma <karthik.sharma@gmail.com>
First post2015-05-27 15:23 -0700
Last post2015-05-28 18:48 +0000
Articles 11 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Decoding JSON file using python Karthik Sharma <karthik.sharma@gmail.com> - 2015-05-27 15:23 -0700
    Re: Decoding JSON file using python random832@fastmail.us - 2015-05-27 19:06 -0400
    Re: Decoding JSON file using python MRAB <python@mrabarnett.plus.com> - 2015-05-28 00:08 +0100
    Re: Decoding JSON file using python Cameron Simpson <cs@zip.com.au> - 2015-05-28 08:52 +1000
      Re: Decoding JSON file using python Karthik Sharma <karthik.sharma@gmail.com> - 2015-05-27 16:51 -0700
        Re: Decoding JSON file using python Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-05-28 01:38 +0000
          Re: Decoding JSON file using python Cameron Simpson <cs@zip.com.au> - 2015-05-28 13:32 +1000
            Re: Decoding JSON file using python Denis McMahon <denismfmcmahon@gmail.com> - 2015-05-28 18:49 +0000
        Re: Decoding JSON file using python random832@fastmail.us - 2015-05-27 23:48 -0400
        Re: Decoding JSON file using python MRAB <python@mrabarnett.plus.com> - 2015-05-28 04:54 +0100
    Re: Decoding JSON file using python Denis McMahon <denismfmcmahon@gmail.com> - 2015-05-28 18:48 +0000

#91329 — Decoding JSON file using python

FromKarthik Sharma <karthik.sharma@gmail.com>
Date2015-05-27 15:23 -0700
SubjectDecoding JSON file using python
Message-ID<11dbf982-efc1-4ef0-b286-e3d08ba5030e@googlegroups.com>
I have the JSON structure shown below and the python code shown below to manipulate the JSON structure.

    import json
    
    json_input = { 
        "msgType": "0",
        "tid": "1",
        "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
        "Timestamp": "1432703193431",
        "Host": "myserver.com",
        "Agent": "Experience",
        "AppName": "undefined",
        "AppInstance": "my_server",
        "Group": "UndefinedGroup"
    }
    
    
    data = json_input['data']
    tdetails = data['Message']
    print('json_input {} \n\ndata {} \n\n tdetails {}\n\n'.format(json_input,data,tdetails))  

I am getting the error.

    Traceback (most recent call last):
      File "test_json.py", line 19, in <module>
        tdetails = data['Message']
    TypeError: string indices must be integers, not str

The JSON structure is valid as shown by http://jsonlint.com/
   I want to be able to access the different fields inside `data` such as `severity`, `subject` and also fields inside `tdetails` such as `CPUs` and `Product`. How do I do this?

[toc] | [next] | [standalone]


#91330

Fromrandom832@fastmail.us
Date2015-05-27 19:06 -0400
Message-ID<mailman.104.1432768013.5151.python-list@python.org>
In reply to#91329
On Wed, May 27, 2015, at 18:23, Karthik Sharma wrote:
> The JSON structure is valid as shown by http://jsonlint.com/
>    I want to be able to access the different fields inside `data` such as
>    `severity`, `subject` and also fields inside `tdetails` such as `CPUs`
>    and `Product`. How do I do this?

your data object is a string, it doesn't have anything inside it. if you
want it to be a json object you'll need to actually use the json parser
on it.

[toc] | [prev] | [next] | [standalone]


#91331

FromMRAB <python@mrabarnett.plus.com>
Date2015-05-28 00:08 +0100
Message-ID<mailman.105.1432768131.5151.python-list@python.org>
In reply to#91329
On 2015-05-27 23:23, Karthik Sharma wrote:
> I have the JSON structure shown below and the python code shown below to manipulate the JSON structure.
>
>      import json
>
>      json_input = {
>          "msgType": "0",
>          "tid": "1",
>          "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
>          "Timestamp": "1432703193431",
>          "Host": "myserver.com",
>          "Agent": "Experience",
>          "AppName": "undefined",
>          "AppInstance": "my_server",
>          "Group": "UndefinedGroup"
>      }
>
>
>      data = json_input['data']
>      tdetails = data['Message']
>      print('json_input {} \n\ndata {} \n\n tdetails {}\n\n'.format(json_input,data,tdetails))
>
> I am getting the error.
>
>      Traceback (most recent call last):
>        File "test_json.py", line 19, in <module>
>          tdetails = data['Message']
>      TypeError: string indices must be integers, not str
>
> The JSON structure is valid as shown by http://jsonlint.com/
>     I want to be able to access the different fields inside `data` such as `severity`, `subject` and also fields inside `tdetails` such as `CPUs` and `Product`. How do I do this?
>
json_input['Message'] is a string, not a dict.

[toc] | [prev] | [next] | [standalone]


#91334

FromCameron Simpson <cs@zip.com.au>
Date2015-05-28 08:52 +1000
Message-ID<mailman.108.1432768474.5151.python-list@python.org>
In reply to#91329
On 27May2015 15:23, Karthik Sharma <karthik.sharma@gmail.com> wrote:
>I have the JSON structure shown below and the python code shown below to manipulate the JSON structure.
>
>    import json
>
>    json_input = {
>        "msgType": "0",
>        "tid": "1",
>        "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
>        "Timestamp": "1432703193431",
>        "Host": "myserver.com",
>        "Agent": "Experience",
>        "AppName": "undefined",
>        "AppInstance": "my_server",
>        "Group": "UndefinedGroup"
>    }
>
>
>    data = json_input['data']
>    tdetails = data['Message']
>    print('json_input {} \n\ndata {} \n\n tdetails {}\n\n'.format(json_input,data,tdetails))
>
>I am getting the error.
>
>    Traceback (most recent call last):
>      File "test_json.py", line 19, in <module>
>        tdetails = data['Message']
>    TypeError: string indices must be integers, not str

That will be because of this line:

  tdetails = data['Message']

because "data" is just the string from your dict.

>The JSON structure is valid as shown by http://jsonlint.com/
>   I want to be able to access the different fields inside `data` such as `severity`, `subject` and also fields inside `tdetails` such as `CPUs` and `Product`. How do I do this?

Then you need to decode "data". Example (untested):

  data_decoded = json.loads(data)

and then access:

  data_decoded['Message']

Cheers,
Cameron Simpson <cs@zip.com.au>

Here's a great .sig I wrote, so good it doesn't rhyme.
        Jon Benger <jbenger@agravaine.st.nepean.uws.edu.au>

[toc] | [prev] | [next] | [standalone]


#91337

FromKarthik Sharma <karthik.sharma@gmail.com>
Date2015-05-27 16:51 -0700
Message-ID<75320b35-72b6-4aae-a47f-630bd7bef812@googlegroups.com>
In reply to#91334
I tried modifying the program as follows as per your suggestion.Doesn't seem to work.

import simplejson as json                                                                                                                                                                
import cjson

json_input = { "msgType": "0",
    "tid": "1",
    "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
    "Timestamp": "1432703193431",
    "Host": "myserver.com",
    "Agent": "Experience",
    "AppName": "undefined",
    "AppInstance": "my_server",
    "Group": "UndefinedGroup"
}

print('json input original  {} \n\n'.format(json_input))

data = json_input['data']

print('data {} \n\n'.format(data))

message = json.loads(data)

print('message {} \n\n'.format(message['Message']))

I get the following error.

Traceback (most recent call last):
  File "test_json.py", line 23, in <module>
    print('message {} \n\n'.format(message['Message']))
TypeError: list indices must be integers, not str
karthik.sharma@aukksharma2:~$ vim test_json.py 





On Thursday, 28 May 2015 11:14:44 UTC+12, Cameron Simpson  wrote:
> On 27May2015 15:23, Karthik Sharma <karthik.sharma@gmail.com> wrote:
> >I have the JSON structure shown below and the python code shown below to manipulate the JSON structure.
> >
> >    import json
> >
> >    json_input = {
> >        "msgType": "0",
> >        "tid": "1",
> >        "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
> >        "Timestamp": "1432703193431",
> >        "Host": "myserver.com",
> >        "Agent": "Experience",
> >        "AppName": "undefined",
> >        "AppInstance": "my_server",
> >        "Group": "UndefinedGroup"
> >    }
> >
> >
> >    data = json_input['data']
> >    tdetails = data['Message']
> >    print('json_input {} \n\ndata {} \n\n tdetails {}\n\n'.format(json_input,data,tdetails))
> >
> >I am getting the error.
> >
> >    Traceback (most recent call last):
> >      File "test_json.py", line 19, in <module>
> >        tdetails = data['Message']
> >    TypeError: string indices must be integers, not str
> 
> That will be because of this line:
> 
>   tdetails = data['Message']
> 
> because "data" is just the string from your dict.
> 
> >The JSON structure is valid as shown by http://jsonlint.com/
> >   I want to be able to access the different fields inside `data` such as `severity`, `subject` and also fields inside `tdetails` such as `CPUs` and `Product`. How do I do this?
> 
> Then you need to decode "data". Example (untested):
> 
>   data_decoded = json.loads(data)
> 
> and then access:
> 
>   data_decoded['Message']
> 
> Cheers,
> Cameron Simpson <cs@zip.com.au>
> 
> Here's a great .sig I wrote, so good it doesn't rhyme.
>         Jon Benger <jbenger@agravaine.st.nepean.uws.edu.au>

[toc] | [prev] | [next] | [standalone]


#91341

FromJon Ribbens <jon+usenet@unequivocal.co.uk>
Date2015-05-28 01:38 +0000
Message-ID<slrnmmcsfd.1t4.jon+usenet@frosty.unequivocal.co.uk>
In reply to#91337
On 2015-05-27, Karthik Sharma <karthik.sharma@gmail.com> wrote:
> I tried modifying the program as follows as per your
> suggestion.Doesn't seem to work.

That's because you didn't modify the program as per their suggestion,
you made completely different changes that bore no relation to what
they said.

> import cjson

What do you think 'import json' (or 'import cjson', or whatever) does?
Hint: it doesn't cause Python to henceforth treat all strings as if
they might contain JSON and to interpret them as such just in case.
As Cameron already said, you have to actually *call* the JSON decoder
to convert the data from its JSON string representation into Python
objects.

[toc] | [prev] | [next] | [standalone]


#91348

FromCameron Simpson <cs@zip.com.au>
Date2015-05-28 13:32 +1000
Message-ID<mailman.117.1432785446.5151.python-list@python.org>
In reply to#91341
On 28May2015 01:38, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
>On 2015-05-27, Karthik Sharma <karthik.sharma@gmail.com> wrote:
>> I tried modifying the program as follows as per your
>> suggestion.Doesn't seem to work.
>
>That's because you didn't modify the program as per their suggestion,
>you made completely different changes that bore no relation to what
>they said.

Actually, his changes looked good to me. He does print from data first, but 
only for debugging. Then he goes:

  message = json.loads(data)

and tried to access message['Message'].

However I am having trouble reproducing his issue because his quoted code is 
incorrect. I've tried to fix it, as listed below, but I don't know what is 
really meant to be in the 'data" string.

Hint: Karthik, use raw strings (r'foo') for complicated strings - it avoids a 
lot of backslashing etc, and thus avoids errors in backslashing.

Karthik, please correct the code below. Currently I do not get your exception, 
I get an exception from the JSON module parsing the "data" string.

Code below,
Cameron Simpson <cs@zip.com.au>

import json

json_input = { "msgType": "0",
    "tid": "1",
    "data": r'[{"Severity":"warn","Subject":"Reporting ","Message":"tdetails": {"Product":"Gecko","CPUs":8,"Language":"en-GB","isEven t":">}]',
    "Timestamp": "1432703193431",
    "Host": "myserver.com",
    "Agent": "Experience",
    "AppName": "undefined",
    "AppInstance": "my_server",
    "Group": "UndefinedGroup"
}

print('json input original  {} \n\n'.format(json_input))

data = json_input['data']
print('data {} \n\n'.format(data))

message = json.loads(data)
print('message {} \n\n'.format(message['Message']))

[toc] | [prev] | [next] | [standalone]


#91383

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2015-05-28 18:49 +0000
Message-ID<mk7nvv$qj$3@dont-email.me>
In reply to#91348
On Thu, 28 May 2015 13:32:39 +1000, Cameron Simpson wrote:

> On 28May2015 01:38, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote:
>>On 2015-05-27, Karthik Sharma <karthik.sharma@gmail.com> wrote:
>>> I tried modifying the program as follows as per your
>>> suggestion.Doesn't seem to work.
>>
>>That's because you didn't modify the program as per their suggestion,
>>you made completely different changes that bore no relation to what they
>>said.
> 
> Actually, his changes looked good to me. He does print from data first,
> but only for debugging. Then he goes:
> 
>   message = json.loads(data)
> 
> and tried to access message['Message'].
> 
> However I am having trouble reproducing his issue because his quoted
> code is incorrect. I've tried to fix it, as listed below, but I don't
> know what is really meant to be in the 'data" string.

it looks like data is a broken array of one object, part of which is a 
further quoted json string.

There should be a string value after the isEvent but I have no idea what 
it should be, nor what else should come after.

"message":"tdetails":{att:val pairs}

is also wrong in the first level of inner json.

I think he wants data[0]['message'], but the inner json strings are 
broken too, and should look more like.

data: "[{\"Severity\":\"warn\",\"Subject\":\"Reporting\",\"Message\":
\"tdetails\",\"attr_name\":\"{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\
\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\"attr_value\\\"}\"}
]",

In this model, if the data attribute is a json string, then

data maps to a list / array

data[0] maps to an object / dictionary

data[0]["Message"] maps to the string literal "tdetails"

data[0]["attr_name"] maps to a string representation of a json ob with 
another level of escaping.

That string can then be loaded, eg:

attr_name = json.loads(data[0]["attr_name"])

See: http:/www.sined.co.uk/python/nested_json.py.txt


-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#91346

Fromrandom832@fastmail.us
Date2015-05-27 23:48 -0400
Message-ID<mailman.115.1432784915.5151.python-list@python.org>
In reply to#91337

On Wed, May 27, 2015, at 19:51, Karthik Sharma wrote:
> I get the following error.
> 
> Traceback (most recent call last):
>   File "test_json.py", line 23, in <module>
>     print('message {} \n\n'.format(message['Message']))
> TypeError: list indices must be integers, not str
> karthik.sharma@aukksharma2:~$ vim test_json.py 

The string in data appears to be a json _list_ (note the first character
is a square bracket) - try message[0]['Message'].

You should really be able to figure this out yourself.

[toc] | [prev] | [next] | [standalone]


#91347

FromMRAB <python@mrabarnett.plus.com>
Date2015-05-28 04:54 +0100
Message-ID<mailman.116.1432785275.5151.python-list@python.org>
In reply to#91337
On 2015-05-28 00:51, Karthik Sharma wrote:
> I tried modifying the program as follows as per your suggestion.Doesn't seem to work.
>
> import simplejson as json
> import cjson
>
> json_input = { "msgType": "0",
>      "tid": "1",
>      "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
>      "Timestamp": "1432703193431",
>      "Host": "myserver.com",
>      "Agent": "Experience",
>      "AppName": "undefined",
>      "AppInstance": "my_server",
>      "Group": "UndefinedGroup"
> }
>
> print('json input original  {} \n\n'.format(json_input))
>
> data = json_input['data']
>
> print('data {} \n\n'.format(data))
>
> message = json.loads(data)
>
> print('message {} \n\n'.format(message['Message']))
>
> I get the following error.
>
> Traceback (most recent call last):
>    File "test_json.py", line 23, in <module>
>      print('message {} \n\n'.format(message['Message']))
> TypeError: list indices must be integers, not str
> karthik.sharma@aukksharma2:~$ vim test_json.py
>
You have a different error message now. It says that 'message' is a
list.

Have a look at the JSON data. It represents a list that contains a dict.

>
> On Thursday, 28 May 2015 11:14:44 UTC+12, Cameron Simpson  wrote:
>> On 27May2015 15:23, Karthik Sharma <karthik.sharma@gmail.com> wrote:
>> >I have the JSON structure shown below and the python code shown below to manipulate the JSON structure.
>> >
>> >    import json
>> >
>> >    json_input = {
>> >        "msgType": "0",
>> >        "tid": "1",
>> >        "data": "[{\"Severity\":\"warn\",\"Subject\":\"Reporting \",\"Message\":\"tdetails:{\\\"Product\\\":\\\"Gecko\\\",\\\"CPUs\\\":8,\\\"Language\\\":\\\"en-GB\\\",\\\"isEvent\\\":\\\">
>> >        "Timestamp": "1432703193431",
>> >        "Host": "myserver.com",
>> >        "Agent": "Experience",
>> >        "AppName": "undefined",
>> >        "AppInstance": "my_server",
>> >        "Group": "UndefinedGroup"
>> >    }
>> >
>> >
>> >    data = json_input['data']
>> >    tdetails = data['Message']
>> >    print('json_input {} \n\ndata {} \n\n tdetails {}\n\n'.format(json_input,data,tdetails))
>> >
>> >I am getting the error.
>> >
>> >    Traceback (most recent call last):
>> >      File "test_json.py", line 19, in <module>
>> >        tdetails = data['Message']
>> >    TypeError: string indices must be integers, not str
>>
>> That will be because of this line:
>>
>>   tdetails = data['Message']
>>
>> because "data" is just the string from your dict.
>>
>> >The JSON structure is valid as shown by http://jsonlint.com/
>> >   I want to be able to access the different fields inside `data` such as `severity`, `subject` and also fields inside `tdetails` such as `CPUs` and `Product`. How do I do this?
>>
>> Then you need to decode "data". Example (untested):
>>
>>   data_decoded = json.loads(data)
>>
>> and then access:
>>
>>   data_decoded['Message']
>>

[toc] | [prev] | [next] | [standalone]


#91382

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2015-05-28 18:48 +0000
Message-ID<mk7nuq$qj$2@dont-email.me>
In reply to#91329
On Wed, 27 May 2015 15:23:31 -0700, Karthik Sharma wrote:

> The JSON structure is valid as shown by http://jsonlint.com/

Not when I paste it in it's not. The "data" attribute is an unterminated 
string and is not followed by a comma.

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web