Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #46223
| Newsgroups | comp.lang.python |
|---|---|
| Date | 2013-05-27 13:47 -0700 |
| Message-ID | <10be5c62-4c58-4b4f-b00a-82d85ee4ef8e@googlegroups.com> (permalink) |
| Subject | Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() |
| From | Bryan Britten <britten.bryan@gmail.com> |
Hey, everyone! I'm very new to Python and have only been using it for a couple of days, but have some experience in programming (albeit mostly statistical programming in SAS or R) so I'm hoping someone can answer this question in a technical way, but without using an abundant amount of jargon. The issue I'm having is that I'm trying to pull information from a website to practice Python with, but I'm having trouble getting the data in a timely fashion. If I use the following code: <code> import json import urllib urlStr = "https://stream.twitter.com/1/statuses/sample.json" twtrDict = [json.loads(line) for line in urllib.urlopen(urlStr)] </code> I get a memory issue. I'm running 32-bit Python 2.7 with 4 gigs of RAM if that helps at all. If I use the following code: <code> import urllib urlStr = "https://stream.twitter.com/1/statuses/sample.json" fileHandle = urllib.urlopen(urlStr) twtrText = fileHandle.readlines() </code> It takes hours (upwards of 6 or 7, if not more) to finish computing the last command. With that being said, my question is whether there is a more efficient manner to do this. I'm worried that if it's taking this long to process the .readlines() command, trying to work with the data is going to be a computational nightmare. Thanks in advance for any insights or advice!
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Bryan Britten <britten.bryan@gmail.com> - 2013-05-27 13:47 -0700
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Roy Smith <roy@panix.com> - 2013-05-27 16:56 -0400
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Bryan Britten <britten.bryan@gmail.com> - 2013-05-27 14:29 -0700
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Denis McMahon <denismfmcmahon@gmail.com> - 2013-05-27 21:35 +0000
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Fábio Santos <fabiosantosart@gmail.com> - 2013-05-28 00:36 +0100
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Dave Angel <davea@davea.name> - 2013-05-27 19:58 -0400
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Bryan Britten <britten.bryan@gmail.com> - 2013-05-27 20:11 -0700
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Fábio Santos <fabiosantosart@gmail.com> - 2013-05-28 08:31 +0100
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Bryan Britten <britten.bryan@gmail.com> - 2013-05-28 07:32 -0700
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Alister <alister.ware@ntlworld.com> - 2013-05-28 17:52 +0000
Re: Reading *.json from URL - json.loads() versus urllib.urlopen.readlines() Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-05-27 21:40 -0400
csiph-web