Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #74908

Html Parsing stuff

X-Received by 10.66.158.193 with SMTP id ww1mr11690555pab.12.1405927213730; Mon, 21 Jul 2014 00:20:13 -0700 (PDT)
X-Received by 10.50.17.102 with SMTP id n6mr28648igd.9.1405927213483; Mon, 21 Jul 2014 00:20:13 -0700 (PDT)
Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!h18no4860429igc.0!news-out.google.com!eg1ni1igc.0!nntp.google.com!h18no4860425igc.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups comp.lang.python
Date Mon, 21 Jul 2014 00:20:13 -0700 (PDT)
Complaints-To groups-abuse@google.com
Injection-Info glegroupsg2000goo.googlegroups.com; posting-host=60.230.192.23; posting-account=_SqfYQoAAADKNDByavvb6bZJr50UfqXN
NNTP-Posting-Host 60.230.192.23
User-Agent G2/1.0
MIME-Version 1.0
Message-ID <b0caa90b-bf6b-4d3c-b6ff-43246e006b70@googlegroups.com> (permalink)
Subject Html Parsing stuff
From Nicholas Cannon <nicholascannon1@gmail.com>
Injection-Date Mon, 21 Jul 2014 07:20:13 +0000
Content-Type text/plain; charset=ISO-8859-1
Content-Transfer-Encoding quoted-printable
Xref csiph.com comp.lang.python:74908

Show key headers only | View raw


Ok i get the basics of this and i have been doing some successful parsings and using regular expressions to find html tags. I have tried to find an img tag and write that image to a file. I have had no success. It says it has successfully wrote the image to the file with a try... except statement but when i try to open this it says that the image has like no been saved correctly or is damaged. This was just reading the src attribute of the tag and trying to save that link to a .jpg(the extension of the image). Ok so i looked deeper and added a forward slash to the url and then added the image src attribute to it. I then opened that link with the urllib.urlopen() and then read the contents and saved it to the file again. I still got the same result as before. Is there a function in beautiful soup or the urllib module that i can use to save and image. This is just a problem i am sorting out not a whole application so the code is small. Thanks

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Html Parsing stuff Nicholas Cannon <nicholascannon1@gmail.com> - 2014-07-21 00:20 -0700
  Re: Html Parsing stuff Nicholas Cannon <nicholascannon1@gmail.com> - 2014-07-21 02:13 -0700

csiph-web