Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #39592
| Date | 2013-02-22 12:05 -0500 |
|---|---|
| From | Dave Angel <davea@davea.name> |
| Subject | Re: Urllib's urlopen and urlretrieve |
| References | <34998ea2-6b19-4a98-8ea0-389aca0192ca@googlegroups.com> <mailman.2162.1361451589.2939.python-list@python.org> <07234607-bd77-4ecb-8a19-3c71e9b4f0b4@googlegroups.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.2285.1361552757.2939.python-list@python.org> (permalink) |
On 02/22/2013 12:09 AM, qoresucks@gmail.com wrote: > Initially I was just trying the html, but later when I attempted more complicated sites that weren't my own I noticed that large bulks of the site were lost in the process. The urllib code essentially looks like what I was trying but it didn't work as I had expected. > > To be more specific, after I got it working for my own little page, I attempted to take it further and get all the lessons from Learn Python The Hard Way. When I tried the same method on the first intro page to see if I was even getting it right, the html code was all there but upon opening it I noticed the format was all wrong, colors were off for the background, images, etc... were all missing. So how are you opening this html? In a text editor that somehow added colors? Or were you opening it in a browser? In order for a browser to render a non-trivial page, it may need lots of files other than the html. Colors for example can be specified inline, in the header, or in an external css file. If the page was designed to use the external css, and it's missing or not in the right location, then the browser is going to get the colors wrong. Further, if the location (url) is relative, then you can create a similar directory structure, and the browser will find it. But if it's absolute, then the browser is going to try to go out to the web to fetch it. If it succeeds, then it's masking the fact that you haven't downloaded the "whole web site." The same is true for other external refs. It may be impossible to host it elsewhere if there are any absolute urls. -- DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 04:12 -0800
Re: Urllib's urlopen and urlretrieve Michael Herman <hermanmu@gmail.com> - 2013-02-21 04:59 -0800
Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-22 12:05 -0500
Re: Urllib's urlopen and urlretrieve MRAB <python@mrabarnett.plus.com> - 2013-02-22 17:18 +0000
Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 10:56 -0500
Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:47 -0800
Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:55 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:04 -0500
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:53 -0500
csiph-web