Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #39592
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!news1.tnib.de!feed.news.tnib.de!news.tnib.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <davea@davea.name> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.005 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'essentially': 0.04; 'python': 0.09; 'expected.': 0.09; 'fetch': 0.09; 'structure,': 0.09; 'wrong,': 0.09; 'absolute,': 0.16; 'background,': 0.16; 'etc...': 0.16; 'non-trivial': 0.16; 'specific,': 0.16; 'succeeds,': 0.16; 'urllib': 0.16; 'later': 0.16; 'wrote:': 0.17; 'file.': 0.20; 'trying': 0.21; 'css,': 0.22; 'location,': 0.22; 'example': 0.23; 'absolute': 0.23; 'specified': 0.23; "haven't": 0.23; 'downloaded': 0.24; 'external': 0.24; 'host': 0.24; 'tried': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'looks': 0.26; 'am,': 0.27; 'css': 0.27; 'noticed': 0.28; 'attempted': 0.29; 'header,': 0.29; 'lessons': 0.29; 'initially': 0.30; 'code': 0.31; 'getting': 0.33; 'html,': 0.33; 'to:addr :python-list': 0.33; 'text': 0.34; 'similar': 0.35; 'there': 0.35; 'but': 0.36; "didn't": 0.36; 'method': 0.36; 'editor': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'fact': 0.38; 'page': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'little': 0.39; 'received:192.168': 0.40; 'lost': 0.60; 'further': 0.61; 'first': 0.61; 'email addr:gmail.com': 0.63; 'more': 0.63; 'further,': 0.71; 'received:74.208': 0.71; '(url)': 0.84; 'received:74.208.4.194': 0.84 |
| Date | Fri, 22 Feb 2013 12:05:30 -0500 |
| From | Dave Angel <davea@davea.name> |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: Urllib's urlopen and urlretrieve |
| References | <34998ea2-6b19-4a98-8ea0-389aca0192ca@googlegroups.com> <mailman.2162.1361451589.2939.python-list@python.org> <07234607-bd77-4ecb-8a19-3c71e9b4f0b4@googlegroups.com> |
| In-Reply-To | <07234607-bd77-4ecb-8a19-3c71e9b4f0b4@googlegroups.com> |
| Content-Type | text/plain; charset=ISO-8859-1; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-Provags-ID | V02:K0:PqeOZp3w5xvGYSW0fkxU0XB73bOX0DPsVZvY2CTP/Q8 qhi07BPKGhvnrMfFZR2Qm2AOXBIEWRAzLpasbIxQ/2sCNdOgM6 mZB7FR5irRrFax825PUK6xT4nsB+IUlH719a7LjitDRTJdOwoG XPuSzGG7tr3pPPCkraHtUJP2/pClH2on4BoI+HD/HnAVTPI3ii lzk9STpwiZrmpl9wqhtEca+/Uwhv2zvrkje3xjSl/O9YsLmVQU ZjfNYbMqoXdLss7owIXlu31KRbavc31PBj/f8tfcq0Y5mX6+Ha eaD8qvQbHha1pSpRoLZJ8bpxG8ApIWVaphD2UqLXeFmffycnw= = |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.2285.1361552757.2939.python-list@python.org> (permalink) |
| Lines | 24 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1361552757 news.xs4all.nl 6963 [2001:888:2000:d::a6]:32993 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:39592 |
Show key headers only | View raw
On 02/22/2013 12:09 AM, qoresucks@gmail.com wrote: > Initially I was just trying the html, but later when I attempted more complicated sites that weren't my own I noticed that large bulks of the site were lost in the process. The urllib code essentially looks like what I was trying but it didn't work as I had expected. > > To be more specific, after I got it working for my own little page, I attempted to take it further and get all the lessons from Learn Python The Hard Way. When I tried the same method on the first intro page to see if I was even getting it right, the html code was all there but upon opening it I noticed the format was all wrong, colors were off for the background, images, etc... were all missing. So how are you opening this html? In a text editor that somehow added colors? Or were you opening it in a browser? In order for a browser to render a non-trivial page, it may need lots of files other than the html. Colors for example can be specified inline, in the header, or in an external css file. If the page was designed to use the external css, and it's missing or not in the right location, then the browser is going to get the colors wrong. Further, if the location (url) is relative, then you can create a similar directory structure, and the browser will find it. But if it's absolute, then the browser is going to try to go out to the web to fetch it. If it succeeds, then it's masking the fact that you haven't downloaded the "whole web site." The same is true for other external refs. It may be impossible to host it elsewhere if there are any absolute urls. -- DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 04:12 -0800
Re: Urllib's urlopen and urlretrieve Michael Herman <hermanmu@gmail.com> - 2013-02-21 04:59 -0800
Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-22 12:05 -0500
Re: Urllib's urlopen and urlretrieve MRAB <python@mrabarnett.plus.com> - 2013-02-22 17:18 +0000
Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 10:56 -0500
Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:47 -0800
Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:55 -0800
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:04 -0500
Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:53 -0500
csiph-web