Re: Urllib's urlopen and urlretrieve

Date	2013-02-21 13:04 -0500
From	Dave Angel <davea@davea.name>
Subject	Re: Urllib's urlopen and urlretrieve
References	<34998ea2-6b19-4a98-8ea0-389aca0192ca@googlegroups.com> <5126439F.8090508@davea.name> <20130221094713.d21dff3ce0cd59e50f62f6f3@lavabit.com>
Newsgroups	comp.lang.python
Message-ID	<mailman.2188.1361469890.2939.python-list@python.org> (permalink)

Show all headers | View raw

On 02/21/2013 12:47 PM, rh wrote:
> On Thu, 21 Feb 2013 10:56:15 -0500
> Dave Angel <davea@davea.name> wrote:
>> On 02/21/2013 07:12 AM, qoresucks@gmail.com wrote:
>>> I only just started Python and given that I know nothing about
>>> network programming or internet programming of any kind really, I
>>> thought it would be interesting to try write something that could
>>> create an archive of a website for myself.
>>
>
>
>> To archive your website, use the rsync command.  No need to write any
>> code, as rsync will descend into all the directories as needed, and
>> it'll get the actual website data, not the stuff that the web server
>> feeds to the browsers.
>
> How many websites let you suck down their content using rsync???
> The request was for creating their own copy of a website.
>

Clearly this was his own website, since it's usually unethical to "suck 
down" someone else's.  And my message specifically said "To archive 
*your* website..."  As to the implied question of why, since he 
presumably has the original sources, I can only relate my own 
experience.  I generate mine by a python program, but over time obsolete 
files are left behind.  Additionally, an overzealous SEO person 
hand-edited my files.  And finally, I reinstalled my system from scratch 
a couple of months ago.  So in order to see exactly what's out there, I 
used rsync, about two weeks ago.

-- 
DaveA

Thread

Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 04:12 -0800
  Re: Urllib's urlopen and urlretrieve Michael Herman <hermanmu@gmail.com> - 2013-02-21 04:59 -0800
    Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
      Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-22 12:05 -0500
      Re: Urllib's urlopen and urlretrieve MRAB <python@mrabarnett.plus.com> - 2013-02-22 17:18 +0000
    Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 10:56 -0500
  Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:47 -0800
  Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:55 -0800
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:04 -0500
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:53 -0500

csiph-web