Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #39439

Re: Urllib's urlopen and urlretrieve

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder1.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <davea@davea.name>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.039
X-Spam-Evidence '*H*': 0.92; '*S*': 0.00; 'needed,': 0.05; 'python': 0.09; 'command.': 0.09; 'preserve': 0.09; 'archive': 0.11; 'attributes.': 0.16; 'browsers.': 0.16; 'wrote:': 0.17; 'code,': 0.18; 'programming': 0.23; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'am,': 0.27; "doesn't": 0.28; 'actual': 0.28; 'feeds': 0.29; 'really,': 0.29; 'smart': 0.29; 'stuff': 0.30; 'could': 0.32; 'text,': 0.33; 'to:addr:python-list': 0.33; 'server': 0.35; 'data,': 0.35; 'something': 0.35; 'list.': 0.35; 'but': 0.36; 'enough': 0.36; 'subject:: ': 0.38; 'some': 0.38; 'nothing': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'received:192.168': 0.40; 'your': 0.60; 'kind': 0.61; 'email addr:gmail.com': 0.63; 'internet': 0.71; 'received:74.208': 0.71; 'received:74.208.4.194': 0.84; 'text-based': 0.84
Date Thu, 21 Feb 2013 10:56:15 -0500
From Dave Angel <davea@davea.name>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
MIME-Version 1.0
To python-list@python.org
Subject Re: Urllib's urlopen and urlretrieve
References <34998ea2-6b19-4a98-8ea0-389aca0192ca@googlegroups.com>
In-Reply-To <34998ea2-6b19-4a98-8ea0-389aca0192ca@googlegroups.com>
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Provags-ID V02:K0:vhryn8c5prEBTspcZEi+wP29eR6rTCtN76SJ7Ofg1bF 3CEAFPxseIFba8mHdDrc2IwvW5ffcw3jFysb3LyhYro3lGuEcM OYHPU/1nwXgFQGYI+BsIxaVwCfrCrCBD2JIetroBrOZqpIYflr SRBk5ua9CFGa1hGuuwzoeJI+V/9WNDp1Fd8AEwsmUmJScufag6 w8SyJEbU1lIJyTIntMKTVyY2v3LyJxwxzJXf0V6Bb7xFXElaQe Q0VkK+MxalOht9m/26wvwv2BDdJn/3fHVVkAmecyJlZi7dctTz B/SeirquARIe+b+TOrUw6MtF8X+A9a8s60oIlqmhGhJ/eV+eg= =
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.2179.1361462189.2939.python-list@python.org> (permalink)
Lines 19
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1361462189 news.xs4all.nl 6861 [2001:888:2000:d::a6]:46712
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:39439

Show key headers only | View raw


On 02/21/2013 07:12 AM, qoresucks@gmail.com wrote:
> I only just started Python and given that I know nothing about network programming or internet programming of any kind really, I thought it would be interesting to try write something that could create an archive of a website for myself.

Please send your emails as text, not html;  this is a text-based mailing 
list.

To archive your website, use the rsync command.  No need to write any 
code, as rsync will descend into all the directories as needed, and 
it'll get the actual website data, not the stuff that the web server 
feeds to the browsers.

If for some reason you don't have rsync, you could use scp.  But it 
doesn't seem to be able to preserve attributes.  It's also not smart 
enough to only copy stuff that's been changed, when you want to update 
incrementally.


-- 
DaveA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 04:12 -0800
  Re: Urllib's urlopen and urlretrieve Michael Herman <hermanmu@gmail.com> - 2013-02-21 04:59 -0800
    Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
      Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-22 12:05 -0500
      Re: Urllib's urlopen and urlretrieve MRAB <python@mrabarnett.plus.com> - 2013-02-22 17:18 +0000
    Re: Urllib's urlopen and urlretrieve qoresucks@gmail.com - 2013-02-21 21:09 -0800
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 10:56 -0500
  Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:47 -0800
  Re: Urllib's urlopen and urlretrieve rh <richard_hubbe11@lavabit.com> - 2013-02-21 09:55 -0800
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:04 -0500
  Re: Urllib's urlopen and urlretrieve Dave Angel <davea@davea.name> - 2013-02-21 13:53 -0500

csiph-web