Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #77434 > unrolled thread

urllib2 redirect error

Started bySumit Ray <sumitbu.ray@gmail.com>
First post2014-09-02 02:08 -0400
Last post2014-09-06 09:24 -0700
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  urllib2 redirect error Sumit Ray <sumitbu.ray@gmail.com> - 2014-09-02 02:08 -0400
    Re: urllib2 redirect error Steven D'Aprano <steve@pearwood.info> - 2014-09-02 07:28 +0000
      Re: urllib2 redirect error dieter <dieter@handshake.de> - 2014-09-03 08:29 +0200
      Re: urllib2 redirect error Sumit Ray <sumitbu.ray@gmail.com> - 2014-09-06 09:24 -0700

#77434 — urllib2 redirect error

FromSumit Ray <sumitbu.ray@gmail.com>
Date2014-09-02 02:08 -0400
Subjecturllib2 redirect error
Message-ID<mailman.13708.1409638130.18130.python-list@python.org>

[Multipart message — attachments visible in raw view] — view raw

Hi,

 I've tried versions of the following but continue to get errors:

------------- snip -------------
url = 'https://www.usps.com/send/official-abbreviations.htm'
request = urllib2.build_opener(urllib2.HTTPRedirectHandler).open(url)
------------- snip -------------

Generates an exception:
urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect
error that would lead to an infinite loop.
The last 30x error message was:
Moved Permanently


The following using the requests library hangs:
------------- snip -------------
response = requests.get(url)
------------- snip -------------

I'm fresh out of ideas and any suggestions you may have would be greatly
appreciated.

Thanks in advance,
Sumit

[toc] | [next] | [standalone]


#77436

FromSteven D'Aprano <steve@pearwood.info>
Date2014-09-02 07:28 +0000
Message-ID<540571b8$0$29876$c3e8da3$5496439d@news.astraweb.com>
In reply to#77434
On Tue, 02 Sep 2014 02:08:47 -0400, Sumit Ray wrote:

> Hi,
> 
>  I've tried versions of the following but continue to get errors:
> 
> ------------- snip -------------
> url = 'https://www.usps.com/send/official-abbreviations.htm' 
> request = urllib2.build_opener(urllib2.HTTPRedirectHandler).open(url)
> ------------- snip -------------
> 
> Generates an exception:
> urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect
> error that would lead to an infinite loop. The last 30x error message
> was:
> Moved Permanently

I'm not an expert, but that sounds like a fault at the server end. I just
tried it in Chrome, and it worked, and then with wget, and I get the same
sort of error:


steve@runes:~$ wget https://www.usps.com/send/official-abbreviations.htm
--2014-09-02 17:22:31--  https://www.usps.com/send/official-abbreviations.htm
Resolving www.usps.com... 23.9.230.219, 2600:1415:8:181::1bf2, 2600:1415:8:183::1bf2
Connecting to www.usps.com|23.9.230.219|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.usps.com/root/global/server_responses/webtools-msg.htm [following]
--2014-09-02 17:22:31--  https://www.usps.com/root/global/
server_responses/webtools-msg.htm
Reusing existing connection to www.usps.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.usps.com/root/global/server_responses/webtools-msg.htm [following]
[...]
--2014-09-02 17:22:32--  https://www.usps.com/root/global/server_responses/webtools-msg.htm
Reusing existing connection to www.usps.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.usps.com/root/global/server_responses/webtools-msg.htm [following]
20 redirections exceeded.


Sounds like if the server doesn't recognise the browser, it gets 
confused and ends up in a redirect loop.

You could try setting the user-agent and see if that helps:


steve@runes:~$ wget -U "firefox 9999 (mozilla compatible)" https://www.usps.com/send/official-abbreviations.htm
--2014-09-02 17:28:03--  https://www.usps.com/send/official-abbreviations.htm
Resolving www.usps.com... 23.9.230.219, 2600:1415:8:183::1bf2, 2600:1415:8:181::1bf2
Connecting to www.usps.com|23.9.230.219|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `official-abbreviations.htm'

    [  <=>                                  ] 102,670      260K/s   in 0.4s    

2014-09-02 17:28:05 (260 KB/s) - `official-abbreviations.htm' saved [102670]




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#77467

Fromdieter <dieter@handshake.de>
Date2014-09-03 08:29 +0200
Message-ID<mailman.13725.1409725798.18130.python-list@python.org>
In reply to#77436
Steven D'Aprano <steve@pearwood.info> writes:
> ...
> I'm not an expert, but that sounds like a fault at the server end. I just
> tried it in Chrome, and it worked, and then with wget, and I get the same
> sort of error:
> ...
> Sounds like if the server doesn't recognise the browser, it gets 
> confused and ends up in a redirect loop.
>
> You could try setting the user-agent and see if that helps:

Excellent analysis and advice.

[toc] | [prev] | [next] | [standalone]


#77659

FromSumit Ray <sumitbu.ray@gmail.com>
Date2014-09-06 09:24 -0700
Message-ID<mailman.13839.1410025327.18130.python-list@python.org>
In reply to#77436

[Multipart message — attachments visible in raw view] — view raw

Steven,

Thank you! User advice was on point.

Sumit


On Tue, Sep 2, 2014 at 11:29 PM, dieter <dieter@handshake.de> wrote:

> Steven D'Aprano <steve@pearwood.info> writes:
> > ...
> > I'm not an expert, but that sounds like a fault at the server end. I just
> > tried it in Chrome, and it worked, and then with wget, and I get the same
> > sort of error:
> > ...
> > Sounds like if the server doesn't recognise the browser, it gets
> > confused and ends up in a redirect loop.
> >
> > You could try setting the user-agent and see if that helps:
>
> Excellent analysis and advice.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web