Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #37501 > unrolled thread

urllib2 FTP Weirdness

Started byNick Cash <nick.cash@npcinternational.com>
First post2013-01-23 20:07 +0000
Last post2013-01-24 11:41 +1100
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  urllib2 FTP Weirdness Nick Cash <nick.cash@npcinternational.com> - 2013-01-23 20:07 +0000
    Re: urllib2 FTP Weirdness Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-01-24 11:41 +1100

#37501 — urllib2 FTP Weirdness

FromNick Cash <nick.cash@npcinternational.com>
Date2013-01-23 20:07 +0000
Subjecturllib2 FTP Weirdness
Message-ID<mailman.919.1358971685.2939.python-list@python.org>
Python 2.7.3 on linux

This has me fairly stumped. It looks like
	urllib2.urlopen("ftp://some.ftp.site/path").read()
will either immediately return '' or hang indefinitely. But
	response = urllib2.urlopen("ftp://some.ftp.site/path")
	response.read()
works fine and returns what is expected. This is only an issue with urllib2, vanilla urllib doesn't do it.

The site I first noticed it on is private, but I can reproduce it with "ftp://ftp2.census.gov/".

I've tested the equivalent code on Python 3.2.3 and get the same results, except that one time I got a socket error (may have been a spurious network blip, though). 


I'm at a loss as to how that could even work differently. My only guess is that by not having a reference to the addinfourl response object, something important is getting garbage collected or closed... that seems like a stretch, though. Is this a urllib2 bug, or am I crazy?

-Nick Cash

[toc] | [next] | [standalone]


#37524

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-01-24 11:41 +1100
Message-ID<51008349$0$9505$c3e8da3$5496439d@news.astraweb.com>
In reply to#37501
Nick Cash wrote:

> Python 2.7.3 on linux
> 
> This has me fairly stumped. It looks like
>     urllib2.urlopen("ftp://some.ftp.site/path").read()
> will either immediately return '' or hang indefinitely. But
>     response = urllib2.urlopen("ftp://some.ftp.site/path")
>     response.read()
> works fine and returns what is expected. This is only an issue with
> urllib2, vanilla urllib doesn't do it.
> 
> The site I first noticed it on is private, but I can reproduce it with
> "ftp://ftp2.census.gov/".

Then why not give that in your example, to make running your code
easier? :-)

I cannot reproduce the problem:


py> import urllib2
py> x = urllib2.urlopen("ftp://ftp2.census.gov/").read()
py> len(x)
5550


Works fine for me using Python 2.7.2 on Linux. I cannot see how the two
snippets you give could possibly be different. If you are using a proxy,
what happens if you bypass it?

If you can reproduce this at will, with and without proxy, with multiple
sites, then I suppose it is conceivable that it could be some sort of bug.
But I wouldn't bet on it.



-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web