Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #53514 > unrolled thread

Python FTP timeout value not effective

Started byJohn Nagle <nagle@animats.com>
First post2013-09-02 10:43 -0700
Last post2013-09-02 20:04 -0400
Articles 4 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Python FTP timeout value not effective John Nagle <nagle@animats.com> - 2013-09-02 10:43 -0700
    Re: Python FTP timeout value not effective David Bolen <db3l.net@gmail.com> - 2013-09-02 18:00 -0400
    Re: Python FTP timeout value not effective Chris Angelico <rosuav@gmail.com> - 2013-09-03 08:35 +1000
    Re: Python FTP timeout value not effective Terry Reedy <tjreedy@udel.edu> - 2013-09-02 20:04 -0400

#53514 — Python FTP timeout value not effective

FromJohn Nagle <nagle@animats.com>
Date2013-09-02 10:43 -0700
SubjectPython FTP timeout value not effective
Message-ID<l02io1$kug$1@dont-email.me>
    I'm reading files from an FTP server at the U.S. Securities and
Exchange Commission.  This code has been running successfully for
years.  Recently, they imposed a consistent connection delay
of 20 seconds at FTP connection, presumably because they're having
some denial of service attack.  Python 2.7 urllib2 doesn't
seem to use the timeout specified.  After 20 seconds, it
gives up and times out.

Here's the traceback:

Internal error in EDGAR update: <urlopen error ftp error: [Errno 110]
Connection timed out>
....
  File "./edgar/edgarnetutil.py", line 53, in urlopen
    return(urllib2.urlopen(url,timeout=timeout))
  File "/opt/python27/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/opt/python27/lib/python2.7/urllib2.py", line 394, in open
    response = self._open(req, data)
  File "/opt/python27/lib/python2.7/urllib2.py", line 412, in _open
    '_open', req)
  File "/opt/python27/lib/python2.7/urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "/opt/python27/lib/python2.7/urllib2.py", line 1379, in ftp_open
    fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
  File "/opt/python27/lib/python2.7/urllib2.py", line 1400, in connect_ftp
    fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
  File "/opt/python27/lib/python2.7/urllib.py", line 860, in __init__
    self.init()
  File "/opt/python27/lib/python2.7/urllib.py", line 866, in init
    self.ftp.connect(self.host, self.port, self.timeout)
  File "/opt/python27/lib/python2.7/ftplib.py", line 132, in connect
    self.sock = socket.create_connection((self.host, self.port),
self.timeout)
  File "/opt/python27/lib/python2.7/socket.py", line 571, in
create_connection
    raise err
URLError: <urlopen error ftp error: [Errno 110] Connection timed out>

Periodic update completed in 21.1 seconds.
----------------------------------------------

Here's the relevant code:

TIMEOUTSECS = 60	## give up waiting for server after 60 seconds
...
def urlopen(url,timeout=TIMEOUTSECS) :
    if url.endswith(".gz") :	# gzipped file, must decompress first
        nd = urllib2.urlopen(url,timeout=timeout)	# get connection
	... # (NOT .gz FILE, DOESN'T TAKE THIS PATH)
    else :
	return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS)


TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't
help.

This isn't an OS problem. The above traceback was on a Linux system.
On Windows 7, it fails with

"URLError: <urlopen error ftp error: [Errno 10060] A connection attempt
failed because the connected party did not properly respond after a
period of time, or established connection failed because connected host
has failed to respond>"

But in both cases, the command line FTP client will work, after a
consistent 20 second delay before the login prompt.  So the
Python timeout parameter isn't working.

				John Nagle


[toc] | [next] | [standalone]


#53532

FromDavid Bolen <db3l.net@gmail.com>
Date2013-09-02 18:00 -0400
Message-ID<m2li3evq84.fsf@valheru.db3l.homeip.net>
In reply to#53514
John Nagle <nagle@animats.com> writes:

> Here's the relevant code:
>
> TIMEOUTSECS = 60	## give up waiting for server after 60 seconds
> ...
> def urlopen(url,timeout=TIMEOUTSECS) :
>     if url.endswith(".gz") :	# gzipped file, must decompress first
>         nd = urllib2.urlopen(url,timeout=timeout)	# get connection
> 	... # (NOT .gz FILE, DOESN'T TAKE THIS PATH)
>     else :
> 	return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS)
>
>
> TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't
> help.

I apologize if it's an obvious question, but is there any possibility that
the default value to urlopen is not being used, but some other timeout is
being supplied?  Or that somehow TIMEOUTSECS is being redefined before
being used by the urlopen definition?  Can you (or have you) verified the
actual timeout parameter value being supplied to urllib2.urlopen?

The fact that you seem to still be timing out very close to the prior 20s
timeout value seems a little curious, since there's no timeout by default
(barring an application level global socket default), so it feels like a
value being supplied.

Not sure which 2.7 you're using, but I tried the below with both 2.7.3 and
2.7.5 on Linux since they were handy, and the timeout parameter seems to be
working properly at least in a case I can simulate (xxx is a firewalled
host so the connection attempt just gets black-holed until the timeout):

     >>> import time, urllib2
     >>> def test(timeout):
     ...   print time.ctime()
     ...   try:
     ...     urllib2.urlopen('ftp://xxx', timeout=timeout)
     ...   except:
     ...     print 'Error'
     ...   print time.ctime()
     ... 
     >>> test(5)
     Mon Sep  2 17:36:15 2013
     Error
     Mon Sep  2 17:36:20 2013
     >>> test(20)
     Mon Sep  2 17:36:23 2013
     Error
     Mon Sep  2 17:36:44 2013
     >>> test(60)
     Mon Sep  2 17:36:50 2013
     Error
     Mon Sep  2 17:37:50 2013

It's tougher to simulate a host that artificially delays the connection
attempt but then succeeds, so maybe it's an issue related specifically to
that implementation.  Depending on how the delay is implemented (delaying
SYN response versus accepting the connection but just delaying the welcome
banner, for example), I suppose it may be tickling some very specific bug.

Since all communication essentially boils down to I/O over the socket, it
seems to me likely that those cases should still fail over time periods
related to the timeout supplied, unlike your actual results, which makes me
wonder about the actual urllib2.urlopen timeout parameter.

-- David

[toc] | [prev] | [next] | [standalone]


#53535

FromChris Angelico <rosuav@gmail.com>
Date2013-09-03 08:35 +1000
Message-ID<mailman.520.1378161342.19984.python-list@python.org>
In reply to#53514
On Tue, Sep 3, 2013 at 3:43 AM, John Nagle <nagle@animats.com> wrote:
> "URLError: <urlopen error ftp error: [Errno 10060] A connection attempt
> failed because the connected party did not properly respond after a
> period of time, or established connection failed because connected host
> has failed to respond>"
>
> But in both cases, the command line FTP client will work, after a
> consistent 20 second delay before the login prompt.  So the
> Python timeout parameter isn't working.

That's a socket timeout, not an FTP timeout - that's why the timeout
parameter isn't doing anything. Are you sure it's just a 20-second
delay there? Check if there's something else blocking the connection
somehow. Can you telnet to that computer on port 21?

ChrisA

[toc] | [prev] | [next] | [standalone]


#53541

FromTerry Reedy <tjreedy@udel.edu>
Date2013-09-02 20:04 -0400
Message-ID<mailman.524.1378166719.19984.python-list@python.org>
In reply to#53514
On 9/2/2013 1:43 PM, John Nagle wrote:
>      I'm reading files from an FTP server at the U.S. Securities and
> Exchange Commission.  This code has been running successfully for
> years.  Recently, they imposed a consistent connection delay
> of 20 seconds at FTP connection, presumably because they're having
> some denial of service attack.  Python 2.7 urllib2 doesn't
> seem to use the timeout specified.  After 20 seconds, it
> gives up and times out.
>
> Here's the traceback:
>
> Internal error in EDGAR update: <urlopen error ftp error: [Errno 110]
> Connection timed out>
> ....
>    File "./edgar/edgarnetutil.py", line 53, in urlopen
>
>    File "/opt/python27/lib/python2.7/socket.py", line 571, in
> create_connection
...
>      raise err
> URLError: <urlopen error ftp error: [Errno 110] Connection timed out>
>
> Periodic update completed in 21.1 seconds.
> ----------------------------------------------
>
> Here's the relevant code:
>
> TIMEOUTSECS = 60	## give up waiting for server after 60 seconds
> ...
> def urlopen(url,timeout=TIMEOUTSECS) :
>      if url.endswith(".gz") :	# gzipped file, must decompress first
>          nd = urllib2.urlopen(url,timeout=timeout)	# get connection
> 	... # (NOT .gz FILE, DOESN'T TAKE THIS PATH)
>      else :
> 	return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS)

I looked at the 3.3 urllib.retrieve.urlopen code and timeout is passed 
through a couple of layers but is it hard to see if it reaches the 
socket connection call. I would also try python3.3 as timeout may have 
been changed a bit.

There are some 'timeout' issues on the tracker, such as
http://bugs.python.org/issue4079
http://bugs.python.org/issue18417
but these do not obviously apply to an explicitly passed timeout

I would also try using ftplib, which cuts out lots of the general 
purpose layers urlopen. FTP.__init__ stores timeout in self.timeout and 
calls connect(), which passes self.timeout to socket.create_connection.

 >>> import ftplib
 >>> ftp = ftplib.FTP("ftp.sec.gov")
 >>> ftp.login()
'230-Anonymous access granted, restrictions apply\n \n Please read the 
file README.txt\n230    it was last modified on Tue Aug 15 14:29:31 2000 
- 4765 days ago'
 >>> ftp.sendcmd('help')
"214-The following commands are recognized (* =>'s unimplemented):\n CWD 
     XCWD    CDUP    XCUP    SMNT*   QUIT    PORT    PASV    \n EPRT 
EPSV    ALLO*   RNFR    RNTO    DELE    MDTM    RMD     \n XRMD    MKD 
    XMKD    PWD     XPWD    SIZE    SYST    HELP    \n NOOP    FEAT 
OPTS    AUTH*   CCC*    CONF*   ENC*    MIC*    \n PBSZ*   PROT*   TYPE 
    STRU    MODE    RETR    STOR    STOU    \n APPE    REST    ABOR 
USER    PASS    ACCT*   REIN*   LIST    \n NLST    STAT    SITE    MLSD 
    MLST    \n214 Direct comments to root@clone11.sec.gov"

I tried to read 'README.txt but I do not know how to use the commands or 
local FTP methods.

> TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't
> help.
>
> This isn't an OS problem. The above traceback was on a Linux system.
> On Windows 7, it fails with
>
> "URLError: <urlopen error ftp error: [Errno 10060] A connection attempt
> failed because the connected party did not properly respond after a
> period of time, or established connection failed because connected host
> has failed to respond>"
>
> But in both cases, the command line FTP client will work, after a
> consistent 20 second delay before the login prompt.  So the
> Python timeout parameter isn't working.


-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web