Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.008 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'skip:u 30': 0.07; 'identifier': 0.09; 'prefix': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:Help': 0.11; 'python': 0.11; '24,': 0.16; 'backslash': 0.16; 'ioerror:': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'repr()': 0.16; 'robots.txt': 0.16; 'url.': 0.16; 'wrote:': 0.18; 'code.': 0.18; 'all,': 0.19; '(the': 0.22; 'print': 0.22; 'header :User-Agent:1': 0.23; 'error': 0.23; "aren't": 0.24; 'specify': 0.24; 'file.': 0.24; 'looks': 0.24; 'header:X-Complaints-To:1': 0.27; 'specifically': 0.29; 'character': 0.29; 'code': 0.31; "skip:' 10": 0.31; 'text': 0.33; 'running': 0.33; 'version': 0.36; 'complete.': 0.36; 'scheme': 0.36; 'subject:regarding': 0.36; 'charset:us-ascii': 0.36; 'example,': 0.37; 'displays': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'how': 0.40; 'referred': 0.60; 'august': 0.61; 'simply': 0.61; "you're": 0.61; 'decided': 0.64; 'more': 0.64; 'here': 0.66; 'default': 0.69; 'behavior': 0.77; 'presumably': 0.84; 'strings)': 0.84; 'doubling': 0.91; '2013': 0.98 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Dave Angel Subject: Re: Help regarding urllib Date: Sat, 24 Aug 2013 13:07:57 +0000 (UTC) References: <31b7c124-d5f3-485d-a838-a083773c6c31@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: 174.32.174.33 User-Agent: XPN/1.2.6 (Street Spirit ; Linux) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 50 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1377349701 news.xs4all.nl 15890 [2001:888:2000:d::a6]:55849 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:52939 malhar vora wrote: > On Saturday, August 24, 2013 4:15:01 PM UTC+5:30, malhar vora wrote: >> Hello All, >> >> >> >> >> >> I am simply fetching data from robots.txt of a url. Below is my code. >> >> >> >> siteurl = siteurl.rstrip("/") > > Sorry for last complete. It was sent by mistake. > > Here is my code. > > siteurl = siteurl.rstrip("/") > roboturl = siteurl + r'/robots.txt' > robotdata = urllib.urlopen(roboturl).read() # Reading robots.txt of given url > print robotdata > > In above code siteurl is fetched simply from local text file. Why aren't you showing us what is in that local text file? Or more specifically what siteurl turns out to be? I suspect it's missing the http:// prefix > IOError: [Errno 2] The system cannot find the path specified: 'www.bestrecipes.c > om.au\\robots.txt' > Looks to me like it decided this url referred to a file. That's the default behavior when you don't specify the scheme identifier (eg. 'http") Also it might well have been necessary to specify what Python version and OS you're running this on. For example, the single backslash character is specific to Windows. (The doubling presumably is an artifact of how the error message is displayed, eg. look at how repr() displays strings) -- DaveA