Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'win32': 0.03; 'handler': 0.05; 'indicated': 0.07; 'urllib2': 0.07; 'data:': 0.09; 'logic': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'scheme.': 0.09; 'python': 0.11; '2.7': 0.14; "%s'": 0.16; 'belongs': 0.16; 'bits.': 0.16; 'cased': 0.16; 'data)': 0.16; 'expected,': 0.16; 'labelled': 0.16; 'len(data)': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'splitting': 0.16; 'timeout)': 0.16; 'type)': 0.16; 'urllib': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'bit': 0.19; 'module': 0.19; 'thu,': 0.19; 'seems': 0.21; '>>>': 0.22; 'import': 0.22; 'header:User-Agent:1': 0.23; 'case.': 0.24; 'headers': 0.24; 'versions': 0.24; 'looks': 0.24; '(or': 0.24; 'sort': 0.25; "i've": 0.25; 'class.': 0.26; 'possibly': 0.26; 'skip:" 30': 0.26; 'least': 0.26; 'supported': 0.26; 'header:X -Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'testing': 0.29; 'leave': 0.29; 'raise': 0.29; "doesn't": 0.30; 'skip:( 20': 0.30; 'code': 0.31; 'getting': 0.31; "skip:' 10": 0.31; '"",': 0.31; '+0100,': 0.31; 'apparently': 0.31; 'away.': 0.31; "d'aprano": 0.31; 'schemes': 0.31; 'steven': 0.31; 'file': 0.32; 'stuff': 0.32; 'supposed': 0.32; 'open': 0.33; '(most': 0.33; 'skip:u 20': 0.35; 'but': 0.35; 'there': 0.35; 'data,': 0.36; 'returning': 0.36; 'subject:data': 0.36; 'doing': 0.36; 'should': 0.36; 'example,': 0.37; 'clear': 0.37; 'expected': 0.38; 'nov': 0.38; 'outstanding': 0.38; 'handle': 0.38; 'to:addr:python-list': 0.38; 'little': 0.38; 'recent': 0.39; 'explain': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'skip:u 10': 0.60; 'ian': 0.60; 'skip:c 50': 0.60; 'show': 0.63; 'real': 0.63; 'central': 0.64; 'more': 0.64; 'different': 0.65; 'hang': 0.67; 'mar': 0.68; 'detail.': 0.68; 'default': 0.69; 'received:109': 0.72; 'unusual': 0.74; 'special': 0.74; '2014,': 0.84; '3.4': 0.84; 'opener': 0.84; 'residual': 0.84; '2013,': 0.91; 'whereas': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Robin Becker Subject: Re: data: protocol Date: Thu, 08 May 2014 11:34:01 +0100 References: <536afe13$0$11109$c3e8da3@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: 109.174.168.73 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 In-Reply-To: <536afe13$0$11109$c3e8da3@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 109 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1399545274 news.xs4all.nl 2941 [2001:888:2000:d::a6]:50696 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:71090 On 08/05/2014 04:46, Steven D'Aprano wrote: > On Wed, 07 May 2014 11:42:24 +0100, Robin Becker wrote: > >> I have an outstanding request for ReportLab to allow images to be opened >> using the data: scheme. That used to be supported in python 2.7 using >> urllib, but in python 3.3 urllib2 --> urllib and at least the default >> urlopener doesn't support data: > > > It looks like you intended to show an example, but left it out. > >> Is there a way to use the residual legacy of the old urllib code that's >> now in urllib.URLopener to open unusual schemes? I know it can be used >> directly eg >> >> urllib.request.URLopener().open('data:.........') >> >> but that seems to leave the splitting & testing logic up to me when it >> logically belongs in some central place ie urllib.request.urlopen. > > You may need to explain in a little more detail. When you say "splitting > and testing", what are you splitting and testing? It may also help if you > show some Python 2.7 code that works, and what happens in 3.3. > > OK not sure about 3.4, but in 3.3 the urllib module cannot open a request like this C:\code-trunk\hg-repos\reportlab\tests>\python33\python.exe Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib.request >>> urllib.request.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read() Traceback (most recent call last): File "", line 1, in File "C:\python33\lib\urllib\request.py", line 156, in urlopen return opener.open(url, data, timeout) File "C:\python33\lib\urllib\request.py", line 469, in open response = self._open(req, data) File "C:\python33\lib\urllib\request.py", line 492, in _open 'unknown_open', req) File "C:\python33\lib\urllib\request.py", line 447, in _call_chain result = func(*args) File "C:\python33\lib\urllib\request.py", line 1310, in unknown_open raise URLError('unknown url type: %s' % type) urllib.error.URLError: >>> in python27 one can do C:\tmp>python Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib >>> data=urllib.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read() >>> len(data) 35 >>> and as indicated by Ian Kelly in 3.4 C:\tmp>\python34\python.exe Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:24:06) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib.request >>> data=urllib.request.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read() >>> len(data) 35 in 3.3 we have the old code URLopener class. However, when I use that I see this C:\code-trunk\hg-repos\reportlab\tests>\python33\python.exe Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from urllib.request import URLopener >>> data = URLopener().open('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read() >>> len(data) 115 >>> data 'Date: Thu, 08 May 2014 10:21:45 GMT\nContent-type: image/gif\nContent-Length: 35\n\nGIF87a\x01\x00\x01\x00\x80\x00\x00├ ┐├┐├┐├┐├┐├┐,\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02D\x01\x00;' >>> so I seem to be getting the real data and some headers now. I think this is different from what is expected, but that code is labelled as old/deprecated and possibly going away. Since urllib doesn't always work as expected in 3.3 I've had to write a small stub for the special data: case. Doing all the splitting off of the headers seems harder than just doing the special case. However, there are a lot of these 'schemes' so should I be doing this sort of thing? Apparently it's taken 4 versions of python to get urllib in 3.4 to do this so it's not clear to me whether all schemes are supposed to hang off urllib.request.urlopen or if instead of special casing the 3.3 data: I should have special cased a handler for it and injected that into my opener (or possibly the default opener). Doing the handler means I do have to handle the headers stuff whereas my stub is just returning the data bits. -- Robin Becker