Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.009 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'else:': 0.03; 'something,': 0.07; 'x-mailer:qualcomm windows eudora version 7.1.0.9': 0.07; 'issue?': 0.09; 'subject:string': 0.09; 'wrote:': 0.15; '>after': 0.16; '>it': 0.16; ">it's": 0.16; 're.search': 0.16; 'subject:print': 0.16; '2.x': 0.19; 'string,': 0.19; 'seems': 0.20; 'figure': 0.21; 'this?': 0.22; 'header:In-Reply- To:1': 0.22; 'assume': 0.23; 'module,': 0.23; 'code': 0.24; 'changed': 0.25; 'string': 0.26; 'function': 0.26; 'fine': 0.26; 'skip:p 30': 0.28; 'objects': 0.28; 'code,': 0.29; 'match': 0.30; '>the': 0.30; 'subject:?': 0.31; 'print': 0.32; "won't": 0.32; 'break': 0.33; "i've": 0.33; 'received:24': 0.34; 'there': 0.34; 'to:addr:python-list': 0.34; 'received:71': 0.34; "can't": 0.34; 'received:mail.rr.com': 0.35; 'subject:How': 0.36; 'charset:us- ascii': 0.36; 'page': 0.36; 'but': 0.37; 'something': 0.38; 'subject:: ': 0.38; 'unless': 0.39; 'run': 0.39; 'header:Mime- Version:1': 0.39; 'received:rr.com': 0.39; 'to:addr:python.org': 0.39; 'page:': 0.40; 'john': 0.62; 'immediate': 0.65; 'subject:you': 0.80; 'url:php': 0.83; '02:58': 0.84; 'message-id :@hrndva-omtalb.mail.rr.com': 0.84; 'received:24.27': 0.84 X-Authority-Analysis: v=1.1 cv=5asQ6euaRPJxDdFxwvXsn6JDb7fmFbz8qWDLMfa45gU= c=1 sm=0 a=zL4f76W6sc8A:10 a=bgH8vYQCEbEA:10 a=kj9zAlcOel0A:10 a=Graj6Y5OcwSZ2XRXwn+B9A==:17 a=URTRhfvZAAAA:8 a=2K47n4iKMOB8Hm8rke4A:9 a=fQuThkSj1u6nqobm3sgA:7 a=CjuIK1q_8ugA:10 a=V2flnl2bjg0A:10 a=Graj6Y5OcwSZ2XRXwn+B9A==:117 X-Cloudmark-Score: 0 X-Originating-IP: 24.27.50.163 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Thu, 23 Jun 2011 16:47:02 -0500 To: python-list@python.org From: "Thomas L. Shinnick" Subject: Re: How do you print a string after it's been searched for an RE? In-Reply-To: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 61 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1308865647 news.xs4all.nl 14147 [::ffff:82.94.164.166]:59816 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:8329 There is also print(match_obj.string) which gives you a copy of the string searched. See end of section 6.2.5. Match Objects At 02:58 PM 6/23/2011, John Salerno wrote: >After I've run the re.search function on a string and no match was >found, how can I access that string? When I try to print it directly, >it's an empty string, I assume because it has been "consumed." How do >I prevent this? > >It seems to work fine for this 2.x code: > >import urllib.request >import re > >next_nothing = '12345' >pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php? >nothing=' >pattern = re.compile(r'[0-9]+') > >while True: > page = urllib.request.urlopen(pc_url + next_nothing) > match_obj = pattern.search(page.read().decode()) > if match_obj: > next_nothing = match_obj.group() > print(next_nothing) > else: > print(page.read().decode()) > break > >But when I try it with my own code (3.2), it won't print the text of >the page: > >import urllib.request >import re > >next_nothing = '12345' >pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php? >nothing=' >pattern = re.compile(r'[0-9]+') > >while True: > page = urllib.request.urlopen(pc_url + next_nothing) > match_obj = pattern.search(page.read().decode()) > if match_obj: > next_nothing = match_obj.group() > print(next_nothing) > else: > print(page.read().decode()) > break > >P.S. I plan to clean up my code, I know it's not great right now. But >my immediate goal is to just figure out why the 2.x code can print >"text", but my own code can't print "page," which are basically the >same thing, unless something significant has changed with either the >urllib.request module, or the way it's decoded, or something, or is it >just an RE issue? > >Thanks.