Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Date: Thu, 23 Jun 2011 16:47:02 -0500
To: python-list@python.org
From: "Thomas L. Shinnick" <tshinnic@prismnet.com>
Subject: Re: How do you print a string after it's been searched for an RE?
In-Reply-To: <c4486911-f2cb-4e0e-a93c-3d6ff7e302e9@16g2000yqy.googlegrou ps.com>
References: <c4486911-f2cb-4e0e-a93c-3d6ff7e302e9@16g2000yqy.googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.342.1308865647.1164.python-list@python.org>
Lines: 61
NNTP-Posting-Host: 82.94.164.166
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:8329

There is also
       print(match_obj.string)
which gives you a copy of the string searched.  See end of section 
6.2.5. Match Objects

At 02:58 PM 6/23/2011, John Salerno wrote:
>After I've run the re.search function on a string and no match was
>found, how can I access that string? When I try to print it directly,
>it's an empty string, I assume because it has been "consumed." How do
>I prevent this?
>
>It seems to work fine for this 2.x code:
>
>import urllib.request
>import re
>
>next_nothing = '12345'
>pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
>nothing='
>pattern = re.compile(r'[0-9]+')
>
>while True:
>     page = urllib.request.urlopen(pc_url + next_nothing)
>     match_obj = pattern.search(page.read().decode())
>     if match_obj:
>         next_nothing = match_obj.group()
>         print(next_nothing)
>     else:
>         print(page.read().decode())
>         break
>
>But when I try it with my own code (3.2), it won't print the text of
>the page:
>
>import urllib.request
>import re
>
>next_nothing = '12345'
>pc_url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?
>nothing='
>pattern = re.compile(r'[0-9]+')
>
>while True:
>     page = urllib.request.urlopen(pc_url + next_nothing)
>     match_obj = pattern.search(page.read().decode())
>     if match_obj:
>         next_nothing = match_obj.group()
>         print(next_nothing)
>     else:
>         print(page.read().decode())
>         break
>
>P.S. I plan to clean up my code, I know it's not great right now. But
>my immediate goal is to just figure out why the 2.x code can print
>"text", but my own code can't print "page," which are basically the
>same thing, unless something significant has changed with either the
>urllib.request module, or the way it's decoded, or something, or is it
>just an RE issue?
>
>Thanks.