Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #27303
| Path | csiph.com!usenet.pasdenom.info!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <vlastimil.brom@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.006 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'syntax': 0.03; 'parser': 0.07; 'subject:Question': 0.07; 'itself,': 0.09; 'matched': 0.09; 'cc:addr:python-list': 0.10; 'backslashes': 0.16; 'received:209.85.216.53': 0.16; 'unresolved': 0.16; 'url:html#re': 0.16; 'url:sub': 0.16; 'string': 0.17; 'thanks,': 0.18; '>>>': 0.18; 'issue.': 0.20; 'hey': 0.21; 'cc:2**0': 0.23; 'specified': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'appear': 0.26; '(which': 0.26; 'raw': 0.27; 'i.e.': 0.27; 'replace': 0.27; 'message- id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'questions.': 0.29; 'url:mailman': 0.29; 'probably': 0.29; 'function': 0.30; '(and': 0.32; 'url:python': 0.32; 'url:listinfo': 0.32; 'html,': 0.33; 'text,': 0.33; '(with': 0.33; 'likely': 0.33; 'tutorial': 0.33; 'hi,': 0.33; 'received:google.com': 0.34; 'too.': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'url:org': 0.36; 'url:library': 0.36; 'should': 0.36; 'thank': 0.36; 'xml': 0.37; 'quite': 0.37; 'received:209': 0.37; 'received:209.85.216': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'url:docs': 0.38; 'instead': 0.39; 'header:Received:5': 0.40; 'url:mail': 0.40; 'most': 0.61; 'kind': 0.61; 'strange': 0.62; 'respect': 0.63; 'decided': 0.65; 'learned': 0.65; 'frank': 0.75 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=WCRmQLVN6KYHK5O8dIa2YInYhVEoljRhe9l9TJ0WRRk=; b=0uKcPSGUd0MS8LbWycSZeptgEaCfms/OhCIkUh33aLOmRZR8eMFlBqLYQiu4715PE6 hhUNIU8lp6Bc7jlottSGF6sEJ23f8an5W+bn2GVRejMRsLcugVCO2BzJBSKfSnvrH+jV ZYpOUItKxIaE5NcVmoClUIyCk18uZcYqF+LlYiaTEa4z6F2ak2zyanTvJ0noo2asraaT kD/+p0ao+2/GBQfNjE8cvJ+VEpFO774HW/rgiv8X3cZKm1AGSIJjMVmchJvxxZCIDxWt JcEanp80ZN1h+NBEp7OdjRL4+DbnhM35/odHmYt+rs07CadeFe22X+GOZ2gPMttJfUjk d73w== |
| MIME-Version | 1.0 |
| In-Reply-To | <79aaa167-296a-4a0c-8a06-c4e67cf53597@j19g2000yqi.googlegroups.com> |
| References | <385e732e-1c02-4dd0-ab12-b92890bbed66@o3g2000yqp.googlegroups.com> <502fa524$0$29978$c3e8da3$5496439d@news.astraweb.com> <79aaa167-296a-4a0c-8a06-c4e67cf53597@j19g2000yqi.googlegroups.com> |
| Date | Sat, 18 Aug 2012 17:50:17 +0200 |
| Subject | Re: Regex Question |
| From | Vlastimil Brom <vlastimil.brom@gmail.com> |
| To | Frank Koshti <frank.koshti@gmail.com> |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Cc | python-list@python.org |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.12 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3456.1345305020.4697.python-list@python.org> (permalink) |
| Lines | 44 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1345305020 news.xs4all.nl 6844 [2001:888:2000:d::a6]:38155 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:27303 |
Show key headers only | View raw
2012/8/18 Frank Koshti <frank.koshti@gmail.com>:
> Hey Steven,
>
> Thank you for the detailed (and well-written) tutorial on this very
> issue. I actually learned a few things! Though, I still have
> unresolved questions.
>
> The reason I don't want to use an XML parser is because the tokens are
> not always placed in HTML, and even in HTML, they may appear in
> strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue is
> I need to match, process and replace $foo(x=3), knowing that (x=3) is
> optional, and the token might appear simply as $foo.
>
> To do this, I decided to use:
>
> re.compile('\$\w*\(?.*?\)').findall(mystring)
>
> the issue with this is it doesn't match $foo by itself, and requires
> there to be () at the end.
>
> Thanks,
> Frank
> --
> http://mail.python.org/mailman/listinfo/python-list
Hi,
Although I don't quite get the pattern you are using (with respect to
the specified task), you most likely need raw string syntax for the
pattern, e.g.: r"...", instead of "...", or you have to double all
backslashes (which should be escaped), i.e. \\w etc.
I am likely misunderstanding the specification, as the following:
>>> re.sub(r"\$foo\(x=3\)", "bar", "<h1 $foo(x=3)>Hello</h1>")
'<h1 bar>Hello</h1>'
>>>
is probably not the desired output.
For some kind of "processing" the matched text, you can use the
replace function instead of the replace pattern in re.sub too.
see
http://docs.python.org/library/re.html#re.sub
hth,
vbr
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-17 21:41 -0700
Re: Regex Question Chris Angelico <rosuav@gmail.com> - 2012-08-18 15:42 +1000
Re: Regex Question Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-08-18 11:50 +0100
Re: Regex Question Roy Smith <roy@panix.com> - 2012-08-18 09:08 -0400
Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 07:21 -0700
Re: Regex Question Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-18 14:22 +0000
Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 07:53 -0700
Re: Regex Question Peter Otten <__peter__@web.de> - 2012-08-18 17:48 +0200
Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 08:56 -0700
Re: Regex Question Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-08-18 17:50 +0200
Re: Regex Question Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2012-08-18 19:22 +0300
Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 13:18 -0700
Re: Regex Question python@bdurham.com - 2012-08-18 12:36 -0400
Re: Regex Question Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-08-18 13:30 -0400
csiph-web