Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27303

Re: Regex Question

References <385e732e-1c02-4dd0-ab12-b92890bbed66@o3g2000yqp.googlegroups.com> <502fa524$0$29978$c3e8da3$5496439d@news.astraweb.com> <79aaa167-296a-4a0c-8a06-c4e67cf53597@j19g2000yqi.googlegroups.com>
Date 2012-08-18 17:50 +0200
Subject Re: Regex Question
From Vlastimil Brom <vlastimil.brom@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.3456.1345305020.4697.python-list@python.org> (permalink)

Show all headers | View raw


2012/8/18 Frank Koshti <frank.koshti@gmail.com>:
> Hey Steven,
>
> Thank you for the detailed (and well-written) tutorial on this very
> issue. I actually learned a few things! Though, I still have
> unresolved questions.
>
> The reason I don't want to use an XML parser is because the tokens are
> not always placed in HTML, and even in HTML, they may appear in
> strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue is
> I need to match, process and replace $foo(x=3), knowing that (x=3) is
> optional, and the token might appear simply as $foo.
>
> To do this, I decided to use:
>
> re.compile('\$\w*\(?.*?\)').findall(mystring)
>
> the issue with this is it doesn't match $foo by itself, and requires
> there to be () at the end.
>
> Thanks,
> Frank
> --
> http://mail.python.org/mailman/listinfo/python-list

Hi,
Although I don't quite get the pattern you are using (with respect to
the specified task), you most likely need raw string syntax for the
pattern, e.g.: r"...", instead of "...", or you have to double all
backslashes (which should be escaped), i.e. \\w etc.

I am likely misunderstanding the specification, as the following:
>>> re.sub(r"\$foo\(x=3\)", "bar", "<h1 $foo(x=3)>Hello</h1>")
'<h1 bar>Hello</h1>'
>>>
is probably not the desired output.

For some kind of "processing" the matched text, you can use the
replace function instead of the replace pattern in re.sub too.
see
http://docs.python.org/library/re.html#re.sub

hth,
  vbr

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-17 21:41 -0700
  Re: Regex Question Chris Angelico <rosuav@gmail.com> - 2012-08-18 15:42 +1000
  Re: Regex Question Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-08-18 11:50 +0100
  Re: Regex Question Roy Smith <roy@panix.com> - 2012-08-18 09:08 -0400
    Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 07:21 -0700
  Re: Regex Question Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-08-18 14:22 +0000
    Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 07:53 -0700
      Re: Regex Question Peter Otten <__peter__@web.de> - 2012-08-18 17:48 +0200
        Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 08:56 -0700
      Re: Regex Question Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-08-18 17:50 +0200
      Re: Regex Question Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2012-08-18 19:22 +0300
        Re: Regex Question Frank Koshti <frank.koshti@gmail.com> - 2012-08-18 13:18 -0700
    Re: Regex Question python@bdurham.com - 2012-08-18 12:36 -0400
  Re: Regex Question Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-08-18 13:30 -0400

csiph-web