Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #12028

Re: Help with regular expression in python

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <vlastimil.brom@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.011
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'mrab': 0.04; '(using': 0.05; 'subject:python': 0.11; '(its': 0.16; 'developer)': 0.16; 'parentheses': 0.16; 'res': 0.16; 'subject:expression': 0.16; 'subject:regular': 0.16; 'subject:Help': 0.17; '>>>': 0.18; 'appropriate': 0.20; 'header:In-Reply-To:1': 0.22; 'received:209.85.212.46': 0.23; 'received:mail- vw0-f46.google.com': 0.23; 'missed': 0.24; 'matching': 0.24; 'problem': 0.28; 'import': 0.28; 'matches': 0.29; 'sorry,': 0.29; 'message-id:@mail.gmail.com': 0.29; 'print': 0.29; 'match': 0.30; 'pattern': 0.30; 'earlier': 0.32; 'there': 0.33; 'to:addr:python- list': 0.33; 'instead': 0.33; 'received:209.85.212': 0.34; 'flag': 0.34; 'skip:" 10': 0.36; 'thread': 0.37; 'using': 0.37; 'something': 0.37; 'could': 0.38; 'some': 0.38; 'received:google.com': 0.38; 'received:209.85': 0.38; 'subject:: ': 0.39; 'i.e.': 0.39; 'subject:with': 0.39; 'data': 0.39; 'to:addr:python.org': 0.39; 'description': 0.39; 'your': 0.61; 'further': 0.64; 'data?': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=p7DXuAVs8IYTpLzYdt2fI4GOIFUZ0EP+dgPBKMMmDQQ=; b=TyRTuYBuDjw5iphUqi/+l3AVssEUVLt6D7FiFtaYS2UTo66NknKPeNiuyGelyQxOYN nGqm0AHIU7T+JuS2bgzaWOGSmavlbihwpNA6DDRhRiw4kjdCsq9Ofo2dv/sEM7QgqTQz f44sITPxx9Zn3WPcnjRgKhXX0wq9yw9QvHVo4=
MIME-Version 1.0
In-Reply-To <201108191555.53338.matze999@gmail.com>
References <201108181349.54727.matze999@gmail.com> <mailman.227.1313775252.27778.python-list@python.org> <c83c4dd3-7451-4c55-81d5-ae9c575381b1@glegroupsg2000goo.googlegroups.com> <201108191555.53338.matze999@gmail.com>
Date Mon, 22 Aug 2011 15:59:16 +0200
Subject Re: Help with regular expression in python
From Vlastimil Brom <vlastimil.brom@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.317.1314021558.27778.python-list@python.org> (permalink)
Lines 36
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1314021558 news.xs4all.nl 23865 [2001:888:2000:d::a6]:49224
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:12028

Show key headers only | View raw


Sorry, if I missed some further specification in the earlier thread or
if the following is oversimplification of the original problem (using
3 numbers instead of 32),
would something like the following work for your data?

>>> import re
>>> data = """2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :       some description
... 2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
  some description
... 2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
  some description
... 2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
  some description"""
>>> for res in re.findall(r"(?m)^(?:(?:[-+]?(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][-+]?\d+))?\s+){3}(?:.+)$", data): print res
...
2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
some description
2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
some description
2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
some description
2.201000e+01 2.150000e+01 2.199000e+01 : (instance: 0)       :
some description
>>>

i.e. all parentheses are non-capturing (?:...) and there are extra
anchors for line begining and end ^...$ with the multiline flag set
via (?m)
Each result is one matching line in this sample (if you need to acces
single numbers, you could process these matches further or use the new
regex implementation mentioned earlier by mrab (its developer) with
the new match method captures() - using an appropriate pattern with
the needed groupings).

regards,
  vbr

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Re: Help with regular expression in python Matt Funk <matze999@gmail.com> - 2011-08-19 09:20 -0600
  Re: Help with regular expression in python jmfauth <wxjmfauth@gmail.com> - 2011-08-19 08:50 -0700
  Re: Help with regular expression in python Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2011-08-19 18:00 +0200
    Re: Help with regular expression in python Matt Funk <matze999@gmail.com> - 2011-08-19 11:33 -0600
      Re: Help with regular expression in python jmfauth <wxjmfauth@gmail.com> - 2011-08-19 11:40 -0700
        Re: Help with regular expression in python Matt Funk <matze999@gmail.com> - 2011-08-19 15:21 -0600
      Re: Help with regular expression in python "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-08-19 12:55 -0700
        Re: Help with regular expression in python MRAB <python@mrabarnett.plus.com> - 2011-08-19 21:43 +0100
      Re: Help with regular expression in python Carl Banks <pavlovevidence@gmail.com> - 2011-08-19 13:11 -0700
        Re: Help with regular expression in python Matt Funk <matze999@gmail.com> - 2011-08-19 15:55 -0600
        Re: Help with regular expression in python Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-08-22 15:59 +0200

csiph-web