Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
From: Matt Funk <matze999@gmail.com>
To: python-list@python.org
Subject: Re: Help with regular expression in python
Date: Fri, 19 Aug 2011 11:33:49 -0600
User-Agent: KMail/1.13.6 (Linux/2.6.38-10-server; KDE/4.6.2; x86_64; ; )
References: <201108181349.54727.matze999@gmail.com> <mailman.222.1313767221.27778.python-list@python.org> <87hb5d4cik.fsf@dpt-info.u-strasbg.fr>
In-Reply-To: <87hb5d4cik.fsf@dpt-info.u-strasbg.fr>
MIME-Version: 1.0
Content-Type: Text/Plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: matze999@gmail.com
Newsgroups: comp.lang.python
Message-ID: <mailman.227.1313775252.27778.python-list@python.org>
Lines: 72
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:11860

On Friday, August 19, 2011, Alain Ketterlin wrote:
> Matt Funk <matze999@gmail.com> writes:
> > thanks for the suggestion. I guess i had found another way around the
> > problem as well. But i really wanted to match the line exactly and i
> > wanted to know why it doesn't work. That is less for the purpose of
> > getting the thing to work but more because it greatly annoys me off that
> > i can't figure out why it doesn't work. I.e. why the expression is not
> > matches {32} times. I just don't get it.
> 
> Because a line is not 32 times a number, it is a number followed by 31
> times "a space followed by a number". Using Jason's regexp, you can
> build the regexp step by step:
> 
> number = r"\d\.\d+e\+\d+"
> numbersequence = r"%s( %s){31}" % (number,number)
That didn't work either. Using the (modified (where the (.+) matches the end of 
the line)) expression as:

number = r"\d\.\d+e\+\d+"
numbersequence = r"%s( %s){31}(.+)" % (number,number)
instance_linetype_pattern = re.compile(numbersequence)

The results obtained are:
results: 
[(' 2.199000e+01', ' : (instance: 0)\t:\tsome description')]
so this matches the last number plus the string at the end of the line, but no 
retaining the previous numbers.

Anyway, i think at this point i will go another route. Not sure where the 
issues lies at this point.

thanks for all the help
matt


> 
> There are better ways to build your regexp, but I think this one is
> convenient to answer your question. You still have to append what will
> match the end of the line.
> 
> -- Alain.
> 
> P/S: please do not top-post
> 
> >> $ python
> >> Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
> >> [GCC 4.4.3] on linux2
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> 
> >>>>> data
> >> 
> >> '1.002000e+01 2.037000e+01 2.128000e+01 1.908000e+01 1.871000e+01
> >> 1.914000e+01 2.007000e+01 1.664000e+01 2.204000e+01 2.109000e+01
> >> 2.209000e+01 2.376000e+01 2.158000e+01 2.177000e+01 2.152000e+01
> >> 2.267000e+01 1.084000e+01 1.671000e+01 1.888000e+01 1.854000e+01
> >> 2.064000e+01 2.000000e+01 2.200000e+01 2.139000e+01 2.137000e+01
> >> 2.178000e+01 2.179000e+01 2.123000e+01 2.201000e+01 2.150000e+01
> >> 2.150000e+01 2.199000e+01 : (instance: 0)       :       some
> >> description'
> >> 
> >>>>> import re
> >>>>> re.findall(r"\d\.\d+e\+\d+", data)
> >> 
> >> ['1.002000e+01', '2.037000e+01', '2.128000e+01', '1.908000e+01',
> >> '1.871000e+01', '1.914000e+01', '2.007000e+01', '1.664000e+01',
> >> '2.204000e+01', '2.109000e+01', '2.209000e+01', '2.376000e+01',
> >> '2.158000e+01', '2.177000e+01', '2.152000e+01', '2.267000e+01',
> >> '1.084000e+01', '1.671000e+01', '1.888000e+01', '1.854000e+01',
> >> '2.064000e+01', '2.000000e+01', '2.200000e+01', '2.139000e+01',
> >> '2.137000e+01', '2.178000e+01', '2.179000e+01', '2.123000e+01',
> >> '2.201000e+01', '2.150000e+01', '2.150000e+01', '2.199000e+01']