Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #11814 > unrolled thread

Re: Help with regular expression in python

Started byMatt Funk <matze999@gmail.com>
First post2011-08-18 17:03 -0600
Last post2011-08-18 17:03 -0600
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Help with regular expression in python Matt Funk <matze999@gmail.com> - 2011-08-18 17:03 -0600

#11814 — Re: Help with regular expression in python

FromMatt Funk <matze999@gmail.com>
Date2011-08-18 17:03 -0600
SubjectRe: Help with regular expression in python
Message-ID<mailman.200.1313708602.27778.python-list@python.org>
Hi guys,

thanks for the suggestions. I had tried the white space before as well (to no 
avail). So here is the expression i am using (based on suggestions), but still 
no success:

instance_linetype_pattern_str =\
	r'(([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+))?\s+){32}(.+)'
instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
results = instance_linetype_pattern.findall(line)
print "results: "; print results


The match i get is:
results: 
[('2.199000e+01 ', '2.199000', '.199000', 'e+01', ': (instance: 0)\t:\tsome 
description')]


btw: The line to be matched (given below) is ONE line. There are no line 
breaks (even though my email client adds them).


matt


On Thursday, August 18, 2011, Vlastimil Brom wrote:
> 2011/8/18 Matt Funk <matze999@gmail.com>:
> > Hi,
> > i am sorry if this doesn't quite match the subject of the list. If
> > someone takes offense please point me to where this question should go.
> > Anyway, i have a problem using regular expressions. I would like to
> > match the line:
> > 
> > 1.002000e+01 2.037000e+01 2.128000e+01 1.908000e+01 1.871000e+01
> > 1.914000e+01 2.007000e+01 1.664000e+01 2.204000e+01 2.109000e+01
> > 2.209000e+01 2.376000e+01 2.158000e+01 2.177000e+01 2.152000e+01
> > 2.267000e+01 1.084000e+01 1.671000e+01 1.888000e+01 1.854000e+01
> > 2.064000e+01 2.000000e+01 2.200000e+01 2.139000e+01 2.137000e+01
> > 2.178000e+01 2.179000e+01 2.123000e+01 2.201000e+01 2.150000e+01
> > 2.150000e+01 2.199000e+01 : (instance: 0)       :       some description
> > 
> > The number of floats can vary (in this example there are 32). So what i
> > thought i'd do is the following:
> > instance_linetype_pattern_str =
> > '([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?) {32}'
> > instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
> > Basically the expression in the first major set of paranthesis matches a
> > scientific number format. The '{32}' is supposed to match the previous 32
> > times. However, it doesn't. I  can't figure out why this does not work.
> > I'd really like to understand it if someone can shed light on it.
> > 
> > thanks
> > matt
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> 
> Hi,
> the already suggested handling of whitespace with \s+ etc. at the end
> of the parenthesised patern should help;
> furhtermore, if you are using this pattern in the python source, you
> should either double all backslashes or use a raw string for the
> pattern - with r prepended before the opening quotation mark:
> pattern_str = r"..."
> in order to handle backslashes literally and not as escape character.
> hth,
> vbr

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web