Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #104183 > unrolled thread
| Started by | Fillmore <fillmore_remove@hotmail.com> |
|---|---|
| First post | 2016-03-06 23:38 -0500 |
| Last post | 2016-03-07 07:50 -0500 |
| Articles | 7 — 5 participants |
Back to article view | Back to comp.lang.python
Regex: Perl to Python Fillmore <fillmore_remove@hotmail.com> - 2016-03-06 23:38 -0500
Re: Regex: Perl to Python Chris Angelico <rosuav@gmail.com> - 2016-03-07 15:45 +1100
Re: Regex: Perl to Python Terry Reedy <tjreedy@udel.edu> - 2016-03-06 23:48 -0500
Re: Regex: Perl to Python Rustom Mody <rustompmody@gmail.com> - 2016-03-06 21:53 -0800
Re: Regex: Perl to Python Rustom Mody <rustompmody@gmail.com> - 2016-03-06 21:53 -0800
Re: Regex: Perl to Python Peter Otten <__peter__@web.de> - 2016-03-07 08:48 +0100
Re: Regex: Perl to Python Fillmore <fillmore_remove@hotmail.com> - 2016-03-07 07:50 -0500
| From | Fillmore <fillmore_remove@hotmail.com> |
|---|---|
| Date | 2016-03-06 23:38 -0500 |
| Subject | Regex: Perl to Python |
| Message-ID | <nbj0k0$12gk$1@gioia.aioe.org> |
Hi, I'm trying to move away from Perl and go to Python.
Regex seems to bethe hardest challenge so far.
Perl:
while (<HEADERFILE>) {
if (/(\d+)\t(.+)$/) {
print $1." - ". $2."\n";
}
}
into python
pattern = re.compile(r"(\d+)\t(.+)$")
with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile:
for line in headerfile:
#sys.stdout.write(line)
m = pattern.match(line)
print(m.group(0))
headerfile.close()
but I must be getting something fundamentally wrong because:
Traceback (most recent call last):
File "./slicer.py", line 30, in <module>
print(m.group(0))
AttributeError: 'NoneType' object has no attribute 'group'
why is 'm' a None?
the input data has this format:
:
3 prop1
4 prop2
5 prop3
Thanks
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2016-03-07 15:45 +1100 |
| Message-ID | <mailman.6.1457325934.10335.python-list@python.org> |
| In reply to | #104183 |
On Mon, Mar 7, 2016 at 3:38 PM, Fillmore <fillmore_remove@hotmail.com> wrote:
> pattern = re.compile(r"(\d+)\t(.+)$")
> with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile:
> for line in headerfile:
> #sys.stdout.write(line)
> m = pattern.match(line)
> print(m.group(0))
> headerfile.close()
>
> but I must be getting something fundamentally wrong because:
>
> Traceback (most recent call last):
> File "./slicer.py", line 30, in <module>
> print(m.group(0))
> AttributeError: 'NoneType' object has no attribute 'group'
>
>
> why is 'm' a None?
When the regex doesn't match, Python gives you back None instead of a
match object. Your Perl code has an 'if' to guard that; you can do the
same thing:
if m:
print(m.group(0))
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2016-03-06 23:48 -0500 |
| Message-ID | <mailman.7.1457326155.10335.python-list@python.org> |
| In reply to | #104183 |
On 3/6/2016 11:38 PM, Fillmore wrote:
>
> Hi, I'm trying to move away from Perl and go to Python.
> Regex seems to bethe hardest challenge so far.
>
> Perl:
>
> while (<HEADERFILE>) {
> if (/(\d+)\t(.+)$/) {
> print $1." - ". $2."\n";
> }
> }
>
> into python
>
> pattern = re.compile(r"(\d+)\t(.+)$")
> with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile:
> for line in headerfile:
> #sys.stdout.write(line)
> m = pattern.match(line)
> print(m.group(0))
> headerfile.close()
Delete this line. Files opened in a with statement are automatically
closed when exiting the block. This is a main motivator and use for the
with statement.
> but I must be getting something fundamentally wrong because:
>
> Traceback (most recent call last):
> File "./slicer.py", line 30, in <module>
> print(m.group(0))
> AttributeError: 'NoneType' object has no attribute 'group'
>
>
> why is 'm' a None?
Python has a wonderful interactive help facility. Learn to use it.
>>> import re
>>> help(re.match)
Help on function match in module re:
match(pattern, string, flags=0)
Try to apply the pattern at the start of the string, returning
a match object, or None if no match was found.
>>>
Add 'if m is not None:' before accessing m.group.
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2016-03-06 21:53 -0800 |
| Message-ID | <df11597e-a96c-4201-93d7-18caa8f91476@googlegroups.com> |
| In reply to | #104185 |
Also for regex hacking: Try with findall before using match/search
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2016-03-06 21:53 -0800 |
| Message-ID | <25d7cdf5-25c5-4452-9644-b3292d903fb4@googlegroups.com> |
| In reply to | #104185 |
Also for regex hacking: Try with findall before using match/search
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2016-03-07 08:48 +0100 |
| Message-ID | <mailman.8.1457336938.10335.python-list@python.org> |
| In reply to | #104183 |
Fillmore wrote:
>
> Hi, I'm trying to move away from Perl and go to Python.
> Regex seems to bethe hardest challenge so far.
>
> Perl:
>
> while (<HEADERFILE>) {
> if (/(\d+)\t(.+)$/) {
> print $1." - ". $2."\n";
> }
> }
>
> into python
>
> pattern = re.compile(r"(\d+)\t(.+)$")
> with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile:
> for line in headerfile:
> #sys.stdout.write(line)
> m = pattern.match(line)
> print(m.group(0))
> headerfile.close()
>
> but I must be getting something fundamentally wrong because:
>
> Traceback (most recent call last):
> File "./slicer.py", line 30, in <module>
> print(m.group(0))
> AttributeError: 'NoneType' object has no attribute 'group'
>
>
> why is 'm' a None?
match() matches from the begin of the string, use search():
match = pattern.search(line)
if match is not None:
print(match.group(1), "-", match.group(2))
Also, in Python you often can use string methods instead of regular
expressions:
index, tab, value = line.strip().partition("\t")
if tab and index.isdigit():
print(index, "-", value)
> the input data has this format:
>
> :
> 3 prop1
> 4 prop2
> 5 prop3
>
[toc] | [prev] | [next] | [standalone]
| From | Fillmore <fillmore_remove@hotmail.com> |
|---|---|
| Date | 2016-03-07 07:50 -0500 |
| Message-ID | <nbjtdv$q48$1@gioia.aioe.org> |
| In reply to | #104183 |
Big thank you to everyone who offered their help! On 03/06/2016 11:38 PM, Fillmore wrote: > >
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web