Path: csiph.com!feeder.erje.net!2.eu.feeder.erje.net!newsfeed.freenet.ag!newsfeed.kamp.net!newsfeed.kamp.net!fu-berlin.de!uni-berlin.de!not-for-mail From: Peter Otten <__peter__@web.de> Newsgroups: comp.lang.python Subject: Re: Regex: Perl to Python Date: Mon, 07 Mar 2016 08:48:46 +0100 Organization: None Lines: 56 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Trace: news.uni-berlin.de M80lZMUg1SjVdZpwEkmT9AFD/UiMcxSeNYop4l0euFkA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.05; 'none:': 0.05; 'matches': 0.07; 'format:': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'tab': 0.09; 'python': 0.10; 'python.': 0.11; 'skip:# 20': 0.13; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'received:t-ipconnect.de': 0.16; 'tab,': 0.16; 'wrote:': 0.16; 'string': 0.17; 'attribute': 0.18; 'string,': 0.18; 'input': 0.18; 'trying': 0.22; 'seems': 0.23; '(most': 0.24; 'header:User- Agent:1': 0.26; 'header:X-Complaints-To:1': 0.26; 'regular': 0.29; 'index,': 0.29; 'perl': 0.29; 'value)': 0.29; "i'm": 0.30; 'print': 0.30; 'getting': 0.33; 'traceback': 0.33; 'file': 0.34; 'something': 0.35; 'but': 0.36; 'instead': 0.36; 'to:addr:python- list': 0.36; 'subject:: ': 0.37; 'received:org': 0.37; 'wrong': 0.38; 'skip:p 20': 0.38; 'hi,': 0.38; 'why': 0.39; 'data': 0.39; 'to:addr:python.org': 0.40; 'received:de': 0.40; 'challenge': 0.61; '30,': 0.63; 'hardest': 0.91 X-Injected-Via-Gmane: http://gmane.org/ X-Gmane-NNTP-Posting-Host: p57bd893f.dip0.t-ipconnect.de User-Agent: KNode/4.13.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:104191 Fillmore wrote: > > Hi, I'm trying to move away from Perl and go to Python. > Regex seems to bethe hardest challenge so far. > > Perl: > > while () { > if (/(\d+)\t(.+)$/) { > print $1." - ". $2."\n"; > } > } > > into python > > pattern = re.compile(r"(\d+)\t(.+)$") > with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile: > for line in headerfile: > #sys.stdout.write(line) > m = pattern.match(line) > print(m.group(0)) > headerfile.close() > > but I must be getting something fundamentally wrong because: > > Traceback (most recent call last): > File "./slicer.py", line 30, in > print(m.group(0)) > AttributeError: 'NoneType' object has no attribute 'group' > > > why is 'm' a None? match() matches from the begin of the string, use search(): match = pattern.search(line) if match is not None: print(match.group(1), "-", match.group(2)) Also, in Python you often can use string methods instead of regular expressions: index, tab, value = line.strip().partition("\t") if tab and index.isdigit(): print(index, "-", value) > the input data has this format: > > : > 3 prop1 > 4 prop2 > 5 prop3 >