Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!lightspeed.eweka.nl!lightspeed.eweka.nl!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Date: Wed, 29 Apr 2015 22:28:08 +0100
From: MRAB <python@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Python re to extract useful information from each line
References: <e5473ccc-4f7d-431d-93a7-1aeeededcbf0@googlegroups.com> <220dafbc-25f0-48a7-b37a-c8a77a6f2ffa@googlegroups.com> <mhri39$n45$1@ger.gmane.org>
In-Reply-To: <mhri39$n45$1@ger.gmane.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.99.1430342891.3680.python-list@python.org>
Lines: 14
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:89579

On 2015-04-29 22:22, Emile van Sebille wrote:
> On 4/29/2015 1:49 PM, Kashif Rana wrote:
>> pol_elements = re.compile('id\s(?P<p_id>.+?)(?:\sname\s(?P<p_name>.+?))?\sfrom\s(?P<p_from>.+?)\sto\s(?P<p_to>.+?)\s{2}(?P<p_src>[^\s]+?)\s(?P<p_dst>[^\s]+?)\s(?P<p_port>[^\s]+?)(?:(?P<p_nat_status>\snat)\s(?P<p_nat_type>[^\s]+?)(?P<p_nat_ip>\sdip-id\s[^\s]+?)?)?\s(?P<p_action>[^\s]+?)(?:\sschedule\s(?P<p_schedule>[^\s]+?))?(?P<p_log_status>\slog)?$'
>> )
>
>
> ... and that's why we avoid regular expressions... it makes my head hurt
> just looking at that line noise.
>
It might just be easier to split it into a list of fields and then pick
out the ones you want:

fields = re.findall(r'"[^"]+"|\S+', line)