Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Thu, 13 Jun 2013 09:17:28 +0200
From: Andreas Perstinger <andipersti@gmail.com>
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130510 Thunderbird/17.0.6
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Problem creating a regular expression to parse open-iscsi, iscsiadm output (help?)
References: <e682e1eb-1f7b-4776-82f0-11a0147947ec@googlegroups.com>
In-Reply-To: <e682e1eb-1f7b-4776-82f0-11a0147947ec@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.3169.1371108348.3114.python-list@python.org>
Lines: 39
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:47915

On 13.06.2013 02:59, rice.cruft@gmail.com wrote:
> I am parsing the output of an open-iscsi command that contains
> severalblocks of data for each data set. Each block has the format:
[SNIP]
> I tried using \s* to swallow the whitespace between the to iSCSI
> lines. No joy... However [\s\S]*? allows the regex to succeed. But that
> seems to me to be overkill (I am not trying to skip lines of text here.)
> Also note that I am using \ + to catch spaces between the words. On the
> two problem lines, using \s+ between the label words fails.

Changing
>          # Connection state
>          iSCSI\ +Connection\ +State:\s+(?P<connState>\w+\s*\w*)
>          [\s\S]*?    <<<<<< without this the regex fails
>          # Session state
>          iSCSI\ +Session\ +State:\s+(?P<sessionState>\w+)

to
         # Connection state
         iSCSI\s+Connection\s+State:\s+(?P<connState>\w+\s*\w*)\s*
         # Session state
         iSCSI\s+Session\s+State:\s+(?P<sessionState>\w+)

gives me

 >>> # 'test' is the example string
 >>> myDetails = [ m.groupdict() for m in regex.finditer(test)]
 >>> print myDetails
[{'initiatorIP': '221.128.52.214', 'connState': 'LOGGED IN', 'SID': 
'154', 'ipaddr': '221.128.52.224', 'initiatorName': 
'iqn.1996-04.de.suse:01:7c9741b545b5', 'sessionState': 'LOGGED_IN', 
'iqn': 'iqn.1992-04.com.emc:vplex-000000008460319f-0000000000000007', 
'tag': '7', 'port': '3260'}]

for your example (same for the original regex).
It looks like it works (Python 2.7.3) and there is something else 
breaking the regex.

Bye, Andreas