Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'output': 0.05; '(python': 0.07; 'lines,': 0.07; 'subject:help': 0.08; 'string': 0.09; 'lines.': 0.09; 'parsing': 0.09; 'spaces': 0.09; '(same': 0.16; 'andreas': 0.16; 'bye,': 0.16; 'fails.': 0.16; 'format:': 0.16; "skip:' 60": 0.16; 'subject: \n ': 0.16; 'subject:?)': 0.16; 'subject:Problem': 0.16; 'subject:expression': 0.16; 'subject:regular': 0.16; 'words.': 0.16; 'wrote:': 0.18; 'trying': 0.19; "skip:' 30": 0.19; 'seems': 0.21; 'command': 0.22; '>>>': 0.22; 'example': 0.22; 'skip:+ 20': 0.22; 'print': 0.22; 'header :User-Agent:1': 0.23; 'skip': 0.24; 'skip:i 40': 0.24; 'looks': 0.24; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; 'words': 0.29; 'label': 0.30; 'gives': 0.31; 'lines': 0.31; "skip:' 10": 0.31; 'breaking': 0.31; 'allows': 0.31; 'text': 0.33; 'problem': 0.35; 'subject: (': 0.35; 'connection': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'set.': 0.36; 'changing': 0.37; 'two': 0.37; 'received:10': 0.37; 'message- id:@gmail.com': 0.38; 'skip:[ 10': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'catch': 0.60; 'email addr:gmail.com': 0.63; 'skip:+ 10': 0.65; 'between': 0.67; "'test'": 0.84; 'succeed.': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=p3LKxIyMz/FDzViMgW8dn908LiIEFz3tO8B5KFt//sU=; b=pvZWtuAsn9GlwN5UX9JvaZr0zzanGc0zDGQQdoll0QYW4sNB0G/RzaaKRFVRamz5Eh DDEHH/GJF+Z4zcbfToqCy0/4HY2Zm7ffGrh4WZRMUHs22cLORF/byKuJAZzTKEEYPnbg LvP3czfAXUo2iYk5lrvxJ4frV8E1B2EeB/7PM0671FON4wzTOLHtOTd75VOXxylJG9gw nn3VAHCp5YRQMlolwe9THGFe06zwwNszPTI4K5ZjeIusS/PYJiGRbJvu3RtSbR3RitB7 dUL7jE7+1RYuhDdwhYmReAISlJr6zYL5WhQyiLN8kSlFLeBwikK2x4RxaXC8m+MMJi6X PS/A== X-Received: by 10.204.225.209 with SMTP id it17mr3655767bkb.17.1371107850814; Thu, 13 Jun 2013 00:17:30 -0700 (PDT) Date: Thu, 13 Jun 2013 09:17:28 +0200 From: Andreas Perstinger User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Problem creating a regular expression to parse open-iscsi, iscsiadm output (help?) References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 39 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1371108348 news.xs4all.nl 15922 [2001:888:2000:d::a6]:54113 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:47915 On 13.06.2013 02:59, rice.cruft@gmail.com wrote: > I am parsing the output of an open-iscsi command that contains > severalblocks of data for each data set. Each block has the format: [SNIP] > I tried using \s* to swallow the whitespace between the to iSCSI > lines. No joy... However [\s\S]*? allows the regex to succeed. But that > seems to me to be overkill (I am not trying to skip lines of text here.) > Also note that I am using \ + to catch spaces between the words. On the > two problem lines, using \s+ between the label words fails. Changing > # Connection state > iSCSI\ +Connection\ +State:\s+(?P\w+\s*\w*) > [\s\S]*? <<<<<< without this the regex fails > # Session state > iSCSI\ +Session\ +State:\s+(?P\w+) to # Connection state iSCSI\s+Connection\s+State:\s+(?P\w+\s*\w*)\s* # Session state iSCSI\s+Session\s+State:\s+(?P\w+) gives me >>> # 'test' is the example string >>> myDetails = [ m.groupdict() for m in regex.finditer(test)] >>> print myDetails [{'initiatorIP': '221.128.52.214', 'connState': 'LOGGED IN', 'SID': '154', 'ipaddr': '221.128.52.224', 'initiatorName': 'iqn.1996-04.de.suse:01:7c9741b545b5', 'sessionState': 'LOGGED_IN', 'iqn': 'iqn.1992-04.com.emc:vplex-000000008460319f-0000000000000007', 'tag': '7', 'port': '3260'}] for your example (same for the original regex). It looks like it works (Python 2.7.3) and there is something else breaking the regex. Bye, Andreas