Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #97255
| Path | csiph.com!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!nzpost1.xs4all.net!not-for-mail |
|---|---|
| Return-Path | <python.list@tim.thechases.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'subject:Question': 0.05; 'ugly': 0.07; '(1,': 0.09; '0))': 0.09; '0),': 0.09; 'literal': 0.09; 'optional': 0.09; 'skip:# 60': 0.09; 'tuple': 0.09; 'suggest': 0.15; '-tkc': 0.16; '4),': 0.16; 'comma': 0.16; 'commented': 0.16; 'expressions,': 0.16; 'expressions.': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'only)': 0.16; 'pprint': 0.16; 'subject:expression': 0.16; 'subject:regular': 0.16; 'wrote:': 0.16; 'string': 0.17; 'integer': 0.18; 'stick': 0.18; 'not,': 0.22; 'form:': 0.22; 'tuples': 0.22; 'this:': 0.23; 'second': 0.24; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'command': 0.26; 'regular': 0.29; 'boundary': 0.29; 'dictionary': 0.29; 'grouping': 0.29; 'whitespace': 0.29; 'anywhere': 0.30; "i'd": 0.31; 'guess': 0.31; 'possibly': 0.32; 'problem': 0.33; 'surely': 0.33; 'values.': 0.33; 'advice': 0.35; 'problem.': 0.35; 'but': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'two': 0.37; 'charset:us-ascii': 0.37; 'things': 0.38; 'version': 0.38; 'names': 0.38; 'stuff': 0.38; 'to:addr:python.org': 0.40; 'between': 0.65; 'capture': 0.66; 'groups.': 0.72; 'received:10.94': 0.84; 'received:23': 0.84 |
| X-Sender-Id | wwwh|x-authuser|tim@thechases.com |
| X-Sender-Id | wwwh|x-authuser|tim@thechases.com |
| X-MC-Relay | Neutral |
| X-MailChannels-SenderId | wwwh|x-authuser|tim@thechases.com |
| X-MailChannels-Auth-Id | wwwh |
| X-MC-Loop-Signature | 1443640871115:260629575 |
| X-MC-Ingress-Time | 1443640871115 |
| Date | Wed, 30 Sep 2015 14:20:15 -0500 |
| From | Tim Chase <python.list@tim.thechases.com> |
| To | python-list@python.org |
| Subject | Re: Question about regular expression |
| In-Reply-To | <811788b6-9955-4dcc-bf49-9647891d17ec@googlegroups.com> |
| References | <811788b6-9955-4dcc-bf49-9647891d17ec@googlegroups.com> |
| X-Mailer | Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) |
| MIME-Version | 1.0 |
| Content-Type | text/plain; charset=US-ASCII |
| Content-Transfer-Encoding | 7bit |
| X-AuthUser | tim@thechases.com |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.20+ |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.276.1443641310.28679.python-list@python.org> (permalink) |
| Lines | 65 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1443641310 news.xs4all.nl 23792 [2001:888:2000:d::a6]:51249 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:97255 |
Show key headers only | View raw
On 2015-09-30 11:34, massi_srb@msn.com wrote:
> firstly the description of my problem. I have a string in the
> following form:
>
> s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
>
> that is a string made up of groups in the form 'name' (letters
> only) plus possibly a tuple containing 1 or 2 integer values.
> Blanks can be placed between names and tuples or not, but they
> surely are placed beween two groups. I would like to process this
> string in order to get a dictionary like this:
>
> d = {
> "name1":(0, 0),
> "name2":(1, 0),
> "name3":(0, 0),
> "name4":(1, 4),
> "name5":(2, 0),
> }
>
> I guess this problem can be tackled with regular expressions, b
First out of the gate, I suggest you follow Emile's advice and try
using string expressions. However, if you *want* to do it with
regular expressions, you can. It's ugly and might be fragile, but
#############################################################
import re
s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
r = re.compile(r"""
\b # start at a word boundary
(\w+) # capture the word
\s* # optional whitespace
(?: # start an optional grouping for things in the parens
\( # a literal open-paren
\s* # optional whitespace
(\d+) # capture the number in those parens
(?: # start a second optional grouping for the stuff after a comma
\s* # optional whitespace
, # a literal comma
\s* # optional whitespace
(\d+) # the second number
)? # make the command and following number optional
\) # a literal close-paren
)? # make that stuff in parens optional
""", re.X)
d = {}
for m in r.finditer(s):
a, b, c = m.groups()
d[a] = (int(b or 0), int(c or 0))
from pprint import pprint
pprint(d)
#############################################################
I'd stick with the commented version of the regexp if you were to use
this anywhere so that others can follow what you're doing.
-tkc
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Question about regular expression massi_srb@msn.com - 2015-09-30 11:34 -0700
Re: Question about regular expression Emile van Sebille <emile@fenx.com> - 2015-09-30 11:50 -0700
Re: Question about regular expression Tim Chase <python.list@tim.thechases.com> - 2015-09-30 14:20 -0500
Re: Question about regular expression Denis McMahon <denismfmcmahon@gmail.com> - 2015-09-30 23:30 +0000
Re: Question about regular expression Denis McMahon <denismfmcmahon@gmail.com> - 2015-10-02 18:25 +0000
Re: Question about regular expression Emile van Sebille <emile@fenx.com> - 2015-09-30 20:58 -0700
Re: Question about regular expression Tim Chase <python.list@tim.thechases.com> - 2015-10-01 07:39 -0500
Re: Question about regular expression Rob Gaddi <rgaddi@technologyhighland.invalid> - 2015-10-01 15:53 +0000
Re: Question about regular expression Denis McMahon <denismfmcmahon@gmail.com> - 2015-10-01 21:41 +0000
Re: Question about regular expression Denis McMahon <denismfmcmahon@gmail.com> - 2015-10-01 21:31 +0000
csiph-web