Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #38898
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!nntp.club.cc.cmu.edu!195.208.113.1.MISMATCH!goblin3!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <mailtomanage@163.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.002 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'output': 0.04; 'patterns': 0.04; 'attribute': 0.05; 'pat': 0.05; '-*-': 0.07; 'utf-8': 0.07; 'python': 0.09; '8bit%:30': 0.09; 'coding:': 0.09; '>>': 0.16; '>on': 0.16; '>the': 0.16; 'literals': 0.16; 'output?': 0.16; 'subject:expression': 0.16; 'subject:regular': 0.16; 'string': 0.17; 'wrote:': 0.17; '>>>': 0.18; 'skip:p 30': 0.20; 'import': 0.21; '>>>': 0.22; 'message-id:@163.com': 0.22; 'example': 0.23; '>': 0.23; 'this:': 0.23; 'header:In-Reply- To:1': 0.25; 'skip:" 20': 0.26; '(most': 0.27; 'raw': 0.27; 'skip:( 20': 0.28; 'all.': 0.28; '8bit%:89': 0.29; 'optional': 0.29; 'url:mailman': 0.29; 'skip:& 10': 0.29; 'subject: ?': 0.30; 'code': 0.31; 'url:python': 0.32; 'file': 0.32; 'print': 0.32; 'url:listinfo': 0.32; 'received:220.181.13': 0.33; 'traceback': 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; 'code:': 0.33; 'recommended': 0.33; 'but': 0.36; 'url:org': 0.36; 'subject:: ': 0.38; 'skip:( 30': 0.38; 'object': 0.38; 'to:addr:python.org': 0.39; 'skip:" 10': 0.40; 'url:mail': 0.40; 'url:ip addr': 0.62; 'skip:n 10': 0.63; 'here': 0.65; '8bit%:100': 0.70; 'subject::': 0.83; 'url:177': 0.84; 'url:202': 0.84; '8bit%:54': 0.91; '8bit%:56': 0.91; 'url:is': 0.91; 'url:nbsp': 0.93 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Received:Date:From:To:Subject:In-Reply-To: References:Content-Type:MIME-Version:Message-ID; bh=rWrszHojyhmc ze51sn3BWVU0aJZW09Fo/ve4aT6y2NQ=; b=RcWIpyuU/CgPnOmqkL7PIQub5A0M H1tbyisjTQwXwUiMmJYD8eRTvgr/yhvZ/9XHd5eSpHkNWHCfRScM3OyRdvz4/GWw J6IVwut/mmceD7xqY2/BYcrUgjqVGziAp/hRWGrUmbKdNh6CD0Xiq8YDu6oyCQcJ lvFi2J5erKgG1bI= |
| X-Originating-IP | [1.89.180.69] |
| Date | Fri, 15 Feb 2013 08:32:14 +0800 (CST) |
| From | python <mailtomanage@163.com> |
| To | python-list@python.org |
| Subject | Re:Re: how to right the regular expression ? |
| X-Priority | 3 |
| X-Mailer | Coremail Webmail Server Version SP_ntes V3.5 build 20130124(21453.5226.5222) Copyright (c) 2002-2013 www.mailtech.cn 163com |
| In-Reply-To | <511D062E.4080101@mrabarnett.plus.com> |
| References | <227f6014.405c.13cd90da3d4.Coremail.mailtomanage@163.com> <511D062E.4080101@mrabarnett.plus.com> |
| X-CM-CTRLDATA | ZzFHxGZvb3Rlcl9odG09ODEyMTo4MQ== |
| Content-Type | multipart/alternative; boundary="----=_Part_3811_1277455604.1360888334389" |
| MIME-Version | 1.0 |
| X-CM-TRANSID | k8GowABnbgoOgh1R1GhYAA--.38908W |
| X-CM-SenderInfo | hpdlz3xrpd0tljh6il2tof0z/1tbisQ7EPFD+JflF9gADsF |
| X-Coremail-Antispam | 1U5529EdanIXcx71UUUUU7vcSsGvfC2KfnxnUU== |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.1791.1360889265.2939.python-list@python.org> (permalink) |
| Lines | 254 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1360889265 news.xs4all.nl 6925 [2001:888:2000:d::a6]:42892 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:38898 |
Show key headers only | View raw
[Multipart message — attachments visible in raw view] - view raw
the regex--- pat = r'([a-z].+?\s)(.+)(?:(\(.+\)))?' ,do not work at all.
>>> rfile.close()
>>> import re
>>> rfile=open("tv.txt","r")
>>> pat1 = r'([a-z].+?\s)(.+)((\(.+\)))?'
>>> for line in rfile.readlines():
... Match=re.match(pat1,line)
... print "1group is ",Match.group(1),"2group is ",Match.group(2),"3group is ",Match.group(3)
...
1group is http://202.177.192.119/radio5 2group is 香港电台第五台(可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radio35 2group is 香港电台第五台(DAB版,可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radiopth 2group is 香港电台普通话台(可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radio31 2group is 香港电台普通话台(DAB版,可于Totem/VLC/MPlayer播放) 3group is None
1group is octoshape:rthk.ch1 2group is 香港电台第一台(粤) 3group is None
1group is octoshape:rthk.ch2 2group is 香港电台第二台(粤) 3group is None
1group is octoshape:rthk.ch6 2group is 香港电台普通话台 3group is None
1group is octoshape:rthk.ch3 2group is 香港电台第三台(英) 3group is None
>>> rfile.close()
>>> import re
>>> rfile=open("tv.txt","r")
>>> pat2 = r'([a-z].+?\s)(.+?)((\(.+\)))?'
>>> for line in rfile.readlines():
... Match=re.match(pat1,line)
... print "1group is ",Match.group(1),"2group is ",Match.group(2),"3group is ",Match.group(3)
...
1group is http://202.177.192.119/radio5 2group is 香港电台第五台(可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radio35 2group is 香港电台第五台(DAB版,可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radiopth 2group is 香港电台普通话台(可于Totem/VLC/MPlayer播放) 3group is None
1group is http://202.177.192.119/radio31 2group is 香港电台普通话台(DAB版,可于Totem/VLC/MPlayer播放) 3group is None
1group is octoshape:rthk.ch1 2group is 香港电台第一台(粤) 3group is None
1group is octoshape:rthk.ch2 2group is 香港电台第二台(粤) 3group is None
1group is octoshape:rthk.ch6 2group is 香港电台普通话台 3group is None
1group is octoshape:rthk.ch3 2group is 香港电台第三台(英) 3group is None
在 2013-02-14 23:43:42,MRAB <python@mrabarnett.plus.com> 写道:
>On 2013-02-14 14:13, python wrote:
>> my tv.txt is :
>> http://202.177.192.119/radio5 香港电台第五台(可于Totem/VLC/MPlayer播放)
>> http://202.177.192.119/radio35 香港电台第五台(DAB版,可于Totem/VLC/MPlayer播放)
>> http://202.177.192.119/radiopth 香港电台普通话台(可于Totem/VLC/MPlayer播放)
>> http://202.177.192.119/radio31 香港电台普通话台(DAB版,可于Totem/VLC/MPlayer播放)
>> octoshape:rthk.ch1 香港电台第一台(粤)
>> octoshape:rthk.ch2 香港电台第二台(粤)
>> octoshape:rthk.ch6 香港电台普通话台
>> octoshape:rthk.ch3 香港电台第三台(英)
>>
>> what i want to get the result is
>> 1group is http://202.177.192.119/radio5 2group is 香港电台第五台 3group is (可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radio35 2group is 香港电台第五台 3group is (DAB版,可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radiopth 2group is 香港电台普通话台 3group is (可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radio31 2group is 香港电台普通话台 3group is (DAB版,可于Totem/VLC/MPlayer播放)
>> 1group is octoshape:rthk.ch1 2group is 香港电台第一台 3group is (粤)
>> 1group is octoshape:rthk.ch2 2group is 香港电台第二台 3group is (粤)
>> 1group is octoshape:rthk.ch6 2group is 香港电台普通话台 3group is none
>> 1group is octoshape:rthk.ch3 2group is 香港电台第三台 3group is (英)
> >
>> here is my code:
>> # -*- coding: utf-8 -*-
>> import re
>> rfile=open("tv.txt","r")
>> pat=r'([a-z].+?\s)(.+)(\(.+\))'
>> for line in rfile.readlines():
>> Match=re.match(pat,line)
>> print "1group is ",Match.group(1),"2group is
>> ",Match.group(2),"3group is ",Match.group(3)
>> rfile.close()
>>
>> the output is :
>> 1group is http://202.177.192.119/radio5 2group is 香港电台第五台
>> 3group is (可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radio35 2group is 香港电台第五台
>> 3group is (DAB版,可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radiopth 2group is 香港电台普通话台
>> 3group is (可于Totem/VLC/MPlayer播放)
>> 1group is http://202.177.192.119/radio31 2group is 香港电台普通话台
>> 3group is (DAB版,可于Totem/VLC/MPlayer播放)
>> 1group is octoshape:rthk.ch1 2group is 香港电台第一台 3group is (粤)
>> 1group is octoshape:rthk.ch2 2group is 香港电台第二台 3group is (粤)
>> 1group is
>> Traceback (most recent call last):
>> File "tv.py", line 7, in <module>
>> print "1group is ",Match.group(1),"2group is ",Match.group(2),"3group is ",Match.group(3)
>> AttributeError: 'NoneType' object has no attribute 'group'
>>
>> how to revise my code to get the output?
>>
>The problem is that the regex makes '(\(.+\))' mandatory, but example 7
>doesn't match it.
>
>You can make it optional by wrapping it in a non-capturing group
>(?:...), like this:
>
>pat = r'([a-z].+?\s)(.+)(?:(\(.+\)))?'
>
>Also, it's highly recommended that you use raw string literals
>(r'...') when writing regex patterns and replacements.
>
>--
>http://mail.python.org/mailman/listinfo/python-list
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re:Re: how to right the regular expression ? python <mailtomanage@163.com> - 2013-02-15 08:32 +0800
csiph-web