Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #41022 > unrolled thread
| Started by | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| First post | 2013-03-10 10:42 -0700 |
| Last post | 2013-03-11 16:23 -0400 |
| Articles | 15 — 8 participants |
Back to article view | Back to comp.lang.python
Regular expression problem mukesh tiwari <mukeshtiwari.iiitm@gmail.com> - 2013-03-10 10:42 -0700
Re: Regular expression problem Chris Angelico <rosuav@gmail.com> - 2013-03-11 04:59 +1100
Re: Regular expression problem mukesh tiwari <mukeshtiwari.iiitm@gmail.com> - 2013-03-10 11:05 -0700
Re: Regular expression problem mukesh tiwari <mukeshtiwari.iiitm@gmail.com> - 2013-03-10 11:05 -0700
Re: Regular expression problem Chris Angelico <rosuav@gmail.com> - 2013-03-11 05:08 +1100
Re: Regular expression problem mukesh tiwari <mukeshtiwari.iiitm@gmail.com> - 2013-03-10 11:48 -0700
Re: Regular expression problem Chris Angelico <rosuav@gmail.com> - 2013-03-11 05:57 +1100
Re: Regular expression problem mukesh tiwari <mukeshtiwari.iiitm@gmail.com> - 2013-03-10 11:48 -0700
Re: Regular expression problem Terry Reedy <tjreedy@udel.edu> - 2013-03-10 22:06 -0400
Re: Regular expression problem jmfauth <wxjmfauth@gmail.com> - 2013-03-11 02:28 -0700
Re: Regular expression problem Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-11 10:19 +0000
Re: Regular expression problem rusi <rustompmody@gmail.com> - 2013-03-11 06:18 -0700
On topic, please [Was:Re: Regular expression problem] Ned Deily <nad@acm.org> - 2013-03-11 11:13 -0700
Re: Regular expression problem Serhiy Storchaka <storchaka@gmail.com> - 2013-03-11 20:30 +0200
Re: Regular expression problem Terry Reedy <tjreedy@udel.edu> - 2013-03-11 16:23 -0400
| From | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| Date | 2013-03-10 10:42 -0700 |
| Subject | Regular expression problem |
| Message-ID | <c24ab822-a3f4-4faf-8a6d-2e1fde6552ae@googlegroups.com> |
Hello all
I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster.
import re
if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[^~!@#$%^&*()<>?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^&*()<>?,.a-zA-Z0-9]*' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ):
print email[i].group()
c += 1
Also rather than breaking the string at space, I tried to use findall but I am not getting correct result.
>>> re.findall ( '[^~!@#$%^&*()<>?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^&*()<>?,.a-zA-Z0-9]*[ ]+','mukeshtiwari.iiitmmmm@gmail.com mukeshtiwari.iiitmmmmm@gmail.com')
['com']
I am suppose to get [ mukeshtiwari.iiitmmm@gmail.com , mukeshtiwari.iiitmmm@gmail.com] ?
Regards
Mukesh Tiwari
[1] http://www.spoj.com/problems/MAIN12C/
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-03-11 04:59 +1100 |
| Message-ID | <mailman.3158.1362938407.2939.python-list@python.org> |
| In reply to | #41022 |
On Mon, Mar 11, 2013 at 4:42 AM, mukesh tiwari <mukeshtiwari.iiitm@gmail.com> wrote: > I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster. What is the time limit? I just tried it (Python 2.6 under Windows) and it finished in a humanly-immeasurable amount of time. Are you sure that STDIN (eg raw_input()) is where your test data is coming from? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| Date | 2013-03-10 11:05 -0700 |
| Message-ID | <d1f38f67-ec36-480d-8c4d-2d6b2e1c1f3d@googlegroups.com> |
| In reply to | #41025 |
Hi Chris On the problem page, it is 3 second. > What is the time limit? I just tried it (Python 2.6 under Windows) and > > it finished in a humanly-immeasurable amount of time. Are you sure > > that STDIN (eg raw_input()) is where your test data is coming from? Yes, on SPOJ we read data from STDIN. Regards Mukesh Tiwari > > > > ChrisA
[toc] | [prev] | [next] | [standalone]
| From | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| Date | 2013-03-10 11:05 -0700 |
| Message-ID | <mailman.3161.1362939467.2939.python-list@python.org> |
| In reply to | #41025 |
Hi Chris On the problem page, it is 3 second. > What is the time limit? I just tried it (Python 2.6 under Windows) and > > it finished in a humanly-immeasurable amount of time. Are you sure > > that STDIN (eg raw_input()) is where your test data is coming from? Yes, on SPOJ we read data from STDIN. Regards Mukesh Tiwari > > > > ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-03-11 05:08 +1100 |
| Message-ID | <mailman.3160.1362938932.2939.python-list@python.org> |
| In reply to | #41022 |
On Mon, Mar 11, 2013 at 4:59 AM, Chris Angelico <rosuav@gmail.com> wrote:
> On Mon, Mar 11, 2013 at 4:42 AM, mukesh tiwari
> <mukeshtiwari.iiitm@gmail.com> wrote:
>> I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster.
>
> What is the time limit? I just tried it (Python 2.6 under Windows) and
> it finished in a humanly-immeasurable amount of time. Are you sure
> that STDIN (eg raw_input()) is where your test data is coming from?
Oops, reading comprehension fail. Time limit is 3s on a Pentium III.
I've no idea how long your code will take on that hardware, but I
doubt that it's taking three seconds. So my query regarding source of
test data still stands. Can you put together an uber-simple test
program that just echoes the lines of input, to make sure it really is
coming off stdin?
The problem description certainly does seem to imply stdin, but I
can't see why your code would take three seconds unless it's stalling
for some reason. Though perhaps on a P3 with the maximum 100 tests,
maybe that could take a while...
Something to try: Since you're using re.search(), see if you can drop
the complemented sets at the beginning [^~!@#$%^&*()<>?,.]* and end
[^~!@#$%^&*()<>?,.a-zA-Z0-9]* - they're going to be slow to process.
Also, you can simplify this:
[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*
to this:
[a-zA-Z0-9][a-zA-Z0-9._]{4,}
The brace notation means "at least 4, at most infinity".
Try those out and see if you still get the results you want.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| Date | 2013-03-10 11:48 -0700 |
| Message-ID | <a5a2bd74-4d7e-45c1-aad2-2f67cf9bd9fe@googlegroups.com> |
| In reply to | #41028 |
Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again!
import re
if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ) :
print email[i].group()
c += 1
Regards
Mukesh Tiwari
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-03-11 05:57 +1100 |
| Message-ID | <mailman.3163.1362941836.2939.python-list@python.org> |
| In reply to | #41031 |
On Mon, Mar 11, 2013 at 5:48 AM, mukesh tiwari <mukeshtiwari.iiitm@gmail.com> wrote: > Hi Chris > Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again! Excellent! Have fun. Incidentally, regular expressions aren't the only way to solve this sort of problem. If you get stuck with one method, it may be worth trying another one, to see if you can get around the issue. As they say, now you have two problems... ChrisA
[toc] | [prev] | [next] | [standalone]
| From | mukesh tiwari <mukeshtiwari.iiitm@gmail.com> |
|---|---|
| Date | 2013-03-10 11:48 -0700 |
| Message-ID | <mailman.3162.1362941307.2939.python-list@python.org> |
| In reply to | #41028 |
Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again!
import re
if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ) :
print email[i].group()
c += 1
Regards
Mukesh Tiwari
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-03-10 22:06 -0400 |
| Message-ID | <mailman.3174.1362967600.2939.python-list@python.org> |
| In reply to | #41022 |
On 3/10/2013 1:42 PM, mukesh tiwari wrote: > Hello all > I am trying to solve this problem[1] > [1] http://www.spoj.com/problems/MAIN12C/ As I remember, and as it still appears, this site severely penalizes Python solvers by using the same time limit for all languages. Thus, a 'slow' python program may work correctly but the site will not let you know. A test that refuses to answer is no test at all. In the meanwhile, an algorithmically equivalent C program will be run and judged correct, so the programmer can try to speed up while not losing correctness. By teaching 'speed before correctness", this site promotes bad programming habits and thinking (and the use of low-level but faster languages). I quote your later response: "Now I am getting wrong answer so at least program is faster then previous one". If the previous one was correct and the revision wrong, you should toss the revision and go back to the correct program. I recommend that you work on problems where you have tests that you can actually run even before you code. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | jmfauth <wxjmfauth@gmail.com> |
|---|---|
| Date | 2013-03-11 02:28 -0700 |
| Message-ID | <d5432f08-2fc4-4e1c-98e5-6f637e4d1c5d@w3g2000vba.googlegroups.com> |
| In reply to | #41049 |
On 11 mar, 03:06, Terry Reedy <tjre...@udel.edu> wrote: > > ... > By teaching 'speed before correctness", this site promotes bad > programming habits and thinking (and the use of low-level but faster > languages). > ... This is exactly what "your" flexible string representation does! And away from technical aspects, you even succeeded to somehow lose unicode compliance. jmf
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-03-11 10:19 +0000 |
| Message-ID | <mailman.3177.1362997176.2939.python-list@python.org> |
| In reply to | #41056 |
On 11/03/2013 09:28, jmfauth wrote: > On 11 mar, 03:06, Terry Reedy <tjre...@udel.edu> wrote: > >> >> ... >> By teaching 'speed before correctness", this site promotes bad >> programming habits and thinking (and the use of low-level but faster >> languages). >> ... > > > This is exactly what "your" flexible string representation > does! > > And away from technical aspects, you even succeeded to > somehow lose unicode compliance. > > jmf > Please stick to something you know about such as sexual self abuse. -- Cheers. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | rusi <rustompmody@gmail.com> |
|---|---|
| Date | 2013-03-11 06:18 -0700 |
| Message-ID | <160bcdfd-2400-4d51-8431-93f47d962149@vh9g2000pbb.googlegroups.com> |
| In reply to | #41056 |
On Mar 11, 2:28 pm, jmfauth <wxjmfa...@gmail.com> wrote: > On 11 mar, 03:06, Terry Reedy <tjre...@udel.edu> wrote: > > > > > ... > > By teaching 'speed before correctness", this site promotes bad > > programming habits and thinking (and the use of low-level but faster > > languages). > > ... > > This is exactly what "your" flexible string representation > does! This is an old complaint of your with no new data for supporting it > > And away from technical aspects, you even succeeded to > somehow lose unicode compliance. This is a new complaint. Just to make it clear: 1. All your recent complaints about unicode were in the realm of performance So your complaint that python has lost unicode compliance can mean one of: 2a. The unicode standard mandates performance criteria or 2b. There are problems with python's implementation (of strings?) that have functional problems apart from your old performance complaints or 2c. You need to look up what 'compliance' means My own choice is to have a mid-point between Very early binding: Narrow vs wide builds Very late binding: String reps can change with the characters as they are seen This mid point would be perhaps a commandline switch to choose string- engine attributes. However to make this choice even worth a second look we need to have hard data about performance that you have been unable to provide. [See the recent thread of RoR vs Django to see the problems of excessive spurious choice]
[toc] | [prev] | [next] | [standalone]
| From | Ned Deily <nad@acm.org> |
|---|---|
| Date | 2013-03-11 11:13 -0700 |
| Subject | On topic, please [Was:Re: Regular expression problem] |
| Message-ID | <mailman.3201.1363025617.2939.python-list@python.org> |
| In reply to | #41056 |
A friendly reminder that this forum is for general discussion and questions about Python. "Pretty much anything Python-related is fair game for discussion, and the group is even fairly tolerant of off-topic digressions; there have been entertaining discussions of topics such as floating point, good software design, and other programming languages such as Lisp and Forth." But ... "Rudeness and personal attacks, even in reaction to blatant flamebait, are strongly frowned upon. People may strongly disagree on an issue, but usually discussion remains civil. In case of an actual flamebait posting, you can ignore it, quietly plonk the offending poster in your killfile or mail filters, or write a sharp but still-polite response, but at all costs resist the urge to flame back." http://www.python.org/community/lists/ It's up to all of us to help keep this group/list a place where people enjoy participating, without fear of gratuitous personal sniping. Thanks! -- Ned Deily, nad@acm.org
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2013-03-11 20:30 +0200 |
| Message-ID | <mailman.3202.1363026660.2939.python-list@python.org> |
| In reply to | #41022 |
On 11.03.13 04:06, Terry Reedy wrote: > On 3/10/2013 1:42 PM, mukesh tiwari wrote: >> Hello all >> I am trying to solve this problem[1] >> [1] http://www.spoj.com/problems/MAIN12C/ > > As I remember, and as it still appears, this site severely penalizes > Python solvers by using the same time limit for all languages. Thus, a > 'slow' python program may work correctly but the site will not let you > know. I'm sure the time limits are enough to solve most (if not all) of problems. Actually all submitted solutions on Python for this problem run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/).
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-03-11 16:23 -0400 |
| Message-ID | <mailman.3206.1363033451.2939.python-list@python.org> |
| In reply to | #41022 |
On 3/11/2013 2:30 PM, Serhiy Storchaka wrote: > On 11.03.13 04:06, Terry Reedy wrote: >> On 3/10/2013 1:42 PM, mukesh tiwari wrote: >>> Hello all >>> I am trying to solve this problem[1] >>> [1] http://www.spoj.com/problems/MAIN12C/ >> >> As I remember, and as it still appears, this site severely penalizes >> Python solvers by using the same time limit for all languages. Thus, a >> 'slow' python program may work correctly but the site will not let you >> know. > > I'm sure the time limits are enough to solve most (if not all) of > problems. Actually all submitted solutions on Python for this problem > run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/). You do not see the solutions that timed out. I suppose you are pointing to the fact that for this problem there are solutions close to but under the time limit. However, algorithm running times are not evenly distributed. Suppose, for instance) there is a correct O(n**2) solution and a correct O(n) solution and that the ones listed are the O(n) solutions. Then the Python O(n**2) solutions could easily take 10x longer to run and time out, while equivalent C solutions do not. Mukesh is not the first to post here a reasonable looking solution for that site that he could not judge because the test quite and refused to answer. I point out again that he was 'happy' to have a faster but incorrect program, even though it might have been a regression from his original. -- Terry Jan Reedy
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web