Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #98121 > unrolled thread

Regular expressions

Started bySeymore4Head <Seymore4Head@Hotmail.invalid>
First post2015-11-02 20:09 -0500
Last post2015-11-03 22:15 +0000
Articles 20 on this page of 106 — 30 participants

Back to article view | Back to comp.lang.python


Contents

  Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 20:09 -0500
    Re: Regular expressions MRAB <python@mrabarnett.plus.com> - 2015-11-03 01:19 +0000
      Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
    Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-02 20:42 -0600
      Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
        Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-02 22:58 -0500
          Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
            Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:38 -0700
              Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:33 -0800
                Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 19:04 -0700
                  Re: Regular expressions Dan Sommers <dan@tombstonezero.net> - 2015-11-04 02:55 +0000
                    Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:23 +1100
                      Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 20:47 -0700
                        Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-04 13:27 +0000
                      Re: Regular expressions Nobody <nobody@nowhere.invalid> - 2015-11-04 05:05 +0000
                      Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-04 09:57 +0100
                        Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:28 +1100
                          Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 20:48 -0600
                          Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:03 +1100
                          Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-05 09:33 +0100
                            Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 23:05 +1100
                              Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-05 08:00 -0600
                          Re: Regular expressions Albert van der Horst <albert@spenarnc.xs4all.nl> - 2015-11-05 13:39 +0000
                      Re: Regular expressions Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-04 08:00 -0500
                      Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-04 08:13 -0700
                        Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:00 -0500
                          Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:24 -0800
                            Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:24 +1100
                              Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:59 -0800
                                Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 09:18 +0100
                                  Re: Regular expressions rurpy@yahoo.com - 2015-11-06 11:52 -0800
                                    Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-06 21:36 +0100
                                      Re: Regular expressions Larry Martell <larry.martell@gmail.com> - 2015-11-06 15:42 -0500
                            Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:34 +1100
                              Re: Regular expressions rurpy@yahoo.com - 2015-11-04 22:27 -0800
                      Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:42 -0600
                        Re: Regular expressions Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-11-05 20:55 +1300
                          Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:06 +1100
                      What does “grep” stand for? (was: Regular expressions) Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 05:24 +1100
                        Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 20:38 +0100
                          Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:42 +1100
                            Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 08:32 +0100
                              Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:00 +1100
                          Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 10:19 -0500
                            Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 18:29 +0000
                              Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 14:56 -0500
                                Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 20:19 +0000
                                  Re: What does “grep” stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-05 20:18 -0500
                                    Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-05 19:36 -0800
                                      Re: What does “grep” stand for? Dan Sommers <dan@tombstonezero.net> - 2015-11-06 05:31 +0000
                                      Re: What does “grep” stand for? William Ray Wing <wrw@mac.com> - 2015-11-06 08:25 -0500
                                        Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-06 19:21 -0800
                                    Re: What does ???grep??? stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-06 14:15 +0000
                                      Re: What does ???grep??? stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-06 20:03 -0500
                      Re: What does “grep” stand for? (was: Regular expressions) Tim Chase <python.list@tim.thechases.com> - 2015-11-04 13:05 -0600
                      Re: Regular expressions Terry Reedy <tjreedy@udel.edu> - 2015-11-04 18:08 -0500
                        Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:29 -0500
                Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 21:12 -0600
                Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 14:26 +1100
                Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:48 +1100
                  Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 08:21 +0100
                    Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 19:47 +1100
                      Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:43 -0800
                  Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:38 -0800
                    Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 01:52 +1100
                      Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:13 -0800
                        Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:33 +1100
                          Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:42 -0800
                        Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:26 +1100
                          Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:07 +1100
                          Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:54 -0800
                        Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-05 10:14 +0100
                  Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:02 -0500
                    Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 11:54 +1100
                      Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-05 10:07 -0500
                        Re: Regular expressions rurpy@yahoo.com - 2015-11-06 12:46 -0800
            Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-03 18:15 +1100
              Re: Regular expressions Nick Sarbicki <nick.a.sarbicki@gmail.com> - 2015-11-03 08:43 +0000
              Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:22 -0800
        Re: Regular expressions Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-03 12:38 +0000
        Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:53 -0600
        Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-03 10:34 -0500
          Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-03 11:10 -0500
            Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 03:20 +1100
              Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:35 +1100
                Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-04 12:41 +0100
      Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 14:56 +0000
    Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 20:51 -0700
      Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
        Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:33 -0700
        Re: Regular expressions Robin Koch <robin.koch@t-online.de> - 2015-11-03 23:58 +0100
    Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 10:25 +0100
    Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:50 -0600
    Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 15:00 +0100
      Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 17:12 +0200
        Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 16:35 +0100
          Re: Irregular last line in a text file, was Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 18:42 +0200
        Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 10:56 -0600
          Re: Irregular last line in a text file, was Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:39 +1100
            Re: Irregular last line in a text file, was Re: Regular expressions Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-11-04 10:07 +0000
            Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:33 -0600
        Re: Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 18:44 +0100
        Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:33 -0700
        Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:39 -0700
        Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 13:45 -0600
          Re: Irregular last line in a text file, was Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 22:15 +0000

Page 1 of 6  [1] 2 3 4 5 6  Next page →


#98121 — Regular expressions

FromSeymore4Head <Seymore4Head@Hotmail.invalid>
Date2015-11-02 20:09 -0500
SubjectRegular expressions
Message-ID<662g3blobme52hfoududj27err185v2npm@4ax.com>
How do I make a regular expression that returns true if the end of the
line is an asterisk

[toc] | [next] | [standalone]


#98122

FromMRAB <python@mrabarnett.plus.com>
Date2015-11-03 01:19 +0000
Message-ID<mailman.76.1446513578.4463.python-list@python.org>
In reply to#98121
On 2015-11-03 01:09, Seymore4Head wrote:
> How do I make a regular expression that returns true if the end of the
> line is an asterisk
>
To match an asterisk: \*

To match the end of a line: $

To match an asterisk at the end of a line: \*$

[toc] | [prev] | [next] | [standalone]


#98125

FromSeymore4Head <Seymore4Head@Hotmail.invalid>
Date2015-11-02 22:17 -0500
Message-ID<1r9g3bd89q1mp36l7bioasgmsrhsb8c7fg@4ax.com>
In reply to#98122
On Tue, 3 Nov 2015 01:19:34 +0000, MRAB <python@mrabarnett.plus.com>
wrote:

>On 2015-11-03 01:09, Seymore4Head wrote:
>> How do I make a regular expression that returns true if the end of the
>> line is an asterisk
>>
>To match an asterisk: \*
>
>To match the end of a line: $
>
>To match an asterisk at the end of a line: \*$

Thanks

[toc] | [prev] | [next] | [standalone]


#98123

FromTim Chase <python.list@tim.thechases.com>
Date2015-11-02 20:42 -0600
Message-ID<mailman.0.1446519578.8789.python-list@python.org>
In reply to#98121
On 2015-11-02 20:09, Seymore4Head wrote:
> How do I make a regular expression that returns true if the end of
> the line is an asterisk

Why use a regular expression?

  if line[-1] == '*':
    yep(line)
  else:
    nope(line)

-tkc


[toc] | [prev] | [next] | [standalone]


#98124

FromSeymore4Head <Seymore4Head@Hotmail.invalid>
Date2015-11-02 22:17 -0500
Message-ID<hp9g3b9hsn06edb0po8bduegjqkmpo4p8n@4ax.com>
In reply to#98123
On Mon, 2 Nov 2015 20:42:37 -0600, Tim Chase
<python.list@tim.thechases.com> wrote:

>On 2015-11-02 20:09, Seymore4Head wrote:
>> How do I make a regular expression that returns true if the end of
>> the line is an asterisk
>
>Why use a regular expression?
>
>  if line[-1] == '*':
>    yep(line)
>  else:
>    nope(line)
>
>-tkc
>
>
Because that is the part of Python I am trying to learn at the moment.
Thanks

[toc] | [prev] | [next] | [standalone]


#98128

FromJoel Goldstick <joel.goldstick@gmail.com>
Date2015-11-02 22:58 -0500
Message-ID<mailman.3.1446523111.8789.python-list@python.org>
In reply to#98124
On Mon, Nov 2, 2015 at 10:17 PM, Seymore4Head <Seymore4Head@hotmail.invalid>
wrote:

> On Mon, 2 Nov 2015 20:42:37 -0600, Tim Chase
> <python.list@tim.thechases.com> wrote:
>
> >On 2015-11-02 20:09, Seymore4Head wrote:
> >> How do I make a regular expression that returns true if the end of
> >> the line is an asterisk
> >
> >Why use a regular expression?
> >
> >  if line[-1] == '*':
> >    yep(line)
> >  else:
> >    nope(line)
> >
> >-tkc
> >
> >
> Because that is the part of Python I am trying to learn at the moment.
> Thanks
> --
> https://mail.python.org/mailman/listinfo/python-list
>

My completely unsolicited advice is that regular expressions shouldn't be
very high on the list of things to learn.  They are very useful, and very
tricky and prone many problems that can and should be learned to be
resolved with much simpler methods.  If you really want to learn regular
expressions, that's great but the problem you posed is not one for which
they are the best solution.  Remember simpler is better than complex.

-- 
Joel Goldstick
http://joelgoldstick.com/stats/birthdays

[toc] | [prev] | [next] | [standalone]


#98130

Fromrurpy@yahoo.com
Date2015-11-02 20:23 -0800
Message-ID<d39290cf-cb26-470f-a987-2f71e3860f97@googlegroups.com>
In reply to#98128
On Monday, November 2, 2015 at 8:58:45 PM UTC-7, Joel Goldstick wrote:
> On Mon, Nov 2, 2015 at 10:17 PM, Seymore4Head <Seymore4Head@hotmail.invalid>
> wrote:
> 
> > On Mon, 2 Nov 2015 20:42:37 -0600, Tim Chase
> > <python.list@tim.thechases.com> wrote:
> >
> > >On 2015-11-02 20:09, Seymore4Head wrote:
> > >> How do I make a regular expression that returns true if the end of
> > >> the line is an asterisk
> > >
> > >Why use a regular expression?
> > >
> > >  if line[-1] == '*':
> > >    yep(line)
> > >  else:
> > >    nope(line)
> > >
> > >-tkc
> > >
> > >
> > Because that is the part of Python I am trying to learn at the moment.
> > Thanks
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> >
> 
> My completely unsolicited advice is that regular expressions shouldn't be
> very high on the list of things to learn.  They are very useful, and very
> tricky and prone many problems that can and should be learned to be
> resolved with much simpler methods.  If you really want to learn regular
> expressions, that's great but the problem you posed is not one for which
> they are the best solution.  Remember simpler is better than complex.

Regular expressions should be learned by every programmer or by anyone
who wants to use computers as a tool.  They are a fundamental part of
computer science and are used in all sorts of matching and searching 
from compilers down to your work-a-day text editor.

Not knowing how to use them is like an auto mechanic not knowing how to 
use a socket wrench.

[toc] | [prev] | [next] | [standalone]


#98132

FromMichael Torrie <torriem@gmail.com>
Date2015-11-02 21:38 -0700
Message-ID<mailman.5.1446525488.8789.python-list@python.org>
In reply to#98130
On 11/02/2015 09:23 PM, rurpy--- via Python-list wrote:
>> My completely unsolicited advice is that regular expressions shouldn't be
>> very high on the list of things to learn.  They are very useful, and very
>> tricky and prone many problems that can and should be learned to be
>> resolved with much simpler methods.  If you really want to learn regular
>> expressions, that's great but the problem you posed is not one for which
>> they are the best solution.  Remember simpler is better than complex.
> 
> Regular expressions should be learned by every programmer or by anyone
> who wants to use computers as a tool.  They are a fundamental part of
> computer science and are used in all sorts of matching and searching 
> from compilers down to your work-a-day text editor.
> 
> Not knowing how to use them is like an auto mechanic not knowing how to 
> use a socket wrench.

Not quite.  Core language concepts like ifs, loops, functions,
variables, slicing, etc are the socket wrenches of the programmer's
toolbox.  Regexs are like an electric impact socket wrench.  You can do
the same work without it, but in many cases it's slower. But you have to
learn the other hand tools first in order to really use the electric
driver properly (understanding torques, direction of threads, etc), lest
you wonder why you're breaking off so many bolts with the torque of the
impact drive.

[toc] | [prev] | [next] | [standalone]


#98195

Fromrurpy@yahoo.com
Date2015-11-03 16:33 -0800
Message-ID<bb15756d-7181-421d-835e-b2fbfc1c1774@googlegroups.com>
In reply to#98132
On Monday, November 2, 2015 at 9:38:24 PM UTC-7, Michael Torrie wrote:
> On 11/02/2015 09:23 PM, rurpy--- via Python-list wrote:
> >> My completely unsolicited advice is that regular expressions shouldn't be
> >> very high on the list of things to learn.  They are very useful, and very
> >> tricky and prone many problems that can and should be learned to be
> >> resolved with much simpler methods.  If you really want to learn regular
> >> expressions, that's great but the problem you posed is not one for which
> >> they are the best solution.  Remember simpler is better than complex.
> > 
> > Regular expressions should be learned by every programmer or by anyone
> > who wants to use computers as a tool.  They are a fundamental part of
> > computer science and are used in all sorts of matching and searching 
> > from compilers down to your work-a-day text editor.
> > 
> > Not knowing how to use them is like an auto mechanic not knowing how to 
> > use a socket wrench.
> 
> Not quite.  Core language concepts like ifs, loops, functions,
> variables, slicing, etc are the socket wrenches of the programmer's
> toolbox.  Regexs are like an electric impact socket wrench.  You can do
> the same work without it, but in many cases it's slower. But you have to
> learn the other hand tools first in order to really use the electric
> driver properly (understanding torques, direction of threads, etc), lest
> you wonder why you're breaking off so many bolts with the torque of the
> impact drive.

I consider regexs more fundemental.  One need not even be a programmer
to use them: consider grep, sed, a zillion editors, database query 
languages, etc.

When there is a mini-language explicitly developed for describing
string patterns, why, except is very simple cases, would one not
take advantage of it?  Beyond trivial operations a regex, although
terse (overly perhaps), is still likely to be more understandable 
more maintainable than bunch of ad-hoc code.  And the relative ease 
of expressing complex patterns means one is more likely to create
more specific patterns, resulting in detecting unexpected input 
earlier than with ad-hoc code. 

[toc] | [prev] | [next] | [standalone]


#98198

FromMichael Torrie <torriem@gmail.com>
Date2015-11-03 19:04 -0700
Message-ID<mailman.0.1446602668.16136.python-list@python.org>
In reply to#98195
On 11/03/2015 05:33 PM, rurpy--- via Python-list wrote:
> I consider regexs more fundemental.  One need not even be a programmer
> to use them: consider grep, sed, a zillion editors, database query 
> languages, etc.

Grep can use regular expressions (and I do so with it regularly), but
it's default mode is certainly not regular expressions, and it is still
very powerful.  I've never used regular expressions in a database query
language; until this moment I didn't know any supported such things in
their queries.  Good to know.  How you would index for regular
expressions in queries I don't know.

> When there is a mini-language explicitly developed for describing
> string patterns, why, except is very simple cases, would one not
> take advantage of it?  

Mainly because the programming language itself often can do it just as
cleanly and just as fast (slicing, string methods, etc).  I certainly
programmed for many years without needing regular expressions in my
small projects.  In fact, REs are a bit of a pain to use in, say, C or
C++, requiring a library.  With Python they are much more readily
accessible so I use them much more.

But honestly it wasn't until college when I learned about finite state
automata that I really grasped what regular expressions were and how to
use them.

> Beyond trivial operations a regex, although
> terse (overly perhaps), is still likely to be more understandable 
> more maintainable than bunch of ad-hoc code.  And the relative ease 
> of expressing complex patterns means one is more likely to create
> more specific patterns, resulting in detecting unexpected input 
> earlier than with ad-hoc code. 

Maybe, maybe not.  Using Python string class methods is probably more
clear when such methods are sufficient.

[toc] | [prev] | [next] | [standalone]


#98199

FromDan Sommers <dan@tombstonezero.net>
Date2015-11-04 02:55 +0000
Message-ID<n1bs2g$376$1@dont-email.me>
In reply to#98198
On Tue, 03 Nov 2015 19:04:23 -0700, Michael Torrie wrote:

> On 11/03/2015 05:33 PM, rurpy--- via Python-list wrote:
>> I consider regexs more fundemental.  One need not even be a programmer
>> to use them: consider grep, sed, a zillion editors, database query 
>> languages, etc.
> 
> Grep can use regular expressions (and I do so with it regularly), but
> it's default mode is certainly not regular expressions ...

Its very name indicates that its default mode most certainly is regular
expressions.

Dan

[toc] | [prev] | [next] | [standalone]


#98203

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-11-04 14:23 +1100
Message-ID<56397a18$0$11094$c3e8da3@news.astraweb.com>
In reply to#98199
On Wednesday 04 November 2015 13:55, Dan Sommers wrote:

> On Tue, 03 Nov 2015 19:04:23 -0700, Michael Torrie wrote:
> 
>> On 11/03/2015 05:33 PM, rurpy--- via Python-list wrote:
>>> I consider regexs more fundemental.  One need not even be a programmer
>>> to use them: consider grep, sed, a zillion editors, database query
>>> languages, etc.
>> 
>> Grep can use regular expressions (and I do so with it regularly), but
>> it's default mode is certainly not regular expressions ...
> 
> Its very name indicates that its default mode most certainly is regular
> expressions.

I don't even know what grep stands for. 

But I think what Michael may mean is that if you "grep foo", no regex magic 
takes place since "foo" contains no metacharacters.




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#98206

FromMichael Torrie <torriem@gmail.com>
Date2015-11-03 20:47 -0700
Message-ID<mailman.3.1446608844.16136.python-list@python.org>
In reply to#98203
On 11/03/2015 08:23 PM, Steven D'Aprano wrote:
>>> Grep can use regular expressions (and I do so with it regularly), but
>>> it's default mode is certainly not regular expressions ...
>>
>> Its very name indicates that its default mode most certainly is regular
>> expressions.
> 
> I don't even know what grep stands for. 
> 
> But I think what Michael may mean is that if you "grep foo", no regex magic 
> takes place since "foo" contains no metacharacters.

More likely I just don't know what I'm talking about.  I must have been
thinking about something else (shell globbing perhaps).

Certainly most of the times I've seen grep used, it's to look for a word
with no special metacharacters, as you say. Still a valid RE of course.
 But I have learned to night I don't need to resort to grep -e to use
regular expressions.  At least with GNU grep, that's the default.

[toc] | [prev] | [next] | [standalone]


#98226

FromGrant Edwards <invalid@invalid.invalid>
Date2015-11-04 13:27 +0000
Message-ID<n1d14h$hii$2@reader1.panix.com>
In reply to#98206
On 2015-11-04, Michael Torrie <torriem@gmail.com> wrote:
> On 11/03/2015 08:23 PM, Steven D'Aprano wrote:
>>
>>>> Grep can use regular expressions (and I do so with it regularly), but
>>>> it's default mode is certainly not regular expressions ...
>>>
>>> Its very name indicates that its default mode most certainly is
>>> regular expressions.
>> 
>> I don't even know what grep stands for. 

General Regular Expression Parser (or somesuch)

>> But I think what Michael may mean is that if you "grep foo", no regex
>> magic takes place since "foo" contains no metacharacters.
>
> More likely I just don't know what I'm talking about.  I must have
> been thinking about something else (shell globbing perhaps).
>
> Certainly most of the times I've seen grep used, it's to look for a
> word with no special metacharacters, as you say. Still a valid RE of
> course. But I have learned to night I don't need to resort to grep -e
> to use regular expressions.

The -e turns on "enhanced" regexes which add a few more features to
the regex language it parses.  I've never been entirely sure if the -e
regex language is backwards compatible with the default one or not...

> At least with GNU grep, that's the default.

Grep has always by default parsed its first command line argument (at
least since v7 and Sys5).  If you didn't want it treated as a regex,
you had to specify -f.

-- 
Grant Edwards               grant.b.edwards        Yow! I like your SNOOPY
                                  at               POSTER!!
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#98208

FromNobody <nobody@nowhere.invalid>
Date2015-11-04 05:05 +0000
Message-ID<pan.2015.11.04.05.05.52.633000@nowhere.invalid>
In reply to#98203
On Wed, 04 Nov 2015 14:23:04 +1100, Steven D'Aprano wrote:

>> Its very name indicates that its default mode most certainly is regular
>> expressions.
> 
> I don't even know what grep stands for.

From the ed command "g /re/p" (where "re" is a placeholder for an
arbitrary regular expression). Tests all lines ("g" for global) against
the specified regexp and prints ("p") any which match.

> But I think what Michael may mean is that if you "grep foo", no regex
> magic takes place since "foo" contains no metacharacters.

At least the GNU version will treat the input as a regexp regardless of
whether it contains only literal characters. I.e. "grep foo" and
"grep [f][o][o]" will both construct the same state machine then process
the input with it.

You need to actually use -F to change the matching algorithm.

[toc] | [prev] | [next] | [standalone]


#98215

FromPeter Otten <__peter__@web.de>
Date2015-11-04 09:57 +0100
Message-ID<mailman.6.1446627436.16136.python-list@python.org>
In reply to#98203
Michael Torrie wrote:

> On 11/03/2015 08:23 PM, Steven D'Aprano wrote:
>>>> Grep can use regular expressions (and I do so with it regularly), but
>>>> it's default mode is certainly not regular expressions ...
>>>
>>> Its very name indicates that its default mode most certainly is regular
>>> expressions.
>> 
>> I don't even know what grep stands for.
>> 
>> But I think what Michael may mean is that if you "grep foo", no regex
>> magic takes place since "foo" contains no metacharacters.
> 
> More likely I just don't know what I'm talking about.  I must have been
> thinking about something else (shell globbing perhaps).
> 
> Certainly most of the times I've seen grep used, it's to look for a word
> with no special metacharacters, as you say. Still a valid RE of course.
>  But I have learned to night I don't need to resort to grep -e to use
> regular expressions.  At least with GNU grep, that's the default.

Well, I didn't know that grep uses regular expressions by default.

I tried Tim's example

$ seq 5 | grep '1*'
1
2
3
4
5
$ 

which surprised me because I remembered that there usually weren't any 
matching lines when I invoked grep instead of egrep by mistake. So I tried 
another one

$ seq 5 | grep '[1-3]+'
$ 

and then headed for the man page. Apparently there is a subset called "basic 
regular expressions":

"""
  Basic vs Extended Regular Expressions
       In basic regular expressions the meta-characters ?, +, {, |, (,
       and ) lose their special meaning; instead use  the  backslashed
       versions \?, \+, \{, \|, \(, and \).
"""

[toc] | [prev] | [next] | [standalone]


#98265

FromSteven D'Aprano <steve@pearwood.info>
Date2015-11-05 13:28 +1100
Message-ID<563abee1$0$1614$c3e8da3$5496439d@news.astraweb.com>
In reply to#98215
On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:

> I tried Tim's example
> 
> $ seq 5 | grep '1*'
> 1
> 2
> 3
> 4
> 5
> $

I don't understand this. What on earth is grep matching? How does "4"
match "1*"?


> which surprised me because I remembered that there usually weren't any
> matching lines when I invoked grep instead of egrep by mistake. So I tried
> another one
> 
> $ seq 5 | grep '[1-3]+'
> $
> 
> and then headed for the man page. Apparently there is a subset called
> "basic regular expressions":
> 
> """
>   Basic vs Extended Regular Expressions
>        In basic regular expressions the meta-characters ?, +, {, |, (,
>        and ) lose their special meaning; instead use  the  backslashed
>        versions \?, \+, \{, \|, \(, and \).
> """

None of this appears relevant, as the metacharacter * is not listed. So
what's going on?




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#98268

FromTim Chase <python.list@tim.thechases.com>
Date2015-11-04 20:48 -0600
Message-ID<mailman.40.1446691790.16136.python-list@python.org>
In reply to#98265
On 2015-11-05 13:28, Steven D'Aprano wrote:
> > I tried Tim's example
> > 
> > $ seq 5 | grep '1*'
> > 1
> > 2
> > 3
> > 4
> > 5
> > $  
> 
> I don't understand this. What on earth is grep matching? How does
> "4" match "1*"?

The line with "4" matches "zero or more 1s".  If it was searching for
a literal "1*" (as would happen with fgrep or "grep -F"), it would
return no results:

  $ seq 5 | fgrep '1*'
  $

-tkc


[toc] | [prev] | [next] | [standalone]


#98269

FromBen Finney <ben+python@benfinney.id.au>
Date2015-11-05 14:03 +1100
Message-ID<mailman.41.1446692637.16136.python-list@python.org>
In reply to#98265
Steven D'Aprano <steve@pearwood.info> writes:

> On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:
>
> > I tried Tim's example
> > 
> > $ seq 5 | grep '1*'
> > 1
> > 2
> > 3
> > 4
> > 5
> > $
>
> I don't understand this. What on earth is grep matching? How does "4"
> match "1*"?

You can experiment with regular expressions to find out. Here's a link
to the RegExr tool for the above pattern <URL:http://regexr.com/3c4ot>.

Matching patterns can include specifications meaning “match some number
of the preceding segment”, with the ‘{n,m}’ notation. That means “match
at least n, and at most m, occurrences of the preceding segment”. Either
‘n’ or ‘m’ can be omitted, meaning “at least 0” and “no maximum”
respectively.

Those are quite useful, so there are shortcuts for the most common
cases: ‘?’ is a short cut for ‘{0,1}’, ‘*’ is a short cut for ‘{0,}’,
and ‘+’ is a short cut for ‘{1,}’.

In this case, ‘*’ is a short cut for ‘{0,}’ meaning “match 0 or more
occurences of the preceding segment”. The segment here is the atom ‘1’.
Since ‘1*’ is the entirety of the pattern, the pattern can match zero
characters, anywhere within any string. So, it matches every possible
string.

To match (some atom) 1 or more times, ‘+’ is a short cut for ‘(1,}’
meaning “match 1 or more occurrences of the preceding segment”.

-- 
 \    學而不思則罔,思而不學則殆。 (To study and not think is a waste. |
  `\                             To think and not study is dangerous.) |
_o__)                            —孔夫子 Confucius (551 BCE – 479 BCE) |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#98291

FromPeter Otten <__peter__@web.de>
Date2015-11-05 09:33 +0100
Message-ID<mailman.48.1446712433.16136.python-list@python.org>
In reply to#98265
Steven D'Aprano wrote:

> On Wed, 4 Nov 2015 07:57 pm, Peter Otten wrote:
> 
>> I tried Tim's example
>> 
>> $ seq 5 | grep '1*'
>> 1
>> 2
>> 3
>> 4
>> 5
>> $
> 
> I don't understand this. What on earth is grep matching? How does "4"
> match "1*"?

Look for zero or more "1". Written in Python:

for line in sys.stdin:
    if re.compile("1*").search(line):
        print(line, end="")
 
>> which surprised me because I remembered that there usually weren't any
>> matching lines when I invoked grep instead of egrep by mistake. So I
>> tried another one
>> 
>> $ seq 5 | grep '[1-3]+'
>> $
>> 
>> and then headed for the man page. Apparently there is a subset called
>> "basic regular expressions":
>> 
>> """
>>   Basic vs Extended Regular Expressions
>>        In basic regular expressions the meta-characters ?, +, {, |, (,
>>        and ) lose their special meaning; instead use  the  backslashed
>>        versions \?, \+, \{, \|, \(, and \).
>> """
> 
> None of this appears relevant, as the metacharacter * is not listed. 

That's the very point. 

> So what's going on?

Most special characters are not working with grep, but * is. The quote 
explains why many regular expressions like "[1-3]+" that you may know from 
Python's re don't work, but a small subset including the ominous "1*" do.

[toc] | [prev] | [next] | [standalone]


Page 1 of 6  [1] 2 3 4 5 6  Next page →

Back to top | Article view | comp.lang.python


csiph-web