Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #14835

Re: simple regex pattern sought

From Lew <noone@lewscanon.com>
Newsgroups comp.lang.java.programmer
Subject Re: simple regex pattern sought
Date 2012-05-27 11:39 -0700
Organization albasani.net
Message-ID <jptscb$t82$1@news.albasani.net> (permalink)
References <e9vvr7p7l8l5kem31v5a37apdlubrqjq5e@4ax.com> <dc4ca9b0-9aa9-4fe1-bbc9-2d3a28250a9d@googlegroups.com> <jpovld$9la$1@dont-email.me> <jprgls$vnb$1@news.albasani.net> <jps0a9$58k$1@dont-email.me>

Show all headers | View raw


markspace wrote:
> Lew wrote:
>> markspace wrote:
>>> Lew wrote:
>>>> Use a regex like "[\"'][^\"']+[\"']" is one way. The cleanest? I
>>>> don't know.
>>>>
>>> This would match "John's restaurant" as "John'.
>>>
>>> The first quote matches ", John does not contain either ' or " as
>>> specified,
>>> and the last character class matches the '. Not I think what is wanted.
>>
>> As I correct6ed in my very next post.
>
> Unfortunately that one doesn't work either. The central part, [^"'], doesn't
> allow a match of a ' if the starting delimiter was a ", and that doesn't match
> Roedy's spec. "John's restaurant" wouldn't be matched at all, because the
> matcher couldn't match past the ' to get to the ".
>
> I think the easiest is to write out a grammar for the expression, then
> translate to regex.
>
> QUOTED_STRING := SQUOTED_STRING | DQUOTED_STRING
>
> SQUOTED_STRING := ' NON_S_QUOTE + '
>
> DQUOTED_STRING := " NON_D_QUOTE + "
>
> NON_S_QUOTE := [^']
>
> NON_D_QUOTE := [^"]
>
> At this point the grammar is very clear. (Note I haven't included Robert's \x
> escape sequences.) I think it's worth learning to use antlr rather than regex,
> which tends to obfuscate more than it helps. However, a literal translation
> into regex isn't hard, and a literal translation avoids mis-optimizations.

Very illuminating. Thank you.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-25 14:45 -0700
  Re: simple regex pattern sought markspace <-@.> - 2012-05-25 14:55 -0700
  Re: simple regex pattern sought Lew <lewbloch@gmail.com> - 2012-05-25 14:55 -0700
    Re: simple regex pattern sought markspace <-@.> - 2012-05-25 15:04 -0700
      Re: simple regex pattern sought Lew <noone@lewscanon.com> - 2012-05-26 14:07 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 18:34 -0700
          Re: simple regex pattern sought Lew <noone@lewscanon.com> - 2012-05-27 11:39 -0700
    Re: simple regex pattern sought Lew <lewbloch@gmail.com> - 2012-05-25 15:03 -0700
    Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 00:12 +0200
      Re: simple regex pattern sought markspace <-@.> - 2012-05-25 18:43 -0700
        Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 16:37 +0200
          Re: simple regex pattern sought markspace <-@.> - 2012-05-26 08:06 -0700
            Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 17:34 +0200
              Re: simple regex pattern sought Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2012-05-26 10:07 -0700
      Re: simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-26 06:19 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 07:19 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 07:57 -0700
          Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 17:13 +0200
            Re: simple regex pattern sought markspace <-@.> - 2012-05-26 10:08 -0700
              Re: simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-26 14:14 -0700

csiph-web