Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #14804

Re: simple regex pattern sought

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!news.musoftware.de!wum.musoftware.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From Robert Klemme <shortcutter@googlemail.com>
Newsgroups comp.lang.java.programmer
Subject Re: simple regex pattern sought
Date Sat, 26 May 2012 00:12:34 +0200
Lines 70
Message-ID <a2aeesF2s0U1@mid.individual.net> (permalink)
References <e9vvr7p7l8l5kem31v5a37apdlubrqjq5e@4ax.com> <dc4ca9b0-9aa9-4fe1-bbc9-2d3a28250a9d@googlegroups.com>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Trace individual.net tUS/Z0mDXcVq8yQc8J2wqwlPi6E2scBxQKB0RnI983z6wF/pPGq3NlXqIOuTSuQPw=
Cancel-Lock sha1:XlcrKZ5/XwPyvTkkq2QjLSj35c4=
User-Agent Mozilla/5.0 (Windows NT 6.0; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
In-Reply-To <dc4ca9b0-9aa9-4fe1-bbc9-2d3a28250a9d@googlegroups.com>
Xref csiph.com comp.lang.java.programmer:14804

Show key headers only | View raw


On 25.05.2012 23:55, Lew wrote:
> Roedy Green wrote:
>> I often have to search for things of the form
>>
>> "xxxxx"
>> or
>> 'xxxxx'
>>
>> where xxx is anything not " or '.  It might be Russian or English or
>> any other language.
>>
>> What is the cleanest way to do that?
>
> Use a regex like "[\"'][^\"']+[\"']" is one way. The cleanest? I don't know.

That does not match quoting properly.  Better do something like

"([\"'])[^\"']*\\1"

Still I prefer

"\"[^\"]*\"|'[^']*'"

Because it allows for quotes of the other type inside quotes.

With proper escaping (using \ as escape char, any other works, too) this 
becomes

"\"(?:\\\\.|[^\\\"])*\"|'(?:\\\\.|[^\\'])*'"

Kind regards

	robert


package rx;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Quotes {

   private static final Pattern Q1 = Pattern.compile("([\"'])[^\"']*\\1");
   private static final Pattern Q2 = Pattern.compile("\"[^\"]*\"|'[^']*'");
   private static final Pattern Q3 = 
Pattern.compile("\"(?:\\\\.|[^\\\"])*\"|'(?:\\\\.|[^\\'])*'");

   public static void main(String[] args) {
     System.out.println(Q1);
     for (final Matcher m = Q1.matcher("'a' \"b\" 'c'"); m.find();) {
       System.out.println(m.group());
     }

     System.out.println(Q2);
     for (final Matcher m = Q2.matcher("'a' \"b\" 'c'"); m.find();) {
       System.out.println(m.group());
     }

     System.out.println(Q3);
     for (final Matcher m = Q3.matcher("'a' \"\\\"b\" 'c'"); m.find();) {
       System.out.println(m.group());
     }
   }

}


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-25 14:45 -0700
  Re: simple regex pattern sought markspace <-@.> - 2012-05-25 14:55 -0700
  Re: simple regex pattern sought Lew <lewbloch@gmail.com> - 2012-05-25 14:55 -0700
    Re: simple regex pattern sought markspace <-@.> - 2012-05-25 15:04 -0700
      Re: simple regex pattern sought Lew <noone@lewscanon.com> - 2012-05-26 14:07 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 18:34 -0700
          Re: simple regex pattern sought Lew <noone@lewscanon.com> - 2012-05-27 11:39 -0700
    Re: simple regex pattern sought Lew <lewbloch@gmail.com> - 2012-05-25 15:03 -0700
    Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 00:12 +0200
      Re: simple regex pattern sought markspace <-@.> - 2012-05-25 18:43 -0700
        Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 16:37 +0200
          Re: simple regex pattern sought markspace <-@.> - 2012-05-26 08:06 -0700
            Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 17:34 +0200
              Re: simple regex pattern sought Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2012-05-26 10:07 -0700
      Re: simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-26 06:19 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 07:19 -0700
        Re: simple regex pattern sought markspace <-@.> - 2012-05-26 07:57 -0700
          Re: simple regex pattern sought Robert Klemme <shortcutter@googlemail.com> - 2012-05-26 17:13 +0200
            Re: simple regex pattern sought markspace <-@.> - 2012-05-26 10:08 -0700
              Re: simple regex pattern sought Roedy Green <see_website@mindprod.com.invalid> - 2012-05-26 14:14 -0700

csiph-web