Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!news.musoftware.de!wum.musoftware.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Robert Klemme Newsgroups: comp.lang.java.programmer Subject: Re: simple regex pattern sought Date: Sat, 26 May 2012 00:12:34 +0200 Lines: 70 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: individual.net tUS/Z0mDXcVq8yQc8J2wqwlPi6E2scBxQKB0RnI983z6wF/pPGq3NlXqIOuTSuQPw= Cancel-Lock: sha1:XlcrKZ5/XwPyvTkkq2QjLSj35c4= User-Agent: Mozilla/5.0 (Windows NT 6.0; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 In-Reply-To: Xref: csiph.com comp.lang.java.programmer:14804 On 25.05.2012 23:55, Lew wrote: > Roedy Green wrote: >> I often have to search for things of the form >> >> "xxxxx" >> or >> 'xxxxx' >> >> where xxx is anything not " or '. It might be Russian or English or >> any other language. >> >> What is the cleanest way to do that? > > Use a regex like "[\"'][^\"']+[\"']" is one way. The cleanest? I don't know. That does not match quoting properly. Better do something like "([\"'])[^\"']*\\1" Still I prefer "\"[^\"]*\"|'[^']*'" Because it allows for quotes of the other type inside quotes. With proper escaping (using \ as escape char, any other works, too) this becomes "\"(?:\\\\.|[^\\\"])*\"|'(?:\\\\.|[^\\'])*'" Kind regards robert package rx; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Quotes { private static final Pattern Q1 = Pattern.compile("([\"'])[^\"']*\\1"); private static final Pattern Q2 = Pattern.compile("\"[^\"]*\"|'[^']*'"); private static final Pattern Q3 = Pattern.compile("\"(?:\\\\.|[^\\\"])*\"|'(?:\\\\.|[^\\'])*'"); public static void main(String[] args) { System.out.println(Q1); for (final Matcher m = Q1.matcher("'a' \"b\" 'c'"); m.find();) { System.out.println(m.group()); } System.out.println(Q2); for (final Matcher m = Q2.matcher("'a' \"b\" 'c'"); m.find();) { System.out.println(m.group()); } System.out.println(Q3); for (final Matcher m = Q3.matcher("'a' \"\\\"b\" 'c'"); m.find();) { System.out.println(m.group()); } } } -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/