Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: markspace <-@.> Newsgroups: comp.lang.java.programmer Subject: Re: simple regex pattern sought Date: Sat, 26 May 2012 08:06:46 -0700 Organization: A noiseless patient Spider Lines: 30 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sat, 26 May 2012 15:06:47 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="2kn9RzOWSe/v/hLnHgGT4Q"; logging-data="12965"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XQa5hK9LuSNtAyRrqLM4URck/aElfqUY=" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 In-Reply-To: Cancel-Lock: sha1:Mozxab61dBg28RO/17nF0G0K1dk= Xref: csiph.com comp.lang.java.programmer:14815 On 5/26/2012 7:37 AM, Robert Klemme wrote: > On 26.05.2012 03:43, markspace wrote: >> On 5/25/2012 3:12 PM, Robert Klemme wrote: >> >>> "\"(?:\\\\.|[^\\\"])*\"|'(?:\\\\.|[^\\'])*'" >> ... >> and I don't think you need to in a regex >> either (although I didn't check that). > > There is also no regexp escaping of single quotes either. The only > regexp escaping you can see are the \\\\ which translate into \\ in the > string which is a literal backslash for the regexp engine. Yes, there is, although I think it's a typo. Both \\\" and \\' get passed to the regex as \" and \', which means just a single character " and ' respectively. You're right about the rest of it though. With so many \'s floating around, I have a hard time reading Java regex! > It's not parenthesis around character classes but around the alternative > of "match a backslash followed by any char" and "any char which is not > backslash or the opening quote type of this string variant". Yup, I totally missed this too. Thanks for pointing it out.