Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail From: Rafael Villar Newsgroups: comp.lang.java.programmer Subject: Re: Regex doesn't recognize single quote Date: Sun, 08 Jan 2012 09:05:11 +1000 Organization: Aioe.org NNTP Server Lines: 63 Message-ID: References: <74f4b448-24bf-448f-9f4a-06fd1b79c86d@o12g2000vbd.googlegroups.com> NNTP-Posting-Host: 1QofoA7r19pAiEX5BZLHVA.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 X-Notice: Filtered by postfilter v. 0.8.2 Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:11095 On 08/01/12 08:41, Roedy Green wrote: > On 7 Jan 2012 11:42:26 GMT, ram@zedat.fu-berlin.de (Stefan Ram) wrote, > quoted or indirectly quoted someone who said : > >>> That is not what a regex is for. >> >> How do you know what it is for? > > Regexes are for searching for patterns. Transforming or deleting > characters is much simpler done with a for loop. > > How do I know what a regex is for? I am familiar with the API. I have > attempted to use them for various purposes and discovered they were > suitable for some and not for others. >> >>> Just use a StringBuilder the length of your String. Then >>> loop through the chars with charAt. If the character is a >>> ' or \w, ignore it, else append. If it gets complex, use a >>> switch or if it gets really complicated use a BitSet. >> >> This might be needless (as far as we know right now) >> optimization bloating the code reducing its readability and >> low-level thinking, which might be required sometimes, but >> does not serve as a general rule. Still it is nice to know >> how it could be done if required. > > What is your simpler implementation? > > /** remove ' and \w from string > * @param s string to process > * @return string without ' or \w > */ > private static String scrunch( final String s ) > { > final Stringbuilder sb = new StringBuilder( s.length() ); > for (int i=0; i { > char c = s.charAt(i); > if ( !( c = '\'' || c = '\w' ) ) > { > sb.append ( c ); > } > } > return sb.toString(); > } In most cases is better to use a StringBuilder to perform replacements, but in this particular case String.replaceAll() is better. By the way, the escape sequence \w is not a java regular escape sequence but belongs to the pattern syntax (although you should already know about it, as you say you are familiar with the API). Anyway a simpler implementation (and one which works, because yours doesn't): /** remove ' and \w from string * @param s string to process * @return string without ' or \w */ private static String scrunch( final String s ) { return s.replaceAll("[^'\\w]+", ""); }