Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #7701

Re: contains

From Lew <lewbloch@gmail.com>
Newsgroups comp.lang.java.programmer
Subject Re: contains
Date 2011-09-08 07:47 -0700
Organization http://groups.google.com
Message-ID <7bd53b6f-ed95-4f77-995e-a179f4f30ad0@glegroupsg2000goo.googlegroups.com> (permalink)
References <ab8bf299-2281-472d-88af-e02778edd279@m38g2000vbn.googlegroups.com> <Zw1aq.8466$CQ4.1852@newsfe09.iad> <13987de0-042f-45e7-8279-25e9f7bcfb0e@glegroupsg2000goo.googlegroups.com> <3n4aq.6818$GV2.28@newsfe20.iad>

Show all headers | View raw


On Thursday, September 8, 2011 7:33:34 AM UTC-7, Arved Sandstrom wrote:

Your post had trouble, being in "UTC-7" and dealing with 8-bit characters.

> Lew wrote:
>> Arved Sandstrom wrote:
>>> bob wrote:
>>>> Is there any case-insensitive version of the String contains method?
>>>>
>>> Not that I'm aware of. An easy way to do the deed is to use Pattern;
>>> something like
>>>
>>> boolean isContained  =
>>>     Pattern.compile("isThisStringContained",
>>>         Pattern.CASE_INSENSITIVE)
>>>     .matcher("stringThatMayContainOther").find();
>>>
>>> does the trick.
>>>
>>> A possibly preferable alternative to doing case-insensitive string
>>> operations is simply to uppercase (or lowercase) both Strings before
>>> doing the operation. .toLowerCase() and .toUpperCase() are String
>>> methods that are available for this purpose. If you plan to do this a
>>> lot you can write up a small utility method.
>> 
>> Beware of uppercasing and lowercasing - the results can be surprising.
>> 
>> String whatThe = "ß".toUpperCase().toLowerCase();
>> 
>> What should be the value of '"ß".equalsIgnoreCase("ss")'?
>> 
>> What should be the value of '"ß".toUpperCase().toLowerCase().equals("ß")'?
>> 
> That's a good point, but it's no mystery that uppercasing/lowercasing
> outside the standard 26-letter Latin alphabet has the odd pitfall here
> and there, like Eszett rules.
> 
> So my second suggestion applies in particular to Strings that contain
> Latin characters. For anything else you'd best be aware of the rules and
> the nature of your text.

Which means, for all practical purposes, always be aware of the rules and nature of your text.  Even here in the U.S. of A., we use lots of letters that don't fall into your (apparent) definition of "Latin" characters, for example, Latin-American names and loan words.  Never mind that in almost any context that a programmer cares about, you have to deal with locales.  Advising a programmer to deal with just the restriction to ASCII is a ludicrous suggestion.  You pretty much always have to be aware, at a minimum, of eight-bit characters, and really all of UTF-8.  To do otherwise is very irresponsible.

Also, by definition "ß" is a Latin character, in the sense that it's in the Latin-1 (ISO8859-1) character set.

If you write for the 128
Then you deserve your horrible fate.

-- 
Lew

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

contains bob <bob@coolgroups.com> - 2011-09-08 00:44 -0700
  Re: contains Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-09-08 08:19 -0300
    Re: contains Lew <lewbloch@gmail.com> - 2011-09-08 06:37 -0700
      Re: contains Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-09-08 11:33 -0300
        Re: contains Lew <lewbloch@gmail.com> - 2011-09-08 07:47 -0700
          Re: contains Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-09-08 18:59 +0000
            Re: contains Lew <lewbloch@gmail.com> - 2011-09-08 16:59 -0700
          Re: contains Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-09-08 16:37 -0300
            Re: contains supercalifragilisticexpialadiamaticonormalizeringelimatisticantations <supercalifragilisticexpialadiamaticonormalizeringelimatisticantations@averylongandannoyingdomainname.com> - 2011-09-08 20:14 -0400
  Re: contains Arne Vajhøj <arne@vajhoej.dk> - 2011-09-08 20:05 -0400

csiph-web