Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mladen Gogala Newsgroups: comp.databases.postgresql Subject: Re: Case-insensitive select Date: Thu, 1 Sep 2011 21:26:13 +0000 (UTC) Organization: solani.org Lines: 48 Message-ID: References: <25ee8bf2-869c-427b-a4a4-5faf86f92d57@o26g2000vbi.googlegroups.com> <1314860211.105064@proxy.dienste.wien.at> <1314888079.999065@proxy.dienste.wien.at> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: solani.org 1314912373 25042 eJwNwoERAEEEBLCWGCzKwdF/Cf+ZmIAxrjConV3G1Xp2HI2wePy5nDC7SpXRlw7IPLGOlx8Y8hD2 (1 Sep 2011 21:26:13 GMT) X-Complaints-To: abuse@news.solani.org NNTP-Posting-Date: Thu, 1 Sep 2011 21:26:13 +0000 (UTC) User-Agent: Pan/0.133 (House of Butterflies) X-User-ID: eJwFwQEBwDAIAzBLo1A6O4eDfwlL6GnZimQGl4uqaQ9qhK64UfbhzpL7bxADkEeHbpopPSM7EP4= Cancel-Lock: sha1:wGuHnUHv8od4jdvltO7WuxoNdx8= X-NNTP-Posting-Host: eJwNydsVADEEBcCWiMeVchD6L2H3zOeYOHtD3VxtfziZYg2urlNVGVNG9wUFmDxSabfn+XRLDN0LUeL7LKGmrf+VHDgPdPkD6GMaHg== Xref: x330-a1.tempe.blueboxinc.net comp.databases.postgresql:227 On Thu, 01 Sep 2011 16:40:58 +0200, Laurenz Albe wrote: > Oh, I see. I think you got confused here, because the sentence you quote > is a description of a drawback of the "standard approach" using "WHERE > lower(col) = LOWER(?)", *not* a drawback of citext. Seems you're right, I was mistaken. mgogala=# create temporary table t1(col1 citext, constraint t1_pk primary key(col1)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pk" for table "t1" CREATE TABLE mgogala=# insert into t1 values('Laurenz Albe'); INSERT 0 1 mgogala=# insert into t1 values('laurenz albe'); ERROR: duplicate key value violates unique constraint "t1_pk" DETAIL: Key (col1)=(laurenz albe) already exists. mgogala=# insert into t1 values('LAURENZ ALBE'); ERROR: duplicate key value violates unique constraint "t1_pk" DETAIL: Key (col1)=(LAURENZ ALBE) already exists. mgogala=# However, the citext doesn't allow me to limit the column size: mgogala=# create temporary table t1(col1 citext(10), constraint t1_pk primary key(col1)); ERROR: type modifier is not allowed for type "citext" LINE 1: create temporary table t1(col1 citext(10), constraint t1_pk ... It seems to me that citext is a good replacement for the type text, not for the type varchar(n). At any rate, the type citext is useful but rather nonstandard. I would still use the standard varchar type for the vast majority of cases and use trigger and/or functional indexes to resolve the case insensitive search. Case insensitive search was the problem for which functional indexes were invented. Cost based optimizers were having trouble with the functional indexes, Oracle's until version 11 and the extended statistics, because the normal statistics gathering procedures usually do not collect data on lower(col1) or something similar to it. The trick to make it perform better was to add a column, named something like lower_col1 and populate it by a trigger. That would help the optimizer, at the expense of violating the relational rules and storing the duplicate information. That column could then be indexed normally. This method still provides the best performance, on all databases I know of, with the notable exception of Oracle 11i. -- http://mgogala.byethost5.com