Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Erland Sommarskog Newsgroups: comp.databases.ms-sqlserver Subject: Re: question on clustered indexes in sql-server Date: Tue, 29 Nov 2011 23:45:42 +0100 Organization: Erland Sommarskog Lines: 52 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Injection-Info: mx04.eternal-september.org; posting-host="nBFDv6s1VJQDuF1w6hpX2A"; logging-data="9448"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+KMCstBE+w0/y7nXdyHOW5" User-Agent: Xnews/2006.08.24 Mime-proxy/2.1.c.0 (Win32) Cancel-Lock: sha1:N48BIF1crAxTs+3H9k4yqytQM/4= Xref: x330-a1.tempe.blueboxinc.net comp.databases.ms-sqlserver:850 Lennart Jonsson (erik.lennart.jonsson@gmail.com) writes: > What is the purpose of a clustered index in sql-server (as you probably > have guessed I have zero to none experience with sql-server)? > > The reason I ask is because I look at a databas where more or less all > tables are designed as: > > create table T ( > x int IDENTITY(1,1) NOT NULL, > [...] > CONSTRAINT ... PRIMARY KEY CLUSTERED ( x ) ... > > In db2 I would look at range predicates and order by clauses on queries > to determine what index that should be clustered (inorder to avoid sorts). > > What is the rationale to use a clustering index like the one above? Probably database design on autopilot. I don't know about DB2, but I know that in Oracle, heaps are the norm, and index-organised tables is something you use rarely. In SQL Server, it is the other way around. The clustered index is the normal thing, and heaps is something you only use sometimes. Except in SQL Azure, where heaps are not even supported. All mindsets in SQL Server is geared on clustered indexes, and you better know what you are doing if use a heap. As a consequence of this, by default when you define a primary key, it will be clustered, unless there already is a clustered index. Which there rarely is, since the PK is typically the first index. Furthermore, many inexperienced developers slaps IDENTITY column in each table (often with all other columns nullable), so that's why you get it. That said, there are also sound reasons to have an clustered index on an IDENTITY column to avoid fragmentation. But that does not apply to all tables you see. And, incidently, nor if you want really good INSERT performance, since you get a single hot-spot. Thomas Kejser had a good presentation on this on SQL Rally where he talked about getting really good INSERT performance on flash drives. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx