Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: Erland Sommarskog <esquel@sommarskog.se>
Newsgroups: comp.databases.ms-sqlserver
Subject: Re: question on clustered indexes in sql-server
Date: Tue, 29 Nov 2011 23:45:42 +0100
Organization: Erland Sommarskog
Lines: 52
Message-ID: <Xns9FACF1B805663Yazorman@127.0.0.1>
References: <jb2ucg$tq7$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
Injection-Info: mx04.eternal-september.org; posting-host="nBFDv6s1VJQDuF1w6hpX2A"; logging-data="9448"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+KMCstBE+w0/y7nXdyHOW5"
User-Agent: Xnews/2006.08.24 Mime-proxy/2.1.c.0 (Win32)
Cancel-Lock: sha1:N48BIF1crAxTs+3H9k4yqytQM/4=
Xref: x330-a1.tempe.blueboxinc.net comp.databases.ms-sqlserver:850

Lennart Jonsson (erik.lennart.jonsson@gmail.com) writes:
> What is the purpose of a clustered index in sql-server (as you probably
> have guessed I have zero to none experience with sql-server)?
> 
> The reason I ask is because I look at a databas where more or less all
> tables are designed as:
> 
> create table T (
>      x int IDENTITY(1,1) NOT NULL,
> [...]
>  CONSTRAINT ... PRIMARY KEY CLUSTERED ( x ) ...
>      
> In db2 I would look at range predicates and order by clauses on queries
> to determine what index that should be clustered (inorder to avoid sorts).
> 
> What is the rationale to use a clustering index like the one above?
 
Probably database design on autopilot.

I don't know about DB2, but I know that in Oracle, heaps are the norm,
and index-organised tables is something you use rarely. In SQL Server,
it is the other way around. The clustered index is the normal thing,
and heaps is something you only use sometimes. Except in SQL Azure, 
where heaps are not even supported. All mindsets in SQL Server is
geared on clustered indexes, and you better know what you are doing if
use a heap.

As a consequence of this, by default when you define a primary key,
it will be clustered, unless there already is a clustered index. Which
there rarely is, since the PK is typically the first index.

Furthermore, many inexperienced developers slaps IDENTITY column in
each table (often with all other columns nullable), so that's why
you get it.

That said, there are also sound reasons to have an clustered index
on an IDENTITY column to avoid fragmentation. But that does not 
apply to all tables you see. And, incidently, nor if you want really
good INSERT performance, since you get a single hot-spot. Thomas
Kejser had a good presentation on this on SQL Rally where he talked
about getting really good INSERT performance on flash drives.




-- 
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx