Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.databases.ms-sqlserver > #1942

cluster or hash table advise needed

Newsgroups comp.databases.ms-sqlserver
Date 2015-09-29 12:06 -0700
Message-ID <fa7449c7-daca-4a6d-81ba-603063464b4f@googlegroups.com> (permalink)
Subject cluster or hash table advise needed
From "M.G." <michael@gurfinkel.us>

Show all headers | View raw


We are designing a table with high insert / delete activity. The table maintains sequence of actions per specific experiments. These are the attributes:

CREATE TABLE ACTION_SEQUENCE (
	/* ACTION_SEQUENCE_ID int NOT NULL, <<< questionable */
	EXP_ID int NOT NULL,
	ACT_SEQ int NOT NULL,
	ACT_ID int NOT NULL,
	MODIFIED_TIME datetime NULL,
	ACT_TYPE int NOT NULL,
);

EXP_ID and ACT_ID are foreign keys into experiments and actions tables correspondingly.

sample data for two experiments with 2 action for #100 and three actions for #200:
EXP_ID ACT_SEQ ACT_ID ACT_TYPE
100     1       233    0
100     2       560    0
100     3       233    1
200     1       220    0
200     2       220    1
200     3       778    0
200     4       778    1

The nature of EXP_ID - monotonous increment, same for ACT_ID.

How we read data - access one experiment at a time, its actions sorted by ACT_SEQ. Like this:
select * from ACTION_SEQUENCE where EXP_ID=@ID order by ACT_SEQ;
The sequence is of essence here.

How we add / update data - use delete/insert approach (never update), again records per one experiment are always deleted (if exists) and inserted as a group, like this:
delete from ACTION_SEQUENCE where EXP_ID=@ID;
insert into ACTION_SEQUENCE(EXP_ID,ACT_SEQ,ACT_ID) values (@ID,...)

Expected number of records - around 10 million, expected number of inserts (pure additions) around 5000 a day, number of change (delete/insert) around 1000

Expected read / write ratio - 10 reads per 1 update

The question - would you make it a clustered table. or no cluster at all?
Current design - table has ACTION_SEQUENCE_ID primary key (cluster), the sequence itself is maintained through external sequence table (reason - the app development framework). 
I don't see any need for cluster here (and it's maintenance).
We can use EXP_ID, ACT_SEQ, ACT_ID as unique key logically, but again, we never deal with individual records, only with groups of records for a given experiment.  

Your input is highly appreciated.

Back to comp.databases.ms-sqlserver | Previous | NextNext in thread | Find similar


Thread

cluster or hash table advise needed "M.G." <michael@gurfinkel.us> - 2015-09-29 12:06 -0700
  Re: cluster or hash table advise needed Erland Sommarskog <esquel@sommarskog.se> - 2015-09-29 22:15 +0200
    Re: cluster or hash table advise needed "M.G." <michael@gurfinkel.us> - 2015-09-29 13:55 -0700

csiph-web