Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.databases.ms-sqlserver > #1290

Re: table design for massive GPS data

Newsgroups comp.databases.ms-sqlserver
Date 2012-09-17 08:46 -0700
References <37389863-6d11-4092-9f04-2e27dbccbd2b@googlegroups.com> <XnsA0CFC4E06222AYazorman@127.0.0.1>
Message-ID <c7f8f434-150c-4b5b-b223-9c4fe5431912@googlegroups.com> (permalink)
Subject Re: table design for massive GPS data
From rja.carnegie@gmail.com

Show all headers | View raw


On Saturday, September 15, 2012 6:21:13 PM UTC+1, Erland Sommarskog wrote:
> migurus (migurus@yahoo.com) writes: 
> > First idea is to keep data normalized in one table, one record per
> > GPS reading. It looks very simple to implement, but I doubt the 
> > server will be happy with this high number of small records.
> 
> Why not? As long it is properly indexed, I don't see any problem.
> What you should consider though, is to partition the table (which 
> requires Enterprise Edition). When five years has passed, and you
> want to purge old data, this is a swift affair with partitioning, 
> provided that your partition aligns with what you want to purge.

Partitioning might be not a great boost to performance, depending on 
how it's done.  With methods possibly out of date or wrong, I estimate 
the original design row size is 23 bytes, counting one byte to map 
NULL columns (for up to 8 table columns whether nullable or not),
and an overhead of 11 bytes per table row, although I don't remember
how I worked that out - maybe by trying it out.  Then I think that 
one day's new data is a little less than 10 megabytes, and five years
is around 20 GB.  That's before indexing - which would probably be, 
well, somewhat smaller than the data.  If it's stupid to make tables
as large as that, I've been stupider.

300,000 transactions per day, however, I put at about 2 GB or more 
in the transaction log if it's fully logged, so keeping that under 
control might be an important part of the backup/maintenance plan.
On the other hand, since I'm assuming storing one 8 KB page in the 
log for storing each 34 bytes row, it's a lousy way to keep a backup 
version of the data as far as efficiency goes.  But efficient isn't
the same thing as effective.  If you can bear the inefficiency, it'll
work.  Whoops, I forgot that these additions will probably update 
the index as well, so, double that log requirement.

For deleting data /without/ taking table partitioning into account, 
you might remove one day's records each day after five years, but
that won't be logged so badly if the rows are grouped together by 
clustered index - you're probably okay even if they're ordered by
vehicle ID and /then/ date/time.

And then the physical storage of data rows will be merrily jumbled 
within the data file - fragmented somewhat as time goes on - 
but if you only want to query one day's and one vehicle's data
at a time, I think that should be fine.

If partitioning the table, instead, what size of partition would you 
make - one day, one month, one year - and how would you maintain that?
Can it be done automatically?

Backup should be whatever you need to have in order to restore 
as much as you need to restore, as promptly as you need to have it,
with depth of design in case you discover just at the wrong time 
that your backup tape or disc isn't any good.  There are some 
horror stories that it's good to read, but not at bedtime.
So, what you should set up depends on the circumstances, and
the budget.  You may need to explain some of the horror stories
to whoever pays for stuff.

I might include separately storing at least a day's worth of GPS 
messages to re-load into your database from the external format, 
in case of (a) equipment failure and data loss and (b) finding out 
that you were mishandling and corrupting the messages for a while.
There are horror stories about that, too.  And if you can restore
yesterday's data in that way, then you may only need a daily backup 
of the entire database, just after midnight - which will save you 
from storing copies of the transaction log.

On the other hand, your recovery scenario may be, you have database
backups, but someone stole all of your computers...

Also, would it be bad if somebody stole your data... such as, 
to rob the vehicles.  Like, if the truck full of money drives 
the same route every Friday... sure, the bad guys could go out
and follow it anyway, but at least then they have to pay for gas.
I don't think you want to tell /us/ if you have trucks full 
of money.

So, anyway, you should stop that from happening.

Back to comp.databases.ms-sqlserver | Previous | NextPrevious in thread | Find similar


Thread

table design for massive GPS data migurus <migurus@yahoo.com> - 2012-09-13 14:47 -0700
  Re: table design for massive GPS data bradbury9 <ray.bradbury9@gmail.com> - 2012-09-14 00:06 -0700
  Re: table design for massive GPS data "Bob Barrows" <reb01501@NOSPAMyahoo.com> - 2012-09-14 07:33 -0400
    Re: table design for massive GPS data migurus <migurus@yahoo.com> - 2012-09-14 10:13 -0700
  Re: table design for massive GPS data rja.carnegie@gmail.com - 2012-09-14 05:05 -0700
    Re: table design for massive GPS data migurus <migurus@yahoo.com> - 2012-09-14 10:25 -0700
  Re: table design for massive GPS data Erland Sommarskog <esquel@sommarskog.se> - 2012-09-15 19:21 +0200
    Re: table design for massive GPS data rja.carnegie@gmail.com - 2012-09-17 08:46 -0700

csiph-web