Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.databases.postgresql > #245

Re: MySQL vs Postgres for a "data warehouse", 5-10 TB

From Norman Peelman <npeelmandog@cfl.rr.com>
Newsgroups comp.databases.mysql, comp.databases.postgresql
Subject Re: MySQL vs Postgres for a "data warehouse", 5-10 TB
Date 2011-09-10 19:06 -0400
Organization A noiseless patient Spider
Message-ID <j4gqi3$epu$1@dont-email.me> (permalink)
References <H9idndjaqY3Xb_fTnZ2dnUVZ_sGdnZ2d@giganews.com> <vilain-F1171C.23304409092011@news.individual.net> <M9OdnSps5JUM6PbTnZ2dnUVZ_gGdnZ2d@giganews.com>

Cross-posted to 2 groups.

Show all headers | View raw


On 09/10/2011 10:34 AM, Ignoramus13208 wrote:
> On 2011-09-10, Michael Vilain<vilain@NOspamcop.net>  wrote:
>> This is why database consultants get paid big bucks.  Answering
>> questions like "what DB product should I buy?" and "how should I lay out
>> my database system?" are all bread and butter for these guys.  And you
>> want that sort of analysis for free?  Huh.
>
> I was hoping that someone in these two groups has faced similar issues
> before and would be willing to share their relevant experiences.
>
>> Basically, unless you're a high power database person who knows the ins
>> and outs of each vendor's products, how to optimize performance (that's
>> $150K/year easy), and can design and administer such systems, I think
>> you're going to have do the research own your own.
>
> Well, this is what I am trying to do here, do a research.
>
>> Back in my day of system performance monitoring, a customer would
>> ask me how their workload would run on a specific hardware
>> configuration or why it's running poorly and should they buy the
>> larger machine.  Poorly written applications will still perform
>> poorly on a bigger machine, just faster.  Spend time designing
>> something well with growth, scale, archival, and _yes Virginia,
>> fricken backups_ and you'll save yourself some time and stress.
>> Developers never think of "how do I back this up?", they just ask
>> the datacenter to do it.  After they've gone live.
>
> Well, for better or worse, I have already designed a backup system for
> this warehouse.
>
> But there are issues of large datasets that just do not matter in
> small datasets. I liken it to having a pet that happens to be an
> elephant. It is not the same as having a mouse, cat or dog for a pet
> -- a pet elephant comes with its own set of challenges.
>
> For example, I remember the horrors and upsets of REPAIR TABLE
> statements that took almost a day. That was with a MyISAM table around
> 40 gigs (I do not remember the size exactly).
>
> This means that I cannot use MyISAM, period, end of story, for
> something 100 times the size of the above.
>
> And that is why I am asking for people with relevant experience to
> share what it is.
>
>> I have no idea how well you've designed your database or your experience
>> in such matters.  You sound like a hardware guy from what you've
>> described.
>
> I am actually a computer programmer.
>
>> If this is just personal project, build it using MySQL, run some
>> benchmarks, rebuild it with the same data on Postgres and run the
>> same benchmarks.  If they're the same, add more data and see what
>> happens?  Keep adding more data until you see a difference.  I've
>> only used MySQL without transactions, so I can't tell you how well
>> it will perform compared to Postgres which I think had them from day
>> 1.  Some DBMS' have other ways of indexing tables beyond b-trees
>> (DEC's RdB had a hashing method that allowed for very fast reads on
>> certain types of tables).  I don't know if MySQL or Postgres offer
>> that level of optimization.
>>
>> I just backup the stuff.
>
> Doing it in both mysql and postgres, is not such a bad idea.
>
> I have backups relatively well covered, as my paranoia about backups
> has no bounds.
>
> i

   It seems to me that if you could split the db into smaller tables 
based on per site usage or timeframe (monthly, quarterly, etc) any 
REPAIR TABLE operations would be significantly reduced. How much that 
would impact your scripts I don't know. That and a 64bit linux (is the 
hw 64bit) to cover the file structure and data structure limits.

http://jeremy.zawodny.com/blog/archives/000796.html
http://www.fromdual.ch/mysql-limitations




-- 
Norman
Registered Linux user #461062
-Have you been to www.mysql.com yet?-

Back to comp.databases.postgresql | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus5263 <ignoramus5263@NOSPAM.5263.invalid> - 2011-09-10 00:14 -0500
  Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Michael Vilain <vilain@NOspamcop.net> - 2011-09-09 23:30 -0700
    Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Luuk <Luuk@invalid.lan> - 2011-09-10 10:13 +0200
    Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus13208 <ignoramus13208@NOSPAM.13208.invalid> - 2011-09-10 09:34 -0500
      Re: MySQL vs Postgres for a "data warehouse", 5-10 TB John Levine <johnl@iecc.com> - 2011-09-10 19:47 +0000
        Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus13208 <ignoramus13208@NOSPAM.13208.invalid> - 2011-09-10 16:02 -0500
          Re: MySQL vs Postgres for a "data warehouse", 5-10 TB The Natural Philosopher <tnp@invalid.invalid> - 2011-09-11 15:40 +0100
          Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Axel Schwenke <axel.schwenke@gmx.de> - 2011-09-11 18:11 +0200
            Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus21330 <ignoramus21330@NOSPAM.21330.invalid> - 2011-09-11 12:07 -0500
      Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Norman Peelman <npeelmandog@cfl.rr.com> - 2011-09-10 19:06 -0400
  Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Matthew Woodcraft <mattheww@chiark.greenend.org.uk> - 2011-09-11 10:24 +0100
    Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus21330 <ignoramus21330@NOSPAM.21330.invalid> - 2011-09-11 07:35 -0500
  Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Bodo <bodo@nomail.org.invalid> - 2011-09-23 13:01 +0200

csiph-web