Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.databases.postgresql > #238
| From | Michael Vilain <vilain@NOspamcop.net> |
|---|---|
| Newsgroups | comp.databases.mysql, comp.databases.postgresql |
| Subject | Re: MySQL vs Postgres for a "data warehouse", 5-10 TB |
| Date | 2011-09-09 23:30 -0700 |
| Organization | DexLabs, Inc. |
| Message-ID | <vilain-F1171C.23304409092011@news.individual.net> (permalink) |
| References | <H9idndjaqY3Xb_fTnZ2dnUVZ_sGdnZ2d@giganews.com> |
Cross-posted to 2 groups.
In article <H9idndjaqY3Xb_fTnZ2dnUVZ_sGdnZ2d@giganews.com>, Ignoramus5263 <ignoramus5263@NOSPAM.5263.invalid> wrote: > First of all, this is not an attempt to create a flamewar of any > kind. I have been a MySQL user for years, including owning a few > multi-gigabyte databases for my websites, and using it to host > algebra.com (about 12 GB database). > > I have had my ups and downs with MySQL. The ups were ease of use and > decent performance for small databases such as algebra.com. The downs > were things like twenty hour REPAIR TABLE operations on a 35 GB > table, etc. > > Right now I have a personal (one user) project to create a 5-10 > Terabyte data warehouse. The largest table will consume the most space > and will take, perhaps, 200,000,000 rows. > > I want to use it to obtain valuable business intelligence and to make > money. > > I expect it to grow, never shrink, and to be accessed via batch > queries. I do not care for batch queries to be super fast, for example > an hour per query would be just fine. > > However, while an hour is fine, two weeks per query is NOT fine. > > I have a server with about 18 TB of storage and 48 GB of RAM, and 12 > CPU cores. > > My initial plan was to use MySQL, InnoDB, and deal with problems as > they arise. Perhaps, say, I would implement my own joining > procedures. > > After reading some disparaging stuff about InnoDB performance on large > datasets, however, I am getting cold feet. I have a general feeling > that, perhaps, I will not be able to succeed with MySQL, or, perhaps, > with either MySQL and Postgres. > > I do not know much about Postgres, but I am very eager to learn and > see if I can use it for my purposes more effectively than MySQL. > > I cannot shell out $47,000 per CPU for Oracle for this project. > > To be more specific, the batch queries that I would do, I hope, > would either use small JOINS of a small dataset to a large dataset, or > just SELECTS from one big table. > > So... Can Postgres support a 5-10 TB database with the use pattern > stated above? > > i This is why database consultants get paid big bucks. Answering questions like "what DB product should I buy?" and "how should I lay out my database system?" are all bread and butter for these guys. And you want that sort of analysis for free? Huh. Basically, unless you're a high power database person who knows the ins and outs of each vendor's products, how to optimize performance (that's $150K/year easy), and can design and administer such systems, I think you're going to have do the research own your own. Back in my day of system performance monitoring, a customer would ask me how their workload would run on a specific hardware configuration or why it's running poorly and should they buy the larger machine. Poorly written applications will still perform poorly on a bigger machine, just faster. Spend time designing something well with growth, scale, archival, and _yes Virginia, fricken backups_ and you'll save yourself some time and stress. Developers never think of "how do I back this up?", they just ask the datacenter to do it. After they've gone live. I have no idea how well you've designed your database or your experience in such matters. You sound like a hardware guy from what you've described. If this is just personal project, build it using MySQL, run some benchmarks, rebuild it with the same data on Postgres and run the same benchmarks. If they're the same, add more data and see what happens? Keep adding more data until you see a difference. I've only used MySQL without transactions, so I can't tell you how well it will perform compared to Postgres which I think had them from day 1. Some DBMS' have other ways of indexing tables beyond b-trees (DEC's RdB had a hashing method that allowed for very fast reads on certain types of tables). I don't know if MySQL or Postgres offer that level of optimization. I just backup the stuff. -- DeeDee, don't press that button! DeeDee! NO! Dee... [I filter all Goggle Groups posts, so any reply may be automatically ignored]
Back to comp.databases.postgresql | Previous | Next — Previous in thread | Next in thread | Find similar
MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus5263 <ignoramus5263@NOSPAM.5263.invalid> - 2011-09-10 00:14 -0500
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Michael Vilain <vilain@NOspamcop.net> - 2011-09-09 23:30 -0700
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Luuk <Luuk@invalid.lan> - 2011-09-10 10:13 +0200
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus13208 <ignoramus13208@NOSPAM.13208.invalid> - 2011-09-10 09:34 -0500
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB John Levine <johnl@iecc.com> - 2011-09-10 19:47 +0000
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus13208 <ignoramus13208@NOSPAM.13208.invalid> - 2011-09-10 16:02 -0500
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB The Natural Philosopher <tnp@invalid.invalid> - 2011-09-11 15:40 +0100
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Axel Schwenke <axel.schwenke@gmx.de> - 2011-09-11 18:11 +0200
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus21330 <ignoramus21330@NOSPAM.21330.invalid> - 2011-09-11 12:07 -0500
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Norman Peelman <npeelmandog@cfl.rr.com> - 2011-09-10 19:06 -0400
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Matthew Woodcraft <mattheww@chiark.greenend.org.uk> - 2011-09-11 10:24 +0100
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Ignoramus21330 <ignoramus21330@NOSPAM.21330.invalid> - 2011-09-11 07:35 -0500
Re: MySQL vs Postgres for a "data warehouse", 5-10 TB Bodo <bodo@nomail.org.invalid> - 2011-09-23 13:01 +0200
csiph-web