Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.databases.postgresql > #151

Re: Is PostgreSQL good? (was: migrating oracle to postgres)

From Mladen Gogala <gogala.mladen@gmail.com>
Newsgroups comp.databases.postgresql
Subject Re: Is PostgreSQL good? (was: migrating oracle to postgres)
Date 2011-06-24 13:53 +0000
Organization solani.org
Message-ID <pan.2011.06.24.13.53.38@gmail.com> (permalink)
References (1 earlier) <1308640710.210659@proxy.dienste.wien.at> <pan.2011.06.21.13.38.10@gmail.com> <1308738638.232318@proxy.dienste.wien.at> <pan.2011.06.22.12.02.22@gmail.com> <1308911383.819034@proxy.dienste.wien.at>

Show all headers | View raw


On Fri, 24 Jun 2011 12:29:19 +0200, Laurenz Albe wrote:

> Mladen Gogala wrote:
>> So what is this post all about, if you don't want to enter discussion?
> 
> I have no problem to discuss hints and other perceived or real
> shortcomings of PostgreSQL, only before I do this I wanted to ascertain
> that you are still able to post a civil reply. I wanted to bring the
> heat level down.

You wanted to bring the level down by telling me that I sound like a 
broken record? I do find that interesting.

> 
> I think that founded criticism of PostgreSQL is not off-topic here, and
> I have no problem with it. Yet lately you have spread a great deal of
> FUD around here.
> 
> So here I go.
> 
> About hints:
> ============
> 
> Let me say that I personally do not feel very strongly either way - I
> have used hints in Oracle and get by without them in PostgreSQL.

Unfortunately, I was forced to kill a porting project because I couldn't 
have make the SQL perform properly within the allotted time frame. The 
goal was to make an oracle application written in Java work against 
Postgres. The tables and indexes were converted using ora2pg. There were 
few triggers which I recoded manually. However, the performance was 
abysmal because of bad plans and the only chance to make the deadline 
was  to have a mechanism to force specific plans. I didn't have that. I 
was even reasonable enough to suggest that such mechanism should be only 
put on TODO list, but nobody wanted to even discuss that. I was 
steamrolled, so I killed the project and stopped all the plans for 
porting applications. In the end, we both lost: Postgres lost a user, I 
lost an opportunity to save some money for the company.


> Sifting through some of the e-mails, here is a list of some arguments
> against query hints that I found:
> - They create a maintenance burden whenever there is a change in the
> optimizer. - 

They do. The optimizer itself is a maintenance burden. Every new feature 
is a maintenance burden.

> They are a hack that does not address the real problem (a
> deficiency in the   planer) 

This argument is ludicrous. Hints are the way around the deficiencies in 
the planner. Achieving the "perfect optimizer" is impossible at this 
state of technology. Oracle has excellent optimizer and vast resources 
and yet it provides hints. Hints are a solution of the last resort. 
Undesired and unloved, but it has to be there when necessary. If they are 
not there, the database will not get used.

> and may not work well in the next version or when the data
>   change.

Well, that's up to the maintainers. However, the type of decisions 
influenced by hints is unlikely to change between versions.

> - They will cause people to work around deficiencies in the planer,
> rather than
>   report their problem so it can be fixed.

Unfortunately, the real developers and project managers work on projects 
that have deadlines. Users must have tuning methodology. Postgres will 
have to trust its users to report problems. On the other hand, if the 
community doesn't trust its users, it is very likely that it will not 
have many of them. Personally, I don't think that posting the problem on 
the mailing list is a satisfactory tuning methodology. I'd rather kill 
the project than do it that way. So I did.


> 
> The first point is something that could be handled if more able people
> were willing to put resources into it. The third point is probably more
> idealistic than a commercial vendor would be (fewer bug reports mean
> less work for support). Still I think all three are valid points.

On the other hand, I interpret those points as a deep mistrust of the 
Postgres community to its users and the reluctance to provide the 
necessary tools for the users to have chance to help themselves. 


> The only recognizable argument you made so far is "everybody else has
> them, so they must be the right thing". 

No. The argument that I have made is not about everybody else. That 
"argument" was thrown when I was faced with the "everybody else is 
against" argument. The real argument is that hints are needed and that 
the users must have a way of helping themselves. What you and the rest of 
the Postgres community are arguing against is allowing users to help 
themselves. Essentially, Postgres is the only database that doesn't allow 
its users to help themselves. That is the argument.

> PostgreSQL hackers will reply
> that they are out to do things *better* than everybody else, so while it
> is certainly a good idea to seriously consider something that everybody
> else does, that does not dictate that PostgreSQL has to do it the same
> way.

Let's be realistic here. Postgres is one of the available RDBMS systems 
on the market, far behind other systems in popularity, even behind MySQL. 
Nobody uses PgSQL for benchmarks (see www.tpc.org) because of the 
performance problems. Such grandeur is not justified. The goal must be to 
give the users something that they can use as an alternative to the much 
better performing commercial databases like Oracle or DB2. In other 
words, Postgres must be an alternative to Oracle. At present, it isn't. I 
tried, I failed and looked elsewhere. I am not the only one, I am just 
the most vocal one.
When NY PUG was revived in 2010, there were people from BoA, Goldman 
Sachs, Verizon and other major NYC companies. They stopped coming to the 
meetings and PGEast in NYC in March this year was a flop. I haven't 
noticed any attempts to figure out what was the problem.


> 
> If you really want to influence the PostgreSQL planer, there is a number
> of configuration parameters that allow you to do this (only for a whole
> query though). Also, a query rewrite can often do that same thing that a
> hint would do (didn't you post an example recently?).

Yes, I did post an example. There are limited cases, but that is not 
enough. There must be a way of making SQL perform adequately in minutes. 
If a VP is looking over your shoulder and asking "is it working yet", 
there is no time to wait for Tom Lane to respond. 

> 
> Interestingly, Oracle also seems to think that hints are not so great. I
> quote from
> http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/
sql_elements006.htm#SQLRF00219

Don't get me wrong. Hints are the last resort, the solution used only in 
desperation. However, they must be there and they are there, at least in 
Oracle. Real life situations, with real life problems also makes 
desperate situations highly probable. As DBAs, we are frequently in dire 
straits, and have to be the sultans of swing. And we can't swing if don't 
have hints.


> 
>   Hints were introduced in Oracle7, when users had little recourse if
>   the optimizer generated suboptimal plans. Now Oracle provides a number
>   of tools, including the SQL Tuning Advisor, SQL plan management, and
>   SQL Performance Analyzer, to help you address performance problems
>   that are not solved by the optimizer. Oracle strongly recommends that
>   you use those tools rather than hints. The tools are far superior to
>   hints, because when used on an ongoing basis, they provide fresh
>   solutions as your data and database environment change.

Of course that Oracle recommends those tools. Performance and diagnostic 
packs are $5000 per CPU thread, each. Therein lies the problem. If it 
wasn't for Oracle pricing, I wouldn't even look twice at Postgres, would 
try the pilot project and discard it straight away. This is sales pitch. 
Oracle, however, knows better than to take hints away. Users must have 
the self help methodology. The main problem of Postgres is that it 
doesn't provide that. We've both read Cary Millsap's book. This book 
outlines the tuning methodology for Oracle. The tuning methodology has 
two parts to it:

1) Figure out where the time is spent
2) Shorten the time.

What is so groundbreaking in Millsap's book is the recommendation to 
always start from the application instead of the database and to stop 
looking at "performance indicators" like the infamous "buffer cache hit 
ratio" or "BCHR", as it was known before Connor McDonald's method of 
adjusting it according to your heart's desires.
In order to aid users with that methodology, Oracle has developed an 
extensive set of V$ tables which help users dissect the performance 
problems and en extensive system of hints which can be used to solve the 
performance problems. Those "profiles" mentioned in the Oracle manual are 
nothing more than the complete sets of hints. See the recent book by 
Christian Antognini. In other words, OEM will write your hints for you 
and store them as "profile".
Speaking of the tuning methodology, that's precisely the problem with 
Postgres: it has none. When faced with performance problems in Oracle, 
there are things I can do, short of rewriting the application and 
rethinking the data model. In Postgres, I can only post to the list and 
hope for the best.


> 
>   Hints should be used sparingly, and only after you have collected
>   statistics on the relevant tables and evaluated the optimizer plan
>   without hints using the EXPLAIN PLAN statement. Changing database
>   conditions as well as query performance enhancements in subsequent
>   releases can have significant impact on how hints in your code affect
>   performance.

Hints in the code are needed mainly for two purposes: 

1) To ensure/prevent the use of the particular index.
2) To select a join method (hash/merge/nl).

I don't see the need for those decisions being eliminated anytime soon. 
All versions of the optimizer will have to make such decisions in the 
foreseeable future. In the likely situation that the decision will 
sometimes be wrong, the users must be allowed to make the choice for the 
optimizer. The main problem with Postgres is that it has no self help 
methodology for users. I once asked Joshua Drake what is user to do when 
faced with a performance problem. The answer was something that sounds 
like a bad joke: "post it in the mailing list and we'll tell you what to 
do". Unfortunately, I cannot accept such "tuning methodology". Postgres 
will have to come up with a better one or sink into obscurity.

> 
> So it seems that Oracle considers hints a legacy from when their
> optimizer was even worse and they had nothing better to offer.

I don't see it that way, as explained above.

 
> Oracle has to burn extra cycles (and disk access?) to move rows away
> (and maybe back) right when the performance might be needed elsewhere.

Well, Oracle still defeats Postgres in performance, and by a wide margin. 
That means that the cost is not as great as it may seem at the first 
glance.

> Moreover, there's the problem of having to maintain an extra data
> structure (undo space) which may run out, causing an expensive
> long-running operation to fail randomly.

Well, there is a whole philosophy about "snapshot too old". That, I 
suppose, would be open to vote.

> ROWID:
> ======
> 
> You make the point that because of VACUUM there is nothing compared to
> Oracle's ROWID. But do you really need that? Shouldn't a row be
> referenced by its primary key?

Rowid is needed for the global indexes. I am frustrated with Postgres 
partitioning, which leaves a lot to be desired. I can build a global 
index using Sphinx, and yes, it does use the primary key. Unfortunately, 
the problem with building the global index on the partitioned table is 
precisely the lack of the universal primary key, like ROWID. Somebody has 
mentioned CTID, but here is what the manual (
http://www.postgresql.org/docs/9.0/static/ddl-system-columns.html)

says:
************************************************************************
ctid

    The physical location of the row version within its table. Note that 
although the ctid can be used to locate the row version very quickly, a 
row's ctid will change if it is updated or moved by VACUUM FULL. 
Therefore ctid is useless as a long-term row identifier. The OID, or even 
better a user-defined serial number, should be used to identify logical 
rows. 
************************************************************************

That is the reason for nagging against vacuum and CTID.

> 
> Oracle itself says in
> http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/
pseudocolumns008.htm#SQLRF00254
> 
>   You should not use ROWID as the primary key of a table. If you delete
>   and reinsert a row with the Import and Export utilities, for example,
>   then its rowid may change. If you delete a row, then Oracle may
>   reassign its rowid to a new row inserted later.

Oracle, however, does have partitioning with global indexes, which is 
what I was really after. Let me clarify: I do not advocate using ROWID as 
the primary key, nor do I even allow that in the databases entrusted to 
me. 


> 
> Other maintenance commands like ALTER TABLE ... SHRINK SPACE will also
> change ROWID. 

Only if you allow it. There is a command "alter table <name> enable row 
movement". If this command is not executed, rowid remains constant during 
the lifetime of the row. That also makes index maintenance much more 
expensive, which is why people tend to not do that lightly.



> I do not understand how lack of ROWID should affect global indexes on
> partitioned tables. Can you explain?

Index is a mathematical structure called B*Tree. It locates the key and 
with the key, there is the information how to locate the row(s) 
containing this key. If you want to have a global index, spanning several 
tables, you must have a global identifier for rows, something like 
file.block.offset, which is precisely what Oracle ROWID is. Oracle's 
solution to partitioning was to disassociate segment from an object, 
which used to be the same before Oracle8.  Object in Oracle can have many 
segments, which is a sort of dirty solution. I like the Postgres idea of 
inheritance, but I really need global indexes. I don't think that having 
them is possible without a permanent global row identifier. 


> 
> Fixed size of WAL segments:
> ===========================
> 
> Maybe I am dense, but what is the problem there?

Sheer number of files is annoying. I think I can live with that, it's 
just an aesthetic complaint. However, it is indicative of the fact that 
the creators of Postgres have never been DBA's working on the real life 
databases and do not think high of us, "the DBA types". Josh Berkus 
expressed his disdain for Dan Tow and people like him. In return, I can 
express my contempt for Josh and his ilk. Unfortunately for Josh, it is 
people like me who decide which database will get to live in the 
corporate server room. I can only advise an attitude re-adjustment. If 
Postgres is meant for the corporate users, then it should provide us with 
what we need, in order to be able to do our jobs. Only after that can we 
work on the perfect optimizer, the final frontier and the mission to 
boldly go where no RDBMS system has gone before.




-- 
http://mgogala.byethost5.com

Back to comp.databases.postgresql | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

migrating oracle to postgres Username <user@domain.tld> - 2011-06-20 19:28 +0200
  Re: migrating oracle to postgres Mladen Gogala <no@email.here.invalid> - 2011-06-20 20:08 +0000
    Re: migrating oracle to postgres Username <user@domain.tld> - 2011-06-20 23:45 +0200
      Re: migrating oracle to postgres Mladen Gogala <no@email.here.invalid> - 2011-06-20 22:37 +0000
      Re: migrating oracle to postgres Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-21 02:01 +0000
  Re: migrating oracle to postgres "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-21 09:18 +0200
    Re: migrating oracle to postgres Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-21 13:38 +0000
      Is PostgreSQL good? (was: migrating oracle to postgres) "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-22 12:30 +0200
        Re: Is PostgreSQL good? Harry Tuttle <OTPXDAJCSJVU@spammotel.com> - 2011-06-22 13:15 +0200
        Re: Is PostgreSQL good? (was: migrating oracle to postgres) Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-22 12:02 +0000
          Re: Is PostgreSQL good? "M. Strobel" <sorry_no_mail_here@nowhere.dee> - 2011-06-22 21:52 +0200
            Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-22 22:43 +0000
          Re: Is PostgreSQL good? (was: migrating oracle to postgres) "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-24 12:29 +0200
            Re: Is PostgreSQL good? Hans Castorp <REWYRLXHEGHO@spammotel.com> - 2011-06-24 12:57 +0200
              Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-24 14:53 +0200
              Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-24 16:32 +0000
            Re: Is PostgreSQL good? (was: migrating oracle to postgres) Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-24 13:53 +0000
              Re: Is PostgreSQL good? (was: migrating oracle to postgres) "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-27 11:17 +0200
                Re: Is PostgreSQL good? (was: migrating oracle to postgres) Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-28 06:11 +0000
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-28 10:34 +0200
                Re: Is PostgreSQL good? Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-28 13:30 +0000
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-29 09:50 +0200
                Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-29 21:38 +0000
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-06-30 09:14 +0200
                Re: Is PostgreSQL good? Mladen Gogala <mgogala@no.address.invalid> - 2011-06-30 13:22 +0000
            Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-06-30 07:05 -0700
              Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-30 15:41 +0000
                Re: Is PostgreSQL good? Richard Kettlewell <rjk@greenend.org.uk> - 2011-06-30 16:49 +0100
                Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-30 19:24 +0000
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-06-30 13:21 -0700
                Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-06-30 22:42 +0000
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-01 11:57 -0700
                Re: Is PostgreSQL good? Mladen Gogala <gogala.mladen@gmail.com> - 2011-07-01 22:58 +0000
                Re: Is PostgreSQL good? Richard Kettlewell <rjk@greenend.org.uk> - 2011-07-01 14:07 +0100
                Re: Is PostgreSQL good? Mladen Gogala <gogala.mladen@gmail.com> - 2011-07-01 13:29 +0000
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-06-30 13:17 -0700
                Re: Is PostgreSQL good? Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-30 23:53 +0000
              Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-01 10:13 +0200
                Re: Is PostgreSQL good? Mladen Gogala <gogala.mladen@gmail.com> - 2011-07-01 13:38 +0000
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-01 16:58 +0200
                Re: Is PostgreSQL good? Mladen Gogala <no@email.here.invalid> - 2011-07-01 15:33 +0000
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-01 11:39 -0700
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-04 16:39 +0200
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-04 12:11 -0700
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-05 08:58 +0200
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-05 11:29 -0700
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-06 09:00 +0200
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-06 01:00 -0700
                Re: Is PostgreSQL good? "Laurenz Albe" <invite@spam.to.invalid> - 2011-07-07 09:05 +0200
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-07 00:34 -0700
                Re: Is PostgreSQL good? Matthew Woodcraft <mattheww@chiark.greenend.org.uk> - 2011-07-04 20:12 +0100
                Re: Is PostgreSQL good? Don Y <nowhere@here.com> - 2011-07-04 12:24 -0700
      Re: migrating oracle to postgres Robert Klemme <shortcutter@googlemail.com> - 2011-06-23 17:48 +0200
        Re: migrating oracle to postgres Mladen Gogala <no@email.here.invalid> - 2011-06-23 16:42 +0000
          Re: migrating oracle to postgres Mladen Gogala <no@email.here.invalid> - 2011-06-23 18:57 +0000
          Re: migrating oracle to postgres Robert Klemme <shortcutter@googlemail.com> - 2011-06-26 17:35 +0200
            Re: migrating oracle to postgres Mladen Gogala <gogala.mladen@gmail.com> - 2011-06-26 19:08 +0000
          Re: migrating oracle to postgres Robert Klemme <shortcutter@googlemail.com> - 2011-06-27 00:32 -0700
            Re: migrating oracle to postgres Mladen Gogala <no@email.here.invalid> - 2011-06-28 16:00 +0000
              Re: migrating oracle to postgres Robert Klemme <shortcutter@googlemail.com> - 2011-06-28 22:28 +0200

csiph-web