Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #3770 > unrolled thread

SQL Puzzle - too many dimensions

Started byZapanaz <http://joecosby.com/code/mail.pl@foo.com>
First post2011-05-07 16:13 -0700
Last post2011-05-08 07:45 -0400
Articles 5 on this page of 25 — 6 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  SQL Puzzle - too many dimensions Zapanaz <http://joecosby.com/code/mail.pl@foo.com> - 2011-05-07 16:13 -0700
    Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-08 14:00 +1200
      Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-08 13:08 +0200
        Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-09 00:01 +1200
          Re: SQL Puzzle - too many dimensions Lew <noone@lewscanon.com> - 2011-05-08 08:08 -0400
    Re: SQL Puzzle - too many dimensions Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-08 01:02 -0300
      Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-08 13:03 +0200
        Re: SQL Puzzle - too many dimensions Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-08 11:14 -0300
          Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-08 23:37 +0200
      Re: SQL Puzzle - too many dimensions Zapanaz <http://joecosby.com/code/mail.pl@foo.com> - 2011-05-08 16:06 -0700
        Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-09 22:10 +0200
          Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-10 13:50 +1200
            Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-10 00:06 -0700
              Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-13 22:55 +1200
                Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-15 18:52 +0200
            Re: SQL Puzzle - too many dimensions Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-10 06:35 -0300
              Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-13 22:57 +1200
                Re: SQL Puzzle - too many dimensions Lew <noone@lewscanon.com> - 2011-05-13 07:45 -0400
                Re: SQL Puzzle - too many dimensions Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-15 13:09 -0300
                  Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-15 18:56 +0200
    Re: SQL Puzzle - too many dimensions Roedy Green <see_website@mindprod.com.invalid> - 2011-05-07 23:32 -0700
      Re: SQL Puzzle - too many dimensions Roedy Green <see_website@mindprod.com.invalid> - 2011-05-07 23:36 -0700
        Re: SQL Puzzle - too many dimensions Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-08 20:43 +1200
          Re: SQL Puzzle - too many dimensions Robert Klemme <shortcutter@googlemail.com> - 2011-05-09 01:27 -0700
    Re: SQL Puzzle - too many dimensions Lew <noone@lewscanon.com> - 2011-05-08 07:45 -0400

Page 2 of 2 — ← Prev page 1 [2]


#3793

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-05-07 23:32 -0700
Message-ID<a0ecs6tjngbukgu3rq539hkfuqbfsfl33r@4ax.com>
In reply to#3770
On Sat, 07 May 2011 16:13:52 -0700, Zapanaz
<http://joecosby.com/code/mail.pl@foo.com> wrote, quoted or indirectly
quoted someone who said :

>My job is to come up with one SQL query which will return the
>highest-rated version of each song.


You could create a query ordered by descending rating and just read
the first record.

I presume there is some sort of max function that returns the max
value over a set of some field.

Find the max, then do a query for match on that max value.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
How long did it take after the car was invented before owners understood 
cars would not work unless you regularly changed the oil and the tires?
We have gone 33 years and still it is rare to uncover a user who 
understands computers don't work without regular backups.

[toc] | [prev] | [next] | [standalone]


#3795

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-05-07 23:36 -0700
Message-ID<7becs6h10lufrqvb8ciii7vpafjqldv3f0@4ax.com>
In reply to#3793
On Sat, 07 May 2011 23:32:09 -0700, Roedy Green
<see_website@mindprod.com.invalid> wrote, quoted or indirectly quoted
someone who said :

>
>Find the max, then do a query for match on that max value.

You need to combine these into one query, otherwise the max could
change just prior to the second query.
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
How long did it take after the car was invented before owners understood 
cars would not work unless you regularly changed the oil and the tires?
We have gone 33 years and still it is rare to uncover a user who 
understands computers don't work without regular backups.

[toc] | [prev] | [next] | [standalone]


#3798

FromLawrence D'Oliveiro <ldo@geek-central.gen.new_zealand>
Date2011-05-08 20:43 +1200
Message-ID<iq5l3q$pr5$2@lust.ihug.co.nz>
In reply to#3795
In message <7becs6h10lufrqvb8ciii7vpafjqldv3f0@4ax.com>, Roedy Green wrote:

> You need to combine these into one query, otherwise the max could
> change just prior to the second query.

Why does that matter?

[toc] | [prev] | [next] | [standalone]


#3845

FromRobert Klemme <shortcutter@googlemail.com>
Date2011-05-09 01:27 -0700
Message-ID<3c106267-497c-48a4-bdcb-5b002b3229b7@p13g2000yqh.googlegroups.com>
In reply to#3798
On 8 Mai, 10:43, Lawrence D'Oliveiro <l...@geek-
central.gen.new_zealand> wrote:
> In message <7becs6h10lufrqvb8ciii7vpafjqldv...@4ax.com>, Roedy Green wrote:
>
> > You need to combine these into one query, otherwise the max could
> > change just prior to the second query.
>
> Why does that matter?

Because consistency is not guaranteed any more.  If the row with the
max is removed after the first query, the second query will lack a row
for that song_id.

More formally, because TX isolation level "read committed" allows for
non repeatable reads to occur.

If you want to know more please read up on "statement level read
consistency".  The whole chapter is very informative.

http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/consist.htm#CNCPT88960
http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html

Also it's usually faster to do it in one query.

Cheers

robert


PS: I advise anybody doing development of applications which
persistently store data in _any_ way (yes, this includes infamous
"NoSQL" storages) to make themselves familiar with transaction
handling.  Doing otherwise is grossly negligent and can (and will)
cause all sorts of issues from performance degradation to data
inconsistency (which is probably worse than data loss).  Actually, the
concept of "transaction" is not limited to persistent storage but also
to how an application deals with its state in memory.

[toc] | [prev] | [next] | [standalone]


#3803

FromLew <noone@lewscanon.com>
Date2011-05-08 07:45 -0400
Message-ID<iq5voi$7vq$1@news.albasani.net>
In reply to#3770
Zapanaz wrote:
> I know this is a java group, not a SQL group, I am trying to bounce
> this off as many heads as I can and most Java programmers I know are
> good at SQL.

I truly hope you aren't saying that you multi-posted your question to multiple 
groups.  Cross-posting would make sense, but multi-posting is a sin.

> So I've got a table, song_versions, in an Oracle database. It has

Don't let the idiot who spouts anti-Oracle sentiment without any engineering 
basis affect your thinking at all.  Oracle, DB2, PostgreSQL, Derby, H2, SQL 
Server and a few others are excellent DBMS products.  (MySQL is not.)

> columns like this
>
> SONG_ID | SONG_VERSION_ID | RATING |
>
> SONG_ID is a key into the table songs. A song could be something like
> this, 'Money (That's What I Want)' by Berry Gordie
>
> http://www.youtube.com/watch?v=z6xkT7FMyTc
>
> And I have three versions of this song, one by Barrett Strong, one by
> The Beatles, and one by The Rolling Stones.
>
> I also have another song, 'Butterfly Caught' by Massive Attack. I only
> have the one version of this song.
>
> So my table looks like this
>
> SONG_ID | SONG_VERSION_ID | RATING |
> Money (That's What I Want) | Barrett Strong | 3 |
> Money (That's What I Want) | Rolling Stones | 2 |
> Money (That's What I Want) | Beatles | 1 |
> Butterfly Caught | Massive Attack | 1 |
>
> The actual values for the ID columns in the table are integers, this
> is just for clarity.

The "actual" IDs are the ones you show.  The physical *surrogate* IDs for 
which you use integers are a hack.  They are not the real IDs; they are used 
only for the convenience of implementation and should not be exposed to 
domain-relevant logic.

> My job is to come up with one SQL query which will return the
> highest-rated version of each song.

Others have answered this so I will not repeat their good advice.  The key 
point in the solutions presented was sub-SELECTs, or correlated subqueries. 
Study up on those.

> So for Money, the highest  rated version is the Barrett Strong
> version. For Butterfly Caught, there is only one version, so I want to
> return that, the Massive Attack version. So I would want my result to
> be the Barrett Strong version of Money and the Massive Attack version
> fo Butterfuly Caught.
>
> I can't think of a way to do it.
>
> To get the MAX(Rating), I would have to GROUP BY song_id. So I can get
> the numerical highest rating for each song, I can get a
>
> song_id | rating
>
> result set, but what I need is the song_version_id corresponding to
> that rating.
>
> Or I don't know, maybe something like
>
> select song_version_id
>   from song_versions
>   where rating =
>    (select max(rating)
>      from song_versions
>      where song_id = [??? I don't know ... the same song ID I'm talking
> about in the outer query]
>    )
>
> If I group by song_id, I lose the song_version_id.
>
> (Some DBs maybe I could cheat and return columns that aren't group by
> columns, but oracle won't let me pull that one.
>
> I just tried this in MySql
>
> select song_id, song_version_id, max(rating)
>   from song_versions
>   group by song_id
>
> and it worked, I got the Barrett Strong song_version_id, but then
> there's nothing in my query that actually gaurantees that result, so I
> don't know if I would want to depend on that in real code.
>
> But anyway, this is in an Oracle DB, and they don't let you include
> columns in a group-by query that aren't explicitly group-by that way.
> )

Ummm, that's SQL, not just Oracle.  Kinda hard to use an SQL-based product if 
you don't know the language and what is standard for it.

And you're right, this is a Java newsgroup.  People here do know some SQL, but 
you should not be surprised if you get a Java coding solution here instead of 
an SQL one, say one involving JPA.  A SQL or Oracle group is much more 
appropriate for this question.

It would help if you stick around and participate in future discussions and 
don't just eff us then forget us.  We love to help, as you saw from the high 
quality of answers others have given you, but there is a quid pro quo.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.java.programmer


csiph-web