Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.databases.postgresql > #861 > unrolled thread
| Started by | Bill Gunshannon <bill.gunshannon@gmail.com> |
|---|---|
| First post | 2019-04-27 14:59 -0400 |
| Last post | 2020-02-22 09:15 +0100 |
| Articles | 10 — 4 participants |
Back to article view | Back to comp.databases.postgresql
Table with a variable number of elements in a column Bill Gunshannon <bill.gunshannon@gmail.com> - 2019-04-27 14:59 -0400
Re: Table with a variable number of elements in a column John-Paul Stewart <jpstewart@sympatico.ca> - 2019-04-27 15:39 -0400
Re: Table with a variable number of elements in a column Bill Gunshannon <bill.gunshannon@gmail.com> - 2019-04-27 15:48 -0400
Re: Table with a variable number of elements in a column John-Paul Stewart <jpstewart@sympatico.ca> - 2019-04-27 16:02 -0400
Re: Table with a variable number of elements in a column Bill Gunshannon <bill.gunshannon@gmail.com> - 2019-04-27 16:29 -0400
Re: Table with a variable number of elements in a column John-Paul Stewart <jpstewart@sympatico.ca> - 2019-04-27 17:10 -0400
Re: Table with a variable number of elements in a column Bill Gunshannon <bill.gunshannon@gmail.com> - 2019-04-27 18:40 -0400
Re: Table with a variable number of elements in a column John-Paul Stewart <jpstewart@sympatico.ca> - 2019-04-27 19:26 -0400
Re: Table with a variable number of elements in a column alexander.d.mills@gmail.com - 2020-02-21 14:02 -0800
Re: Table with a variable number of elements in a column Christian Barthel <bch@online.de> - 2020-02-22 09:15 +0100
| From | Bill Gunshannon <bill.gunshannon@gmail.com> |
|---|---|
| Date | 2019-04-27 14:59 -0400 |
| Subject | Table with a variable number of elements in a column |
| Message-ID | <gijn3vF68nnU1@mid.individual.net> |
Not sure what the right terminology for this is, but I will provide my example and see if anyone can tell me how I might do this. I want to create a database table for an index of all my record albums. The basic stuff is easy. Title, Artist, Publisher and the Publisher's ID # as a primary unique key. But then I get to the hard part. Number of tracks and then a list of those tracks. This would, obviously, be different and variable from album to album. So, I need a way to define a table that has a variable number of fields depending on the value in Number-of-Tracks. Is there any way in standard SQL to do this without just having a large number of fields and only using the number you need? bill
[toc] | [next] | [standalone]
| From | John-Paul Stewart <jpstewart@sympatico.ca> |
|---|---|
| Date | 2019-04-27 15:39 -0400 |
| Message-ID | <gijpelF6n73U1@mid.individual.net> |
| In reply to | #861 |
On 2019-04-27 2:59 p.m., Bill Gunshannon wrote: > > > Not sure what the right terminology for this is, but I will > provide my example and see if anyone can tell me how I might > do this. > > I want to create a database table for an index of all my record > albums. > > The basic stuff is easy. > > Title, Artist, Publisher and the Publisher's ID # as a primary > unique key. > But then I get to the hard part. > Number of tracks and then a list of those tracks. > This would, obviously, be different and variable from album to album. > > So, I need a way to define a table that has a variable number of fields > depending on the value in Number-of-Tracks. The usual way to do this in a relational database is to have a separate table of tracks and define the relationship between the two tables. For example, add a unique album id number to your albums table. (You won't need the "number of tracks" field you propose.) Then have a tracks table that has album id (which refers back to the albums table), track number, track title, etc. You can then use a JOIN clause in your SQL SELECT statement to associate the track and album info with each other, or other queries to get the number of tracks in an album (e.g., "select count(*) from tracks where album_id = 1"), or whatever else you need to know about it. You probably want to read up on the concept of "foreign keys" and SQL JOIN clauses. That's the usual way to do it in a database, and a big part of the relational model.
[toc] | [prev] | [next] | [standalone]
| From | Bill Gunshannon <bill.gunshannon@gmail.com> |
|---|---|
| Date | 2019-04-27 15:48 -0400 |
| Message-ID | <gijpvpF6sfbU1@mid.individual.net> |
| In reply to | #862 |
On 4/27/19 3:39 PM, John-Paul Stewart wrote: > On 2019-04-27 2:59 p.m., Bill Gunshannon wrote: >> >> >> Not sure what the right terminology for this is, but I will >> provide my example and see if anyone can tell me how I might >> do this. >> >> I want to create a database table for an index of all my record >> albums. >> >> The basic stuff is easy. >> >> Title, Artist, Publisher and the Publisher's ID # as a primary >> unique key. >> But then I get to the hard part. >> Number of tracks and then a list of those tracks. >> This would, obviously, be different and variable from album to album. >> >> So, I need a way to define a table that has a variable number of fields >> depending on the value in Number-of-Tracks. > > The usual way to do this in a relational database is to have a separate > table of tracks and define the relationship between the two tables. For > example, add a unique album id number to your albums table. (You won't > need the "number of tracks" field you propose.) Well, I am going to want that number anyway. :-) > Then have a tracks > table that has album id (which refers back to the albums table), track > number, track title, etc. You can then use a JOIN clause in your SQL > SELECT statement to associate the track and album info with each other, > or other queries to get the number of tracks in an album (e.g., "select > count(*) from tracks where album_id = 1"), or whatever else you need to > know about it. But if I understand this correctly I would need a separate table for every album resulting in, potentially, thousands of tables. (OK, in my case hundreds but others may like this idea, too, when I finish the whole project.) > > You probably want to read up on the concept of "foreign keys" and SQL > JOIN clauses. That's the usual way to do it in a database, and a big > part of the relational model. Not sure what the "foreign keys" have to do with it, but I understand the JOIN part. I was just looking for an efficient way to do this and somehow the though of hundreds to thousands of TABLES scares me. But, thanks for the info. I will likely end out trying it a couple of different ways before deciding which is best. bill
[toc] | [prev] | [next] | [standalone]
| From | John-Paul Stewart <jpstewart@sympatico.ca> |
|---|---|
| Date | 2019-04-27 16:02 -0400 |
| Message-ID | <gijqqoF71obU1@mid.individual.net> |
| In reply to | #863 |
On 2019-04-27 3:48 p.m., Bill Gunshannon wrote: > On 4/27/19 3:39 PM, John-Paul Stewart wrote: >> On 2019-04-27 2:59 p.m., Bill Gunshannon wrote: >>> >>> >>> Not sure what the right terminology for this is, but I will >>> provide my example and see if anyone can tell me how I might >>> do this. >>> >>> I want to create a database table for an index of all my record >>> albums. >>> >>> The basic stuff is easy. >>> >>> Title, Artist, Publisher and the Publisher's ID # as a primary >>> unique key. >>> But then I get to the hard part. >>> Number of tracks and then a list of those tracks. >>> This would, obviously, be different and variable from album to album. >>> >>> So, I need a way to define a table that has a variable number of fields >>> depending on the value in Number-of-Tracks. >> >> The usual way to do this in a relational database is to have a separate >> table of tracks and define the relationship between the two tables. For >> example, add a unique album id number to your albums table. (You won't >> need the "number of tracks" field you propose.) > > Well, I am going to want that number anyway. :-) Yes, but the database can calculate it for you, for each album. Having a separate column could lead to problems where the column says something different than the actual number of tracks stored in the table. >> Then have a tracks >> table that has album id (which refers back to the albums table), track >> number, track title, etc. You can then use a JOIN clause in your SQL >> SELECT statement to associate the track and album info with each other, >> or other queries to get the number of tracks in an album (e.g., "select >> count(*) from tracks where album_id = 1"), or whatever else you need to >> know about it. > > But if I understand this correctly I would need a separate table for > every album resulting in, potentially, thousands of tables. (OK, in > my case hundreds but others may like this idea, too, when I finish > the whole project.) No, it's only one "tracks" table. That's why there's an album id column in it: so that each track is associated with one album. There'd be (album 1, track 1), (album 1, track 2), (album 2, track 1), and so on.... Each (album, track) tuple would have to be unique. But you'd keep all the tracks for all the albums in one table. >> >> You probably want to read up on the concept of "foreign keys" and SQL >> JOIN clauses. That's the usual way to do it in a database, and a big >> part of the relational model. > Not sure what the "foreign keys" have to do with it, The foreign key (in this case, the use of album id in the tracks table) is what connects the individual tracks to the albums, and ensures that the album id actually exists. By declaring album id (in the tracks table) as a foreign key referring to the albums table, Postgres will ensure that each track has a corresponding album. Without declaring it as a foreign key, you can mistakenly have a track who's album id doesn't exist. > but I understand > the JOIN part. I was just looking for an efficient way to do this and > somehow the though of hundreds to thousands of TABLES scares me. It's just two: albums and tracks. > But, thanks for the info. I will likely end out trying it a couple > of different ways before deciding which is best. Another topic to read up on is "database normalization".
[toc] | [prev] | [next] | [standalone]
| From | Bill Gunshannon <bill.gunshannon@gmail.com> |
|---|---|
| Date | 2019-04-27 16:29 -0400 |
| Message-ID | <gijsdrF7d5iU1@mid.individual.net> |
| In reply to | #864 |
On 4/27/19 4:02 PM, John-Paul Stewart wrote: > On 2019-04-27 3:48 p.m., Bill Gunshannon wrote: >> On 4/27/19 3:39 PM, John-Paul Stewart wrote: >>> On 2019-04-27 2:59 p.m., Bill Gunshannon wrote: >>>> >>>> >>>> Not sure what the right terminology for this is, but I will >>>> provide my example and see if anyone can tell me how I might >>>> do this. >>>> >>>> I want to create a database table for an index of all my record >>>> albums. >>>> >>>> The basic stuff is easy. >>>> >>>> Title, Artist, Publisher and the Publisher's ID # as a primary >>>> unique key. >>>> But then I get to the hard part. >>>> Number of tracks and then a list of those tracks. >>>> This would, obviously, be different and variable from album to album. >>>> >>>> So, I need a way to define a table that has a variable number of fields >>>> depending on the value in Number-of-Tracks. >>> >>> The usual way to do this in a relational database is to have a separate >>> table of tracks and define the relationship between the two tables. For >>> example, add a unique album id number to your albums table. (You won't >>> need the "number of tracks" field you propose.) >> >> Well, I am going to want that number anyway. :-) > > Yes, but the database can calculate it for you, for each album. Having > a separate column could lead to problems where the column says something > different than the actual number of tracks stored in the table. Ah yes, I guess I was too tied into the database using the number to determine how many fields it needed. > >>> Then have a tracks >>> table that has album id (which refers back to the albums table), track >>> number, track title, etc. You can then use a JOIN clause in your SQL >>> SELECT statement to associate the track and album info with each other, >>> or other queries to get the number of tracks in an album (e.g., "select >>> count(*) from tracks where album_id = 1"), or whatever else you need to >>> know about it. >> >> But if I understand this correctly I would need a separate table for >> every album resulting in, potentially, thousands of tables. (OK, in >> my case hundreds but others may like this idea, too, when I finish >> the whole project.) > > No, it's only one "tracks" table. That's why there's an album id column > in it: so that each track is associated with one album. There'd be > (album 1, track 1), (album 1, track 2), (album 2, track 1), and so > on.... Each (album, track) tuple would have to be unique. But you'd > keep all the tracks for all the albums in one table. I got that now. But I guess I am still stuck with how I define how many tracks columns there has to be for each album. I think I have been away from this for too long. I am missing something that is probably both simple and very apparent. > >>> >>> You probably want to read up on the concept of "foreign keys" and SQL >>> JOIN clauses. That's the usual way to do it in a database, and a big >>> part of the relational model. >> Not sure what the "foreign keys" have to do with it, > > The foreign key (in this case, the use of album id in the tracks table) > is what connects the individual tracks to the albums, and ensures that > the album id actually exists. By declaring album id (in the tracks > table) as a foreign key referring to the albums table, Postgres will > ensure that each track has a corresponding album. Without declaring it > as a foreign key, you can mistakenly have a track who's album id doesn't > exist. Got it, I understand that part now. Like I said, been away from this for much too long. (Damn retirement.....) > >> but I understand >> the JOIN part. I was just looking for an efficient way to do this and >> somehow the though of hundreds to thousands of TABLES scares me. > > It's just two: albums and tracks. > >> But, thanks for the info. I will likely end out trying it a couple >> of different ways before deciding which is best. > > Another topic to read up on is "database normalization". > I will. Amazing how long it takes to learn and how fast you can lose the learning when you stop doing it. bill
[toc] | [prev] | [next] | [standalone]
| From | John-Paul Stewart <jpstewart@sympatico.ca> |
|---|---|
| Date | 2019-04-27 17:10 -0400 |
| Message-ID | <gijuppF7slsU1@mid.individual.net> |
| In reply to | #865 |
On 2019-04-27 4:29 p.m., Bill Gunshannon wrote: > On 4/27/19 4:02 PM, John-Paul Stewart wrote: >> On 2019-04-27 3:48 p.m., Bill Gunshannon wrote: >>> >>> But if I understand this correctly I would need a separate table for >>> every album resulting in, potentially, thousands of tables. (OK, in >>> my case hundreds but others may like this idea, too, when I finish >>> the whole project.) >> >> No, it's only one "tracks" table. That's why there's an album id column >> in it: so that each track is associated with one album. There'd be >> (album 1, track 1), (album 1, track 2), (album 2, track 1), and so >> on.... Each (album, track) tuple would have to be unique. But you'd >> keep all the tracks for all the albums in one table. > > I got that now. > But I guess I am still stuck with how I define how many tracks columns > there has to be for each album. I think I have been away from this for > too long. I am missing something that is probably both simple and very > apparent. By using a tracks table, it's no longer a question of "how many tracks columns" but "how many tracks rows" instead. And the answer is "however many you populate with the same album id". That's the beauty of it.
[toc] | [prev] | [next] | [standalone]
| From | Bill Gunshannon <bill.gunshannon@gmail.com> |
|---|---|
| Date | 2019-04-27 18:40 -0400 |
| Message-ID | <gik42aF8vneU1@mid.individual.net> |
| In reply to | #866 |
On 4/27/19 5:10 PM, John-Paul Stewart wrote: > On 2019-04-27 4:29 p.m., Bill Gunshannon wrote: >> On 4/27/19 4:02 PM, John-Paul Stewart wrote: >>> On 2019-04-27 3:48 p.m., Bill Gunshannon wrote: >>>> >>>> But if I understand this correctly I would need a separate table for >>>> every album resulting in, potentially, thousands of tables. (OK, in >>>> my case hundreds but others may like this idea, too, when I finish >>>> the whole project.) >>> >>> No, it's only one "tracks" table. That's why there's an album id column >>> in it: so that each track is associated with one album. There'd be >>> (album 1, track 1), (album 1, track 2), (album 2, track 1), and so >>> on.... Each (album, track) tuple would have to be unique. But you'd >>> keep all the tracks for all the albums in one table. >> >> I got that now. >> But I guess I am still stuck with how I define how many tracks columns >> there has to be for each album. I think I have been away from this for >> too long. I am missing something that is probably both simple and very >> apparent. > > By using a tracks table, it's no longer a question of "how many tracks > columns" but "how many tracks rows" instead. And the answer is "however > many you populate with the same album id". That's the beauty of it. > Yeah, I went upstairs to watch the golf and think about it and it hit me. I was looking at it all wrong. I was trying to label the field for each track but being as they are merely sequentially numbered values they don't need individual names. Duh... Hopefully other stuff will come back to the surface easier than this. Thank you very much for kicking me in the head to knock this stuff loose so it could float back up to the surface. bill
[toc] | [prev] | [next] | [standalone]
| From | John-Paul Stewart <jpstewart@sympatico.ca> |
|---|---|
| Date | 2019-04-27 19:26 -0400 |
| Message-ID | <gik6ppF9ho2U1@mid.individual.net> |
| In reply to | #867 |
On 2019-04-27 6:40 p.m., Bill Gunshannon wrote: > > Yeah, I went upstairs to watch the golf and think about it and > it hit me. I was looking at it all wrong. I know the feeling. It's all too easy to get fixated on one idea and then "not see the forest for the trees". I've been there, done that far more times than I'd care to admit. You really don't want to know how badly I mis-used Postgres' arrays in the past to achieve the result you initially described before I learned about this stuff. > Thank you very much for kicking me in the head to knock this stuff > loose so it could float back up to the surface. I'm sure there are plenty of topics where I'll need the kick in the head. And more where no amount of kicking will help me. That's what I like about these newsgroups: they're a great place to get a second opinion. Good luck! Your album archive project sounds like more fun than anything I'm working on.
[toc] | [prev] | [next] | [standalone]
| From | alexander.d.mills@gmail.com |
|---|---|
| Date | 2020-02-21 14:02 -0800 |
| Message-ID | <c9bc3eb5-42c2-491b-9570-cedba269bd41@googlegroups.com> |
| In reply to | #861 |
I have a similar question/answer on StackExchange: https://softwareengineering.stackexchange.com/questions/405567/how-to-use-strict-schema-with-seemingly-fluid-data-type
[toc] | [prev] | [next] | [standalone]
| From | Christian Barthel <bch@online.de> |
|---|---|
| Date | 2020-02-22 09:15 +0100 |
| Message-ID | <87eeunavho.fsf@barthel.ch> |
| In reply to | #888 |
alexander.d.mills@gmail.com writes: > I have a similar question/answer on StackExchange: > https://softwareengineering.stackexchange.com/questions/405567/ > how-to-use-strict-schema-with-seemingly-fluid-data-type I have seen that kind of modelling elsewhere. This is basically an EAV (Entity Attribute Value) table? The book [1,2] describes it as an anti-pattern which should be avoided because it is harder to use, control and query later. Example: The email should probably satisfy some constraints but it is of type "json" and anything can be stored in it. How about typos in the "key" field etc. Note that processing may consume more CPU time as well according to Wikipedia [3] - depending on the planned size of your database. So, before using this pattern, I would think about whether it is really necessary to have that flexibility (at the cost of the disadvantages) or if it is possible to use normalization theory and create relations for the entities. | is there a better way to do this other than using JSON for the | value column? With the example shown, why are you using a "JSON" field instead of a text field? If you are using a JSON field, I would store all emails in one JSON array field and make (user_id, key) unique: | id | user_id | key | value | |----|---------|-------|--------------------------------------| | 1 | 1 | email | ["foo1@bar.com", "foo2@bar.com", ..] | ... Why are there more than one email addresses for each user? Do they serve a specific purpose? With your design, it seems impossible to select some email addresses for certain actions (which may be OK?). An alternative version might be: use a N:M mapping between "user" and "email" (the "email" table contains a list of email addresses a user may specify, i.e. "email at work", "private mail", etc. and link the user with zero or more email addresses). (Further N:M mappings might be necessary for other "key" values which might lead to a larger number of tables but column constraints and queries would be simpler) | If not - is there a way to enforce a schema on the JSON | somehow? You can do this with a trigger that runs before the insertion. You have to carefully check the layout of the JSON and the constraint of its values (i.e. if the key=email, then the value field should have a certain pattern etc.). I would also check the "key" field and only allow existing keys (so that it becomes possible to add "e-mail", "E-Mail", ... ). PostgreSQL comes with nice functions and operators to work with JSON [4]. | Last question - from my brief research the inverse table design | is called an "unpivot" table - but if there is a better name | for it please let me know. "unpivot" sounds a bit "unspecific". Personally, I'd name it "user_meta_data" or "user_contacts" (if there are only email addresses). [1] D. Fontaine: Mastering PostgreSQL In Application Development [2] https://tapoueh.org/blog/2018/03/database-modelization-anti-patterns/#entity-attribute-values [3] https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model [4] https://www.postgresql.org/docs/11/functions-json.html -- Christian Barthel <bch@online.de>
[toc] | [prev] | [standalone]
Back to top | Article view | comp.databases.postgresql
csiph-web