Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #86063 > unrolled thread
| Started by | "TommyVee" <xxxxxxxx@xxxxxx.xxx> |
|---|---|
| First post | 2015-02-21 14:46 -0500 |
| Last post | 2015-02-22 17:02 +0000 |
| Articles | 10 — 9 participants |
Back to article view | Back to comp.lang.python
Algorithm for Creating Supersets of Smaller Sets Based on Common Elements "TommyVee" <xxxxxxxx@xxxxxx.xxx> - 2015-02-21 14:46 -0500
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Joel Goldstick <joel.goldstick@gmail.com> - 2015-02-21 15:06 -0500
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Ethan Furman <ethan@stoneleaf.us> - 2015-02-21 12:11 -0800
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Dave Angel <davea@davea.name> - 2015-02-21 15:18 -0500
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-21 22:40 +0000
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements "TommyVee" <xxxxxxxx@xxxxxx.xxx> - 2015-02-21 19:07 -0500
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-22 15:02 +1100
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements wxjmfauth@gmail.com - 2015-02-22 00:15 -0800
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements Peter Pearson <pkpearson@nowhere.invalid> - 2015-02-22 16:49 +0000
Re: Algorithm for Creating Supersets of Smaller Sets Based on Common Elements duncan smith <buzzard@invalid.invalid> - 2015-02-22 17:02 +0000
| From | "TommyVee" <xxxxxxxx@xxxxxx.xxx> |
|---|---|
| Date | 2015-02-21 14:46 -0500 |
| Subject | Algorithm for Creating Supersets of Smaller Sets Based on Common Elements |
| Message-ID | <Bg5Gw.1344030$No4.494335@fx19.iad> |
Start off with sets of elements as follows: 1. A,B,E,F 2. G,H,L,P,Q 3. C,D,E,F 4. E,X,Z 5. L,M,R 6. O,M,Y Note that sets 1, 3 and 4 all have the element 'E' in common, therefore they are "related" and form the following superset: A,B,C,D,E,F,X,Z Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 have element 'M' in common, therefore they form the following superset: G,H,L,M,O,P,Q,R,Y I think you get the point. As long as sets have at least 1 common element, they combine to form a superset. Also "links" (common elements) between sets may go down multiple levels, as described in the second case above (2->5->6). Cycles thankfully, are not possible. BTW, the number of individual sets (and resultant supersets) will be very large. I don't know where to start with this. I thought about some type of recursive algorithm, but I'm not sure. I could figure out the Python implementation easy enough, I'm just stumped on the algorithm itself. Anybody have an idea? Thanks, Tom
[toc] | [next] | [standalone]
| From | Joel Goldstick <joel.goldstick@gmail.com> |
|---|---|
| Date | 2015-02-21 15:06 -0500 |
| Message-ID | <mailman.18981.1424549202.18130.python-list@python.org> |
| In reply to | #86063 |
On Sat, Feb 21, 2015 at 2:46 PM, TommyVee <xxxxxxxx@xxxxxx.xxx> wrote: > Start off with sets of elements as follows: > > 1. A,B,E,F > 2. G,H,L,P,Q > 3. C,D,E,F > 4. E,X,Z > 5. L,M,R > 6. O,M,Y > > Note that sets 1, 3 and 4 all have the element 'E' in common, therefore they > are "related" and form the following superset: > > A,B,C,D,E,F,X,Z > > Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 have > element 'M' in common, therefore they form the following superset: > > G,H,L,M,O,P,Q,R,Y > > I think you get the point. As long as sets have at least 1 common element, > they combine to form a superset. Also "links" (common elements) between > sets may go down multiple levels, as described in the second case above > (2->5->6). Cycles thankfully, are not possible. > > BTW, the number of individual sets (and resultant supersets) will be very > large. > > I don't know where to start with this. I thought about some type of > recursive algorithm, but I'm not sure. I could figure out the Python > implementation easy enough, I'm just stumped on the algorithm itself. > > Anybody have an idea? > start with reading about python sets. If you take the intersection of two sets it will return a set with common elements. If that is empty, they don't pass your test. If you take the union, you get a set with the set values in each. > Thanks, Tom > -- > https://mail.python.org/mailman/listinfo/python-list -- Joel Goldstick http://joelgoldstick.com
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2015-02-21 12:11 -0800 |
| Message-ID | <mailman.18982.1424549507.18130.python-list@python.org> |
| In reply to | #86063 |
[Multipart message — attachments visible in raw view] — view raw
On 02/21/2015 11:46 AM, TommyVee wrote: > Start off with sets of elements as follows: > > 1. A,B,E,F > 2. G,H,L,P,Q > 3. C,D,E,F > 4. E,X,Z > 5. L,M,R > 6. O,M,Y > > Note that sets 1, 3 and 4 all have the element 'E' in common, therefore they are "related" and form the following superset: > > A,B,C,D,E,F,X,Z > > Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 have element 'M' in common, therefore they form > the following superset: > > G,H,L,M,O,P,Q,R,Y > > I think you get the point. As long as sets have at least 1 common element, they combine to form a superset. Also > "links" (common elements) between sets may go down multiple levels, as described in the second case above (2->5->6). > Cycles thankfully, are not possible. > > BTW, the number of individual sets (and resultant supersets) will be very large. > > I don't know where to start with this. I thought about some type of recursive algorithm, but I'm not sure. I could > figure out the Python implementation easy enough, I'm just stumped on the algorithm itself. > > Anybody have an idea? Use a Counter (collections.Counter), add all sets, then keep any keys with a count of 2 or more. -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-02-21 15:18 -0500 |
| Message-ID | <mailman.18983.1424549962.18130.python-list@python.org> |
| In reply to | #86063 |
On 02/21/2015 02:46 PM, TommyVee wrote: > Start off with sets of elements as follows: > > 1. A,B,E,F > 2. G,H,L,P,Q > 3. C,D,E,F > 4. E,X,Z > 5. L,M,R > 6. O,M,Y > > Note that sets 1, 3 and 4 all have the element 'E' in common, therefore > they are "related" and form the following superset: > > A,B,C,D,E,F,X,Z > > Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 > have element 'M' in common, therefore they form the following superset: > > G,H,L,M,O,P,Q,R,Y > > I think you get the point. As long as sets have at least 1 common > element, they combine to form a superset. Also "links" (common > elements) between sets may go down multiple levels, as described in the > second case above (2->5->6). Cycles thankfully, are not possible. > > BTW, the number of individual sets (and resultant supersets) will be > very large. > > I don't know where to start with this. I can't see where you've defined "this" yet. I think you're trying to find all combinations of sets that are "related", and that you're assuming some of them won't be. So perhaps you're trying to build "islands" of related sets, and you want each of those islands to be of maximal size. > I thought about some type of > recursive algorithm, but I'm not sure. I could figure out the Python > implementation easy enough, I'm just stumped on the algorithm itself. > Seems like you can have a doubly-nested loop, where your inner loop collects all the sets that are related to the one in the outer loop. You label each group with an island-number, where the island number is simply the lowest number set that's on the island. It's still not clear what you do with each of those islands, but the "superset" you refer to before is simply the OR of everything on the island. Perhaps you want to show all smaller supersets that are composed of at least two sets of an island. That becomes a fairly simple power-of-two iteration of those sets, where you throw out the case where the index is a power of two (ie. contains only one set). As you say, calculating the union or intersection of the various sets in python is trivial. -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-02-21 22:40 +0000 |
| Message-ID | <mailman.18984.1424558476.18130.python-list@python.org> |
| In reply to | #86063 |
On 21/02/2015 19:46, TommyVee wrote:
> Start off with sets of elements as follows:
>
> 1. A,B,E,F
> 2. G,H,L,P,Q
> 3. C,D,E,F
> 4. E,X,Z
> 5. L,M,R
> 6. O,M,Y
>
> Note that sets 1, 3 and 4 all have the element 'E' in common, therefore
> they are "related" and form the following superset:
>
> A,B,C,D,E,F,X,Z
>
> Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6
> have element 'M' in common, therefore they form the following superset:
>
> G,H,L,M,O,P,Q,R,Y
>
> I think you get the point. As long as sets have at least 1 common
> element, they combine to form a superset. Also "links" (common
> elements) between sets may go down multiple levels, as described in the
> second case above (2->5->6). Cycles thankfully, are not possible.
>
> BTW, the number of individual sets (and resultant supersets) will be
> very large.
>
> I don't know where to start with this. I thought about some type of
> recursive algorithm, but I'm not sure. I could figure out the Python
> implementation easy enough, I'm just stumped on the algorithm itself.
>
> Anybody have an idea?
>
> Thanks, Tom
A naive approach but should give you something to think about.
from collections import defaultdict
sets = ({'A','B','E','F'},
{'G','H','L','P','Q'},
{'C','D','E','F'},
{'E','X','Z'},
{'L','M','R'},
{'O','M','Y'})
d = defaultdict(list)
for i, aSet in enumerate(sets):
for a in aSet:
d[a].append(i)
superSets = []
for k in d:
if len(d[k]) > 1:
superSet = set()
for i in d[k]:
superSet |= sets[i]
superSets.append(superSet)
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | "TommyVee" <xxxxxxxx@xxxxxx.xxx> |
|---|---|
| Date | 2015-02-21 19:07 -0500 |
| Message-ID | <w59Gw.1082219$Lq6.132077@fx29.iad> |
| In reply to | #86063 |
"TommyVee" wrote in message news:Bg5Gw.1344030$No4.494335@fx19.iad... Start off with sets of elements as follows: 1. A,B,E,F 2. G,H,L,P,Q 3. C,D,E,F 4. E,X,Z 5. L,M,R 6. O,M,Y Note that sets 1, 3 and 4 all have the element 'E' in common, therefore they are "related" and form the following superset: A,B,C,D,E,F,X,Z Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 have element 'M' in common, therefore they form the following superset: G,H,L,M,O,P,Q,R,Y I think you get the point. As long as sets have at least 1 common element, they combine to form a superset. Also "links" (common elements) between sets may go down multiple levels, as described in the second case above (2->5->6). Cycles thankfully, are not possible. BTW, the number of individual sets (and resultant supersets) will be very large. I don't know where to start with this. I thought about some type of recursive algorithm, but I'm not sure. I could figure out the Python implementation easy enough, I'm just stumped on the algorithm itself. Anybody have an idea? Thanks, Tom Thanks to all! Not only did you turn around my thinking on this, but I also learned about some new Python collection types. I think I have a way to get this to work now.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-02-22 15:02 +1100 |
| Message-ID | <54e954d6$0$13013$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86063 |
TommyVee wrote:
> Start off with sets of elements as follows:
>
> 1. A,B,E,F
> 2. G,H,L,P,Q
> 3. C,D,E,F
> 4. E,X,Z
> 5. L,M,R
> 6. O,M,Y
>
> Note that sets 1, 3 and 4 all have the element 'E' in common, therefore
> they are "related" and form the following superset:
>
> A,B,C,D,E,F,X,Z
Sounds to me like you already have an algorithm in mind. It might not be
correct or complete, or it might not be efficient, but you performed some
steps in your own mind to generate that superset. Write down those steps
and you have an algorithm. Once you have the algorithm, you can then turn
it into Python code and start revising it from there.
I went back to first principles and drew a set diagram with multiple
disjoint sets of overlapping sets, and then thought about the process of
*adding* each new set to the set-of-supersets. It's hard to describe, but
if you start drawing an empty Venn diagram, then add one of your sets to
the Venn diagram at a time, merging that set to whatever superset it
belongs to, the process should appear quite obvious.
Let me walk through the process with your six sets above, plus one extra.
First, I create a list of supersets, initially empty:
supersets = []
I look at Set #1, and compare to all the supersets. There aren't any, so I
make a copy of Set #1 and add it to the supersets:
[{A,B,E,F}]
I look at Set #2, and compare it to all the supersets. It doesn't intersect
any of them, so I add it as a new superset:
[{A,B,E,F}, {G,H,L,P,Q}]
I look at Set #3. It intersects the first superset, but not the second, so I
merge them. Likewise for Set #4:
[{A,B,C,D,E,F}, {G,H,L,P,Q}]
[{A,B,C,D,E,F,X,Z}, {G,H,L,P,Q}]
Set #5 intersects the second superset, but not the first. Same for Set #6:
[{A,B,C,D,E,F,X,Z}, {G,H,L,M,P,Q,R}]
[{A,B,C,D,E,F,X,Z}, {G,H,L,M,O,P,Q,R,Y}]
Now let me add an extra Set #7, {A, R}, which intersects *both* supersets.
First I merge it with both:
[{A,B,C,D,E,F,R,X,Z}, {A,G,H,L,M,O,P,Q,R,Y}]
but I don't stop there. Then I recursively run through the supersets with
the same merging process to get:
[{A,B,C,D,E,F,G,H,L,M,O,P,Q,R,X,Y,Z}]
which is the single minimal, disjoint superset of all seven sets.
def merge(sets):
"""Merge a collection of sets into a list of minimal, disjoint
supersets.
"""
supersets = []
for a in sets:
count = 0
for s in supersets:
if a&s:
s.update(a)
count += 1
if count == 0:
supersets.append(a.copy())
elif count > 1:
supersets = merge(supersets)
return supersets
Testing this with your six sets gives:
py> sets = [set(s) for s in ("ABEF", "GHLPQ", "CDEF", "EXZ", "LMR", "OMY")]
py> for s in merge(sets):
... print(','.join(sorted(s)))
...
A,B,C,D,E,F,X,Z
G,H,L,M,O,P,Q,R,Y
Adding set #7 gives:
py> for s in merge(sets + [{'A', 'R'}]):
... print(','.join(sorted(s)))
...
A,B,C,D,E,F,G,H,L,M,O,P,Q,R,X,Y,Z
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2015-02-22 00:15 -0800 |
| Message-ID | <503257ca-a61a-4aaf-be09-826c553b3f31@googlegroups.com> |
| In reply to | #86076 |
>>> z = [set(s) for s in ("ABEF", "GHLPQ", "CDEF", "EXZ", "LMR", "OMY")]
>>> x = set()
>>> for e in z:
... x = x.union(e)
...
>>> x
{'A', 'C', 'B', 'E', 'D', 'G', 'F', 'H', 'M', 'L', 'O', 'Q', 'P', 'R', 'Y', 'X', 'Z'}
>>> ''.join(x)
'ACBEDGFHMLOQPRYXZ'
>>>
[toc] | [prev] | [next] | [standalone]
| From | Peter Pearson <pkpearson@nowhere.invalid> |
|---|---|
| Date | 2015-02-22 16:49 +0000 |
| Message-ID | <ckufkqFd796U1@mid.individual.net> |
| In reply to | #86063 |
On Sat, 21 Feb 2015 14:46:26 -0500, TommyVee wrote: > Start off with sets of elements as follows: > > 1. A,B,E,F > 2. G,H,L,P,Q > 3. C,D,E,F > 4. E,X,Z > 5. L,M,R > 6. O,M,Y > > Note that sets 1, 3 and 4 all have the element 'E' in common, therefore they > are "related" and form the following superset: > > A,B,C,D,E,F,X,Z > > Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6 have > element 'M' in common, therefore they form the following superset: > > G,H,L,M,O,P,Q,R,Y > > I think you get the point. [snip] I recommend continuing to work on your statement of the problem until it is detailed, precise, and complete -- something along the lines of, "Given a set of sets, return a set of sets having the following properties: (1)... (2)..." This approach often brings to light logical problems in the loosely sketched requirements. It also produces the outline of a testing regimen to determine whether an implemented solution is correct. -- To email me, substitute nowhere->runbox, invalid->com.
[toc] | [prev] | [next] | [standalone]
| From | duncan smith <buzzard@invalid.invalid> |
|---|---|
| Date | 2015-02-22 17:02 +0000 |
| Message-ID | <54ea0ba0$0$23001$862e30e2@ngroups.net> |
| In reply to | #86063 |
On 21/02/15 19:46, TommyVee wrote:
> Start off with sets of elements as follows:
>
> 1. A,B,E,F
> 2. G,H,L,P,Q
> 3. C,D,E,F
> 4. E,X,Z
> 5. L,M,R
> 6. O,M,Y
>
> Note that sets 1, 3 and 4 all have the element 'E' in common, therefore
> they are "related" and form the following superset:
>
> A,B,C,D,E,F,X,Z
>
> Likewise, sets 2 and 5 have the element 'L' in common, then set 5 and 6
> have element 'M' in common, therefore they form the following superset:
>
> G,H,L,M,O,P,Q,R,Y
>
> I think you get the point. As long as sets have at least 1 common
> element, they combine to form a superset. Also "links" (common
> elements) between sets may go down multiple levels, as described in the
> second case above (2->5->6). Cycles thankfully, are not possible.
>
> BTW, the number of individual sets (and resultant supersets) will be
> very large.
>
> I don't know where to start with this. I thought about some type of
> recursive algorithm, but I'm not sure. I could figure out the Python
> implementation easy enough, I'm just stumped on the algorithm itself.
>
> Anybody have an idea?
>
> Thanks, Tom
Tom,
You could construct a graph G such that there exists an edge {v,w}
for each pair of items that appear in a common set. Each connected
component will contain the nodes of one of the supersets you're looking
for. That's unnecessarily expensive, so you can adopt a similar approach
using trees.
http://en.wikipedia.org/wiki/Disjoint-set_data_structure
Construct a forest of trees (initially one tree for each item) and join
pairs of trees until you have one tree for each of your supersets. For
each set of size n you only need to consider n-1 joins. That will ensure
that any pair of items that are in one of the sets are contained in a
single tree. The find and union operations are amortized constant, so
this should be efficient for your large numbers of sets.
The union_find module can be found at
https://github.com/DuncanSmith147/KVMS. Cheers.
Duncan
>>> import union_find
>>> sets = (['A','B','E','F'],
['G','H','L','P','Q'],
['C','D','E','F'],
['E','X','Z'],
['L','M','R'],
['O','M','Y'])
>>> uf = union_find.UnionFindTree()
>>> for a_set in sets:
for item in a_set:
try:
uf.add(item)
except ValueError:
pass
n = a_set[0]
for item in a_set[1:]:
is_merged = uf.union(n, item)
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>> for item in uf:
res[uf.find(item)].append(item)
>>> res.values()
[['G', 'H', 'M', 'L', 'O', 'Q', 'P', 'R', 'Y'], ['A', 'C', 'B', 'E',
'D', 'F', 'X', 'Z']]
>>>
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web