Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #76561 > unrolled thread

Collaps arrays/ list of intergers

Started byJurgens de Bruin <debruinjj@gmail.com>
First post2014-08-19 05:54 -0700
Last post2014-08-19 19:05 +0000
Articles 5 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Collaps arrays/ list of intergers Jurgens de Bruin <debruinjj@gmail.com> - 2014-08-19 05:54 -0700
    Re: Collaps arrays/ list of intergers Peter Pearson <ppearson@nowhere.invalid> - 2014-08-19 16:22 +0000
      Re: Collaps arrays/ list of intergers Rock Neurotiko <miguelglafuente@gmail.com> - 2014-08-19 19:27 +0200
    Re: Collaps arrays/ list of intergers Peter Otten <__peter__@web.de> - 2014-08-19 19:52 +0200
    Re: Collaps arrays/ list of intergers Denis McMahon <denismfmcmahon@gmail.com> - 2014-08-19 19:05 +0000

#76561 — Collaps arrays/ list of intergers

FromJurgens de Bruin <debruinjj@gmail.com>
Date2014-08-19 05:54 -0700
SubjectCollaps arrays/ list of intergers
Message-ID<fb9b097d-1a25-4f28-9b3f-61636e013a91@googlegroups.com>
Hi All,

I do hope somebody can help me with the following:
I have the followings lists which represent the upper and lower value of a range/array.

a = [1,50]
b = [75,150]
c = [25,42]
d = [120,149]
e = [35,55]

What I would like to happen is that overlapping range will "collapse" to a single range meaning the above list would become:

as list a,c and e overlap they can be represented by
f = [1,55]
as list b and d overlap they can be represented by 
g = [75,150]

I have sort of got working solution using networkx and numpy and they work perfect for the above example the problem I have no is my true data set looks more like the following:

x = [135098,19854736]
y = [135098,41639553]
z = [11818998,12587339]

To give only three examples using networkx and numpy does not work as this results in memory error due to the ranges being so large.

I would appreciate any help with this.

Thanks

[toc] | [next] | [standalone]


#76573

FromPeter Pearson <ppearson@nowhere.invalid>
Date2014-08-19 16:22 +0000
Message-ID<c5hbutF3mg8U1@mid.individual.net>
In reply to#76561
On Tue, 19 Aug 2014 05:54:24 -0700 (PDT), Jurgens de Bruin wrote:
>
> I do hope somebody can help me with the following:
> I have the followings lists which represent the upper and lower value
> of a range/array.
>
> a = [1,50]
> b = [75,150]
> c = [25,42]
> d = [120,149]
> e = [35,55]
>
> What I would like to happen is that overlapping range will "collapse"
> to a single range meaning the above list would become:
>
> as list a,c and e overlap they can be represented by
> f = [1,55]
> as list b and d overlap they can be represented by 
> g = [75,150]

Is your question about (a) identifying overlapping subsets of ranges,
or (b) collapsing such subsets once you have identified them?

What output would you want if the inputs were . . .

a = [1,50]
b = [2,10]
c = [40,60]

?

-- 
To email me, substitute nowhere->spamcop, invalid->net.

[toc] | [prev] | [next] | [standalone]


#76594

FromRock Neurotiko <miguelglafuente@gmail.com>
Date2014-08-19 19:27 +0200
Message-ID<mailman.13160.1408473581.18130.python-list@python.org>
In reply to#76573

[Multipart message — attachments visible in raw view] — view raw

Hi,

I made a fast implementation (I'm sure that can be done better) but it
works (for what I understood).

Is tested in Python3.4, if you will execute in Python 2.x, or don't have
mypy or don't like it, you always can remove the function annotations :)

http://gist.github.com/rockneurotiko/017044d907242c2e0482

There are all the code and some own-tests :)

I hope that this is what you was asking for :)

Cheers!




2014-08-19 18:22 GMT+02:00 Peter Pearson <ppearson@nowhere.invalid>:

> On Tue, 19 Aug 2014 05:54:24 -0700 (PDT), Jurgens de Bruin wrote:
> >
> > I do hope somebody can help me with the following:
> > I have the followings lists which represent the upper and lower value
> > of a range/array.
> >
> > a = [1,50]
> > b = [75,150]
> > c = [25,42]
> > d = [120,149]
> > e = [35,55]
> >
> > What I would like to happen is that overlapping range will "collapse"
> > to a single range meaning the above list would become:
> >
> > as list a,c and e overlap they can be represented by
> > f = [1,55]
> > as list b and d overlap they can be represented by
> > g = [75,150]
>
> Is your question about (a) identifying overlapping subsets of ranges,
> or (b) collapsing such subsets once you have identified them?
>
> What output would you want if the inputs were . . .
>
> a = [1,50]
> b = [2,10]
> c = [40,60]
>
> ?
>
> --
> To email me, substitute nowhere->spamcop, invalid->net.
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Miguel García Lafuente - Rock Neurotiko

Do it, the devil is in the details.
The quieter you are, the more you are able to hear.
Happy Coding. Code with Passion, Decode with Patience.
If we make consistent effort, based on proper education, we can change the
world.

El contenido de este e-mail es privado, no se permite la revelacion del
contenido de este e-mail a gente ajena a él.

[toc] | [prev] | [next] | [standalone]


#76588

FromPeter Otten <__peter__@web.de>
Date2014-08-19 19:52 +0200
Message-ID<mailman.13157.1408470746.18130.python-list@python.org>
In reply to#76561
Jurgens de Bruin wrote:

> Hi All,
> 
> I do hope somebody can help me with the following:
> I have the followings lists which represent the upper and lower value of a
> range/array.
> 
> a = [1,50]
> b = [75,150]
> c = [25,42]
> d = [120,149]
> e = [35,55]
> 
> What I would like to happen is that overlapping range will "collapse" to a
> single range meaning the above list would become:
> 
> as list a,c and e overlap they can be represented by
> f = [1,55]
> as list b and d overlap they can be represented by
> g = [75,150]
> 
> I have sort of got working solution using networkx and numpy and they work
> perfect for the above example the problem I have no is my true data set
> looks more like the following:
> 
> x = [135098,19854736]
> y = [135098,41639553]
> z = [11818998,12587339]
> 
> To give only three examples using networkx and numpy does not work as this
> results in memory error due to the ranges being so large.
> 
> I would appreciate any help with this.

The naive approach would be to sort the intervals, take one and extend it 
while the lower bound is inside the current interval. Here is my attempt (no 
warranties):

$ cat merge_intervals.py         
def merge_intervals(intervals):
    intervals = iter(sorted(intervals))

    current_lo, current_hi = next(intervals)
    for lo, hi in intervals:
        if lo <= current_hi:
            if hi > current_hi:
                current_hi = hi
        else:
            yield [current_lo, current_hi]
            current_lo = lo
            current_hi = hi
    yield [current_lo, current_hi]


def demo(intervals):
    print("before")
    for ivl in intervals:
        print(ivl)
    print("")

    intervals = list(merge_intervals(intervals))
    print("after")
    for ivl in intervals:
        print(ivl)
    print("\n")

demo(
    [
        [1,50],
        [75,150],
        [25,42],
        [120,149],
        [35,55],
    ]
)

demo(
    [
        [135098,19854736],
        [135098,41639553],
        [11818998,12587339],
        [10**17, 10**20],
        [10**15, 10**18],
        [10**20+1, 10**30],
    ]
)
$ python3 merge_intervals.py 
before
[1, 50]
[75, 150]
[25, 42]
[120, 149]
[35, 55]

after
[1, 55]
[75, 150]


before
[135098, 19854736]
[135098, 41639553]
[11818998, 12587339]
[100000000000000000, 100000000000000000000]
[1000000000000000, 1000000000000000000]
[100000000000000000001, 1000000000000000000000000000000]

after
[135098, 41639553]
[1000000000000000, 100000000000000000000]
[100000000000000000001, 1000000000000000000000000000000]



[toc] | [prev] | [next] | [standalone]


#76600

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2014-08-19 19:05 +0000
Message-ID<lt075d$tmv$2@dont-email.me>
In reply to#76561
On Tue, 19 Aug 2014 05:54:24 -0700, Jurgens de Bruin wrote:

> I do hope somebody can help me with the following:
> I have the followings lists which represent the upper and lower value of
> a range/array.
> 
> a = [1,50]
> b = [75,150]
> c = [25,42]
> d = [120,149]
> e = [35,55]

I think you're starting off with the wrong data model.

I think you should be starting with is a list of either tuples or lists, 
eg:

[(1,50),(75,150),(25,42),(120,149),(35,55)]

and looking to output another list of tuples or lists, eg:

[(1,55),(75,150)]

I have a solution, but I don't consider it particularly elegant. :(

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web