Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed2a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.004 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'else:': 0.03; 'skip:[ 20': 0.04; 'python3': 0.07; '[1,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'def': 0.12; 'interval.': 0.16; 'numpy': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'wrote:': 0.18; 'all,': 0.19; 'skip:1 30': 0.19; 'examples': 0.20; 'memory': 0.22; 'example': 0.22; 'header:User-Agent:1': 0.23; 'error': 0.23; 'skip:l 30': 0.24; 'looks': 0.24; 'sort': 0.25; 'header:X -Complaints-To:1': 0.27; 'subject:list': 0.30; 'large.': 0.31; 'lists': 0.32; 'this.': 0.32; 'extend': 0.32; 'problem': 0.35; 'yield': 0.36; 'list': 0.37; 'being': 0.38; 'represent': 0.38; 'somebody': 0.38; 'skip:[ 10': 0.38; 'to:addr:python-list': 0.38; 'does': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'hope': 0.61; 'lower': 0.61; 'range': 0.61; 'happen': 0.63; 'skip:n 10': 0.64; 'more': 0.64; 'skip:1 20': 0.65; 'due': 0.66; 'here': 0.66; 'results': 0.69; 'ranges': 0.74; 'upper': 0.74 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Peter Otten <__peter__@web.de> Subject: Re: Collaps arrays/ list of intergers Date: Tue, 19 Aug 2014 19:52:11 +0200 Organization: None References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Gmane-NNTP-Posting-Host: p57bd9bc0.dip0.t-ipconnect.de User-Agent: KNode/4.13.3 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 116 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1408470746 news.xs4all.nl 2890 [2001:888:2000:d::a6]:42397 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:76588 Jurgens de Bruin wrote: > Hi All, > > I do hope somebody can help me with the following: > I have the followings lists which represent the upper and lower value of a > range/array. > > a = [1,50] > b = [75,150] > c = [25,42] > d = [120,149] > e = [35,55] > > What I would like to happen is that overlapping range will "collapse" to a > single range meaning the above list would become: > > as list a,c and e overlap they can be represented by > f = [1,55] > as list b and d overlap they can be represented by > g = [75,150] > > I have sort of got working solution using networkx and numpy and they work > perfect for the above example the problem I have no is my true data set > looks more like the following: > > x = [135098,19854736] > y = [135098,41639553] > z = [11818998,12587339] > > To give only three examples using networkx and numpy does not work as this > results in memory error due to the ranges being so large. > > I would appreciate any help with this. The naive approach would be to sort the intervals, take one and extend it while the lower bound is inside the current interval. Here is my attempt (no warranties): $ cat merge_intervals.py def merge_intervals(intervals): intervals = iter(sorted(intervals)) current_lo, current_hi = next(intervals) for lo, hi in intervals: if lo <= current_hi: if hi > current_hi: current_hi = hi else: yield [current_lo, current_hi] current_lo = lo current_hi = hi yield [current_lo, current_hi] def demo(intervals): print("before") for ivl in intervals: print(ivl) print("") intervals = list(merge_intervals(intervals)) print("after") for ivl in intervals: print(ivl) print("\n") demo( [ [1,50], [75,150], [25,42], [120,149], [35,55], ] ) demo( [ [135098,19854736], [135098,41639553], [11818998,12587339], [10**17, 10**20], [10**15, 10**18], [10**20+1, 10**30], ] ) $ python3 merge_intervals.py before [1, 50] [75, 150] [25, 42] [120, 149] [35, 55] after [1, 55] [75, 150] before [135098, 19854736] [135098, 41639553] [11818998, 12587339] [100000000000000000, 100000000000000000000] [1000000000000000, 1000000000000000000] [100000000000000000001, 1000000000000000000000000000000] after [135098, 41639553] [1000000000000000, 100000000000000000000] [100000000000000000001, 1000000000000000000000000000000]