Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!weretis.net!feeder1.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Peter Otten <__peter__@web.de> Newsgroups: comp.lang.python Subject: Re: groupby - summing multiple columns in a list of lists Followup-To: comp.lang.python Date: Tue, 17 May 2011 20:24:14 +0200 Organization: None Lines: 74 Message-ID: References: <2f60e6c7-1ce2-4094-a777-e544293e8843@hg8g2000vbb.googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Trace: solani.org 1305656574 26587 eJwFwQkBACAIA8BKoI4nDoLrH8E7bFNrPwY7IDg5b4uglrODqpeVAlkyQLBVY+6khWMV430gZBE1 (17 May 2011 18:22:54 GMT) X-Complaints-To: abuse@news.solani.org NNTP-Posting-Date: Tue, 17 May 2011 18:22:54 +0000 (UTC) X-User-ID: eJwNxcEBwCAIA8CVAhKUcQBl/xHa+xyXi/c2pxuHI9wtZkQXaHgRXQyoyFVpHp2Xx3TdpK//xFTC32mtnMAHKlIU3w== Cancel-Lock: sha1:6wfDWjNF9ypza+BMPM0ju04/Dvs= X-NNTP-Posting-Host: eJwNyMEBwCAIA8CVgpCo46DC/iO09zy6THeGqGCz7WwQlidW+AmSVZxohBJ5Xz3R2+zGmNrpKv/fOApDQAt9a72dLxGM/gCrjhlq Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:5587 Jackson wrote: > I'm currently using a function pasted in below. This allows me to sum > a column (index) in a list of lists. > > So if mylist = [[1, 2, 3], [1, 3, 4], [2, 3, 4], [2, 4, 5]] > group_results(mylist,[0],1) > > Returns: > [(1, 5), (2, 7)] > > What I would like to do is allow a tuple/list of index values, rather > than a single index value to be summed up, so you could say > group_results(mylist,[0],[1,2]) would return [(1, 5,7), (2, 7,9)] but > I'm struggling to do so, any thoughts? Cheers > > from itertools import groupby as gb > from operator import itemgetter as ig > > def group_results(table,keys,value): > res = [] > nkey = ig(*keys) > value = ig(value) > for k, group in gb(sorted(table,key=ig(*keys)),nkey): > res.append((k,sum(value(row) for row in group))) > return res You could write a version of sum() that can cope with tuples: from itertools import groupby, imap def itemgetter(keys, rowtype=tuple): def getitem(value): return rowtype(value[key] for key in keys) return getitem def sum_all(rows): rows = iter(rows) sigma = next(rows) rowtype = type(sigma) sigma = list(sigma) for row in rows: for i, x in enumerate(row): sigma[i] += x return rowtype(sigma) def group_results(table, key, value): get_key = itemgetter(key) get_value = itemgetter(value) table = sorted(table, key=get_key) for keyvalue, group in groupby(table, get_key): yield keyvalue + sum_all(imap(get_value, group)) but I'd probably use a dict-based approach: def group_results(table, key, value): get_key = itemgetter(key) get_value = itemgetter(value) grouped = {} for row in table: key = get_key(row) value = get_value(row) if key in grouped: grouped[key] = tuple(a + b for a, b in zip(grouped[key], value)) else: grouped[key] = value return [k + v for k, v in sorted(grouped.iteritems())] if __name__ == "__main__": items = [(1, 2, 3), (1, 3, 4), (2, 3, 4), (2, 4, 5)] print list(group_results(items, [0], [1, 2])) Note that the function built with my version of itemgetter() will always return a tuple.