Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #100021
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: counting unique numpy subarrays |
| Date | 2015-12-05 00:06 +0100 |
| Organization | None |
| Message-ID | <mailman.213.1449270605.14615.python-list@python.org> (permalink) |
| References | <Q1m8y.334924$rR1.113623@fx19.iad> |
duncan smith wrote:
> Hello,
> I'm trying to find a computationally efficient way of identifying
> unique subarrays, counting them and returning an array containing only
> the unique subarrays and a corresponding 1D array of counts. The
> following code works, but is a bit slow.
>
> ###############
>
> from collections import Counter
> import numpy
>
> def bag_data(data):
> # data (a numpy array) is bagged along axis 0
> # returns concatenated array and corresponding array of counts
> vec_shape = data.shape[1:]
> counts = Counter(tuple(arr.flatten()) for arr in data)
> data_out = numpy.zeros((len(counts),) + vec_shape)
> cnts = numpy.zeros((len(counts,)))
> for i, (tup, cnt) in enumerate(counts.iteritems()):
> data_out[i] = numpy.array(tup).reshape(vec_shape)
> cnts[i] = cnt
> return data_out, cnts
>
> ###############
>
> I've been looking through the numpy docs, but don't seem to be able to
> come up with a clean solution that avoids Python loops.
Me neither :(
> TIA for any
> useful pointers. Cheers.
Here's what I have so far:
def bag_data(data):
counts = numpy.zeros(data.shape[0])
seen = {}
for i, arr in enumerate(data):
sarr = arr.tostring()
if sarr in seen:
counts[seen[sarr]] += 1
else:
seen[sarr] = i
counts[i] = 1
nz = counts != 0
return numpy.compress(nz, data, axis=0), numpy.compress(nz, counts)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-04 19:43 +0000
RE: counting unique numpy subarrays Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2015-12-04 22:36 +0000
Re: counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-05 00:13 +0000
Re: counting unique numpy subarrays Peter Otten <__peter__@web.de> - 2015-12-05 00:06 +0100
Re: counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-05 00:18 +0000
csiph-web