Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100016

RE: counting unique numpy subarrays

From Albert-Jan Roskam <sjeik_appie@hotmail.com>
Newsgroups comp.lang.python
Subject RE: counting unique numpy subarrays
Date 2015-12-04 22:36 +0000
Message-ID <mailman.208.1449268679.14615.python-list@python.org> (permalink)
References <Q1m8y.334924$rR1.113623@fx19.iad>

Show all headers | View raw


Hi

(Sorry for topposting)

numpy.ravel is faster than numpy.flatten (no copy)
numpy.empty is faster than numpy.zeros
numpy.fromiter might be useful to avoid the loop (just a hunch)

Albert-Jan

> From: duncan@invalid.invalid
> Subject: counting unique numpy subarrays
> Date: Fri, 4 Dec 2015 19:43:35 +0000
> To: python-list@python.org
> 
> Hello,
>       I'm trying to find a computationally efficient way of identifying
> unique subarrays, counting them and returning an array containing only
> the unique subarrays and a corresponding 1D array of counts. The
> following code works, but is a bit slow.
> 
> ###############
> 
> from collections import Counter
> import numpy
> 
> def bag_data(data):
>     # data (a numpy array) is bagged along axis 0
>     # returns concatenated array and corresponding array of counts
>     vec_shape = data.shape[1:]
>     counts = Counter(tuple(arr.flatten()) for arr in data)
>     data_out = numpy.zeros((len(counts),) + vec_shape)
>     cnts = numpy.zeros((len(counts,)))
>     for i, (tup, cnt) in enumerate(counts.iteritems()):
>         data_out[i] = numpy.array(tup).reshape(vec_shape)
>         cnts[i] =  cnt
>     return data_out, cnts
> 
> ###############
> 
> I've been looking through the numpy docs, but don't seem to be able to
> come up with a clean solution that avoids Python loops. TIA for any
> useful pointers. Cheers.
> 
> Duncan
> -- 
> https://mail.python.org/mailman/listinfo/python-list
 		 	   		  

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-04 19:43 +0000
  RE: counting unique numpy subarrays Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2015-12-04 22:36 +0000
    Re: counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-05 00:13 +0000
  Re: counting unique numpy subarrays Peter Otten <__peter__@web.de> - 2015-12-05 00:06 +0100
    Re: counting unique numpy subarrays duncan smith <duncan@invalid.invalid> - 2015-12-05 00:18 +0000

csiph-web