Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Fri, 21 Sep 2012 00:58:03 +0100
From: MRAB <python@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20120907 Thunderbird/15.0.1
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: looping in array vs looping in a dic
References: <007b2d71-3355-4085-b84f-204834b2c8d0@googlegroups.com> <505B6A00.10308@mrabarnett.plus.com> <CALwzid=HowMzWmgiU2+2Gu4opxWu0Hx-iOFRZZcXi=BAdKVfrw@mail.gmail.com> <mailman.971.1348169395.27098.python-list@python.org> <f375e37c-d700-4ca5-b06b-2d195a5644de@googlegroups.com>
In-Reply-To: <f375e37c-d700-4ca5-b06b-2d195a5644de@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: python-list@python.org
Newsgroups: comp.lang.python
Message-ID: <mailman.984.1348185486.27098.python-list@python.org>
Lines: 42
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:29587

On 2012-09-21 00:35, giuseppe.amatulli@gmail.com wrote:
> Hi Ian and MRAB
> thanks to you input i have improve the speed  of my code. Definitely reading in dic() is faster. I have one more question.
> In the dic() I calculate the sum of the values, but i want count also the number of observation, in order to calculate the average in the end.
> Should i create a new dic() or is possible to do in the same dic().
> Here in the final code.
> Thanks Giuseppe
>
Keep it simple. Use 2 dicts.

>
>
> rows = dsCategory.RasterYSize
> cols = dsCategory.RasterXSize
>
> print("Generating output file %s" %(dst_file))
>
> start = time()
>
> unique=dict()
>
> for irows in xrange(rows):
>      valuesRaster=dsRaster.GetRasterBand(1).ReadAsArray(0,irows,cols,1)
>      valuesCategory=dsCategory.GetRasterBand(1).ReadAsArray(0,irows,cols,1)
>      for icols in xrange(cols):
>          if ( valuesRaster[0,icols] != no_data_Raster ) and ( valuesCategory[0,icols] != no_data_Category ) :
>              row = valuesCategory[0, icols],valuesRaster[0, icols]
>              if row[0] in unique :
>                  unique[row[0]] += row[1]
>              else:
>                  unique[row[0]] = 0+row[1] # this 0 was add if not the first observation was considered = 0
>
You could use defaultdict instead:

from collections import defaultdict

unique = defaultdict(int)
...
              category, raster = valuesCategory[0, icols], 
valuesRaster[0, icols]
              unique[category] += raster