Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!newsfeed.straub-nv.de!feed.xsnews.nl!border-1.ams.xsnews.nl!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <583C055C-9E99-4EAC-8F3A-D578C399826E@gmail.com>
References: <930ab3d8-4ab9-446d-9970-ee811eb70a44@googlegroups.com> <F1B463BB-19A6-4DB1-99B3-929CCBFB5920@gmail.com> <50241F14.2060209@tim.thechases.com> <36EA3847-6713-4C12-B47B-9B5E10325F00@gmail.com> <502429C3.5000600@tim.thechases.com> <583C055C-9E99-4EAC-8F3A-D578C399826E@gmail.com>
Date: Thu, 9 Aug 2012 16:53:49 -0500
Subject: Re: save dictionary to a file without brackets.
From: Giuseppe Amatulli <giuseppe.amatulli@gmail.com>
To: python-list@python.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.3127.1344549233.4697.python-list@python.org>
Lines: 123
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:26822

Thanks a lot for the clarification.
Actually my problem is giving to raster dataset in geo-tif format find out
unique pair combination, count the number of observation
unique combination in rast1, count the number of observation
unique combination in rast2, count the number of observation

I try different solution and this seems to me the faster


Rast00=3DdsRast00.GetRasterBand(1).ReadAsArray()
Rast10=3DdsRast10.GetRasterBand(1).ReadAsArray()

mask=3D( Rast00 !=3D 0 ) & ( Rast10 !=3D 0  )  # may be this masking
operation can be included in the for loop

Rast00_mask=3D Rast00[mask]                # may be this masking
operation can be included in the for loop
Rast10_mask=3D Rast10[mask]                # may be this masking
operation can be included in the for loop

array2D =3D np.array(zip( Rast00_mask,Rast10_mask))

unique_u=3Ddict()
unique_k1=3Ddict()
unique_k2=3Ddict()

for key1,key2 in  array2D :
    row =3D tuple((key1,key2))
    if row in unique_u:
        unique_u[row] +=3D 1
    else:
        unique_u[row] =3D 1
    if key1 in unique_k1:
        unique_k1[key1] +=3D 1
    else:
        unique_k1[key1] =3D 1
    if key2 in unique_k2:
        unique_k2[key2] +=3D 1
    else:
        unique_k2[key2] =3D 1

output =3D open(dst_file_rast0010, "w")
for (a, b), c in unique_u.items():
    print(a, b, c, file=3Doutput)
output.close()

output =3D open(dst_file_rast00, "w")
for (a), b in unique_k1.items():
    print(a, b, file=3Doutput)
output.close()

output =3D open(dst_file_rast10, "w")
for (a), b in unique_k2.items():
    print(a, b, file=3Doutput)
output.close()

What do you think? is there a way to speed up the process?
Thanks
Giuseppe





On 9 August 2012 16:34, Roman Vashkevich <vashkevichrb@gmail.com> wrote:
> Actually, they are different.
> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred tho=
usand entries, and you will feel the difference.
> Dict uses hashing to get a value from the dict and this is why it's O(1).
>
> 10.08.2012, =D0=B2 1:21, Tim Chase =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=
=D0=BB(=D0=B0):
>
>> On 08/09/12 15:41, Roman Vashkevich wrote:
>>> 10.08.2012, =D0=B2 0:35, Tim Chase =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=
=D0=BB(=D0=B0):
>>>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>>>> and i want to print to a file without the brackets comas and semicol=
on in order to obtain something like this?
>>>>>> 4 5 1
>>>>>> 5 4 1
>>>>>> 4 4 2
>>>>>> 2 3 1
>>>>>> 4 3 2
>>>>>
>>>>> for key in dict:
>>>>>    print key[0], key[1], dict[key]
>>>>
>>>> This might read more cleanly with tuple unpacking:
>>>>
>>>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>>>>   print edge1, edge2, cost
>>>>
>>>> (I'm making the assumption that this is a edge/cost graph...use
>>>> appropriate names according to what they actually mean)
>>>
>>> dict.items() is a list - linear access time whereas with 'for
>>> key in dict:' access time is constant:
>>> http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#u=
se-in-where-possible-1
>>
>> That link doesn't actually discuss dict.{iter}items()
>>
>> Both are O(N) because you have to touch each item in the dict--you
>> can't iterate over N entries in less than O(N) time.  For small
>> data-sets, building the list and then iterating over it may be
>> faster faster; for larger data-sets, the cost of building the list
>> overshadows the (minor) overhead of a generator.  Either way, the
>> iterate-and-fetch-the-associated-value of .items() & .iteritems()
>> can (should?) be optimized in Python's internals to the point I
>> wouldn't think twice about using the more readable version.
>>
>> -tkc
>>
>>
>



--=20
Giuseppe Amatulli
Web: www.spatial-ecology.net