Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102844

Re: Storing a big amount of path names

From Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt>
Newsgroups comp.lang.python
Subject Re: Storing a big amount of path names
Date 2016-02-12 04:45 +0000
Organization Aioe.org NNTP Server
Message-ID <n9jo24$o42$1@gioia.aioe.org> (permalink)
References <n9j94f$712$1@gioia.aioe.org> <56BD4DCD.4080401@mrabarnett.plus.com> <mailman.66.1455248959.22075.python-list@python.org> <n9jm9n$m77$1@gioia.aioe.org> <mailman.67.1455251003.22075.python-list@python.org>

Show all headers | View raw


Às 04:23 de 12-02-2016, Chris Angelico escreveu:
> On Fri, Feb 12, 2016 at 3:15 PM, Paulo da Silva
> <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> wrote:
>> Às 03:49 de 12-02-2016, Chris Angelico escreveu:
>>> On Fri, Feb 12, 2016 at 2:13 PM, MRAB <python@mrabarnett.plus.com> wrote:
>>>> Apart from all of the other answers that have been given:
>>>>
>> ...
>>>
>>> Simpler to let the language do that for you:
>>>
>>>>>> import sys
>>>>>> p1 = sys.intern('foo/bar')
>>>>>> p2 = sys.intern('foo/bar')
>>>>>> id(p1), id(p2)
>>> (139621017266528, 139621017266528)
>>>
>>
>> I didn't know about id or sys.intern :-)
>> I need to look at them ...
>>
>> As I can understand I can do in MyFile class
>>
>> self.dirname=sys.intern(dirname) # dirname passed as arg to the __init__
>>
>> and the character string doesn't get repeated.
>> Is this correct?
> 
> Correct. Two equal strings, passed to sys.intern(), will come back as
> identical strings, which means they use the same memory. You can have
> a million references to the same string and it takes up no additional
> memory.
I have being playing with this and found that it is not always true!
For example:

In [1]: def f(s):
   ...:     print(id(sys.intern(s)))
   ...:

In [2]: import sys

In [3]: f("12345")
139805480756480

In [4]: f("12345")
139805480755640

In [5]: f("12345")
139805480756480

In [6]: f("12345")
139805480756480

In [7]: f("12345")
139805480750864

I think a dict, as MRAB suggested, is needed.
At the end of the store process I may delete the dict.

> 
> But I reiterate: Don't even bother with this unless you know your
> program is running short of memory.

Yes, it is.
This is part of a previous post (sets of equal files) and I need lots of
memory for performance reasons. I only have 2G in this computer.

I already had implemented a solution. I used two dicts. One to map
dirnames to an int handler and the other to map the handler to dir
names. At the end I deleted the 1st. one because I only need to get the
dirname from the handler. But I thought there should be a better choice.

Thanks
Paulo

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Storing a big amount of path names Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-12 00:31 +0000
  Re: Storing a big amount of path names Chris Angelico <rosuav@gmail.com> - 2016-02-12 11:39 +1100
  Re: Storing a big amount of path names Ben Finney <ben+python@benfinney.id.au> - 2016-02-12 11:44 +1100
  Re: Storing a big amount of path names Tim Chase <python.list@tim.thechases.com> - 2016-02-11 19:13 -0600
    Re: Storing a big amount of path names Rob Gaddi <rgaddi@highlandtechnology.invalid> - 2016-02-12 02:17 +0000
  Re: Storing a big amount of path names MRAB <python@mrabarnett.plus.com> - 2016-02-12 03:13 +0000
  Re: Storing a big amount of path names Chris Angelico <rosuav@gmail.com> - 2016-02-12 14:49 +1100
    Re: Storing a big amount of path names Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-12 04:15 +0000
      Re: Storing a big amount of path names Chris Angelico <rosuav@gmail.com> - 2016-02-12 15:23 +1100
        Re: Storing a big amount of path names Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-12 04:45 +0000
          Re: Storing a big amount of path names Chris Angelico <rosuav@gmail.com> - 2016-02-12 16:02 +1100
            Re: Storing a big amount of path names Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-12 05:49 +0000
            Re: Storing a big amount of path names Steven D'Aprano <steve@pearwood.info> - 2016-02-12 16:51 +1100
        Re: Storing a big amount of path names Rob Gaddi <rgaddi@highlandtechnology.invalid> - 2016-02-12 17:05 +0000
          Re: Storing a big amount of path names Chris Angelico <rosuav@gmail.com> - 2016-02-13 04:18 +1100
          Re: Storing a big amount of path names Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-02-12 21:37 +0000
          Re: Storing a big amount of path names Ben Finney <ben+python@benfinney.id.au> - 2016-02-13 08:49 +1100
          Re: Storing a big amount of path names Matt Wheeler <funkyhat@gmail.com> - 2016-02-12 23:31 +0000
            Re: Storing a big amount of path names mkondrashin@gmail.com - 2016-02-13 12:19 -0800
  Re: Storing a big amount of path names srinivas devaki <mr.eightnoteight@gmail.com> - 2016-02-12 11:46 +0530

csiph-web