Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: MRAB Newsgroups: comp.lang.python Subject: Re: Storing a big amount of path names Date: Fri, 12 Feb 2016 03:13:17 +0000 Lines: 48 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de lHTt4sGbj6CO4rSYs995MgthIn+YAHyNQvpf32GFAkww== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.008 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'dict': 0.09; 'stored': 0.10; 'def': 0.13; '(shortest': 0.16; 'dirnames': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:addr:python': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'pathnames': 0.16; 'paulo': 0.16; 'received:192.168.1.4': 0.16; 'received:84.93': 0.16; 'received:84.93.230': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'memory': 0.17; 'string,': 0.18; '>>>': 0.20; 'header:In-Reply-To:1': 0.24; 'header:User-Agent:1': 0.26; 'skip:m 30': 0.27; 'idea': 0.28; 'objects': 0.29; 'skip:_ 10': 0.32; 'received:84': 0.32; 'class': 0.33; 'skip:_ 30': 0.33; 'common': 0.33; 'equal': 0.34; 'maps': 0.35; 'path': 0.35; 'but': 0.36; 'there': 0.36; 'to:addr:python- list': 0.36; 'subject:: ': 0.37; 'being': 0.37; 'skip:9 10': 0.37; "won't": 0.38; 'names': 0.38; 'stuff': 0.38; 'skip:p 20': 0.38; 'files': 0.38; 'received:192': 0.39; 'to:addr:python.org': 0.40; 'share': 0.61; 'more': 0.63; 'different': 0.63; 'where:': 0.66; 'apart': 0.70; 'copies.': 0.72 X-CM-Score: 0.00 X-CNFS-Analysis: v=2.1 cv=Iat6Ijea c=1 sm=1 tr=0 a=0nF1XD0wxitMEM03M9B4ZQ==:117 a=0nF1XD0wxitMEM03M9B4ZQ==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=dgtEZIG7s6rEIRSWLfQA:9 a=QEXdDO2ut3YA:10 X-AUTH: mrabarnett@:2500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21rc2 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:102840 On 2016-02-12 00:31, Paulo da Silva wrote: > Hi! > > What is the best (shortest memory usage) way to store lots of pathnames > in memory where: > > 1. Path names are pathname=(dirname,filename) > 2. There many different dirnames but much less than pathnames > 3. dirnames have in general many chars > > The idea is to share the common dirnames. > > More realistically not only the pathnames are stored but objects each > object being a MyFile containing > self.name - > getPathname(self) - > other stuff > > class MyFile: > > __allfiles=[] > > def __init__(self,dirname,filename): > self.dirname=dirname # But I want to share this with other files > self.name=filename > MyFile.__allfiles.append(self) > ... > > def getPathname(self): > return os.path.join(self.dirname,self.name) > > ... > Apart from all of the other answers that have been given: >>> p1 = 'foo/bar' >>> p2 = 'foo/bar' >>> id(p1), id(p2) (982008930176, 982008930120) >>> d = {} >>> id(d.setdefault(p1, p1)) 982008930176 >>> id(d.setdefault(p2, p2)) 982008930176 The dict maps equal strings (dirnames) to the same string, so you won't have multiple copies.