Path: csiph.com!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch Newsgroups: comp.lang.c Subject: Re: Dymamic arrays: memory management and naming Date: Tue, 19 Sep 2023 07:58:57 -0700 Organization: A noiseless patient Spider Lines: 117 Message-ID: <86o7hyky4u.fsf@linuxsc.com> References: <20230909132336.99cdb303ad685bc1e40f92df@gmail.moc> <86ediwmk8b.fsf@linuxsc.com> <20230919011018.d9ed14da6e842291d47b69d7@gmail.moc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: dont-email.me; posting-host="2708376e434a174750b1cbd4a8683397"; logging-data="2530628"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/hYcKi1HnEiki2vY4slRrk1AxEK/sL0lM=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:JKuZTWMTjl7pN2ri+OWWVunK3NU= sha1:J8dUbw8cfHqQDkYvonTgN1zB48c= Xref: csiph.com comp.lang.c:176032 Anton Shepelev writes: > Tim Rentsch: > >> Another point to consider: if you are using malloc(), and >> the array elements might be small (eg chars or shorts), > > I thought such effects should be prominent only with small > array elements /and/ small array lengthes... I expected them > to drown below the noise floor (so to say) when the length > becomes large. I think it's not as simple as that. Consider a scenario: a program is making use of a growable character buffer, with eventually 150 bytes needed. Let's take an initial guess of 8 elements, and growing by a factor of two. The chain of events in the two cases would look like (ignore grain, etc) 8 16 32 64 128 256 (272 used) (factor in grain, etc) 24 56 120 248 (256 used) Taking granularity, etc, into account the program needs just four allocations (and uses only slightly less memory). Ignoring these factors the program needs six allocations. Granted, the memory savings is only about 6 percent, but there is also a time savings by virtue of needing fewer re-allocations. (If the numbers in the second line are confusing, remember that at each stage the numbers are rounded up to be 8 mod 16. The initial 24 is because that's the smallest amount of memory that the allocator will fit in a single block; any smaller amount will end up returning 24 usable bytes, so we might as well ask for that.) I admit this is only one scenario, and other scenarios will give different results, including some where a naive approach does better. Still, the potential upside seems worth getting, since the cost of getting it (basically some simple arithmetic in the allocation function) is pretty small. >> you >> might try to take into account minimum allocation size, >> grain, and overhead. Here is a program to help explain: >> >> #include >> #include >> >> int >> main(){ >> enum { N = 90 }; >> char *blocks[N+1]; >> unsigned i; >> for( i = 0; i < N+1; i++ ){ >> blocks[i] = malloc( i ); >> } >> for( i = 0; i < N; i++ ){ >> printf( "i: %2u delta: %3td0, i, blocks[i+1] - blocks[i] ); >> } >> } >> >> Running this program on my system here, it indicates a grain >> of 16 bytes, an overhead of 8 bytes > > Same here (on the Freeshell.de server). I had not thought > of that at all, assuming subconsciously that the allocation > map (some sort of RAM filesystem) is kept separately from > the actual allocated blocks. May I rely on the > contiguousness of each overhead data element with the > corresponding allocated block of data? I think most allocators use a little bit of memory just before each allocated block, to facilitate free()-ing and coalescing of adjacent free blocks. That isn't guaranteed of course, but it's common. > May I rely in > general on the sequential order of allocations that your > program clearly demonstrated in both our cases? When a program starts it's likely that the allocator has a single large block of memory available, and will fill successive calls at one end of that block. Again, not guaranteed, but I expect it occurs often enough that it's worth making a stab. If an attempt to discover the allocator characteristics fails, I think a reasonable fallback is to guess a granularity of 16 and an overhead of 8 (minimum size 24) for 64-bit systems, and half of those for 32-bit systems. > For aught I > know, the memory manager is free to do howsoever it wishes... Certainly it is, but it's not just a coincidence that the numbers you got match the numbers that I got. Having said that, I should add that my experience with MS Windows is that it has enormous amount of overhead for each allocated block. > Thanks for the observation, Tim. As you see, it is somewhat > below the horizon (geologically speaking) of my knowledge. > If you should care to provide a strategy that estimates > these parameters of the memory manager and then uses them to > optimise the allocation of dynamic arrays, I shall gladly > include it as as an option with my implementation. Right now > I am casting for some lean, clean, and not git-driven platform > to host the code. My first suggestion is to provide an option (runtime selectable) that guesses 'sizeof (size_t)' for overhead, twice that for the granularity, and three times that for the minimum request size. It's probably worth adding an initialization call to your allocator interface, so that you can later add code that tries to discover the true values of these parameters. (Note: some spelling corrections to your message have been made.)