Groups > comp.programming.threads > #1180

Re: Filling a histogram

Date	2012-10-24 09:28 +0200
From	Marcel Müller <news.5.maazl@spamgourmet.com>
Newsgroups	comp.programming.threads
Subject	Re: Filling a histogram
References	<alpine.LNX.2.00.1210231714320.3074@coulomb.univ-paris12.fr>
Message-ID	<5087988c$0$6581$9b4e6d93@newsspool3.arcor-online.net> (permalink)
Organization	Arcor

Show all headers | View raw

On 23.10.2012 17:22, Lucas Levrel wrote:
> I have several threads that should concurrently fill a histogram,
> represented by a shared array. Each thread locally (and repeatedly)
> computes the bin to increment, then does ++(histo[bin]) . What do you
> think is the more efficient way to achieve this:
> - each thread updates a copy of the histogram, then reduce;
> - use atomic<>?

It depends on the update rate. Let's say all threads together do some 
update approximately 10 times per second. Then feel free to update a 
shared array directly. (I assume that no two threads update the same 
bin. Otherwise see below.)

If the update rate is high (let's say >= 1000/s), then you will run into 
a cache invalidation issue, because many platforms will invalidate the 
cache of other CPU cores on write access to a cache line.

However, if each of your threads is responsible for a longer, continuous 
run of bins. The only updates close to the border of the thread 
responsibility counts, i.e. the shared cache lines.

A common approach to work around the cache problem is to use another 
level of indirection. Make your histo array of type int* histo[] (or 
something similar) and allocate single objects of type int by the thread 
that writes them. The latter is essential, because allocators try to use 
different heap areas for different threads.

If the lifetime of your worker threads is longer than the lifetime of 
histo, then you also could initialize histo with pointers to local 
members of your worker class. This avoids the additional allocation and 
you do not need to deal with null references in the view engine.

Assuming that read access to int is implicitly atomic on your platform, 
there is no need to protect each integer value by std::atomic. In theory 
this could delay the updates for an infinite time until they are visible 
by the viewer thread. In practice it will work in any existing real 
world scenario.

If your threads all update all bins (e.g. in case of Monte Carlo 
integration), then it depends on how often you want to update the global 
array. Working with local copies in each thread and adding them to the 
global one from time to time has the least synchronization overhead. The 
latter could be done by the viewer thread on demand. You may not need 
any synchronization in this case as long as you do not care when the 
time index where the bins are sampled from the local thread histos 
slightly differs for each bin.

If you want to update the global array immediately, then you need an 
array of std::atomic<int>. But be aware that nothing can protect you 
from the cache invalidation issue mentioned above in this case.

Marcel

Back to comp.programming.threads | Previous | Next — Previous in thread | Next in thread | Find similar

Thread

Filling a histogram Lucas Levrel <lucas.levrel@u-pec.fr> - 2012-10-23 17:22 +0200
  Re: Filling a histogram Marcel Müller <news.5.maazl@spamgourmet.com> - 2012-10-24 09:28 +0200
    Re: Filling a histogram Lucas Levrel <lucas.levrel@u-pec.fr> - 2012-10-24 15:19 +0200

csiph-web