Groups > comp.lang.java.programmer > #14544 > unrolled thread

multithreaded cache?

Started by	bugbear <bugbear@trim_papermule.co.uk_trim>
First post	2012-05-15 10:14 +0100
Last post	2012-06-02 21:54 +0200
Articles	20 on this page of 40 — 12 participants

Back to article view | Back to comp.lang.java.programmer

  multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-15 10:14 +0100
    Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-15 11:41 +0200
      Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-15 15:57 -0700
        Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 16:13 -0700
          Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:22 +0200
        Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:19 +0200
          Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 20:26 -0400
            Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-15 21:41 -0300
            Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 09:26 +0200
              Re: multithreaded cache? "Mike Schilling" <mscottschilling@hotmail.com> - 2012-05-26 14:54 -0700
          Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-16 22:03 -0700
            Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:25 +0200
    Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 08:22 -0400
    Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 09:56 -0700
    Re: multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-16 14:33 +0100
    Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 11:54 +0200
      Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 12:06 +0200
        Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-17 09:51 -0700
          Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 20:48 +0200
        Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:35 +0200
          Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:48 +0200
            Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 18:11 -0700
      Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-18 20:42 +0200
        Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-18 23:45 +0200
          Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-18 18:31 -0400
            Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-18 22:15 -0700
              Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-19 07:09 -0400
            Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-19 12:33 +0200
              Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 14:24 -0700
                Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-21 21:15 +0200
                  Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-23 23:11 -0700
                    Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-24 23:05 +0200
                      Re: multithreaded cache? Gene Wirchenko <genew@ocis.net> - 2012-05-24 20:11 -0700
                        Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:42 +0200
                  Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-25 07:09 -0300
          Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:21 +0200
    Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-05-24 23:22 -0700
      Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:36 +0200
        Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-06-02 12:27 -0700
          Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-06-02 21:54 +0200

Page 2 of 2 — ← Prev page 1 [2]

#14659

From	Sebastian <sebastian@undisclosed.invalid>
Date	2012-05-20 01:48 +0200
Message-ID	<jp9bfh$b9f$1@news.albasani.net>
In reply to	#14658

Am 20.05.2012 01:35, schrieb Sebastian:
> Am 17.05.2012 12:06, schrieb Robert Klemme:
>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>>> On 05/15/2012 11:14 AM, bugbear wrote:
>>>> However, if the underlying function is slow
>>>> and/or resource hungry (consider cacheing
>>>> a ray traced image!) many threads can
>>>> end up calling the real function (second
>>>> and subsequent threads to the first get a miss
>>>> during the first threads call to the underlying function).
>>>>
>>>> "obviously..." what I want is for only
>>>> the thread that FIRST has a cache miss
>>>> calls the underlying function, whilst other
>>>> threads (for the same key) wait.
>>>
>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>> should yield higher throughput because it works without read write
>>> locking. You can find it as gist in case the code is garbled in the
>>> newsgroup posting:
>>> https://gist.github.com/2717818
>>
>> There was one important detail missing. This is the corrected code (gist
>> is updated as well):
>>
> [snip]
> The important detail obiously is the extra check with the "here" flag.
> Would you mind explaining why that is necessary? Everyone seems to have
> got it, but I must admit I haven't. Why is it not enough that
> Reference#get() is synchronized? As you yourself have missed this at
> your first attempt, I hope the reason isn't trivial...
> -- Sebastian
>
I think I've got it -  the check prevents double calculation of the 
value by two threads calling get() on that same reference instance.
Which may occur if putIfAbsent() returns a non-null value before the
"calculating reference" has had time to replace itself with the
constant reference. Is that right? -- Sebastian

[toc] | [prev] | [next] | [standalone]

#14662

From	Lew <noone@lewscanon.com>
Date	2012-05-19 18:11 -0700
Message-ID	<jp9gc1$7s3$1@news.albasani.net>
In reply to	#14659

Sebastian wrote:
> schrieb Sebastian:
>> The important detail obiously is the extra check with the "here" flag.
>> Would you mind explaining why that is necessary? Everyone seems to have
>> got it, but I must admit I haven't. Why is it not enough that
>> Reference#get() is synchronized? As you yourself have missed this at
>> your first attempt, I hope the reason isn't trivial...
>>
> I think I've got it - the check prevents double calculation of the value by
> two threads calling get() on that same reference instance.
> Which may occur if putIfAbsent() returns a non-null value before the
> "calculating reference" has had time to replace itself with the
> constant reference. Is that right? -- Sebastian

Yep.

https://gist.github.com/2717818

============
LazyCache
============

   public V get(final K key) {
     Reference<V> ref = map.get(key);

     if (ref == null) {
       // miss
       ref = new Reference<V>() {
         private V val;
         private boolean here = false; // superfluous but explicit

         @Override
         public synchronized V get() {
           if (!here) {
             val = calc.get(key);
             here = true;
             // next time lock free access:
             Reference<V> x = map.put(key, new Ref<V>(val));
             assert x == this;
           }

           return val;
         }
       };

       final Reference<V> old = map.putIfAbsent(key, ref);

       if (old != null)
         ref = old; // someone else was faster
     }

     return ref.get();
   }

============

The first cluster of requestors to 'LazyCache#get(K key)' enter the
'if (ref == null) {' block.

They all construct a new anonymous-type instance of 'Reference<V>'.

The first one to hit 'putIfAbsent()' wins. The rest replace their 
painstakingly - actually very, very quickly - contructed anonymous-type 
'Reference<V>' instances with the anonymous-type instance the first client put 
into the 'Map'.

So only one 'Reference<V>' comes back from all the clients that first hit the 
map and saw 'null'.

It is of the anonymous type.

The purpose of the 'here' variable is for the anonymous-type instance. That 
one-and-only possible instance can tell that it has never been 'here' before, 
i.e., inside the 'get()' method.

The next batch of 'LazyCache' clients hit the map when that anonymous-type 
instance is still in there.

They all call 'get()' on that anonymous-type instance. First one sees 'here' 
is still false.  Up until now, no one has called the expensive operation 
inside of 'calc'. No matter how many threads hit the map when the value was 
'null', and no matter how many of them hit it when the map contained the 
anonymous-type instance, all those threads have to wait until that one and 
only instance calls 'get()' in turn, and the first one of that whole cluster 
is the only one to see '!here'.

It calls the expensive operation and replaces its own internal value with the 
newly-calculated one.

In addition, it replaces itself in the map with a simple, final-value holder.

All the threads currently blocked on the anonymous-type instance 'get()' will 
see the stashed value.

None of them will call the expensive operation again.

Any clients thereafter will hit the 'LazyCache' map and find only the simple, 
final-value named-type instance, whose 'get()' returns the pre-calculated 
value in a completely thread-safe manner without any more locking.

Once the named-type instance is in the map, and all threads that found the 
anonymous one finish their 'get()' calls, the anonymous-type instance is 
eligible for garbage collection, thus letting go of an extra reference to the 
calculator. I've chased that rabbit down the hole a bit without finding a real 
danger to the circular reference in this scenario, but it's never bad to break 
reference-graph cycles. They are often a major bug breeding ground, especially 
on resource managers and types with non-trivial 'finalize()' methods.

Again, while here the lifetime of the calculator is that of the 'LazyCache' 
instance, it's good practice to decouple value clients from their producers' 
finalizers. It simplifies memory management. See Josh Bloch's /Effective Java/ 
for details on finalizers.

You get tremendous results from the twin engines of scope and lifetime management.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#14628

From	Silvio Bierman <silvio@moc.com>
Date	2012-05-18 20:42 +0200
Message-ID	<4fb69812$0$6849$e4fe514c@news2.news.xs4all.nl>
In reply to	#14571

On 05/17/2012 11:54 AM, Robert Klemme wrote:
> On 05/15/2012 11:14 AM, bugbear wrote:
>> However, if the underlying function is slow
>> and/or resource hungry (consider cacheing
>> a ray traced image!) many threads can
>> end up calling the real function (second
>> and subsequent threads to the first get a miss
>> during the first threads call to the underlying function).
>>
>> "obviously..." what I want is for only
>> the thread that FIRST has a cache miss
>> calls the underlying function, whilst other
>> threads (for the same key) wait.
>
> I provide a variant of Silvio's, Eric's and Daniel's solution which
> should yield higher throughput because it works without read write
> locking. You can find it as gist in case the code is garbled in the
> newsgroup posting:
> https://gist.github.com/2717818
>
> Kind regards
>
> robert
>
>
> The code (untested):
>
> package clj;
>
> import java.util.concurrent.ConcurrentHashMap;
> import java.util.concurrent.ConcurrentMap;
>
> /**
> * The cache works with as few locking as possible. Lookup is done in two
> steps
> * on cache miss:
> * <ol>
> * <li>On a cache miss a retriever is inserted into the cache which will
> obtain
> * the value synchronized from a {@link Calculator}.</li>
> * <li>Once calculation has finished a simple lock free reference to the
> value
> * replaces the retriever in the cache and the value is returned.</li>
> * </ol>
> *
> * @author robert klemme
> *
> * @param <K>
> * key type
> * @param <V>
> * value type
> */
> public final class LazyCache<K, V> {
> /**
> * Calculate values from given keys.
> *
> * @param <K>
> * key type
> * @param <V>
> * value type
> */
> public interface Calculator<K, V> {
> V get(K key);
> }
>
> /**
> * Obtain a value.
> *
> * @param <V>
> * value type.
> */
> private interface Reference<V> {
> V get();
> }
>
> /**
> * Stupid simple reference which only hands out a fixed value all the time
> * without synchronization.
> *
> * @param <V>
> * value type.
> */
> private static final class Ref<V> implements Reference<V> {
> private final V val;
>
> public Ref(V val) {
> this.val = val;
> }
>
> @Override
> public V get() {
> return val;
> }
> }
>
> /** Mapping from keys to objects which yield values. */
> private final ConcurrentMap<K, Reference<V>> map = new
> ConcurrentHashMap<K, Reference<V>>();
>
> /** User provided. */
> private final Calculator<K, V> calc;
>
> /**
> * Create a cache.
> *
> * @param calc
> * user must provide a reasonable implementation, not
> * <code>null</code>.
> */
> public LazyCache(final Calculator<K, V> calc) {
> if (calc == null)
> throw new NullPointerException();
> this.calc = calc;
> }
>
> /**
> * Get a value from the cache. The value might have to be calculated.
> *
> * @param key
> * lookup key.
> * @return value, might even be <code>null</code> depending on algorithm.
> */
> public V get(final K key) {
> Reference<V> ref = map.get(key);
>
> if (ref == null) {
> // miss
> ref = new Reference<V>() {
> @Override
> public synchronized V get() {
> final V val = calc.get(key);
> // next time lock free access:
> Reference<V> x = map.put(key, new Ref<V>(val));
> assert x == this;
> return val;
> }
> };
>
> final Reference<V> old = map.putIfAbsent(key, ref);
>
> if (old != null)
> ref = old; // someone else was faster
> }
>
> return ref.get();
> }
> }


I think you have as many locks as I suggested (being one)? My initial 
implementations of something like this used a plain map with an extra 
lock but later cases used the by then available ConcurrentHashMap as 
well, making one lock redundant.

Silvio

[toc] | [prev] | [next] | [standalone]

#14633

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-18 23:45 +0200
Message-ID	<a1nu7sFnnpU1@mid.individual.net>
In reply to	#14628

On 18.05.2012 20:42, Silvio Bierman wrote:
> On 05/17/2012 11:54 AM, Robert Klemme wrote:

>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>> should yield higher throughput because it works without read write
>> locking. You can find it as gist in case the code is garbled in the
>> newsgroup posting:
>> https://gist.github.com/2717818

> I think you have as many locks as I suggested (being one)? My initial
> implementations of something like this used a plain map with an extra
> lock but later cases used the by then available ConcurrentHashMap as
> well, making one lock redundant.

You didn't show it here, did you?  I can's seem to find it in the 
thread.  Note that CHM does also do synchronization.  I am not sure from 
your statement what exact locking scheme you apply.  There does seem to 
be one difference though: I my version the second lock goes away after 
the value has been computed so there is only the sync of CHM left.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#14636

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-05-18 18:31 -0400
Message-ID	<jp6iji$n6m$1@dont-email.me>
In reply to	#14633

On 5/18/2012 5:45 PM, Robert Klemme wrote:
> On 18.05.2012 20:42, Silvio Bierman wrote:
>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>
>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>> should yield higher throughput because it works without read write
>>> locking. You can find it as gist in case the code is garbled in the
>>> newsgroup posting:
>>> https://gist.github.com/2717818
>
>> I think you have as many locks as I suggested (being one)? My initial
>> implementations of something like this used a plain map with an extra
>> lock but later cases used the by then available ConcurrentHashMap as
>> well, making one lock redundant.
>
> You didn't show it here, did you? I can's seem to find it in the thread.
> Note that CHM does also do synchronization. I am not sure from your
> statement what exact locking scheme you apply. There does seem to be one
> difference though: I my version the second lock goes away after the
> value has been computed so there is only the sync of CHM left.

     It seems to me that if N threads query the same key at about
the same time, they may all miss the map and go off to perform
the slow computation.  If "slow" is large compared to the cost of
a lock-release pair (and if it weren't, why cache?), the tradeoff
seems suspect.

     Also, different threads may wind up using different value
instances.  If the cache is purely a convenience for a value-only
object that may be all right, but it's not all right if the values
are supposed to be singletons.

     Finally, there's more than a whiff of the double-checked locking
antipattern about what you're doing with the `here' flag.  I'm not
absolutely sure what you've got is in fact DCL (hence, wrong), but
I'm also not absolutely sure it's DCL-free.  Before using the pattern
in any important way, I'd want to check with a major-league guru,
just as "due diligence."

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#14637

From	Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Date	2012-05-18 22:15 -0700
Message-ID	<J%Ftr.22162$TC4.17002@newsfe14.iad>
In reply to	#14636

On 5/18/12 3:31 PM, Eric Sosman wrote:
> On 5/18/2012 5:45 PM, Robert Klemme wrote:
>> On 18.05.2012 20:42, Silvio Bierman wrote:
>>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>>
>>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>>> should yield higher throughput because it works without read write
>>>> locking. You can find it as gist in case the code is garbled in the
>>>> newsgroup posting:
>>>> https://gist.github.com/2717818
>>
>>> I think you have as many locks as I suggested (being one)? My initial
>>> implementations of something like this used a plain map with an extra
>>> lock but later cases used the by then available ConcurrentHashMap as
>>> well, making one lock redundant.
>>
>> You didn't show it here, did you? I can's seem to find it in the thread.
>> Note that CHM does also do synchronization. I am not sure from your
>> statement what exact locking scheme you apply. There does seem to be one
>> difference though: I my version the second lock goes away after the
>> value has been computed so there is only the sync of CHM left.
>
> It seems to me that if N threads query the same key at about
> the same time, they may all miss the map and go off to perform
> the slow computation.
Nope, they will all create a "Reference" object that *can* perform the 
calculation, however because the method uses "putIfAbsent", only the 
first calculating "Reference" object is actually ever used.

After that, the Reference.get() method will block until that particular 
calculation is complete.  Once that calculation is done, the calculating 
Reference will replace itself with a constant Reference for future 
accesses to use.
> If "slow" is large compared to the cost of
> a lock-release pair (and if it weren't, why cache?), the tradeoff
> seems suspect.
It would be suspect, but there isn't a tradeoff here. All threads lock 
until one value is calculated, basically coalescing the requests for 
that value.
>
> Also, different threads may wind up using different value
> instances. If the cache is purely a convenience for a value-only
> object that may be all right, but it's not all right if the values
> are supposed to be singletons.
Again, invalid if you recognize what is actually happening.
>
> Finally, there's more than a whiff of the double-checked locking
> antipattern about what you're doing with the `here' flag. I'm not
> absolutely sure what you've got is in fact DCL (hence, wrong), but
> I'm also not absolutely sure it's DCL-free. Before using the pattern
> in any important way, I'd want to check with a major-league guru,
> just as "due diligence."
>

"double-checked" locking is only broken if you don't have the 
appropriate memory semantics enforced elsewhere.  The reason the 
"classical" double-check locking fails is because there could be a 
data-race condition, where the reference is not null, but the object 
state hasn't been flushed between threads.

In this case, the "first-check" of obtaining a value from the 
ConcurrentMap causes happens-before relationships, and causes the object 
state to flush.

Robert Klemme's code is correct, according to everything I've read in 
JCIP and elsewhere.

Cheers,
Daniel.

[toc] | [prev] | [next] | [standalone]

#14641

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-05-19 07:09 -0400
Message-ID	<jp7v1f$c15$1@dont-email.me>
In reply to	#14637

On 5/19/2012 1:15 AM, Daniel Pitts wrote:
> On 5/18/12 3:31 PM, Eric Sosman wrote:
>> On 5/18/2012 5:45 PM, Robert Klemme wrote:
>>>[...]
>>> You didn't show it here, did you? I can's seem to find it in the thread.
>>> Note that CHM does also do synchronization. I am not sure from your
>>> statement what exact locking scheme you apply. There does seem to be one
>>> difference though: I my version the second lock goes away after the
>>> value has been computed so there is only the sync of CHM left.
>>
>> It seems to me that if N threads query the same key at about
>> the same time, they may all miss the map and go off to perform
>> the slow computation.
> Nope, they will all create a "Reference" object that *can* perform the
> calculation, however because the method uses "putIfAbsent", only the
> first calculating "Reference" object is actually ever used.
> [...]

     Re-reading, I think you're right.  "Never mind," as Emily
Litella used to say.

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#14640

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-19 12:33 +0200
Message-ID	<a1pb7oFpc0U1@mid.individual.net>
In reply to	#14636

On 19.05.2012 00:31, Eric Sosman wrote:
> On 5/18/2012 5:45 PM, Robert Klemme wrote:
>> On 18.05.2012 20:42, Silvio Bierman wrote:
>>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>>
>>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>>> should yield higher throughput because it works without read write
>>>> locking. You can find it as gist in case the code is garbled in the
>>>> newsgroup posting:
>>>> https://gist.github.com/2717818
>>
>>> I think you have as many locks as I suggested (being one)? My initial
>>> implementations of something like this used a plain map with an extra
>>> lock but later cases used the by then available ConcurrentHashMap as
>>> well, making one lock redundant.
>>
>> You didn't show it here, did you? I can's seem to find it in the thread.
>> Note that CHM does also do synchronization. I am not sure from your
>> statement what exact locking scheme you apply. There does seem to be one
>> difference though: I my version the second lock goes away after the
>> value has been computed so there is only the sync of CHM left.
>
> It seems to me that if N threads query the same key at about
> the same time, they may all miss the map and go off to perform
> the slow computation. If "slow" is large compared to the cost of
> a lock-release pair (and if it weren't, why cache?), the tradeoff
> seems suspect.
>
> Also, different threads may wind up using different value
> instances. If the cache is purely a convenience for a value-only
> object that may be all right, but it's not all right if the values
> are supposed to be singletons.
>
> Finally, there's more than a whiff of the double-checked locking
> antipattern about what you're doing with the `here' flag. I'm not
> absolutely sure what you've got is in fact DCL (hence, wrong), but
> I'm also not absolutely sure it's DCL-free. Before using the pattern
> in any important way, I'd want to check with a major-league guru,
> just as "due diligence."

There is no double checked locking going on - neither in LazyCache 
itself nor in the retriever instance.
http://en.wikipedia.org/wiki/Double-checked_locking

Daniel has explained all the details already.  I'd just add that 
creation of multiple retriever objects is the price you pay for 
increased concurrency due to CHM.  But only one of them is ever going to 
be used per key.  (Documenting this is one of the tasks of the assert 
Daniel questioned earlier.)

I think rw-locking is an inferior technique compared to CHM in this case 
because in case of cache miss you cannot escalate a r-lock to a w-lock 
(to avoid deadlocks) and hence need to release the r-lock, acquire the 
w-lock, *and check again*.  In other words you have two more 
synchronization points with memory barriers, scheduling effects etc.

In case of cache hit you might still suffer throughput because of thread 
starvation caused by all lookups for values missing from the cache might 
be blocked for indefinite time when using an unfair rw-lock.  Using a 
fair rw-lock on the other hand is more expensive and still has the 
downside of a global lock which affects the complete range of keys.

CHM improves this by partitioning value space and only lock individual 
partitions for atomic operations.  These partitions are completely 
independent from a locking perspective.  That's the main advantage of CHM.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#14653

From	Lew <noone@lewscanon.com>
Date	2012-05-19 14:24 -0700
Message-ID	<jp932s$r59$1@news.albasani.net>
In reply to	#14640

Robert Klemme wrote:
> Eric Sosman wrote:
>> Robert Klemme wrote:
>>> Silvio Bierman wrote:
>>>> Robert Klemme wrote:
>>>
>>>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>>>> should yield higher throughput because it works without read write
>>>>> locking. You can find it as gist in case the code is garbled in the
>>>>> newsgroup posting:
>>>>> https://gist.github.com/2717818
>>>
>>>> I think you have as many locks as I suggested (being one)? My initial
>>>> implementations of something like this used a plain map with an extra
>>>> lock but later cases used the by then available ConcurrentHashMap as
>>>> well, making one lock redundant.
>>>
>>> You didn't show it here, did you? I can's seem to find it in the thread.
>>> Note that CHM does also do synchronization. I am not sure from your
>>> statement what exact locking scheme you apply. There does seem to be one
>>> difference though: I my version the second lock goes away after the
>>> value has been computed so there is only the sync of CHM left.
>>
>> It seems to me that if N threads query the same key at about
>> the same time, they may all miss the map and go off to perform
>> the slow computation. If "slow" is large compared to the cost of
>> a lock-release pair (and if it weren't, why cache?), the tradeoff
>> seems suspect.
>>
>> Also, different threads may wind up using different value
>> instances. If the cache is purely a convenience for a value-only
>> object that may be all right, but it's not all right if the values
>> are supposed to be singletons.
>>
>> Finally, there's more than a whiff of the double-checked locking
>> antipattern about what you're doing with the `here' flag. I'm not
>> absolutely sure what you've got is in fact DCL (hence, wrong), but
>> I'm also not absolutely sure it's DCL-free. Before using the pattern
>> in any important way, I'd want to check with a major-league guru,
>> just as "due diligence."
>
> There is no double checked locking going on - neither in LazyCache itself nor
> in the retriever instance.
> http://en.wikipedia.org/wiki/Double-checked_locking
>
> Daniel has explained all the details already. I'd just add that creation of
> multiple retriever objects is the price you pay for increased concurrency due
> to CHM. But only one of them is ever going to be used per key. (Documenting
> this is one of the tasks of the assert Daniel questioned earlier.)
>
> I think rw-locking is an inferior technique compared to CHM in this case
> because in case of cache miss you cannot escalate a r-lock to a w-lock (to
> avoid deadlocks) and hence need to release the r-lock, acquire the w-lock,
> *and check again*. In other words you have two more synchronization points
> with memory barriers, scheduling effects etc.
>
> In case of cache hit you might still suffer throughput because of thread
> starvation caused by all lookups for values missing from the cache might be
> blocked for indefinite time when using an unfair rw-lock. Using a fair rw-lock
> on the other hand is more expensive and still has the downside of a global
> lock which affects the complete range of keys.
>
> CHM improves this by partitioning value space and only lock individual
> partitions for atomic operations. These partitions are completely independent
> from a locking perspective. That's the main advantage of CHM.

First off, thank you for the very professional and elegant code.

I shall study it frequently.

I had experience with CHM on an Enterprise Java project a few years back that 
involved processing documents up to 1GB or so at millions of
documents per hour.

As you can imagine, concurrency was a consideration there, but due to 
bureaucracy was not properly managed for a few years. I was hired around the 
time they started to pay attention to such issues.

The code involved a properly but naively synchronized Map at one point. 
Detailed profiling revealed that lock contention for the Map was the number 
one throughput chokepoint in the whole system. Even above database concurrency 
and I/O.

Boy howdy, the pundits are right to recommend hard measurement.

Lock contention has a cascade effect. In modern JVMs, like IBM's 
mainframe-level ones that we used, uncontended locks process quite quickly. 
Contention introduces roadblocks that delay threads, allowing more to queue 
up, causing more contention, slowing things down still more, causing more, 
yada. It only takes one skunk in the middle of the main road to completely tie 
up rush hour everywhere.

CHM by default partitions lock space (though I'm not clear it uses locks 
exactly) into sixteen independent slices. This meant far more than sixteen 
times faster for us. Writes tend to happen in a solitary thread without a 
whole lot of fight while reads run like greased pigs through the other 
fifteen. With our mix of reads and writes, and transaction volume, CHM pretty 
near eliminated lock contention. YMMV, as always, but in this case that 
chokepoint went from number one to off the list.

It was still the wrong solution, since a simple, effortless, non-concurrent 
better one that would also have eliminated a raft of other problems was 
available, but had no political traction. However, good enough to eliminate 
the throughput impact was good enough, so I didn't raise a fuss when they 
decided against it.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#14744

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-21 21:15 +0200
Message-ID	<a1vijnFv1jU1@mid.individual.net>
In reply to	#14653

On 19.05.2012 23:24, Lew wrote:

> First off, thank you for the very professional and elegant code.
>
> I shall study it frequently.

Thank you!

> I had experience with CHM on an Enterprise Java project a few years back
> that involved processing documents up to 1GB or so at millions of
> documents per hour.
>
> As you can imagine, concurrency was a consideration there, but due to
> bureaucracy was not properly managed for a few years. I was hired around
> the time they started to pay attention to such issues.
>
> The code involved a properly but naively synchronized Map at one point.
> Detailed profiling revealed that lock contention for the Map was the
> number one throughput chokepoint in the whole system. Even above
> database concurrency and I/O.
>
> Boy howdy, the pundits are right to recommend hard measurement.
>
> Lock contention has a cascade effect. In modern JVMs, like IBM's
> mainframe-level ones that we used, uncontended locks process quite
> quickly. Contention introduces roadblocks that delay threads, allowing
> more to queue up, causing more contention, slowing things down still
> more, causing more, yada. It only takes one skunk in the middle of the
> main road to completely tie up rush hour everywhere.
>
> CHM by default partitions lock space (though I'm not clear it uses locks
> exactly) into sixteen independent slices. This meant far more than
> sixteen times faster for us. Writes tend to happen in a solitary thread
> without a whole lot of fight while reads run like greased pigs through
> the other fifteen. With our mix of reads and writes, and transaction
> volume, CHM pretty near eliminated lock contention. YMMV, as always, but
> in this case that chokepoint went from number one to off the list.
>
> It was still the wrong solution, since a simple, effortless,
> non-concurrent better one that would also have eliminated a raft of
> other problems was available, but had no political traction. However,
> good enough to eliminate the throughput impact was good enough, so I
> didn't raise a fuss when they decided against it.

Thanks for sharing that story.  What I find amazing about this is that 
what you did isn't exactly rocket science and yet they didn't do it 
before.  You would guess that it's just what every engineer would do but 
no: something prevents this from happening.

And it is true for a number of other techniques I would consider bread 
and butter tools:

- Ensuring requirements are gathered properly and understood before 
starting to code (and design of course).

- Testing code _before_ shipping.

- When writing unit tests, making sure to also include tests for 
critical values (usually corner cases such as -1, 0, 1, limit, limit - 
1, limit + 1, null, "", etc.).

- Thinking about the person who must use what you produce, regardless 
whether it's a document, a configuration file layout, a DSL, a class, a 
library.  It seems many people in software development are far more 
concerned with the inner workings of what they create instead of 
considering how it will be used.  Maybe it's easier or it is because 
making it work takes the largest part of coding - still the outcome 
often is dissatisfying and a little more thought upfront goes a long way 
at avoiding maintenance headaches and worse things.

...

This can't be so difficult, can it?

Kind regards

	robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#14765

From	Lew <noone@lewscanon.com>
Date	2012-05-23 23:11 -0700
Message-ID	<jpkjdg$a8f$1@news.albasani.net>
In reply to	#14744

Robert Klemme wrote:
> Thanks for sharing that story. What I find amazing about this is that what you
> did isn't exactly rocket science and yet they didn't do it before. You would
> guess that it's just what every engineer would do but no: something prevents
> this from happening.

It was worse.

Turns out the resource acquisition managed by the inter-thread map coulda 
woulda shoulda been handled better by an object instantiated with a simple new 
..() call.

The returned reference and access to the object would perforce have been 
limited to the instantiating thread. Any other thread woulda shoulda coulda 
had its own independent resource, object, reference, whatever. Sharing was not 
needed.

Loosely speaking -

  Desirable desire = getFromMap(FANCY_CAR);
  // map is shared, with fancy-schmancy synch

vs.

  Desirable desire = new FancyCar();
  // Map? Meshugge! I'm not sharing!

> And it is true for a number of other techniques I would consider bread and
> butter tools:
>
> - Ensuring requirements are gathered properly and understood before starting
> to code (and design of course).

True, with provisos.

  = Requirements are not fixed. What you should have now is the best available 
for now.

  = "Properly"? "Understood"? There are books on this, but it boils down to:

    Your software will be used. By whom? Why? What would annoy them? Don't do 
that! What would make them forget the software is there and just get their job 
done? Do that!

"Proper" is always from the user's standpoint. Not what they say, by the way, 
but what they think and feel - really. Behind the bullshit and self-deception, 
when they aren't posturing for your benefit and really are just trying to get 
their work done. And that's where "understood" comes in. Give'em what they 
want, not necessarily what they ask for.

> - Testing code _before_ shipping.

Some espouse before even writing it. I say, sure, great idea, when you make it 
easy to express my ideas in tests instead of imperative code.

Hmm.

BTW, this being Java, have you guys tried EasyMock? (AndroidMock in Android 
Java.) That shtuff rocks. It makes your very code better, not just tested but 
idiomatically. Tricky stuff. I'll post about that when I get back into it. 
I've discovered a Right Way and a Wrong Way to do mocks, just like I did with JPA.

> - When writing unit tests, making sure to also include tests for critical
> values (usually corner cases such as -1, 0, 1, limit, limit - 1, limit + 1,
> null, "", etc.).

Consider this. No matter what input a module receives, it must achieve a valid 
program state in response.

No matter what.

Coding to this formalism means, for example, that a method can handle all 
possible argument values.

public interface Caromer
{
   public Fromitz carom(Flamo flame);
}

As Herr Klemme points out, 'null' is one possible input for 'flame'. 
Non-'null' is another. Presumably a 'Flamo' has observable state that 
distinguishes input values. If the type were 'Integer' or 'int', Robert's 
example int values might pertain.

But you cannot give infinite test cases to a unit test. So you pick edge 
cases, on both the "happy path" (positive) and "unhappy path" (negative) 
scenarios. Consider this:

public interface Quadrupler
{
   public int quadruple(int val);
}

What if it were called with an auto-unboxed Integer?

  public int someClient(Integer val)
  {
    Quadrupler quadrupler = QuadFactory.get();
    Integer quadded = quadrupler.quadruple(val);
  }

> - Thinking about the person who must use what you produce, regardless whether
> it's a document, a configuration file layout, a DSL, a class, a library. It
> seems many people in software development are far more concerned with the
> inner workings of what they create instead of considering how it will be used.
> Maybe it's easier or it is because making it work takes the largest part of
> coding - still the outcome often is dissatisfying and a little more thought
> upfront goes a long way at avoiding maintenance headaches and worse things.
>
> ...
>
> This can't be so difficult, can it?

You have a fine sense of humor.

It's as easy as empathy, and getting inside the mind of a professional in a 
field of which you know little to nothing.

Testing certainly helps. Functional tests (as opposed to unit tests) can and 
should be part of and produced from requirements. At the user-facing level you 
can send test scenarios back as memos of understanding (MoUs) - what better 
way to make clear what you understand and how it will play out?

Prototypes count, too. They can meet tests, even. Building a prototype that 
meets tests, starting within a day or so of starting, kicks ass for converging 
on clarity. Keep it up - every day the prototype runs, works (up to whatever 
you've done that day), can be checked out and run in any directory, by any 
developer, and run, etc.

Add a CI tool like Jenkins to run your tests every time you "git commit" to 
"origin/master" (or whatever); you got yourself a one-entity professional shop.

<http://www.junit.org/>
<http://www.easymock.org/>
<http://code.google.com/p/android-mock/>
<http://jenkins-ci.org/>

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#14783

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-24 23:05 +0200
Message-ID	<a27m4iFv94U1@mid.individual.net>
In reply to	#14765

On 24.05.2012 08:11, Lew wrote:
> Robert Klemme wrote:

>> And it is true for a number of other techniques I would consider bread
>> and
>> butter tools:
>>
>> - Ensuring requirements are gathered properly and understood before
>> starting
>> to code (and design of course).
>
> True, with provisos.
>
> = Requirements are not fixed. What you should have now is the best
> available for now.

But there is a minimum standard: there must be enough information to at 
least get an idea of what is to be done.  I have seen three sentence 
requirements for a feature in a complex system which left engineers 
wondering what they are supposed to do.

> = "Properly"? "Understood"? There are books on this, but it boils down to:
>
> Your software will be used. By whom? Why? What would annoy them? Don't
> do that! What would make them forget the software is there and just get
> their job done? Do that!

!

> "Proper" is always from the user's standpoint. Not what they say, by the
> way, but what they think and feel - really. Behind the bullshit and
> self-deception, when they aren't posturing for your benefit and really
> are just trying to get their work done. And that's where "understood"
> comes in. Give'em what they want, not necessarily what they ask for.

Pretty early in my career I had a quite instructive experience: we were 
building a web catalog for a business customer with user management, a 
lot of technical data about their products - and of course a link to 
their ERP system.  They wanted to have a price calculation in that 
system which could take several routes depending on the user.  They 
explained it, I wrote a document which also included an UML activity 
diagram of the flow with explanations.  We sent it in.  They rejected: 
"No, it must be completely different."  So we sat together.  End of the 
story: it was exactly what they wanted.  So they either did not 
understood the document and didn't want to ask - or their own 
requirements. :-)

>> - Testing code _before_ shipping.
>
> Some espouse before even writing it. I say, sure, great idea, when you
> make it easy to express my ideas in tests instead of imperative code.
>
> Hmm.
>
> BTW, this being Java, have you guys tried EasyMock? (AndroidMock in
> Android Java.) That shtuff rocks. It makes your very code better, not
> just tested but idiomatically. Tricky stuff. I'll post about that when I
> get back into it. I've discovered a Right Way and a Wrong Way to do
> mocks, just like I did with JPA.

Not yet, thank you for the pointer!

>> - When writing unit tests, making sure to also include tests for critical
>> values (usually corner cases such as -1, 0, 1, limit, limit - 1, limit
>> + 1,
>> null, "", etc.).
>
> Consider this. No matter what input a module receives, it must achieve a
> valid program state in response.
>
> No matter what.

Absolutely!

> Coding to this formalism means, for example, that a method can handle
> all possible argument values.
>
> public interface Caromer
> {
> public Fromitz carom(Flamo flame);
> }
>
> As Herr Klemme points out, 'null' is one possible input for 'flame'.
> Non-'null' is another. Presumably a 'Flamo' has observable state that
> distinguishes input values. If the type were 'Integer' or 'int',
> Robert's example int values might pertain.
>
> But you cannot give infinite test cases to a unit test. So you pick edge
> cases, on both the "happy path" (positive) and "unhappy path" (negative)
> scenarios.

Exactly.

>> - Thinking about the person who must use what you produce, regardless
>> whether
>> it's a document, a configuration file layout, a DSL, a class, a
>> library. It
>> seems many people in software development are far more concerned with the
>> inner workings of what they create instead of considering how it will
>> be used.
>> Maybe it's easier or it is because making it work takes the largest
>> part of
>> coding - still the outcome often is dissatisfying and a little more
>> thought
>> upfront goes a long way at avoiding maintenance headaches and worse
>> things.
>>
>> ...
>>
>> This can't be so difficult, can it?
>
> You have a fine sense of humor.

Well, the fun disappears when you see the same mistakes made over and 
over again and attempts to improve it do not have effects...

> It's as easy as empathy, and getting inside the mind of a professional
> in a field of which you know little to nothing.

I agree for library API design: that takes a bit of experience to get 
right.  But when it comes to writing documents I think the rules for 
readability are quite straight forward (some universities even have them 
in their curriculum): do not start with the details but give the reader 
an overview first.  Give previews and summaries.  Generally watch 
document structure.  Do not put too much in a single diagram.  Things 
like that.

> Testing certainly helps. Functional tests (as opposed to unit tests) can
> and should be part of and produced from requirements. At the user-facing
> level you can send test scenarios back as memos of understanding (MoUs)
> - what better way to make clear what you understand and how it will play
> out?

Good point.

> Prototypes count, too. They can meet tests, even. Building a prototype
> that meets tests, starting within a day or so of starting, kicks ass for
> converging on clarity. Keep it up - every day the prototype runs, works
> (up to whatever you've done that day), can be checked out and run in any
> directory, by any developer, and run, etc.

And prototypes are a great way to test whether ideas work with low 
effort before basing a whole design on an idea.

> Add a CI tool like Jenkins to run your tests every time you "git commit"
> to "origin/master" (or whatever); you got yourself a one-entity
> professional shop.

Jenkins is great.  We use it.  Broken tests are quickly discovered. :-)

Kind regards

	robert

PS: Hey, you managed to sneak in a German word. :-)

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#14788

From	Gene Wirchenko <genew@ocis.net>
Date	2012-05-24 20:11 -0700
Message-ID	<arttr7lumqps4v6h19ihf16a4rqf2d1v32@4ax.com>
In reply to	#14783

On Thu, 24 May 2012 23:05:19 +0200, Robert Klemme
<shortcutter@googlemail.com> wrote:

[snip]

>Pretty early in my career I had a quite instructive experience: we were 
>building a web catalog for a business customer with user management, a 
>lot of technical data about their products - and of course a link to 
>their ERP system.  They wanted to have a price calculation in that 
>system which could take several routes depending on the user.  They 
>explained it, I wrote a document which also included an UML activity 
>diagram of the flow with explanations.  We sent it in.  They rejected: 
>"No, it must be completely different."  So we sat together.  End of the 
>story: it was exactly what they wanted.  So they either did not 
>understood the document and didn't want to ask - or their own 
>requirements. :-)
               ^^^
     What is this?

     (I think you called it right, no joking.)

     A UML diagram might as well be code.  It is something of ours.
Why expect them to be able to read it?

     As to possibly not knowing their own requirements, that is quite
possible.  Many a person knows how to do his job as it is currently
configured.  That does not mean that he knows how to discuss the whys
and wherefores of it.

[snip]

Sincerely,

Gene Wirchenko

[toc] | [prev] | [next] | [standalone]

#14794

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-25 17:42 +0200
Message-ID	<a29njiFto3U1@mid.individual.net>
In reply to	#14788

On 25.05.2012 05:11, Gene Wirchenko wrote:
>       A UML diagram might as well be code.  It is something of ours.
> Why expect them to be able to read it?

I did not expect them to read it out of the box.  But it was embedded in 
a document and accompanied with explanation of the process.  Also, if 
they did not understand it, my expectation would have been for them to 
ask.  But they simply rejected it as if they had understood it.  It 
always takes two sides for successful communication - in this case we 
luckily resolved it but I think generally this communication pattern is 
more likely to produce errors than success.

>       As to possibly not knowing their own requirements, that is quite
> possible.  Many a person knows how to do his job as it is currently
> configured.  That does not mean that he knows how to discuss the whys
> and wherefores of it.

That is true.  Doing and reasoning about what one does are really two 
things.  However, in this case they actually _had_ described their 
requirements - and even in a way that I was capable of coming up with 
the proper algorithm.  This means a certain amount of thought must have 
been done already.  They just did not recognize it when they saw the 
result and for an unknown reason immediately jumped to a conclusion.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#14792

From	Arved Sandstrom <asandstrom3minus1@eastlink.ca>
Date	2012-05-25 07:09 -0300
Message-ID	<fTIvr.34159$_l.17077@newsfe15.iad>
In reply to	#14744

On 12-05-21 04:15 PM, Robert Klemme wrote:
>
[ SNIP ]>
> And it is true for a number of other techniques I would consider bread
> and butter tools:
>
> - Ensuring requirements are gathered properly and understood before
> starting to code (and design of course).

I want to blame an improper understanding of agile and lean and 
iterative here, but I won't. On the other hand, I do think that a 
widespread kneejerk rejection of waterfall has also led, in many cases, 
to people rejecting formal requirements analysis and intensive BA work. 
Maybe they've read enough about agile to think that they'll find out 
everything they need to find out when it's time to build Slice 3, and 
it's OK to find out about Slice 3 requirements then, not upfront.

My philosophy - perhaps old-school - is based on these observations over 
the years:

1) In the vast majority of projects of all sizes, the large majority of 
requirements were known and identifiable at the beginning;

2) Corollary to #1: the large majority of requirements do not change 
over the lifetime of a project, of any size. They may be identified 
late, or not at all (before release), but they were always knowable;

3) An impression of changing requirements is almost always due to 
imprecise stakeholder/BA/architect/designer interactions and 
conversations. If you get one answer in April and a different one in 
October, it's either down to who you are asking or what your questions 
are, or both. The actual final "real" requirement probably never changed.

As I say, these are my observations. YMMV. In my case I do substantially 
more requirements analysis up front than maybe other folks do. It has 
served me well. I also do a lot of prototyping and wire-framing and 
proofs-of-concept/technology, all of which in some developer camps seem 
to have been relegated.

> - Testing code _before_ shipping.
>
> - When writing unit tests, making sure to also include tests for
> critical values (usually corner cases such as -1, 0, 1, limit, limit -
> 1, limit + 1, null, "", etc.).

Testing is a different art. There is a lot of truth to the saying that a 
good tester is a mediocre coder, and vice versa. At a minimum, if a 
single person does have the expertise to do both, they need to be 
schizophrenic.

Although I've not read anything about it, I've occasionally thought that 
by generally accepted testing principles, that a person who writes unit 
tests (just like any other tests) cannot, and should not, be the 
developer. And yet it's the developer who invariably writes unit tests 
for his own code. Food for thought.

> - Thinking about the person who must use what you produce, regardless
> whether it's a document, a configuration file layout, a DSL, a class, a
> library. It seems many people in software development are far more
> concerned with the inner workings of what they create instead of
> considering how it will be used. Maybe it's easier or it is because
> making it work takes the largest part of coding - still the outcome
> often is dissatisfying and a little more thought upfront goes a long way
> at avoiding maintenance headaches and worse things.
>
> ...
See my thoughts about prototypes, wireframes, Pocs/PoTs above. Using 
those methods I've saved a lot of time over the years. All of them show 
the end-user a good impression of *their* final interface. Which is what 
matters.
>
> This can't be so difficult, can it?
>
> Kind regards
>
> robert
>
>
AHS

-- 
Never interrupt your enemy when he is making a mistake.
--Napoleon

[toc] | [prev] | [next] | [standalone]

#14734

From	Silvio Bierman <silvio@moc.com>
Date	2012-05-21 15:21 +0200
Message-ID	<4fba416a$0$6950$e4fe514c@news2.news.xs4all.nl>
In reply to	#14633

On 05/18/2012 11:45 PM, Robert Klemme wrote:
> On 18.05.2012 20:42, Silvio Bierman wrote:
>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>
>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>> should yield higher throughput because it works without read write
>>> locking. You can find it as gist in case the code is garbled in the
>>> newsgroup posting:
>>> https://gist.github.com/2717818
>
>> I think you have as many locks as I suggested (being one)? My initial
>> implementations of something like this used a plain map with an extra
>> lock but later cases used the by then available ConcurrentHashMap as
>> well, making one lock redundant.
>
> You didn't show it here, did you? I can's seem to find it in the thread.
> Note that CHM does also do synchronization. I am not sure from your
> statement what exact locking scheme you apply. There does seem to be one
> difference though: I my version the second lock goes away after the
> value has been computed so there is only the sync of CHM left.
>
> Kind regards
>
> robert
>

Actually, I didn't, and I did mention the word "lock" alongside the 
wrapper object.

Since the only code I have at hand is a Scala version there was little 
point in posting it. It uses a Scala "lazy val" in the wrapper object 
which translates to a compiler generated DCL (assuming 1.5+). So the 
only lock that would remain would be the CHM one.

I vaguely remember a DCL in one of the more recent Java versions I 
coughed up.

[toc] | [prev] | [next] | [standalone]

#14789

From	Kevin McMurtrie <mcmurtrie@pixelmemory.us>
Date	2012-05-24 23:22 -0700
Message-ID	<4fbf252e$0$87578$742ec2ed@news.sonic.net>
In reply to	#14544

In article <zsWdnVufD87Egy_SnZ2dnUVZ7oednZ2d@brightview.co.uk>,
 bugbear <bugbear@trim_papermule.co.uk_trim> wrote:

> I'm using various Apache Commons Maps as a
> multithread cache, protected using
> ReentrantReadWriteLock, so that getting() uses a read lock,
> and putting() uses a write lock.
> 
> But I've got an issue; in the
> case of a cache miss (protected by a read lock),
> the required value is acquired using the "underlying function"
> that the cache is over; this value is then put() into
> the cache (protected by a write lock)
> 
> This is all perfectly thread safe, and gives
> correct results.
> 
> However, if the underlying function is slow
> and/or resource hungry (consider cacheing
> a ray traced image!) many threads can
> end up calling the real function (second
> and subsequent threads to the first get a miss
> during the first threads call to the underlying function).
> 
> "obviously..." what I want is for only
> the thread that FIRST has a cache miss
> calls the underlying function, whilst other
> threads (for the same key) wait.
> 
> This seems "obvious" enough that I'm guessing
> there's a standard solution.
> 
> Googling led me to several "heavy" libraries;
> 
> This appears more a locking/cacheing issue
> than a Java issue, although my implementation
> is in Java.
> 
> Can anyone (please) point me at a
> canonical solution/implementation?
> 
>   BugBear

Ha, this is an interview question that I use.

What you need is row level locking for the cache load.

Step 1)
Use synchronized operations to map your key to a value; creating an 
uninitialized value in the map if needed.  Use whatever tech you want.  
A synchronized block on a HashMap is simplest and performs the fastest 
on 1 or 2 core systems.  A ConcurrentHashMap sometimes performs better 
with 4+ core systems.

Step 2)
Synchronize on the value.  Initialize it if needed.

Step 1 blocks all cache access for only for a very short moment to make 
sure that a key always has a value.  Step 2 blocks access independently 
for each cache value to make sure that it is loaded.  It will perform 
well for continuous use by several CPU cores.  Google has some high 
concurrency Maps that aren't too bad either.

In the 16 core range you'll find that any kind of exclusive lock causes 
stalls where threads suspend while holding locks, causing a backlog that 
reinforces itself. A concurrency expert can fix that using complex 
Compare-And-Swap designs.  In the 32+ core range you'll sometimes find 
that harmless race conditions are better than multiple threads 
frequently writing to the same memory page.
-- 
I will not see posts from Google because I must filter them as spam

[toc] | [prev] | [next] | [standalone]

#14793

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-05-25 17:36 +0200
Message-ID	<a29n7cFr61U1@mid.individual.net>
In reply to	#14789

On 25.05.2012 08:22, Kevin McMurtrie wrote:
> Ha, this is an interview question that I use.
>
> What you need is row level locking for the cache load.

But not all the time.  See https://gist.github.com/2717818

> Step 1)
> Use synchronized operations to map your key to a value; creating an
> uninitialized value in the map if needed.  Use whatever tech you want.
> A synchronized block on a HashMap is simplest and performs the fastest
> on 1 or 2 core systems.  A ConcurrentHashMap sometimes performs better
> with 4+ core systems.

In my experience a CHM performs better even on a 1 or 2 core system.

> Step 2)
> Synchronize on the value.  Initialize it if needed.
>
>
> Step 1 blocks all cache access for only for a very short moment to make
> sure that a key always has a value.  Step 2 blocks access independently
> for each cache value to make sure that it is loaded.  It will perform
> well for continuous use by several CPU cores.  Google has some high
> concurrency Maps that aren't too bad either.

Actually once the value is in the cache you do not need any step 2 
synchronization any more.

> In the 16 core range you'll find that any kind of exclusive lock causes
> stalls where threads suspend while holding locks, causing a backlog that
> reinforces itself. A concurrency expert can fix that using complex
> Compare-And-Swap designs.

Basically this is what CHM does with putIfAbsent() internally.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]

#15008

From	Kevin McMurtrie <mcmurtrie@pixelmemory.us>
Date	2012-06-02 12:27 -0700
Message-ID	<4fca6924$0$35601$742ec2ed@news.sonic.net>
In reply to	#14793

In article <a29n7cFr61U1@mid.individual.net>,
 Robert Klemme <shortcutter@googlemail.com> wrote:

> On 25.05.2012 08:22, Kevin McMurtrie wrote:
> > Ha, this is an interview question that I use.
> >
> > What you need is row level locking for the cache load.
> 
> But not all the time.  See https://gist.github.com/2717818

The original post stated that duplicate element creation was too 
expensive.  That code may create duplicate elements for a key and 
discard the extras.

> 
> > Step 1)
> > Use synchronized operations to map your key to a value; creating an
> > uninitialized value in the map if needed.  Use whatever tech you want.
> > A synchronized block on a HashMap is simplest and performs the fastest
> > on 1 or 2 core systems.  A ConcurrentHashMap sometimes performs better
> > with 4+ core systems.
> 
> In my experience a CHM performs better even on a 1 or 2 core system.

This depends on usage patterns.  The Java 1.5 Concurrency locks used in 
some parts of CHM are very slow compared to synchronization.  The 
Concurrency locks only have a chance of performing better when there are 
a lot of concurrent shared read locks.

> > Step 2)
> > Synchronize on the value.  Initialize it if needed.
> >
> >
> > Step 1 blocks all cache access for only for a very short moment to make
> > sure that a key always has a value.  Step 2 blocks access independently
> > for each cache value to make sure that it is loaded.  It will perform
> > well for continuous use by several CPU cores.  Google has some high
> > concurrency Maps that aren't too bad either.
> 
> Actually once the value is in the cache you do not need any step 2 
> synchronization any more.

Step 2 is initializing the value.  It's critical to synchronize there or 
the second thread to request the same key may see an element that is not 
yet fully initialized.  (Again, the original post stated that elements 
were expensive to create.)

> 
> > In the 16 core range you'll find that any kind of exclusive lock causes
> > stalls where threads suspend while holding locks, causing a backlog that
> > reinforces itself. A concurrency expert can fix that using complex
> > Compare-And-Swap designs.
> 
> Basically this is what CHM does with putIfAbsent() internally.

Yes, but you don't have access to its internal container class so CHM is 
not much use for caching elements that are very expensive to create.  
The Googlely version does provide a way to initialize the container with 
a factory callback; combining steps 1 and 2 here.

> 
> Kind regards
> 
> 	robert
-- 
I will not see posts from Google because I must filter them as spam

[toc] | [prev] | [next] | [standalone]

#15010

From	Robert Klemme <shortcutter@googlemail.com>
Date	2012-06-02 21:54 +0200
Message-ID	<a2v9biF8s3U1@mid.individual.net>
In reply to	#15008

On 02.06.2012 21:27, Kevin McMurtrie wrote:
> In article<a29n7cFr61U1@mid.individual.net>,
>   Robert Klemme<shortcutter@googlemail.com>  wrote:
>
>> On 25.05.2012 08:22, Kevin McMurtrie wrote:
>>> Ha, this is an interview question that I use.
>>>
>>> What you need is row level locking for the cache load.
>>
>> But not all the time.  See https://gist.github.com/2717818
>
> The original post stated that duplicate element creation was too
> expensive.  That code may create duplicate elements for a key and
> discard the extras.

No, it does not create duplicate elements and hence does not discard 
them.  The only thing which might be duplicated is the proxy instance 
which calls the factory method (created in line 98).  But that is cheap. 
  The design ensures that a value is only calculated once per key and 
neither CPU is wasted nor duplicate value objects.  Please see Lew's 
excellent explanation of what happens or look at the code again.

>>> Step 1)
>>> Use synchronized operations to map your key to a value; creating an
>>> uninitialized value in the map if needed.  Use whatever tech you want.
>>> A synchronized block on a HashMap is simplest and performs the fastest
>>> on 1 or 2 core systems.  A ConcurrentHashMap sometimes performs better
>>> with 4+ core systems.
>>
>> In my experience a CHM performs better even on a 1 or 2 core system.
>
> This depends on usage patterns.  The Java 1.5 Concurrency locks used in
> some parts of CHM are very slow compared to synchronization.  The
> Concurrency locks only have a chance of performing better when there are
> a lot of concurrent shared read locks.

Even if they are slower than synchronization they perform better than a 
single lock on the whole Map (what you suggested above) with multiple 
threads even on a few core machine just because there are more of them 
and the whole Map is partitioned.  I haven't strictly measured the 
synchronization mechanisms used inside CHM but I did also not notice bad 
timing (and frankly, I cannot believe Doug Lea would have used 
inefficient mechanisms).  Bottom line: when in doubt measure.

>>> Step 2)
>>> Synchronize on the value.  Initialize it if needed.
>>>
>>>
>>> Step 1 blocks all cache access for only for a very short moment to make
>>> sure that a key always has a value.  Step 2 blocks access independently
>>> for each cache value to make sure that it is loaded.  It will perform
>>> well for continuous use by several CPU cores.  Google has some high
>>> concurrency Maps that aren't too bad either.
>>
>> Actually once the value is in the cache you do not need any step 2
>> synchronization any more.
>
> Step 2 is initializing the value.  It's critical to synchronize there or
> the second thread to request the same key may see an element that is not
> yet fully initialized.  (Again, the original post stated that elements
> were expensive to create.)

Maybe my wording was not clear enough.  Once the value is in the Map 
synchronization is needed no more for this key and only the Map level 
locking remains.  Again, please look at the code.

>>> In the 16 core range you'll find that any kind of exclusive lock causes
>>> stalls where threads suspend while holding locks, causing a backlog that
>>> reinforces itself. A concurrency expert can fix that using complex
>>> Compare-And-Swap designs.
>>
>> Basically this is what CHM does with putIfAbsent() internally.
>
> Yes, but you don't have access to its internal container class so CHM is
> not much use for caching elements that are very expensive to create.

I beg to differ (see code).

> The Googlely version does provide a way to initialize the container with
> a factory callback; combining steps 1 and 2 here.

Which version?  Please share a reference.  Thank you!

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [standalone]

Page 2 of 2 — ← Prev page 1 [2]

csiph-web

multithreaded cache?

Contents

#14659

#14662

#14628

#14633

#14636

#14637

#14641

#14640

#14653

#14744

#14765

#14783

#14788

#14794

#14792

#14734

#14789

#14793

#15008

#15010