Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #14662

Re: multithreaded cache?

From Lew <noone@lewscanon.com>
Newsgroups comp.lang.java.programmer
Subject Re: multithreaded cache?
Date 2012-05-19 18:11 -0700
Organization albasani.net
Message-ID <jp9gc1$7s3$1@news.albasani.net> (permalink)
References <zsWdnVufD87Egy_SnZ2dnUVZ7oednZ2d@brightview.co.uk> <a1k06fFtdtU1@mid.individual.net> <a1k0taF206U1@mid.individual.net> <jp9anv$a6a$1@news.albasani.net> <jp9bfh$b9f$1@news.albasani.net>

Show all headers | View raw


Sebastian wrote:
> schrieb Sebastian:
>> The important detail obiously is the extra check with the "here" flag.
>> Would you mind explaining why that is necessary? Everyone seems to have
>> got it, but I must admit I haven't. Why is it not enough that
>> Reference#get() is synchronized? As you yourself have missed this at
>> your first attempt, I hope the reason isn't trivial...
>>
> I think I've got it - the check prevents double calculation of the value by
> two threads calling get() on that same reference instance.
> Which may occur if putIfAbsent() returns a non-null value before the
> "calculating reference" has had time to replace itself with the
> constant reference. Is that right? -- Sebastian

Yep.

https://gist.github.com/2717818

============
LazyCache
============

   public V get(final K key) {
     Reference<V> ref = map.get(key);

     if (ref == null) {
       // miss
       ref = new Reference<V>() {
         private V val;
         private boolean here = false; // superfluous but explicit

         @Override
         public synchronized V get() {
           if (!here) {
             val = calc.get(key);
             here = true;
             // next time lock free access:
             Reference<V> x = map.put(key, new Ref<V>(val));
             assert x == this;
           }

           return val;
         }
       };

       final Reference<V> old = map.putIfAbsent(key, ref);

       if (old != null)
         ref = old; // someone else was faster
     }

     return ref.get();
   }

============

The first cluster of requestors to 'LazyCache#get(K key)' enter the
'if (ref == null) {' block.

They all construct a new anonymous-type instance of 'Reference<V>'.

The first one to hit 'putIfAbsent()' wins. The rest replace their 
painstakingly - actually very, very quickly - contructed anonymous-type 
'Reference<V>' instances with the anonymous-type instance the first client put 
into the 'Map'.

So only one 'Reference<V>' comes back from all the clients that first hit the 
map and saw 'null'.

It is of the anonymous type.

The purpose of the 'here' variable is for the anonymous-type instance. That 
one-and-only possible instance can tell that it has never been 'here' before, 
i.e., inside the 'get()' method.

The next batch of 'LazyCache' clients hit the map when that anonymous-type 
instance is still in there.

They all call 'get()' on that anonymous-type instance. First one sees 'here' 
is still false.  Up until now, no one has called the expensive operation 
inside of 'calc'. No matter how many threads hit the map when the value was 
'null', and no matter how many of them hit it when the map contained the 
anonymous-type instance, all those threads have to wait until that one and 
only instance calls 'get()' in turn, and the first one of that whole cluster 
is the only one to see '!here'.

It calls the expensive operation and replaces its own internal value with the 
newly-calculated one.

In addition, it replaces itself in the map with a simple, final-value holder.

All the threads currently blocked on the anonymous-type instance 'get()' will 
see the stashed value.

None of them will call the expensive operation again.

Any clients thereafter will hit the 'LazyCache' map and find only the simple, 
final-value named-type instance, whose 'get()' returns the pre-calculated 
value in a completely thread-safe manner without any more locking.

Once the named-type instance is in the map, and all threads that found the 
anonymous one finish their 'get()' calls, the anonymous-type instance is 
eligible for garbage collection, thus letting go of an extra reference to the 
calculator. I've chased that rabbit down the hole a bit without finding a real 
danger to the circular reference in this scenario, but it's never bad to break 
reference-graph cycles. They are often a major bug breeding ground, especially 
on resource managers and types with non-trivial 'finalize()' methods.

Again, while here the lifetime of the calculator is that of the 'LazyCache' 
instance, it's good practice to decouple value clients from their producers' 
finalizers. It simplifies memory management. See Josh Bloch's /Effective Java/ 
for details on finalizers.

You get tremendous results from the twin engines of scope and lifetime management.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-15 10:14 +0100
  Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-15 11:41 +0200
    Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-15 15:57 -0700
      Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 16:13 -0700
        Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:22 +0200
      Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:19 +0200
        Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 20:26 -0400
          Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-15 21:41 -0300
          Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 09:26 +0200
            Re: multithreaded cache? "Mike Schilling" <mscottschilling@hotmail.com> - 2012-05-26 14:54 -0700
        Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-16 22:03 -0700
          Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:25 +0200
  Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 08:22 -0400
  Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 09:56 -0700
  Re: multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-16 14:33 +0100
  Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 11:54 +0200
    Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 12:06 +0200
      Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-17 09:51 -0700
        Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 20:48 +0200
      Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:35 +0200
        Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:48 +0200
          Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 18:11 -0700
    Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-18 20:42 +0200
      Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-18 23:45 +0200
        Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-18 18:31 -0400
          Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-18 22:15 -0700
            Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-19 07:09 -0400
          Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-19 12:33 +0200
            Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 14:24 -0700
              Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-21 21:15 +0200
                Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-23 23:11 -0700
                Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-24 23:05 +0200
                Re: multithreaded cache? Gene Wirchenko <genew@ocis.net> - 2012-05-24 20:11 -0700
                Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:42 +0200
                Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-25 07:09 -0300
        Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:21 +0200
  Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-05-24 23:22 -0700
    Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:36 +0200
      Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-06-02 12:27 -0700
        Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-06-02 21:54 +0200

csiph-web