Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #14544 > unrolled thread
| Started by | bugbear <bugbear@trim_papermule.co.uk_trim> |
|---|---|
| First post | 2012-05-15 10:14 +0100 |
| Last post | 2012-06-02 21:54 +0200 |
| Articles | 20 on this page of 40 — 12 participants |
Back to article view | Back to comp.lang.java.programmer
multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-15 10:14 +0100
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-15 11:41 +0200
Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-15 15:57 -0700
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 16:13 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:22 +0200
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:19 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 20:26 -0400
Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-15 21:41 -0300
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 09:26 +0200
Re: multithreaded cache? "Mike Schilling" <mscottschilling@hotmail.com> - 2012-05-26 14:54 -0700
Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-16 22:03 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:25 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 08:22 -0400
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 09:56 -0700
Re: multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-16 14:33 +0100
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 11:54 +0200
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 12:06 +0200
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-17 09:51 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 20:48 +0200
Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:35 +0200
Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:48 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 18:11 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-18 20:42 +0200
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-18 23:45 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-18 18:31 -0400
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-18 22:15 -0700
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-19 07:09 -0400
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-19 12:33 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 14:24 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-21 21:15 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-23 23:11 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-24 23:05 +0200
Re: multithreaded cache? Gene Wirchenko <genew@ocis.net> - 2012-05-24 20:11 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:42 +0200
Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-25 07:09 -0300
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:21 +0200
Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-05-24 23:22 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:36 +0200
Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-06-02 12:27 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-06-02 21:54 +0200
Page 1 of 2 [1] 2 Next page →
| From | bugbear <bugbear@trim_papermule.co.uk_trim> |
|---|---|
| Date | 2012-05-15 10:14 +0100 |
| Subject | multithreaded cache? |
| Message-ID | <zsWdnVufD87Egy_SnZ2dnUVZ7oednZ2d@brightview.co.uk> |
I'm using various Apache Commons Maps as a multithread cache, protected using ReentrantReadWriteLock, so that getting() uses a read lock, and putting() uses a write lock. But I've got an issue; in the case of a cache miss (protected by a read lock), the required value is acquired using the "underlying function" that the cache is over; this value is then put() into the cache (protected by a write lock) This is all perfectly thread safe, and gives correct results. However, if the underlying function is slow and/or resource hungry (consider cacheing a ray traced image!) many threads can end up calling the real function (second and subsequent threads to the first get a miss during the first threads call to the underlying function). "obviously..." what I want is for only the thread that FIRST has a cache miss calls the underlying function, whilst other threads (for the same key) wait. This seems "obvious" enough that I'm guessing there's a standard solution. Googling led me to several "heavy" libraries; This appears more a locking/cacheing issue than a Java issue, although my implementation is in Java. Can anyone (please) point me at a canonical solution/implementation? BugBear
[toc] | [next] | [standalone]
| From | Silvio Bierman <silvio@moc.com> |
|---|---|
| Date | 2012-05-15 11:41 +0200 |
| Message-ID | <4fb224b2$0$6867$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #14544 |
On 05/15/2012 11:14 AM, bugbear wrote: > I'm using various Apache Commons Maps as a > multithread cache, protected using > ReentrantReadWriteLock, so that getting() uses a read lock, > and putting() uses a write lock. > > But I've got an issue; in the > case of a cache miss (protected by a read lock), > the required value is acquired using the "underlying function" > that the cache is over; this value is then put() into > the cache (protected by a write lock) > > This is all perfectly thread safe, and gives > correct results. > > However, if the underlying function is slow > and/or resource hungry (consider cacheing > a ray traced image!) many threads can > end up calling the real function (second > and subsequent threads to the first get a miss > during the first threads call to the underlying function). > > "obviously..." what I want is for only > the thread that FIRST has a cache miss > calls the underlying function, whilst other > threads (for the same key) wait. > > This seems "obvious" enough that I'm guessing > there's a standard solution. > > Googling led me to several "heavy" libraries; > > This appears more a locking/cacheing issue > than a Java issue, although my implementation > is in Java. > > Can anyone (please) point me at a > canonical solution/implementation? > > BugBear The simplest approach is to wrap your cached values in an object and make all cache access go through two stages. Stage one is to lookup the wrapper in the cache, possibly inserting a new (empty) wrapper if not present. Stage two is to get a lock on the wrapper to get the actual cache value. If that is not present (or perhaps stale) then the thread owing the lock on the wrapper can recompute the value. If you cache values that are slow to compute this is usually the way to go.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2012-05-15 15:57 -0700 |
| Message-ID | <rpn5r7lbt7ckkj0l67n7us7m9srs2plprp@4ax.com> |
| In reply to | #14545 |
On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman <silvio@moc.com> wrote, quoted or indirectly quoted someone who said : >The simplest approach is to wrap your cached values in an object and >make all cache access go through two stages. Where did you learn this? -- Roedy Green Canadian Mind Products http://mindprod.com Programmers love to create simplified replacements for HTML. They forget that the simplest language is the one you already know. They also forget that their simple little markup language will bit by bit become even more convoluted and complicated than HTML because of the unplanned way it grows. .
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-05-15 16:13 -0700 |
| Message-ID | <qqBsr.11866$3y3.7423@newsfe20.iad> |
| In reply to | #14553 |
On 5/15/12 3:57 PM, Roedy Green wrote: > On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> > wrote, quoted or indirectly quoted someone who said : > >> The simplest approach is to wrap your cached values in an object and >> make all cache access go through two stages. > > Where did you learn this? I think he meant "an easy efficient approach", rather than "the simplest". The simplest of course is a global lock for the cache.
[toc] | [prev] | [next] | [standalone]
| From | Silvio Bierman <silvio@moc.com> |
|---|---|
| Date | 2012-05-16 01:22 +0200 |
| Message-ID | <4fb2e52f$0$6854$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #14554 |
On 05/16/2012 01:13 AM, Daniel Pitts wrote: > On 5/15/12 3:57 PM, Roedy Green wrote: >> On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> >> wrote, quoted or indirectly quoted someone who said : >> >>> The simplest approach is to wrap your cached values in an object and >>> make all cache access go through two stages. >> >> Where did you learn this? > I think he meant "an easy efficient approach", rather than "the simplest". > > The simplest of course is a global lock for the cache. Yep, with "simplest" I meant the simplest way to get around the mentioned disadvantage of a global lock.
[toc] | [prev] | [next] | [standalone]
| From | Silvio Bierman <silvio@moc.com> |
|---|---|
| Date | 2012-05-16 01:19 +0200 |
| Message-ID | <4fb2e46c$0$6854$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #14553 |
On 05/16/2012 12:57 AM, Roedy Green wrote: > On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> > wrote, quoted or indirectly quoted someone who said : > >> The simplest approach is to wrap your cached values in an object and >> make all cache access go through two stages. > > Where did you learn this? Why do you ask? Do you disagree? It is something I came up with the first time I encountered the same situation, probably many years ago. It is obvious you need to decouple key lookups/insertions, which are inherently cache-global, from value insertions/updates, which could be considered key-local. Eric and Daniel responded with more specific solutions that both match my general pattern.
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2012-05-15 20:26 -0400 |
| Message-ID | <jous86$cc7$2@dont-email.me> |
| In reply to | #14555 |
On 5/15/2012 7:19 PM, Silvio Bierman wrote:
> On 05/16/2012 12:57 AM, Roedy Green wrote:
>> On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com>
>> wrote, quoted or indirectly quoted someone who said :
>>
>>> The simplest approach is to wrap your cached values in an object and
>>> make all cache access go through two stages.
>>
>> Where did you learn this?
>
> Why do you ask? Do you disagree?
>
> It is something I came up with the first time I encountered the same
> situation, probably many years ago.
> It is obvious you need to decouple key lookups/insertions, which are
> inherently cache-global, from value insertions/updates, which could be
> considered key-local.
>
> Eric and Daniel responded with more specific solutions that both match
> my general pattern.
Three independent votes for one pattern: What we have here is a
M A N D A T E !!!
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | Arved Sandstrom <asandstrom3minus1@eastlink.ca> |
|---|---|
| Date | 2012-05-15 21:41 -0300 |
| Message-ID | <DICsr.2195$9Q6.238@newsfe18.iad> |
| In reply to | #14558 |
On 12-05-15 09:26 PM, Eric Sosman wrote: > On 5/15/2012 7:19 PM, Silvio Bierman wrote: >> On 05/16/2012 12:57 AM, Roedy Green wrote: >>> On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> >>> wrote, quoted or indirectly quoted someone who said : >>> >>>> The simplest approach is to wrap your cached values in an object and >>>> make all cache access go through two stages. >>> >>> Where did you learn this? >> >> Why do you ask? Do you disagree? >> >> It is something I came up with the first time I encountered the same >> situation, probably many years ago. >> It is obvious you need to decouple key lookups/insertions, which are >> inherently cache-global, from value insertions/updates, which could be >> considered key-local. >> >> Eric and Daniel responded with more specific solutions that both match >> my general pattern. > > Three independent votes for one pattern: What we have here is a > > M A N D A T E !!! > Dunno about that, but I did go out and buy a lottery ticket... AHS -- Never interrupt your enemy when he is making a mistake. --Napoleon
[toc] | [prev] | [next] | [standalone]
| From | Silvio Bierman <silvio@moc.com> |
|---|---|
| Date | 2012-05-16 09:26 +0200 |
| Message-ID | <4fb35691$0$6944$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #14558 |
On 05/16/2012 02:26 AM, Eric Sosman wrote: > On 5/15/2012 7:19 PM, Silvio Bierman wrote: >> On 05/16/2012 12:57 AM, Roedy Green wrote: >>> On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> >>> wrote, quoted or indirectly quoted someone who said : >>> >>>> The simplest approach is to wrap your cached values in an object and >>>> make all cache access go through two stages. >>> >>> Where did you learn this? >> >> Why do you ask? Do you disagree? >> >> It is something I came up with the first time I encountered the same >> situation, probably many years ago. >> It is obvious you need to decouple key lookups/insertions, which are >> inherently cache-global, from value insertions/updates, which could be >> considered key-local. >> >> Eric and Daniel responded with more specific solutions that both match >> my general pattern. > > Three independent votes for one pattern: What we have here is a > > M A N D A T E !!! > Let's hope this is inspires the Greeks...
[toc] | [prev] | [next] | [standalone]
| From | "Mike Schilling" <mscottschilling@hotmail.com> |
|---|---|
| Date | 2012-05-26 14:54 -0700 |
| Message-ID | <jprjfe$1je$1@dont-email.me> |
| In reply to | #14561 |
"Silvio Bierman" <silvio@moc.com> wrote in message news:4fb35691$0$6944$e4fe514c@news2.news.xs4all.nl... > On 05/16/2012 02:26 AM, Eric Sosman wrote: >> On 5/15/2012 7:19 PM, Silvio Bierman wrote: >>> On 05/16/2012 12:57 AM, Roedy Green wrote: >>>> On Tue, 15 May 2012 11:41:06 +0200, Silvio Bierman<silvio@moc.com> >>>> wrote, quoted or indirectly quoted someone who said : >>>> >>>>> The simplest approach is to wrap your cached values in an object and >>>>> make all cache access go through two stages. >>>> >>>> Where did you learn this? >>> >>> Why do you ask? Do you disagree? >>> >>> It is something I came up with the first time I encountered the same >>> situation, probably many years ago. >>> It is obvious you need to decouple key lookups/insertions, which are >>> inherently cache-global, from value insertions/updates, which could be >>> considered key-local. >>> >>> Eric and Daniel responded with more specific solutions that both match >>> my general pattern. >> >> Three independent votes for one pattern: What we have here is a >> >> M A N D A T E !!! >> > > Let's hope this is inspires the Greeks... They've been dating men for millenia.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2012-05-16 22:03 -0700 |
| Message-ID | <rh19r7th8q0r7cvvbn9ldmd2forprodisi@4ax.com> |
| In reply to | #14555 |
On Wed, 16 May 2012 01:19:07 +0200, Silvio Bierman <silvio@moc.com> wrote, quoted or indirectly quoted someone who said : > >Why do you ask? Do you disagree? No. I thought there were two possibilities. There is a some great book you read or website that I should reference in the Java glossary, or that you figured this out by exhaustive experiment, in which case I should consider worship. -- Roedy Green Canadian Mind Products http://mindprod.com Plants" with "leaves" no more efficient than today’s solar cells could out-compete real plants, crowding the biosphere with an inedible foliage. Tough omnivorous "bacteria" could out-compete real bacteria: They could spread like blowing pollen, replicate swiftly, and reduce the biosphere to dust in a matter of days. Dangerous replicators could easily be too tough, small, and rapidly spreading to stop -- at least if we make no preparation. We have trouble enough controlling viruses and fruit flies. ~ Eric Drexler (born: 1955-04-25 age: 57) Engines of Creation: The Coming Era of Nanotechnology. .
[toc] | [prev] | [next] | [standalone]
| From | Silvio Bierman <silvio@moc.com> |
|---|---|
| Date | 2012-05-21 15:25 +0200 |
| Message-ID | <4fba4237$0$6846$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #14568 |
On 05/17/2012 07:03 AM, Roedy Green wrote: > On Wed, 16 May 2012 01:19:07 +0200, Silvio Bierman<silvio@moc.com> > wrote, quoted or indirectly quoted someone who said : > >> >> Why do you ask? Do you disagree? > > No. I thought there were two possibilities. There is a some great > book you read or website that I should reference in the Java glossary, > or that you figured this out by exhaustive experiment, in which case I > should consider worship. Actually, exhaustive experiment would be a huge overstatement. It was the second thing that came up after I immediately discarded the first impulse of a global lock. Some people would call that premature optimization...
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2012-05-15 08:22 -0400 |
| Message-ID | <jothpq$pub$1@dont-email.me> |
| In reply to | #14544 |
On 5/15/2012 5:14 AM, bugbear wrote:
> I'm using various Apache Commons Maps as a
> multithread cache, protected using
> ReentrantReadWriteLock, so that getting() uses a read lock,
> and putting() uses a write lock.
>
> But I've got an issue; in the
> case of a cache miss (protected by a read lock),
> the required value is acquired using the "underlying function"
> that the cache is over; this value is then put() into
> the cache (protected by a write lock)
>
> This is all perfectly thread safe, and gives
> correct results.
>
> However, if the underlying function is slow
> and/or resource hungry (consider cacheing
> a ray traced image!) many threads can
> end up calling the real function (second
> and subsequent threads to the first get a miss
> during the first threads call to the underlying function).
>
> "obviously..." what I want is for only
> the thread that FIRST has a cache miss
> calls the underlying function, whilst other
> threads (for the same key) wait.
>
> This seems "obvious" enough that I'm guessing
> there's a standard solution.
>
> Googling led me to several "heavy" libraries;
>
> This appears more a locking/cacheing issue
> than a Java issue, although my implementation
> is in Java.
>
> Can anyone (please) point me at a
> canonical solution/implementation?
Don't know whether it's "canonical," but one fairly obvious
approach is to insert the missed key in the map immediately, but
with a value that means "Coming Soon To A Cache Near You." The
original thread can then release its read lock and fetch the true
value from its source.
Other threads that look for the same value will "hit" the
cache, but will recognize the "Coming Soon" as an indication to
wait until the first thread comes up with the value.
There must be many ways to arrange this, but one that requires
very little awareness on the part of the querying threads is to
make the hashed "value" a value-holder object. Then a querying
thread could do something like
mapLock.lockForReading();
ValueHolder holder = map.get(key);
mapLock.release();
if (holder == null) {
mapLock.lockForWriting();
holder = map.get(key); // inserted by other thread?
if (holder == null) {
holder = new ValueHolder(key);
map.put(key, holder);
}
mapLock.release();
}
Value = holder.getValue();
The ValueHolder class might look like
class ValueHolder {
private final Key key;
private Value cachedValue;
private SourceException fetchError;
ValueHolder(Key key) {
this.key = key;
}
synchronized
Value getValue() throws SourceException {
if (cachedValue == null) {
if (fetchError != null) {
// notify new thread of old error
throw fetchError;
}
try {
cachedValue = fetchValueFromSource(key);
} catch (SourceException ex) {
fetchError = ex;
throw fetchError;
}
}
return cachedValue;
}
}
(That's not the only way to deal with errors in fetching, but it
seems perfectly reasonable to me even if it's a little odd to throw
the same exception object multiple times.)
The cache-querying threads lock the map only briefly, long
enough to determine whether a ValueHolder is present. ValueHolder
can delay its callers for a long time (if not, why cache at all?),
but will only delay callers that are interested in that particular
key; callers looking for other keys will get different ValueHolder
instances and won't be interfered with. And only one thread will
actually attempt fetchValueFromSource() on any given key.
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-05-15 09:56 -0700 |
| Message-ID | <bVvsr.9500$TC4.4320@newsfe14.iad> |
| In reply to | #14544 |
On 5/15/12 2:14 AM, bugbear wrote:
> I'm using various Apache Commons Maps as a
> multithread cache, protected using
> ReentrantReadWriteLock, so that getting() uses a read lock,
> and putting() uses a write lock.
>
> But I've got an issue; in the
> case of a cache miss (protected by a read lock),
> the required value is acquired using the "underlying function"
> that the cache is over; this value is then put() into
> the cache (protected by a write lock)
>
> This is all perfectly thread safe, and gives
> correct results.
>
> However, if the underlying function is slow
> and/or resource hungry (consider cacheing
> a ray traced image!) many threads can
> end up calling the real function (second
> and subsequent threads to the first get a miss
> during the first threads call to the underlying function).
>
> "obviously..." what I want is for only
> the thread that FIRST has a cache miss
> calls the underlying function, whilst other
> threads (for the same key) wait.
>
> This seems "obvious" enough that I'm guessing
> there's a standard solution.
>
> Googling led me to several "heavy" libraries;
>
> This appears more a locking/cacheing issue
> than a Java issue, although my implementation
> is in Java.
>
> Can anyone (please) point me at a
> canonical solution/implementation?
>
> BugBear
What I think I would do in this situation is have the "write" operation
be fast. It would put in a "FutureTask<V>" object, which hasn't been run
yet.
The FutureTask<V> class would basically be a "lazy loader". The reason
to do this is to be able to release the lock on the Map, while still
blocking the return of the get until the value is ready.
public V get(K key) throw ExecutionException, InterruptedException {
readLock.lockInterruptibly();
try {
final Future<V> future = map.get(key);
if (future != null) { return future.getValue(); }
} finally {
readLock.unlock();
}
final FutureTask<V> future;
writeLock.lockInterruptibly();
try {
// We need to double check, to make sure no one else
// has added the future.
final FutureTask<V> cachedFuture = map.get(key);
if (cachedFuture != null) {
future = cachedFuture;
} else {
future = new FutureTask<V>(new
UnderlyingFunctionCallable<V>(key));
map.put(key, future);
}
} finally {
writeLock.unlock();
}
future.run();
return future.getValue();
}
Note, since the "get" and "put" operations should be fairly fast, a
read/write lock may be over-kill, and the whole thing could be simplified:
public V get(K key) throw ExecutionException, InterruptedException {
synchronize(mySync) {
final FutureTask<V> cachedFuture = map.get(key);
if (cachedFuture != null) {
future = cachedFuture;
} else {
future = new FutureTask<V>(new
UnderlyingFunctionCallable<V>(key));
map.put(key, future);
}
}
future.run();
return future.getValue();
}
This basically gives you "per key" synchronization, with the "whole map"
synch being only for an O(1) operation of "check for key, add place-hold
if absent".
[toc] | [prev] | [next] | [standalone]
| From | bugbear <bugbear@trim_papermule.co.uk_trim> |
|---|---|
| Date | 2012-05-16 14:33 +0100 |
| Message-ID | <_P2dnU4rLbwxMS7SnZ2dnUVZ8uGdnZ2d@brightview.co.uk> |
| In reply to | #14544 |
bugbear wrote: > > Can anyone (please) point me at a > canonical solution/implementation? The answer is "yes" !! Thanks to the members for some great thoughts, analysis and reasoned discussion. BugBear
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-05-17 11:54 +0200 |
| Message-ID | <a1k06fFtdtU1@mid.individual.net> |
| In reply to | #14544 |
On 05/15/2012 11:14 AM, bugbear wrote:
> However, if the underlying function is slow
> and/or resource hungry (consider cacheing
> a ray traced image!) many threads can
> end up calling the real function (second
> and subsequent threads to the first get a miss
> during the first threads call to the underlying function).
>
> "obviously..." what I want is for only
> the thread that FIRST has a cache miss
> calls the underlying function, whilst other
> threads (for the same key) wait.
I provide a variant of Silvio's, Eric's and Daniel's solution which
should yield higher throughput because it works without read write
locking. You can find it as gist in case the code is garbled in the
newsgroup posting:
https://gist.github.com/2717818
Kind regards
robert
The code (untested):
package clj;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
/**
* The cache works with as few locking as possible. Lookup is done in
two steps
* on cache miss:
* <ol>
* <li>On a cache miss a retriever is inserted into the cache which
will obtain
* the value synchronized from a {@link Calculator}.</li>
* <li>Once calculation has finished a simple lock free reference to
the value
* replaces the retriever in the cache and the value is returned.</li>
* </ol>
*
* @author robert klemme
*
* @param <K>
* key type
* @param <V>
* value type
*/
public final class LazyCache<K, V> {
/**
* Calculate values from given keys.
*
* @param <K>
* key type
* @param <V>
* value type
*/
public interface Calculator<K, V> {
V get(K key);
}
/**
* Obtain a value.
*
* @param <V>
* value type.
*/
private interface Reference<V> {
V get();
}
/**
* Stupid simple reference which only hands out a fixed value all
the time
* without synchronization.
*
* @param <V>
* value type.
*/
private static final class Ref<V> implements Reference<V> {
private final V val;
public Ref(V val) {
this.val = val;
}
@Override
public V get() {
return val;
}
}
/** Mapping from keys to objects which yield values. */
private final ConcurrentMap<K, Reference<V>> map = new
ConcurrentHashMap<K, Reference<V>>();
/** User provided. */
private final Calculator<K, V> calc;
/**
* Create a cache.
*
* @param calc
* user must provide a reasonable implementation, not
* <code>null</code>.
*/
public LazyCache(final Calculator<K, V> calc) {
if (calc == null)
throw new NullPointerException();
this.calc = calc;
}
/**
* Get a value from the cache. The value might have to be calculated.
*
* @param key
* lookup key.
* @return value, might even be <code>null</code> depending on
algorithm.
*/
public V get(final K key) {
Reference<V> ref = map.get(key);
if (ref == null) {
// miss
ref = new Reference<V>() {
@Override
public synchronized V get() {
final V val = calc.get(key);
// next time lock free access:
Reference<V> x = map.put(key, new Ref<V>(val));
assert x == this;
return val;
}
};
final Reference<V> old = map.putIfAbsent(key, ref);
if (old != null)
ref = old; // someone else was faster
}
return ref.get();
}
}
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-05-17 12:06 +0200 |
| Message-ID | <a1k0taF206U1@mid.individual.net> |
| In reply to | #14571 |
On 05/17/2012 11:54 AM, Robert Klemme wrote:
> On 05/15/2012 11:14 AM, bugbear wrote:
>> However, if the underlying function is slow
>> and/or resource hungry (consider cacheing
>> a ray traced image!) many threads can
>> end up calling the real function (second
>> and subsequent threads to the first get a miss
>> during the first threads call to the underlying function).
>>
>> "obviously..." what I want is for only
>> the thread that FIRST has a cache miss
>> calls the underlying function, whilst other
>> threads (for the same key) wait.
>
> I provide a variant of Silvio's, Eric's and Daniel's solution which
> should yield higher throughput because it works without read write
> locking. You can find it as gist in case the code is garbled in the
> newsgroup posting:
> https://gist.github.com/2717818
There was one important detail missing. This is the corrected code
(gist is updated as well):
package clj;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
/**
* The cache works with as few locking as possible. Lookup is done in
two steps
* on cache miss:
* <ol>
* <li>On a cache miss a retriever is inserted into the cache which
will obtain
* the value synchronized from a {@link Calculator}.</li>
* <li>Once calculation has finished a simple lock free reference to
the value
* replaces the retriever in the cache and the value is returned.</li>
* </ol>
*
* @author robert klemme
*
* @param <K>
* key type
* @param <V>
* value type
*/
public final class LazyCache<K, V> {
/**
* Calculate values from given keys.
*
* @param <K>
* key type
* @param <V>
* value type
*/
public interface Calculator<K, V> {
V get(K key);
}
/**
* Obtain a value.
*
* @param <V>
* value type.
*/
private interface Reference<V> {
V get();
}
/**
* Stupid simple reference which only hands out a fixed value all
the time
* without synchronization.
*
* @param <V>
* value type.
*/
private static final class Ref<V> implements Reference<V> {
private final V val;
public Ref(V val) {
this.val = val;
}
@Override
public V get() {
return val;
}
}
/** Mapping from keys to objects which yield values. */
private final ConcurrentMap<K, Reference<V>> map = new
ConcurrentHashMap<K, Reference<V>>();
/** User provided. */
private final Calculator<K, V> calc;
/**
* Create a cache.
*
* @param calc
* user must provide a reasonable implementation, not
* <code>null</code>.
*/
public LazyCache(final Calculator<K, V> calc) {
if (calc == null)
throw new NullPointerException();
this.calc = calc;
}
/**
* Get a value from the cache. The value might have to be calculated.
*
* @param key
* lookup key.
* @return value, might even be <code>null</code> depending on
algorithm.
*/
public V get(final K key) {
Reference<V> ref = map.get(key);
if (ref == null) {
// miss
ref = new Reference<V>() {
private V val;
private boolean here = false; // superfluous but explicit
@Override
public synchronized V get() {
if (!here) {
val = calc.get(key);
here = true;
// next time lock free access:
Reference<V> x = map.put(key, new Ref<V>(val));
assert x == this;
}
return val;
}
};
final Reference<V> old = map.putIfAbsent(key, ref);
if (old != null)
ref = old; // someone else was faster
}
return ref.get();
}
}
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2012-05-17 09:51 -0700 |
| Message-ID | <v0atr.23997$_l.2497@newsfe15.iad> |
| In reply to | #14572 |
On 5/17/12 3:06 AM, Robert Klemme wrote:
> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>> On 05/15/2012 11:14 AM, bugbear wrote:
>>> However, if the underlying function is slow
>>> and/or resource hungry (consider cacheing
>>> a ray traced image!) many threads can
>>> end up calling the real function (second
>>> and subsequent threads to the first get a miss
>>> during the first threads call to the underlying function).
>>>
>>> "obviously..." what I want is for only
>>> the thread that FIRST has a cache miss
>>> calls the underlying function, whilst other
>>> threads (for the same key) wait.
>>
>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>> should yield higher throughput because it works without read write
>> locking. You can find it as gist in case the code is garbled in the
>> newsgroup posting:
>> https://gist.github.com/2717818
>
> There was one important detail missing. This is the corrected code (gist
> is updated as well):
[snip]
> if (!here) {
> val = calc.get(key);
> here = true;
> // next time lock free access:
> Reference<V> x = map.put(key, new Ref<V>(val));
> assert x == this;
> }
This assert (while true) is unnecessary and potentially harmful.
Nothing becomes inconsistent if the value you replace is not the
calculating reference. However, your assert may break at runtime if
something else changes about this code. Assert should be used to
validate consistency.
I like this class, though I have a couple of style comments.
I personally would name "Ref" as "ConstantReference", and your anonymous
Reference implementation I would make a private static class
CalculatingReference.
In my own personal library of utilities, I actually have two interfaces
"Factory<T>" with a "T create()" method and "ParameterizedFactory<T,P>"
with a "T create(P parameter)". Including a few interesting
implementations of those. In my case, I would probably implement this
cache as a ParameterizedFactory impl that takes another
ParameterizedFactory as the "calc" field.
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2012-05-17 20:48 +0200 |
| Message-ID | <a1kvglF557U1@mid.individual.net> |
| In reply to | #14583 |
On 17.05.2012 18:51, Daniel Pitts wrote:
> On 5/17/12 3:06 AM, Robert Klemme wrote:
>> On 05/17/2012 11:54 AM, Robert Klemme wrote:
>>> On 05/15/2012 11:14 AM, bugbear wrote:
>>>> However, if the underlying function is slow
>>>> and/or resource hungry (consider cacheing
>>>> a ray traced image!) many threads can
>>>> end up calling the real function (second
>>>> and subsequent threads to the first get a miss
>>>> during the first threads call to the underlying function).
>>>>
>>>> "obviously..." what I want is for only
>>>> the thread that FIRST has a cache miss
>>>> calls the underlying function, whilst other
>>>> threads (for the same key) wait.
>>>
>>> I provide a variant of Silvio's, Eric's and Daniel's solution which
>>> should yield higher throughput because it works without read write
>>> locking. You can find it as gist in case the code is garbled in the
>>> newsgroup posting:
>>> https://gist.github.com/2717818
>>
>> There was one important detail missing. This is the corrected code (gist
>> is updated as well):
> [snip]
>> if (!here) {
>> val = calc.get(key);
>> here = true;
>> // next time lock free access:
>> Reference<V> x = map.put(key, new Ref<V>(val));
>> assert x == this;
>> }
> This assert (while true) is unnecessary and potentially harmful. Nothing
> becomes inconsistent if the value you replace is not the calculating
> reference.
The algorithm works under the assumption that the first element placed
into the map is the one doing the calculation and only after that the
constant reference is inserted. If that is not the case (e.g. because
another calculator is put into the map) then obviously something must be
wrong and we might even have more than one lengthy calculation of the
cached value. This assumption is checked with the assert which also
serves as documentation.
> However, your assert may break at runtime if something else
> changes about this code. Assert should be used to validate consistency.
Well, if the logic is changed then of course asserts need to change,
too. There is nothing strange or harmful about that. With your
argument no asserts would make sense because they would need to be
changed if a class is changed. Note that internal class invariants can
change without changing the observable invariant. So even though a
class "looks" the same and behaves the same from the outside asserts
might have to be changed if internal logic of the class changes.
> I like this class, though I have a couple of style comments.
Thank you!
> I personally would name "Ref" as "ConstantReference",
Well, I was lazy typing and the class isn't public anyway. This is not
production code but an illustrating example for cljp.
> and your anonymous
> Reference implementation I would make a private static class
> CalculatingReference.
Why? You would need to either pass in a reference to LazyCache or pass
map and calc - plus declaring a constructor and type parameters. In
this case I don't really see the benefit.
> In my own personal library of utilities, I actually have two interfaces
> "Factory<T>" with a "T create()" method and "ParameterizedFactory<T,P>"
> with a "T create(P parameter)". Including a few interesting
> implementations of those. In my case, I would probably implement this
> cache as a ParameterizedFactory impl that takes another
> ParameterizedFactory as the "calc" field.
That could be done. But then method naming could be considered
misleading because the create() of the cache isn't really a create in
all cases.
Kind regards
robert
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Sebastian <sebastian@undisclosed.invalid> |
|---|---|
| Date | 2012-05-20 01:35 +0200 |
| Message-ID | <jp9anv$a6a$1@news.albasani.net> |
| In reply to | #14572 |
Am 17.05.2012 12:06, schrieb Robert Klemme: > On 05/17/2012 11:54 AM, Robert Klemme wrote: >> On 05/15/2012 11:14 AM, bugbear wrote: >>> However, if the underlying function is slow >>> and/or resource hungry (consider cacheing >>> a ray traced image!) many threads can >>> end up calling the real function (second >>> and subsequent threads to the first get a miss >>> during the first threads call to the underlying function). >>> >>> "obviously..." what I want is for only >>> the thread that FIRST has a cache miss >>> calls the underlying function, whilst other >>> threads (for the same key) wait. >> >> I provide a variant of Silvio's, Eric's and Daniel's solution which >> should yield higher throughput because it works without read write >> locking. You can find it as gist in case the code is garbled in the >> newsgroup posting: >> https://gist.github.com/2717818 > > There was one important detail missing. This is the corrected code (gist > is updated as well): > [snip] The important detail obiously is the extra check with the "here" flag. Would you mind explaining why that is necessary? Everyone seems to have got it, but I must admit I haven't. Why is it not enough that Reference#get() is synchronized? As you yourself have missed this at your first attempt, I hope the reason isn't trivial... -- Sebastian
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web