Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #14653
| From | Lew <noone@lewscanon.com> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: multithreaded cache? |
| Date | 2012-05-19 14:24 -0700 |
| Organization | albasani.net |
| Message-ID | <jp932s$r59$1@news.albasani.net> (permalink) |
| References | (1 earlier) <a1k06fFtdtU1@mid.individual.net> <4fb69812$0$6849$e4fe514c@news2.news.xs4all.nl> <a1nu7sFnnpU1@mid.individual.net> <jp6iji$n6m$1@dont-email.me> <a1pb7oFpc0U1@mid.individual.net> |
Robert Klemme wrote: > Eric Sosman wrote: >> Robert Klemme wrote: >>> Silvio Bierman wrote: >>>> Robert Klemme wrote: >>> >>>>> I provide a variant of Silvio's, Eric's and Daniel's solution which >>>>> should yield higher throughput because it works without read write >>>>> locking. You can find it as gist in case the code is garbled in the >>>>> newsgroup posting: >>>>> https://gist.github.com/2717818 >>> >>>> I think you have as many locks as I suggested (being one)? My initial >>>> implementations of something like this used a plain map with an extra >>>> lock but later cases used the by then available ConcurrentHashMap as >>>> well, making one lock redundant. >>> >>> You didn't show it here, did you? I can's seem to find it in the thread. >>> Note that CHM does also do synchronization. I am not sure from your >>> statement what exact locking scheme you apply. There does seem to be one >>> difference though: I my version the second lock goes away after the >>> value has been computed so there is only the sync of CHM left. >> >> It seems to me that if N threads query the same key at about >> the same time, they may all miss the map and go off to perform >> the slow computation. If "slow" is large compared to the cost of >> a lock-release pair (and if it weren't, why cache?), the tradeoff >> seems suspect. >> >> Also, different threads may wind up using different value >> instances. If the cache is purely a convenience for a value-only >> object that may be all right, but it's not all right if the values >> are supposed to be singletons. >> >> Finally, there's more than a whiff of the double-checked locking >> antipattern about what you're doing with the `here' flag. I'm not >> absolutely sure what you've got is in fact DCL (hence, wrong), but >> I'm also not absolutely sure it's DCL-free. Before using the pattern >> in any important way, I'd want to check with a major-league guru, >> just as "due diligence." > > There is no double checked locking going on - neither in LazyCache itself nor > in the retriever instance. > http://en.wikipedia.org/wiki/Double-checked_locking > > Daniel has explained all the details already. I'd just add that creation of > multiple retriever objects is the price you pay for increased concurrency due > to CHM. But only one of them is ever going to be used per key. (Documenting > this is one of the tasks of the assert Daniel questioned earlier.) > > I think rw-locking is an inferior technique compared to CHM in this case > because in case of cache miss you cannot escalate a r-lock to a w-lock (to > avoid deadlocks) and hence need to release the r-lock, acquire the w-lock, > *and check again*. In other words you have two more synchronization points > with memory barriers, scheduling effects etc. > > In case of cache hit you might still suffer throughput because of thread > starvation caused by all lookups for values missing from the cache might be > blocked for indefinite time when using an unfair rw-lock. Using a fair rw-lock > on the other hand is more expensive and still has the downside of a global > lock which affects the complete range of keys. > > CHM improves this by partitioning value space and only lock individual > partitions for atomic operations. These partitions are completely independent > from a locking perspective. That's the main advantage of CHM. First off, thank you for the very professional and elegant code. I shall study it frequently. I had experience with CHM on an Enterprise Java project a few years back that involved processing documents up to 1GB or so at millions of documents per hour. As you can imagine, concurrency was a consideration there, but due to bureaucracy was not properly managed for a few years. I was hired around the time they started to pay attention to such issues. The code involved a properly but naively synchronized Map at one point. Detailed profiling revealed that lock contention for the Map was the number one throughput chokepoint in the whole system. Even above database concurrency and I/O. Boy howdy, the pundits are right to recommend hard measurement. Lock contention has a cascade effect. In modern JVMs, like IBM's mainframe-level ones that we used, uncontended locks process quite quickly. Contention introduces roadblocks that delay threads, allowing more to queue up, causing more contention, slowing things down still more, causing more, yada. It only takes one skunk in the middle of the main road to completely tie up rush hour everywhere. CHM by default partitions lock space (though I'm not clear it uses locks exactly) into sixteen independent slices. This meant far more than sixteen times faster for us. Writes tend to happen in a solitary thread without a whole lot of fight while reads run like greased pigs through the other fifteen. With our mix of reads and writes, and transaction volume, CHM pretty near eliminated lock contention. YMMV, as always, but in this case that chokepoint went from number one to off the list. It was still the wrong solution, since a simple, effortless, non-concurrent better one that would also have eliminated a raft of other problems was available, but had no political traction. However, good enough to eliminate the throughput impact was good enough, so I didn't raise a fuss when they decided against it. -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-15 10:14 +0100
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-15 11:41 +0200
Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-15 15:57 -0700
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 16:13 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:22 +0200
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 01:19 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 20:26 -0400
Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-15 21:41 -0300
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-16 09:26 +0200
Re: multithreaded cache? "Mike Schilling" <mscottschilling@hotmail.com> - 2012-05-26 14:54 -0700
Re: multithreaded cache? Roedy Green <see_website@mindprod.com.invalid> - 2012-05-16 22:03 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:25 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-15 08:22 -0400
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-15 09:56 -0700
Re: multithreaded cache? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-05-16 14:33 +0100
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 11:54 +0200
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 12:06 +0200
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-17 09:51 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-17 20:48 +0200
Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:35 +0200
Re: multithreaded cache? Sebastian <sebastian@undisclosed.invalid> - 2012-05-20 01:48 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 18:11 -0700
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-18 20:42 +0200
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-18 23:45 +0200
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-18 18:31 -0400
Re: multithreaded cache? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-18 22:15 -0700
Re: multithreaded cache? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-05-19 07:09 -0400
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-19 12:33 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-19 14:24 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-21 21:15 +0200
Re: multithreaded cache? Lew <noone@lewscanon.com> - 2012-05-23 23:11 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-24 23:05 +0200
Re: multithreaded cache? Gene Wirchenko <genew@ocis.net> - 2012-05-24 20:11 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:42 +0200
Re: multithreaded cache? Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-05-25 07:09 -0300
Re: multithreaded cache? Silvio Bierman <silvio@moc.com> - 2012-05-21 15:21 +0200
Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-05-24 23:22 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-05-25 17:36 +0200
Re: multithreaded cache? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-06-02 12:27 -0700
Re: multithreaded cache? Robert Klemme <shortcutter@googlemail.com> - 2012-06-02 21:54 +0200
csiph-web