Groups | Search | Server Info | Keyboard shortcuts | Login | Register


Groups > comp.arch > #5933

Re: Single Thread Performance

From MitchAlsup <MitchAlsup@aol.com>
Newsgroups comp.arch
Subject Re: Single Thread Performance
Date 2012-02-14 09:26 -0800
Organization http://groups.google.com
Message-ID <21210493.750.1329240371457.JavaMail.geo-discussion-forums@ynt13> (permalink)
References (7 earlier) <4F38875A.2050407@SPAM.comp-arch.net> <12562993.824.1329107776365.JavaMail.geo-discussion-forums@ynnj2> <4F392235.1020804@SPAM.comp-arch.net> <16314332.739.1329152301273.JavaMail.geo-discussion-forums@ynnk21> <ae37e65b-43ea-42b6-80cf-d7ba5598bbdc@j14g2000vba.googlegroups.com>

Show all headers | View raw


On Monday, February 13, 2012 6:19:57 PM UTC-6, Paul A. Clayton wrote:
> On Feb 13, 11:58 am, MitchAlsup <MitchAl...@aol.com> wrote:
> > On Monday, February 13, 2012 8:46:13 AM UTC-6, Andy (Super) Glew wrote:
> > > What does that look like?
> >
> > We had a device in the machine (much like a TLB) that kept
> > track of the order of synchronization events. When a conflict
> > was detected, it provided a number (instead of yes/no) and
> > that number could be used by SW to proactively avoid
> > interference on the subsequent attempt.
> >
> > So, lets say an interrupt went off and 18 tasks simultaneously
> > began to access a concurrent data structure. ON the first
> > attempt at most 1 task makes forward progress and 17 others
> 
> The Intel RTM (and AMD ASF) specifications do not guarantee
> forward progress.  (I am guessing that this is meant to allow
> simpler implementations while allowing a more complex
> implementation to provide such in general.)

Ever wonder why the phrase "at most" was in that sentance?

> > receive numbers from 1 through 17. On a subsequent access to
> > the CDS, those tasks can use the number to index deeped into
> > the CDS and access an element that is not being simultaneously
> > accessed my 16 others.
> 
> I suppose one could have a mildly bad case where another thread
> (that was busy before) wins the transaction for the first
> entry.  (Cache block granularity of monitoring might also
> force a bloating of the data structure.)
> 
> I wonder if something like a coallescing fetch-and-add
> could be useful in such cases.  (It might even be
> possible to use something like coallescing fetch-and-add
> to support limited conflicting updates within transactions.

My ASF worked on addresses not data. That is memory was not necessarily accessed, but the memory addresses that were participating were definatly known. This got around several/many cache coherent issues that loomed close to unsolvable.

> Unfortunately, it seems that such would require all
> transactions 'later' than a failing transaction to fail
> [or a complex fix-up mechanism would be needed to provide
> the correct value and re-execute any dependent operations--
> yuck!].)
> 
> It seems that such a mechanism might be used to support a
> low overhead lock queue when the conflict is essential.

My ASF was in essence: Multiple Simultaneous Locks

The ASF prolog would anounce the participating addresses, and then bundle them up and send the set of addresses to an Address Resolution Unit. The ARU would check to see if all of the addresses were conflict free, and if so it would return success, if not it would return a counter  representing the number of contenders in front of this one.

Once the processors got a success message back from the ARU it was allowed to violate cache coherence rules for a short time. Thus that processors could go out and fetch cache lines and then hold on to them NAKing requests from other processors. After all, those lines had been blessed by the ARU.

It was this ability to NAK other request under special, limited, checked (blessed) circumstances that provided soft guarentees of forward progress.

> Each thread could have a cache line that is monitored and
> written by the previous thread when it releases its hold
> on the lock.  If the monitor was removed (e.g., by the OS
> scheduler), the OS would have to do something to avoid
> lock-up (perhaps change the next pointer in the previous
> entry or manage the lock queue in the OS).
> 
> > Presto, on this worst case event in lock based CDS accesss
> > O(N**2), ASF would give O(3), that is near best case. In
> > average CDS accesses properly programmed ASF could achieve
> > O(ln(n)) instead O(n**2) where n is the number in interfering
> > acccessors.
> 
> Did this provide an instruction or data address of the conflict
> or was it intended for simple transactions?

I have no idea as to what you are asking.
> 
> > <snip>
> >
> > > What more do you want?
> >
> > Feedback from the HW that can be used by properly written SW to
> > proactively avoid interference on subsequent accesses.
> 
> In addition to accessing a given entry in a queue, information
> _might_ be useful to break large transactions into separate
> non-conflicting pieces and use locking to handle the conflicting
> portion.

In particular My ASF was intended to deal with concurrent data structures such as the RUN queues in a typical OS where multiple threads are inserting and removing things all the time. The ability to lock the 4 cache lines necessary to guarentee forward progress was the main target. My ASF was never intended to be a TM-lite.

> Even known the degree of conflict could be useful, especially
> if one thread is (nearly) guaranteed to make forward progress.
> In that case, just retrying the transaction may be more
> appropriate (without additional delay) may be appropriate).

In particular, if one knows how many CPUs are contending for entries in a run queue, one can access different portions fo the run queue and end up interference free. So from the above example: The first access is <basically> thrown away detecting interference in the cache (avoiding the latency of the ARU). When interference has been detected, the CPU changes mode and the nest ASF prolog is done in a "slow and methodological" way directing its lock requests to the ARU. The third trip thruogh the logic remains in the "S&M" mode, but this time the SW takes the conflict number and marches down the run queue to an element that will not be under contention. So on the 3rd pass all 17 processors grab all 17 entries from the queue conflict free. Presto O(3) time.

Mitch

Back to comp.arch | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Single Thread Performance "Unspecified" <partha@perfectvips.com> - 2012-02-04 21:54 +0530
  Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-06 05:55 -0600
    Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 13:06 +0000
      Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 14:12 -0500
        Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-06 13:36 -0700
          Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 20:47 +0000
            Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-06 15:07 -0700
              Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-06 16:32 -0600
                Re: Single Thread Performance Robert Wessel <robertwessel2@yahoo.com> - 2012-02-06 17:45 -0600
                Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-07 06:01 -0600
                Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-07 13:32 -0700
                Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 19:08 +0000
                Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-10 08:56 -0700
        Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 20:42 +0000
          Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 19:36 -0500
          Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-06 18:28 -0800
            Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 22:23 -0500
            Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-07 06:52 +0000
        Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-06 12:10 -0800
          Re: Single Thread Performance Thomas Womack <twomack@chiark.greenend.org.uk> - 2012-02-07 10:13 +0000
            Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-20 23:58 -0600
          Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 17:33 +0000
            Re: Single Thread Performance nedbrek <nedbrek@yahoo.com> - 2012-02-15 08:10 -0500
    Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 14:17 -0500
      Re: Single Thread Performance del cecchi <delcecchi@gmail.com> - 2012-02-25 22:07 -0800
    Re: Single Thread Performance jgk@panix.com (Joe keane) - 2012-02-07 17:57 +0000
  Re: Single Thread Performance Quadibloc <jsavard@ecn.ab.ca> - 2012-02-05 13:13 -0800
  Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-05 21:35 -0800
    Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 17:38 +0000
      Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-07 14:54 -0600
        Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 21:33 +0000
          Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-07 23:13 -0600
            Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-08 18:54 +0000
              Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-08 15:17 -0600
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-09 08:13 +0100
                Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 17:08 +0000
                Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-09 16:01 -0600
              Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-09 07:56 +0100
                Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 17:18 +0000
          Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-08 10:17 +0100
        Re: Single Thread Performance Jon <jon@beniston.com> - 2012-02-08 05:32 -0800
      Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-07 16:00 -0800
        Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-08 18:35 +0000
    Re: Single Thread Performance Partha <parthaspanda22@gmail.com> - 2012-02-10 11:32 -0800
      Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-10 20:31 +0000
        Re: Single Thread Performance "Unspecified" <partha@perfectvips.com> - 2012-02-11 02:12 +0530
          Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-10 21:04 +0000
          Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-10 16:43 -0800
            Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-10 19:48 -0800
              Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-12 14:31 -0500
            Re: Single Thread Performance Stefan Monnier <monnier@iro.umontreal.ca> - 2012-02-12 21:50 -0500
              Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-12 19:45 -0800
                Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-12 20:36 -0800
                Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-13 06:46 -0800
                Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-13 08:58 -0800
                Re: Single Thread Performance "Paul A. Clayton" <paaronclayton@gmail.com> - 2012-02-13 16:19 -0800
                Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 03:55 +0000
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 10:30 +0100
                Re: Single Thread Performance Andrew Reilly <areilly---@bigpond.net.au> - 2012-02-14 10:49 +0000
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 13:21 +0100
                Re: Single Thread Performance Stephen Fuld <SFuld@alumni.cmu.edu.invalid> - 2012-02-14 13:11 -0800
                Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-14 09:29 -0800
                Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-14 12:40 -0500
                Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-14 16:12 -0500
                Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:14 +0000
                Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:16 +0000
                Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:09 +0000
                Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-14 09:26 -0800
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-15 08:44 +0100
                Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-15 01:07 -0800
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 10:16 +0100
  Re: Single Thread Performance Michael S <already5chosen@yahoo.com> - 2012-02-08 01:04 -0800

csiph-web