Groups > comp.arch > #5703 > unrolled thread

Single Thread Performance

Started by	"Unspecified" <partha@perfectvips.com>
First post	2012-02-04 21:54 +0530
Last post	2012-02-08 01:04 -0800
Articles	12 on this page of 72 — 25 participants

Back to article view | Back to comp.arch

  Single Thread Performance "Unspecified" <partha@perfectvips.com> - 2012-02-04 21:54 +0530
    Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-06 05:55 -0600
      Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 13:06 +0000
        Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 14:12 -0500
          Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-06 13:36 -0700
            Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 20:47 +0000
              Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-06 15:07 -0700
                Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-06 16:32 -0600
                  Re: Single Thread Performance Robert Wessel <robertwessel2@yahoo.com> - 2012-02-06 17:45 -0600
                    Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-07 06:01 -0600
                      Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-07 13:32 -0700
                        Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 19:08 +0000
                          Re: Single Thread Performance BGB <cr88192@hotmail.com> - 2012-02-10 08:56 -0700
          Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-06 20:42 +0000
            Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 19:36 -0500
            Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-06 18:28 -0800
              Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 22:23 -0500
              Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-07 06:52 +0000
          Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-06 12:10 -0800
            Re: Single Thread Performance Thomas Womack <twomack@chiark.greenend.org.uk> - 2012-02-07 10:13 +0000
              Re: Single Thread Performance Brett Davis <ggtgp@yahoo.com> - 2012-02-20 23:58 -0600
            Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 17:33 +0000
              Re: Single Thread Performance nedbrek <nedbrek@yahoo.com> - 2012-02-15 08:10 -0500
      Re: Single Thread Performance Robert Myers <rbmyersusa@gmail.com> - 2012-02-06 14:17 -0500
        Re: Single Thread Performance del cecchi <delcecchi@gmail.com> - 2012-02-25 22:07 -0800
      Re: Single Thread Performance jgk@panix.com (Joe keane) - 2012-02-07 17:57 +0000
    Re: Single Thread Performance Quadibloc <jsavard@ecn.ab.ca> - 2012-02-05 13:13 -0800
    Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-05 21:35 -0800
      Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 17:38 +0000
        Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-07 14:54 -0600
          Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-07 21:33 +0000
            Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-07 23:13 -0600
              Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-08 18:54 +0000
                Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-08 15:17 -0600
                  Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-09 08:13 +0100
                  Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 17:08 +0000
                    Re: Single Thread Performance Stephen Sprunk <stephen@sprunk.org> - 2012-02-09 16:01 -0600
                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-09 07:56 +0100
                  Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-09 17:18 +0000
            Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-08 10:17 +0100
          Re: Single Thread Performance Jon <jon@beniston.com> - 2012-02-08 05:32 -0800
        Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-07 16:00 -0800
          Re: Single Thread Performance timcaffrey@aol.com (Tim McCaffrey) - 2012-02-08 18:35 +0000
      Re: Single Thread Performance Partha <parthaspanda22@gmail.com> - 2012-02-10 11:32 -0800
        Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-10 20:31 +0000
          Re: Single Thread Performance "Unspecified" <partha@perfectvips.com> - 2012-02-11 02:12 +0530
            Re: Single Thread Performance nmm1@cam.ac.uk - 2012-02-10 21:04 +0000
            Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-10 16:43 -0800
              Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-10 19:48 -0800
                Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-12 14:31 -0500
              Re: Single Thread Performance Stefan Monnier <monnier@iro.umontreal.ca> - 2012-02-12 21:50 -0500
                Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-12 19:45 -0800
                  Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-12 20:36 -0800
                    Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-13 06:46 -0800
                      Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-13 08:58 -0800
                        Re: Single Thread Performance "Paul A. Clayton" <paaronclayton@gmail.com> - 2012-02-13 16:19 -0800
                          Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 03:55 +0000
                            Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 10:30 +0100
                              Re: Single Thread Performance Andrew Reilly <areilly---@bigpond.net.au> - 2012-02-14 10:49 +0000
                                Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 13:21 +0100
                                  Re: Single Thread Performance Stephen Fuld <SFuld@alumni.cmu.edu.invalid> - 2012-02-14 13:11 -0800
                              Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-14 09:29 -0800
                              Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-14 12:40 -0500
                                Re: Single Thread Performance EricP <ThatWouldBeTelling@thevillage.com> - 2012-02-14 16:12 -0500
                                Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:14 +0000
                                  Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:16 +0000
                              Re: Single Thread Performance Rick Jones <rick.jones2@hp.com> - 2012-02-14 21:09 +0000
                          Re: Single Thread Performance MitchAlsup <MitchAlsup@aol.com> - 2012-02-14 09:26 -0800
                            Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-15 08:44 +0100
                              Re: Single Thread Performance "Andy (Super) Glew" <andy@SPAM.comp-arch.net> - 2012-02-15 01:07 -0800
                        Re: Single Thread Performance Terje Mathisen <"terje.mathisen at tmsw.no"> - 2012-02-14 10:16 +0100
    Re: Single Thread Performance Michael S <already5chosen@yahoo.com> - 2012-02-08 01:04 -0800

Page 4 of 4 — ← Prev page 1 2 3 [4]

#5937

From	Stephen Fuld <SFuld@alumni.cmu.edu.invalid>
Date	2012-02-14 13:11 -0800
Message-ID	<jheim1$15r$1@dont-email.me>
In reply to	#5930

On 2/14/2012 4:21 AM, Terje Mathisen wrote:

> I think the main problem is that the performance goes down so very
> quickly, i.e. any given programmer will believe that "I can do much
> better, I don't need all the overhead of a random exp backoff, I can
> simply spin a couple of times until the current owner is finished."

How about moving the problem into hardware?  Perhaps an instruction taht 
tests the lock.  If it is clear, it get set and the thread continues. 
If it is locked, the hardware waits an exponentially distributed pseudo 
random amount of time (perhaps a ROM table indexed by the low order bits 
of an internal clock counter), then repeats the test.  Of course, you 
need a timeout of some sort, and you have to insure that all the cores 
are not waiting on the same lock, but you have time to check this as the 
thread is going to wait anyway.  In the event of a timeout, etc. you 
revert to the software to do what it did before.  Hopefully this will be 
rare.

-- 
  - Stephen Fuld
(e-mail address disguised to prevent spam)

[toc] | [prev] | [next] | [standalone]

#5932

From	MitchAlsup <MitchAlsup@aol.com>
Date	2012-02-14 09:29 -0800
Message-ID	<12511083.751.1329240598042.JavaMail.geo-discussion-forums@ynt13>
In reply to	#5925

On Tuesday, February 14, 2012 3:30:05 AM UTC-6, Terje Mathisen wrote:
> Rick Jones wrote:
> > It may not really apply at all, but for some reason the lack of a
> > guarantee of forward progress reminds me of CSMA/CD in Ethernet and
> > "capture effect" which was prevalent before full-duplex became the
> > norm.
> 
> Standard Ethernet has one huge advantage:
> 
> Exponential randomized backoff is part of the spec. As long as everyone 
> implements this you get stable operation and guaranteed forward progress.

Would it not be better if the interface to Ethernet provided the backoff number instead of having to compute it on the fly? Thus, instead of a loop that exponentiates the backoff counter outside a loop that does the waiting, there would only be the waiting loop.

This is why My ASF went to the trouble to build a device to give SW that number.

Mitch

[toc] | [prev] | [next] | [standalone]

#5935

From	EricP <ThatWouldBeTelling@thevillage.com>
Date	2012-02-14 12:40 -0500
Message-ID	<A0x_q.3$Tk4.0@newsfe21.iad>
In reply to	#5925

Terje Mathisen wrote:
> Rick Jones wrote:
>> It may not really apply at all, but for some reason the lack of a
>> guarantee of forward progress reminds me of CSMA/CD in Ethernet and
>> "capture effect" which was prevalent before full-duplex became the
>> norm.
> 
> Standard Ethernet has one huge advantage:
> 
> Exponential randomized backoff is part of the spec. As long as everyone 
> implements this you get stable operation and guaranteed forward progress.
> 
> (This would correspond to having a similar exponential wait in every 
> busy loop retrying a lock, everyone accessing the lock using the same 
> algorithm. This is not going to happen. :-()

The difference here from Ethernet is that Ethernet can listen
at no cost before transmitting and without disturbing the
current transmission. If there is a collision, all colliders
know that it happened.

With RTM/ASF a listener aborts the current transaction.
That is similar to Arpanet which didn't listen first
but just broadcast - it plugged up real quick.
Also here the act of listening requires sending
and receiving packets and has some cost.

TTS spinlocks are also not like Ethernet because there
is a cost for each listener.

Exponential backoff in this situation would tend to produce
LIFO ordering, and does not have a bounded delay.
It is also difficult to figure out how much to back off
because when B reads, A's transaction aborts and B knows nothing.
When A retries it is B that aborts and A doesn't know.

When I tried to solve the above using atomic ops
the costs just grew without success.

I gedanken'd about with these algorithms whan ASF was announced
in AMD's white paper and I couldn't find any way to do lock free.
It has to fall back to locks.

Eric

[toc] | [prev] | [next] | [standalone]

#5938

From	EricP <ThatWouldBeTelling@thevillage.com>
Date	2012-02-14 16:12 -0500
Message-ID	<O6A_q.2392$cH7.586@newsfe02.iad>
In reply to	#5935

EricP wrote:
> 
> With RTM/ASF a listener aborts the current transaction.
> That is similar to Arpanet which didn't listen first
> but just broadcast - it plugged up real quick.

Make that the original ALOHAnet.

Eric

[toc] | [prev] | [next] | [standalone]

#5942

From	Rick Jones <rick.jones2@hp.com>
Date	2012-02-14 21:14 +0000
Message-ID	<jheiqs$jsc$3@usenet01.boi.hp.com>
In reply to	#5935

EricP <ThatWouldBeTelling@thevillage.com> wrote:

> The difference here from Ethernet is that Ethernet can listen at no
> cost before transmitting and without disturbing the current
> transmission. If there is a collision, all colliders know that it
> happened.

> With RTM/ASF a listener aborts the current transaction.  That is
> similar to Arpanet which didn't listen first but just broadcast - it
> plugged up real quick.  Also here the act of listening requires
> sending and receiving packets and has some cost.

I think you mean the ALOHA Net?
http://en.wikipedia.org/wiki/ALOHAnet#Pure_ALOHA

rick jones
-- 
denial, anger, bargaining, depression, acceptance, rebirth...
                                     where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com  but NOT BOTH...

[toc] | [prev] | [next] | [standalone]

#5943

From	Rick Jones <rick.jones2@hp.com>
Date	2012-02-14 21:16 +0000
Message-ID	<jheiuk$jsc$4@usenet01.boi.hp.com>
In reply to	#5942

Rick Jones <rick.jones2@hp.com> wrote:
> EricP <ThatWouldBeTelling@thevillage.com> wrote:

> > The difference here from Ethernet is that Ethernet can listen at no
> > cost before transmitting and without disturbing the current
> > transmission. If there is a collision, all colliders know that it
> > happened.

> > With RTM/ASF a listener aborts the current transaction.  That is
> > similar to Arpanet which didn't listen first but just broadcast - it
> > plugged up real quick.  Also here the act of listening requires
> > sending and receiving packets and has some cost.

> I think you mean the ALOHA Net?
> http://en.wikipedia.org/wiki/ALOHAnet#Pure_ALOHA

Never mind - race condition with the followup...

rick jones
-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[toc] | [prev] | [next] | [standalone]

#5941

From	Rick Jones <rick.jones2@hp.com>
Date	2012-02-14 21:09 +0000
Message-ID	<jheii8$jsc$2@usenet01.boi.hp.com>
In reply to	#5925

Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
> Rick Jones wrote:
> > It may not really apply at all, but for some reason the lack of a
> > guarantee of forward progress reminds me of CSMA/CD in Ethernet and
> > "capture effect" which was prevalent before full-duplex became the
> > norm.

> Standard Ethernet has one huge advantage:

> Exponential randomized backoff is part of the spec. As long as everyone 
> implements this you get stable operation and guaranteed forward progress.

Is there anything more than a statistical "guarantee" in CSMA/CD that
a node will get an opportunity to make forward progress?  I cannot
recall all the chapter and verse from before full-duplex made the
discussion moot, but I thought that even a standards conforming NIC,
with a fast enough system, could grab and hold the net indefinitely.

rick jones
-- 
portable adj, code that compiles under more than one compiler
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

[toc] | [prev] | [next] | [standalone]

#5933

From	MitchAlsup <MitchAlsup@aol.com>
Date	2012-02-14 09:26 -0800
Message-ID	<21210493.750.1329240371457.JavaMail.geo-discussion-forums@ynt13>
In reply to	#5920

On Monday, February 13, 2012 6:19:57 PM UTC-6, Paul A. Clayton wrote:
> On Feb 13, 11:58 am, MitchAlsup <MitchAl...@aol.com> wrote:
> > On Monday, February 13, 2012 8:46:13 AM UTC-6, Andy (Super) Glew wrote:
> > > What does that look like?
> >
> > We had a device in the machine (much like a TLB) that kept
> > track of the order of synchronization events. When a conflict
> > was detected, it provided a number (instead of yes/no) and
> > that number could be used by SW to proactively avoid
> > interference on the subsequent attempt.
> >
> > So, lets say an interrupt went off and 18 tasks simultaneously
> > began to access a concurrent data structure. ON the first
> > attempt at most 1 task makes forward progress and 17 others
> 
> The Intel RTM (and AMD ASF) specifications do not guarantee
> forward progress.  (I am guessing that this is meant to allow
> simpler implementations while allowing a more complex
> implementation to provide such in general.)

Ever wonder why the phrase "at most" was in that sentance?

> > receive numbers from 1 through 17. On a subsequent access to
> > the CDS, those tasks can use the number to index deeped into
> > the CDS and access an element that is not being simultaneously
> > accessed my 16 others.
> 
> I suppose one could have a mildly bad case where another thread
> (that was busy before) wins the transaction for the first
> entry.  (Cache block granularity of monitoring might also
> force a bloating of the data structure.)
> 
> I wonder if something like a coallescing fetch-and-add
> could be useful in such cases.  (It might even be
> possible to use something like coallescing fetch-and-add
> to support limited conflicting updates within transactions.

My ASF worked on addresses not data. That is memory was not necessarily accessed, but the memory addresses that were participating were definatly known. This got around several/many cache coherent issues that loomed close to unsolvable.

> Unfortunately, it seems that such would require all
> transactions 'later' than a failing transaction to fail
> [or a complex fix-up mechanism would be needed to provide
> the correct value and re-execute any dependent operations--
> yuck!].)
> 
> It seems that such a mechanism might be used to support a
> low overhead lock queue when the conflict is essential.

My ASF was in essence: Multiple Simultaneous Locks

The ASF prolog would anounce the participating addresses, and then bundle them up and send the set of addresses to an Address Resolution Unit. The ARU would check to see if all of the addresses were conflict free, and if so it would return success, if not it would return a counter  representing the number of contenders in front of this one.

Once the processors got a success message back from the ARU it was allowed to violate cache coherence rules for a short time. Thus that processors could go out and fetch cache lines and then hold on to them NAKing requests from other processors. After all, those lines had been blessed by the ARU.

It was this ability to NAK other request under special, limited, checked (blessed) circumstances that provided soft guarentees of forward progress.

> Each thread could have a cache line that is monitored and
> written by the previous thread when it releases its hold
> on the lock.  If the monitor was removed (e.g., by the OS
> scheduler), the OS would have to do something to avoid
> lock-up (perhaps change the next pointer in the previous
> entry or manage the lock queue in the OS).
> 
> > Presto, on this worst case event in lock based CDS accesss
> > O(N**2), ASF would give O(3), that is near best case. In
> > average CDS accesses properly programmed ASF could achieve
> > O(ln(n)) instead O(n**2) where n is the number in interfering
> > acccessors.
> 
> Did this provide an instruction or data address of the conflict
> or was it intended for simple transactions?

I have no idea as to what you are asking.
> 
> > <snip>
> >
> > > What more do you want?
> >
> > Feedback from the HW that can be used by properly written SW to
> > proactively avoid interference on subsequent accesses.
> 
> In addition to accessing a given entry in a queue, information
> _might_ be useful to break large transactions into separate
> non-conflicting pieces and use locking to handle the conflicting
> portion.

In particular My ASF was intended to deal with concurrent data structures such as the RUN queues in a typical OS where multiple threads are inserting and removing things all the time. The ability to lock the 4 cache lines necessary to guarentee forward progress was the main target. My ASF was never intended to be a TM-lite.

> Even known the degree of conflict could be useful, especially
> if one thread is (nearly) guaranteed to make forward progress.
> In that case, just retrying the transaction may be more
> appropriate (without additional delay) may be appropriate).

In particular, if one knows how many CPUs are contending for entries in a run queue, one can access different portions fo the run queue and end up interference free. So from the above example: The first access is <basically> thrown away detecting interference in the cache (avoiding the latency of the ARU). When interference has been detected, the CPU changes mode and the nest ASF prolog is done in a "slow and methodological" way directing its lock requests to the ARU. The third trip thruogh the logic remains in the "S&M" mode, but this time the SW takes the conflict number and marches down the run queue to an element that will not be under contention. So on the 3rd pass all 17 processors grab all 17 entries from the queue conflict free. Presto O(3) time.

Mitch

[toc] | [prev] | [next] | [standalone]

#5945

From	Terje Mathisen <"terje.mathisen at tmsw.no">
Date	2012-02-15 08:44 +0100
Message-ID	<6c4s09-g6d1.ln1@ntp6.tmsw.no>
In reply to	#5933

MitchAlsup wrote:
> In particular, if one knows how many CPUs are contending for entries
> in a run queue, one can access different portions fo the run queue
> and end up interference free. So from the above example: The first
> access is<basically>  thrown away detecting interference in the cache
> (avoiding the latency of the ARU). When interference has been
> detected, the CPU changes mode and the nest ASF prolog is done in a
> "slow and methodological" way directing its lock requests to the ARU.
> The third trip thruogh the logic remains in the "S&M" mode, but this
> time the SW takes the conflict number and marches down the run queue
> to an element that will not be under contention. So on the 3rd pass
> all 17 processors grab all 17 entries from the queue conflict free.
> Presto O(3) time.

Mitch, how does this compare with an XADD-based approach?

I.e. all 17 cpus tries to do a fetch&add on the queue index, one of them 
wins immediately, the other 16 will do a hw retry, another one succeeds etc.

I guess the total number of bus attempts/transactions would be 
17+16+15..., i.e. O(n^2) instead of O(3)?

Terje

-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

[toc] | [prev] | [next] | [standalone]

#5948

From	"Andy (Super) Glew" <andy@SPAM.comp-arch.net>
Date	2012-02-15 01:07 -0800
Message-ID	<4F3B75C3.40708@SPAM.comp-arch.net>
In reply to	#5945

On 2/14/2012 11:44 PM, Terje Mathisen wrote:
> MitchAlsup wrote:
>> In particular, if one knows how many CPUs are contending for entries
>> in a run queue, one can access different portions fo the run queue
>> and end up interference free. So from the above example: The first
>> access is<basically> thrown away detecting interference in the cache
>> (avoiding the latency of the ARU). When interference has been
>> detected, the CPU changes mode and the nest ASF prolog is done in a
>> "slow and methodological" way directing its lock requests to the ARU.
>> The third trip thruogh the logic remains in the "S&M" mode, but this
>> time the SW takes the conflict number and marches down the run queue
>> to an element that will not be under contention. So on the 3rd pass
>> all 17 processors grab all 17 entries from the queue conflict free.
>> Presto O(3) time.
>
> Mitch, how does this compare with an XADD-based approach?
>
> I.e. all 17 cpus tries to do a fetch&add on the queue index, one of them
> wins immediately, the other 16 will do a hw retry, another one succeeds
> etc.
>
> I guess the total number of bus attempts/transactions would be
> 17+16+15..., i.e. O(n^2) instead of O(3)?

(0) Yes, that O(N^2) pattern is what happens when you blindly implement 
any test-and-set. Including XADD.

(1) However, Alan Kagi showed that by adding delays, and/or queues in 
the lock controller, you can make it O(n) - 1 request per winner.

Alain's work descended from the QOLB, Queue on Lock Bit, work.  But the 
key is, those queues don't need to be architectural - you can just do it 
in the lock controller.

(2) For extra credit, you can imagine a fetch-and-op combining style 
thing, making it O(1) in cost.

[toc] | [prev] | [next] | [standalone]

#5924

From	Terje Mathisen <"terje.mathisen at tmsw.no">
Date	2012-02-14 10:16 +0100
Message-ID	<mclp09-s091.ln1@ntp6.tmsw.no>
In reply to	#5915

MitchAlsup wrote:
> On Monday, February 13, 2012 8:46:13 AM UTC-6, Andy (Super) Glew
>> What more do you want?
>
> Feedback from the HW that can be used by properly written SW to
> proactively avoid interference on subsequent accesses.

This is exactly how we solved (in sw) a "thundering herd" problem where 
after a power failure every single VoIP (over WLAN) telephone in a new 
hospital would try to reconnect at the same time:

If the core switch (which was distributed/redundant) had less than M 
devices currently going through the login process, it would accept the 
new device at once.

Otherwise it would signal back with a failure code and a number S which 
was the number of seconds the handset should wait before retrying.

This worked perfectly, and it meant that each phone would send a maximum 
of one extra packet in order to get this processing ticket.

Terje
PS. The real problem was that the core switch vendor had a setup process 
which required 600+(!) packets just to connect a new handset...
-- 
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

[toc] | [prev] | [next] | [standalone]

#5812

From	Michael S <already5chosen@yahoo.com>
Date	2012-02-08 01:04 -0800
Message-ID	<9dfef7f5-3ff8-4e17-aaa6-1d29c77d2568@t30g2000vbx.googlegroups.com>
In reply to	#5703

On Feb 4, 6:24 pm, "Unspecified" <par...@perfectvips.com> wrote:
> Hi,
>
>      I was wondering if Developed Economies are considering
>      pursuing Single Thread Performance as worthwhile.
>
>      Any pointers in that direction?
>
> --
>
> Sincerely,
> Partha Sarathi Panda.

Ask yourself why "Developed Economies" are such.
One possible answer would be - because they don't try to manage minor
issues like pursuing single thread performance at The Economy scale.

[toc] | [prev] | [standalone]

Page 4 of 4 — ← Prev page 1 2 3 [4]

csiph-web

Single Thread Performance

Contents

#5937

#5932

#5935

#5938

#5942

#5943

#5941

#5933

#5945

#5948

#5924

#5812