Groups | Search | Server Info | Login | Register


Groups > comp.lang.forth > #22660

Re: The current object

From Bernd Paysan <bernd.paysan@gmx.de>
Newsgroups comp.lang.forth
Subject Re: The current object
Date 2013-05-15 23:23 +0200
Organization 1&1 Internet AG
Message-ID <kn0udd$spc$1@online.de> (permalink)
References (10 earlier) <klsb20$7hk$1@online.de> <2013May2.162843@mips.complang.tuwien.ac.at> <klu9fd$ifv$1@online.de> <kn0aqb$d3v$1@online.de> <R5CdndY-5plDUg7MnZ2dnUVZ_hOdnZ2d@supernews.com>

Show all headers | View raw


Andrew Haley wrote:

> Bernd Paysan <bernd.paysan@gmx.de> wrote:
>> Please do some measurements, if you don't believe me.  A call
>> *offset(reg) is, when correctly predicted, exactly as fast as a
>> direct call (with constant address) on a Core i7.
> 
> I don't doubt that.  However, gotchas arise when the vtable isn't in
> the chache.  The question is how well on avergae, on a real load, it
> works, and how badly an inline cache works.

"Cold" code executes slowly.

>>> But even if vtable dispatch is somewhat better, a cacheing "virtual
>>> function" approach is much more general-purpose, for very little
>>> extra cost.  It is the Rosetta Stone that allows different OOP
>>> systems, static and dynamic, to talk to each other.
>> 
>> The cost of misprediction is way too high.
> 
> I disagree.  It's only another test and branch, which may well be
> quicker than a vtable lookup if the vtable isn't in L1 cache.

Actually, the likelyhood that the object itself (and therefore the class 
information) is not in the cache might be higher.  There aren't that many 
vtables in a reasonable large Forth OOP program.  Maybe in Java.

>> That's one implementation option.  It has the disadvantage that any
>> occasional polymorphic call makes the whole thing slow.
> 
> Not really, and if it's only occasional it's not going to make the
> "whole thing slow", is it?

No, you didn't read that correctly: If you have a mostly monomorphic call, 
which has the occasional polymorphic call, it's going to be slow if you 
convert from monomorphic to polymorphic on the first of those ocassional 
calls, and never go back.

So one occasional call to another method makes all other calls to the usual 
suspect slow.

>>> I'm not surprised you think this technique is slow, given your
>>> misunderstanding of how it actually works.  What are you thinking?
>>> That the people who write such systems are incompetent?
>> 
>> Always is a possibility.
> 
> The thing that bothers me is that this is often your default
> assumption.  As in "If you want it done right, you have to do it
> yourself"

Ah, no, I'm lazy.  I take whatever works for me.  The problem is: it too 
often doesn't.  Then I curse (the above is the curse), and do the thing 
myself.

Because there are many ways to do it wrong, and most of the time we do it 
wrong, especially for other people - "works for me" is a very common excuse 
to close valid bug reports.  So if you follow my advice, chances are high 
that it at least works for you.  It could well be about as buggy as all the 
other stuff, but that stuff worked for them, not for you.  The worst case is 
if you have code written by some employer, who doesn't even need what he 
writes.  Then it works for nobody but the unit test (if there is one).

A few years ago I slowly started to reinvent the Internet, because I thought 
a few things weren't as good as they should.  While doing this, I (and 
others) not only discovered that the pile of shit was way deeper, but also 
the bold claims of people who said they had a solution to some of the 
problems were just that: bold claims.  Like LEDBAT was supposed to be a 
delay minimizing algorithm, but due to the long latency of the regulation 
loop it just doesn't work.  The only way to get BitTorrent, which uses 
LEDBAD, to behave (really have low delays) is to set a limit sufficiently 
below your actual available bandwidth.  And then, even TCP based BitTorrent 
works.  LEDBAT works somewhat when you have only one connection.  But that's 
not the point of a P2P network...

Due to those bold claims, I grossly underestimated the effort for the flow 
control.  I thought this problem has been solved.  After all, TCP is 30 
years old.  The reality is: At the time I thought this problem has been 
solved, Postel hadn't even acknowledged that it exists.  The term "buffer 
bloat" came later, and it was shifting the culprit to the router 
manufacturers.  It turned out that they were right, and adding tons of 
buffers does help the Internet.  The design problem of TCP becomes worse if 
you have less buffer, because if you hit the buffer limit, the bandwidth 
goes down by a factor two, and then the window slowly increases again, one 
packet per round trip.

This was all no problem when the round trip was three packets, and not 300, 
as it is today.

>> So you have to optimize for polymorphic code, and take advantage of
>> the improvements of current CPU architectures.  The monomorphic code
>> is less likely.
> 
> I take your point, but even in the polymorphic case I'm not convinced
> that vtables are the best approach when dcache effects are taken into
> account.

Anton did some calculations, a typical Forth OOP program like MINOS or 
GLForth has about 100 classes with 50 methods each.  So the vtables 
alltogether are 5k words.  Considering that vtables are likely to be 
frequently accessed, they should at least reside in the L2 cache.  It is far 
more likely that the object itself isn't in the cache than that it's vtable 
isn't.  The vtable is hot when some other object of the same class was used 
recently.

>> But I'm already in the last phase.
> 
> I know.  That's the problem!  So many systems already implemented...
> :-)

Yes, we know pretty well what people have needed ;-).

>> The other standard Forth suggestion is to do all the complicated
>> stuff at compile time.  IMHO the only benefit you can get from
>> inline caching is when you can completely inline the called method,
>> and thus save the call completely, and open up opportunities to
>> optimize more things (which inlining usually does).
> 
> But you're not inlining when you use an inline cache, so that doesn't
> make any sense.

But you could, and AFAIK TraceMonkey does exactly this: Once you know that a 
method is "hot" and worth the time to compile, you do all this inlining, 
including the presumed monomorphic calls.  If this assumption turns out to 
be wrong, you revert back to the less aggressively optimized code.  asm.js 
relies on this property.

>> What's different with your vtables?
> 
> Nothing, save for the fact that you're hoping for good prediction and
> neglecting the cache miss effects when fetching from the vtable.  The
> prediction of an inline cache is going to be far better.

Yes, that's what I question.  The current CPUs have branch targets cached 
for several thousand locations, so completely sufficient for a typical Forth 
OOP program.  The branch target cache doesn't care if the vtable lookup is a 
hit or miss, it just executes what it assumes is the destination.  If the 
data cache is cold, it's just more speculated instructions.  An L1 cache hit 
on Core i7 takes 4 cycles, an L2 cache hit 10 cycles, that's both 
significantly below a branch miss.  If you need to go to L3 (40 cycles to 75 
cycles) it becomes significant, but still be within the number of cycles a 
Core i7 can speculatively execute.  Only when you go off-chip to DRAM, it 
will stall.

BTW: The branch target cache does recognize monomorphic calls even when 
different classes are used, but the method itself is the same.  Example: 
MINOS has a cached glue method; most widgets can cache their glue, and do.  
Some can't (their content is too dynamic), and therefore use the uncached 
glue method instead.  So when you have a bunch of different widgets and you 
walk through all of them, calculating the overall minimum and maximum size, 
you get branch hits most of the time, even though there is not just a single 
class involved, but a dozend or two.

>> My guestimate is that a
>> 
>> if(class==constant) then call <method> else rewrite_polymorphic(); fi
>> 
>> is at least one cycle worse on a Core i7 than a predicted vtable call,
>> because the Core i7 can't do two branches in one cycle.
> 
> That's not what we do.  It's
> 
>    mov r1, #constant
>    call method
>    ...
>    mov r2, [obj]
>    cmp r2, r1
>    bne ...

Still two branches, still needs two cycles per invocation.  Even a predicted 
fall-through branch costs one cycle.  The order here doesn't really matter, 
it's just reducing the code size (more call sites than methods - but then, 
you need to create that code again for every subclass, even if that subclass 
doesn't change the method at all).  I hope that the polymorphic dispatch 
jumps right after the bne into the method, without double checking.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Back to comp.lang.forth | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: The current object (was: Went open source with my GA144 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-02 14:28 +0000
  Re: The current object (was: Went open source with my GA144 ...) Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-02 19:58 +0200
    Re: The current object (was: Went open source with my GA144 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-03 16:50 +0000
      Re: The current object (was: Went open source with my GA144 ...) Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-04 18:58 +0200
        Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-04 13:35 -0500
          Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-05 01:08 +0200
            Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-04 18:29 -0500
              Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-05 03:00 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-05 03:48 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-05 23:15 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-06 05:01 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 03:23 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-07 03:39 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 19:00 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-07 13:34 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 22:00 +0200
                Re: The current object anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-08 07:26 +0000
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-05 06:39 -0400
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-05 09:15 -0500
                Re: The current object Paul Rubin <no.email@nospam.invalid> - 2013-05-05 09:25 -0700
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-05 11:43 -0500
                Re: The current object Paul Rubin <no.email@nospam.invalid> - 2013-05-05 09:59 -0700
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-06 05:06 -0500
                Re: The current object Mark Wills <forthfreak@gmail.com> - 2013-05-06 00:19 -0700
                Re: The current object "WJ" <w_a_x_man@yahoo.com> - 2013-05-13 22:45 +0000
            Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-05 06:39 -0400
              Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-05 23:40 +0200
        Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-05 06:39 -0400
          Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-05 23:29 +0200
            Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-06 05:10 -0500
            Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-08 07:35 -0400
              Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-09 00:06 +0200
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-09 06:41 -0400
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-11 21:31 +0200
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-12 07:40 -0400
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-14 01:10 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-14 03:58 -0500
                Re: The current object anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-14 14:51 +0000
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-15 03:48 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-14 18:37 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-14 14:42 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-15 00:22 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-15 02:56 -0500
                Re: The current object stephenXXX@mpeforth.com (Stephen Pelc) - 2013-05-15 13:52 +0000
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-15 17:58 +0200
                Re: The current object albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-05-17 14:01 +0000
                Re: The current object johno <email@address.com> - 2013-05-15 22:38 +0100
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-14 06:27 -0400
                Re: The current object albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-05-09 16:06 +0000
        Re: The current object (was: Went open source with my GA144 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-06 16:12 +0000
          Re: The current object (was: Went open source with my GA144 ...) Alex McDonald <blog@rivadpm.com> - 2013-05-06 10:25 -0700
          Re: The current object (was: Went open source with my GA144 ...) Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 04:00 +0200
            Re: The current object (was: Went open source with my GA144 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-07 15:02 +0000
          Re: The current object (was: Went open source with my GA144 ...) Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 04:09 +0200
            Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-07 03:41 -0500
              Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 19:02 +0200
            Re: The current object (was: Went open source with my GA144 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-07 14:33 +0000
              Re: The current object (was: Went open source with my GA144 ...) Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-07 19:40 +0200
    Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-15 16:46 +0200
      Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-15 13:11 -0500
        Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-15 23:23 +0200
          Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-16 02:24 -0500
            Re: The current object anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-16 13:55 +0000
            Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-16 17:45 +0200
              Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-18 05:13 -0500
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-18 06:43 -0400
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-18 10:03 -0500
                Re: The current object Doug Hoffman <glidedog@gmail.com> - 2013-05-18 11:30 -0400
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-19 01:20 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-19 12:33 -0500
                Re: The current object Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-19 01:07 +0200
                Re: The current object Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-19 03:58 -0500
        OO dispatch (was: The current object) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-16 14:07 +0000
          Re: OO dispatch Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-18 05:28 -0500
            Re: OO dispatch anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-20 11:28 +0000
              Re: OO dispatch Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-20 09:00 -0500
                Re: OO dispatch Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-21 01:37 +0200
                Re: OO dispatch Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-21 03:21 -0500
                Re: OO dispatch anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-21 12:24 +0000
                Re: OO dispatch Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-21 10:18 -0500
                Re: OO dispatch anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-05-21 16:03 +0000
                Re: OO dispatch Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-05-21 12:55 -0500
                Re: OO dispatch Bernd Paysan <bernd.paysan@gmx.de> - 2013-05-29 00:00 +0200
                Re: OO dispatch Mark Wills <markrobertwills@yahoo.co.uk> - 2013-05-28 15:49 -0700

csiph-web