Groups > comp.lang.forth > #7740 > unrolled thread

Readable code and refactoring for optimization

Started by	Wendell <wendellxe@yahoo.com>
First post	2011-12-05 18:14 -0800
Last post	2011-12-14 15:49 +0200
Articles	13 on this page of 73 — 21 participants

Back to article view | Back to comp.lang.forth

  Readable code and refactoring for optimization Wendell <wendellxe@yahoo.com> - 2011-12-05 18:14 -0800
    Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-05 17:54 -1000
      Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-05 17:58 -1000
        Re: Readable code and refactoring for optimization Hans Bezemer <thebeez@xs4all.nl> - 2011-12-06 08:49 +0100
      Re: Readable code and refactoring for optimization mhx@iae.nl (Marcel Hendrix) - 2011-12-06 07:26 +0200
        Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-05 22:02 -1000
        Re: Readable code and refactoring for optimization Bernd Paysan <bernd.paysan@gmx.de> - 2011-12-07 00:49 +0100
      Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-08 00:08 -0800
        Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-08 11:25 +0000
          Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-08 23:51 -0800
        Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-08 14:47 +0000
          Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-08 14:05 -0500
            Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-09 17:21 +0000
              Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-10 09:08 -0500
              Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-10 07:37 -1000
                Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-12 10:27 +0000
                  Re: Readable code and refactoring for optimization Andrew Haley <andrew29@littlepinkcloud.invalid> - 2011-12-12 05:02 -0600
                    Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-12 11:18 +0000
                      Re: Readable code and refactoring for optimization Andrew Haley <andrew29@littlepinkcloud.invalid> - 2011-12-12 05:25 -0600
                        Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-12 12:41 +0000
                          Re: Readable code and refactoring for optimization Andrew Haley <andrew29@littlepinkcloud.invalid> - 2011-12-12 07:23 -0600
                            Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-12 20:28 +0000
                              Re: Readable code and refactoring for optimization Andrew Haley <andrew29@littlepinkcloud.invalid> - 2011-12-13 04:07 -0600
                                Re: Readable code and refactoring for optimization Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2011-12-13 21:04 +0000
                  Re: Readable code and refactoring for optimization Arnold Doray <thinksquared@gmail.com> - 2011-12-12 16:01 +0000
                    Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-26 09:30 -0500
                      Re: Readable code and refactoring for optimization Arnold Doray <invalid@invalid.com> - 2012-01-08 16:28 +0000
                        Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2012-01-08 20:17 -0500
                          Re: Readable code and refactoring for optimization Arnold Doray <invalid@invalid.com> - 2012-01-09 10:41 +0000
                            Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2012-01-09 08:55 -0500
                              Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2012-01-09 11:50 -0500
                                Re: Readable code and refactoring for optimization Arnold Doray <invalid@invalid.com> - 2012-01-10 06:56 +0000
                                  Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2012-01-10 16:34 -0500
                                    Re: Readable code and refactoring for optimization Arnold Doray <invalid@invalid.com> - 2012-01-11 06:57 +0000
                                      Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2012-01-11 09:50 -0500
                                        Re: Readable code and refactoring for optimization Arnold Doray <invalid@invalid.com> - 2012-01-11 16:28 +0000
                  Re: Readable code and refactoring for optimization Fritz Wuehler <fritz@spamexpire-201112.rodent.frell.theremailer.net> - 2011-12-12 22:57 +0100
                    Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-26 09:30 -0500
          Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-08 23:53 -0800
          Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-09 17:46 -1000
            Re: Readable code and refactoring for optimization Josh Grams <josh@qualdan.com> - 2011-12-10 11:39 +0000
              Re: Readable code and refactoring for optimization Ian Osgood <iano@quirkster.com> - 2011-12-21 13:09 -0800
            Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-10 08:52 -0500
            Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-12 09:11 -0800
              Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-12 07:48 -1000
                Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-12 13:47 -0500
                  Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-12 11:46 -0800
                    Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-12 16:20 -0500
                    Re: Readable code and refactoring for optimization BruceMcF <agila61@netscape.net> - 2011-12-12 13:48 -0800
                      Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-13 10:31 +0000
                Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-12 11:42 -0800
                  Re: Readable code and refactoring for optimization Mark Wills <markrobertwills@yahoo.co.uk> - 2011-12-12 13:35 -0800
                    Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-12 11:49 -1000
                      Re: Readable code and refactoring for optimization Paul Rubin <no.email@nospam.invalid> - 2011-12-12 23:50 -0800
                        Re: Readable code and refactoring for optimization JennyB <jennybrien@googlemail.com> - 2011-12-13 03:04 -0800
    Re: Readable code and refactoring for optimization stephenXXX@mpeforth.com (Stephen Pelc) - 2011-12-06 11:04 +0000
      Re: Readable code and refactoring for optimization John Passaniti <john.passaniti@gmail.com> - 2011-12-06 05:52 -0800
    Re: Readable code and refactoring for optimization Arnold Doray <thinksquared@gmail.com> - 2011-12-06 13:52 +0000
      Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-06 08:22 -1000
        Re: Readable code and refactoring for optimization Arnold Doray <thinksquared@gmail.com> - 2011-12-07 08:55 +0000
    Re: Readable code and refactoring for optimization Doug Hoffman <glidedog@gmail.com> - 2011-12-11 07:29 -0500
      Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-11 08:59 -1000
        Re: Readable code and refactoring for optimization Mark Wills <forthfreak@gmail.com> - 2011-12-13 05:33 -0800
          Re: Readable code and refactoring for optimization Andrew Haley <andrew29@littlepinkcloud.invalid> - 2011-12-13 09:05 -0600
            Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-13 08:10 -1000
              Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-15 16:44 +0000
                Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-15 09:15 -1000
                  Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-16 17:13 +0000
                    Re: Readable code and refactoring for optimization "Elizabeth D. Rather" <erather@forth.com> - 2011-12-16 08:11 -1000
                      Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-21 13:57 +0000
          Re: Readable code and refactoring for optimization anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-13 15:06 +0000
          Re: Readable code and refactoring for optimization Bernd Paysan <bernd.paysan@gmx.de> - 2011-12-13 16:18 +0100
            Re: Readable code and refactoring for optimization mhx@iae.nl (Marcel Hendrix) - 2011-12-14 15:49 +0200

Page 4 of 4 — ← Prev page 1 2 3 [4]

#7939

From	Doug Hoffman <glidedog@gmail.com>
Date	2011-12-11 07:29 -0500
Message-ID	<4ee4a216$0$283$14726298@news.sunsite.dk>
In reply to	#7740

I had a high school calculus teacher that firmly taught the idea of 
always being very careful of beginning and ending index conditions.  I 
think that's where most of the code here had problems.

-Doug

p.s., I question that the use of HERE in the "Unfactored" website 
example is safe.

[toc] | [prev] | [next] | [standalone]

#7944

From	"Elizabeth D. Rather" <erather@forth.com>
Date	2011-12-11 08:59 -1000
Message-ID	<HfKdndNFcsLgYHnTnZ2dnUVZ_tCdnZ2d@supernews.com>
In reply to	#7939

On 12/11/11 2:29 AM, Doug Hoffman wrote:
> I had a high school calculus teacher that firmly taught the idea of
> always being very careful of beginning and ending index conditions. I
> think that's where most of the code here had problems.
>
> -Doug
>
> p.s., I question that the use of HERE in the "Unfactored" website
> example is safe.

That whole "Unfactored" example is a mess. It not only is atrocious 
Forth, it doesn't even prove what the clown who wrote it was attempting 
to prove (that it would be faster than the "Factored" version).

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [next] | [standalone]

#8031

From	Mark Wills <forthfreak@gmail.com>
Date	2011-12-13 05:33 -0800
Message-ID	<16318476.428.1323783206964.JavaMail.geo-discussion-forums@vbmi7>
In reply to	#7944

Would it be true to say that a highly factored ITC program would run slower than an un-factored program (more nesting) but that it aint necessarily so with a native-code generating Forth compiler?

[toc] | [prev] | [next] | [standalone]

#8033

From	Andrew Haley <andrew29@littlepinkcloud.invalid>
Date	2011-12-13 09:05 -0600
Message-ID	<B-qdndU7Qtkl9HrTnZ2dnUVZ_ridnZ2d@supernews.com>
In reply to	#8031

Mark Wills <forthfreak@gmail.com> wrote:

> Would it be true to say that a highly factored ITC program would run
> slower than an un-factored program (more nesting) but that it aint
> necessarily so with a native-code generating Forth compiler?

Yes, I think that's generally true.  However, it sometimes turns out
differently; you have to measure.  Processor caches, for example, can
have odd effects.

Andrew.

[toc] | [prev] | [next] | [standalone]

#8044

From	"Elizabeth D. Rather" <erather@forth.com>
Date	2011-12-13 08:10 -1000
Message-ID	<i6ydndAtDLmGCHrTnZ2dnUVZ_sadnZ2d@supernews.com>
In reply to	#8033

On 12/13/11 5:05 AM, Andrew Haley wrote:
> Mark Wills<forthfreak@gmail.com>  wrote:
>
>> Would it be true to say that a highly factored ITC program would run
>> slower than an un-factored program (more nesting) but that it aint
>> necessarily so with a native-code generating Forth compiler?
>
> Yes, I think that's generally true.  However, it sometimes turns out
> differently; you have to measure.  Processor caches, for example, can
> have odd effects.

Too many variables.  To begin with, some ITC implementations are as much 
as 10x faster than others.  An "unfactored" program can be simply bad 
code, as in the "unfactored" example in Rosetta, which is slower than 
the "factored" version (which is also bad code, but not *as* bad). And, 
as Andrew suggests, there are variables involving cache and even how the 
timing is done that will introduce some variables.

It's pretty safe to say that a native-code generating Forth will run 
faster than an ITC Forth.  But the effect of nesting or not on the same 
implementation is less clear. Forth is designed to minimize the cost of 
nesting, and in a good implementation that is so successful that other 
factors often count more.

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [next] | [standalone]

#8099

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-15 16:44 +0000
Message-ID	<2011Dec15.174434@mips.complang.tuwien.ac.at>
In reply to	#8044

"Elizabeth D. Rather" <erather@forth.com> writes:
>It's pretty safe to say that a native-code generating Forth will run 
>faster than an ITC Forth.

Not even that.  Look at Figure 10 of
<http://www.complang.tuwien.ac.at/papers/ertl%26gregg04pact.ps.gz>.
There gforth-plain (a threaded-code system) outperforms bigforth (a
native-code system) on CD16sim by a factor of more than 3.

The reason is that, like many native-code (and some threaded-code)
Forth systems, bigForth places code close to (variable) data,
resulting in significant cache consistency overhead.  These issues
have been known in the Forth community since 1995, but AFAIK most
native-code systems still have not addressed them properly.

For more recent data (but without a pure threaded-code system), look
at slide 12 of
<http://www.complang.tuwien.ac.at/anton/euroforth/ef09/papers/ertl-slides.pdf>

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8117

From	"Elizabeth D. Rather" <erather@forth.com>
Date	2011-12-15 09:15 -1000
Message-ID	<dPSdneCbj5Kh2nfTnZ2dnUVZ_oednZ2d@supernews.com>
In reply to	#8099

On 12/15/11 6:44 AM, Anton Ertl wrote:
> "Elizabeth D. Rather"<erather@forth.com>  writes:
>> It's pretty safe to say that a native-code generating Forth will run
>> faster than an ITC Forth.
>
> Not even that.  Look at Figure 10 of
> <http://www.complang.tuwien.ac.at/papers/ertl%26gregg04pact.ps.gz>.
> There gforth-plain (a threaded-code system) outperforms bigforth (a
> native-code system) on CD16sim by a factor of more than 3.
>
> The reason is that, like many native-code (and some threaded-code)
> Forth systems, bigForth places code close to (variable) data,
> resulting in significant cache consistency overhead.  These issues
> have been known in the Forth community since 1995, but AFAIK most
> native-code systems still have not addressed them properly.
>
> For more recent data (but without a pure threaded-code system), look
> at slide 12 of
> <http://www.complang.tuwien.ac.at/anton/euroforth/ef09/papers/ertl-slides.pdf>

Interesting.  More reinforcement for rejecting any blanket claim that a 
particular model or programming style (e.g. factored/unfactored) will 
"always" be faster/smaller/more readable/etc.  Good code is better than 
bad code by most measures.  That's about it.

As has often been observed, Forth tends to amplify the difference: good 
Forth tends to be *lots* faster/smaller/more readable/etc. A good Forth 
programmer has a lot of leverage, and taking care with your code pays 
big dividends.

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [next] | [standalone]

#8144

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-16 17:13 +0000
Message-ID	<2011Dec16.181350@mips.complang.tuwien.ac.at>
In reply to	#8117

"Elizabeth D. Rather" <erather@forth.com> writes:
>On 12/15/11 6:44 AM, Anton Ertl wrote:
>> Look at Figure 10 of
>> <http://www.complang.tuwien.ac.at/papers/ertl%26gregg04pact.ps.gz>.
>> There gforth-plain (a threaded-code system) outperforms bigforth (a
>> native-code system) on CD16sim by a factor of more than 3.
>>
>> The reason is that, like many native-code (and some threaded-code)
>> Forth systems, bigForth places code close to (variable) data,
>> resulting in significant cache consistency overhead.  These issues
>> have been known in the Forth community since 1995, but AFAIK most
>> native-code systems still have not addressed them properly.
...
>As has often been observed, Forth tends to amplify the difference: good 
>Forth tends to be *lots* faster/smaller/more readable/etc. A good Forth 
>programmer has a lot of leverage, and taking care with your code pays 
>big dividends.

I don't think that this conclusion can be drawn from this basis.  At
least I don't consider ALLOTing unused space between modifyable data
and code to be a sign of particularly good code, yet it is the typical
solution to this cache consistency problem.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8148

From	"Elizabeth D. Rather" <erather@forth.com>
Date	2011-12-16 08:11 -1000
Message-ID	<h-SdnUvxD7BLFHbTnZ2dnUVZ_oSdnZ2d@supernews.com>
In reply to	#8144

On 12/16/11 7:13 AM, Anton Ertl wrote:
> "Elizabeth D. Rather"<erather@forth.com>  writes:
>> On 12/15/11 6:44 AM, Anton Ertl wrote:
>>> Look at Figure 10 of
>>> <http://www.complang.tuwien.ac.at/papers/ertl%26gregg04pact.ps.gz>.
>>> There gforth-plain (a threaded-code system) outperforms bigforth (a
>>> native-code system) on CD16sim by a factor of more than 3.
>>>
>>> The reason is that, like many native-code (and some threaded-code)
>>> Forth systems, bigForth places code close to (variable) data,
>>> resulting in significant cache consistency overhead.  These issues
>>> have been known in the Forth community since 1995, but AFAIK most
>>> native-code systems still have not addressed them properly.
> ...
>> As has often been observed, Forth tends to amplify the difference: good
>> Forth tends to be *lots* faster/smaller/more readable/etc. A good Forth
>> programmer has a lot of leverage, and taking care with your code pays
>> big dividends.
>
> I don't think that this conclusion can be drawn from this basis.  At
> least I don't consider ALLOTing unused space between modifyable data
> and code to be a sign of particularly good code, yet it is the typical
> solution to this cache consistency problem.
n
The context of my observation was the OPs assertion that factoring for 
readability implied slower code, and Mark's question regarding how this 
applied to ITC vs. compile-to-code implementaions.  I believe "good" 
code wins in multiple categories.  Of course, "good" depends on context. 
If extra ALLOTments (particularly if done invisibly by the compiler) 
gains performance in an environment where performance is more important 
than size (as on most PCs), who's to say that's not "good"?

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

[toc] | [prev] | [next] | [standalone]

#8277

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-21 13:57 +0000
Message-ID	<2011Dec21.145716@mips.complang.tuwien.ac.at>
In reply to	#8148

"Elizabeth D. Rather" <erather@forth.com> writes:
>On 12/16/11 7:13 AM, Anton Ertl wrote:
>> "Elizabeth D. Rather"<erather@forth.com>  writes:
>>> As has often been observed, Forth tends to amplify the difference: good
>>> Forth tends to be *lots* faster/smaller/more readable/etc. A good Forth
>>> programmer has a lot of leverage, and taking care with your code pays
>>> big dividends.
>>
>> I don't think that this conclusion can be drawn from this basis.  At
>> least I don't consider ALLOTing unused space between modifyable data
>> and code to be a sign of particularly good code, yet it is the typical
>> solution to this cache consistency problem.
>n
>The context of my observation was the OPs assertion that factoring for 
>readability implied slower code, and Mark's question regarding how this 
>applied to ITC vs. compile-to-code implementaions.  I believe "good" 
>code wins in multiple categories.  Of course, "good" depends on context. 
>If extra ALLOTments (particularly if done invisibly by the compiler) 
>gains performance in an environment where performance is more important 
>than size (as on most PCs), who's to say that's not "good"?

That's a circular argument: Good Forth code is fast, because we define
code that is fast to be good.

I rather had criteria for goodness in mind that are typically taught
to Forth programmers in books such as "Thinking Forth" and endorsed in
style guides.  Things such as factoring and readability.

I don't think that ALLOTing unused space between code and data was
ever encouraged in these sources; it's just a workaround for the
deficiency of some Forth implementations on some hardware.  Hopefully
the Forth system implementors will fix this deficiency eventually;
then this practice will just waste space and reduce the readability
without providing any benefit.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8035

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-13 15:06 +0000
Message-ID	<2011Dec13.160603@mips.complang.tuwien.ac.at>
In reply to	#8031

Mark Wills <forthfreak@gmail.com> writes:
>Would it be true to say that a highly factored ITC program would run slower than an un-factored program (more nesting) but that it aint necessarily so with a native-code generating Forth compiler?

Not in general.  ITC and native-code are independent of inlining.
Also, in some cases unfactoring can increase the cache footprint,
which may cause a slowdown (and even so, changing cache arrangements
can have unexpected effects on performance).  If you want to know for
sure, measure both.

In any case, it seems to me that some people worry far too much about
performance.  Only worry about it at this level if the program is
definitely too slow.  Otherwise just keep it flexible (i.e., factored)
and worry about correctness and flexibility (and maybe efficient data
types).

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8037

From	Bernd Paysan <bernd.paysan@gmx.de>
Date	2011-12-13 16:18 +0100
Message-ID	<jc7qct$i87$1@online.de>
In reply to	#8031

Mark Wills wrote:

> Would it be true to say that a highly factored ITC program would run
> slower than an un-factored program (more nesting) but that it aint
> necessarily so with a native-code generating Forth compiler?

Yes, native code generating Forth compilers perform inlining (at least 
VFX and iForth).  On the other hand, we have seen benchmarks with 
factored and unfactored entries, and the factored already were faster, 
because they were, well, better factored (i.e. the performance critical 
loop was better optimized ;-).

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

[toc] | [prev] | [next] | [standalone]

#8064

From	mhx@iae.nl (Marcel Hendrix)
Date	2011-12-14 15:49 +0200
Message-ID	<97868889918436@frunobulax.edu>
In reply to	#8037

Bernd Paysan <bernd.paysan@gmx.de> writes Re: Readable code and refactoring for optimization

> Mark Wills wrote:

>> Would it be true to say that a highly factored ITC program would run
>> slower than an un-factored program (more nesting) but that it aint
>> necessarily so with a native-code generating Forth compiler?

> Yes, native code generating Forth compilers perform inlining (at least 
> VFX and iForth).  On the other hand, we have seen benchmarks with 
> factored and unfactored entries, and the factored already were faster, 
> because they were, well, better factored (i.e. the performance critical 
> loop was better optimized ;-).

The two versions have the nearly the same LOC (in ASM).
The slowest version is the one doing the largest amount of I/O (The unfactored one).

IMO factoring is useful to keep down the complexity of a program. For doors1 and doors2
it is difficult to see an advantage (they have the same numbers of lines, doors1 can be
thought of as doors2 with a starting comment per line :-)

I certainly don't like 300 line programs to have 300 one-liners.


-marcel

-- warning: linelength exceeds 153 characters ----------------------------------------------------------------------------------------------------------
DOC
(*

	FORTH> see doors1								   FORTH> see doors2
	Flags: ANSI									   Flags: ANSI
	$01249A00  : doors1								   $01249AC0  : doors2
	$01249A0A  lea           rbp, [rbp -8 +] qword					   $01249ACA  lea           rbp, [rbp -8 +] qword
	$01249A0E  mov           [rbp 0 +] qword, $01249A1B d#				   $01249ACE  mov           [rbp 0 +] qword, $01249ADB d#
	$01249A16  jmp           open+10 ( $012499CA ) offset NEAR			   $01249AD6  jmp           HERE+10 ( $01138C42 ) offset NEAR
	$01249A1B  mov           rcx, #100 d#						   $01249ADB  pop           rbx
	$01249A22  mov           rbx, 1 d#						   $01249ADC  mov           rdi, [rsp] qword
	$01249A29  lea           rax, [rax 0 +] qword					   $01249AE0  push          rbx
	$01249A30  cmp           rbx, rcx						   $01249AE1  push          rdi
	$01249A33  push          rcx							   $01249AE2  lea           rbp, [rbp -8 +] qword
	$01249A34  jg            $01249A78 offset NEAR					   $01249AE6  mov           [rbp 0 +] qword, $01249AF3 d#
	$01249A3A  push          rbx							   $01249AEE  jmp           ERASE+10 ( $01138CD2 ) offset NEAR
	$01249A3B  push          rbx							   $01249AF3  mov           rbx, [rsp] qword
	$01249A3C  lea           rbx, [rbx $012491FF +] qword				   $01249AF7  push          rbx
	$01249A43  lea           rax, [rax 0 +] qword					   $01249AF8  xor           rbx, rbx
	$01249A48  cmp           rbx, $01249264 d#					   $01249AFB  pop           rcx
	$01249A4F  jge           $01249A6C offset NEAR					   $01249AFC  call          (DO) offset NEAR
	$01249A55  movzx         rdi, [rbx] byte					   $01249B06  nop
	$01249A59  xor           rdi, 1 b#						   $01249B07  nop
	$01249A5D  mov           [rbx] byte, dil					   $01249B08  push          rbx
	$01249A60  mov           rdi, [rsp] qword					   $01249B09  lea           rbp, [rbp -8 +] qword
	$01249A64  lea           rbx, [rbx rdi*1] qword					   $01249B0D  mov           [rbp 0 +] qword, $01249B1A d#
	$01249A68  jmp           $01249A48 offset SHORT					   $01249B15  jmp           HERE+10 ( $01138C42 ) offset NEAR
	$01249A6A  push          rbx							   $01249B1A  pop           rbx
	$01249A6B  pop           rbx							   $01249B1B  mov           rdi, [rsp] qword
	$01249A6C  pop           rdi							   $01249B1F  lea           rbx, [rbx rdi*1] qword
	$01249A6D  pop           rbx							   $01249B23  push          rbx
	$01249A6E  lea           rbx, [rbx 1 +] qword					   $01249B24  lea           rbp, [rbp -8 +] qword
	$01249A72  pop           rcx							   $01249B28  mov           [rbp 0 +] qword, $01249B35 d#
	$01249A73  jmp           $01249A30 offset SHORT					   $01249B30  jmp           HERE+10 ( $01138C42 ) offset NEAR
	$01249A75  push          rcx							   $01249B35  mov           rbx, [rbp 0 +] qword
	$01249A76  push          rbx							   $01249B39  pop           rdi
	$01249A77  pop           rbx							   $01249B3A  lea           rbx, [rdi rbx*1] qword
	$01249A78  pop           rdi							   $01249B3E  pop           rcx
	$01249A79  jmp           report+10 ( $0124994A ) offset NEAR			   $01249B3F  call          (DO) offset NEAR
	$01249A7E  ;									   $01249B49  lea           rax, [rax 0 +] qword
											   $01249B50  mov           rdi, [rbp 0 +] qword
	FORTH> see open									   $01249B54  movzx         rdi, [rdi] byte
	Flags: ANSI									   $01249B58  xor           rdi, 1 b#
	$012499C0  : open								   $01249B5C  mov           rax, [rbp 0 +] qword
	$012499CA  push          $01249200 d#						   $01249B60  mov           [rax] byte, dil
	$012499CF  push          #100 b#						   $01249B63  mov           rdi, [rbp #24 +] qword
	$012499D1  jmp           ERASE+10 ( $01138CD2 ) offset NEAR			   $01249B67  push          rbx
	$012499D6  ;									   $01249B68  lea           rbx, [rdi 1 +] qword
											   $01249B6C  add           [rbp 0 +] qword, rbx
	FORTH> see report								   $01249B70  add           [rbp 8 +] qword, rbx
	Flags: ANSI									   $01249B74  pop           rbx
	$01249940  : report								   $01249B75  jno           $01249B50 offset NEAR
	$0124994A  lea           rbp, [rbp -8 +] qword					   $01249B7B  add           rbp, #24 b#
	$0124994E  mov           [rbp 0 +] qword, $0124995B d#				   $01249B7F  add           [rbp 0 +] qword, 1 b#
	$01249956  jmp           CR+10 ( $0113909A ) offset NEAR			   $01249B84  add           [rbp 8 +] qword, 1 b#
	$0124995B  mov           rcx, $01249200 d#					   $01249B89  jno           $01249B08 offset NEAR
	$01249962  mov           rbx, #100 d#						   $01249B8F  add           rbp, #24 b#
	$01249969  lea           rax, [rax 0 +] qword					   $01249B93  push          rbx
	$01249970  cmp           rbx, 0 b#						   $01249B94  lea           rbp, [rbp -8 +] qword
	$01249974  push          rcx							   $01249B98  mov           [rbp 0 +] qword, $01249BA5 d#
	$01249975  je            $0124999E offset NEAR					   $01249BA0  jmp           CR+10 ( $0113909A ) offset NEAR
	$0124997B  push          rbx							   $01249BA5  push          $01246960 d#
	$0124997C  lea           rbp, [rbp -8 +] qword					   $01249BAA  push          #14 b#
	$01249980  mov           [rbp 0 +] qword, $0124998D d#				   $01249BAC  lea           rbp, [rbp -8 +] qword
	$01249988  jmp           .closed+10 ( $0124990A ) offset NEAR			   $01249BB0  mov           [rbp 0 +] qword, $01249BBD d#
	$0124998D  pop           rbx							   $01249BB8  jmp           TYPE+10 ( $01138EA2 ) offset NEAR
	$0124998E  pop           rdi							   $01249BBD  xor           rbx, rbx
	$0124998F  lea           rdi, [rdi 1 +] qword					   $01249BC0  pop           rcx
	$01249993  push          rdi							   $01249BC1  call          (DO) offset NEAR
	$01249994  lea           rbx, [rbx -1 +] qword					   $01249BCB  lea           rax, [rax 0 +] qword
	$01249998  pop           rcx							   $01249BD0  push          rbx
	$01249999  jmp           $01249970 offset SHORT					   $01249BD1  lea           rbp, [rbp -8 +] qword
	$0124999B  push          rcx							   $01249BD5  mov           [rbp 0 +] qword, $01249BE2 d#
	$0124999C  push          rbx							   $01249BDD  jmp           HERE+10 ( $01138C42 ) offset NEAR
	$0124999D  pop           rbx							   $01249BE2  mov           rbx, [rbp 0 +] qword
	$0124999E  pop           rdi							   $01249BE6  pop           rdi
	$0124999F  ;									   $01249BE7  cmp           [rdi rbx*1] byte, 0 b#
											   $01249BEB  je            $01249C0B offset NEAR
	FORTH> see .closed								   $01249BF1  mov           rbx, [rbp 0 +] qword
	Flags: ANSI									   $01249BF5  lea           rbx, [rbx 1 +] qword
	$01249900  : .closed								   $01249BF9  push          rbx
	$0124990A  pop           rbx							   $01249BFA  lea           rbp, [rbp -8 +] qword
	$0124990B  pop           rdi							   $01249BFE  mov           [rbp 0 +] qword, $01249C0B d#
	$0124990C  cmp           [rdi] byte, 0 b#					   $01249C06  jmp           .+10 ( $01139722 ) offset NEAR
	$0124990F  push          rdi							   $01249C0B  pop           rbx
	$01249910  je            $01249935 offset NEAR					   $01249C0C  add           [rbp 0 +] qword, 1 b#
	$01249916  mov           rdi, [rsp] qword					   $01249C11  add           [rbp 8 +] qword, 1 b#
	$0124991A  push          rbx							   $01249C16  jno           $01249BD0 offset NEAR
	$0124991B  lea           rbx, [rdi $FFFFFFFFFEDB6E01 +] qword			   $01249C1C  add           rbp, #24 b#
	$01249922  push          rbx							   $01249C20  push          rbx
	$01249923  lea           rbp, [rbp -8 +] qword					   $01249C21  ;
	$01249927  mov           [rbp 0 +] qword, $01249934 d#
	$0124992F  jmp           .+10 ( $01139722 ) offset NEAR
	$01249934  pop           rbx
	$01249935  push          rbx
	$01249936  ;

*)
ENDDOC

\ Factored

100 constant /doors
create 'doors  /doors allot 
'doors /doors + constant <doors 

: toggle    1 over c@ xor swap c! ;
: pass      dup 1- 'doors + begin dup <doors < while dup toggle over + repeat 2drop ;
: passes    /doors 1 begin 2dup >= while dup pass 1+ repeat 2drop ;
: .closed   over c@ if over 'doors - 1+ . then ;
: report    cr 'doors /doors begin dup while .closed 1 /string repeat 2drop ;
: open      'doors /doors erase ; 
: doors1    open passes report ; 

doors1


\ Unfactored 
: doors2 ( n -- )  
  here over erase  \ open all doors  
  dup 0 do  
  	    here over + here i + 
  	            do  i c@  1 xor  i c!   \ toggle    
            j 1+ +loop  
      loop  cr ." Closed doors: "  
  ( n ) 0 do  here i + c@ if i 1+ . then  loop ; 

/doors doors2

[toc] | [prev] | [standalone]

Page 4 of 4 — ← Prev page 1 2 3 [4]

csiph-web

Readable code and refactoring for optimization

Contents

#7939

#7944

#8031

#8033

#8044

#8099

#8117

#8144

#8148

#8277

#8035

#8037

#8064