Path: csiph.com!eternal-september.org!feeder.eternal-september.org!nntp.eternal-september.org!.POSTED!not-for-mail From: peter Newsgroups: comp.lang.forth Subject: Re: locals Date: Sat, 25 Apr 2026 16:07:47 +0200 Organization: A noiseless patient Spider Lines: 143 Message-ID: <20260425160747.00007f4a@tin.it> References: <10qunhm$1nnbt$1@dont-email.me> <2026Apr25.064712@mips.complang.tuwien.ac.at> <87o6j74l1z.fsf@nightsong.com> <2026Apr25.084323@mips.complang.tuwien.ac.at> <2026Apr25.122216@mips.complang.tuwien.ac.at> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Injection-Date: Sat, 25 Apr 2026 14:07:51 +0000 (UTC) Injection-Info: dont-email.me; posting-host="e977cfcb6f83128a2534be8e36bc8ad2"; logging-data="909900"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+odkxk7EuOr9HSp2TN3yECWyQAwHpDbQU=" Cancel-Lock: sha1:T4rjprTwxqLghqXGDxXB0RTxp4k= X-Newsreader: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) Xref: csiph.com comp.lang.forth:134970 On Sat, 25 Apr 2026 10:22:16 GMT anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > albert@spenarnc.xs4all.nl writes: > >String handling and move operation are the exception, because > >they are both simpler and faster in low level. > >Simpler is the argument (especially for i86). > >Faster is the bonus. > > In other words, Forth without locals is not well suited for words > that have so much active data. That is also reflected in hardware > designed for Forth, which got additional registers like A or B (or > additional capabilities for the top of the return stack register R), > which make it simpler and faster to implement such words. > > A definition of STRCMP in the paper is > > : strcmp { addr1 u1 addr2 u2 -- n } > addr1 addr2 > u1 u2 min 0 > ?do { s1 s2 } > s1 c@ s2 c@ - ?dup > if > unloop exit > then > s1 char+ s2 char+ > loop > 2drop > u1 u2 - ; > > So in the loop we have a loop count (on the return stack), two cursors > (s1 and s2) into the compared strings, and within the loop body we > additionally have the two characters, for a total of five live values, > three of which survive across iterations and are changed in every > iteration. One could implement it as > > \ untested, and the following versions, too > : strcmp { addr1 u1 addr2 u2 -- n } > addr1 addr2 > u1 u2 min 0 > ?do > addr1 i + c@ addr2 i + c@ - ?dup > if > unloop exit > then > loop > u1 u2 - ; > > where only one of the values changes in each iteration, but now the > ?DO...LOOP cannot be replaced with a version that does not store a > second value but counts down (or up) to 0, so now we have a total of 6 > live values, four of which survive across iterations, and one is > changed on every iteration. > > One can reduce this by one value by keeping one of the cursors in the > loop counter: > > : strcmp {: addr1 u1 addr2 u2 -- n :} > addr2 addr1 - {: offset :} > u1 u2 min addr1 + addr1 ?do > i c@ i offset + c@ - ?dup > if > unloop exit > then > loop > u1 u2 - ; > > So now we have five live values in the body of the loop at the same > time, three of which live across iterations, and one of which changes > in each iteration. Keeping the loop parameters separate significantly > lessens the load on the data stack. > > Let's see if we can eliminate the local from the loop body: > > : strcmp {: addr1 u1 addr2 u2 -- n :} > addr2 addr1 - ( offset ) > u1 u2 min addr1 + addr1 ?do ( offset ) > dup i + c@ i c@ - ?dup > if > nip unloop exit > then > loop > drop u1 u2 - ; > > That leaves stack purists with the task of eliminating the locals from > the prologue and epilogue of this word. Two items have to be stored > across the loop, or the difference could be computed speculatively and > only one item stored across the loop. And the computations before the > loop involve four values alive at the same time (fortunately addr2 is > does not live long). Let's see: > > : strcmp {: addr1 u1 addr2 u2 -- n :} > rot 2dup - >r ( addr1 addr2 u1 u2 R: n1 ) > min -rot over - ( u12 addr1 offset R: n1 ) > swap rot bounds ( offset limit start R: n1 ) > ?do ( offset R: n1 loop-sys ) > dup i + c@ i c@ - ?dup > if > nip unloop r> drop exit > then > loop > drop r> negate ; > > As can be seen by the many stack comments, the stack load here is more > than I can easily deal with. > > Maybe a stack purist can improve on that. But can he improve it > enough to make it as easy to understand as any of the versions with > locals? I recently reviewed the string comparison for search-wordlist and came up with the following The string stored in the word header is already uppercased. So string comparison will be case insensitive : UC ( c -- c' ) \ uppercase char dup $61 $7B within $20 and - ; : NCOMP4 ( addr n addr' n' - f) \ 0 is match dup >r begin rot = while \ str cstr r> dup 1- >r while \ str cstr swap count uc \ cstr str' s1 rot count \ str' s1 cstr' c1 repeat 2drop r> drop 0 exit then 2drop r> drop 1 ; First iteration in the loop it does not compare chars but the length! BR Peter > > - anton