Path: csiph.com!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: Command line globber/tokenizer library for C?
Date: Tue, 17 Sep 2024 18:31:10 -0700
Organization: A noiseless patient Spider
Lines: 123
Message-ID: <867cb9agtd.fsf@linuxsc.com>
References: <20240912181625.00006e68@yahoo.com> <20240912223828.00005c10@yahoo.com> <861q1nfsjz.fsf@linuxsc.com> <20240915122211.000058b1@yahoo.com> <20240918024611.000002f3@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Wed, 18 Sep 2024 03:31:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="90f5f055cba32b807bf2e550ae581e94"; logging-data="3966324"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/NXygR9XmL/uxRmNXr5X+YTfel+YbUufU="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:OOlCsqTdQeXPJGW2HCtzYdHOAY8= sha1:RNN1LafnVS6L7mOomvp5Ul5tAVY=
Xref: csiph.com comp.lang.c:388439
Michael S writes:
> On Tue, 17 Sep 2024 22:34:33 -0000 (UTC)
> antispam@fricas.org wrote:
>
>> Michael S wrote:
>>
>>> On Fri, 13 Sep 2024 09:05:04 -0700
>>> Tim Rentsch wrote:
>>>
>>>> Michael S writes:
>>>>
>>>> [..iterate over words in a string..]
>>>>
>>>> I couldn't resist writing some code along similar lines. The
>>>> entry point is words_do(), which returns one on success and
>>>> zero if the end of string is reached inside double quotes.
>>>>
>>>>
>>>> typedef struct gopher_s *Gopher;
>>>> struct gopher_s { void (*f)( Gopher, const char *, const char * );
>>>> };
>>>>
>>>> static _Bool collect_word( const char *, const char *, _Bool,
>>>> Gopher ); static _Bool is_space( char );
>>>>
>>>>
>>>> _Bool
>>>> words_do( const char *s, Gopher go ){
>>>> char c = *s;
>>>>
>>>> return
>>>> is_space(c) ? words_do( s+1, go )
>>>> : c ? collect_word( s, s, 1, go )
>>>> : /***************/ 1;
>>>> }
>>>>
>>>> _Bool
>>>> collect_word( const char *s, const char *r, _Bool w, Gopher go ){
>>>> char c = *s;
>>>>
>>>> return
>>>> c == 0 ? go->f( go, r, s ), w
>>>> : is_space(c) && w ? go->f( go, r, s ), words_do( s, go )
>>>> : /***************/ collect_word( s+1, r, w ^ c == '"', go );
>>>> }
>>>>
>>>> _Bool
>>>> is_space( char c ){
>>>> return c == ' ' || c == '\t';
>>>> }
>>>
>>>
>>
>>
>>
>>> Tested on godbolt.
>>> gcc -O2 turns it into iteration starting from v.4.4
>>> clang -O2 turns it into iteration starting from v.4.0
>>> Latest icc still does not turn it into iteration at least along one
>>> code paths.
>>> Latest MSVC implements it as written, 100% recursion.
>>
>> I tested using gcc 12. AFAICS calls to 'go->f' in 'collect_word'
>> are not tail calls and gcc 12 compiles them as normal call.
>
> Naturally.
>
>> The other calls are compiled to jumps. But call to 'collect_word'
>> in 'words_do' is not "sibicall" and dependig in calling convention
>> compiler may treat it narmal call. Two other calls, that is
>> call to 'words_do' in 'words_do' and call to 'collect_word' in
>> 'collect_word' are clearly tail self recursion and compiler
>> should always optimize them to a jump.
>
> "Should" or not, MSVC does not eliminate them.
>
> The funny thing is that it does eliminate all four calls after I rewrote
> the code in more boring style.
>
> _Bool
> words_do( const char *s, Gopher go ){
> char c = *s;
> #if 1
> if (is_space(c))
> return words_do( s+1, go );
> if (c)
> return collect_word( s, s, 1, go );
> return 1;
> #else
> return
> is_space(c) ? words_do( s+1, go ) :
> c ? collect_word( s, s, 1, go ):
> /***************/ 1;
> #endif
> }
>
> static
> _Bool
> collect_word( const char *s, const char *r, _Bool w, Gopher go ){
> char c = *s;
> #if 1
> if (c == 0) {
> go->f( go, r, s );
> return w;
> }
> if (is_space(c) && w) {
> go->f( go, r, s );
> return words_do( s, go );
> }
> return collect_word( s+1, r, w ^ c == '"', go );
> #else
> return
> c == 0 ? go->f( go, r, s ), w :
> is_space(c) && w ? go->f( go, r, s ), words_do( s, go ) :
> /***************/ collect_word( s+1, r, w ^ c == '"', go );
> #endif
> }
That's amusing. :)
Do you know if icc will do tail call elimination for
the boring version of the code?