Path: csiph.com!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch Newsgroups: comp.lang.c Subject: Re: Command line globber/tokenizer library for C? Date: Tue, 17 Sep 2024 18:31:10 -0700 Organization: A noiseless patient Spider Lines: 123 Message-ID: <867cb9agtd.fsf@linuxsc.com> References: <20240912181625.00006e68@yahoo.com> <20240912223828.00005c10@yahoo.com> <861q1nfsjz.fsf@linuxsc.com> <20240915122211.000058b1@yahoo.com> <20240918024611.000002f3@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Date: Wed, 18 Sep 2024 03:31:11 +0200 (CEST) Injection-Info: dont-email.me; posting-host="90f5f055cba32b807bf2e550ae581e94"; logging-data="3966324"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/NXygR9XmL/uxRmNXr5X+YTfel+YbUufU=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:OOlCsqTdQeXPJGW2HCtzYdHOAY8= sha1:RNN1LafnVS6L7mOomvp5Ul5tAVY= Xref: csiph.com comp.lang.c:388439 Michael S writes: > On Tue, 17 Sep 2024 22:34:33 -0000 (UTC) > antispam@fricas.org wrote: > >> Michael S wrote: >> >>> On Fri, 13 Sep 2024 09:05:04 -0700 >>> Tim Rentsch wrote: >>> >>>> Michael S writes: >>>> >>>> [..iterate over words in a string..] >>>> >>>> I couldn't resist writing some code along similar lines. The >>>> entry point is words_do(), which returns one on success and >>>> zero if the end of string is reached inside double quotes. >>>> >>>> >>>> typedef struct gopher_s *Gopher; >>>> struct gopher_s { void (*f)( Gopher, const char *, const char * ); >>>> }; >>>> >>>> static _Bool collect_word( const char *, const char *, _Bool, >>>> Gopher ); static _Bool is_space( char ); >>>> >>>> >>>> _Bool >>>> words_do( const char *s, Gopher go ){ >>>> char c = *s; >>>> >>>> return >>>> is_space(c) ? words_do( s+1, go ) >>>> : c ? collect_word( s, s, 1, go ) >>>> : /***************/ 1; >>>> } >>>> >>>> _Bool >>>> collect_word( const char *s, const char *r, _Bool w, Gopher go ){ >>>> char c = *s; >>>> >>>> return >>>> c == 0 ? go->f( go, r, s ), w >>>> : is_space(c) && w ? go->f( go, r, s ), words_do( s, go ) >>>> : /***************/ collect_word( s+1, r, w ^ c == '"', go ); >>>> } >>>> >>>> _Bool >>>> is_space( char c ){ >>>> return c == ' ' || c == '\t'; >>>> } >>> >>> >> >> >> >>> Tested on godbolt. >>> gcc -O2 turns it into iteration starting from v.4.4 >>> clang -O2 turns it into iteration starting from v.4.0 >>> Latest icc still does not turn it into iteration at least along one >>> code paths. >>> Latest MSVC implements it as written, 100% recursion. >> >> I tested using gcc 12. AFAICS calls to 'go->f' in 'collect_word' >> are not tail calls and gcc 12 compiles them as normal call. > > Naturally. > >> The other calls are compiled to jumps. But call to 'collect_word' >> in 'words_do' is not "sibicall" and dependig in calling convention >> compiler may treat it narmal call. Two other calls, that is >> call to 'words_do' in 'words_do' and call to 'collect_word' in >> 'collect_word' are clearly tail self recursion and compiler >> should always optimize them to a jump. > > "Should" or not, MSVC does not eliminate them. > > The funny thing is that it does eliminate all four calls after I rewrote > the code in more boring style. > > _Bool > words_do( const char *s, Gopher go ){ > char c = *s; > #if 1 > if (is_space(c)) > return words_do( s+1, go ); > if (c) > return collect_word( s, s, 1, go ); > return 1; > #else > return > is_space(c) ? words_do( s+1, go ) : > c ? collect_word( s, s, 1, go ): > /***************/ 1; > #endif > } > > static > _Bool > collect_word( const char *s, const char *r, _Bool w, Gopher go ){ > char c = *s; > #if 1 > if (c == 0) { > go->f( go, r, s ); > return w; > } > if (is_space(c) && w) { > go->f( go, r, s ); > return words_do( s, go ); > } > return collect_word( s+1, r, w ^ c == '"', go ); > #else > return > c == 0 ? go->f( go, r, s ), w : > is_space(c) && w ? go->f( go, r, s ), words_do( s, go ) : > /***************/ collect_word( s+1, r, w ^ c == '"', go ); > #endif > } That's amusing. :) Do you know if icc will do tail call elimination for the boring version of the code?