Path: csiph.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Rainer Weikusat Newsgroups: comp.unix.shell,comp.unix.programmer,comp.lang.misc Subject: Re: Command Languages Versus Programming Languages Date: Thu, 21 Nov 2024 15:07:42 +0000 Lines: 29 Message-ID: <874j40sk01.fsf@doppelsaurus.mobileactivedefense.com> References: <875xohbxre.fsf@doppelsaurus.mobileactivedefense.com> Mime-Version: 1.0 Content-Type: text/plain X-Trace: individual.net O3HcDZSMoC5bQXIZ0hwVuAeEjigWzFJ1OS8imXmOeGiqDPeXk= Cancel-Lock: sha1:D+T9SN8ldpc3ZEPVfdyih7+3rWk= sha1:wzBVcF/2jEJO9xuiHEarTsOEDVY= sha256:yxX+5rPLFNjEL0zy75Ml/zE7579viIF4ErT31+sX0js= User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Xref: csiph.com comp.unix.shell:25923 comp.unix.programmer:16554 comp.lang.misc:11051 cross@spitfire.i.gajendra.net (Dan Cross) writes: > Rainer Weikusat wrote: >>Janis Papanagnou writes: >> >>[...] >> >>> Personally I think that writing bulky procedural stuff for something >>> like [0-9]+ can only be much worse, and that further abbreviations >>> like \d+ are the better direction to go if targeting a good interface. >>> YMMV. >> >>Assuming that p is a pointer to the current position in a string, e is a >>pointer to the end of it (ie, point just past the last byte) and - >>that's important - both are pointers to unsigned quantities, the 'bulky' >>C equivalent of [0-9]+ is >> >>while (p < e && *p - '0' < 10) ++p; >> >>That's not too bad. And it's really a hell lot faster than a >>general-purpose automaton programmed to recognize the same pattern >>(which might not matter most of the time, but sometimes, it does). > > It's also not exactly right. `[0-9]+` would match one or more > characters; this possibly matches 0 (ie, if `p` pointed to > something that wasn't a digit). The regex won't match any digits if there aren't any. In this case, the match will fail. I didn't include the code for handling that because it seemed pretty pointless for the example.