Groups | Search | Server Info | Login | Register
Groups > comp.arch.embedded > #32323
| Path | csiph.com!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail |
|---|---|
| From | David Brown <david.brown@hesbynett.no> |
| Newsgroups | comp.arch.embedded |
| Subject | Re: Static regex for embedded systems |
| Date | Wed, 22 Jan 2025 10:59:03 +0100 |
| Organization | A noiseless patient Spider |
| Lines | 72 |
| Message-ID | <vmqfh7$uiuc$1@dont-email.me> (permalink) |
| References | <vmob4o$3ssqn$2@dont-email.me> <vmok15.1gs.1@stefan.msgid.phost.de> <vmok1j$3ssqn$3@dont-email.me> <9me0pjpctevm2k0vjf07iei0a1isf58tqa@4ax.com> |
| MIME-Version | 1.0 |
| Content-Type | text/plain; charset=UTF-8; format=flowed |
| Content-Transfer-Encoding | 7bit |
| Injection-Date | Wed, 22 Jan 2025 10:59:05 +0100 (CET) |
| Injection-Info | dont-email.me; posting-host="6f7dcd413dcfed51b6dbd849bb948bac"; logging-data="1002444"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19F+CYdfPoqX5VgzIYMEIooqoo6O3xZ14Y=" |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 |
| Cancel-Lock | sha1:xjaIOB5io+2MlLPYUiLQ87GZqus= |
| In-Reply-To | <9me0pjpctevm2k0vjf07iei0a1isf58tqa@4ax.com> |
| Content-Language | en-GB |
| Xref | csiph.com comp.arch.embedded:32323 |
Show key headers only | View raw
On 22/01/2025 01:38, George Neuner wrote: > On Tue, 21 Jan 2025 18:03:48 +0100, pozz <pozzugno@gmail.com> wrote: > >> Il 21/01/2025 17:03, Stefan Reuther ha scritto: >>> Am 21.01.2025 um 15:31 schrieb pozz: >>>> Many times I need to parse/decode a text string that comes from an >>>> external system, over a serial bus, MQTT, and so on. >>>> >>>> Many times this string has a fixed syntax/layout. In order to parse this >>>> string, I everytime create a custom parser that can be tedious, >>>> cumbersom and error prone. >>> [...] >>> >>> I don't see a question in this posting, >> >> The hiddend question was if there's a better approach than handcrafted >> parsers. >> >> >>> but isn't this the task that >>> 'lex' is intended to be used for? >> >> I will look at it. >> >> >>> (Personally, I have no problem with handcrafted parsers.) > > So long as they are correct 8-) > This is vital. You want a /lot/ of test cases to check the algorithm. > >>> Stefan > > Lex and Flex create table driven lexers (and driver code for them). > Under certain circumstances Flex can create far smaller tables than > Lex, but likely either would be massive overkill for the scenario you > described. > > Minding David's warnings about lexer size, if you really want to try > using regex, I would recommend RE2C. RE2C is a preprocessor that > generates simple recursive code to directly implement matching of > regex strings in your code. There are versions available for several > languages. > https://re2c.org/ > The "best" solution depends on the OP's knowledge, the variety of the patterns needed, the resources of the target system, and restrictions on things like programming language support. For example, the C++ template based project I suggested earlier (which I have not tried myself) should give quite efficient results, but it requires a modern C++ compiler. I think if the OP is only looking for a few patterns, or styles of pattern, then regex's and powerful code generator systems are overkill. It will take more work to learn and understand them, and code generated by tools like lex and flex is not designed to be human-friendly, nor is it likely to match well with coding standards for small embedded systems. I'd probably just have a series of matcher functions for different parts (fixed string, numeric field as integer, flag field as boolean, etc.) and have manual parsers for the different types. As a C++ user I'd be returning std::optional<> types here and using the new "and_then" methods to give neat chains, but a C programmer might want to pass a pointer to a value variable and return "bool" for success. If I had a lot of such patterns to match, then I might use templates for generating the higher level matchers - for C, it would be either a macro system or an external Python script. Or just use sscanf() :-)
Back to comp.arch.embedded | Previous | Next — Previous in thread | Next in thread | Find similar
Static regex for embedded systems pozz <pozzugno@gmail.com> - 2025-01-21 15:31 +0100
Re: Static regex for embedded systems David Brown <david.brown@hesbynett.no> - 2025-01-21 16:40 +0100
Re: Static regex for embedded systems Stefan Reuther <stefan.news@arcor.de> - 2025-01-21 17:03 +0100
Re: Static regex for embedded systems pozz <pozzugno@gmail.com> - 2025-01-21 18:03 +0100
Re: Static regex for embedded systems Hans-Bernhard Bröker <HBBroeker@gmail.com> - 2025-01-21 20:40 +0100
Re: Static regex for embedded systems "Niocláiſín Cóilín de Ġloſtéir" <Master_Fontaine_is_dishonest@Strand_in_London.Gov.UK> - 2025-01-22 00:41 +0100
Re: Static regex for embedded systems George Neuner <gneuner2@comcast.net> - 2025-01-21 19:38 -0500
Re: Static regex for embedded systems David Brown <david.brown@hesbynett.no> - 2025-01-22 10:59 +0100
Re: Static regex for embedded systems George Neuner <gneuner2@comcast.net> - 2025-01-22 18:23 -0500
Re: Static regex for embedded systems Stefan Reuther <stefan.news@arcor.de> - 2025-01-22 17:53 +0100
Re: Static regex for embedded systems George Neuner <gneuner2@comcast.net> - 2025-01-22 18:33 -0500
csiph-web