Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: luser droog Newsgroups: comp.compilers Subject: Re: Wrestling with phase 1 of a C compiler Date: Thu, 15 Sep 2022 20:11:15 -0700 (PDT) Organization: Compilers Central Lines: 87 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-09-010@comp.compilers> References: <22-09-001@comp.compilers> <22-09-004@comp.compilers> <22-09-008@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="9865"; mail-complaints-to="abuse@iecc.com" Keywords: parse, design Posted-Date: 20 Sep 2022 11:17:38 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-09-008@comp.compilers> Xref: csiph.com comp.compilers:3162 On Thursday, September 15, 2022 at 11:17:50 AM UTC-5, luser droog wrote: > On Monday, September 12, 2022 at 2:46:47 PM UTC-5, christoph...@compiler-resources.com wrote: > >And, your efforts to put the > > state behind pointers, while necessary only get you part of the way there. > That's also true. For the present case, I'll also need to dynamically allocate > integer objects and keep them in the environment for the calling > function which is one of these closures, a suspension function that converts > the first element of a stream and returns a list with a suspension in the cdr. > > That's the only way I'll get row and column counters to exist on a per-file > basis. So here's the rest of it. This ought to do the whole phase 1 of the C compilation process while also supplementing each byte with its row and column numbers. And it holds the state in the local environment in the closure, although it has to create a new environment for each iteration because I don't have a function for updating definitions (--supposed to be "functional" after all, as much as practical). (the header file defines the names POS_ROW, POS_COL, and POS_INPUT in an enum.) static fSuspension force_chars_with_positions; static list position( object item, int *row, int *col ); static parser position_grammar( void ); static fOperator new_line; list chars_with_positions( list input ){ return Suspension( env( NIL_, 3, Symbol(POS_ROW), Int( 0 ), Symbol(POS_COL), Int( 0 ), Symbol(POS_INPUT), input ), force_chars_with_positions ); } list force_chars_with_positions( list ev ){ list input = assoc_symbol( POS_INPUT, ev ); integer row = assoc_symbol( POS_ROW, ev ); integer col = assoc_symbol( POS_COL, ev ); static parser position_parser; if( ! position_parser ) position_parser = position_grammar(); object result = parse( position_parser, input ); if( not_ok( result ) ) return rest( rest( result ) ); object payload = rest( result ); list pos = position( first( payload ), &row->Int.i, &col->Int.i ); return cons( pos, Suspension( env( NIL_, 3, Symbol(POS_ROW), row, Symbol(POS_COL), col, Symbol(POS_INPUT), rest( payload ) ), force_chars_with_positions ) ); } static list position( object item, int *row, int *col ){ if( valid( eq_int( '\n', item ) ) ) return cons( item, cons( Int( ++ *row ), Int( *col = 0 ) ) ); else return cons( item, cons( Int( *row ), Int( ++ *col ) ) ); } static parser position_grammar( void ){ return either( bind( ANY( str("\r\n"), chr('\r'), chr('\n') ), Operator( NIL_, new_line ) ), item() ); } static object new_line( list env, object input ){ return Int('\n'); } I think it's pretty nice and readable, while hiding some of the magic. One big unsolved issue is how to use a GC in C without access to the stack. My only solution is to only call the GC from the top level, or otherwise carefully cultivating the root set and calling from near the top level where the root set is easy to manage. But as this is off topic here, I'd invite any thoughts about user space GC over in comp.lang.c.