Path: csiph.com!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: Programming exercise/challenge
Date: Mon, 07 Dec 2020 01:03:57 -0800
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <86v9dehts2.fsf@linuxsc.com>
References: <86wnxwkyol.fsf@linuxsc.com> <871rg2rffu.fsf@bsb.me.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="bcc5e78e97e66444c1bdd30cf2244811"; logging-data="3636"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18c5q1InuZ0CE1niXaBqbBZGNXvPgrj7rk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:4haaIr6QE/Q33BDzL7Rh2gnA+Ss= sha1:n3BnY+zSdNemX840lrL/Y92Vg5I=
Xref: csiph.com comp.lang.c:157014
Ben Bacarisse writes:
> Tim Rentsch writes:
>
>> Prompted by some recent discussion regarding 'goto' statements and
>> state machines, I would like to propose a programming exercise.
>> (It is perhaps a bit too large to be called an exercise, but not so
>> difficult that it deserves the label of challenge. On the other
>> hand there are some constraints so maybe challenge is apropos. In
>> any case somewhere in between those two bounds.)
>>
>> Short problem statement: a C program to remove comments from C
>> source input.
>
> Apologies for non-C code but I could not resist...
>
> In all the goto/no goto discussion its easy to lose track of the fact
> the C does not necessarily have the best primitives for constructing
> this sort of program. I don't mean things like, say, complex regular
> expressions, or built-in grammar parsing (al la Perl6). I mean basic
> language constructs.
>
> Here is a solution in Haskell. It is deliberately not very
> "Haskelly" because I want to keep the code really simple for
> programmers unfamiliar with the language.
>
> main = do input <- getContents
> putStr (stripComments (logicalLines input))
>
> stripComments [] = []
> stripComments ('/' : '/' : rest) = stripComments (dropWhile (/= '\n') rest)
> stripComments ('/' : '*' : rest) = ' ' : stripComments (nestedComment rest)
> stripComments (c : rest) = c : if c == '\'' || c == '"'
> then skipQuote c rest
> else stripComments rest
>
> nestedComment ('*' : '/' : rest) = rest
> nestedComment (c : rest) = nestedComment rest
> nestedComment [] = error "Unterminated comment"
>
> skipQuote q ('\\' : c : rest) = '\\' : c : skipQuote q rest
> skipQuote q (c : rest) = c : if c == q
> then stripComments rest
> else skipQuote q rest
> skipQuote q [] = error "Unterminated literal"
>
> logicalLines ('\\' : '\n' : rest) = logicalLines rest
> logicalLines (c : rest) = c : logicalLines rest
> logicalLines [] = []
>
> [..some explanation of Haskell language..]
It's a nice program, but it doesn't quite do what was asked for.
The problem statement says to remove comments. This program does
remove comments, but it also takes out line splices to replace
multiple physical lines with their corresponding logical line.
Line continuation sequences outside of comments should be
preserved in the output.
> Haskell has two big advantages for this exercise: pattern matched
> cases and lazy evaluation. [..explanation..]
Also one not mentioned: 'input <- getContents'. An analogous
functionality in C would read the contents of a file into a
character array, which would make the C program easier.
> Originally, I just wrote stripComments, but when I decided to
> include logical line processing (C's \ "continuation"
> mechanism) it was trivial to put it in. [...]
Line continuation sequences have to be processed if the original
problem statement is to be met. What was done here, with the
function 'logicalLines', is easy, but it changes the problem
being solved.
> I am now tempted to try to write a C version that preserves as much
> of what I see as the clarity and simplicity of this solution. I am
> not as hopeful as I'd like to be about that!
I am a fan of Haskell as you know, and I enjoyed reading your
program here. However, what I think would be more in keeping
with the spirit of the requested problem is to write in Haskell
something that is more like what might be done in a C program.
Output might be done functionally, producing a list as a result,
but input should be one character at a time, like what would
happen in a normal C program running a state machine. Doing that
would make the program more accessible to those people who are
experienced C programmers but lack experience in Haskell or
functional programming. That strikes me as a better fit for a
comp.lang.c posting.
Of course, independent of that, I also would like to see your try
at a C program as the problem was originally asked. So I hope we
will see one or both of those suggestions from you sometime soon.