Path: csiph.com!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail From: Keith Thompson Newsgroups: comp.lang.c Subject: Re: // comments and \ Date: Wed, 23 Dec 2015 13:00:06 -0800 Organization: None to speak of Lines: 60 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: mx02.eternal-september.org; posting-host="945944de09706c9b4e29b53c9d2efdc2"; logging-data="27672"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/skRZyWC7NDeb58ZSo/R7I" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) Cancel-Lock: sha1:dUddq2Ae8Un5s/AZQX1C+5tATho= sha1:/xSArsl7UR33fW1InDTyvHaPR9w= Xref: csiph.com comp.lang.c:79170 "Charles Richmond" writes: > "David Thompson" wrote in message > news:undi7b5ct823427h92k2epmh9j1ppshvfq@4ax.com... >> On Tue, 01 Dec 2015 09:21:12 -0800, Keith Thompson >> wrote: >>> It's actually a bit worse than that. Consider this program: >>> >>> #include >>> int main(void) { >>> // \ >>> puts("This is a comment"); >>> puts("This is not a comment"); >>> } >>> >>> You can't see it, but there's a space after the backslash on line 3 >>> (assuming it's not stripped by somebody's news software). >> >> >> >> C90 7.9.2 and successors (all) say that for a text stream, which is >> inherently at runtime in a hosted implementation, whether trailing >> spaces 'appear' (or are lost/stripped) is I-D. >> >> Nothing says C source files must be C text files. But their content is >> remarkably* text-like, and except for crosscompiler or freestanding, >> given that you must implement runtime text files anyway, using them >> also for source files isn't obviously insane. > > ISTM that the C standard (maybe all of them) allowed certain control > characters in C source files. In the past, people I know have embedded > control-L to cause a form feed when the source is printed out. I think > there are a few other control characters allowed. > > This might still be considered a text file I suppose. The characters allowed in a C source file are the members of the *source character set*, which includes the *basic source character set* as a subset. The basic source character set includes the horizontal tab, vertical tab, and form feed control characters, all of which are white space characters. Members of the source character set outside the basic source character set are implementation-defined, and might include additional control characters. In addition, physical source characters are mapped to the source character set in an implementation-defined manner in translation phase 1. This is described in N1570 section 5.2.1 and 5.1.1.2. None of this is necessarily relevant to the issue being discussed. Possibly translation phase 1 *might* remove trailing spaces, but it's not entirely clear that it's allowed to do so. The standard says that physical source characters are "mapped" to the source character set; I'm not sure what that mapping is permitted to do. -- Keith Thompson (The_Other_Keith) kst-u@mib.org Working, but not speaking, for JetHead Development, Inc. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"