Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #383179

Re: Implicit String-Literal Concatenation

From Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups comp.lang.c
Subject Re: Implicit String-Literal Concatenation
Date 2024-02-29 09:03 -0800
Organization None to speak of
Message-ID <87a5njtd72.fsf@nosuchdomain.example.com> (permalink)
References (3 earlier) <urlgfn$3d1ah$3@dont-email.me> <urlmo7$3eg2j$1@dont-email.me> <urn6sv$3s62i$2@dont-email.me> <877ciouua2.fsf@nosuchdomain.example.com> <urphlj$ejuu$1@dont-email.me>

Show all headers | View raw


David Brown <david.brown@hesbynett.no> writes:
> On 28/02/2024 22:57, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>> [...]
>>> They won't use strings, they will use data blobs - binary data.  Then
>>> there is no issue with null bytes.  And yes, implementations will skip
>>> the token generation (unless you are doing something weird, such as
>>> using #embed to read the parameters to a function call).
>>>
>>> Tests with prototype implementations gave extremely fast results.
>> I'm not sure how that would work.  #embed is a preprocessor
>> directive,
>> and at least in the abstract model it has to expand to valid C code.
>> I would have expected that it would simply generate the list of
>> comma-separated integer constants described in the standard; later
>> phases would simply parse that list and generate code as if that
>> sequence had been written in the original source file.  Do you know of
>> an implementation that does something else?
>
> The key thing, as I understand it, is that the compiler gets to know
> that the integers in the list are all "nice".  And since the 
> preprocessor and the compiler are part of the same implementation
> (even if they are separate programs communicating with pipes or
> temporary files), the preprocessor could pass on the binary blob in a
> pre-parsed form.
[...]

Sure, an implementation *could* optimize #embed so it expands to some
implementation-defined nonstandard form that later phases can treat as
raw data.  But since it's defined as a preprocessor directive, it's
difficult to see how it could do so while covering all cases.

[...]

> The results of testing are that #embed is /massively/ faster and lower
> memory compared to external generators, especially for larger files. 
> And it gives you the data on-hand for optimisation purposes, unlike
> external direct linking of binary blobs.  (So you can get the size of 
> the array, or use values from it as compile-time known values.)

What testing?  The very latest versions of gcc and clang (I checked both
their git repos yesterday) do not yet implement #embed.

>> For example, say you have a file "foo.dat" containing 4 bytes with
>> values 0, 1, 2, and 3.  This would be perfectly valid:
>>      struct foo {
>>          unsigned char a;
>>          unsigned short b;
>>          unsigned int c;
>>          double d;
>>      };
>>      struct foo obj = {
>> #embed "foo.dat"
>>      };
>> #embed isn't defined to translate an input file to a sequence of
>> bytes.
>> It's defined to translate an input file to a sequence of integer
>> constant expressions.
>
> Yes.  But the prime speed (and memory usage) gains come in, are for
> large files, and that means array initialisers.  That does not
> conflict with using it for cases like yours.

So a compiler that does this would have to be able to handle

    struct foo obj = {
#blob
<binary data>
#endblob>
    };

and initialize a, b, c, and d to 0, 1, 2, and 3.0, respectively from
successive bytes of the binary data.  Either that, or the preprocessor
would have to use information it doesn't have to determine how to expand
#embed.

>> *Maybe* a compiler could optimize for the case where it knows that it's
>> being used to initialize an array of unsigned char, but (a) that would
>> require the preprocessor to have information that normally doesn't exist
>> until later phases, and (b) I'm not convinced it would be worth the
>> effort.
>
> Look at
> <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1040r6.html#design-practice-speed>.
>
> In those tests, for a 40 MB file gcc #embed is 200 times faster than
> "xxd -i" generated files, and takes about 2.5% of the memory.  It
> scales to 1 GB files.  And that's just a proof-of-concept
> implementation.

That's for std::embed, a proposed C++ feature that's *not* defined as a
preprocessor directive.  Sample usage from the paper:

    constexpr std::span<const std::byte> fxaa_binary = 
        std::embed( "fxaa.spirv" );

So the compiler knows the type of the object being initialized.

(Note that the author of that C++ paper is also the editor for the C
standard.)

I'm still skeptical that C's #embed will actually be implemented other
than as expanding to a sequence of integer constants.

On the other hand, C23 allows for additional implementation-defined
parameters to #embed (as well as the standard embed parameters limit,
prefix, suffix, and is_empty).  Such a parameter could specify how it's
expanded, perhaps to some implementation-defined blob format.  *If*
compilers optimize #embed to something other than a sequence of integer
constant expressions, that's probably how it would be done.  But since
neither gcc nor clang implements #embed at all, it may be too early to
speculate.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Back to comp.lang.c | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-24 23:05 +0000
  Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-25 17:38 +0100
    Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-25 20:43 +0000
      Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-25 21:20 +0000
  Re: Implicit String-Literal Concatenation Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2024-02-25 16:45 +0000
  Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-25 20:25 +0000
  Re: Implicit String-Literal Concatenation Łukasz 'Maly' Ostrowski <l3vi4than@gmail.com> - 2024-02-26 21:12 +0100
    Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-26 20:31 +0000
      Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-27 13:18 +0100
        Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-27 23:10 +0000
          Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-28 00:50 +0100
  Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-02-26 20:42 +0000
  Re: Implicit String-Literal Concatenation porkchop@invalid.foo (Mike Sanders) - 2024-02-26 22:03 +0000
    Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-26 23:17 +0000
      Re: Implicit String-Literal Concatenation porkchop@invalid.foo (Mike Sanders) - 2024-02-27 17:27 +0000
    Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-27 09:36 +0100
      Re: Implicit String-Literal Concatenation porkchop@invalid.foo (Mike Sanders) - 2024-02-27 17:31 +0000
      Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-27 18:56 +0000
        Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-27 23:21 +0100
          Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-27 22:52 +0000
            Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-02-28 01:09 +0000
              Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-28 12:50 +0100
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-28 20:56 +0000
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-28 21:34 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-28 23:52 +0000
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 00:15 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-29 02:53 +0000
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 09:20 +0000
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 15:48 +0000
                Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-29 17:03 +0100
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 16:17 +0000
                Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-29 18:12 +0100
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 17:30 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 13:20 -0800
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 21:44 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 14:06 -0800
                Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-03-01 18:09 +0100
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-03-01 10:49 -0800
                Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-03-01 22:06 +0100
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 09:20 -0800
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 08:58 +0100
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-29 21:05 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-03-01 09:16 +0100
                Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-03-01 16:55 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-03-01 18:28 +0100
      Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-27 20:25 +0000
        Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-27 12:35 -0800
          Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-27 23:03 +0000
        Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-27 22:12 +0000
          Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-28 12:54 +0100
            Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-28 13:13 +0000
              Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-28 15:08 +0100
              Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-28 13:36 -0800
                Re: Implicit String-Literal Concatenation Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-02-29 11:56 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 16:19 +0100
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-29 21:36 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 13:53 -0800
                Re: Implicit String-Literal Concatenation Richard Harnden <richard.nospam@gmail.invalid> - 2024-03-01 12:59 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-03-01 20:59 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 08:08 -0800
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 14:31 +0000
                Re: Implicit String-Literal Concatenation Richard Harnden <richard.nospam@gmail.invalid> - 2024-02-29 15:22 +0000
                Re: Implicit String-Literal Concatenation "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-02-29 13:10 -0800
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 13:45 -0800
                Re: Implicit String-Literal Concatenation "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-02-29 14:03 -0800
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 14:14 -0800
                Re: Implicit String-Literal Concatenation "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-02 13:48 -0800
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-03-05 04:48 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-03-04 20:55 -0800
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-03-07 21:08 +0000
                Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-03-07 21:44 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-03-07 14:25 -0800
                Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-03-07 23:00 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-03-07 15:46 -0800
                Re: Implicit String-Literal Concatenation "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-07 16:17 -0800
                Re: Implicit String-Literal Concatenation Richard Harnden <richard.nospam@gmail.invalid> - 2024-03-08 00:26 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-03-07 14:16 -0800
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 16:30 +0100
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 08:25 -0800
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 08:18 -0800
                Re: Implicit String-Literal Concatenation Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-02-29 18:17 +0100
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 09:22 -0800
                Re: Implicit String-Literal Concatenation Kaz Kylheku <433-929-6894@kylheku.com> - 2024-02-29 19:26 +0000
                Re: Implicit String-Literal Concatenation James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-02-29 14:45 -0500
              Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-28 13:41 -0800
            Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-28 13:57 -0800
              Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-28 23:01 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-28 15:31 -0800
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 00:47 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-28 17:12 -0800
                Re: Implicit String-Literal Concatenation tTh <tth@none.invalid> - 2024-02-29 16:29 +0100
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 16:15 +0000
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 15:53 +0000
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 09:06 -0800
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 17:28 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 18:58 +0100
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 18:05 +0000
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-02-29 18:09 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-02-29 21:27 +0000
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-03-01 11:52 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-03-05 04:47 +0000
                Re: Implicit String-Literal Concatenation scott@slp53.sl.home (Scott Lurndal) - 2024-03-05 15:09 +0000
                Re: Implicit String-Literal Concatenation Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-03-06 01:49 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 20:51 +0100
              Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 10:10 +0100
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-02-29 10:18 +0000
                Re: Implicit String-Literal Concatenation tTh <tth@none.invalid> - 2024-02-29 16:34 +0100
                Re: Implicit String-Literal Concatenation bart <bc@freeuk.com> - 2024-03-01 11:58 +0000
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-03-01 13:17 +0100
                Re: Implicit String-Literal Concatenation Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-02-29 09:03 -0800
                Re: Implicit String-Literal Concatenation David Brown <david.brown@hesbynett.no> - 2024-02-29 19:08 +0100

csiph-web