Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #64771 > unrolled thread

printf format specifier changes

Started by"Rick C. Hodgin" <rick.c.hodgin@gmail.com>
First post2015-07-06 09:04 -0700
Last post2015-07-09 15:25 -0700
Articles 20 on this page of 94 — 23 participants

Back to article view | Back to comp.lang.c


Contents

  printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-06 09:04 -0700
    Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-06 17:50 +0100
      Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-06 11:34 -0700
        Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-06 19:47 +0100
          Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-06 19:53 +0100
            Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-06 11:57 -0700
        Re: printf format specifier changes "James Harris" <james.harris.1@gmail.com> - 2015-09-06 10:53 +0100
          Re: printf format specifier changes Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-09-06 04:21 -0700
          Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-09-06 13:41 +0100
      Re: printf format specifier changes Richard Heathfield <rjh@cpax.org.uk> - 2015-07-06 19:44 +0100
        Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-06 22:14 +0100
          Re: printf format specifier changes Robert Wessel <robertwessel2@yahoo.com> - 2015-07-06 22:43 -0500
          Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-06 23:01 -0700
            Re: printf format specifier changes glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-07-07 06:43 +0000
            Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-09 10:52 +0300
              Re: printf format specifier changes Richard Heathfield <rjh@cpax.org.uk> - 2015-07-09 09:26 +0100
                Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-12 12:56 +0300
                  Re: printf format specifier changes Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-07-12 06:12 -0700
          Re: printf format specifier changes Rosario19 <Ros@invalid.invalid> - 2015-07-07 10:22 +0200
            Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-07 10:10 +0100
        Re: printf format specifier changes Philip Lantz <prl@canterey.us> - 2015-07-07 00:54 -0700
      Re: printf format specifier changes Nobody <nobody@nowhere.invalid> - 2015-07-07 15:08 +0100
        Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-07 16:12 +0100
          Re: printf format specifier changes raltbos@xs4all.nl (Richard Bos) - 2015-07-09 10:34 +0000
            Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-09 12:25 +0100
              Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-12 13:02 +0300
                Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-12 11:23 +0100
      Re: printf format specifier changes <william@wilbur.25thandClement.com> - 2015-07-07 11:50 -0700
        Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-07 12:14 -0700
      Re: printf format specifier changes Les Cargill <lcargill99@comcast.com> - 2015-07-07 21:52 -0500
        Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-07 20:42 -0700
          Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-09 11:12 +0300
            Re: printf format specifier changes Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-07-09 05:03 -0700
              Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-09 13:10 -0500
            Re: printf format specifier changes James Kuyper <jameskuyper@verizon.net> - 2015-07-09 08:58 -0400
            Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-09 08:10 -0700
              Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-12 12:58 +0300
                Re: printf format specifier changes James Kuyper <jameskuyper@verizon.net> - 2015-07-12 12:57 -0400
                Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-12 11:14 -0700
                  Re: printf format specifier changes gazelle@shell.xmission.com (Kenny McCormack) - 2015-07-12 18:26 +0000
        Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-08 13:09 +0100
        Re: printf format specifier changes James Kuyper <jameskuyper@verizon.net> - 2015-07-08 08:10 -0400
          Re: printf format specifier changes Les Cargill <lcargill99@comcast.com> - 2015-07-08 08:12 -0500
            Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-08 08:38 -0700
            Re: printf format specifier changes James Kuyper <jameskuyper@verizon.net> - 2015-07-08 12:00 -0400
              Re: printf format specifier changes Les Cargill <lcargill99@comcast.com> - 2015-07-08 17:20 -0500
        Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-09 11:08 +0300
      Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-08 20:30 +0300
        Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-08 18:56 +0100
          Re: printf format specifier changes gordonb.lmwiv@burditt.org (Gordon Burditt) - 2015-07-09 11:09 -0500
            Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-09 14:05 -0500
        Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-08 11:27 -0700
          Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-08 15:07 -0500
    Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-06 23:39 -0500
      Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-07 10:19 +0100
        Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-07 07:57 -0500
          Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-07 14:15 +0100
            Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-07 14:46 +0100
            Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-07 09:37 -0500
          Re: printf format specifier changes Phil Carmody <pc+usenet@asdf.org> - 2015-07-09 11:24 +0300
            Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-09 09:55 -0500
              Re: printf format specifier changes <william@wilbur.25thandClement.com> - 2015-07-09 11:53 -0700
                Re: printf format specifier changes <william@wilbur.25thandClement.com> - 2015-07-09 13:48 -0700
                Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-09 17:01 -0500
      Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-07 10:59 +0100
        Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-07 07:48 -0500
          Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-07 14:30 +0100
            Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-07 10:05 -0500
              Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-07 23:16 +0100
                Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-07 20:29 -0500
                Re: printf format specifier changes Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-07-08 00:43 -0700
                Re: printf format specifier changes Robert Wessel <robertwessel2@yahoo.com> - 2015-07-09 02:29 -0500
                  Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-09 10:34 +0100
                    Re: printf format specifier changes BGB <cr88192@hotmail.com> - 2015-07-09 09:35 -0500
            Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-07 08:37 -0700
              Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-07 08:45 -0700
            Re: printf format specifier changes <william@wilbur.25thandClement.com> - 2015-07-07 12:12 -0700
    Re: printf format specifier changes pooja deshpande <namitadeshpande25@gmail.com> - 2015-07-09 09:29 -0700
      Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 09:34 -0700
        Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 09:59 -0700
          Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-09 22:33 +0100
            Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-09 15:17 -0700
              Re: printf format specifier changes glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-07-10 00:30 +0000
                Re: printf format specifier changes Reinhardt Behm <rbehm@hushmail.com> - 2015-07-10 09:17 +0800
        Re: printf format specifier changes David Kleinecke <dkleinecke@gmail.com> - 2015-07-09 10:02 -0700
          Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 10:19 -0700
        Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-09 21:50 +0100
          Re: printf format specifier changes Keith Thompson <kst-u@mib.org> - 2015-07-09 15:05 -0700
            Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 15:09 -0700
            Re: printf format specifier changes Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-07-10 00:28 +0100
              Re: printf format specifier changes raltbos@xs4all.nl (Richard Bos) - 2015-07-13 20:39 +0000
          Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 15:07 -0700
            Re: printf format specifier changes Bartc <bc@freeuk.com> - 2015-07-09 23:16 +0100
              Re: printf format specifier changes "Rick C. Hodgin" <rick.c.hodgin@gmail.com> - 2015-07-09 15:25 -0700

Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →


#64835

FromPhilip Lantz <prl@canterey.us>
Date2015-07-07 00:54 -0700
Message-ID<MPG.300525a1db923e094a@news.eternal-september.org>
In reply to#64785
Richard Heathfield wrote:
> It would be possible to devise a new "language" for formatted printing, 
> one that could perhaps do a better job of working out types, but would 
> that really move us any further forward? C would then have to support 
> both mini-languages, and a very significant number of developers would 
> stick to the old way simply because that's the way they already know and 
> it's the way their maintenance programmers already know.

As I do in C++.

> ...
> There is prior art for an alternative system that allows precise format 
> control without requiring you to work /quite/ so hard on the types - the 
> iostream model used by C++. It may not be a good example, but it is at 
> least a horrible warning.

Ah, you anticipated me.

[toc] | [prev] | [next] | [standalone]


#64854

FromNobody <nobody@nowhere.invalid>
Date2015-07-07 15:08 +0100
Message-ID<pan.2015.07.07.14.08.36.229000@nowhere.invalid>
In reply to#64778
On Mon, 06 Jul 2015 17:50:32 +0100, Bartc wrote:

> In 2015, the whole notion of having to tell a compiler what it already
> knows - the type of the values it's printing - is outdated.

But you're not telling the compiler, you're telling a library function
(specifically, telling it how to deserialise what is essentially a stream
of bytes stored on the stack).

So the issue is with the programmer having to tell the library function
the same thing they already told the compiler, as there's no simple way
to get the compiler to pass that information on.

The question is whether it's worth adding that feature for this specific
case.

It's not as if it gets you all that much. It would allow the library
function to detect incorrect argument types (although most compilers can
manage that much now), but you would still need to specify representation
(%u/%x, %f/%e/%g), field width, precision, justification (alignment), and
flags (#/0/+).

At best, having the type conveyed automatically would mean that you need
fewer distinct specifiers (e.g. %d and %u could be merged, as could %x and
%a) and you wouldn't need length modifiers. This could be useful if types
are implementation-dependent (e.g. off_t) or change according to
configuration options.

[toc] | [prev] | [next] | [standalone]


#64860

FromBartc <bc@freeuk.com>
Date2015-07-07 16:12 +0100
Message-ID<mngq6v$sie$1@dont-email.me>
In reply to#64854
On 07/07/2015 15:08, Nobody wrote:
> On Mon, 06 Jul 2015 17:50:32 +0100, Bartc wrote:
>
>> In 2015, the whole notion of having to tell a compiler what it already
>> knows - the type of the values it's printing - is outdated.
>
> But you're not telling the compiler, you're telling a library function
> (specifically, telling it how to deserialise what is essentially a stream
> of bytes stored on the stack).
>
> So the issue is with the programmer having to tell the library function
> the same thing they already told the compiler, as there's no simple way
> to get the compiler to pass that information on.

Yes, some information needs to be duplicated, which also means it needs 
to match. However the richness of the type system is not matched by the 
limited set of format codes. Example:

T a;

a is of type T (which will be usually be something with more of a clue, 
but still, you will need to go and look up exactly what it is).

You need to temporarily print out a to debug something, but what exactly 
is the format spec needed? You don't know. But suppose it's some sort of 
int:

  printf("A is %d\n",a);

Then you comment out that bit of code, until you revisit it later on 
when it gives more problems. Except that T might now be a little 
different, and the printf doesn't work properly.

Or, you might want to use the same format string in several situations 
(by pasting the text or using a macro) but small differences in exact 
types make that impractical.

> The question is whether it's worth adding that feature for this specific
> case.
>
> It's not as if it gets you all that much. It would allow the library
> function to detect incorrect argument types (although most compilers can
> manage that much now), but you would still need to specify representation
> (%u/%x, %f/%e/%g), field width, precision, justification (alignment), and
> flags (#/0/+).

You don't always need them, but when you do, you won't need the type too!

(Most times I have to write a format string, it's for debugging 
purposes. Field length, precision and everything else rarely come into it.

But I have had to convert C code containing lines like this:

printf("ABCD = %s %s %d %x\n", a,b,c,d);

into a new language, and those lines need to be modified to match the 
language. The most satisfying aspect of doing that is getting rid of all 
those redundant formats! And the "\n". The result might be:

  println "ABCD =",a,b,c,d

You can argue as much as you like about how great keeping all that 
punctuation is, but you have to admit it's a lot easier without!

(Another feature I use a lot for debugging is this:

  println =a,=b,=c+d*e

where the "=" sign - it could be anything - displays the following 
expression as a caption just before the value, in upper case; For this 
example, it is the equivalent of C's:

  printf("A=%s B=%s C+D*E=%d\n", a,b,c+d*e);

However, the proposal isn't to eliminate formats completely (although it 
would be nice for those people who can't type), but to void needing to 
get that %s %s %d %x sequence just right. Eg. as %? %? %? %?. Or perhaps 
as %/? to just print the lot. ? isn't a good choice but I don't know 
what other letters are free.)

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#64978

Fromraltbos@xs4all.nl (Richard Bos)
Date2015-07-09 10:34 +0000
Message-ID<559e4d8e.2762968@news.xs4all.nl>
In reply to#64860
Bartc <bc@freeuk.com> wrote:

> On 07/07/2015 15:08, Nobody wrote:

> > It's not as if it gets you all that much. It would allow the library
> > function to detect incorrect argument types (although most compilers can
> > manage that much now), but you would still need to specify representation
> > (%u/%x, %f/%e/%g), field width, precision, justification (alignment), and
> > flags (#/0/+).
> 
> You don't always need them, but when you do, you won't need the type too!
> 
> (Most times I have to write a format string, it's for debugging 
> purposes. Field length, precision and everything else rarely come into it.

That may be the case for you; it's not the case for everybody.

> But I have had to convert C code containing lines like this:
> 
> printf("ABCD = %s %s %d %x\n", a,b,c,d);
> 
> into a new language, and those lines need to be modified to match the 
> language. The most satisfying aspect of doing that is getting rid of all 
> those redundant formats! And the "\n". The result might be:
> 
>   println "ABCD =",a,b,c,d
> 
> You can argue as much as you like about how great keeping all that 
> punctuation is, but you have to admit it's a lot easier without!

It is indeed very easy if the compiler can automagically guess that you
want one number printed decimal and the other hexadecimal. And get this
right every time.

I'm not saying C's format specifiers are perfect. Far from it. But they
have their good points, and much of what they do would need to be
duplicated in their replacement.

Richard

[toc] | [prev] | [next] | [standalone]


#64983

FromBartc <bc@freeuk.com>
Date2015-07-09 12:25 +0100
Message-ID<mnlll0$45l$1@dont-email.me>
In reply to#64978
On 09/07/2015 11:34, Richard Bos wrote:
> Bartc <bc@freeuk.com> wrote:

>> But I have had to convert C code containing lines like this:
>>
>> printf("ABCD = %s %s %d %x\n", a,b,c,d);

>> ...The most satisfying aspect of doing that is getting rid of all
>> those redundant formats!
>>
>>    println "ABCD =",a,b,c,d
>>
>> You can argue as much as you like about how great keeping all that
>> punctuation is, but you have to admit it's a lot easier without!
>
> It is indeed very easy if the compiler can automagically guess that you
> want one number printed decimal and the other hexadecimal.

BUT THAT'S NOT A TYPE SPECIFIER! How many times do I have to repeat it? 
It would be a format specifier. (And C's hex format doesn't work with 
signed integers anyway so is it even a viable choice?)

> And get this
> right every time.

It can get it right *most of the time*. Then you resort to a more 
detailed specifier if you don't want default behaviour. But now you're 
responsible for writing "%x", "%lx", or "%llx" which means 
double-checking the exact type. (And good luck with that if it depends 
on conditional code.)

(I use a scheme elsewhere where int types are normally printed in 
decimal, and pointer types - other than char* - are printed in hex. 
char* types are printed as strings. And it works beautifully.

My example above isn't made up. Here's one from real code:

printf("CMD: %d %s LINE:%d SPTR=%X\n", i,cmdnames[i],
         getlinenumber(), sptr);

This could be written elsewhere as:

println "CMD:",i,cmdnames[i],"LINE:",getlinenumber(),"SPTR=",sptr

Spot the format specifiers. Yet it gives the same output. Magic! 
Although as proposed for c, it might look like:

printf("CMD: %# %# LINE:%# SPTR=%#\n", i,cmdnames[i],
         getlinenumber(), sptr);

You just write %# (or %? etc) per /thing/.

One other thing: the %X above is problematical. It should be %p, but if 
%X is used, it can cause problems when run on machines with 64-bit 
pointers. This is what we're trying to get away from.)

> I'm not saying C's format specifiers are perfect. Far from it. But they
> have their good points, and much of what they do would need to be
> duplicated in their replacement.

They don't need to be replaced. One idea was to simply augment them by a 
special format specifier that freed you from having to write - and 
maintain - the right number of "l" modifiers, among other things.

What would be your objection to that?

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#65400

FromPhil Carmody <pc+usenet@asdf.org>
Date2015-07-12 13:02 +0300
Message-ID<87egkdbsno.fsf@bazspaz.fatphil.org>
In reply to#64983
Bartc <bc@freeuk.com> writes:
> On 09/07/2015 11:34, Richard Bos wrote:
> > Bartc <bc@freeuk.com> wrote:
> 
> >> But I have had to convert C code containing lines like this:
> >>
> >> printf("ABCD = %s %s %d %x\n", a,b,c,d);
> 
> >> ...The most satisfying aspect of doing that is getting rid of all
> >> those redundant formats!
> >>
> >>    println "ABCD =",a,b,c,d
> >>
> >> You can argue as much as you like about how great keeping all that
> >> punctuation is, but you have to admit it's a lot easier without!
> >
> > It is indeed very easy if the compiler can automagically guess that you
> > want one number printed decimal and the other hexadecimal.
> 
> BUT THAT'S NOT A TYPE SPECIFIER! How many times do I have to repeat
> it? It would be a format specifier.

That's Richard's point. And there's *no* format specifier in your
println line above. Therefore it's not an example of what you claim
it is. Since Richard pulled the trigger, it's instead a bullet in
your foot.

Phil
-- 
A well regulated militia, being necessary to the security of a free state,
the right of the people to keep and bear arms, shall be well regulated.

[toc] | [prev] | [next] | [standalone]


#65402

FromBartc <bc@freeuk.com>
Date2015-07-12 11:23 +0100
Message-ID<mntf58$fo8$1@dont-email.me>
In reply to#65400
On 12/07/2015 11:02, Phil Carmody wrote:
> Bartc <bc@freeuk.com> writes:
>> On 09/07/2015 11:34, Richard Bos wrote:
>>> Bartc <bc@freeuk.com> wrote:
>>
>>>> But I have had to convert C code containing lines like this:
>>>>
>>>> printf("ABCD = %s %s %d %x\n", a,b,c,d);
>>
>>>> ...The most satisfying aspect of doing that is getting rid of all
>>>> those redundant formats!
>>>>
>>>>     println "ABCD =",a,b,c,d
>>>>
>>>> You can argue as much as you like about how great keeping all that
>>>> punctuation is, but you have to admit it's a lot easier without!
>>>
>>> It is indeed very easy if the compiler can automagically guess that you
>>> want one number printed decimal and the other hexadecimal.
>>
>> BUT THAT'S NOT A TYPE SPECIFIER! How many times do I have to repeat
>> it? It would be a format specifier.
>
> That's Richard's point. And there's *no* format specifier in your
> println line above.

That's the default mode; it needs neither type nor format info. That's 
good, yes?

If I wanted to override that, to print in base 16 or base 7 for example 
instead of 10, then a format specifier might be needed. (For base 16, it 
can also be achieved with casts, as pointers are output in hex.)

   print 343:"x7"

outputs "1000". "x7" is a format specifier, in this case saying to use 
base 7.

To summarise, the types of the expressions given to the print routine 
determine the default formatting applied (this is proposed for printf 
too). To override that, a format specifier is needed. Using casts in the 
expressions is also a crude way to control which default is used.

It's not exactly rocket science.

> Therefore it's not an example of what you claim
> it is. Since Richard pulled the trigger, it's instead a bullet in
> your foot.

Huh? If it means not spending half my life writing or maintaining format 
specifiers that are unnecessary, because the language is quite capable 
of deducing them, then by all means shoot me in the foot.

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#64868

From<william@wilbur.25thandClement.com>
Date2015-07-07 11:50 -0700
Message-ID<5kjt6c-38a.ln1@wilbur.25thandClement.com>
In reply to#64778
Bartc <bc@freeuk.com> wrote:
> On 06/07/2015 17:04, Rick C. Hodgin wrote:
>> In 2015, what suggestions would you have for changing anything about
>> the format specifiers used in the printf functions?  Leave them as
>> they are?  Or change this or that about them?
> 
> In 2015, the whole notion of having to tell a compiler what it already 
> knows - the type of the values it's printing - is outdated.
> 
> But working out alternatives is not so easy either, as sometimes you do 
> need precise formatting control, but it is irrevocably tied, with the 
> current system, to the type specifiers.

Using the preprocessor, compound literals, and _Generic (or GCC's
__builtin_types_compatible_p) you can build a printf wrapper such that
arguments are passed as struct pointers with a type tag.

Of course, writing it to support a large number of variable arguments would
be quite verbose in the preprocessor. But that's a mostly aesthetic concern.

I've been thinking of doing this for my small json.c library[1]. It has an
API which permits manipulating the JSON tree using xpath-like notation. E.g.

	json_setnumber(J, 42.0, "foo.bar[#].$", 1, "baz");

will autovivify the node foo.bar[1].baz and initialize it to the value of
42. The problem I've run into is type safety. # expects an int, but often
I'm passing a value of type size_t, which requires explicit casting, and
thus requires that I remember to cast. It also implicates signed overflow
issues, although in practice I mostly use this in loops where the range is
safe (e.g. `for (int i = 0, n = json_count(J, "foo.bar"); i < n; i++)').

Because I've _almost_ forgotten to cast multiple times, and because I'm not
the only person using the library, I feel the API definitely needs to be
fixed. My options are a fancy replacement like I described above, or simply
reverting to a printf style that allows me to annotate the format argument
with __attribute__((printf ())). I'm leaning toward the latter out of
practicality, given that I've already burned myself in this instance.

[1] http://25thandclement.com/~william/projects/json.c.html

[toc] | [prev] | [next] | [standalone]


#64869

From"Rick C. Hodgin" <rick.c.hodgin@gmail.com>
Date2015-07-07 12:14 -0700
Message-ID<41399403-ca55-437d-b440-0b4703205b40@googlegroups.com>
In reply to#64868
On Tuesday, July 7, 2015 at 3:00:19 PM UTC-4, wil...@wilbur.25thandclement.com wrote:
> Bartc <bc@freeuk.com> wrote:
> > On 06/07/2015 17:04, Rick C. Hodgin wrote:
> >> In 2015, what suggestions would you have for changing anything about
> >> the format specifiers used in the printf functions?  Leave them as
> >> they are?  Or change this or that about them?
> > 
> > In 2015, the whole notion of having to tell a compiler what it already 
> > knows - the type of the values it's printing - is outdated.
> > 
> > But working out alternatives is not so easy either, as sometimes you do 
> > need precise formatting control, but it is irrevocably tied, with the 
> > current system, to the type specifiers.
> 
> Using the preprocessor, compound literals, and _Generic (or GCC's
> __builtin_types_compatible_p) you can build a printf wrapper such that
> arguments are passed as struct pointers with a type tag.
> 
> Of course, writing it to support a large number of variable arguments would
> be quite verbose in the preprocessor. But that's a mostly aesthetic concern.

I have considered doing this as well. I would like to move away from
hard-and-fast C/C++ code, and instead move into an editor platform
which generates code for you based on graphical constructs in the GUI.
In this case of printf, you need to print something, and you load a
little GUI builder which allows you to drag and drop and preview what
it would look like in real-time, converting it into proper C code for
you.  In that way, the C language uses its perplexities to handle the
various coding requirements, while human beings deal with things they
can manipulate on a touch-flat-screen monitor by dragging-and-dropping
nearby variable names.

I have lots of ideas like these. :-)

Best regards,
Rick C. Hodgin

[toc] | [prev] | [next] | [standalone]


#64890

FromLes Cargill <lcargill99@comcast.com>
Date2015-07-07 21:52 -0500
Message-ID<mni35p$efa$2@dont-email.me>
In reply to#64778
Bartc wrote:
> On 06/07/2015 17:04, Rick C. Hodgin wrote:
>> In 2015, what suggestions would you have for changing anything about
>> the format specifiers used in the printf functions?  Leave them as
>> they are?  Or change this or that about them?
>
> In 2015, the whole notion of having to tell a compiler what it already
> knows - the type of the values it's printing - is outdated.
>

No, it is not.

Example 1:

char *s = something();

printf(":%s: 0x%08x\n",s,(int)s);

Every. Single. Attempt. To improve on this. Fails. Repeatedly.

Serialization Is Hard.

> But working out alternatives is not so easy either, as sometimes you do
> need precise formatting control, but it is irrevocably tied, with the
> current system, to the type specifiers.
>

-- 
Les Cargill

[toc] | [prev] | [next] | [standalone]


#64893

FromKeith Thompson <kst-u@mib.org>
Date2015-07-07 20:42 -0700
Message-ID<lnwpyb9uy4.fsf@kst-u.example.com>
In reply to#64890
Les Cargill <lcargill99@comcast.com> writes:
> Bartc wrote:
>> On 06/07/2015 17:04, Rick C. Hodgin wrote:
>>> In 2015, what suggestions would you have for changing anything about
>>> the format specifiers used in the printf functions?  Leave them as
>>> they are?  Or change this or that about them?
>>
>> In 2015, the whole notion of having to tell a compiler what it already
>> knows - the type of the values it's printing - is outdated.
>>
>
> No, it is not.
>
> Example 1:
>
> char *s = something();
>
> printf(":%s: 0x%08x\n",s,(int)s);
>
> Every. Single. Attempt. To improve on this. Fails. Repeatedly.

%x requires an unsigned int, not an int.  But I would write that as:

    printf(":%s: %p\n", s, (void*)s);

(It can be argued that the cast is not needed in this case, but it's
easier to be consisten.)

[...]

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#64971

FromPhil Carmody <pc+usenet@asdf.org>
Date2015-07-09 11:12 +0300
Message-ID<87a8v5da0m.fsf@bazspaz.fatphil.org>
In reply to#64893
Keith Thompson <kst-u@mib.org> writes:
> Les Cargill <lcargill99@comcast.com> writes:
> > Bartc wrote:
> >> On 06/07/2015 17:04, Rick C. Hodgin wrote:
> >>> In 2015, what suggestions would you have for changing anything about
> >>> the format specifiers used in the printf functions?  Leave them as
> >>> they are?  Or change this or that about them?
> >>
> >> In 2015, the whole notion of having to tell a compiler what it already
> >> knows - the type of the values it's printing - is outdated.
> >>
> >
> > No, it is not.
> >
> > Example 1:
> >
> > char *s = something();
> >
> > printf(":%s: 0x%08x\n",s,(int)s);
> >
> > Every. Single. Attempt. To improve on this. Fails. Repeatedly.
> 
> %x requires an unsigned int, not an int.  But I would write that as:
> 
>     printf(":%s: %p\n", s, (void*)s);
> 
> (It can be argued that the cast is not needed in this case, but it's
> easier to be consisten.)
> 
> [...]

In that "..." was "Serialization Is Hard.".

%p is not particularly good serialisation, as its output is
implementation defined. Someone else parsing those logs may
not recognise the format being used and either only partially
parse the printed value (as it didn't expect so much), or even
refuse to parse it (as it expected more).

Phil
-- 
A well regulated militia, being necessary to the security of a free state,
the right of the people to keep and bear arms, shall be well regulated.

[toc] | [prev] | [next] | [standalone]


#64985

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-07-09 05:03 -0700
Message-ID<7e6c272c-a71e-4b61-8217-a39d82b1781e@googlegroups.com>
In reply to#64971
On Thursday, July 9, 2015 at 9:13:05 AM UTC+1, Phil Carmody wrote:
> 
> In that "..." was "Serialization Is Hard.".
> 
It's not.
It's just that the conventions aren't well established.
Somehow you've got to know the type of the information, that is, how
to interpret the bits, and the message or sub-message length. You've
also got to be able to express nesting and, ideally, arbitrary graph
connections.
There are lots of ways of doing this and it's not hard to devise your own
scheme. But no one system has caught on.
 

[toc] | [prev] | [next] | [standalone]


#65044

FromBGB <cr88192@hotmail.com>
Date2015-07-09 13:10 -0500
Message-ID<mnmdmg$7ea$1@news.albasani.net>
In reply to#64985
On 7/9/2015 7:03 AM, Malcolm McLean wrote:
> On Thursday, July 9, 2015 at 9:13:05 AM UTC+1, Phil Carmody wrote:
>>
>> In that "..." was "Serialization Is Hard.".
>>
> It's not.
> It's just that the conventions aren't well established.
> Somehow you've got to know the type of the information, that is, how
> to interpret the bits, and the message or sub-message length. You've
> also got to be able to express nesting and, ideally, arbitrary graph
> connections.
> There are lots of ways of doing this and it's not hard to devise your own
> scheme. But no one system has caught on.
>

yes.

could be categorized some:
   text based
     line-oriented formats
     tree-structured formats
     direct data-binding via one of the above
       IOW: flattening structures directly
     indirect data-binding via one of the above
       IOW: flattening structures in terms of another representation
         example: using XML in the case of SOAP
   binary based
     image-based formats (normally byte-based)
     ad-hoc serial formats (byte or bitstream)
     TLV structured formats (usually byte-based)
     ...
     direct data-binding via a specialized format
     indirect data-binding via a specialized format
     indirect data-binding via compression of a textual format
     ...

a distinction can be made here between file-formats and data-binding.

conventional file formats focus mostly on how the data is laid out in 
its serialized form, and how these rules are to be followed. if or how 
closely this matches with any structures internal to the application is 
mostly not an issue.

data-binding tends to focus mostly about data as it is seen in-program, 
with much less care given to its serialization. often as a consequence, 
these formats can't readily be moved between applications or frameworks, 
and often will break between versions of an application as things are 
changed internally (people add/remove fields, or change field types, now 
the loader refuses to accept it).


I personally mostly prefer more conventional file-format design, as it 
tends to have a much better track record on these fronts.


some various options:
textual one-off line oriented formats
   pros:
     very simple to implement
     good for simple formats
   cons:
     very use-case specific;
     typically bulky and unreadable for non-trivial data.

textual tree-structured formats:
   XML (via a DOM-like system):
     pros:
       fairly popular;
       reasonably flexible design.
     cons:
       inefficient with memory (high per-node cost);
       awkward and computationally expensive to work with;
       ...
   S-Expressions (mapped to a Common-Lisp like typesystem):
     pros:
       more efficient and cleaner/easier to work with than XML;
         (Lisp-style list stuff can be mapped pretty OK to C);
       generally much more compact in serialized form than XML.
     cons:
       virtually unknown to most people (vs XML);
       absent care, representations can become brittle;
       generally confused with Rivest S-Expressions, which are different;
       not well standardized.
   JSON (mapped to a dynamic type system):
     pros:
       can be more flexible than naive S-Expressions;
       allows for modestly efficient in-memory representations;
       serialized form is more compact than XML.
     cons:
       generally more awkward to work with in C than S-Expressions;
       unlike S-Exps, generally lacks support for cyclic structures.
         (granted, this is rarely particularly relevant).

binary formats
   image-based:
     pros:
       good for mapping into RAM and using in-place;
       very efficient for many use cases;
       may be combined with TLV for extensibility.
     cons:
       generally ad-hoc structure;
       storage efficiency may be reduced by internal fragmentation;
       typically involve some sort of internal memory-management system;
       random-access compressed image formats are non-trivial.
   TLV:
     pros:
       computationally efficient to parse or produce serially;
       used well, can be fairly space efficient.
     cons:
       tradeoffs may need to be made in terms of TLV tag representation;
       typically, leaves fall back to fixed-format binary.
         dumping contents requires knowing the format specifics;
         new data is unreadable, usually at best ignored.
       naive TLV is poor for random access.
         may be hybridized with an image-based strategy, such as in AVI.

compressed ASCII:
   pros:
     may be more space-competitive with binary formats;
     typically simpler to implement than more specialized formats.
   cons:
     typically slowest and most expensive to encode/decode
       first decompress into a buffer, and then parse.
       typically tokenizing/parsing is the main cost.

binary-serialized abstract model:
   pros:
     can be very space efficient in storage
       can be very compact with a specialized bitsteam format.
     fast to parse and serialize (vs ASCII or compressed ASCII);
     can add cyclic structure support (why not?...);
     more consistent than TLV
       pretty much all the data can be decoded
     can be combined readily with TLV and image-based formats.
   cons:
     pretty much one-off;
     more complex and expensive to process than an analogous TLV format;
       may require allocating/building intermediate in-memory structures;
       may involve expenses such as context modeling (*)
     typically more complex than a compressed ASCII format.

*: earlier on, I had made some compressed message formats using 
in-stream modeling and prediction, but these were generally 
computationally and memory-intensive. I have thus largely abandoned this 
in favor of using a simpler binary serialization and a Deflate-style 
backend compressor, as this tends to be simpler and higher performance 
(even if the compression isn't as good as with a more specialized 
predictive context model).


general choices per use-case:
   compressed A/V bitstreams:
     typically a TLV format with specialized bitstreams for the payload;
     some use bit-packed byte-data with a Deflate-like entropy stage.
   network protocols:
     typically a message-based TLV format at the outer-level
       often compressed-ASCII or binary-abstract-model for message data.
       some messages are plain TLV.
       compression works good even over LAN (CPU is faster than WiFi).
   VM bytecode data:
     often a mix of TLV and/or image-based representations.
     older were generally pure TLV.

or such...

[toc] | [prev] | [next] | [standalone]


#64993

FromJames Kuyper <jameskuyper@verizon.net>
Date2015-07-09 08:58 -0400
Message-ID<mnlr3j$na1$1@dont-email.me>
In reply to#64971
On 07/09/2015 04:12 AM, Phil Carmody wrote:
> Keith Thompson <kst-u@mib.org> writes:
>> Les Cargill <lcargill99@comcast.com> writes:
>>> Bartc wrote:
>>>> On 06/07/2015 17:04, Rick C. Hodgin wrote:
>>>>> In 2015, what suggestions would you have for changing anything about
>>>>> the format specifiers used in the printf functions?  Leave them as
>>>>> they are?  Or change this or that about them?
>>>>
>>>> In 2015, the whole notion of having to tell a compiler what it already
>>>> knows - the type of the values it's printing - is outdated.
>>>>
>>>
>>> No, it is not.
>>>
>>> Example 1:
>>>
>>> char *s = something();
>>>
>>> printf(":%s: 0x%08x\n",s,(int)s);
>>>
>>> Every. Single. Attempt. To improve on this. Fails. Repeatedly.
>>
>> %x requires an unsigned int, not an int.  But I would write that as:
>>
>>     printf(":%s: %p\n", s, (void*)s);
>>
>> (It can be argued that the cast is not needed in this case, but it's
>> easier to be consisten.)
>>
>> [...]
> 
> In that "..." was "Serialization Is Hard.".
> 
> %p is not particularly good serialisation, as its output is
> implementation defined. ...

Is that really a problem? It's meaning is also implementation-specific -
the only context in which it can be used is in the same instance of the
same program from which it was written, and even then, only if the
lifetime of the object it pointed at has not yet ended.

For any purpose for which the implementation-specific format of %p
output is a problem, the proper way to serialize such data is not to
print out the pointer value, but to identify by some method the array
containing the pointed-at object (which might be a 1-element array) and
the index into that array at which the object resides.
-- 
James Kuyper

[toc] | [prev] | [next] | [standalone]


#65014

FromKeith Thompson <kst-u@mib.org>
Date2015-07-09 08:10 -0700
Message-ID<ln4mld9xk1.fsf@kst-u.example.com>
In reply to#64971
Phil Carmody <pc+usenet@asdf.org> writes:
[...]
> %p is not particularly good serialisation, as its output is
> implementation defined. Someone else parsing those logs may
> not recognise the format being used and either only partially
> parse the printed value (as it didn't expect so much), or even
> refuse to parse it (as it expected more).

The format produced by printf("%p", ptr) is required to be handled
correctly by scanf("%p", &ptr).

As for the fact that the format is implementation-defined, what
alternative would you suggest?  A reasonable representation for pointers
*has* to be implementation-defined.  A PDP-11 probably would have used
octal; a system with segments might use something like "1234:5678".
Human-readable pointer representations aren't intended for end users
anyway.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#65399

FromPhil Carmody <pc+usenet@asdf.org>
Date2015-07-12 12:58 +0300
Message-ID<87io9pbsty.fsf@bazspaz.fatphil.org>
In reply to#65014
Keith Thompson <kst-u@mib.org> writes:
> Phil Carmody <pc+usenet@asdf.org> writes:
> [...]
> > %p is not particularly good serialisation, as its output is
> > implementation defined. Someone else parsing those logs may
> > not recognise the format being used and either only partially
> > parse the printed value (as it didn't expect so much), or even
> > refuse to parse it (as it expected more).
> 
> The format produced by printf("%p", ptr) is required to be handled
> correctly by scanf("%p", &ptr).

Utterly false. The scanf might be running on a different machine
with a different architecture from the printf.

Phil
-- 
A well regulated militia, being necessary to the security of a free state,
the right of the people to keep and bear arms, shall be well regulated.

[toc] | [prev] | [next] | [standalone]


#65422

FromJames Kuyper <jameskuyper@verizon.net>
Date2015-07-12 12:57 -0400
Message-ID<55A29C88.4060108@verizon.net>
In reply to#65399
On 07/12/2015 05:58 AM, Phil Carmody wrote:
> Keith Thompson <kst-u@mib.org> writes:
>> Phil Carmody <pc+usenet@asdf.org> writes:
>> [...]
>>> %p is not particularly good serialisation, as its output is
>>> implementation defined. Someone else parsing those logs may
>>> not recognise the format being used and either only partially
>>> parse the printed value (as it didn't expect so much), or even
>>> refuse to parse it (as it expected more).
>>
>> The format produced by printf("%p", ptr) is required to be handled
>> correctly by scanf("%p", &ptr).
> 
> Utterly false. The scanf might be running on a different machine
> with a different architecture from the printf.

The relevant requirement is, in full:

"If the input item is a value converted earlier during the same program
execution, the pointer that results shall compare equal to that value;
otherwise the behavior of the %p conversion is undefined." (7.21.6.2p12)

So the requirement Keith described does exist. He didn't state the
preconditions of that requirement. However, if that was the point of
your objection, your objection was not well-worded - the way you wrote
it implies that running on a machine with the same architecture would be
sufficient to give the conversion well-defined behavior. That's not
sufficient: the printf() and scanf() must occur "during the same program
execution.

This makes sense if you think about it - the value of a pointer to an
object has meaning only inside the instance of the program that created
that object, and only during the lifetime of that object. This seems so
obvious to me that I can't see any justification for objecting to the
fact that Keith didn't mention those limitations.

[toc] | [prev] | [next] | [standalone]


#65433

FromKeith Thompson <kst-u@mib.org>
Date2015-07-12 11:14 -0700
Message-ID<ln380tnsyy.fsf@kst-u.example.com>
In reply to#65399
Phil Carmody <pc+usenet@asdf.org> writes:
> Keith Thompson <kst-u@mib.org> writes:
[...]
>> The format produced by printf("%p", ptr) is required to be handled
>> correctly by scanf("%p", &ptr).
>
> Utterly false. The scanf might be running on a different machine
> with a different architecture from the printf.

Not utterly false, just incomplete.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#65436

Fromgazelle@shell.xmission.com (Kenny McCormack)
Date2015-07-12 18:26 +0000
Message-ID<mnubg5$tgu$1@news.xmission.com>
In reply to#65433
In article <ln380tnsyy.fsf@kst-u.example.com>,
Keith Thompson  <kst-u@mib.org> wrote:
>Phil Carmody <pc+usenet@asdf.org> writes:
>> Keith Thompson <kst-u@mib.org> writes:
>[...]
>>> The format produced by printf("%p", ptr) is required to be handled
>>> correctly by scanf("%p", &ptr).
>>
>> Utterly false. The scanf might be running on a different machine
>> with a different architecture from the printf.
>
>Not utterly false, just incomplete.

That's being (very) charitable.

The reality is that only a complete moron (or a CLC reg, but I repeat
myself) would ever think that the requirement stated above didn't
implicitly require that both operations be done on the same implementation.

Note that "on different machines or architecture" is a red herring anyway,
since different implementations running on the same platform could generate
(use) different output formats.

-- 
"Insisting on perfect safety is for people who don't have the balls to live
in the real world."

    - Mary Shafer, NASA Ames Dryden -

[toc] | [prev] | [next] | [standalone]


Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →

Back to top | Article view | comp.lang.c


csiph-web