Groups > comp.lang.c > #26575 > unrolled thread

packed structs

Started by	JohnF <john@please.see.sig.for.email.com>
First post	2012-09-22 01:54 +0000
Last post	2012-10-16 09:13 -0700
Articles	20 on this page of 37 — 13 participants

Back to article view | Back to comp.lang.c

  packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-22 01:54 +0000
    Re: packed structs Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-09-21 23:22 -0400
      Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-22 06:37 +0000
        Re: packed structs "BartC" <bc@freeuk.com> - 2012-09-22 13:47 +0100
          Re: packed structs Ben Bacarisse <ben.usenet@bsb.me.uk> - 2012-09-22 14:00 +0100
          Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-22 15:42 +0000
        Re: packed structs Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-09-22 09:13 -0400
        Re: packed structs Johann Klammer <klammerj@NOSPAM.a1.net> - 2012-09-23 03:10 +0200
          Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-23 02:10 +0000
        Re: packed structs Stephen Sprunk <stephen@sprunk.org> - 2012-09-23 11:44 -0500
          Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-23 23:23 +0000
            Re: packed structs Ben Bacarisse <ben.usenet@bsb.me.uk> - 2012-09-24 01:59 +0100
              Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-24 02:54 +0000
                Re: packed structs Ben Bacarisse <ben.usenet@bsb.me.uk> - 2012-09-24 04:38 +0100
                  Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-24 04:07 +0000
                    Re: packed structs Ben Bacarisse <ben.usenet@bsb.me.uk> - 2012-09-24 12:16 +0100
                      Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-24 11:45 +0000
                Re: packed structs "BartC" <bc@freeuk.com> - 2012-09-24 10:18 +0100
                  Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-24 11:04 +0000
                    Re: packed structs Stephen Sprunk <stephen@sprunk.org> - 2012-09-30 14:21 -0500
                      Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-10-01 07:34 +0000
                        Re: packed structs Stephen Sprunk <stephen@sprunk.org> - 2012-10-15 00:46 -0500
            Re: packed structs Stephen Sprunk <stephen@sprunk.org> - 2012-09-30 13:52 -0500
      Re: packed structs Nick Keighley <nick_keighley_nospam@hotmail.com> - 2012-09-22 01:31 -0700
        Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-22 08:53 +0000
          Re: packed structs Jorgen Grahn <grahn+nntp@snipabacken.se> - 2012-09-22 14:17 +0000
            Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-22 15:33 +0000
              Re: packed structs Jorgen Grahn <grahn+nntp@snipabacken.se> - 2012-09-22 20:43 +0000
              Re: packed structs "BartC" <bc@freeuk.com> - 2012-09-22 22:52 +0100
            Re: packed structs Keith Thompson <kst-u@mib.org> - 2012-09-22 13:47 -0700
              Re: packed structs JohnF <john@forkosh.com.com> - 2012-09-23 00:19 +0000
                Re: packed structs Ian Collins <ian-news@hotmail.com> - 2012-09-23 13:32 +1200
                  Re: packed structs JohnF <john@please.see.sig.for.email.com> - 2012-09-23 02:16 +0000
          Re: packed structs Ian Collins <ian-news@hotmail.com> - 2012-09-23 10:33 +1200
            Re: packed structs Nick Keighley <nick_keighley_nospam@hotmail.com> - 2012-09-23 01:38 -0700
    Re: packed structs The Great Firewall of China Blue <chine.bleu@yahoo.com> - 2012-09-21 21:29 -0700
    Re: packed structs W Karas <wkaras@yahoo.com> - 2012-10-16 09:13 -0700

Page 1 of 2 [1] 2 Next page →

#26575 — packed structs

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-22 01:54 +0000
Subject	packed structs
Message-ID	<k3j5or$q37$1@reader1.panix.com>

Any >>portable<< way to accomplish that in c?
Don't want to use __attribute__((__packed__))
or #pragma pack, etc, nor #ifdef's to choose
among whatever alternatives I happen to know
about. It's a requirement that the code remain
portable.

In particular, I'm trying to write blocks that
conform to a binary file format (gif), and can
set up structs for them easily enough, but can't
fwrite(blockstruct,sizeof(blockstruct),1,fileptr),
or the like, due to blockstruct's inevitable
padding (which indeed occurs for gif format blocks).

At the moment, I just have a different func for
each block type that writes out the members of that
particular struct individually... b..o..r..i..n..g.
A generalization of that idea (if portable packing's
not possible) would also be fine: >>if<< there's some
way to reference the members of a struct, passed as
an argument but of unknown (to the func) type, in a
loop, i.e., for(i=0;i<nmembers;i++)thisstruct->member[i].
Then I could offsetof() and sizeof() each member, and
write it out, so just one (much less boring) func
could handle all the different block type structs.

But, afaik, I don't think that thisstruct->member[i]
thing is possible, nor portable packing. So is there
any "one size fits all" way to handle this problem?
I'm sure people must come across it frequently enough
that it's been thought about, and the best possible
approach (among, possibly, several bad alternatives)
has been identified.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [next] | [standalone]

#26576

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-09-21 23:22 -0400
Message-ID	<k3jaun$8al$1@dont-email.me>
In reply to	#26575

On 9/21/2012 9:54 PM, JohnF wrote:
> Any >>portable<< way to accomplish that in c?

     No.

> Don't want to use __attribute__((__packed__))
> or #pragma pack, etc, nor #ifdef's to choose
> among whatever alternatives I happen to know
> about. It's a requirement that the code remain
> portable.
>
> In particular, I'm trying to write blocks that
> conform to a binary file format (gif), and can
> set up structs for them easily enough, but can't
> fwrite(blockstruct,sizeof(blockstruct),1,fileptr),
> or the like, due to blockstruct's inevitable
> padding (which indeed occurs for gif format blocks).

     Aha!  You don't need (or want) packed structs at all:
You want a way to pluck information from a plain vanilla
struct and write it in an externally-defined format.  That's
a horse of another kettle of colored fish (or something
along those lines).

> At the moment, I just have a different func for
> each block type that writes out the members of that
> particular struct individually... b..o..r..i..n..g.
> A generalization of that idea (if portable packing's
> not possible) would also be fine: >>if<< there's some
> way to reference the members of a struct, passed as
> an argument but of unknown (to the func) type, in a
> loop, i.e., for(i=0;i<nmembers;i++)thisstruct->member[i].
> Then I could offsetof() and sizeof() each member, and
> write it out, so just one (much less boring) func
> could handle all the different block type structs.

     One way is the function-per-struct-type approach, and
although it may be "b..o..r..i..n..g" it has advantages
that should not be dismissed lightly.  Consider that there
are (most likely) only a handful of structs and hence only
a handful of functions; writing them won't take enough
time to b..o..r..e anyone except an ADHD sufferer.

     Still, there's an alternative: You write one super-
function that accepts a void* struct pointer and a "struct
descriptor," usually an array of byte offsets and type codes:

	struct struct_descriptor {
	    size_t offset;
	    enum { BYTE, BYTEPAIR, BYTEQUAD, ..., STOP } type;
	};

	struct foo {
	    int this;  // to be written as four bytes
	    int that;  // to be written as one byte
	    ...
	};

	const struct struct_descriptor foo_description[] = {
	    { offsetof(struct foo, this), BYTEQUAD },
	    { offsetof(struct foo, that), BYTE },
	    ...
	    { 0, STOP } };

So: You set up one descriptor table per struct type, and you
pass a struct pointer and the matching descriptor to the
all-consuming writer function.  (Note that the writer might do
additional work with each field, like writing a multi-byte
quantity in a format-specific endianness -- something no amount
of portable or non-portable packing magic can manage.)

     All very neat and nice, but it has a drawback: The compiler
doesn't know that foo_description[] and struct foo go together,
so it won't complain if you make a mistake like

	struct foo mumble = ...;
	struct bar grumble = ...;
	...
	writer(stream, &mumble, foo_description);
	writer(stream, &grumble, foo_description); // oops!

... and you will be stuck debugging the result.  That's an
area where the b..o..r..i..n..g approach has an advantage.  But,
hey: If you've got a hormonal insufficiency and need to give
your adrenal glands extra exercise, a little terror may help.

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#26578

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-22 06:37 +0000
Message-ID	<k3jmbi$qgf$1@reader1.panix.com>
In reply to	#26576

Eric Sosman <esosman@ieee-dot-org.invalid> wrote:
> JohnF wrote:
>> Any >>portable<< way to accomplish that in c?
>     No.
>> In particular, I'm trying to write blocks that
>> conform to a binary file format (gif), and can
>> set up structs for them easily enough, but can't
>> fwrite(blockstruct,sizeof(blockstruct),1,fileptr),
>> or the like, due to blockstruct's inevitable
>> padding (which indeed occurs for gif format blocks).
> 
>     Aha!  You don't need (or want) packed structs at all:
> You want a way to pluck information from a plain vanilla
> struct and write it in an externally-defined format.
> ...
> You write one super-
> function that accepts a void* struct pointer and a "struct
> descriptor," usually an array of byte offsets and type codes:
>   struct struct_descriptor {
>       size_t offset;
>       enum { BYTE, BYTEPAIR, BYTEQUAD, ..., STOP } type; };

Thanks, Eric (and whoever the other guy is). Both your suggestions
are along the same lines, and, of course, already occurred to me.
And, as per the drawbacks you pointed out (plus being a pain in
the neck to maintain two structs per struct), already dismissed
by me.

I'd actually finished (what I think is) a more elegant solution,
that I called smemf() that's like memcpy() but under format
control, including additonal format specifiers for hex, for bits,
and for other stuff. The code actually works fine, but still
uncompleted is 723 lines (though that includes >>many<< comments),
which is somewhat of a tail-wagging-dog situation which I also
want to avoid.
   The point of smemf is that a single format string replaces
that entire extra struct. So it's still "extra work for mother",
but much less in-your-face when reading the program that uses it.
Since it's not done (and the copyright not registered) yet,
I'm not releasing it, but (since I can't copyright the idea,
anyway) below is its (still somewhat incomplete) main comment block
describing its usage and functional specs, in case anyone else wants
to take a stab at implementation. Also, some stuff is done but not
documented, e.g., little/big endian flag to control which way %d works,
the bit-field specifier, etc. But you'll get the idea. And after that,
the additional details are obvious to anyone who thinks about it.

/* ==========================================================================
 * Function:    smemf ( unsigned char *mem, char *format, ... )
 * Purpose:     Construct a formatted block of memory, typically containing
 *              binary data, e.g., network packets, gif images, etc.
 *              Behaves much like (a subset of) sprintf, but the intent
 *              to accommodate binary data requires a few significant
 *              exceptions, as explained in the Notes section below.
 * --------------------------------------------------------------------------
 * Arguments:   mem (O)         (unsigned char *) to memory block
 *                              to be formatted.
 *              format (I)      (char *) to null-terminated string containing
 *                              specifications for how the variable arg list
 *                              following it should be formatted in mem.
 *                              See Notes below.
 *              ...             as many value args as needed to satisfy
 *                              the format specification above
 * --------------------------------------------------------------------------
 * Returns:     ( int )         # bytes in returned mem,
 *                              0 for any error.
 * --------------------------------------------------------------------------
 * Notes:     o Like sprintf, mem must already be allocated by caller,
 *              and be large enough to accommodate the accompanying
 *              format specification argument.
 *            o format is sprintf-like, but with some significant exceptions
 *              and additions to facilitate smemf's different purpose.
 *              The one most significant similarity and difference is...
 *               * Like sprintf, conversion specifications are introduced by
 *                 the % character and terminated by a conversion specifier
 *                 (see list and discussions below).
 *               * But unlike sprintf, ordinary characters occurring
 *                 outside conversion specifications aren't immediately
 *                 copied into mem...
 *               * Instead, literals that you want formatted in mem
 *                 must always be followed by corresponding conversion
 *                 specifications.
 *               * For example, 123abc%s formats the next >>six bytes<<
 *                 of mem with that >>ascii character<< string.
 *                 But 123abc%x interprets that same string as
 *                 >>hex digits<<, formatting the next >>three bytes<<
 *                 of mem accordingly.
 *               * And that's why ordinary characters must be followed
 *                 by a corresponding conversion specification, i.e.,
 *                 because unlike sprintf, which always interprets ordinary
 *                 characters as ascii, smemf formats binary memory blocks,
 *                 too, and therefore needs a conversion specification
 *                 to interpret the intended meaning of literals.
 *               * Additional notes:
 *                  - When %s isn't preceded by a literal field,
 *                    then smemf interprets the next argument from your
 *                    argument list, in the usual way like sprintf.
 *                    But 123abc%s uses no arguments from your argument list.
 *                  - Field widths like 123abc%10s generate a 10-byte field,
 *                    left-justified with your 123abc literal, and
 *                    right-filled with four blanks. See the field width
 *                    discussion below for additional information about
 *                    optional right-justification, non-blank filler, etc.
 *                  - Leading/trailing whitespace is ignored, so
 *                    " 123abc  %s " is the same as "123abc%s", but any
 *                    embedded whitespace like "123 abc%s" is respected
 *                    (although "123 abc%x" would still be an error,
 *                    while " 123abc %x " becomes okay).
 *                  - Leading/trailing (or pure) whitespace is obtained
 *                    by surrounding the literal with its own quotes,
 *                    e.g., format = " \" 123abc \" %s " includes one blank
 *                    before and after 123abc.
 *                  - On the very rare occasion when you want a literal
 *                    quote character, escape it, i.e., smemf needs
 *                    to see \" in your format string, so you'd need to
 *                    write format = " 123\\\"abc %s " to actually format
 *                    the string 123"abc in mem. That's confusing, but
 *                    a straightforward application of the obvious rules.
 *            o The conversion specifiers recognized by smemf are
 *              s,S, x,X, d,D, all discussed in detail below.
 *              Note that x,X behave identically, as do d,D,
 *              but s,S have different behaviors, discussed in detail below.
 *              But first, some general remarks...
 *               * s,S,x,X are default left-justified, i.e., the first byte
 *                 (or first hex digit for x,X) from your literal or argument
 *                 goes into the next available byte (or hex digit) of mem.
 *                 But %+etc (i.e., a + flag following %) right-justifies
 *                 your literal or argument instead, e.g., 123abc%+10s
 *                 generates a 10-byte field, left-filled with four blanks,
 *                 then followed by your right-justified 123abc literal.
 *                 And 123abc%+10x generates a five-byte field, left-filled
 *                 with two leading 0 bytes, followed by three bytes
 *                 containing your 12,3A,BC.
 *            o The s conversion specifier...
 *               *
 *            o The S conversion specifier...
 *            o The x,X conversion specifier...
 *            o The d,D conversion specifier...
 *               * A literal 123%d is taken as the decimal integer 123,
 *                 or the argument for %d is taken as an int.
 *               * Justification flags following % are ignored.
 *                 The bits comprising your int are "right-justified" in
 *                 your specified field, i.e., low-order bit is rightmost.
 *               * A width %10d means 10 bits. But all fields are byte-sized,
 *                 and 10-bits is "promoted" to 2-bytes, with your int value
 *                 "right-justified" (explained above) in that 16-bit field.
 *                 However, if your int value is greater than 1023=2^10-1,
 *                 then your value is "truncated" to its low-order 10 bits,
 *                 even though that 16-bit field could accommodate more bits.
 *               * If % width.precision d is given, precision is ignored.
 *               * If width is not given, it defaults to 16(bits) if your
 *                 value is less than 65536, or to 32(bits) otherwise.
 *               * Example: 47%d generates two bytes containing 00,2F.
 *                 And 47%17d generates three bytes 00,00,2F.
 * ======================================================================= */

-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26582

From	"BartC" <bc@freeuk.com>
Date	2012-09-22 13:47 +0100
Message-ID	<k3kc56$tu5$1@dont-email.me>
In reply to	#26578

"JohnF" <john@please.see.sig.for.email.com> wrote in message 
news:k3jmbi$qgf$1@reader1.panix.com...

* Notes:     o Like sprintf, mem must already be allocated by caller,
> *            o format is sprintf-like, but with some significant 
> exceptions
> *               * For example, 123abc%s formats the next >>six bytes<<
> *                 But 123abc%x interprets that same string as

<snip complicated ways of avoiding pragma pack()>

This is similar to the situation of reading a writing a binary file, 
containing variable-size values.

Then you might use functions such as inbyte() or outint() to read or build 
such data serially.

But, once you've used your smemf() to create the data representing a packed 
struct, how do you access a field in a the middle?

(With my file methods, I might use some seek-function to get to a particular 
offset, but with in-memory structs you'd expect to do so in a more efficient 
manner.)

-- 
Bartc

[toc] | [prev] | [next] | [standalone]

#26583

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2012-09-22 14:00 +0100
Message-ID	<0.6eb74176814304194575.20120922140021BST.87zk4i2owq.fsf@bsb.me.uk>
In reply to	#26582

"BartC" <bc@freeuk.com> writes:

> "JohnF" <john@please.see.sig.for.email.com> wrote in message
> news:k3jmbi$qgf$1@reader1.panix.com...
>
> * Notes:     o Like sprintf, mem must already be allocated by caller,
>> *            o format is sprintf-like, but with some significant
>> exceptions
>> *               * For example, 123abc%s formats the next >>six bytes<<
>> *                 But 123abc%x interprets that same string as
>
> <snip complicated ways of avoiding pragma pack()>

That misses half the problem.  Even if you could rely on generating the
right packed structure (particularly hard if there are bit-fields
involved), you can't rely on getting the right representation.  The most
obvious problem being byte ordering in integer fields.

<snip>
-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#26588

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-22 15:42 +0000
Message-ID	<k3km93$lq7$2@reader1.panix.com>
In reply to	#26582

BartC <bc@freeuk.com> wrote:
> But, once you've used your smemf() to create the data
> representing a packed struct, how do you access a field
> in a the middle?

You don't. This is to format output, so you already know
what you're putting in, and don't need to re-read it.
I'm writing a gif encoder, forkosh.com/gifsave89.html
For a decoder, you would be reading those blocks,
and then your problem is indeed a problem. I'd have
to write the corresponding scan-like function to smemf()
to deal with that.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26584

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-09-22 09:13 -0400
Message-ID	<k3kdhl$6a3$1@dont-email.me>
In reply to	#26578

On 9/22/2012 2:37 AM, JohnF wrote:
> Eric Sosman <esosman@ieee-dot-org.invalid> wrote:
>> ...
>> You write one super-
>> function that accepts a void* struct pointer and a "struct
>> descriptor," usually an array of byte offsets and type codes:
>>    struct struct_descriptor {
>>        size_t offset;
>>        enum { BYTE, BYTEPAIR, BYTEQUAD, ..., STOP } type; };
>[...]
> I'd actually finished (what I think is) a more elegant solution,
> that I called smemf() that's like memcpy() but under format
> control, including additonal format specifiers for hex, for bits,
> and for other stuff.[...]

     (Shrug.)  Seems to me your approach could be feasible for
structs with only a few elements, but would likely become very
bulky with larger structs.

	/* My suggestion */
	writer(stream, &instance, descriptorArray);

	/* Your way */
	smemf(buffer, descriptorString,
	    instance.this, instance.that, instance.tother,
	    instance.x1, instance.x2, instance.y1, instance.y2,
	    instance.namelength, instance.namestring);

You've got the same opportunity for mismatch that I pointed out
in my suggestion, *plus* the chance to mix up the individual
fields.  (Did you spot the error in my example?  No?  Ah, well,
you see: It's supposed to be x1,y1,x2,y2, not x1,x2,y1,y2 --
wasn't that obvious?)

     But, hey: Whatever floats your boat.

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#26597

From	Johann Klammer <klammerj@NOSPAM.a1.net>
Date	2012-09-23 03:10 +0200
Message-ID	<505e6196$0$1570$91cee783@newsreader04.highway.telekom.at>
In reply to	#26578

JohnF wrote:
> Eric Sosman<esosman@ieee-dot-org.invalid>  wrote:
>> JohnF wrote:
>>> Any>>portable<<  way to accomplish that in c?
>>      No.
>>> In particular, I'm trying to write blocks that
>>> conform to a binary file format (gif), and can
>>> set up structs for them easily enough, but can't
>>> fwrite(blockstruct,sizeof(blockstruct),1,fileptr),
>>> or the like, due to blockstruct's inevitable
>>> padding (which indeed occurs for gif format blocks).
>>
>>      Aha!  You don't need (or want) packed structs at all:
>> You want a way to pluck information from a plain vanilla
>> struct and write it in an externally-defined format.
>> ...
>> You write one super-
>> function that accepts a void* struct pointer and a "struct
>> descriptor," usually an array of byte offsets and type codes:
>>    struct struct_descriptor {
>>        size_t offset;
>>        enum { BYTE, BYTEPAIR, BYTEQUAD, ..., STOP } type; };
>
> Thanks, Eric (and whoever the other guy is). Both your suggestions
> are along the same lines, and, of course, already occurred to me.
> And, as per the drawbacks you pointed out (plus being a pain in
> the neck to maintain two structs per struct), already dismissed
> by me.
>
> I'd actually finished (what I think is) a more elegant solution,
> that I called smemf() that's like memcpy() but under format
> control, including additonal format specifiers for hex, for bits,
> and for other stuff. The code actually works fine, but still
> uncompleted is 723 lines (though that includes>>many<<  comments),
> which is somewhat of a tail-wagging-dog situation which I also
> want to avoid.
>     The point of smemf is that a single format string replaces
> that entire extra struct. So it's still "extra work for mother",
> but much less in-your-face when reading the program that uses it.
> Since it's not done (and the copyright not registered) yet,
> I'm not releasing it, but (since I can't copyright the idea,
> anyway) below is its (still somewhat incomplete) main comment block
> describing its usage and functional specs, in case anyone else wants
> to take a stab at implementation. Also, some stuff is done but not
> documented, e.g., little/big endian flag to control which way %d works,
> the bit-field specifier, etc. But you'll get the idea. And after that,
> the additional details are obvious to anyone who thinks about it.
>
> /* ==========================================================================
>   * Function:    smemf ( unsigned char *mem, char *format, ... )
>   * Purpose:     Construct a formatted block of memory, typically containing
>   *              binary data, e.g., network packets, gif images, etc.
>   *              Behaves much like (a subset of) sprintf, but the intent
>   *              to accommodate binary data requires a few significant
>   *              exceptions, as explained in the Notes section below.
>   * --------------------------------------------------------------------------
>   * Arguments:   mem (O)         (unsigned char *) to memory block
>   *                              to be formatted.
>   *              format (I)      (char *) to null-terminated string containing
>   *                              specifications for how the variable arg list
>   *                              following it should be formatted in mem.
>   *                              See Notes below.
>   *              ...             as many value args as needed to satisfy
>   *                              the format specification above
>   * --------------------------------------------------------------------------
>   * Returns:     ( int )         # bytes in returned mem,
>   *                              0 for any error.
>   * --------------------------------------------------------------------------
>   * Notes:     o Like sprintf, mem must already be allocated by caller,
>   *              and be large enough to accommodate the accompanying
>   *              format specification argument.
>   *            o format is sprintf-like, but with some significant exceptions
>   *              and additions to facilitate smemf's different purpose.
>   *              The one most significant similarity and difference is...
>   *               * Like sprintf, conversion specifications are introduced by
>   *                 the % character and terminated by a conversion specifier
>   *                 (see list and discussions below).
>   *               * But unlike sprintf, ordinary characters occurring
>   *                 outside conversion specifications aren't immediately
>   *                 copied into mem...
>   *               * Instead, literals that you want formatted in mem
>   *                 must always be followed by corresponding conversion
>   *                 specifications.
>   *               * For example, 123abc%s formats the next>>six bytes<<
>   *                 of mem with that>>ascii character<<  string.
>   *                 But 123abc%x interprets that same string as
>   *>>hex digits<<, formatting the next>>three bytes<<
>   *                 of mem accordingly.
>   *               * And that's why ordinary characters must be followed
>   *                 by a corresponding conversion specification, i.e.,
>   *                 because unlike sprintf, which always interprets ordinary
>   *                 characters as ascii, smemf formats binary memory blocks,
>   *                 too, and therefore needs a conversion specification
>   *                 to interpret the intended meaning of literals.
>   *               * Additional notes:
>   *                  - When %s isn't preceded by a literal field,
>   *                    then smemf interprets the next argument from your
>   *                    argument list, in the usual way like sprintf.
>   *                    But 123abc%s uses no arguments from your argument list.
>   *                  - Field widths like 123abc%10s generate a 10-byte field,
>   *                    left-justified with your 123abc literal, and
>   *                    right-filled with four blanks. See the field width
>   *                    discussion below for additional information about
>   *                    optional right-justification, non-blank filler, etc.
>   *                  - Leading/trailing whitespace is ignored, so
>   *                    " 123abc  %s " is the same as "123abc%s", but any
>   *                    embedded whitespace like "123 abc%s" is respected
>   *                    (although "123 abc%x" would still be an error,
>   *                    while " 123abc %x " becomes okay).
>   *                  - Leading/trailing (or pure) whitespace is obtained
>   *                    by surrounding the literal with its own quotes,
>   *                    e.g., format = " \" 123abc \" %s " includes one blank
>   *                    before and after 123abc.
>   *                  - On the very rare occasion when you want a literal
>   *                    quote character, escape it, i.e., smemf needs
>   *                    to see \" in your format string, so you'd need to
>   *                    write format = " 123\\\"abc %s " to actually format
>   *                    the string 123"abc in mem. That's confusing, but
>   *                    a straightforward application of the obvious rules.
>   *            o The conversion specifiers recognized by smemf are
>   *              s,S, x,X, d,D, all discussed in detail below.
>   *              Note that x,X behave identically, as do d,D,
>   *              but s,S have different behaviors, discussed in detail below.
>   *              But first, some general remarks...
>   *               * s,S,x,X are default left-justified, i.e., the first byte
>   *                 (or first hex digit for x,X) from your literal or argument
>   *                 goes into the next available byte (or hex digit) of mem.
>   *                 But %+etc (i.e., a + flag following %) right-justifies
>   *                 your literal or argument instead, e.g., 123abc%+10s
>   *                 generates a 10-byte field, left-filled with four blanks,
>   *                 then followed by your right-justified 123abc literal.
>   *                 And 123abc%+10x generates a five-byte field, left-filled
>   *                 with two leading 0 bytes, followed by three bytes
>   *                 containing your 12,3A,BC.
>   *            o The s conversion specifier...
>   *               *
>   *            o The S conversion specifier...
>   *            o The x,X conversion specifier...
>   *            o The d,D conversion specifier...
>   *               * A literal 123%d is taken as the decimal integer 123,
>   *                 or the argument for %d is taken as an int.
>   *               * Justification flags following % are ignored.
>   *                 The bits comprising your int are "right-justified" in
>   *                 your specified field, i.e., low-order bit is rightmost.
>   *               * A width %10d means 10 bits. But all fields are byte-sized,
>   *                 and 10-bits is "promoted" to 2-bytes, with your int value
>   *                 "right-justified" (explained above) in that 16-bit field.
>   *                 However, if your int value is greater than 1023=2^10-1,
>   *                 then your value is "truncated" to its low-order 10 bits,
>   *                 even though that 16-bit field could accommodate more bits.
>   *               * If % width.precision d is given, precision is ignored.
>   *               * If width is not given, it defaults to 16(bits) if your
>   *                 value is less than 65536, or to 32(bits) otherwise.
>   *               * Example: 47%d generates two bytes containing 00,2F.
>   *                 And 47%17d generates three bytes 00,00,2F.
>   * ======================================================================= */
>

There is already a library which does very similar stuff.
It is called libtpl.

[toc] | [prev] | [next] | [standalone]

#26599

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-23 02:10 +0000
Message-ID	<k3lr33$eub$1@reader1.panix.com>
In reply to	#26597

Johann Klammer <klammerj@nospam.a1.net> wrote:
> JohnF wrote:
>>
>> /* =======================================================================
>>   * Function:    smemf ( unsigned char *mem, char *format, ... )
>>   * Purpose:     Construct a formatted block of memory, typically containing
>>   *              binary data, e.g., network packets, gif images, etc.
>>   *              Behaves much like (a subset of) sprintf, but the intent
>>   *              to accommodate binary data requires a few significant
>>   *              exceptions, as explained in the Notes section below.
>>  <<snip>>
> 
> There is already a library which does very similar stuff.
> It is called libtpl.

Thanks for the pointer, Johann.
  http://tpl.sourceforge.net/userguide.html
Also looks interesting, though less similar than pack/unpack
for python and perl, pointed out previously. Nevertheless,
certainly valuable additional food for thought. I'll certainly
study them all, and then try to spec out (what seems to me like)
the best C variant.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26613

From	Stephen Sprunk <stephen@sprunk.org>
Date	2012-09-23 11:44 -0500
Message-ID	<k3ne96$bja$1@dont-email.me>
In reply to	#26578

On 22-Sep-12 01:37, JohnF wrote:
> I'd actually finished (what I think is) a more elegant solution,
> that I called smemf() that's like memcpy() but under format
> control, including additonal format specifiers for hex, for bits,
> and for other stuff. The code actually works fine, but still
> uncompleted is 723 lines (though that includes >>many<< comments),
> which is somewhat of a tail-wagging-dog situation which I also
> want to avoid.

Your smemf() looks interesting, but I'm curious why you went with that
syntax (which, despite claiming to be similar to printf, doesn't seem to
end up looking much like it) rather than leveraging the syntax of an
existing system for the same purpose, eg. Perl's pack/unpack.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]

#26625

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-23 23:23 +0000
Message-ID	<k3o5mb$1ib$1@reader1.panix.com>
In reply to	#26613

Stephen Sprunk <stephen@sprunk.org> wrote:
> On 22-Sep-12 01:37, JohnF wrote:
>> I'd actually finished (what I think is) a more elegant solution,
>> that I called smemf() that's like memcpy() but under format
>> control, including additonal format specifiers for hex, for bits,
>> and for other stuff. The code actually works fine, but still
>> uncompleted is 723 lines (though that includes >>many<< comments),
>> which is somewhat of a tail-wagging-dog situation which I also
>> want to avoid.
> 
> Your smemf() looks interesting, but I'm curious why you went with that
> syntax (which, despite claiming to be similar to printf, doesn't seem to
> end up looking much like it) rather than leveraging the syntax of an
> existing system for the same purpose, eg. Perl's pack/unpack.
> S

You didn't read all the followups -- I wasn't aware of perl's
(or python's) pack/unpack, but they were brought to my attention
(thanks again, guys). Then I indeed said I'd be reading up more
carefully about them all, and trying to re-spec smemf() (and maybe
rename it -- pack or spackf???) based on the all that stuff,
as the best C variant I can come up with.
   By the way, compared with those pack/unpack formats, I think
smemf()'s format string syntax is a lot more C/sprintf-like
(could you be more specific?). That's, of course, the trick:
access all functionality from a format string syntax that's
immediately intuitively sensible to both (a)people already
familiar with perl and/or python pack/unpack and maybe C/sprintf,
as well as (b)people only familiar with C/sprintf. Since I'm a
b-type myself, and wasn't even aware of the a-types until
brought to my attention here, the C/sprintf look-alike was
my sole original goal (which I'd thought was pretty successful,
modulo the minimal unavoidable syntax differences due to
fundamental functional requirements differences).
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26627

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2012-09-24 01:59 +0100
Message-ID	<0.6562e3f0e92d8c6a2354.20120924015911BST.871uhs2q3k.fsf@bsb.me.uk>
In reply to	#26625

JohnF <john@please.see.sig.for.email.com> writes:
<snip>
>    By the way, compared with those pack/unpack formats, I think
> smemf()'s format string syntax is a lot more C/sprintf-like
> (could you be more specific?). That's, of course, the trick:
> access all functionality from a format string syntax that's
> immediately intuitively sensible to both (a)people already
> familiar with perl and/or python pack/unpack and maybe C/sprintf,
> as well as (b)people only familiar with C/sprintf. Since I'm a
> b-type myself, and wasn't even aware of the a-types until
> brought to my attention here, the C/sprintf look-alike was
> my sole original goal (which I'd thought was pretty successful,
> modulo the minimal unavoidable syntax differences due to
> fundamental functional requirements differences).

The main departure from sprintf is that literal characters are not
copied to the destination.  That's going to look very odd at first
glance.  The main departure from pack/unpack is the lack of support for
alternative byte orderings.  As a result, I'm not sure it's all that
close to either familiar "model".

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#26629

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-24 02:54 +0000
Message-ID	<k3oi08$14i$1@reader1.panix.com>
In reply to	#26627

Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> JohnF <john@please.see.sig.for.email.com> writes:
> <snip>
>>    By the way, compared with those pack/unpack formats, I think
>> smemf()'s format string syntax is a lot more C/sprintf-like
>> (could you be more specific?). That's, of course, the trick:
>> access all functionality from a format string syntax that's
>> immediately intuitively sensible to both (a)people already
>> familiar with perl and/or python pack/unpack and maybe C/sprintf,
>> as well as (b)people only familiar with C/sprintf. Since I'm a
>> b-type myself, and wasn't even aware of the a-types until
>> brought to my attention here, the C/sprintf look-alike was
>> my sole original goal (which I'd thought was pretty successful,
>> modulo the minimal unavoidable syntax differences due to
>> fundamental functional requirements differences).
> 
> The main departure from sprintf is that literal characters are not
> copied to the destination.  That's going to look very odd at first
> glance.

Impossible to avoid (I think): what are you supposed to do with
  smemf(mem,"deaf"); ?
Is that 4 ascii chars or two hex bytes? Somehow, the
user has specify what he wants. My solution:
  smemf(mem,"deaf %s"); or smemf(mem,"deaf %x");
where a format specification preceded by a literal
is applied to that literal rather than eating the next arg.
If you've got a better idea, please follow up... consider
that both a request and a challenge. I tried and failed
to think of any better idea. But I'd be grateful for one.
I can (probably) code it, but I can't think of it.

> The main departure from pack/unpack is the lack of support for
> alternative byte orderings.  As a result, I'm not sure it's all that
> close to either familiar "model".

It does have little/big-endian support as an extra %d flag.
In the middle of the state machine parsing the format string...
  /* --- endian option flags (for %d) --- */
  case 'l': case 'L': endian = (-1); break; /* little-endian */
  case 'b': case 'B': endian = (+1); break; /* or big-endian */
  case 'e': case 'E': endian = ENDIAN; break; /*or whatever machine uses*/
I just didn't document it very well in the comments yet.
Sorry about that. I have a back-and-forth technique between
writing up the functional specs as comments in enough detail
to get going, then writing some code to see how it works,
then returning to the comments to fix them up or add some more
or whatever, etc, etc.
   It was primarily smemf()'s general idea I wanted to get across
by posting that comment block. Obviously, any shortcomings
in the specific functional details can be corrected and coded.
Of course, when I originally posted those comments, I wasn't
yet aware of python/perl pack and unpack. In that case, I could
have gotten across the general idea just by mentioning those.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26630

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2012-09-24 04:38 +0100
Message-ID	<0.c55a90e358b081ca4a3c.20120924043848BST.87mx0g1453.fsf@bsb.me.uk>
In reply to	#26629

JohnF <john@please.see.sig.for.email.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>> JohnF <john@please.see.sig.for.email.com> writes:
>> <snip>
>>>    By the way, compared with those pack/unpack formats, I think
>>> smemf()'s format string syntax is a lot more C/sprintf-like
>>> (could you be more specific?). That's, of course, the trick:
>>> access all functionality from a format string syntax that's
>>> immediately intuitively sensible to both (a)people already
>>> familiar with perl and/or python pack/unpack and maybe C/sprintf,
>>> as well as (b)people only familiar with C/sprintf. Since I'm a
>>> b-type myself, and wasn't even aware of the a-types until
>>> brought to my attention here, the C/sprintf look-alike was
>>> my sole original goal (which I'd thought was pretty successful,
>>> modulo the minimal unavoidable syntax differences due to
>>> fundamental functional requirements differences).
>> 
>> The main departure from sprintf is that literal characters are not
>> copied to the destination.  That's going to look very odd at first
>> glance.
>
> Impossible to avoid (I think): what are you supposed to do with
>   smemf(mem,"deaf"); ?
> Is that 4 ascii chars or two hex bytes?

My point was that its 4 chars (the encoding is whatever your C
implementation decides, it need not be ASCII) in sprintf and not doing
that causes a difference.  You said you'd aimed for (or achieved)
"minimal unavoidable syntax differences".

> Somehow, the
> user has specify what he wants. My solution:
>   smemf(mem,"deaf %s"); or smemf(mem,"deaf %x");
> where a format specification preceded by a literal
> is applied to that literal rather than eating the next arg.

In sprintf et. al. the number of required arguments is equal to the
number of format specifiers and everything else is literal bytes.  I am
not saying this is wrong, I am saying that it does not match my notion
of "minimal unavoidable syntax differences" from sprintf.

<snip>
-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#26631

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-24 04:07 +0000
Message-ID	<k3om9d$nam$1@reader1.panix.com>
In reply to	#26630

Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> JohnF <john@please.see.sig.for.email.com> writes:
>> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>>> JohnF <john@please.see.sig.for.email.com> writes:
>>> <snip>
>>>>    By the way, compared with those pack/unpack formats, I think
>>>> smemf()'s format string syntax is a lot more C/sprintf-like
>>>> (could you be more specific?). That's, of course, the trick:
>>>> access all functionality from a format string syntax that's
>>>> immediately intuitively sensible to both (a)people already
>>>> familiar with perl and/or python pack/unpack and maybe C/sprintf,
>>>> as well as (b)people only familiar with C/sprintf. Since I'm a
>>>> b-type myself, and wasn't even aware of the a-types until
>>>> brought to my attention here, the C/sprintf look-alike was
>>>> my sole original goal (which I'd thought was pretty successful,
>>>> modulo the minimal unavoidable syntax differences due to
>>>> fundamental functional requirements differences).
>>> 
>>> The main departure from sprintf is that literal characters are not
>>> copied to the destination.  That's going to look very odd at first
>>> glance.
>>
>> Impossible to avoid (I think): what are you supposed to do with
>>   smemf(mem,"deaf"); ?
>> Is that 4 ascii chars or two hex bytes?
> 
> My point was that its 4 chars (the encoding is whatever your C
> implementation decides, it need not be ASCII) in sprintf and not doing
> that causes a difference.  You said you'd aimed for (or achieved)
> "minimal unavoidable syntax differences".
> 
>> Somehow, the user to has specify what he wants.
>> My solution:
>>   smemf(mem,"deaf %s"); or smemf(mem,"deaf %x");
>> where a format specification preceded by a literal
>> is applied to that literal rather than eating the next arg.
> 
> In sprintf et. al. the number of required arguments is equal to the
> number of format specifiers and everything else is literal bytes.  I am
> not saying this is wrong, I am saying that it does not match my notion
> of "minimal unavoidable syntax differences" from sprintf.
> <snip>

Well, part of what you snipped was my request for an alternative.
Lacking that, it matches the notion of "minimal unavoidable
syntax differences" because of the "unavoidable" part.
You have to somehow deal with "deaf". And, even worse, with "10",
which could be ascii/decimal/hex/bits, and which I deal with
in one consistent way, "10%s"/"10%d"/"10%x"/"10%b".
   For someone familiar and comfortable with sprintf formats,
that's the most intuitively sensible way I could think of to
deal with the problem. And it >>must<< be dealt with somehow.
If you've got a better idea, I'd be happy to code it.
Otherwise, it's an unavoidable difference.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26649

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2012-09-24 12:16 +0100
Message-ID	<0.945be815097351071b79.20120924121613BST.87a9wf1xj6.fsf@bsb.me.uk>
In reply to	#26631

JohnF <john@please.see.sig.for.email.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>> JohnF <john@please.see.sig.for.email.com> writes:
>>> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>>>> JohnF <john@please.see.sig.for.email.com> writes:
>>>> <snip>
>>>>>    By the way, compared with those pack/unpack formats, I think
>>>>> smemf()'s format string syntax is a lot more C/sprintf-like
>>>>> (could you be more specific?). That's, of course, the trick:
>>>>> access all functionality from a format string syntax that's
>>>>> immediately intuitively sensible to both (a)people already
>>>>> familiar with perl and/or python pack/unpack and maybe C/sprintf,
>>>>> as well as (b)people only familiar with C/sprintf. Since I'm a
>>>>> b-type myself, and wasn't even aware of the a-types until
>>>>> brought to my attention here, the C/sprintf look-alike was
>>>>> my sole original goal (which I'd thought was pretty successful,
>>>>> modulo the minimal unavoidable syntax differences due to
>>>>> fundamental functional requirements differences).
>>>> 
>>>> The main departure from sprintf is that literal characters are not
>>>> copied to the destination.  That's going to look very odd at first
>>>> glance.
>>>
>>> Impossible to avoid (I think): what are you supposed to do with
>>>   smemf(mem,"deaf"); ?
>>> Is that 4 ascii chars or two hex bytes?
>> 
>> My point was that its 4 chars (the encoding is whatever your C
>> implementation decides, it need not be ASCII) in sprintf and not doing
>> that causes a difference.  You said you'd aimed for (or achieved)
>> "minimal unavoidable syntax differences".
>> 
>>> Somehow, the user to has specify what he wants.
>>> My solution:
>>>   smemf(mem,"deaf %s"); or smemf(mem,"deaf %x");
>>> where a format specification preceded by a literal
>>> is applied to that literal rather than eating the next arg.
>> 
>> In sprintf et. al. the number of required arguments is equal to the
>> number of format specifiers and everything else is literal bytes.  I am
>> not saying this is wrong, I am saying that it does not match my notion
>> of "minimal unavoidable syntax differences" from sprintf.
>> <snip>
>
> Well, part of what you snipped was my request for an alternative.
> Lacking that, it matches the notion of "minimal unavoidable
> syntax differences" because of the "unavoidable" part.

I gave you my suggestion which is why I snipped the request -- literal
text just represents those characters.  That's what sprintf does and
therefor offers the least "surprise".

> You have to somehow deal with "deaf". And, even worse, with "10",
> which could be ascii/decimal/hex/bits, and which I deal with
> in one consistent way, "10%s"/"10%d"/"10%x"/"10%b".
>    For someone familiar and comfortable with sprintf formats,
> that's the most intuitively sensible way I could think of to
> deal with the problem. And it >>must<< be dealt with somehow.

Doing what sprintf does must surely be closer than doing something new.
This problem that you insist must be dealt with seems to me to be an
invented one: it's intuitive (to me at least) that "deaf" in a format
means copy a 'd' an 'e' an 'a' and then an 'f' to the destination.

> If you've got a better idea, I'd be happy to code it.
> Otherwise, it's an unavoidable difference.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#26651

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-24 11:45 +0000
Message-ID	<k3ph4v$lvv$1@reader1.panix.com>
In reply to	#26649

Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> JohnF <john@please.see.sig.for.email.com> writes:
>>
>> what are you supposed to do with
>>   smemf(mem,"deaf"); ?
>> Is that 4 ascii chars or two hex bytes?
>
> it's intuitive (to me at least) that "deaf" in a format
> means copy a 'd' an 'e' an 'a' and then an 'f' to the destination.

Okay, good to know. Thanks so much for your thoughts.
-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26642

From	"BartC" <bc@freeuk.com>
Date	2012-09-24 10:18 +0100
Message-ID	<k3p8jm$vl5$1@dont-email.me>
In reply to	#26629

"JohnF" <john@please.see.sig.for.email.com> wrote in message
news:k3oi08$14i$1@reader1.panix.com...
> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:

>> The main departure from pack/unpack is the lack of support for
>> alternative byte orderings.  As a result, I'm not sure it's all that
>> close to either familiar "model".

>   It was primarily smemf()'s general idea I wanted to get across
> by posting that comment block. Obviously, any shortcomings
> in the specific functional details can be corrected and coded.
> Of course, when I originally posted those comments, I wasn't
> yet aware of python/perl pack and unpack. In that case, I could
> have gotten across the general idea just by mentioning those.

Actually this seems a much better idea to me now, than it did at first ...
if you forget about the purpose of using it in place of C's packed structs.

It's just an alternative, possibly simpler way of writing to a binary stream 
than trying to use -printf() functions, or sequences of function calls. And 
presumably to read from them too.

But you didn't explain clearly how it would be used. In the GIF example,
presumably you'd have an *unpacked* struct representing the header
information, which allows the program to access fields, now properly
aligned, using the conventional forms of 'p.a' or 'p->a'.

smem() (and it's -scanf() counterpart) would simply be used to write to the
proper packed form, or to read from it.

So they are not a plug-in replacement for '#pragma pack()'.

-- 
Bartc

[toc] | [prev] | [next] | [standalone]

#26648

From	JohnF <john@please.see.sig.for.email.com>
Date	2012-09-24 11:04 +0000
Message-ID	<k3peo6$quh$1@reader1.panix.com>
In reply to	#26642

BartC <bc@freeuk.com> wrote:
> "JohnF" <john@please.see.sig.for.email.com> wrote
>> Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> 
>>> The main departure from pack/unpack is the lack of support for
>>> alternative byte orderings.  As a result, I'm not sure it's all that
>>> close to either familiar "model".
> 
>>   It was primarily smemf()'s general idea I wanted to get across
>> by posting that comment block. Obviously, any shortcomings
>> in the specific functional details can be corrected and coded.
>> Of course, when I originally posted those comments, I wasn't
>> yet aware of python/perl pack and unpack. In that case, I could
>> have gotten across the general idea just by mentioning those.
> 
> Actually this seems a much better idea to me now, than it did at first ...
> if you forget about the purpose of using it in place of C's packed structs.
> 
> It's just an alternative, possibly simpler way of writing to a binary stream 
> than trying to use -printf() functions, or sequences of function calls. And 
> presumably to read from them too.
> 
> But you didn't explain clearly how it would be used. In the GIF example,
> presumably you'd have an *unpacked* struct representing the header
> information, which allows the program to access fields, now properly
> aligned, using the conventional forms of 'p.a' or 'p->a'.

Well, you'd use it however you liked. But for my gif situation,
I'd envisioned >>doing away with structs entirely<<, packed or not.
The smemf >>format string totally replaces<< the need for any struct.
And, of course, I prototyped that for myself, i.e., wrote some
pseudocode using the as-yet-uncompleted smemf, just to make sure
that idea seems to work.
    You can see exhaustively complete comments about the gif block
formats at forkosh.com/gifsave89.html by clicking the Listing link
along the left-hand side under "Related Pages", and scrolling down
to line#493 for the GIFIMAGEDESCRIP struct. My "pseudocode" for that
is just one smemf statement that totally replaces the struct,

nbitsinbuffer =  /* whitespace in smemf format string is ignored */
    smemf(buffer, " 2C %x  "   /* Image Descriptor identifier is hex 2C */
                  "    %2ld"   /* 2-byte little-endian word for X-pos */
                  "    %2ld"   /* 2-byte little-endian word for Top */
                  "    %2ld"   /* 2-byte little-endian word for Width */
                  "    %2ld"   /* 2-byte little-endian word for Height */
    /* following is the "Packed" Byte consisting of five bit fields */
                  "    %3b "   /* 3-bits #colorbits */
                  "  0 %2b "   /* 2-bits "reserved bits" */
                  "  0 %1b "   /* 1-bit  local colortable sort flag */
                  "  0 %1b "   /* 1-bit  interlace flag */
                  "    %1b ",  /* 1-bit  local colortable flag */
    col0,row0, ncols,nrows, ncolorbits, (colortable==NULL?0:1) );

And smemf() returns the size, in #bits, of the buffer it constructs.
That would usually be a multiple of 8, in which case you can just
fwrite(buffer,etc), or do whatever you want with it.

> smem() (and it's -scanf() counterpart) would simply be used to write to the
> proper packed form, or to read from it.
> So they are not a plug-in replacement for '#pragma pack()'.

-- 
John Forkosh  ( mailto:  j@f.com  where j=john and f=forkosh )

[toc] | [prev] | [next] | [standalone]

#26896

From	Stephen Sprunk <stephen@sprunk.org>
Date	2012-09-30 14:21 -0500
Message-ID	<k4a63c$11h$1@dont-email.me>
In reply to	#26648

On 24-Sep-12 06:04, JohnF wrote:
> Well, you'd use it however you liked. But for my gif situation,
> I'd envisioned >>doing away with structs entirely<<, packed or not.
> The smemf >>format string totally replaces<< the need for any struct.
> And, of course, I prototyped that for myself, i.e., wrote some
> pseudocode using the as-yet-uncompleted smemf, just to make sure
> that idea seems to work.
>     You can see exhaustively complete comments about the gif block
> formats at forkosh.com/gifsave89.html by clicking the Listing link
> along the left-hand side under "Related Pages", and scrolling down
> to line#493 for the GIFIMAGEDESCRIP struct. My "pseudocode" for that
> is just one smemf statement that totally replaces the struct,
> 
> nbitsinbuffer =  /* whitespace in smemf format string is ignored */
>   smemf(buffer, " 2C %x  "   /* Image Descriptor identifier is hex 2C */
>                 "    %2ld"   /* 2-byte little-endian word for X-pos */
>                 "    %2ld"   /* 2-byte little-endian word for Top */
>                 "    %2ld"   /* 2-byte little-endian word for Width */
>                 "    %2ld"   /* 2-byte little-endian word for Height */
>   /* following is the "Packed" Byte consisting of five bit fields */
>                 "    %3b "   /* 3-bits #colorbits */
>                 "  0 %2b "   /* 2-bits "reserved bits" */
>                 "  0 %1b "   /* 1-bit  local colortable sort flag */
>                 "  0 %1b "   /* 1-bit  interlace flag */
>                 "    %1b ",  /* 1-bit  local colortable flag */
>   col0,row0, ncols,nrows, ncolorbits, (colortable==NULL?0:1) );
> 
> And smemf() returns the size, in #bits, of the buffer it constructs.
> That would usually be a multiple of 8, in which case you can just
> fwrite(buffer,etc), or do whatever you want with it.

This example shows how far you have diverged from sprintf().  From such
a claim, I would have expected something more like this:

nbitsinbuffer = smemf(buffer,
    "%1lu"   /* Image Descriptor, 8-bit little-endian unsigned int */
    "%2lu"   /* X-Pos, 16-bit little-endian unsigned int */
    "%2lu"   /* Top, 16-bit little-endian unsigned int */
    "%2lu"   /* Width, 16-bit little-endian unsigned int */
    "%2lu"   /* Height, 16-bit little-endian unsigned int */
    /* following is the "Packed" Byte consisting of five bit fields */
    "%3b"   /* #colorbits, 3-bit field */
    "%2b"   /* reserved, 2-bit field */
    "%1b"   /* local colortable sort flag, 1-bit field */
    "%1b"   /* interlace flag, 1-bit field */
    "%1b",  /* local colortable flag, 1-bit field */
    0x2C, col0, row0, ncols, nrows, ncolorbits, 0, 0, 0,
(colortable==NULL?0:1) );

Notice that whitespace is _not_ ignored and that there are ten arguments
corresponding to ten format specifiers.  I also used %u rather than %d
since I'm pretty sure those ints are supposed to be unsigned (but I'm
not sure that makes a difference here).

Of course, your use of "%ld" to mean "little-endian word" rather than
"long signed integer" is a major difference as well, though that's
forgivable since you obviously need more control over representation
than sprintf()'s specifiers offer.  When thinking it through, though,
that was the point at which I decided that trying to reuse those
specifiers was probably more trouble than it was worth.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

packed structs

Contents

#26575 — packed structs

#26576

#26578

#26582

#26583

#26588

#26584

#26597

#26599

#26613

#26625

#26627

#26629

#26630

#26631

#26649

#26651

#26642

#26648

#26896