Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #234882 > unrolled thread
| Started by | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| First post | 2023-10-12 11:11 +0300 |
| Last post | 2023-10-12 18:31 +0300 |
| Articles | 11 — 4 participants |
Back to article view | Back to comp.lang.c
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: portable way to get highest bit set? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2023-10-12 11:11 +0300
Re: portable way to get highest bit set? Michael S <already5chosen@yahoo.com> - 2023-10-12 02:27 -0700
Re: portable way to get highest bit set? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-12 02:41 -0700
Re: portable way to get highest bit set? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2023-10-12 14:17 +0300
Re: portable way to get highest bit set? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-12 07:01 -0700
Re: portable way to get highest bit set? scott@slp53.sl.home (Scott Lurndal) - 2023-10-12 14:18 +0000
Re: portable way to get highest bit set? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-12 07:58 -0700
Re: portable way to get highest bit set? Michael S <already5chosen@yahoo.com> - 2023-10-12 17:30 +0300
Re: portable way to get highest bit set? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-12 08:05 -0700
Re: portable way to get highest bit set? Michael S <already5chosen@yahoo.com> - 2023-10-12 19:16 +0300
Re: portable way to get highest bit set? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2023-10-12 18:31 +0300
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2023-10-12 11:11 +0300 |
| Subject | Re: portable way to get highest bit set? |
| Message-ID | <20231012111100.272c96b3209baad26a150e55@g{oogle}mail.com> |
Tim Rentsch:
> I've coded up a dozen[*] different solutions.
> Here is my current avorite[**]:
>
> typedef union {
> long double d;
> struct { unsigned long long s; unsigned e:15;} b;
> } X;
>
> unsigned long long
> high_bit_set( unsigned long long u ){
> return 1ULL << (X){ .d = u }.b.e - 16383 & u;
> }
And mine is quite anticlimactic. It accepts a byte, because
I wrote it yesterday, before learning that the OP needs a
more generic solution:
unsigned char highest_bit( unsigned char byte )
{ int i ;
unsigned char mask = 1 << 7;
for( i = 0; i <= 8; i += 1 )
{ if( byte & mask )
{ break; }
mask >>= 1;
}
return mask;
}
It is the most straight-forward method that I could think
of, and extensible to larger sizes of the variable.
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2023-10-12 02:27 -0700 |
| Message-ID | <1d76464c-1f6d-423d-ab97-ea24af85558cn@googlegroups.com> |
| In reply to | #234882 |
On Thursday, October 12, 2023 at 11:11:16 AM UTC+3, Anton Shepelev wrote:
> Tim Rentsch:
> > I've coded up a dozen[*] different solutions.
> > Here is my current avorite[**]:
> >
> > typedef union {
> > long double d;
> > struct { unsigned long long s; unsigned e:15;} b;
> > } X;
> >
> > unsigned long long
> > high_bit_set( unsigned long long u ){
> > return 1ULL << (X){ .d = u }.b.e - 16383 & u;
> > }
> And mine is quite anticlimactic. It accepts a byte, because
> I wrote it yesterday, before learning that the OP needs a
> more generic solution:
>
> unsigned char highest_bit( unsigned char byte )
> { int i ;
> unsigned char mask = 1 << 7;
> for( i = 0; i <= 8; i += 1 )
> { if( byte & mask )
> { break; }
> mask >>= 1;
> }
> return mask;
> }
>
> It is the most straight-forward method that I could think
> of, and extensible to larger sizes of the variable.
> --
There exist more straight-forward method that you could think of
if you try to think a little longer. It is suggested by IanC near beginning
of this thread starting from "If it was me I'd do it in reverse".
> () ascii ribbon campaign -- against html e-mail
> /\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-10-12 02:41 -0700 |
| Message-ID | <86cyxkb2ka.fsf@linuxsc.com> |
| In reply to | #234882 |
Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
> Tim Rentsch:
>
>> I've coded up a dozen[*] different solutions.
>> Here is my current avorite[**]:
>>
>> typedef union {
>> long double d;
>> struct { unsigned long long s; unsigned e:15;} b;
>> } X;
>>
>> unsigned long long
>> high_bit_set( unsigned long long u ){
>> return 1ULL << (X){ .d = u }.b.e - 16383 & u;
>> }
>
> And mine is quite anticlimactic. It accepts a byte, because
> I wrote it yesterday, before learning that the OP needs a
> more generic solution:
>
> unsigned char highest_bit( unsigned char byte )
> { int i ;
> unsigned char mask = 1 << 7;
> for( i = 0; i <= 8; i += 1 )
> { if( byte & mask )
> { break; }
> mask >>= 1;
> }
> return mask;
> }
>
> It is the most straight-forward method that I could think
> of, and extensible to larger sizes of the variable.
But the approach you are using requires knowing the size
of the type involved. The challenge is to write code
that works without having to know the size in advance.
So something like the following, with type T being an
unknown unsigned integer type:
T
highest_bit_set( T u ){
T r;
/* set r to the highest bit set in u */
return r;
}
I have written code for this form of the problem, but
won't be posting it before the weekend. Try it!
P.S. My earlier "solution" was just for fun, not a serious
attempt to solve the problem.
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2023-10-12 14:17 +0300 |
| Message-ID | <20231012141719.99f5a10ec921db3ee6f7d948@g{oogle}mail.com> |
| In reply to | #236095 |
Tim Rentsch:
> But the approach you are using requires knowing the size
> of the type involved. The challenge is to write code that
> works without having to know the size in advance.
Well, I could use macros, or the largest integer type, or:
#include <stddef.h>
#include <limits.h>
char is_le()
{ const int v = 1;
const char* p = (unsigned char*)&v;
return *p == 0;
}
unsigned char highest_bit_1( unsigned char byte )
{ int i ;
unsigned char mask = 1 << (CHAR_BIT-1);
for( i = 0; i <= CHAR_BIT; i += 1 )
{ if( byte & mask )
{ break; }
mask >>= 1;
}
return mask;
}
unsigned long long highest_bit_2( void* val, size_t sz )
{ int i,f,inc;
unsigned char *p = val;
unsigned char c;
char le = is_le();
if( le )
{ inc = 1; } else
{ inc = -1; p += sz-1; }
f = sz;
for( i = 0; i < sz; i += 1 )
{ if( (c = *p) == 0 )
{ p += inc; }
else
{ f = i;
break;
}
}
return highest_bit_1( c ) << (CHAR_BIT*(sz-1-f));
}
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-10-12 07:01 -0700 |
| Message-ID | <864jiwaqic.fsf@linuxsc.com> |
| In reply to | #236929 |
Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
> Tim Rentsch:
>
>> But the approach you are using requires knowing the size
>> of the type involved. The challenge is to write code that
>> works without having to know the size in advance.
>
> Well, I could use macros, or the largest integer type, or:
Neither necessary nor sufficient. The code should work for
the type T being __uint128_t, which is larger than uintmax_t.
> #include <stddef.h>
> #include <limits.h>
>
> [..examine value character-by-character, including
> determining if little-endian encoding is used..]
You're making things more complicated than they need to be.
Looking at character values rather than just values of the type
involved is almost guaranteed to have potential representation
issues. For example, little endian and big endian are not the
only choices possible for endianness.
Below is a proposed solution that mostly works. What problems
does it have, and how can they be fixed? And can we find
something better?
/* The type T has been supplied somewhere upstream. It */
/* is an unsigned type, and possibly outside the realm */
/* of 'integer types' as defined by the C standard, as */
/* for example __uint128_t. */
T
highest_bit_set( T u ){
T r = 1;
while( u>>1 > r ) r <<= 1;
return r;
}
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2023-10-12 14:18 +0000 |
| Message-ID | <ffTVM.60121$8fO.5299@fx15.iad> |
| In reply to | #237611 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
>
>> Tim Rentsch:
>>
>>> But the approach you are using requires knowing the size
>>> of the type involved. The challenge is to write code that
>>> works without having to know the size in advance.
>>
>> Well, I could use macros, or the largest integer type, or:
>
>Neither necessary nor sufficient. The code should work for
>the type T being __uint128_t, which is larger than uintmax_t.
>
>> #include <stddef.h>
>> #include <limits.h>
>>
>> [..examine value character-by-character, including
>> determining if little-endian encoding is used..]
>
>You're making things more complicated than they need to be.
>Looking at character values rather than just values of the type
>involved is almost guaranteed to have potential representation
>issues. For example, little endian and big endian are not the
>only choices possible for endianness.
>
>Below is a proposed solution that mostly works. What problems
>does it have, and how can they be fixed? And can we find
>something better?
>
> /* The type T has been supplied somewhere upstream. It */
> /* is an unsigned type, and possibly outside the realm */
> /* of 'integer types' as defined by the C standard, as */
> /* for example __uint128_t. */
>
> T
> highest_bit_set( T u ){
> T r = 1;
> while( u>>1 > r ) r <<= 1;
> return r;
> }
Given a reasonable compiler (gcc or clang), I'd use
/**
* Return the 1-relative bit number of the
* highest set bit. If no bits are set,
* return zero.
*/
size_t
highest_bit_set(unsigned long long u)
{
return u ? 64ul - __builtin_clzll(u) : 0ul;
}
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-10-12 07:58 -0700 |
| Message-ID | <86zg0nanw5.fsf@linuxsc.com> |
| In reply to | #237682 |
scott@slp53.sl.home (Scott Lurndal) writes:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
>>
>>> Tim Rentsch:
>>>
>>>> But the approach you are using requires knowing the size
>>>> of the type involved. The challenge is to write code that
>>>> works without having to know the size in advance.
>>>
>>> Well, I could use macros, or the largest integer type, or:
>>
>> Neither necessary nor sufficient. The code should work for
>> the type T being __uint128_t, which is larger than uintmax_t.
>>
>>> #include <stddef.h>
>>> #include <limits.h>
>>>
>>> [..examine value character-by-character, including
>>> determining if little-endian encoding is used..]
>>
>> You're making things more complicated than they need to be.
>> Looking at character values rather than just values of the type
>> involved is almost guaranteed to have potential representation
>> issues. For example, little endian and big endian are not the
>> only choices possible for endianness.
>>
>> Below is a proposed solution that mostly works. What problems
>> does it have, and how can they be fixed? And can we find
>> something better?
>>
>> /* The type T has been supplied somewhere upstream. It */
>> /* is an unsigned type, and possibly outside the realm */
>> /* of 'integer types' as defined by the C standard, as */
>> /* for example __uint128_t. */
>>
>> T
>> highest_bit_set( T u ){
>> T r = 1;
>> while( u>>1 > r ) r <<= 1;
>> return r;
>> }
>
> Given a reasonable compiler (gcc or clang), I'd use
>
> /**
> * Return the 1-relative bit number of the
> * highest set bit. If no bits are set,
> * return zero.
> */
> size_t
> highest_bit_set(unsigned long long u)
> {
> return u ? 64ul - __builtin_clzll(u) : 0ul;
> }
That gives a solution to a different problem.
When you have code that solves the problem being
discussed, I'm interested to see it.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2023-10-12 17:30 +0300 |
| Message-ID | <20231012173021.0000149c@yahoo.com> |
| In reply to | #237611 |
On Thu, 12 Oct 2023 07:01:47 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
>
> > Tim Rentsch:
> >
> >> But the approach you are using requires knowing the size
> >> of the type involved. The challenge is to write code that
> >> works without having to know the size in advance.
> >
> > Well, I could use macros, or the largest integer type, or:
>
> Neither necessary nor sufficient. The code should work for
> the type T being __uint128_t, which is larger than uintmax_t.
>
> > #include <stddef.h>
> > #include <limits.h>
> >
> > [..examine value character-by-character, including
> > determining if little-endian encoding is used..]
>
> You're making things more complicated than they need to be.
> Looking at character values rather than just values of the type
> involved is almost guaranteed to have potential representation
> issues. For example, little endian and big endian are not the
> only choices possible for endianness.
>
> Below is a proposed solution that mostly works. What problems
> does it have, and how can they be fixed? And can we find
> something better?
>
> /* The type T has been supplied somewhere upstream. It */
> /* is an unsigned type, and possibly outside the realm */
> /* of 'integer types' as defined by the C standard, as */
> /* for example __uint128_t. */
>
> T
> highest_bit_set( T u ){
> T r = 1;
> while( u>>1 > r ) r <<= 1;
> return r;
> }
That code does not look like working.
That is:
T
highest_bit_set1( T u ){
T r = 0;
while (u > r) r = r*2+1;
return r ^ (r>>1);
}
T
highest_bit_set1( T u ){
T r = 0;
while( u > r) r = r*2+1;
return r ^ (r>>1);
}
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-10-12 08:05 -0700 |
| Message-ID | <86v8bbanjv.fsf@linuxsc.com> |
| In reply to | #237719 |
Michael S <already5chosen@yahoo.com> writes:
> On Thu, 12 Oct 2023 07:01:47 -0700
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
>>
>>> Tim Rentsch:
>>>
>>>> But the approach you are using requires knowing the size
>>>> of the type involved. The challenge is to write code that
>>>> works without having to know the size in advance.
>>>
>>> Well, I could use macros, or the largest integer type, or:
>>
>> Neither necessary nor sufficient. The code should work for
>> the type T being __uint128_t, which is larger than uintmax_t.
>>
>>> #include <stddef.h>
>>> #include <limits.h>
>>>
>>> [..examine value character-by-character, including
>>> determining if little-endian encoding is used..]
>>
>> You're making things more complicated than they need to be.
>> Looking at character values rather than just values of the type
>> involved is almost guaranteed to have potential representation
>> issues. For example, little endian and big endian are not the
>> only choices possible for endianness.
>>
>> Below is a proposed solution that mostly works. What problems
>> does it have, and how can they be fixed? And can we find
>> something better?
>>
>> /* The type T has been supplied somewhere upstream. It */
>> /* is an unsigned type, and possibly outside the realm */
>> /* of 'integer types' as defined by the C standard, as */
>> /* for example __uint128_t. */
>>
>> T
>> highest_bit_set( T u ){
>> T r = 1;
>> while( u>>1 > r ) r <<= 1;
>> return r;
>> }
>
> That code does not look like working.
Yes, deliberately so. I didn't want to give working
code yet, so other people could work on the problem.
> That is:
>
> T
> highest_bit_set1( T u ){
> T r = 0;
> while (u > r) r = r*2+1;
> return r ^ (r>>1);
> }
As expected, this works. I like the way it naturally
accommodates a zero input.
Can you find a different solution that works in
logarithmic time rather than linear time?
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2023-10-12 19:16 +0300 |
| Message-ID | <20231012191641.00007493@yahoo.com> |
| In reply to | #237875 |
On Thu, 12 Oct 2023 08:05:40 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> Michael S <already5chosen@yahoo.com> writes:
>
> > On Thu, 12 Oct 2023 07:01:47 -0700
> > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
> >
> >> Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
> >>
> >>> Tim Rentsch:
> >>>
> >>>> But the approach you are using requires knowing the size
> >>>> of the type involved. The challenge is to write code that
> >>>> works without having to know the size in advance.
> >>>
> >>> Well, I could use macros, or the largest integer type, or:
> >>
> >> Neither necessary nor sufficient. The code should work for
> >> the type T being __uint128_t, which is larger than uintmax_t.
> >>
> >>> #include <stddef.h>
> >>> #include <limits.h>
> >>>
> >>> [..examine value character-by-character, including
> >>> determining if little-endian encoding is used..]
> >>
> >> You're making things more complicated than they need to be.
> >> Looking at character values rather than just values of the type
> >> involved is almost guaranteed to have potential representation
> >> issues. For example, little endian and big endian are not the
> >> only choices possible for endianness.
> >>
> >> Below is a proposed solution that mostly works. What problems
> >> does it have, and how can they be fixed? And can we find
> >> something better?
> >>
> >> /* The type T has been supplied somewhere upstream. It */
> >> /* is an unsigned type, and possibly outside the realm */
> >> /* of 'integer types' as defined by the C standard, as */
> >> /* for example __uint128_t. */
> >>
> >> T
> >> highest_bit_set( T u ){
> >> T r = 1;
> >> while( u>>1 > r ) r <<= 1;
> >> return r;
> >> }
> >
> > That code does not look like working.
>
> Yes, deliberately so. I didn't want to give working
> code yet, so other people could work on the problem.
>
> > That is:
> >
> > T
> > highest_bit_set1( T u ){
> > T r = 0;
> > while (u > r) r = r*2+1;
> > return r ^ (r>>1);
> > }
>
> As expected, this works. I like the way it naturally
> accommodates a zero input.
>
> Can you find a different solution that works in
> logarithmic time rather than linear time?
Naturally, I can. But the code is not as easy to comprehend as I would
like. And I am not sure that for typical distributions with prevalence
of small numbers it is faster than above variant.
T
highest_bit_set2( T u ){
T v;
for (int rshift = 0; (v = (u >> rshift) >> 1) != 0; rshift = rshift*2+1)
u |= v;
return u ^ (u>>1);
}
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2023-10-12 18:31 +0300 |
| Message-ID | <20231012183135.a6faa3027ccac0820b24ad96@g{oogle}mail.com> |
| In reply to | #237611 |
Tim Rentsch:
> The code should work for the type T being __uint128_t,
> which is larger than uintmax_t.
I was in my C89 mode:
#include <stdio.h>
#include <stddef.h>
#include <limits.h>
char is_le()
{ const int v = 1;
const char* p = (unsigned char*)&v;
return *p == 0;
}
unsigned char highest_bit_1( unsigned char byte )
{ int i ;
unsigned char mask = 1 << (CHAR_BIT-1);
for( i = 0; i <= CHAR_BIT; i += 1 )
{ if( byte & mask )
{ break; }
mask >>= 1;
}
return mask;
}
/* sz must be the size of the value /and/ of the buffer: */
void highest_bit_2( void* val, void* buffer, size_t sz )
{ int i,inc;
unsigned char *pval = val;
unsigned char *pbuf = buffer;
unsigned char c ;
char le = is_le();
char set;
if( le )
{ inc = 1; } else
{ inc = -1; pval += sz-1; pbuf += sz-1; }
set = 0;
for( i = 0; i < sz; i += 1 )
{ c = *pval ;
if( !set && c != 0 )
{ *pbuf = highest_bit_1( c );
set = 1;
}
else *pbuf = 0;
pval += inc; pbuf += inc;
}
}
void test( long int val )
{ char buf[sizeof(long int)];
highest_bit_2(&val, &buf, sizeof( long int ));
printf("Value: %9x, High-bit mask: %9x\n", val, *(long unsigned int *)&buf);
}
int main(void)
{ test(0x0); test(0x1); test(0x2); test(0x3); test(0x4); test(0x5);
test((1<<17)-1); test((1<<17)); test((1<<17)+1);
}
...printf("%x"), anybody?
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.c
csiph-web