Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: "Rick C. Hodgin" Newsgroups: comp.compilers Subject: Bit swizzling Date: Sat, 5 Sep 2020 12:05:07 -0400 Organization: Liberty Software Foundation Lines: 52 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <20-09-014@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="67025"; mail-complaints-to="abuse@iecc.com" Keywords: optimize, question Posted-Date: 05 Sep 2020 12:45:39 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-US Xref: csiph.com comp.compilers:2585 Are there any algorithms which take a known-at-compile-time sequence of bitwise operations on an 8-bit to 64-bit quantity, and optimize them down to their minimal set of operations? For example, if I have an 8-bit byte and I want to swizzle the bits thusly: Input: 07 06 05 04 03 02 01 00 Output: 05 04 07 02 01 03 00 06 I can easily swizzle the data using a brute force method: v = get_value(); o = 0; swizzle1(o, v, 0, 6); swizzle1(o, v, 1, 0); swizzle1(o, v, 2, 3); swizzle1(o, v, 3, 1); swizzle1(o, v, 4, 2); swizzle1(o, v, 5, 7); swizzle1(o, v, 6, 4); swizzle1(o, v, 7, 5); // Untested, off the top of my head void swizzle(unsigned char& o, unsigned char v, int s, int d) { o |= (((v & (1 << s)) >> s) << d); } And, of course, that algorithm can be optimized based on relative values of s and d, and if s is 0, etc. However, there also exists a minimal set of steps to complete that swizzling because in the operation above, bits 5 and 4 are used in sequence. It could be done using a swizzle2() operation, for example and handle 2 bits at a time. In addition, given the bit operator abilities that exist on various CPUs there are potentially other combinations that exist behind an operation, such as bitswap, where the order of bits flips or mirrors across the center position. Are there any existing algorithms which examine the operations that must be conducted and then create an optimized / minimal sequence of mechanical steps to conduct it given a constrained set of features (such as those present on a given CPU)? Thank you in advance. -- Rick C. Hodgin