Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #3726 > unrolled thread
| Started by | John R Levine <johnl@taugh.com> |
|---|---|
| First post | 2026-04-10 12:04 -0400 |
| Last post | 2026-04-19 14:30 +0000 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.compilers
Paper: dividing by seven John R Levine <johnl@taugh.com> - 2026-04-10 12:04 -0400
Re: Paper: dividing by seven anton@mips.complang.tuwien.ac.at - 2026-04-13 05:48 +0000
Re: Paper: dividing by seven robin51@dodo.com.au - 2026-04-19 00:19 +1000
Re: Paper: dividing by seven anton@mips.complang.tuwien.ac.at - 2026-04-19 14:30 +0000
| From | John R Levine <johnl@taugh.com> |
|---|---|
| Date | 2026-04-10 12:04 -0400 |
| Subject | Paper: dividing by seven |
| Message-ID | <26-04-002@comp.compilers> |
Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets Shigeo Mitsunari, Takashi Hoshino Granlund and Montgomery proposed an optimization method for unsigned integer division by constants [3]. Their method (called the GM method in this paper) was further improved in part by works such as [1] and [7], and is now adopted by major compilers including GCC, Clang, Microsoft Compiler, and Apple Clang. However, for example, for x/7, the generated code is designed for 32-bit CPUs and therefore does not fully exploit 64-bit capabilities. This paper proposes an optimization method for 32-bit unsigned division by constants targeting 64-bit CPUs. We implemented patches for LLVM/GCC and achieved speedups of 1.67x on Intel Xeon w9-3495X (Sapphire Rapids) and 1.98x on Apple M4 (Apple M-series SoC) in the microbenchmark described later. The LLVM patch has already been merged into llvm:main [6], demonstrating the practical applicability of the proposed method. https://arxiv.org/abs/2604.07902 Regards, John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly
[toc] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at |
|---|---|
| Date | 2026-04-13 05:48 +0000 |
| Message-ID | <26-04-003@comp.compilers> |
| In reply to | #3726 |
John R Levine <johnl@taugh.com> writes:
>Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets
>
>Shigeo Mitsunari, Takashi Hoshino
>
>Granlund and Montgomery proposed an optimization method for unsigned
>integer division by constants [3]. Their method (called the GM method in
>this paper) was further improved in part by works such as [1] and [7], and
>is now adopted by major compilers including GCC, Clang, Microsoft
>Compiler, and Apple Clang. However, for example, for x/7, the generated
>code is designed for 32-bit CPUs and therefore does not fully exploit
>64-bit capabilities. This paper proposes an optimization method for 32-bit
>unsigned division by constants targeting 64-bit CPUs. We implemented
>patches for LLVM/GCC and achieved speedups of 1.67x on Intel Xeon w9-3495X
>(Sapphire Rapids) and 1.98x on Apple M4 (Apple M-series SoC) in the
>microbenchmark described later. The LLVM patch has already been merged
>into llvm:main [6], demonstrating the practical applicability of the
>proposed method.
>
>https://arxiv.org/abs/2604.07902
In a paper [ertl19kps] I looked at full-width division by multiplying
with the double-width (128 bits on 64-bit targets) reciprocal , and
that gave a small reduction in latency (over the conventional approach
of using a single-width multiplication with corrections) in some
unsigned cases (e.g., n/7). I did not look into half-width division,
because the programming language I am interested in does not have
half-width arithmetics, but of course one can apply the same idea in
this context by multiplying with the full-width reciprocal. It's
interesting that the benefits are much higher in this context.
@InProceedings{ertl19kps,
author = {M. Anton Ertl},
title = {Integer Division by Multiplying with the
Double-Width Reciprocal},
crossref = {kps19},
pages = {75--84},
url = {https://www.complang.tuwien.ac.at/papers/ertl19kps.pdf},
url-slides = {https://www.complang.tuwien.ac.at/papers/ertl19kps-slides.pdf},
abstract = {Earlier work on integer division by multiplying with
the reciprocal has focused on multiplying with a
single-width reciprocal, combined with a correction
and followed by a shift. The present work explores
using a double-width reciprocal to allow getting rid
of the correction and shift.}
}
@Proceedings{kps19,
title = {20. Kolloquium Programmiersprachen und Grundlagen
der Programmierung (KPS)},
booktitle = {20. Kolloquium Programmiersprachen und Grundlagen
der Programmierung (KPS)},
year = {2019},
key = {kps19},
editor = {Martin Pl\"umicke and Fayez Abu Alia},
url = {https://www.hb.dhbw-stuttgart.de/kps2019/kps2019_Tagungsband.pdf}
}
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | robin51@dodo.com.au |
|---|---|
| Date | 2026-04-19 00:19 +1000 |
| Message-ID | <26-04-004@comp.compilers> |
| In reply to | #3726 |
Might be useful to look at Hacker's Delight (H. S. Warren, Jr). On 11/04/2026 02:04, John R Levine wrote: > Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets > > Shigeo Mitsunari, Takashi Hoshino > ... > https://arxiv.org/abs/2604.07902 [It's an amazing book that belongs on every compiler developer's shelf. Here's a version of the first edition, of dubious legality, at the Internet archive: https://dn710003.ca.archive.org/0/items/random-papers9/Henry%20S.%20Warren%20-%20Hacker%27s%20delight-Addison-Wesley%20%282003%29.pdf -John]
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at |
|---|---|
| Date | 2026-04-19 14:30 +0000 |
| Message-ID | <26-04-005@comp.compilers> |
| In reply to | #3728 |
robin51@dodo.com.au writes:
>Might be useful to look at Hacker's Delight (H. S. Warren, Jr).
I looked at the first edition of this book for my paper
<https://www.complang.tuwien.ac.at/papers/ertl19kps.pdf>, but its
Chapter 10 only got an honourable mention at the end of my
Related-Work section; there is a second edition of this book, however.
In any case, I wrote at the start of my related-work section:
|If you read only one paper about the topic, my recommendation is
|Robison���s [Rob05].
@InProceedings{robison05,
author = "Arch D. Robison",
title = "{$N$}-Bit Unsigned Division Via {$N$}-Bit
Multiply-Add",
OPTeditor = "Paolo Montuschi and Eric (Eric Mark) Schwarz",
booktitle = "{Proceedings of the 17th IEEE Symposium on Computer
Arithmetic (ARITH-17)}",
publisher = "IEEE Computer Society Press",
ISBN = "0-7695-2366-8",
ISBN-13 = "978-0-7695-2366-8",
year = "2005",
bibdate = "Wed Jun 22 07:02:55 2005",
bibsource = "http://www.math.utah.edu/pub/tex/bib/fparith.bib",
URL = "http://www.acsel-lab.com/arithmetic/arith17/papers/ARITH17_Robison.pdf",
abstract = "Integer division on modern processors is expensive
compared to multiplication. Previous algorithms for
performing unsigned division by an invariant divisor,
via reciprocal approximation, suffer in the worst case
from a common requirement for $ n + 1 $ bit
multiplication, which typically must be synthesized
from $n$-bit multiplication and extra arithmetic
operations. This paper presents, and proves, a hybrid
of previous algorithms that replaces $ n + 1 $ bit
multiplication with a single fused multiply-add
operation on $n$-bit operands, thus reducing any
$n$-bit unsigned division to the upper $n$ bits of a
multiply-add, followed by a single right shift. An
additional benefit is that the prerequisite
calculations are simple and fast. On the Itanium 2
processor, the technique is advantageous for as few as
two quotients that share a common run-time divisor.",
acknowledgement = "Nelson H. F. Beebe, University of Utah, Department
of Mathematics, 110 LCB, 155 S 1400 E RM 233, Salt Lake
City, UT 84112-0090, USA, Tel: +1 801 581 5254, FAX: +1
801 581 4148, e-mail: \path|beebe@math.utah.edu|,
\path|beebe@acm.org|, \path|beebe@computer.org|
(Internet), URL:
\path|http://www.math.utah.edu/~beebe/|",
keywords = "ARITH-17",
pagecount = "9",
}
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web