Groups | Search | Server Info | Login | Register
Groups > comp.compilers > #3656
| Path | csiph.com!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end |
|---|---|
| From | John R Levine <johnl@taugh.com> |
| Newsgroups | comp.compilers |
| Subject | Paper: Improving Assembly Code Performance with Large Language Models via Reinforcement Learning |
| Date | Mon, 19 May 2025 12:54:22 -0400 |
| Organization | Compilers Central |
| Sender | johnl%iecc.com |
| Approved | comp.compilers@iecc.com |
| Message-ID | <25-05-013@comp.compilers> (permalink) |
| MIME-Version | 1.0 |
| Content-Type | text/plain; charset="UTF-8" |
| Injection-Info | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="22877"; mail-complaints-to="abuse@iecc.com" |
| Keywords | optimize |
| Posted-Date | 19 May 2025 12:55:29 EDT |
| X-submission-address | compilers@iecc.com |
| X-moderator-address | compilers-request@iecc.com |
| X-FAQ-and-archives | http://compilers.iecc.com |
| Xref | csiph.com comp.compilers:3656 |
Show key headers only | View raw
They prompted some LLMs with C programs and the GCC -O3 assembly, with feedback when the result was faster and still correct. It seems to me like asking for trouble, but they claim they got 47% speedup and 96% still correct code. The paper ends with a contrived example where the LLM figured out that a C routine could be collapsed into a POPCNT instruction. Anjiang Wei, Tarun Suresh, Huanmi Tan, Yinglun Xu, Gagandeep Singh, Ke Wang, Alex Aiken Abstract Large language models (LLMs) have demonstrated strong performance across a wide range of programming tasks, yet their potential for code optimization remains underexplored. This work investigates whether LLMs can optimize the performance of assembly code, where fine-grained control over execution enables improvements that are difficult to express in high-level languages. We present a reinforcement learning framework that trains LLMs using Proximal Policy Optimization (PPO), guided by a reward function that considers both functional correctness, validated through test cases, and execution performance relative to the industry-standard compiler gcc -O3. To support this study, we introduce a benchmark of 8,072 real-world programs. Our model, Qwen2.5-Coder-7B-PPO, achieves 96.0% test pass rates and an average speedup of 1.47x over the gcc -O3 baseline, outperforming all 20 other models evaluated, including Claude-3.7-sonnet. These results indicate that reinforcement learning can unlock the potential of LLMs to serve as effective optimizers for assembly code performance. https://arxiv.org/abs/2505.11480 Regards, John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly
Back to comp.compilers | Previous | Next | Find similar
Paper: Improving Assembly Code Performance with Large Language Models via Reinforcement Learning John R Levine <johnl@taugh.com> - 2025-05-19 12:54 -0400
csiph-web