Path: csiph.com!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: comp.arch Subject: Register windows (was: The Third Wish) Date: Thu, 17 Jul 2025 12:20:13 -0400 Organization: A noiseless patient Spider Lines: 20 Message-ID: References: <0c857b8347f07f3a0ca61c403d0a8711@www.novabbs.com> <6b7b0a1988c1e735b70ecd8b13d7515f@www.novabbs.com> <40232cb64d72310e2bc8e8691cca8728@www.novabbs.com> <2cf2c3f7974b23c820c5a6806011e339@www.novabbs.org> <103g8la$2kp3o$1@dont-email.me> <103la6f$3va5a$1@dont-email.me> <1042fas$3ban0$1@dont-email.me> <1042ll5$3cmom$1@dont-email.me> <10455q0$11ot$1@dont-email.me> <1045640$3d13k$1@dont-email.me> <10459op$1umc$1@dont-email.me> <1045jcq$3n4d$1@dont-email.me> <1045p9f$4tbk$1@dont-email.me> <5a36f7fc7dcdb828256469ec1bac582a@www.novabbs.org> <1059d1b$vrmt$3@dont-email.me> <3978ba63d716259121cdc4fe54d87062@www.novabbs.org> <1059reg$1652n$1@dont-email.me> <1059sj8$16a8p$1@dont-email.me> <1059t3a$16euf$1@dont-email.me> <1059tti$16il5$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain Injection-Date: Thu, 17 Jul 2025 18:20:15 +0200 (CEST) Injection-Info: dont-email.me; posting-host="20b9803cdeccd8555c8aad92b9398f66"; logging-data="1537804"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Hc9SrGzyBMU87SH34+G+fWVdBv7UbBW4=" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:NS0kkOobqrfBz0/r7goXB1yt6ds= sha1:+iwO/5isNO+CS8SG/FDFob9b7+c= Xref: csiph.com comp.arch:112609 > The only good arguments I have heard wrt big architectural register > files has to do with things like Register-Windows and/or optimizing > CALL/RET interface. But even there, it justifies only additional "second-class registers", i.e. where the set of immediately addressable registers can still be the same size as usual (e.g. 16 or 32), but you can quickly push some of those to some kind of "stack" and then pull them back in. IIRC the Mill had actually 2 categories of "second-class registers": the stack and the scratch registers. I think you can get similar benefits with "cache-line sized" memory operations that load/store several registers at a time (assuming you have good enough store-to-load forwarding). Or even fold those loads&stores into some kind of CALL/RET instructions, which can let you start the control-flow part of the CALL before the stores, and similarly start the loads before the control flow part of the RET is done. Stefan