Path: csiph.com!weretis.net!feeder9.news.weretis.net!panix!.POSTED.spitfire.i.gajendra.net!not-for-mail From: cross@spitfire.i.gajendra.net (Dan Cross) Newsgroups: comp.programming Subject: Re: Rust vs Hype (was Re: Informal discussion: comp.lang.rust?) Date: Mon, 4 Aug 2025 22:33:36 -0000 (UTC) Organization: PANIX Public Access Internet and UNIX, NYC Message-ID: <106rcg0$etv$1@reader1.panix.com> References: <106apsa$2nju3$1@dont-email.me> <106b3qn$2e7$1@reader1.panix.com> <106lbus$155i0$1@dont-email.me> Injection-Date: Mon, 4 Aug 2025 22:33:36 -0000 (UTC) Injection-Info: reader1.panix.com; posting-host="spitfire.i.gajendra.net:166.84.136.80"; logging-data="15295"; mail-complaints-to="abuse@panix.com" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: cross@spitfire.i.gajendra.net (Dan Cross) Xref: csiph.com comp.programming:16840 In article <106lbus$155i0$1@dont-email.me>, David Brown wrote: >On 29/07/2025 20:27, Dan Cross wrote: >> In article <106apsa$2nju3$1@dont-email.me>, >> David Brown wrote: >>> On 29/07/2025 14:16, Dan Cross wrote: >>> [snip] >>> I personally don't know enough Rust to make any reasonable comparison >>> with other languages. I also think there is scope for all sorts of >>> languages, and it seems perfectly reasonable to me for Rust to be >>> "better" than C while also having C be "better" than Rust - different >>> languages have their strengths and weaknesses. >> >> I agree with this: modern C certainly has its place, and it >> would be foolish to think that the many billions of lines of C >> (or C++, for that matter) in existence today are simply going to >> vanish and be replaced with well-written, idiomatic Rust >> tomorrow. >> >> But no one serious is suggesting that. What I think a number of >> foks _are_ suggesting is that experience is proving that we get >> better, less buggy results out of Rust than equivalent C. >> > >Actually, I think many people /are/ suggesting that Rust be used instead >of C as though it were a clear and complete "upgrade" and that those who >write C today should switch to Rust and magically create "safer" code >(for some value of "safer"). Oh, many people are saying it, and they are even saying it seriously, but they are not serious people. By which I don't mean people who are serious about replacing C with Rust, but rather, people who are serious in the sense of being both qualified and having the judgement to understand the tradeoffs. Given those definitions, I maintain what I said: no one who is serious is actually suggesting that we just dump all C code in the world and replace it with Rust. Ok, maybe a few are, but I think those people mean on an extremely long timeframe, measured in decades, at a minimum. >Now, I /do/ think that many people who write C code today could write >code that is substantially better code (fewer bugs, more efficient >development, or other metrics) if they used different languages. (I >also think the software world could benefit if some people stopped >programming altogether.) I don't see many areas for which C is the >ideal choice of language for new code. > >But I don't think Rust is the magic bullet that a number of its >advocates appear to believe. I think many of those C programmers would >be better off switching to C++, Python, or various other languages (with >Rust being included as one of those). I agree with all of the above points. Well, I don't know about encouraging folks to quit; some folks may realize it's not for them and opt out, but I kind of look at someone who struggles to become a competent programmer as a failure of that person's mentors, teachers, managers, and so on. That's a topic for another time, though. >(To be clear, I am not saying that /you/ are claiming Rust is anything >like that.) No problem; I understood that. :-) >>> But one thing that bothers me is that Rust advocates almost invariably >>> compare modern Rust, programmed by top-rank programmers interested in >>> writing top-quality code, with ancient C written by people who may have >>> very different abilities and motivations. >>> >>> Rust is the new, cool language - the programmers who use it are >>> enthusiasts who are actively interested in programming, and talented >>> enough to learn the language themselves and are keen to make the best of >>> it. C, on the other hand, has been the staple language for workhorse >>> tasks. The great majority of people programming in C over the decades >>> do so because that's what they learned at university, and that's what >>> their employers' pay them to write. They write C code to earn a living, >>> and while I am sure most take pride in their jobs, their task is not to >>> write top-quality bug-free C code, but to balance the cost of writing >>> code that is good enough with the costs and benefits to customers. >>> >>> So it is an artificial and unfair comparison to suggest, as many Rust >>> enthusiasts do, that existing C code has lots of bugs that could be >>> prevented by writing the code in Rust - the bugs could be prevented >>> equally well by one of those Rust programmers re-writing the code in >>> good, modern C using modern C development tools. >> >> You have a point that transcends any sort of Rust<->C debate. >> Indeed, it is difficult to compare C'23 to C'89, let alone pre- >> ANSI "typesetter" C, let alone the C that, say, 6th Edition Unix >> was written in. Those are all very different languages. > >Yes, I agree with that. Languages have changed over the last few >decades, even when within the confines of just a single nominal language >(like "C", "C++", "Python", or any other living language). The way we >use languages, and what we do with them, has also changed - again, even >if you simply stick to a single language variant (such as C90). And the >tools have changed hugely too. Agreed. >> That said, there are some things that are simply impossible to >> represent in (correct, safe) Rust that are are known to be >> problematic, but that you cannot escape in C. The canonical >> example in the memory-safety domain is are Rust's non-nullable >> reference types vs C pointers; the latter can be nil, the former >> cannot. And while Rust _does_ have "raw" pointers (that can be >> null), you have to use `unsafe` to dereference them. The upshot >> is that in safe rust, you cannot dereference a NULL-pointer; >> perhaps Andy Hoare's "billion dollar mistake" can be fixed. > >That is all true (other than Tony Hoare's name), Oops! Thanks. >and it is definitely >one of Rust's many positive features. However, it is easy to exaggerate >the importance of this for many reasons: > >1. As I understand it, "unsafe" Rust code is common, especially in >low-level code. I write a lot of low-level Rust code, and I don't think that's quite accurate. Using unsafe at all? Sure, but canonically one still tries to minimize it, and wrap it up in a safe interface. Of course, in real world low-level code, we have to do things like manipulate machine and device registers, manipulate address spaces, write memory allocators, and so forth. Some amount of `unsafe` code is generally going to be required. And this is qualitatively different than those writing (say) userspace code, but even there, we need to do things like invoke system calls, write memory allocators, and deal with `mmap`. But it doesn't follow that every other line is `unsafe`, or that `unsafe` in low-level code needn't be significantly more common than in high level code. Here's a somewhat trivial example; this is the code for writing text to the CGA device from rxv64 (that is, printing to the text-mode graphics interface): https://github.com/dancrossnyc/rxv64/blob/main/kernel/src/cga.rs This isn't a huge module; about 100 lines. But it does do some slightly fiddly stuff to maintain the state of the screen, position the cursor, scroll text, and so on, all while actually writing to the (memory-mapped) display; some other code has already mapped that into the kernel's virtual address space. There are precisely 4 uses of `unsafe`: one wraps an intrinsic, (`volatile_copy_memory`), and intrinsics are always unsafe. Another is creating a `NonNull` object, which encapsulates a pointer into a "new type", the existence of which asserts that the pointer is non-null; the pointer in this case is being taken from a constant that names the (fixed) virtual address of the start of the CGA MMIO region. A third instance wraps unpacking that pointer, and converting it into a reference to a mutable slice of bytes. This is `unsafe` simply to ensure that the programmer acknowledges that the rules for converting a pointer to a (mutable) reference are being upheld, as the language has no insight to do that at compile time itself. The fourth and final wraps a series of `outb` calls, that set the cursor location. `outb` is considered an "unsafe" function because it is possible to (ab)use it to do things that could, in theory, violate memory safety (ie, reprogram DMA addresses on IO devices and things like that). All in all, 4 `unsafe` blocks covering a total of 8 statements out of ~100 lines of code to drive a memory-mapped device does not seem excessive, or especially common, to me. In the kernel as a whole, out of about 7K lines of code, the `unsafe` keyword appears exactly 369 times, probably covering 10% or less of total code. I could probably make that substantially fewer, but I choose not to in part because it would diminish the pedagogical of the system. >I think it is a good thing to have the separation of >"safe" and "unsafe" code, and isolating riskier coding techniques is >beneficial. But perhaps rather than dividing a Rust program into "safe" >and "unsafe" parts, it would be better still to divide the program into >low-level efficiency-critical code written in C, and write the majority >of the code in a higher level managed language where you don't really >have pointers, or at least where any bugs in pointer-style concepts >would be caught immediately. Again, I am not saying Rust's philosophy >is bad here, merely that there are alternatives that could be better and >Rust's benefits are often over-sold. Over the years, this has been done many times. A relatively recent example may be Biscuit (https://pdos.csail.mit.edu/projects/biscuit.html) But again, I don't see why it follows that I'd write the "low-level" parts in C. Indeed, one could structure a system largely along the lines you described, writing the "low-level efficiency-critical" parts in Rust. Tock did something like this (https://tockos.org), limiting by policy the places where `unsafe` can be used. More recently, ASTERINAS (https://www.usenix.org/conference/atc25/presentation/peng-yuke) enforces a similar division. Both are in Rust. Managed languages are wonderful, but they do have downsides. Rust tries to thread the needle in giving you precise control over things like memory allocation and deallocation, without the overhead, but with a measure of type- and memory-safety you usually only find in managed languages. >2. It is perfectly possible to write C code without having errors due to >null pointer dereferences. I do not recall ever having had a null >pointer bug in any C code I have written over the last 30+ years. I >have occasionally had other pointer-related errors, typically as a >result of typos and usually found very quickly. While impressive, this is exceedingly rare. Indeed, I was just debugging a NULL pointer dereference bug in the Linux kernel earlier today: https://lore.kernel.org/linux-hams/CAEoi9W4FGoEv+2FUKs7zc=XoLuwhhLY8f8t_xQ6MgTJyzQPxXA@mail.gmail.com/ >These kinds of mistakes >are avoided with due care and attention to the coding, appropriate >testing, sensible coding standards, and other good development >practices. Good tools can help catch problems at compile time (such as >compiler warnings about uninitialised variables, or extensions to mark >function parameters as "non-null"). They can also catch bugs quickly at >runtime, such as using sanitisers. I am a big fan of any language >feature that can make it more difficult to put bugs in your code (the >power of a programming languages mainly comes not from what it allows >you to write, but what it stops you from writing). But things like null >pointer bugs are not a symptom of C programming, but of poor software >development - and that is language independent. I strongly disagree with this: it simply doesn't scale, and as projects get bigger, with more engineers working on them concurrently, we inevitably run into bugs. Consider the null pointer deref bug in Linux mentioned above: what's going on there? It's not that these people are dumb, or that they are not using capable tools; indeed, the Linux kernel lock annotation stuff has been a real advance. But this _is_ an example of someone making a change that violated someone else's assumptions, across a very large system, and manifested as a nil pointer in a bit of code that is both obscure and also difficult to test (to really test AX.25 you need an RF path, and for that, you need a license and special equipment). Could better programming practices have caught this? Maybe. But a language that doesn't let you express the manifestation of the bug in the first place obviates the need for that. >3. Alternative existing and established languages already provide >features that handle this - the prime example being C++. People keep saying that, and yet available data suggests that C++ is not nearly as effective as people claim in this area. Microsoft may be much maligned, but they employ some damned good engineers, and they're betting on Rust, not C++, to solve a lot of these sorts of problems. >>> I also see little in the way of comparisons between Rust and modern C++. >> >> Really? I see quite a lot of comparison between Rust and C++. >> Most of the data out of Google and Microsoft, for instance, is >> not comparing Rust and C, it's actually comparing Rust and C++. >> Where Rust is compared to C most often is in the embedded space. > >I work with embedded development, so maybe that colours my reading! Fair point. You may find something like this interesting: https://security.googleblog.com/2024/09/deploying-rust-in-existing-firmware.html >>> Many of the "typical C" bugs - dynamic memory leaks and bugs, buffer >>> overflows in arrays and string handling, etc., - disappear entirely when >>> you use C++ with smart pointers, std::vector<>, std::string<>, and the >>> C++ Core Guidelines. (Again - I am not saying that C++ is "better" than >>> Rust, or vice versa. Each language has its pros and cons.) >> >> And yet, experience has shown that, even in very good C++ code >> bases, we find that programs routinely hit those sorts of >> issues. Indeed, consider trying to embed a reference into a >> `std::vector`, perhaps so that one can do dynamic dispatch thru >> a vtable. How do you do it? This is an area where C++ >> basically forces you to use a pointer; even if you try to put a >> `unique_ptr` into the vector, those can still own a `nullptr`. > >Sure, there are still occasions when you need raw pointers in C++. It >is far from a "perfect" language, and suffers greatly from many of its >good ideas being add-ons rather than in the original base language, >meaning the older more "dangerous" styles being still available (and >indeed often the default). But just as you should not need raw pointers >in most of your Rust programming, you should not need raw pointers in >most of your C++ programming. Except that you do, if you want to do (say) dynamic method dispatch across a collection of objects. If you don't use inheritence you don't have to care, but that's a pretty fundamental part of the language. I'd say this is qualitatively different than the situation in Rust, where most of the time you don't need the dangerous thing, while it's baked into core language features in C++. >>> So while I appreciate that comparing these two projects might be more >>> useful than many vague "C vs. Rust" comparisons, it is still a >>> comparison between a 10-20 year old C project and a modern Rust design. >>> >>> The most immediate first-impression difference between the projects is >>> that the Rust version is sensibly organised in directories, while the C >>> project jumbles OS code and user-land utilities together. That has, >>> obviously, absolutely nothing to do with the languages involved. Like >>> so often when a Rust re-implementation of existing C code gives nicer, >>> safer, and more efficient results, the prime reason is that you have a >>> re-design of the project in a modern style using modern tools with the >>> experience of knowing the existing C code and its specifications (which >>> have usually changed greatly during the lifetime of the C code). You'd >>> get at least 90% of the benefits by doing the same re-write in modern C. >> >> Not really. rxv64 has a very specific history: we were working >> on a new (type-1) hypervisor in Rust, and bringing new engineers >> who were (usually) very good C and C++ programmers onto the >> project. While generally experienced, these folks had very >> little experience with kernel-level programming, and almost none >> in Rust. For the OS bits, we were pointing them at MIT's course >> materials for the 6.828 course, but those were in C and for >> 32-bit x86, so I rewrote it in Rust for x86_64. >> >> In doing so, I took care to stay as close as I reasonably could >> to the original. Obviously, some things are different (most >> system calls are implemented as methods on the `Proc` type, for >> example, and error handling is generally more robust; there are >> some instances where the C code will panic in response to user >> action because it's awkward to return an error to the calling >> process, but I can bubble thus back up through the kernel and >> into user space using the `Result` type), but the structure is >> largely the same; my only real conceit to structural change in >> an effort to embrace modernity was the pseudo-slab allocator for >> pipe objects. Indeed, there are some things where I think the >> rewrite is _less_ elegant than the original (the doubly-linked >> list for the double-ended queue of free buffers in the block >> caching layer, for instance: this was a beautiful little idea in >> early Unix, but its expression in Rust -- simulated using >> indices into the fixed-size buffer cache -- is awkward). > >I can certainly see that some things are more inconvenient to write in >C, even in modern C standards and styles - Rust's Result types are much >nicer to use than a manual struct return in C. (C++ has std::expected<> >that is very similar to Result, though Result has some convenient >shortcuts and is much more pervasive in Rust.) Yeah. I want to like `std::expected`, but its use is awkward without the syntactic sure of the `?` operator. There's the old saw about syntactic sugar and cancer of the semicolon, but in this case, I think they got it mostly right. >> The Rust port did expose a few bugs in the original, which I >> fixed and contributed back to MIT. And while it's true that the >> xv6 code was initially written in the mid 00's, it is still very >> much used and maintained (though MIT has moved on to a variant >> that targets RISC-V and sunsetted the x86 code). Also, xv6 has >> formed the basis for several research projects, and provided a >> research platform that has resulted in more than one >> dissertation. To say that it is not representative of modern C >> does not seem accurate; it was explicitly written as a modern >> replacement for 6th Edition Unix, after all. And if it is not >> considered modern, then what is? > >Modern C would, IMHO, imply C99 at the very least. It would be using >C99 types, and it would be declaring local variables only when they can >be properly initialised - and these would often be "const". It would be >using enumerated types liberally, not a bunch of #define constants. It >would not be declaring "extern" symbols in C files and other scattered >repeated declarations - rather, it would be declaring exported >identifiers once in an appropriately named header and importing these >when needed so that you are sure you have a single consistent >definition, checked by the compiler. It would not be using "int" as a >boolean type, returning 0 or -1, or when an error enumeration type would >be appropriate. It would be organised into directories, rather than >mixing the OS code, build utilities, user-land programs, and other files >in one directory. Header files - at a minimum, those that might be used >by other code - would have guards and include any other headers needed. >Files would have at least a small amount of documentation or >information. All functions would be either static to their files, or >declared "extern" in the appropriate header. Much of this is, as you say, stylistic. Enumerations for constants might be an improvement, I suppose, but in C they confer few benefits beyond `#define`; they're basically `int`s up through C18, and still default to `int` unless defined with a fixed underlying type in C23. That said, I mostly agree with you; certainly, using appropriately sized types everywhere would already been a considerable improvement. >And for code like this that is already compiler-specific, I would also >want to see use of compiler extensions for additional static error >checking - at least, if it is reasonable to suppose a compiler that is >modern enough for that. This I must push back on. This code is or (or, rather, should not) be _compiler specific_ so much as _ABI specific_. There seem to be a handful of specific compiler attributes used here: mostly in marking bits of code `__attribute__((noreturn))` (though we have a standard way to write that now) and aligned (we also have a standard way to do that now, too). There's a small amount of inline assembly, which _could_ just be in a `.S` file, and a coule of calls to `__sync_synchronize()` intrinsics, which could be replaced with some sort of atomic barrier, I'd imagine. But herein lies another problem with this approach: once you start using a lot of compiler-specific extensions, you're not exactly programming in C anymore, but some dialect defined by your compiler and enabled extensions. You limit portability, and you create a future maintenance burden if you ever need to switch to a different compiler. As a counter-example, we did the Harvey OS a few years ago. This was a "port" of the Plan 9 OS to mostly-ISO standard C11, replacing the Plan 9 C dialect and Ken Thompson's compilers. It worked, and we made it compile with several different versions of several different compilers. This discipline around portability actually revealed several bugs, which we could find and fix. >Now, I realise some of these are stylistic choices and not universal. >And some (like excessive use of "int") may be limited by compatibility >with Unix standards from the days of "everything is an int". > >None of this means the code is bad in any particular way, and I don't >think it would make a big difference to the "code safety". But these >things do add up in making code easier to follow, and that in turn makes >it harder to make mistakes without noticing. That's fair, but if we're talking about how the language affects safety of the resulting program, mostly stylistic changes aren't really all that helpful. Even using the sized types from e.g. `stdint.h` only gets you so far: you've still got to deal with C's abstruse implicit integer promotion rules, UB around signed integer overflow, and so on. For example, consider: uint16_t mul(uint16_t a, uint16_t b) { return a * b; } Is that free of UB in all cases? The equivalent Rust code is: pub fn mul(a: u16, b: u16) -> u16 { a.wrapping_mul(b) } There's plenty of crazy C code out there from the days of "all the world's a VAX" that we can pick on; safety wise, xv6 is actually pretty decent, though. >> I hear this argument a lot, but it quickly turns into a "no true >> Scotsman" fallacy. > >Agreed - that is definitely a risk. And it also risks becoming an >argument about preferred styles regardless of the language. +1e6 >> This is less frivilous than many of the >> other arguments that are thrown out to just dismiss Rust (or any >> other technology, honestly) that often boil down to, honestly, >> emotion. But if the comparison doesn't feel like it's head to >> head, then propose a _good_ C code base to compare to Rust. >> > >I don't know of any appropriate C code for such a comparison. I think >you'd be looking for something that can showcase the benefits of Rust - >something that uses a lot of dynamic memory or other allocated >resources, and where you have ugly C-style error handling where >functions return an error code directly and the real return value via a >pointer, and then you have goto's to handle freeing up resources that >may or may not have been allocated successfully. > >Ideally (in my mind), you'd also compare that to C++ code that used >smart pointers and/or containers to handle this, along with exceptions >and/or std::expected<> or std::optional<>. I feel like this has been done now, a few times over, with data published from some of the usual FAANG suspects, as well as in a growing number of academic conferences. Then you've got things like https://ferrocene.dev/en/ as well. >I am totally unconvinced by arguments about Rust being "safer" than C, >or that it is a cure-all for buffer overflows, memory leaks, mixups >about pointer ownership, race conditions, and the like - because I >already write C code minimal risk of these problems, at the cost of >sometimes ugly and expansive code, or the use of compiler-specific code >such as gcc's "cleanup" attribute. And I know I can write C++ code with >even lower risk and significantly greater automation and convenience. >If Rust lets me write code that is significantly neater than C++ here, >then maybe it worth considering using it for my work. First, I think you are totally justified in being skeptical. By all means, don't take my (or anyone else's) word about the language and what we express as benefits. Indeed, I was extremely skeptical when I started looking at Rust seriously, and frankly, I'm really glad that I was. I started from a default negative position about the claims, but I did try to give the thing an honest shot, and I can say with confidence that it surprised me with just how much it really _can_ deliver. That said, I am still not an unapologetic fan boy. There are parts of the language that I think are awkward or ill-designed; it does not fix _all_ bug, or seek to. I just think that, for the application domain that I work in most often (kernel-level code on bare metal), Rust is the best available language at this time. It is not my child, though: if another language that I felt met the area _better_ suddenly showed up, I'd advocate switching. But the contrapositive of skepticism is that one has to be open to evidence that disconfirms one preconceptions about a thing. So it's fine to be skeptical, but my suggestion (if you are so inclined) is to pick a problem of some size, and go through the exercise of learning enough Rust to write an idiomatic solution to that problem, then maybe seek out some folks with more experience in the language for critique. That is, really dig into a problem with it and see how it feels _then_. I'd honestly be a bit surprised if you came away from that exercise with the same level of skepticism. >First, however, the language and tools need to reach some level of >maturity. C++ has a new version of the language and library every 3 >years, and that's arguably too fast for a lot of serious development >groups to keep up. Rust, as far as I can see, comes out with new >language features every 6 weeks. That may seem reasonable to people >used to patching their Windows systems every week, but not for people >who expect their code to run for years without pause. That is not accurate. A new "edition" of the language is published about once every three years (we're currently on Rust 2024, for instance; the one before that was 2021, then 2018 and 2015). A new stable version of the compiler comes out every 12 weeks, with a beta compiler every 6 weeks, I believe (I don't really follow the beta series, so I may be wrong about the timeline there) and a nightly compiler approximately every day. Usually each beta/stable compiler promotes an interface that _was_ experimental to being "stable", but usually that's something in a library; you can only use the experimental ("unstable)" interfaces from a nightly compiler. Intefaces give you a means for long-term support. We have code that uses 2021 and 2018, and the language hasn't changed so much that it breaks. The tooling is very good, and I would argue superior to C++'s in many ways (error messages, especially). There is robust and wide support in debuggers, profilers, various inspection and instrumentation tools, etc; part of that is due to the wise decision to (mostly) use C++-compatible name mangling for symbols, and build on the LLVM infrastructure. There are a number of very good references for the language; the official book is available online, gratis: https://doc.rust-lang.org/book/ (Disclaimer: Steve Klabnik is one of my colleagues.) I'd say, give it a whirl. You may feel the same, but you may also be pleasantly (or perhaps unpleasantly!) surprised. In any event, I'd like to know how it turns out. - Dan C