Ubuntu, uutils and the elusive C replacement
Ubuntu just announced that they are considering using uutils as their coreutils instead of GNU. I am a big Rust fan (even though there are a bunch of things I dislike in it), and I would consider it my main programming language. Why use Rust here?
Why
Security is number one, and the reasons are obvious, so I won’t explain the benefits of a memory-safe language over a language such as C: that much is clear. But other implementations of coreutils have been around for longer and have been thoroughly battle-tested and audited (including GNU). Still, I do agree that using a memory-safe language is better, but memory-safety is not the only good reason.
The tooling around C is pretty archaic at this point and has pretty bad UX compared to Rust tooling. Rust in general is friendlier than C in all aspects, and this makes the codebase more approachable to new maintainers and people looking to poke into the source code. GNU source code is also notoriously complex, and complexity is also bad in terms of security (the more complex a system is, the harder it is to make secure). Part of this is because GNU supports everything under the sun (and adds extra non-POSIX functionality), while Rust doesn’t (limited by LLVM), but as a modern replacement, especially if catered to Ubuntu specifically, it makes total sense and is a great choice.
Except…
Except none of this is true. Well, some parts are, but not all. Let’s look at a very simple example in
coreutils, the yes utility.
This little guy’s only job is to print y or the first argument that it receives to stdout in an
infinite loop. Super simple.
If we look at the GNU implementation we’ll see that it takes roughly 100 LOC. Huh? For such a simple loop? Not only that, but it also includes some extra headers, three to be exact, that now make it harder to follow:
 #include "system.h"
 #include "full-write.h"
 #include "long-options.h"
Also, why on earth is an IFDEF needed in such a simple program, and what is a CHERI?:
 #if defined __CHERI__
   /* Cheri capability bounds do not allow for this.  */
   reuse_operand_strings = false;
 #endif
Maybe it’s this. Still. Why have custom
code for it? IT’S A SIMPLE FOR LOOP. Why architecture a simple for loop in such a way that it
needs an IFDEF because of one obscure target. Madness.
The rest of the code is needlessly complex. All of this because of “optimizations” and “performance”.
For the yes utility.
At my day-job I work on performance critical applications and spend quite a bit of time looking at assembly. I like performance work. But this is completely unnecessary. It is harder to understand, more complex and therefore harder to work with. It will be easier for bugs (including security bugs) to creep in and will throw off most people trying to read the code and contribute to it. Plus, the coding style. Yuck.
The ©Rust™® Version
Surely, everything is better over here in Rustland. The build system is better, right? No more
horrible Makefiles, just cargo build. Let’s look at the source code
 
Well, I’ll be damned. Not only do we have one of those super-not-trendy Makefiles, we have two! And
a GNU one at that! Plus cargo, obviously. And a flake, of course. But well, let’s find yes.
After looking for it for a while, I find its source code. It is split in 3 files. Three. Files. For
an infinite loop. This does not bode well at all. One of them is called main.rs. Our entrypoint,
great. Let’s look at it.
Oh no.
 uucore::bin!(uu_yes);
This is the whole code in this file. And it’s a fucking macro. Sigh. The other two files combined
are even longer than the GNU version and are even harder to follow. A sad day indeed. Part of why
it is hard to follow is because it uses a bunch of crates (Rust libraries) for handling all sorts
of things, so the user must be already familiar with the ecosystem to follow the code.
So: more complex, more dependencies, more code…
Wait, dependencies? What? For a for loop?
Oh yeah. And not just one. Behold:
 [workspace.dependencies]
 ansi-width = "0.1.0"
 bigdecimal = "0.4"
 binary-heap-plus = "0.5.0"
 bstr = "1.9.1"
 bytecount = "0.6.8"
 byteorder = "1.5.0"
 chrono = { version = "0.4.38", default-features = false, features = [
   "std",
   "alloc",
   "clock",
 ] }
 chrono-tz = "0.10.0"
 clap = { version = "4.5", features = ["wrap_help", "cargo"] }
 clap_complete = "4.4"
 clap_mangen = "0.2"
 compare = "0.1.0"
 coz = { version = "0.1.3" }
 crossterm = "0.28.1"
 ctrlc = { version = "3.4.4", features = ["termination"] }
 dns-lookup = { version = "2.0.4" }
 exacl = "0.12.0"
 file_diff = "1.0.0"
 filetime = "0.2.23"
 fnv = "1.0.7"
 fs_extra = "1.3.0"
 # Remove the "=" once we moved to Rust edition 2024
 fts-sys = "=0.2.14"
 fundu = "2.0.0"
 gcd = "2.3"
 glob = "0.3.1"
 half = "2.4.1"
 hostname = "0.4"
 iana-time-zone = "0.1.57"
 indicatif = "0.17.8"
 itertools = "0.14.0"
 libc = "0.2.153"
 linux-raw-sys = "0.9"
 lscolors = { version = "0.20.0", default-features = false, features = [
   "gnu_legacy",
 ] }
 memchr = "2.7.2"
 memmap2 = "0.9.4"
 nix = { version = "0.29", default-features = false }
 nom = "8.0.0"
 notify = { version = "=8.0.0", features = ["macos_kqueue"] }
 num-bigint = "0.4.4"
 num-prime = "0.4.4"
 num-traits = "0.2.19"
 number_prefix = "0.4"
 onig = { version = "~6.4", default-features = false }
 parse_datetime = "0.8.0"
 phf = "0.11.2"
 phf_codegen = "0.11.2"
 platform-info = "2.0.3"
 quick-error = "2.0.1"
 rand = { version = "0.9.0", features = ["small_rng"] }
 rand_core = "0.9.0"
 rayon = "1.10"
 regex = "1.10.4"
 rstest = "0.25.0"
 rust-ini = "0.21.0"
 same-file = "1.0.6"
 self_cell = "1.0.4"
 # Remove the "=" once we moved to Rust edition 2024
 selinux = "= 0.5.0"
 selinux-sys = "= 0.6.13"
 signal-hook = "0.3.17"
 smallvec = { version = "1.13.2", features = ["union"] }
 tempfile = "3.15.0"
 terminal_size = "0.4.0"
 textwrap = { version = "0.16.1", features = ["terminal_size"] }
 thiserror = "2.0.3"
 time = { version = "0.3.36" }
 unicode-segmentation = "1.11.0"
 unicode-width = "0.2.0"
 utf-8 = "0.7.6"
 utmp-classic = "0.1.6"
 uutils_term_grid = "0.6"
 walkdir = "2.5"
 winapi-util = "0.1.8"
 windows-sys = { version = "0.59.0", default-features = false }
 xattr = "1.3.1"
 zip = { version = "2.2.2", default-features = false, features = ["deflate"] }
 
 hex = "0.4.3"
 md-5 = "0.10.6"
 sha1 = "0.10.6"
 sha2 = "0.10.8"
 sha3 = "0.10.8"
 blake2b_simd = "1.0.2"
 blake3 = "1.5.1"
 sm3 = "0.4.2"
 crc32fast = "1.4.2"
 digest = "0.10.7"
 
 uucore = { version = "0.0.30", package = "uucore", path = "src/uucore" }
 uucore_procs = { version = "0.0.30", package = "uucore_procs",
 path = "src/uucore_procs" }
 uu_ls = { version = "0.0.30", path = "src/uu/ls" }
 uu_base32 = { version = "0.0.30", path = "src/uu/base32" }
These are the dependencies used throughout the project (the yes utility only uses two), excluding
the optional dependencies needed to run tests and build examples. Yikes.
If the whole point of using Rust is the safety, how is using this many dependencies more secure? Specially when using these in such a careful part of a system, the coreutils of all things. This is a huge supply-chain attack waiting to happen. The language design encourages using many libraries because of how easy it is to add dependencies and manage them. Again, the friendliness of Rust and the convenience it offers for developers becomes a double-edged sword.
Are we doomed?
So, to recap: we have a Rust version that is harder to read, harder to understand, has complex
tooling around it, is not as efficient as the C version and is not as portable. You also need a
more powerful machine to build Rust projects and run Rust related tooling like rust-analyzer, which
brings my 16GB of RAM, 20-core cpu laptop to its knees when indexing huge Rust codebases, gating the access
to such a fundamental part of any free operating system
to people that have the means to afford expensive hardware.
But it does not have to be this way. Look at the OpenBSD version:
 #include <err.h>
 #include <stdio.h>
 #include <unistd.h>
 
 int
 main(int argc, char *argv[])
 {
 	if (pledge("stdio", NULL) == -1)
 		err(1, "pledge");
 
 	if (argc > 1)
 		for (;;)
 			puts(argv[1]);
 	else
 		for (;;)
 			puts("y");
 }
Delightful. It does exactly what you would expect it to do. It can’t get better than this. The extra
line at the beginning uses pledge, an OpenBSD syscall that ensures
that the program only ever uses stdio and nothing else (I am oversimplifying, but you get the idea).
This makes it even more secure. How great is that! There are other nice C implementations out there (like
busybox), but I really like how OpenBSD does things.
The C replacement
But C still sucks a bit. It still needs replacing.
So, is Rust a good C replacement? I don’t really think so. It’s a good C++ replacement, and I love Rust, I really do, but it can’t truly replace C. There are no other good competing implementations of Rust (it does not even have a specification yet, and even if you wanted to make one, it would be terribly hard because of how complex Rust is as a language). For something as critical as coreutils, we want something that is small, simple and portable. If not now, in the future, and Rust will never be as portable as C because of how big it is.
It can definitely replace C in some areas, but I don’t think this is a good example. To be fair, there is nothing stopping you from writing a Rust version as simple as the OpenBSD version, but it’s a matter of culture.
Rustaceans love to over-complicate code. They love their type system and building abstractions.
They don’t mind using 200 libraries because they have amazing APIs that are hard to misuse, have incredible performance
and are super high quality: so why wouldn’t you? But it tends to be overly complex and brittle.
And breaking changes are common between create releases. Most libraries are not even in version 1.0 so
they can break stuff without feeling guilty. Few crates like tokio or serde have had the guts to
say “we are stable”. Even rand has not shipped a 1.0 version!
You can still write simple Rust code, but it is not as idiomatic. I appreciate the power that Rust
gives you, I really do, and I will probably make some posts in the future singing its praises (and others
criticizing it), but how the language is used in practice is also “part” of the language in a way.
As an example, I don’t really dislike Java as a language, and think that the JVM is a phenomenal piece of
software, but I dislike Java because of how the Java community writes Java: all the OOP attached to it,
the “clean” AbstractFactoryVisitorSqueezer everywhere and how most of their programmers wear suits.
Other modern “C replacements” focus mostly on performance, like Zig and Odin, but are not simple at all (despite how the love to say how simple they are). Which is OK! They are definitely simpler than Rust or C++ (although that is a very low bar), but not as simple as one would like. They seem to be very popular in the Windows game dev crowd, and for good reason, as video games really benefit from a language that puts performance front and center, but are not very UNIX-y. Zig does not even list any of the BSDs in its download page.
Go is the only one that carries the torch, but too far. First of, it is not “simple”. It has a pretty complex runtime with a garbage collector. So, not really a good C replacement as far as I’m concerned.
Down the rabbit hole
The Hare programming language is my only hope so far. It is made by people that really love UNIX, that live and breathe it and fully understand its values and want to enhance the good parts (while making the bad parts better). It has a specification and very simple compiler (the backend is not LLVM, but QBE, how amazing is that?). While still in development, it shows great promise and I can’t wait for it to become more stable. The only thing so far that I don’t like are how it uses slices. I think Go slices are a very bad abstraction, and I hope Hare can mark a slice as immutable so that it is clear that it is only a view into data (and hopefully the compiler can make sure at compile time that you can’t append to it, for example). Other than that, I am extremely optimistic. Just take a look at the website and see the rest for yourself. You will quickly either love it or hate it. I’m in the love-it camp.
Just to highlight how much the Hare team gets it, they even have a repo of coreutils
ported to Hare as a language showcase! And look at their yes implementation:
 use fmt;
 use getopt;
 use main;
 use os;
 
 export fn utilmain() (main::error | void) = {
 	const help: []getopt::help = [
 		"output a string repeatedly until killed",
 		"[STRING]",
 	];
 	const cmd = getopt::parse(os::args, help...);
 	defer getopt::finish(&cmd);
 
 	const s = if (len(cmd.args) == 1) {
 		yield cmd.args[0];
 	} else {
 		yield "y";
 	};
 
 	for (true) {
 		fmt::println(s)!;
 	};
  };
Perfection!
