Rust Target Names Aren’t Passed to LLVM

TL;DR: Rust’s i686-unknown-linux-gnu target requires SSE2 and, therefore, does not mean the same as GCC’s -march=i686. It is the responsibility of Linux distributions to use a target configuration that matches what they intend to support.

From time to time, claims that Rust is “not portable” flare up. “Not portable” generally means “LLVM does not support my retrocomputing hobby target.” This is mostly about dead ISAs like DEC Alpha. There is a side track about x86, though: the complaint that Rust’s default 32-bit x86 (glibc) Linux target does not support all x86 CPUs that are still supported by a given Linux distribution.

Upstream Rust ships with two preconfigured 32-bit x86 glibc Linux targets: The primary one has the kind of floating-point math that other ISAs have and requires SSE2. “Primary” here means that the Rust project considers this “guaranteed to work”. The secondary one does not require SSE2 and, therefore, works on even older CPUs but has floating-point math that differs from other ISAs. “Secondary” here means that the Rust project considers this only “guaranteed to build”. Conceptually, this is simple: x86 with SSE2 and x86 without SSE2. Pick the former if you can and the latter if you must.

The problem is that the x86 with SSE2 target is called i686-unknown-linux-gnu, but i686 is supposed to mean the P6 microarchitecture introduced in Pentium Pro. Pentium Pro didn’t have SSE2. Rust uses i686 as the first component of a target name to mean “default (for Rust) 32-bit x86”—not the P6 microarchitecture specifically. Notably, in the cases of macOS and Android, the baseline CPU is even higher despite the target name starting with i686, because those systems were introduced to x86 later. On the other hand, the x86 without SSE2 glibc Linux target is called i586-unknown-linux-gnu and it does target Pentium.

The key thing here is that i686-unknown-linux-gnu is not trying to capture the concept of i686 but is trying to capture the concept of x86 with SSE2 or x86 with the kind of floating-point math other ISAs have. If the target name started with e.g. i786, it would still not be the target that Linux distributions that target non-SSE2 x86 CPUs should use.

The second important thing to note is that the target name is an opaque string and the first component is not parsed and passed to LLVM. Instead, you need to look inside the target specification file to see what is passed to LLVM. There we see base.cpu = "pentium4".to_string(); for i686-unknown-linux-gnu. Notably, the target name existed first and the base CPU was revised subsequently in 2015.

But isn’t Rust still bad and misleading Linux distributions into believing that i686-unknown-linux-gnu targets pentiumpro instead of pentium4? Perhaps, but if distributions did minimal smoke testing of their binaries on the actual intended baseline hardware, it would be apparent that i686-unknown-linux-gnu is not the Rust target you want if you intend to support non-SSE2 CPUs. If no one bothers to test non-SSE2 CPUs as part of the release process, might that be a sign of it being the time to stop pretending to support non-SSE2 CPUs?

What if a distribution is unhappy with both pentium4 and pentium and really wants pentiumpro? Is it appropriate to blame it on Rust? No. A distribution has the freedom to mint e.g. a target like i686-distro-linux-gnu by copying and pasting the i586-unknown-linux-gnu target specification and changing the base CPU to pentiumpro and then build all its own packages with that target. What if that still generates an unwanted instruction somewhere? Chances would be that there would then be an LLVM bug, in which case it should be up to the retrocomputing community to fix it. At least it’s likely a smaller task than adding an LLVM back end for an entire ISA.

Should the upstream i686-unknown-linux-gnu target be renamed to something like i786-unknown-linux-gnu? Probably not. There are probably enough build configurations out there depending on it meaning what it means now that renaming would be disruptive. Also, baking the ISA extension level into the preconfigured target is not that nice in general, and a better solution is moving towards building the Rust standard library with the same compiler options as everything else, which makes the concept of preconfigured targets less necessary.