That's a "great on the surface, bad when you think about it" thing if I've ever ...

xmprt · on Jan 21, 2021

I have solved the halting problem. Every program you write compiles and executes as HALT; and therefore it's possible to determine if a program halts! It might not be correct but it sure is fast!

mrybczyn · on Jan 21, 2021

https://en.wikipedia.org/wiki/Full_employment_theorem

You've broken the full employment theorem for compiler writers.

jcranmer · on Jan 21, 2021

The practical effect isn't what you think it is.

Many compiler writers are employed by hardware companies, and these companies are employing them to sell their hardware chips. For these companies, even a 1% boost in performance on $BENCHMARK can drive measurable impact on sales. Even if you're not worried about driving hardware sales, performance tends to be the major benchmark that gets mentioned in comparisons--note how the "Benchmarks Game" (what used to be called the Programming Language Shootout) compares languages solely on the basis of performance.

In other words, if you look at actual customer behavior, performance tends to be one of the most dominant factors in choosing one compiler over another (language compliance is probably more dominant, however). Surety of correctness is not a major factor at all.

Now, in order to actually report benchmarks correctly, you have to implement them correctly, so to some degree you have a basic guarantee that the compiler will usually work. But someone pointing out a soundness issue using a hypothetical example (as opposed to real code) is going to end up low priority because the evidence is that it doesn't matter in practice for the real, large applications that customers care about (as no one else has complained about it for the past several years).

In summary, practical performance will win out over theoretical correctness, but not practical correctness.

igouy · on Jan 21, 2021

> …"Benchmarks Game" … compares languages solely on the basis of performance.

Not true.

1) source code size is compared

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

2) correctness is checked

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

3) time "to complete and log all make actions" (build the program) is also logged

jcranmer · on Jan 21, 2021

> 1) source code size is compared

This is new to me (the Benchmarks Game has been around for well over a decade).

> 2) correctness is checked

This is par for the course of benchmarks. A benchmark that is incorrectly compiled provides no useful information whatsoever about performance. It's not really what people mean when they complain about correctness issues in the compiler (which is generally referred to as soundness in literature).

> 3) time "to complete and log all make actions" (build the program) is also logged

Performance in a compiler is usually taken to be a combination of runtime, code size, and compile time, although all three components are not weighted equally.

igouy · on Jan 21, 2021

Back on May 23 2006, a sortable column of GZip source code size measurements:

https://web.archive.org/web/20060523195137/http://shootout.a...

https://web.archive.org/web/20060519213743/http://shootout.a...

> Performance in a compiler is usually taken to be a combination of runtime, code size, and compile time

The benchmarks game shows source code size, not compiled code size.

qznc · on Jan 21, 2021

If people cared about correctness, they would use CompCert instead of GCC or Clang.

pjmlp · on Jan 21, 2021

Or create OSes where applications are all written in managed languages, and the use of GCC and clang is gate keeped to the OS vendor and native libraries, just an idea.

MaxBarraclough · on Jan 21, 2021

Bit of a nitpick, but the real value is in safe languages, rather than managed languages, no? The distinction is a little blurry, admittedly, but as an example, I figure that verified SPARK counts as safe but unmanaged.

pjmlp · on Jan 21, 2021

Because as you might have guessed, I was speaking of iOS, Android and ChromeOS, while SPARK as much as I like it, is a very niche technology.

MaxBarraclough · on Jan 21, 2021

I should have gone with a less niche example: the safe subset of Rust. D also has a safe subset.

pjmlp · on Jan 21, 2021

Still niche, Rust is not up to the stuff that app developers expect in terms of tooling and D still has quite a few @safe bugs to sort out, among its current adoption issues.

Unfortunely regardless of their own reports regarding CVEs, Microsoft, Apple and Google keep turning to C and C++ for their bottom layers + managed languages, instead of going full speed in some of those alternatives.

So it appears to be the long term solution to mitigate security issues, while keeping compatibility with existing code and tooling.

eru · on Jan 21, 2021

Why use managed languages when you can prove your code correct?

pjmlp · on Jan 21, 2021

With what, Frama-C? Good luck getting it adopted outside domains that enforce its use for code certification.

eru · on Jan 27, 2021

Frama-C is a specific compiler. It has not much to do with proving your own code. (And C ain't a good language for writing code you want to prove correct in.)

You are right that after having proven your code correct, you might want to use an implementation (compiler or interpreter), that is also proven correct. But those are still two different things.

edwcross · on Jan 28, 2021

Frama-C is a framework for code analysis. It can help proving code with the aid of ACSL (JML-like notation for C contracts).

But I agree that C is not the best language for writing critical code.

tobylane · on Jan 21, 2021

CompCert doesn’t guarantee these two functions will work. Thanks, didn’t know about that project.

Cthulhu_ · on Jan 21, 2021

I do; low level, OS and library code should be both. It's not an either or; you start with the correct code (and code to verify), then if needs be you optimize.

wongarsu · on Jan 21, 2021

So you're saying maximum priority is correctness, and performance second place? Because if you don't have correctness you can't "start with correct code"

0xTJ · on Jan 21, 2021

If you have a broken compiler, you can have broken generated code, and not even know it. This isn't about whether or not your C code is correct, this is about whether the assembly that's generated matches the behavior in your code.

ndr · on Jan 21, 2021

How is this not confirming the point you're replying to?

lmm · on Jan 21, 2021

Anyone who's decided to use C has already made that choice, so it makes sense that C compiler writers would follow their priorities.