2024 Feb 2

CBMC × (LTO – assertions)

My sampler project is not just a way for me to explore music tech. It’s also a research vehicle to explore better ways of engineering embedded software.

The code is written in style similar to design by contract. Functions check inputs and outputs by calling assert(). There are a lot of assertions, because I avoid using the && operator with conditions. Instead of writing

assert(fabsf(x) <= 1 && other_condition);

I write

assert(x <= 1);
assert(-1 <= x);
assert(other_condition);

This form provides higher resolution when an assertion fails: it lets me know precisely what condition happened. But it results in more branches, which reduces speed.

I keep an eye on the speed of the MCU code by toggling a GPIO around the audio processing. I have always used size optimization (-Os), and am currently exploring link-time optimization (LTO), which Meson makes super easy. Although I have read opinions that embedded applications should not use LTO and should not disable assertions, I am using LTO and disabling assertions for target builds. At least for now. It’s not a big commitment!

This approach seems to work and results in a 10× speedup compared to no optimization and keeping the assertions. That’s the difference between having CPU time for 30 voices instead of two or three.

This isn’t the only approach and might not be the best one. When you write code with the intent to disable assertions, you might want to take care that the semantics of the program without assertions are still robust. In some places, I do this by assuming that internal data can be corrupted, such as enum variables.

For example, the sequencer has two states: playing and stopped. But the code is written so that any value that isn’t SEQUENCER_STATE_PLAYING means the same thing as SEQUENCER_STATE_STOPPED.

if (state != SEQUENCER_STATE_PLAYING) {
	assert(state == SEQUENCER_STATE_STOPPED);
	// stopped
} else {
	// playing
}

Another thing I do to verify my code is run it through CBMC, a bounded model checker (BMC). I’ve been exploring this tool for a few months. I don’t know everything about it, but I get the basics.

CBMC checks the assertions in my code by simulating execution. It is not able to simulate indefinite execution (that’s what the development board is for) but it can simulate execution in response to finite inputs. That’s what “bounded” means. This suits the sampler, whose code is pretty finite. However, I’m not using CBMC to prove that the entire renderer is correct. I tried that, and it was too slow.

Instead, I prove properties about leaf functions by writing harnesses. The harness allows you to add assumptions and other assertions that wouldn’t be possible in the contracts. For example, you can prove that a function obeys certain mathematical properties, such as monotonicity or injectivity. It’s impossible to write contracts that express these ideas (they’d make the function infinitely recursive when built with assertions), but you can do it by writing a harness.

Being able to prove mathematical properties is really cool. It’s cool to know that a type conversion function is injective, because it means that it loses no data. CBMC can prove that, sometimes very quickly!

One of my guidelines for using CBMC is that it always runs as part of a test suite that has to finish in under five seconds. This rules out slow proofs. The proofs that remain are still interesting, and I continue to search for useful properties that CBMC can prove quickly.

There’s also an ordinary test suite that targets POSIX. With assertions enabled, it runs the entire application for a finite amount of time. This serves as a smoke and integration test. It has no boilerplate and is very effective.

I used to have more unit tests, but I’ve been replacing them with contracts and proofs. This was inspired by James Coplien’s rant against unit testing. I spent years as table-driven unit testing advocate. James finally changed my mind. Tests should check overall properties—not a large set of specific points—and make sure everything works together. I feel that I’ve found an approach that does that.

Verdict

Is CBMC × (LTO – assertions) a good approach?

On the one hand, the shipped product is aggressively optimized and has no assertions, so there is definitely reduced robustness. My effort to code robustly in the absence of assertions was too thin. Memory corruption, although it’s not likely, could wreak havoc. If an object pointer is clobbered by wayward code, invalid dereferences can happen. However, due to the heavy optimization, there might be fewer actual dereferences. To understand specific failure modes, I need to read the disassembly.

On the other hand, the test suite gives me high confidence that my code is correct. This assumes no data corruption. One area to explore here is CBMC’s ability to inject nondeterministic values. This probably can be used to find robustness issues arising from data corruption.

Regarding the use of assertions, there’s a subtle distinction between input validation and internal invariants. Both can be coded with assert(), but they mean different things. Code that validates untrusted inputs shouldn’t be removed at compile time! But, if memory corruption is a concern, then neither should the invariants! Defense-in-depth seems to require that the code can’t trust itself. There seems to be a tradeoff between being featureful and being robust.

One area where I know the code has a bug is related to concurrency. I use C11 lock-free atomics to deal with interrupts. It was interesting to learn about how lock-free atomics are implemented, and I think they’re the correct solution for setting flags in interrupt handlers. Unfortunately, there is still a TOCTOU bug in the main loop. I have not observed it causing a problem, but it’s incorrect.

How robust does a sampler need to be? A violated contract could result in an audio signal above the threshold of pain during a high-volume live performance. Performers need their gear to keep working. The system should restore itself to a good state if an unexpected state is detected.

CBMC × (LTO – assertions) focuses more on correctness and efficiency than robustness.

Feedback

Discuss this page by emailing my public inbox. Please note the etiquette guidelines.