2023 Jan 1

On the job again

In late November, I started an embedded software engineering job. My first task was to troubleshoot a network module whose creators were no longer at the company. The module uses a 32-bit microcontroller running a real-time operating system (RTOS). One of my colleagues had observed via serial logs that the system was running out of memory. The software design uses multiple heap allocators. He suspected a memory leak and asked me to investigate the RTOS’s heap allocations.

In my first two weeks on the job, I had access to source code and build artifacts, but not the hardware. I analyzed the heap statically using Radare. Later, I wrote a script to automate this, producing a CSV file.

The script tracked down a subset of the firmware’s heap allocations by finding all the callsites of the allocator function and searching for its arguments in the appropriate registers. The script extracts immediate values from instructions, not variables. So it ignores variable-sized heap allocations. But the information would still be useful for controlling heap usage.

My colleague performed dynamic heap analysis to get the overall picture. He eventually narrowed the cause down to an incorrect memory configuration.

I suspect the bug is fixed, although there’s always room for doubt. Testing can reduce that doubt. However, firmware is hard to test compared to software. It is usually tied up with memory-mapped hardware peripherals, which are a very specific execution environment.

Dependence on specific hardware tends to increase the time of the development cycle, which pushes teams toward a linear development model. According to Universal Principles of Design, linear development is preferred only if requirements change little and the cost of iteration is high. In the case of firmware design, testing is one of the main costs of iteration because it is not readily automated.

New firmware might be designed so as to isolate the strong coupling to a low layer that can be implemented in a more generic, high-level computing environment such as POSIX.

But a lot of firmware that needs to be tested isn’t going to be redesigned. This is partly because firmware development cycles can be long, due to the dependence on hardware.

Emulation is a post-hoc way to reduce the cost of iteration. Emulating hardware amounts to implementing hardware interfaces in software. How is that possible? Well, that’s precisely what the emulation platform provides. Emulation has a wide variety of uses outside of testing, so it may provide other benefits too, such as porting mature functionality to more advanced hardware.

I had never developed an emulator, so I set out to size up the task. Research turned up two promising open source emulators for embedded systems: Qemu and Renode. I’ll explain how I chose Qemu.

Renode is a newer project that specifically targets resource-constrained hardware. Its main contributor is a company called Antmicro. Although the project is released under an open source license, the project requires contributors to assign their copyright to Antmicro. This ensures Antmicro the sole right to relicense the software under arbitrary terms, including proprietary. In effect, Antmicro’s open source project is simply a channel for their proprietary product development rather than a community-maintained project. The project is written in C#, which I’ve never used, and also uses a custom declarative configuration syntax for hardware support. It’s not really my style, both in philosophy and tech.

Qemu, in contrast, is not dominated by any one company. All contributors retain their copyright, so that ownership of the intellectual property is dispersed. It’s operated like the Linux kernel, with multiple prominent organizations opting to enrich the software commons against their otherwise strong incentive to hoard ownership and control. That’s not only more in line with my personal philosophy, but it reflects a wider and deeper role in the software ecosystem. Qemu is built with classic technologies: GNU C11, Kconfig, and Python. As such, it can enjoy a wide range of contributors.

After a few weeks of research and tinkering, I managed to get a stub Qemu machine running for the network module’s firmware. It’s a stub in the sense that it can run armv7m code, but doesn’t support peripherals. But because Qemu allows logging accesses to unimplemented memory regions, I effectively turned this stub emulator into a hardware interface detector.

Development of the emulator can proceed through iteration. The first task was to set up the memory regions correctly, such as SRAM, ROM, and flash. Once those crashes were fixed, I discovered that we needed to create the system configuration device. With that stubbed out, the firmware started touching the flash controller. And so on development could go until we’ve implemented enough hardware to test our firmware.

I estimate that Qemu support for our hardware would take 3.3 kLoC. This was calculated by examining a subset of the peripherals our firmware is known to use and looking up the average size of such implementations in Qemu. I’m not sure how to estimate development time, because there’s simply too much that I don’t know. But Qemu is full of good examples, from which I’ve already learned a lot.

Feedback

Discuss this page by emailing my public inbox. Please note the etiquette guidelines.