My “official” summer project for Mozilla was to implement a linear scan register allocator for IonMonkey based on the algorithms published by Christian Wimmer. While that was certainly where most of my time went, I had a couple of related side projects that I worked on that require a little bit of background:
Implementing an algorithm from a collection of papers efficiently and correctly is a software engineering challenge. LinearScan.cpp as of the end of my internship was about 1500 lines long, not to mention about 600 lines of support in LinearScan.h. Most lines of LinearScan.cpp are algorithmically significant, and keeping track of all of the required invariants can be daunting. To make life more exciting, a violated invariant can lead to a number of symptoms, including:
- An assertion is tripped somewhere in the allocator
- An assertion is tripped in a later stage of the compiler, like the code generator
- No assertions are tripped but the generated code is wrong
- Correctness remains intact, but the output decreases in some measure of quality
Of these, tripping an assertion is the allocator itself is obviously preferred: the violated invariant is clear and the assertion probably tripped close to where everything went wrong. Tripping an assertion later means you know what happened, but you might have to dig a little bit to figure out how it got that way. If we generate wrong code (and notice!), it is a lengthy process to identify the erroneous instructions before you can begin to figure out how they got that way. Sub-optimality tends to go unnoticed entirely unless one specifically looks for it.
The Problem with IonMonkey
I recognized some of these issues early in the summer and produced some small ad-hoc python scripts to generate large random test files with lots of nested branches and loops to try to break the bytecode-to-IR construction algorithms. They found about five bugs collectively, but once the issues associated with the code pattern each one produced had been ironed out they were of limited utility.
Anion has automatically found over thirty bugs in IonMonkey since it was written, and continues to be a valuable bug-finding tool today. It is fast enough to catch some regressions before patches even land and covers a broad enough state space to encounter edge cases we didn’t know were possible.
Of course, Anion only finds correctness issues. Identifying performance issues is tricky: we don’t have any good, realistic long-running benchmarks to time yet. However, we can predict performance by counting various occurrences in the compiler such as the emission of a move instruction. Code containing a lot of unnecessary move instructions will likely run more slowly, so if we start emitting a lot of these it probably won’t be good for performance on real code later. To try to catch these performance issues, I wrote Cation.
Cation is a fuzzer similar to Anion, but much simpler. It doesn’t concern itself with crashes or triage, but instead with finding test cases that cause some metric to be worse than some baseline. The output is a pile of test cases that exposed performance issues along with a scorecard of how bad the issue was. Some examples of the sort of things Cation can find test cases for are:
- Linear scan register allocation producing more move instructions than the greedy allocator
- Global value numbering increasing register pressure causing the register allocator to emit lots of moves
- Loop invariant code motion resulting in an increase of the number of instructions in a loop
- Version A of the compiler producing fewer expensive instructions than later version B on the same function
Cation makes it very easy to try any of these: simply add print calls in the code wherever the event you want to count occurs. It has already helped me locate several places in the linear scan register allocator where heuristics could be improved, and hopefully it will continue to be helpful for improving performance elsewhere.
Both tools are open source and under active development, and can be checked out from my repository on GitHub. Hopefully they can be of some use to you, too!