The Xen of Static Checking, Part 1: bug-free code without the effort

OK, maybe the title of this post is a slight exaggeration but it’s good to have goals for the future!

It’s a goal which many would argue will be unreachable without the genesis of Strong AI. It’s also a goal where we can achieve very useful results just by trying to get there. I’m going to write a series of articles about my current work on static checking the Xen codebase. The goal here is to find errors before they occur, spot bugs that aren’t caught by human reviewers and improve the overall quality of codebase. Unfortunately, global harmony and toast which doesn’t fall butter-side-down are probably still outside the scope of this work – sorry.

This first article gives an overview of the historical background of static code checking. Future articles in this series will describe what I’m doing to apply static checking to the Xen codebase and the possibilities for Xen in the future.

There’s currently a brief placeholder on this topic on the Xen Wiki at http://wiki.xensource.com/xenwiki/StaticChecking. This has been there for such a long time that many may have thought that the effort was dead. I’m here to tell you that it’s not – it’s just been pining for the fjords for some time. I’ll hopefully be fleshing out that wiki page with a little more information in due course; I’d also like to get a working mq tree up on Xenbits so that people can clone the repository and see the current state of things.

Without further ado, lets get to the subject of this article: the history of static checking.

What is static checking?

Static checking refers to a class of techniques for finding bugs in code without running it, hence the “static”-ness. Static checkers perform various high-level checks on the meaning of the code in order to find bugs. They may do this with or without the assistance of “annotations”, often embedded in comments, which describe the expected behaviour of the code to be tested.

An early example of a static checker was the lint checker for the C language. Lint performed a scan of the code looking for suspicious-looking things, which it could then flag to the developer. This enabled the developer to direct his attention to potential bugs he might have missed when reading the code and that might also have been missed by the compiler. Thus lint was used a supplement to existing checks on code quality, to save time and spot problems that might otherwise have gone unnoticed.

Since then, much academic and commercial effort has been expended on static checking for a wide variety of languages. One of the most influential academic static checkers was the Stanford Checker project. This has evolved into possibly the best known commercial static checker: a product of Coverity. Coverity run an Open Source Scanning effort which applies their proprietary checker to find bugs in important open codebases, including the Linux Kernel.

In the Open Source space, there is the successor to the original lint: Splint. Splint performs checks for potential bugs and security holes and attempts to encourage good coding style. It can optionally use annotations to improve its understanding of the code and facilitate more powerful checking. There is also the Sparse project, which was originally developed by Linus Torvalds for the purpose of checking Linux kernel code. This performs some analyses similar to those in the original Stanford Checker and is able to take information from annotations in addition to the details it can infer directly from the code. Sparse has found numerous bugs in the Linux kernel – in areas which are hard for human reviewers to deal with. Sparse is integrated into the kernel build system, allowing straightforward checking of the Linux tree whilst compiling it.

Another class of tools that I’m including in my static checking effort for Xen is that of style checkers. These are purely intended to detect and report variations from recommended good coding style. They are arguably outside the normal definition of static checking, however I feel they’re relevant to the overall effort and so I’m including them here. Of the tools I’ve already described, all of them will probably complain if your code contains actual syntax errors. However, of those only Splint and the original lint have an explicit goal of detecting style problems as well as correctness issues. Linux includes a patch style checker called checkpatch.pl which is designed to verify that patches conform to the recommended kernel coding style. An automated style checker saves reviewer effort, makes it easier for contributors to check their coding style themselves and may spot problems (like trailing whitespace) that are harder to spot visually. Making coding style more regular doesn’t necessarily fix bugs but it can make them easier to spot. Stylechecking is a controversial topic, however Linux Kernel Developer Ingo Molnar delivers a strong account of the benefits of checkpatch.pl.

Other static checkers

It should be noted that there are many other open source, academic and commercial static checking projects which I have not had time to investigate or write about. The fact that I have not included them here should not be interpreted as a slight against their featureset or quality. It is simply a case of “too many projects, not enough time”. I’ve written about the projects that I’m familiar with and that I’ve investigated whilst trying to apply static checking to Xen. I would be interested to hear of other projects that readers would recommend or believe I should have mentioned.

Static Checking in Xen

This leads us to the real topic of this series of posts: static checking in Xen. It may surprise readers to learn that, at the time of writing, mainline Xen is arguably already ahead of many projects in its use of static checking: the hypervisor is built using the

-Werror

flag to gcc. This forces all compilation warnings to behave like compilation errors. It’s impossible to just ignore compiler warnings – they automatically break the build. Xen also switches on a lot of gcc warnings which are not active by default. When this small, initially inconvenient change was introduced, real bugs began to be found, caught and corrected. Sometimes gcc warnings are harmless but sometimes they are symptomatic of real problems: forcing people to pay attention to them produced real benefits in code quality.

Of course, this isn’t what most people would think of as truly being “static checking”. However, compiler checks are a brilliant example of how automated code checking is already useful to the majority of developers. It’s also something which many projects don’t even get round to making full use of – see how many warnings scroll past next time you do a full build of software package! Xen is arguably already ahead of the curve in this respect and I believe it will go even further. There are many other ways in which more sophisticated static checking could be usefully introduced to the project.

What I’m doing in this area will be the subject of the next article in this series.

Acknowledgements

I’d like to give particular praise to Linux Weekly News for keeping me up-to-date on Linux kernel news and the applications of static checking within the kernel. I’d also like to mention the, sadly missed, Kernel Traffic for helping to bring the use of Sparse on the Linux Kernel to my attention.

Any mistakes are, of course, all my own work. If you see anything I’ve missed out, got wrong or misrepresented, please don’t hesitate to leave a comment and let me know.

Read more

Xen Project Announces Performance and Security Advancements with Release of 4.19
Aug 05 2024

New release marks significant enhancements in performance, security, and versatility across various architectures.  SAN FRANCISCO – July 31st, 2024 – The Xen Project, an open source project under the Linux Foundation, is proud to announce the release of Xen Project 4.19. This release marks a significant milestone in enhancing performance, security,

Upcoming Closure of Xen Project Colo Facility
Jul 10 2024

Dear Xen Community, We regret to inform you that the Xen Project is currently experiencing unexpected changes due to the sudden shutdown of our colocated (colo) data center facility by Synoptek. This incident is beyond our control and will impact the continuity of OSSTest (the gating Xen Project CI loop)

Xen Summit Talks Now Live on YouTube!
Jun 18 2024

Hello Xen Community! We have some thrilling news to share with you all. The highly anticipated talks from this year’s Xen Summit are now live on YouTube! Whether you attended the summit in person or couldn’t make it this time, you can now access all the insightful presentations

Get ready for Xen Summit 2024!
May 24 2024

With less than 2 weeks to go, are you ready? The Xen Project is gearing up for a summit full of discussions, collaboration and innovation. If you haven’t already done so – get involved by submitting a design session topic. Don’t worry if you can’t attend in person,