This is a guest blog post by Tamas K. Lengyel, a long-time open source enthusiast and Xen contributor. Tamas works as a Senior Security Researcher at Novetta, while finishing his PhD on the topic of malware analysis and virtualization security at the University of Connecticut.
The recent disclosure of the VENOM bug affecting major open-source hypervisors, such as KVM and Xen, has many reevaluating their risks when using cloud infrastructures. That is a very good thing indeed. Complex software systems are prone to bugs and errors. Â Virtualization and the cloud have been erroneously considered by many to be a silver bullet against intrusions and malware. The fact is the cloud is anything but a safe place. There are inherent risks in the cloud and it’s very important to put those risks in their proper context.
VENOM is one of many vulnerabilities involving hypervisors (e.g. [Qemu], [ESX], [KVM], [Xen]). And, while Venom is indeed a serious bug and can result in a VM escape, which in turn can compromise all VMs on the host, it doesn’t have to be. In fact, we’ve known about VENOM-style attacks for a long time. Yet, there are easy-to-deploy counter-measures to mitigate the risk of such exploits natively available in both Xen and KVM (see RedHat’s blog post on the same topic).
What is the root cause of VENOM? Emulation
While modern systems come with a plethora of virtualization extensions, many components are still being emulated, usually devices such as your network card, graphics card and your hard drive. While Linux comes with paravirtual (virtualization-aware) drivers to create such devices, emulation is often the only solution to run operating systems that do not have kernel drivers. This has been traditionally the case with Windows (note that Citrix recently open-sourced its Windows PV drivers so this is no longer the case). This emulation layer has been implemented in QEMU, which caused VENOM and a handful of other VM-escape bugs in recent years. However, QEMU is not the only place where emulation is used within Xen. For a variety of interrupts and timers, such as RTC, PIT, HPET, PMTimer, PIC, IOAPIC and LAPIC, there is a baked-in emulator in Xen itself for performance and scalability reasons. As with QEMU, this emulator has been just as exploitable. This is simply because programming emulators properly is really complex and error-prone.
Available Mitigations
We’ve known how complicated emulation is for some time. In 2011, this Black Hat presentation on Virtunoid demonstrated a VM escape attack against KVM through QEMU. The author of Virtunoid even cautioned there would be many bugs of that nature found by researchers. Fast-forward four years and we now have VENOM.
The good news is it’s easy to mitigate all present and future QEMU bugs, which the recent Xen Security Advisory emphasized as well. Stubdomains can nip the whole class of vulnerabilities exposed by QEMU in the bud by moving QEMU into a de-privileged domain of its own. Instead of having QEMU run as root in dom0, a stubdomain has access only to the VM it is providing emulation for. Thus, an escape through QEMU will only land an attacker in a stubdomain, without access to critical resources. Furthermore, QEMU in a stubdomain runs on MiniOS, so an attacker would only have a very limited environment to run code in (as in return-to-libc/ROP-style), having exactly the same level of privilege as in the domain where the attack started. Nothing is to be gained for a lot of work, effectively making the system as secure as it would be if only PV drivers were used.
While it might sound complex, it’s actually quite simple to take advantage of this protection. Simply add the following line to your Xen domain configuration, and you gain immediate protection against VENOM and the whole class of vulnerabilities brought to you by QEMU:
device_model_stubdomain_override = 1
However, as with most security systems, it comes at a cost. Running stubdomains requires a bit of extra memory as compared to the plain QEMU process in dom0. This in turn can limit the number of VMs a cloud provider may be able to sell, so they have little incentive to enable this feature by default on their end. However, this protection works best if all VMs on the server run with stubdomains. Otherwise you may end up protecting your neighbors against an escape attack from your domain, while not being protected from an escape attack from theirs. It’s no surprise that some users would not want to pay for this extra protection, even if it was available. But for private clouds and exclusive servers there is no reason why it shouldn’t be your default setting.