This blog was written by Zhongze Liu. Zhongze Liu is a student studying information security in Huazhong University of Science and Technology in Wuhan, China. He recently took part in GSoC 2017 where he worked closely with the Xen Project community on “Allowing Sharing Memory Regions between VMs from xl Config.” His interests are low-level hacking and system security (especially cloud and virtualization security).
I got to know the Xen Project about one year ago when I was working on a virtualization security project in a system security lab. It was the very first time that I received hands-on experience with a Type-I hypervisor. I was very interested in its internals and wanted to explore more of it by reading its code and performing some hacking on it. This is also what I was able to do this summer while I worked as a GSoC student with the Xen Project community. My specific focus was on setting up shared memory regions among VMs from a new xl config entry.
The purpose of this GSoC project is to allow simple guests that don’t have grant table support to be able to communicate via one or more shared memory regions. Such guests are not uncommon in the embedded world, and this project makes it possible for these poor guests to communicate with their friends.
This project involves many components of Xen, from the xl utility, xenstore, all the way down to the hypervisor itself. The implementation plan is quite straightforward: (1) during domain creation: parse the config –> map the pages –> write down in the xenstore fs what has been done;  (2) during domain destruction: read from xenstore the current status –> unmap the pages –> clean up the related xenstore entries. More details can be found in my proposal posted on the xen-devel mailing list. The tangible outcome is a patch set adding changes to xl, libxl, libxc, xsm and flask.
I met quite a few challenges during the project. The first and biggest one turned out to be how to design an appropriate syntax for the new config entry. The syntax has to be flexible and friendly to users. And the hardest part is how to control the stage-2 page permissions and cache attributes — we currently don’t have such a hypercall to control the stage-2 page attributes, but the clients are asking for the control over these attributes. I read a lot of documents about stage-2 page attributes on both x86 and ARM, and wrote a proposal for a new hypercall that would solve this issue.
After I made this proposal, I discovered that it would take up too much time to discuss the details in my proposal, not to mention implementing it. After discussing this challenge with my mentors, we decided to leave this as a TODO (see the “Future Directions” section in the project proposal), and only support the default attributes in the very first version of this project.
Next challenge: the “map the pages” step in the plan is easier said than done. After implemented the tool stack side, I moved forward to test my code, but kept getting errors on mapping the pages. By putting many printks through the whole code path, I found something blocking the way: On x86, adding foreign pages from one DomU to another by modifying p2m entries is not allowed.
Why? (1) the default xsm policy doesn’t allow this; (2) p2m tear-down is not implemented —doing so will screw up the refcount of the pages.
Fixing reason (2) is not a trivial task, but luckily, p2m tear-down is already implemented on the ARM side. So I decided to mark this new config entry as unsupported on x86, and continue to implement the ARM side. The fix to (1) turned out to be some changes to the xsm interface for xsm_map_gmfn_foreign, the dummy xsm policy, and the corresponding flask hook.
The last challenge that I’m going to talk about is testing. To test out ARM code, I followed this instruction on the Xen wiki to setup an emulator and another instruction on cross-compiling Xen and the tool stack for ARM. I’m not using Debian, so some of the handy tools provided by Debian are not available for me. I have to find alternative solutions to some of the critical steps and during my experiment, I found docker is the most distribution-independent solution which in the mean time won’t bring too much performance overhead. I created a Debian-based docker images with all the tools and dependencies required to build Xen, and every time I went to launch a build, I just needed to do a ‘docker run -v local/xen/path:docker/xen/path -it image-name build-script-name.sh‘. Â I’m planning to post my Dockerfile to the Xen wiki so that others can build their own cross-building environment with a simple ‘docker build’.
I’ve really learned a lot during the process, from my mentors, and from the Xen Project community. Additionally:
- I’ve improved my coding skills
- I’ve learned more about the Xen Project and its internals
- I’ve learned many efficient git tricks, which will be very useful in my future projects
- I’ve read about memory management on ARM and x86
- I’ve learned how to setup a rootfs, emulator, kernel, drivers and cross-compiling environment to test out ARM programs
- And most importantly, I’ve learned how to work with an open source community.
And no words are strong enough to express my many thanks to all the people in the community who have helped me so far, especially Stefano, Julien, and Wei. They’ve been very supportive and responsive from the very beginning, giving me valuable suggestions and answering my sometimes stupid questions.
I’m very glad that I was invited to the Xen Project Summit in Budapest. It was really a great experience to meet so many interesting people there. And many thanks to Lars and Mary, who helped me in getting my visa to the event and offered me two T-shirts to help me through the hard times when my luggage was delayed.
The GSoC internship is coming to an end, and it’s just my first step to contributing to the Xen Project. I like this community and I am looking forward to contributing to it more and learning more from it.