The virtues of OS-level virtualization
13th of November 2007
After deciding that virtualization is the way to improve efficiency in server infrastructure, one faces a choice from a vast number of technology options to realize that improvement. There is hardware assisted virtualization, paravirtualization, operating system level virtualization, each with a number of concrete options from various vendors and open source communities.
Hardware assisted virtualization and paravirtualization are in recent years slowly converging into one, as optimalization techniques (e.g. for direct I/O from VMs to physical hardware) are still needed even when complete emulation of the VM execution is on newer hardware no longer required. This approach has much potential in the future, but today, the overhead of paravirtualization with even partial emulation is still significant, relative to native performance. Eventually, this problem will diminish beyond the point of significance, but that will happen only after a number of substantial computer architecture improvements to further accommodate virtualization on x86 systems will have been made commercially available, something that may take well beyond 2010. A research paper compares VMs running Xen paravirtualization with those running OpenVZ OS-level virtualization, and while OpenVZ manages to stay in close proximity of native performance, Xen lags far behind. In the particular benchmark application the researchers used, it would in fact take roughly double the number of machines to achieve the same throughput using Xen, than on native or OpenVZ setups, something clearly prohibitive from TCO perspective.
Low administrative overhead
Linux-Vserver and OpenVZ have shown in benchmarks that their authors succeeded in keeping the original advantage of running all processes from all VMs essentially on one Linux kernel, which itself is well equipped to schedule the 1,000 processes or more that this may involve. Also resource management is comparable to that required to run just one very busy system, and the overhead to system calls introduced by these OS-level virtualization patches to the Linux kernel remains within units of percent. By design, this is the lowest administrative overhead possible for a virtualization technology.
High density deployment
OS-level virtualization shares idle resources on the machine among the VMs by default, which brings the highest density of deployment achievable on the particular piece of hardware, without enduring the chaos of providing all those services, possibly for different business clients, from a single OS installation. Virtuozzo does even better than that, VZFS on-disk and in-memory sharing allows grouping hundreds of VM instances on one physical host, while their open-source sibling, OpenVZ, does not fare much worse than that. At that level of deployment density it's mostly a matter of high quality resource management that succeeds in keeping the VMs from starving each other out when consuming kernel resources. However, this scenario is exactly what OS-level virtualization is built for and hence what it handles at least as well as all the other virtualization competition, and probably much better.
High quality resource management
OpenVZ in particular excels in this subject -- over 20 crucial resource levels can be set live, while the VM is running, optionally saved to be re-initialized to the new value after a reboot. Linux-Vserver, on the other hand, shines in flexible VM isolation via POSIX capabilities that can be selectively re-added to a VM which are by default removed from it. OS-level virtualization technologies succeed in applying less unneeded abstraction and additional complexity to partitioning the available physical resources in a sufficiently solid and certainly more flexible manner than a full blown para- or hardware virtualization.
The conclusion is that as long as Linux kernel itself manages to scale well on typical server hardware in current use (which is very likely), it will not be required to partition the system with hardware assisted virtualization first, and the optimal choice will remain to be running only one layer of OS-level virtualization with very good isolation and resource management. When either the kernel would fail to scale or even the best resource management and isolation features of current OS-level virtualization technologies would fail to separate VMs to the required extent (neither of which I believe is very likely), it would be time to wrap OS-level virtualized groups of VMs in hardware virtualized VM shells. For instance, both Linux-Vserver and OpenVZ can run under Xen, for exactly this purpose.
So far, one VM layer, via OS-level virtualization, has proved in production to be the best solution. Which technology we prefer to deliver this setup I will explore in a future post.