Virtualization Principals with Paravirtualized IO

This isn’t really a proper post, more just some little notes I can point people too (which I guess, technically is a blog post, ahh well!).

So Virtualisation, have used VMWare products (most of) and Xen previously in production, with Virtualbox my desktop-virt product of choice for testing on my local machine for some years now, but times change and my current view is this;

– XEN is disappearing from many distro’s (including the ones I mainly use in production) and being replaced with KVM.
– VMWare VCenter/ESXi is a bit overkill for my test/home/local machine VM’s stuff.
– Virtualbox is good but annoys me that I need to install extra kernel modules etc and updates (even if it’s done through DKMS) when my kernel supports KVM anyway!

So I’ve moved a lot of home/test/local machine VM’s to KVM.

Not going into performance (many, many better testers have spent more time looking at this than I), but just to clear up a couple of things. The reason distro’s have moved away from XEN DOM-0/DOM-U support in favor of KVM is that “KVM is Linux, XEN isnt”.

By this, we mean;
– KVM is a hypervisor made from the components of your Linux kernel, it is this Linux kernel of the your linux install placed onto the bare metal that runs on the privileged area of your processor, providing hardware assisted virtualization to guests on CPU’s that support VT-x/AMD-v. You’ll notice in this install you can still see the virtualisation CPU extensions in ‘/proc/cpuinfo’ as it’s this OS that IS the hypervisor;

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
stepping : 10
cpu MHz : 2666.583
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dts tpr_shadow vnmi flexpriority
bogomips : 5333.16
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

– XEN on the other hand is nothing todo with linux, it’s a separate hypervisor that runs it’s own non-linux microkernel and then boots one instance of linux as it’s first guest, called DOM-O, this first guest has the privileges to control the underlying XEN hypervisor, but it’s still a guest, you wont see your CPU flags even in your DOM-0 because it’s not linux running directly on the hardware, it’s XEN. You can also see this in grub, notice how your kernel and initrd are both passed to XEN as modules, it’s XEN grub is actually booting.

Hardware Virtualization – Peripherals (virtual)
So with hardware virtualization, we are no longer having to translate/emulate instructions (or VM’s memory pages with the newer EPT stuff) to and from the processor, that’s handled in hardware reducing massive overhead. However, even though the code is running ‘native’ on the processor, we still have to present hardware to our guest OS’s.
This was previously done by emulating very common hardware (read; old) in software to present to guests (so that guests will already have the drivers for that hardware).

Implementing old hardware does not necessarily limit the virtual hardware devices to the same performance limits as the original hardware (for example the motherboard models implemented in KVM/VMWare/XEN etc can support way more PCI/PCIe ‘devices’ than the original board had slots for).

KVM Uses QEMU to present the device hardware to each virtual machine and there is some performance degradation by having the following peripherals emulated in software;

– Network Cards
– Storage controllers for HDD/SSD storage

VirtIO / Paravirtualized Hardware
To get around this, the idea of paravirtualized hardware has been created. This changes the Virtualization model somewhat from;

The guest OS doesn’t know it’s being virtualized, it runs on what it thinks is standard hardware using drivers the OS already has.


We don’t really care that the guest OS knows it’s on a Virtualization host, so why not improve throughput to the hosts hardware by giving the guests drivers to better interact with the host/hypervisor layer in terms of passing I/O for disks, network cards etc.

This of course means the guest OS will needs special drivers based on whatever hypervisor we are using underneath, but then dispenses with the idea that it’s ‘real’ hardware, these paravirtualized guest drivers implement a more direct way of getting I/O to and from the hypervisor without having to emulate a fake piece of hardware in software.

VMWare has the VMNET(1/2/3) network card, for which you need drivers from the ‘VMWare Guest Tools’ installer. This is a paravirtualized NIC which has no basis on any real network card and gives better performance than the e1000e offered as another option in VMWare.

XEN had the xenblk and xennet drivers which did the same thing for NIC’s and Storage controllers. VMWare has paravirt storage controller drivers too I just can’t remember their names 🙂

KVM (and now Virtualbox can too) use something called ‘VirtIO’.

What is VirtIO?
VirtIO is exactly the same principal as the above offerings from VMWare and XEN, only it’s a movement to standardize paravirtualized peripheral drivers for guest operating systems accross multiple virtualization platforms, instead of each Hypervisor development team implementing their own guest-side drivers.

So for KVM, if you want better performance out of your Network/Disk I/O within a virtual machine, you’ll be wanting to use ‘VirtIO’ devices instead of emulated hardware devices.

More information can be found here;

VirtIO also includes a driver to allow memory balooning, much like VMWare with the baloon driver within the VMWare guest tools.

It is worth mentioning here that this information is NOT the same as Intel VT-d or single root IO virtualization (SR-IOV) these are also related to virtual machine/guest OS’s and how they interface with hardware, but in a very different way;

– VT-d technology allows for a physical device (such a NIC or graphics card in a physical PCI-E slot on the host machine) to be passed directly to a guest, the guest will use the drivers for that particular device and will speak natively to and from the hardware. This required VT-D extensions on a CPU to work and a hypervisor capable of utilizing the technologies.

– SR-IOV allows for multiple virtual machines to see hardware devices, ie share the physical hardware device yet all still access the raw hardware natively just as one guest could with VT-d technology above. IE, 10 guests could share 1Gb/s each of a 10Gb/s physical network card, using the physical network card drivers directly in the guest (to support all that network card’s features and remove the need for the hypervisor to be involved in the I/O) without the need for emulated hardware or paravirtualized drivers. The hardware device (such as a NIC) needs to be designed to support SR-IOV and so far only a handful of hardware components have been.