October 19, 2011

300 mission critical VMs per host?


In the keynote this morning (Wednesday) there were a few predictions about the future. One of these was that in 2014 we would all run an average of 300 VMs per host. It didn't say mission critical, but a quite significant amount of VMs is normally mission critical. The exception is VDI, since VDI normally has a lower SLA than servers. There's no technical limitation that will prevent you from running 300 VMs per host even today as long as your environment can support it.

Traditionally it has been considered quite dangerous to put that many eggs in a basket, so I think we would need some changes to make it reasonable to run such systems. It's possible even today to run as many VMs per host, but we are not seeing a lot of customers doing it because they risk that too many VMs will fail simultaneously if there's a failure. And if you have a situation where VMware HA is restarting your 300 VMs simultaneously, you'll also see a boot storm ala VDI except that this is your server VMs that is also trying to start and run a lot of services.

If we look back at where virtualization originated we can take a quick look at what IBM have done. They've been into this business since 1972 and should have a bit of experience. They now run VMs on their mainframe System Z platform (S390) that is said to have a Mean Time Between Failure (MTBF) rate of 40 years. I don't know what the MTBF is on an average x86 server. I only know that if a cpu fails, the server fails. So how did IBM achieve such a high MTBF rate? By creating custom hardware that is fully redundant in every part. You can change memory modules, cpus and all parts without shutting down the system. Some of these features has eventually found it's way down to the x86 architecture. Things like Chipkill (ECC) and NUMA are technologies that are examples of things that was found on mainframe architectures quite a few years before they showed up on x86. If such a monster type of server had appeared in the x86 field where all parts could easily be replaced without any downtime I think it would be feasible to achieve this goal. But I doubt that we'll see it happen. At least so soon. Bringing hardware functionality from one platform to another is something that we're seeing all the time, but there seems to still be quite some way to go until we have mainframe functionality on x86 servers.


Are there other methods we could use to make such dense systems a reasonable thing? Yes, we have for some time had a feature called Fault Tolerance (FT). It will run a single VM on multiple hosts at the same time. But the coolness factor of this feature is however much higher than the useful factor. FT is crippled with some limitations that is preventing most customers from using it today such as single cpu VMs and no snapshot support. If however these limitations were ruled out and we could use it on all the VMs that we wanted without loosing functionality, I guess that it would be a highly appreciated feature. Now that we have 10GbE (soon 40GbE, 100GbE later on) with LBT and NetIOC, we could do it without redesigning the network infrastructure. At VMworld they already showed off FT with vsmp and if they manage to remove the other limitations, putting 300 VMs in a basket wouldn't be a problem since you wouldn't put it in a single basket. The 300 eggs would live in several baskets at the same time. If one of the hosts dies the 300 VMs will live on as nothing had happened and  they will also automatically duplicate themselves to the remaining hosts so that each VM is always running on two hosts.


Will this happen by 2014? I don't know, we'll just have to wait and see.

No comments:

Post a Comment