In my work as a consultant I meet many different VMware environments on a daily basis and from time to time I'm called out to troubleshoot performance issues. For troubleshooting such issues I've compiled a list of things to be aware on vmfaq.com:
This article has been the most popular article on that site for quite some time. That list is not meant to be a final solution to all problems, but more like a quick list that will rule out the most common errors. Out of those 10 steps listed there, there is one thing that I've seen being the root cause for many issues that I've seen lately. This problem normally involve one or more servers that are having bad performance and even after the local vmware admin has tried to tune the systems with more ram and vcpus the performance is still bad.
Take a look at this screen shot:
After removing the memory limit we could quickly see how the balloon deflated quite quickly:
This step helped a lot for this VMs performance. How about other VMs on this system? Was there other VMs affected by this same problem? Yes, there sure was:
These VMs often also have cpu limits set and also those should be removed. The cpu limits I have seen have however affected the performance in a much lower degree than the memory limits. This mainly because the GHz of the cpus today are quite similar to the GHz of cpus five years ago.
The topic of this posting is "Limits are evil" and I think they really are evil when you don't know they are there. I can surely see the usefulness of limits in many situations as well, but there's a huge difference between doing something when it's a thought through planned setting and inheriting a setting that you wasn't aware of.