November 21, 2015

Replacing a vSAN caching disk

Replacing disks in vSAN could be a bit less smooth than some of the traditional Storage Arrays. For normal disks used for storage it's quite easy, but disks used for caching it can be a slightly different story. If you get a dead caching disk you should remove it from the config before removing it physically from the server. Otherwise you will get the problems described in this posting.

Once the disk has been replaced you will be unable to delete the disk or the disk group both from the vSphere Web client and RVC. The reason this fails is that it can't find the disk. The disk will show up with a status of "Dead or Error" or "Absent" (depending on where you look)

"esxcli vsan storage list" will show all the other disks belonging to vsan on that server, but not the missing SSD disk.

Listing out the disks in RVC with the command vsan-host_info shows that the disk is in an Absent status:

Trying to use RVC with "vsan.host_wipe_vsan_disks -f" to remove the disk also fails:

A solution that did work in the end was to use partedUtil to remove the partitions of all spinning disks of this disk group. partedUtil is a very dangerous tool so if you have multiple disk groups on your host (like we had) you must make sure you're working with the correct disks. We found it best to locate the naa IDs of the failed disk group from the web client.

After removing both partitions of all the disks belonging to this disk group, the disk group was gone and we could create a new one where we were able to use our new SSD disk and all the spinning ones.

The official way to solve thisproblem is to remove the disk from the pool while it's still present in the server. In our case that was not possible. The SSD disk had for some unknown reason entered "Foreign mode", which is a Dell disk controller feature. We had to enter the Perc controller BIOS settings (from POST), clear the Foreign Config and we also had to configure the disk in the controller config in order to use it again. Because of these things the disk came up with a new naa ID even though we didn't really have a failed disk.

March 23, 2015

vSphere AutoDeploy and Trend Micro Deep Security

When researching online documentation to see if we could get Trend Micro Deep Security implemented in our VMware vSphere AutoDeploy environment, the only references we could find were a japanese blog posting and a japanese white paper. My language abilities is a bit limited, but I still found the screen shots valuable.

To get Deep Security working there are several components that needs to get fixed in a given order:
  1. Manually load vShield Endpoint driver on one of the ESXi hosts
  2. Update Host Profile based on ESXi host with vShield Endpoint driver
  3. Edit Host profile in order to get it working
  4. Create new ESXi image with Image builder that includes the vShield Endpoint driver and Trend Micro Filter driver
  5. Boot ESXi hosts from new ESXi Image
  6. Remediate new Host Profile for these hosts
  7. Deploy DSVA per ESXi host
1. You need to use vShield Manager to install the vShield Endpoint driver. Note that the ESXi host should not be in maintenance mode when doing this. This may sound strange, but you'll get an error message after installing it if the host was in maintenance mode.

2. Go to host profiles and either create a new Host Profile based on Host, or update an existing Host based on the host you installed the driver on.
3. You need to edit the Host Profile. In addition to other tasks that needs to be done when a Host Profile has been updated from a host config, you now also need to make this new vShield based endpoint network work automatically. There are basically three things that needs to be done: Unselect a vShield Connection ID field, Don't get asked for a MAC address and Set a static ip address. This address is always and is an internal  (host only) network on each host.

4. The following needs to be added to the VMware vSphere Image Builder script:

Add-EsxSoftwareDepot -DepotUrl "e:\vmware\drivers\"
Add-EsxSoftwareDepot -DepotUrl "e:\vmware\drivers\"

Add-EsxSoftwarePackage -ImageProfile $imageprofile -SoftwarePackage epsec-mux
Add-EsxSoftwarePackage -ImageProfile $imageprofile -SoftwarePackage dvfilter-dsa
5. Activate the new image using the cmdlet Repair-DeployRuleSetCompliance
6. Remediate the host with the new Host Profile.
7. You can now see that the ESXi host has a prepared status and you can now start deploying DSVAs.

March 22, 2015

vSphere AutoDeploy and Apex 2800 cards

When reading through the Teradici documentation you can't find a single reference of neither Autodeploy nor Image Builder. The good news is that it does indeed work out of the box. All you need is to add a few lines to the image builder config:
Add-EsxSoftwareDepot -DepotUrl "e:\vmware\drivers\"
Add-EsxSoftwarePackage -ImageProfile $imageprofile pcoip-ctrl
Add-EsxSoftwarePackage -ImageProfile $imageprofile tera2
You can now build the image like you normally do and the driver will load if there's an APEX card in the server.

January 26, 2015

Bulk registering vSAN disks for controllers not supporting pass-through mode

When configuring VSAN the amount of initial setup time is highly dependent on the type of disk controller you're using. Some controllers support pass-through mode and will not need the additional configuration described in this posting.

If you however are using a controller such as the Dell PERC H710, you will first need to setup each disk in the RAID controller's BIOS; with every disk in it's own disk group where you enable write through, disable read ahead and select initialize.

After doing this you will see the individual disks within VMware vCenter under the esx host / manage / storage / storage controller / devices. The disks are however not detected correctly as the controller gives no information about the type of disks shared in these RAID 0s.

In order for vSAN to make sense of these disks you will need to create rules that specify what type of disks that are being used.

Spinning disk command:
esxcli storage nmp satp rule add --satp=VMW_SATP_LOCAL --device <device id> --option "enable_local"

SSD disk command: 
esxcli storage nmp satp rule add --satp=VMW_SATP_LOCAL --device <device id> --option "enable_local enable_ssd"

The device id in question here is the naa lun id. Some suggest that you use the command esxcli storage core device list, but in a system with many disks I've found it easier to filter out the needed info by using the command fdisk -l by identifying the disk types by looking at the disk sizes.

You can compile the list of naa lun ids for a given disk type and run the following commands:
for i in <paste list of spinning disk naa lun ids here>
esxcli storage nmp satp rule add --satp=VMW_SATP_LOCAL --device $i --option "enable_local"

for i in <paste list of ssd disk naa lun ids here>
esxcli storage nmp satp rule add --satp=VMW_SATP_LOCAL --device $i --option "enable_local enable_ssd"

You will now need to reboot the host for the new config to become active. Repeat these steps for all of your vSAN hosts and you'll soon be able to start configuring vSAN.

November 22, 2014

vSAN and HP 5400 switches

While setting up vSAN we found several guides for Cisco switches, but none for HP. Even the HP vSAN reference architecture was using Cisco Nexus switches.

We did initially see the error message: "Host cannot communicate with all other nodes in the VSAN enabled cluster" even though all vSAN enabled vmkernel interfaces could ping each other. vSAN has some special multicast requirements that needs to be taken care of.

We were trying to get HP 5400 series 10GbE switches to work with vSAN.

After playing around for a bit with the switch config we came up with the following working config:
vlan 53
   name "vSAN network 1"
   tagged C1-C8
   ip address
   ip igmp
Within a few minutes the error messages were gone, status went to Normal with a green icon and vSAN started working nicely.

Since we had 2x 10GbE nics dedicated to vSAN we also setup a secondary vlan for vSAN and bound each of the vlans to different nics in order to get maximum performance.

November 18, 2014

Accessing the GK Cloud Labs from Linux

Last week I attended vSAN training in Stockholm. The requirements for attending this class was that you needed to bring your own laptop with RDP capabilities.
When attending the class I discovered that there were a few extra things into this requirement. According to the class manual it required you to install an ActiveX component in Internet Explorer in order to get this working.

As I'm a Linux user they did of course not provide any info on how to do it, but that's part of the game I guess. In case I couldn't figure things out I could always start a Windows VM from within VMware Workstation. They did however provide info for Apple Macintosh users. By reading through the Mac docs I found what was really going on behind the scenes. The RDP session required a proxy config and encryption.

The standard Ubuntu RDP client didn't provide support for an RDP proxy, but I found an alternate client, called FreeRDP that I installed by following this HowTo.

I could now the access the labs by using the info from the login info sheet we had been provided with the following command:
xfreerdp  / /d:gklabs /u:Wxxxx-Studentx-x /p:PassWord / /w:1920 /h:1080 -nego
The connection now worked perfectly, even though it spent some time setting up the initial connection. Looks like it was trying to verify the certificate, even with the -nego switch that is supposed to tell it to ignore the certificate. Well, it does in fact ignore it in the sense you're not warned about a self signed certificate, but it still waits for it to time out before starting the connection.

All in all the training was a great experience, giving a better insight into vSAN than the HOL lab.

August 23, 2014

Making the XtremIO GUI Simulator work under Linux

While attending XtremIO training this week there was a bit talk about a GUI simulator for XtremIO. While not as good as the real thing it can be a good thing for learning to know the GUI and maybe show customers/colleagues how to admin the XtremIO. While XtremIO was bought by EMC they still seem to operate outside of  EMC and their GUI is not integrated into UniSphere.

The GUI Simulator is available as for download and exists in two flavors: Mac and Windows.

I downloaded the Windows version and I initially planned to try to run it in Wine, but I discovered that it really was a java application so I just needed to extract the correct files and install the required version of java.

I use Ubuntu 13.04 and did the following steps:

Install java runtime 1.8:

$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer
$ java -version

Install Wine from Software Center if you haven't already. We will be using Wine to unpack the files inside the .exe file by installing it into a Wine container.  Locate the XtremIO GUI Simulator exe file (which is an installer) and right click it.
Choose Open with Wine Windows Program Launcher.

Choose to install the application.

After a bit the install will finish and all the files are extracted
 You will need to make the Simulator.jar file executable.
$ cd .wine/drive_c/users/lars/Local\ Settings/Application\ Data/XtremIO\ GUI\ Simulator/app/
$ chmod +x Simulator.jar

Navigate to the app folder using the file browser
Right click Simulator.jar and choose Open with Oracle Java 8 Runtime

Pick your choice, any choice.

Login with default credentials

And you're free to use the XtremIO GUI Simulator.
Note that the while the GUI Simulator is good for training it is not 100% equal to the real XtremIO GUI as the simulator seems to have a few bugs that are not present in the real GUI. It still gives a fairly good idea of how things work.

The GUI Simulator requires quite a bit of resources in order to run well so a slow PC without too much free ram will not be working greatly.