I can write page after page about the ins-and-outs of VSAN, but fortunately several very respected individuals have already done so. For starters, Duncan Epping at yellow-bricks.com not only is a massive contributor to the cause, but has also put together a nice list of VSAN resources from around the web that is a must-see. But lets face it, if you're tracking VSAN you've probably already been there, done that :-) So for this post, I'm going to focus instead on my VSAN home lab build and experiences thus far. I've shared several preliminary stats on twitter (here, here, and here) ahead of any tweaking and will be sure to post additional results as I play with things a bit more.
EZLAB ("EZ" after El-Zein in case you were wondering) has been through somewhat of an overhaul. My original lab was mostly whitebox and was everything I needed at the time, but to play in the home lab big leagues I needed to make some modest investments.
Here's a logical / solutions overview of the current state of "EZLAB"...
|EZLAB Logical Architecture|
2 x Dell PowerEdge R710 2U
2 x Dell PowerEdge R610 1U
System Configuration (per host):
- 2 x 4-core Intel E5620 @ 2.4Ghz
- 64GB Memory
- PERC 6/i RAID Controller, no JBOD :-(
- 1 x Samsung 840 Pro 256GB SSD
- 3 x WD Black 7.2k RPM, 750GB 2.5" SATA HD
- 1 x 4PT Broadcom 1Gbps nic
- 1 x 4PT Intel 1Gbps nic
- 1 x 2PT Infiniband 4X DDR HCA @ 10gbps*
* not yet implemented
Brocade FCX 24pt 1Gbps + 2pt 10Gbps Switch
Qlogic SilverStorm 24pt Infiniband Edge (9024-CU24-ST2)*
* not yet implemented
Synology 1511+, 5 x Crucial 128GB SSD + 5 x WD Black 1TB SATA
VMware Virtual SAN (!!)
The Synology 1511+ has been the primary storage solution for a couple years now...and it met all my needs for a small environment. The 5 x SSD bay definitely contributed to that. However, with the recent VSAN upgrades, the Synology has taken a back seat of sorts, now providing NFS-based datastores for the nested vESXi hosts in the cloud cluster (at least until I move VSAN into there as well). It is also utilized as primary backup, media/file server, VPN, OpenLDAP, etc. I love this unit, so I don't think I'll be retiring it anytime soon. But needless to say, VSAN has taken over as my primary storage fabric. Speaking of VSAN...
As I'm sure you've heard/witnessed by now, VSAN is a breeze to configure. Once your disks are online and visible to vSphere, you enable VSAN traffic on the appropriate vmk interface then enable VSAN in cluster settings. Again, there are so many resources out there that will step you through getting started, the configuration, in-depth details, design considerations, FAQ's, other deployment options, and troubleshooting with VSAN Observer...so i'll spare you those details. This post is not intended to be a how-to guide.
Configuring VSAN in my lab was incredibly straight forward. There are currently 3 hosts participating in my VSAN cluster...the 4th (R610) is down for maintenance at the moment, but I will be adding it to the mix very shortly.
UPDATE: 4th host was added in a follow-up post, Scaling VSAN: Adding a New VSAN Host
Each of my ESXi hosts are configured with a dedicated vSwitch, a dedicated storage vmk (enabled for VSAN), and 2 physical 1Gbps uplinks (active/standby) for all storage traffic. This will soon be replaced with the infiniband fabric, which will add dual 10Gbps HCA's per host. Although VSAN supports both 1Gbps and 10Gbps networks, 10Gbps is highly recommended for scalability and performance. I opted for infiniband to keep costs down after reading Erik Bussink's infiniband post.
With the cluster selected, browse to Manage tab --> Settings --> General to enable and configure VSAN. I used VSAN's manual configuration option to give me full control of which disks are used for the cluster. You can also opt to use the "Automatic" option, which automatically claims and consumes all empty/available local disks for VSAN. A minimum of 1 SSD and at least 1 mechanical disk are required per disk group. In my case I used all the host's available disks (1 SSD, 3 SATA) in a single disk group per host.
Once VSAN is enabled, the Disk Management section allows you to configure and manage new or existing Disk Groups by claiming available physical disk. Since I opted for the Manual configuration option, this is where I created each Disk Group (1 per host).
|VSAN Disk Management|
The other caveat with using RAID0 is SSD presentation -- vSphere will not recognize the SSD drive in a RAID0 as a local SSD. The work-around is to 'fool' vSphere into thinking the SSD's RAID0 volume is actually a native SSD drive. To do this, I had to SSH into each host and execute the following esxcli commands:
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d naa.6842b2b006600b001a6b7e5a0582e09a -o enable_ssdThe device "naa.6842b2b006600b001a6b7e5a0582e09a" is the device name that corresponds with the SSD drive (on host ezlab-esx05 in this example). This command was run on each host for the appropriate SSD drives. Once completed, the Drive Type will properly show SSD (no reboot necessary)...and only then can a VSAN Disk Group be created. In case you're wondering...yes, this procedure can be done on a non-SSD drive for the sake of testing VSAN...but definitely not a recommended practice. The take-away here -- go with JBOD.
esxcli storage core claiming reclaim -d naa.6842b2b006600b001a6b7e5a0582e09a
|VSAN Completed Configuration and Status|
|Monitoring VSAN's Disk Groups|
Storage Policy-Based Management (SPBM)
VSAN enforces various settings and policies by using the SPBM engine on a per-VM basis (or globally if desired). Cormac Hogan covers SPBM in detail as part of his VSAN series, which is a must read. All VM's that live on a VSAN datastore are assigned either the default policy, or a user-specified one. It is recommended to create a policy of your own, even if it simply applies all the default settings. Once a policy is created, it is applied to the VM (to some or all of it's attached disks/files) using the vSphere Web Client.
|Applying a VM Storage Policy|
The "High IO Apps" policy is sort of ridiculous at the moment with the read cache reservation set to 100%. This is experimental as I'm gauging the impact of this setting on some high-IO apps.
|EZLAB Storage Policy "High IO Apps"|
|EZLAB Storage Policy "Non-Critical Apps"|
You can monitor policy status and VM disk/component placement using in vSphere Web Client by selecting the Cluster or VM in the left pane and selecting "Virtual SAN" from the "Monitor" tab.
|Monitoring VSAN Virtual Disks|
Early testing has yielded very impressive results that compete with the performance I was getting out of the all-SSD array, but at a tiny fraction of the cost for 10X the capacity. I will continue to play with VSAN and SPBM as I fine-tune the lab (especially after the 10Gb fabric is installed!)...and I'll be sure to share the results.
|VSAN Iometer Test - 4k size, 100% Read, 100% Random, 2GB object on a single VM.|
Be sure to follow me on Twitter to see more test results as I get them out there.