Designing the Virtual Data Center - Part 1

Posted by Jason Langone   |   Monday, October 05, 2009   |  

Welcome to a new virtualization.info column focused on designing the virtual data center.

The job of a Virtualization Architect may be one of the most complex within an overall IT department. To be a successful Virtualization Architect one must possess the skills of a storage guru, network engineer, server administrator and be proficient in advanced automation techniques. The ability to think conceptually at a high level and understand how separate components relate to one another is truly what separates senior engineers from legitimate architects.

As I discussed at Virtualization Congress 2009, building a virtual infrastructure (VI) involves leveraging a broad spectrum of technologies. When building an overall VI, one-stop vendor shopping is often the easiest method of procurement but will likely yield a less than ideal solution. This column will focus on the connections between the critical components of the VI and how Virtualization Architects construct the best platform for both technical and financial success.

It is important to understand that no matter how great the hypervisor, the least common denominator in virtualized solutions is the Input/Out (I/O)layer, comprised primarily of both storage and network connectivity. The I/O layer is what physically provides the pipeline to storage, network resources, end users and clientele. The physical server platform is generally either a compact blade solution (e.g. HP C Class blade series) or a collection of robust 2-4U servers (e.g. Dell R900). Regardless, the servers will connect to their storage, which houses the actual virtual machines, via fibre channel, InfiniBand or ethernet. The storage connectivity is of paramount importance as it is where the actual virtual machines reside.

In the early days of this current era of virtualization (5-6 years ago), virtual machine performance would often be inferior to that of its physical counterparts simply because the server administrator (which artificially matured into the virtual infrastructure administrator) had little understanding of how the underlying storage platforms worked and the considerations that needed to be made when requesting provisions from the storage team.

A post on my own blog regarding, “SCSI delayed writes,” an error registered inside Windows virtual machines when the storage access becomes latent, is still one of the most viewed articles I have ever written. This illustrates that a default configuration is not the path to great performance or great success. The popularity of the article also highlights that many server and virtual infrastructure administrators are not clear on what constitutes acceptable performance metrics.

Today, storage vendors are making life a bit easier to the growing number of virtualization administrators by offering, “VMware” as a server host-type from within the array configuration, for example. Also, an increasing number of storage vendors are grasping the importance of integrating their solution stack into the VI and as such have opened up their application programming interface (API). API access allows seasoned developers to associate VI tasks with the underlying storage platform which may help realize significant time savings in large environments.

From my experience there are a few recommendations I would suggest to both virtualization architects at their current companies and consultants walking into an unknown environment. They are as follows:

  1. Metrics. Using your storage stress test tool of choice (e.g. IOMeter), run a battery of tests inside a virtual machine using several different block sizes and queue depths. The goal is to test what a typical virtual machine will do in your environment when running under load. From here, monitor the performance not only from within the hypervisor’s management console (e.g. XenCenter), but also take a look at the switch and array consoles as well. I once walked into a 4Gb/s FC environment that was seeing a max of 4MB/sec throughput from physical host to the storage array. Needless to say we spent significant time rectifying the storage issue before even embarking on the virtualization initiative.
  2. Host Port Balancing. In the vast majority of FC environments I have walked into, the storage arrays typically hold 4 to 8 host ports (2 to 4 per storage processor). Multiple host ports provide obvious redundancy when cabled correctly and also provide additional bandwidth.
    By default, VMware hosts will typically use vmhba1 (often associated with Fabric A) and happily connect down their first path (which is often the first host port on the array). This configuration works well enough and provides the connectivity needed to continue the build out. There are two potential downsides to this; the first is that if the virtual infrastructure contains scores of physical hosts and each host is using the same HBA going down the same fabric (A) to the same host port on the array (#1), then there is potential for a bottleneck. The second downside to this configuration is that monetary resources were used to purchase additional host ports that now sit idle.
    Balancing storage traffic across all of the available host ports is easy enough to do from within the management console or with a script. Architects and administrators should check to ensure that path configurations are re-enforced after a host reboot.
  3. Monitoring. Many organizations run their critical production systems on a virtualized infrastructure, leveraging the many benefits of server virtualization. However, many organizations also fail to correctly monitor their VI in the same holistic fashion as they have historically done with their physical environment and its hosted services. Integrating the VI into an enterprise solution from BMC, HP, et cetera, is certainly one viable option. Additionally, there are plenty of third-party solutions that work well in a network operations center (NOC) to have trained eyes monitor the VI (e.g. the visual representation provided through Vizioncore’s vFoglight or Veeam Monitor).

Furthermore, new technologies and server platforms are increasing the benefits of server consolidation but placing greater strain on the I/O channels. The advent of Intel’s mighty Nehalem is one such technology that will allow a significant improvement to the amount of VMs that can be adequately hosted per physical server. The more VMs per physical server, the greater the I/O demands. Many current VI implementations use solutions such a blade servers but then rely on a handful of GbE connection to provide the bandwidth. This will prove to be a limiting component if not addressed with the new (and more powerful) breed of server hardware.

Companies like Cisco and Xsigo are directly addressing these I/O demands. Xsigo offers I/O modules that can be placed inside physical servers which can then connect to their Xsigo I/O Director via InfiniBand. Then IP network connectivity, IP storage and fibre channel storage are connected through a pair of very large pipes (10 to 20GBs Infiniband) to the Xsigo Director which then connects into the proper ethernet or fibre channel networks. By removing the often limiting edge switches and tying the VI directly into 10GbE+ networks, virtualization can continue to proliferate at an exponential rate (thanks to even more favorable consolidation ratios), which may ultimately have an impact on how hypervisors are licensed. The concept of converged networks is not new, but their need in the enterprise is quickly becoming a necessity to sustain the increasing density of virtual infrastructures.

While technologies such as InfiniBand and 10GbE have not become the standard for server connectivity in the mainstream, their adoption rate has been significant in high-performance computing (HPC) environments, where latency and throughput are of paramount importance. As near-term I/O demand exceeds the common methods of connectivity in more common environments (4Gb/s FC and n x 1GbE), InfiniBand or 10GbE will need to be implemented. Their implementation will provide large I/O channels allowing both network and storage connectivity over the same pipe; a Virtual Infrastructure-over-Ethernet. This will allow Virtualization Architects to continue focusing on big picture solution delivery and not be limited by the underlying connectivity framework.

Comments

blog comments powered by Disqus