Whitepaper: Performance Troubleshooting for VMware vSphere 4 and ESX 4.0

Posted by Alessandro Perilli   |   Monday, October 19, 2009   |  

vmware logo

In July VMware released a must-read 51-pages paper that is definitively worth a read: Performance Troubleshooting for VMware vSphere 4 and ESX 4.0.

The document, which is continuously updated, doesn’t just describe all the aspect of the product (CPU, memory, storage and network) that should be checked to troubleshoot performance. It also provides a much needed troubleshooting methodology:

VMwarePerformanceTroubleshootingMethodology

Labels: , , ,

Benckmarks: Exchange 2007 on VMware vSphere 4.0 with FC, iSCSI and NFS storage at comparison

Posted by Alessandro Perilli   |   Friday, August 07, 2009   |  

vmware logo

A couple of weeks ago the VMware Performance Team released a new interesting paper about a virtual deployment of Microsoft Exchange Server 2007 on vSphere 4.0 Release Candidate (build 140815).

The 16,000 mailboxes environment was distributed across 8 virtual machines (Windows Server 2003 R2, 14GB RAM, 2 vCPUs, 20GB vHD and 2000 users each) served by a HP ProLiant DL 580 G5 server with 4 Quad-Core Intel Xeon X7350 @ 2.93GHz and 128GB RAM.

The backend storage was served by a NetApp FAS6030 array with 114 disks split into four aggregates (the data on was made of 40 disks).

VMware tested the same environment with Microsoft Exchange Load Generator (8-hours workday simulation) using a 4GB Fibre Channel connection, a 1GbE iSCSI connection and a 1GbE NFS connection.
Despite this major difference in the available bandwidth the three protocols performance are very similar:

Exchange_FCiSCSINFSExchange_FCiSCSINFS-2

Labels: ,

Benchmark: SQL Server 2008 performance on VMware vSphere 4.0

Posted by Alessandro Perilli   |   Tuesday, July 07, 2009   |  

vmware logo

A couple of weeks ago VMware published a benchmark analysis titled Performance and Scalability of Microsoft SQL Server on VMware vSphere 4.

It assesses the performance of a SQL Server 2008 OLTP database hosted by a vSphere 4.0 virtual machine with 8 vCPUs and 58GB vRAM, and compares it against the performance of a physical system:

vSphere40_OLTP

vSphere40_OLTP2

Of course it would be much nicer to see how a OLAP database performs, to understand the reliability of virtualization in extreme (and real-world) conditions, but the chances to have it seems pretty low. 
Nonetheless the paper above is extremely interesting and really worth a read.

Labels: ,

Release: VMware VMmark 1.1.1

Posted by Alessandro Perilli   |   Tuesday, June 02, 2009   |  

vmware logo

While the virtualization community continues to debate on the real-world value of the VMmark benchmark platform, VMware continues to update it.

VMmark 1.1.1 was released at the end of April with the following updates:

  • The VMmark harness has been updated to include automation of hypervisor and workload virtual machine reporting for disclosure
  • The reporting script has been updated to support a future version of VMware ESX
    (is this referring to vSphere?)
  • A new VMmarkConfigChecker script is included to help confirm that the workload virtual machines comply with the Run and Reporting Rules for VMmark
  • The disclosure.html template has been updated

Starting June 1st, VMware only accepts benchmarks submissions scored with this new version of VMmark.

Labels: ,

Anandtech challenges VMware with its own independent multi-hypervisor benchmark tool

Posted by Alessandro Perilli   |   Monday, May 25, 2009   |  

vmware logo

There’s no doubt that the virtualization industry needs a standard benchmarking platform. The only two alternatives we have today are simply ignored (Intel vConsolidate) or are not recognized by all the vendors (VMware VMmark).

Now even the specialized press is questioning about the value of these platforms, we are talking specifically about Anandtech, suggesting that they may not use real-world workloads to test the hypervisors:

There are only two consolidation benchmarks out there: Intel's vConsolidate and VMware's VMmark. Both are cumbersome to set up and both are based on industry benchmarks (SPECJbb2005) that are only somewhat or even hardly representative of real-world applications. The result is that VMmark, despite the fact that it is a valuable benchmark, has turned into yet another OEM benchmark(et)ing tool. The only goal of the OEMs seems to be to produce scores as high as possible; that is understandable from their point of view, but not very useful for the IT professional. Without an analysis of where the extra performance comes from, the scores give a quick first impression but nothing more.

To prove its point Anandtech has developed its own benchmark, vApus Mark I (developed by the academic group Sizing Server Lab), that is useful to compare the of different CPUs in a virtual infrastructure running Windows guest operating systems. This is a fine first step as most of the virtual machines deployed in the world run Windows, but just in case some customers are not satisfied the group is already developing a new version that features Windows and Linux virtual machines.
The beauty of this work, assuming it has no flaws, is that it can be used with any hypervisor, including VMware ESX, Citrix XenServer and Microsoft Hyper-V.

The benchmark measures the performance of four virtual machines (each equipped with 4 vCPUs and 4GB vRAM) with four enterprise grade applications:

  • 1 x OLAP database, based on SQL Server 2008 x64 running on Windows 2008 64-bit, which runs Nieuws.be (over 100GB data organized in hundreds of tables)
  • 2 x MCS eFMS portals running PHP, IIS on Windows 2003 R2, described here.
  • 1 x OLTP database, based on Oracle 10G Calling Circle benchmark of Dominic Giles.

Because of this selection the Sizing Server Lab firmly claims that its vApus Mark I is not made to replace VMmark as it mimics the average virtual data center usage, while their own platform only replicates resource intensive services.

The results are extremely interesting as they highlight a performance difference between different CPUs that is not matching at all the numbers obtained with VMmark:

vApusMarkI 
The rest of long article includes key information about impact on performance of CPU components like dual and quad core architectures, the cache, the memory bandwidth and the clock speed.

The conclusions are surprising: if you have a VMware ESX 3.5 Update 4 (ESX 4.0 will be used in future tests) then the Xeon Nehalem is without a doubt the fastest platform, but the latest quad-core Opteron is not far behind.

The entire Anandtech benchmark analysis is definitively worth a read while waiting the likely outraged reaction of VMware.

Labels: ,

Benchmarks: ESX vs Hyper-V vs XenServer

Posted by Alessandro Perilli   |   Friday, March 13, 2009   |  

It doesn’t matter how hard you look, it’s almost impossible that you are going to find a performance comparison that involves Citrix XenServer, Microsoft Hyper-V and VMware ESX.
The VMware End User License Agreement (EULA) specifically says that the company won’t recognize any 3rd party performance testing before it has the chance to review and approve the adopted methodology.

(before June 2006 the situation was even worse as VMware simply didn’t allow the publishing of any benchmark comparison)

At these conditions, the chances that you’ll see an independent benchmark where VMware is outperformed by its competitors are zero.

Despite that, last week a group of brave reporters at Virtualization Review challenged VMware and published an independent analysis without asking any permission.

To ensure the validity of our test results and testing environment, Virtualization Review enlisted the help of Stuart Yarost to formulate and validate the test plan. Yarost is an ASQ Certified Software Quality Engineer and Certified Quality Engineer with more than 22 years' experience in the software and quality fields. Yarost currently holds the position of Vice Chair of Programs for the ASQ Software Division.

The results are more than interesting:

  • In our tests, Hyper-V did well in all categories-it's a real, viable competitor for the competition.
  • XenServer's test results are impressive, but are they enough to justify a replacement of your current hypervisor? For environments with virtualized systems that have a large number of CPU- and memory-intensive workloads, it may be a good choice. The caution is that those high I/O workloads flirt with not being good virtualization candidates, so some administrators might instinctively place these workloads on physical systems. Make no mistake, however: XenServer did extremely well, posting excellent performance numbers.
  • For the first two tests of heavy workloads, VMware underperformed both XenServer and Hyper-V. For the lighter workloads on the third test, the results were almost indistinguishable across the platforms, but ESX had the best results in three of the four categories.

Easy to guess, VMware is not happy and yesteday severely criticized Virtualization Review on the corporate blog Virtual Reality with the post: A big step backwards for virtualization benchmarking.

The list of objections is long:

    • The fact that ESX is completing so many more CPU, memory, and disk operations than Hyper-V obviously means that cycles were being used on those components as opposed to SQL Server.  Which is the right place for the hypervisor to schedule resources?  It’s not possible to tell from the scarce details in the results.
    • All resource-intensive SQL Servers in virtual and native environments have large pages enabled.  ESX supports this behavior but no other hypervisor does.  This test didn’t use that key application and OS feature.
    • The effects of data placement with respect to partition alignment were not planned for.  VMware has documented the impact of this oversight to be very significant in some cases.
    • The disk tests are based on Passmark’s load generation, which uses a test file in the guest operating system.  But the placement of that file, and its alignment with respect to the disk system, was not controlled in this test.
    • The SQL Server workload was custom built and has not been investigated, characterized, or understood by anyone in the industry. As a result, its sensitivity to memory, CPU, network and storage changes is totally unknown, and not documented by the author.  There are plenty of industry standard benchmarks to use with hypervisors and the days of ad hoc benchmark tests have passed.  Virtual machines are fully capable of running the common benchmarks that users know and understand like TPC, SPECweb and SPECjbb.  An even better test is VMmark, a well-rounded test of hypervisor performance that has been adopted by all major server vendors as the standard measurement of virtualization platforms or the related SPECvirt benchmark under development by SPEC.
    • With ESX’s highest recorded storage throughput already measured at over 100,000 IOPS on hundreds of disks, this experiment’s use of an undocumented, but presumably very small, number of spindles would obviously result in a storage system bottleneck. Yet storage performance results vary by tremendous amounts. Clearly there's an inconsistency in the configuration.

VMware highlights how this analysis was not reviewed and approved, and that because of this kind of work they don’t remove the EULA restriction.
And to be absolutely sure that everybody know about the flaws of this benchmarks, this morning the company sent out an alert to its entire Channel.

How the other two vendors reacted?

Citrix didn’t comment so far, while Microsoft validated the study by linking it on the corporate blog.
Now if they want to defend the Hyper-V score in this benchmark is better they publish a counter-analysis explaining why VMware is wrong.

Labels: , , ,

Benchmarks: VMware ESX 3.5 Update 3 supports almost 70,000 concurrent ecommerce transactions

Posted by Alessandro Perilli   |   Wednesday, February 18, 2009   |  

vmware logo

In the endless war for the best performance, VMware releases today a new, interesting analysis.

The company run the SPECweb2005 benchmark on a single HP ProLiant DL 585 G5 system with 16 cores and ESX 3.5 Update 3.

The industry standard platform simulates three typical workloads:

  • a number of customers accessing accounts at a given time via HTTPS
  • a number of customers accessing an e-commerce retail store via HTTP and HTTPS
  • a number of users acquiring patches and downloads from a support website via HTTP

In the first scenario ESX 3.5 could support as much as 80,000 concurrent accesses (equal to 143,000 HTTP operations per second), in the second one almost 70,000 and in the last one 33,000.

The aggregated and normalized metric is equal 44,000, which is the highest score ever recorded with a 16 cores system.

Labels: ,

Benchmarks: Citrix XenDesktop 2.1 vs VMware View 3.0

Posted by Alessandro Perilli   |   Tuesday, February 17, 2009   |  

citrix logo

vmware logo

For the forth time in few days that benchmarks about Citrix and VMware desktop virtualization (VDI, presentation virtualization and application virtualization) solutions take the central stage.
Is it a sign that somebody is getting nervous?

The first (non-sponsored) analysis came out from two independent enterprise architects, Ruben Spruijt and Jeroen van de Kamp, which evaluated how XenServer, ESX and Hyper-V perform in VDI scenarios.

After an immediate reaction from VMware, a XenDesktop 2.1 Scalability Analysis popped up from Citrix (to be fair this document was released on Jan 12, days before the Spruijt/van de Kamp work, and further updated on Jan 27).

Then an independent performance comparison (committed by VMware) between Microsoft App-V, Symantec/Altiris SVS, VMware ThinApp and Citrix XenApp was released by the Exo Performance Network team.

The last episode of this saga come out last week from the Tolly Group.

The test lab realized an independent performance comparison (once again committed by VMware) between Citrix XenDesktop 2.1 Enterprise and VMware View 3.0 Premier.

As for any sponsored analysis the results are easy to guess:

The VMware View 3 VDI solution deployed more simply and more rapidly than Citrix XenDesktop 2.1. VMware provided more comprehensive, efficient image and storage management of virtual desktops. It provides end-users with a quality of experience on the LAN that matches or exceeds that offered by the Citrix solution.

Citrix promptly answers from the corporate blog, invalidating the analysis as it covers unrealistic scenarios and evaluates an old product (XenDesktop 3.0 was released just two weeks ago):

There's a prominent sidebar that in the report that states that Citrix declined to participate in the testing - this is true, and I was the one that actually made that call and discussed it with Tolly Group. To their credit, Tolly Group did call us prior to beginning the testing and informed us of the project and shared the statement of work prepared for VMware. We asked some questions and provided some feedback about the testing methodology. I had serious concerns that the proposed tests did not reflect true customer use cases. For example, the user experience testing was only for a few productivity applications in a LAN environment - that was all that was planned, and it didn't seem to realistic based on what we've seen in real customer environments. Tolly took note of our concerns and asked VMware as the sponsor of the paper whether they would alter their approach.  Later we learned that VMware (not surprisingly) had rejected our suggestions and was not open to changing the proposed tests. At that point, it was clear that it made no sense to participate because…

Labels: , ,

Benchmarks: App-V vs SVS vs ThinApp vs XenApp

Posted by Alessandro Perilli   |   Wednesday, February 11, 2009   |  

While the virtualization community is still intensely discussing the benchmarks around XenServer, ESX and Hyper-V used for VDI scenarios, provided by Ruben Spruijt / Jeroen van de Kamp and confuted by VMware, a new study surfaces.

This performance analysis, committed by VMware, shifts the focus from VDI to application virtualization, comparing Citrix XenApp 5.0, Microsoft App-V 4.5, Symantec SVS Pro 2.1 and VMware ThinApp 4.0.1.

The measurements were performed using the Devil Mountain Software (DMS) Clarity Suite: the Clarity Tracker Agent is deployed on the benchmarked Windows machines, the Clarity Studio produces workload simulation, and the results are uploaded for further analysis to the Exo Performance Network.

The conclusion are rather interesting:

  • Application virtualization solutions that use an embedded virtualization model (ThinApp) deliver the best application throughput. Only ThinApp delivers the combination of excellent raw performance plus low overall CPU utilization, making it the better solution for organizations seeking to minimize the performance “hit” typically associated with virtualization technology.
  • By contrast, solutions that employ a kernel-mode driver or service (App-V, SVS, XenApp) introduce additional layers of software complexity – including significantly higher kernel-mode activity – which translate into runtime overhead that slows the application and/or places an additional burden on the CPU. These agents also consume a considerable amount of memory, both directly – as part of the agent’s process – and indirectly, through expansion of the application’s working set.
  • Agent-based solutions also introduce a new and potentially catastrophic single point of failure (kernel mode execution) that IT organizations must factor into the testing and certification of their desktop computing stacks. Functional limitations, such as the lack of support for locked-down environments and/or inability to run on specific Windows versions (x64), further complicate the application virtualization equation, forcing IT shops to invest additional resources into designing infrastructure around these planning and deployment hurdles.

Read the whole document here.

Labels: , , , ,

Benchmarks: Citrix XenDesktop 2.1 Scalability Analysis

Posted by Alessandro Perilli   |   Monday, February 09, 2009   |  

citrix logo

The last week discussion about XenServer vs ESX (vs Hyper-V) for VDI scenarios, ignited by Ruben Spruijt / Jeroen van de Kamp and followed up by VMware, is still hot.
So maybe it’s worth to further discuss the topic by highlighting a recent paper published by Citrix: XenDesktop 2.1 Scalability Analysis.

The first part of the 29-pages document describes how a Citrix XenDesktop infrastructure (including XenServer, XenApp, the Desktop Delivery Controller connection broker) was tested against Provisioning Server for Desktops (to deliver new virtual desktops) and EdgeSight (to simulate application workloads) to measure its scaling capability.

The analysis was summarized in the following XenDesktop Environment Sizing Guide:XenDesktopSizingGuide

Labels: ,

VMware reacts to the Virtual Reality Check benchmarks

Posted by Alessandro Perilli   |   Tuesday, February 03, 2009   |  

vmware logo

Just yesterday virtualization.info covered the amazing work of Ruben Spruijt (Solution Architect and CTO at PQR) and Jeroen van de Kamp (Enterprise Architect and CTO at Login Consultants), a couple of well-known and respected virtualization experts that lead two separate Citrix and VMware solutions providers.

Their Virtual Reality Check project is a performance analysis of the leading hypervisors (VMware ESX, Citrix XenServer and Microsoft Hyper-V) when running typical Microsoft Terminal Services/Citrix XenApp workloads: a Windows XP virtual desktop loaded with Outlook 2007 and Acrobat Reader 8.

Easy to guess, the post achieved one of the highest page view score in the history of virtualization.info, despite other prominent influencers already covered the project the previous week.

The non-sponsored results published by Spruijt and van de Kamp generated a lot of reactions as their conclusion on Citrix XenApp is:

Not having the ability to overcommit virtual machine memory is an clear disadvantage when
virtualizing desktops. Such a feature allows much more VM’s to be run than physical memory
normally would allow, which makes a virtual desktop solution much more economical.

XenServer is clearly optimized for Terminal Server and XenApp workloads, achieving near bare metal performance and even higher user densities than bare-metal configurations. This is possible because 32-bit 2003 terminal server with 4GB memory is relatively very efficient in comparison to other Windows operating systems.

While Microsoft didn’t comment (it has no interest in doing so), VMware immediately reacted: the company’s performance team published a new benchmark just few days (Jan 30) after the project Virtual Reality Check was announced (Jan 26).

The VMware performance study compares XenServer 5.0 and ESX 3.5.0 Update 3 performance when running Citrix XenApp workloads and highlights some odd results compared to what Virtual Reality Check exposed:

ESX supports about 13% more users than XenServer at a given latency while using less CPU.

Why the benchmarks are so different?

Stats and polls can be read in several different ways and manipulated as needed.
Simon Crosby, the CTO of Virtualization and Management division at Citrix, provides a possible read:

the VMware "study" is not a thorough exploration of a valid set of parameters for the Terminal Services / XenApp workload.  Instead, it is a narrow look at a particular set of configurations which are not reasonable in practice:

  • No test of 32 bit workloads - the primary candidates for server consolidation for this workload because a 32 bit OS exhausts its memory at 4 GB and a modern server can pack hundreds of GB and many cores.  Our work in this area has shown a
    compelling benefit to virtualizing TS/XenApp 32 bit workloads on XenServer, and an equally compelling set of reasons not to use ESX for this purpose.
  • Unrealistic configuration - The server used in the tests is certainly punchy - the machine had 64 GB RAM and 4 processors--each with 4 cores (16 total processor cores).  Anyone familiar with 64b TS/XenApp knows this machine could easily  support hundreds of XenApp sessions.  But the "scientists" at VMware don't.  They instead chose to run exactly one VM (with only 2 vCPU's and using only 25% of the available memory) and XenApp at minimal levels of concurrency (i.e. 10-40 users).  No multi-VM scenarios, no tests at useful user-counts.  Based on their measurements they appear to gleefully extrapolate deeper into the realm of fiction to proudly pronounce their horse the winner.

At this point we would like an additional comment from Ruben Spruijt and Jeroen van de Kamp as their work is somewhat questioned by the new VMware study.

Labels: , , ,

Benchmarks: ESX vs XenServer vs Hyper-V for Terminal Services and VDI workloads

Posted by Alessandro Perilli   |   Monday, February 02, 2009   |  

Last week a couple of well-known and respected virtualization experts, Ruben Spruijt (Solution Architect and CTO at PQR) and Jeroen van de Kamp (Enterprise Architect and CTO at Login Consultants), launched a remarkable project called Virtual Reality Check.

The non-sponsored joint effort produced a set of valuable benchmark comparisons between VMware ESX, Citrix XenServer and Microsoft Hyper-V, when running Windows XP and Vista virtual machines for Terminal Services and VDI environments:

To measure the hypervisors performance they used the recently released, free of charge, Login Virtual Session Indexer (VSI) and performed over 150 tests.

The best part of the documents released so far is that they carefully analyze the impact of several configuration changes for each hypervisor, suggesting which setup is the most performing.

If you are planning a VDI infrastructure the performance analysis these virtualization professionals redacted is a mandatory reading.


By the way: Ruben Spruijt and Jeroen van de Kamp will speak at the virtualization.info’s Virtualization Congress 2009, in Las Vegas.

On stage the two will discuss the results, unveil additional details that were not published and preview the upcoming new tests.
Additionally, they will teach how to setup a benchmarking facility using Login VSI.

Be sure to check their presentation abstract here.

Labels: , , ,

VMware forms a panel to review the VMmark benchmarks

Posted by Alessandro Perilli   |   Monday, December 29, 2008   |  

vmware logo

One year and a half after its launch, the benchmarking platform that VMware called VMmark got some serious traction among OEMs.

The results page shows more than 30 analysis submitted by the biggest OEMs, including Dell, HP, IBM, Sun and Unysis.

Easily to predict, VMmark got zero acceptance from the other virtualization vendors, making the tool only partially useful.
Despite that, VMware competitors, did nothing to seriously develop a common standard or at least to adopt the only alternative available today: Intel vConsolidate.
Their only action in the last 18 months has been to join the SPEC virtualization benckmarking group. It’s unclear what progress the project made so far.

While waiting for the SPEC, VMware is trying to further involve the industry players by forming a review panel.
In theory the panel should grant a more transparent evaluation of the submitted benchmarks, creating the conditions to widely adopt it.
In practice the panel membership is by invitation only.

The founding members of the new VMmark review panel are AMD, Dell and HP.
Unless VMware chances the admission rules and at least Citrix and Microsoft jump in the effort will not change much.

Labels: ,

Running SQL Server in a virtual machine for OLTP workloads

Posted by Alessandro Perilli   |   Monday, December 01, 2008   |  

mssql2008 logo

Recently both Microsoft and VMware released their papers about running SQL Server in a virtual machine for Online Transaction Processing (OLTP) workloads:

Microsoft, that is supposed to know its own product better than anybody else concludes:

From a performance perspective, Hyper-V is a viable option for SQL Server consolidation scenarios. The overall performance of SQL Server running in a Hyper-V virtualized environment is reasonable compared with the equivalent native Windows Server 2008 environment.

With proper I/O capacity and configuration, the I/O overhead is minimal. For best performance, you should have enough physical processors to support number of virtual processors configured on the server to avoid overcommit CPU resources. The CPU overhead increases significantly when the CPU resources are overcommitted. It is important to test each application thoroughly before you deploy it to a Hyper-V environment in production.

Unfortunately who authored the VMware paper took great care in hiding the version of SQL Server used for the benchmarks.

Nonetheless the two papers are greatly interesting for a (probably unfair) comparison.
Readers may also want to look at older benchmarks measuring SQL Server 2005 performance in VMware environments:

Labels: , ,

Benchmarks: Hyper-V performance on Dell R900 with Quad-Core and Six-Core Intel Xeon

Posted by Alessandro Perilli   |   Tuesday, October 21, 2008   |  

microsoft logo

Recently Dell published a very interesting benchmark measuring the Microsoft Hyper-V performance on its new R900 server (with Intel Quad-Core and Six-Core Xeon) against HP ProLiant DL585 G2 (with Intel Quad-Core).

The top R900 system features 4 x E7450 Xeon @ 2.4Ghz and 128GB RAM.

Such system handled 40 Hyper-V virtual machines (1 vCPU and 2GB vRAM), each running a Windows Server 2008 64bit guest OS with SQL Server 2005 64bit.
The overall CPU utilization with such configuration hits 80%, serving 74,084 orders per minute.

Compared with the other two systems this R900 performed 27% better than the HP machine (which can serve no more than 26 virtual machines) and 8% better than the other Dell machine with Quad-Core CPUs (serving no more than 30 virtual machines).

This is one of the first performance study for Hyper-V and it’s worth of a full read.

Labels: ,