Debunking Blue Pill myth

Friday, August 11, 2006   |   14 Comments   |   addthis

Since months the security and virtualization industry are discussing about a new security threat: the Blue Pill.

Blue Pill is the prototype resulting from a security study made by Joanna Rutkowska, which took advantage of new virtualization capabilities of AMD processors (known as SVM and previously as Pacifica) to inject a rootkit in a running Vista operating system (check the related Black Hat 2006 presentation).

The world press given this work much attention, often reporting misleading informations, because the scenario involved the upcoming Microsoft operating system, and because Ms. Rutkowska claimed a malware using this method is undetectable.

Assuming every reader out there already discovered, by reading follow-up to original post or other security professional analysis, that this method is not exploiting any flaw in the operating system, claim of undetectability stands still.

virtualization.info met Anthony Liguori, Software Engineer at IBM Linux Technology Center and, most of all, one of the men behind the Xen hypervisor, to finally debunk the Blue Pill myth.


virtualization.info: Anthony thank you for accepting to spend some time with virtualization.info and its readers. First of all would you explain us your exact involvement in Xen project and since how much time you are working on this role?

Anthony Liguori: I've been working on the Xen project for approximately 2 years. Like most Xen hackers, I do a little bit of everything. I'm on an extended vacation right now but normally I'm fortunate enough to be a full time Xen hacker.

My primary interest in Xen is desktop virtualization. Recently, I've been involved in a number of desktop-related features including graphics virtualization and HVM VNC support. There are a number of interesting features that are in progress that I'm working on too such as HTTP block device support and a high performance graphics component.

In the past, I've been involved in Xen's web services interface and the libvirt API.

I'll also take this opportunity too to point out that my comments here are my own opinions and do not represent the opinions of IBM or the Xen project :-)


VI: Let's talk about the Blue Pill announcement: can you explain us in simple terms the scenario considered by the researcher?

AL: Rutkowska claims to have create a 100% undetectable piece of malware.

The basic idea behind her claim is that one could create a piece of malware that also was a Virtual Machine Monitor. If the VMM could take over the host Operating System (imagine if you could launch Xen on a running copy of Windows and instantly have the previous Windows system be a virtual machine), then it could potentially hide a virus from that virtual machine by remaining within the VMM.

Having a VMM take over a host operating system would be very difficult. It's not outside of the realm of possibility but it would take a huge engineering effort.

However, for this malware to be successful, it would not only need to be able to take over the host Operating System, but it would also need to prevent that operating system from being able to detect that it was now a virtual machine.

While the former is at least possible (albeit tremendously difficult), the later is not possible which means that anti-malware software will always be able to detect this sort of attack.


VI: Where are risks in this scenario?

AL: If a virus cannot be removed or detected, it's pretty much a worse case scenario for corporate security. Once there was an outbreak, you couldn't trust any of your systems at all. I'm not sure how one could even mitigate such a threat--perhaps do frequent reinstalls of every system on your network?

It's really a doomsday scenario which is why it's gotten so much press.

Malware at the VMM level could potentially install keyloggers, provide remote access to disk, sniff passwords from a VM's memory, pretty much anything evil thing that can be imagined.


VI: In your blog, Tales of a Code Monkey, you said Blue Pill claims are unfounded and malware is always detectable. Why?

AL: It's been a fair bit of time since the post you are referring to. Since then, more details have come out about Rutkowska's prototype. I should mention that this prototype is very detectable.

All it does is turn on SVM, and set up a small piece of memory that is called periodically. It makes no attempt, currently, to hide that memory from the operating system so one could simply search all of physical memory.

However, even if she builds a full VMM with proper memory protection (which is no small task), there would still be a way to detect it.

Hardware virtualization requires a technique know as "trap and emulation". The idea is that the hardware traps certain instructions and the VMM emulates those instructions in such a way as to make the software believe it is running in a virtual machine.

Software emulation implies that these instructions take much longer to complete when executed under a VMM then on normal hardware. This fact is what can be used to detect the presence of a VMM.

I approached Rutkowska about this and she attempted to address it in her prototype by adjusting one of the processors clocks on every exit. However, there is nothing that she can due about external time sources and she's admitted to this on her blog.

She refers to this as a theoritical weakness in her system but I assure you that it is quite practical to exploit.

Keep in mind too that this level of sophistication is not even necessary with the current Blue Pill prototype. She would have to get Blue Pill to the point where it was as good of a VMM as Xen or VMware ESX. That's no small task!

This general approach can be used to detect any VMM--Xen, VMware, Virtual PC, z/OS, etc. In fact, this is a prediction from one of the earliest papers on virtualization (the Popek/Goldberg paper).


VI: Ms. Rutkowska stated several times that Blue Pill doesn't exploit any bug at operating system level or hardware level. While this is true, do you think there is something vendors could do to prevent these kind of risks?

AL: I know that Thinkpads come with a BIOS setting to enable or disable virtualization technology. In fact, it is disabled by default.

I assume most vendors provide a BIOS setting to enable or disable VT/SVM. If a problem were found, vendors could simply disable the extension until AMD or Intel fixed the problem.

With that said, I strongly doubt this will ever be necessary.


VI: Ms. Rutkowska developed her prototype to work on machines where AMD SVM is available. Could this approach also work with Intel Virtualization Technology? If not why?

AL: Well, if this approach were valid (which it's not), it would be equally applicable to VT. The two technologies, in their current forms, are almost completely identical except for some minor differences in performance characteristics.


VI: Looking at Ms. Rutkowaska demostration Austin Wilson, Director of the Windows Client Group at Microsoft, said the company will try to prevent such scenario in upcoming Vista operating system. What do you think can really be done at operating system level to mitigate the risk? Is this something also the Linux, BSD and Solaris communities should look at?

AL: I wouldn't lose a bit of sleep over this particular threat. I don't feel there is any new risk here at all.

There is some interesting security research on the horizon though and much of it has a huge intersection with virtualization.

A particularly interesting topic is attestation. Briefly, attestation is the ability to validate that the only software running at a moment in time is the software that is supposed to be there.

Currently, anti-malware software has to look specifically for known threats. Attestation lets you do something much stronger. Attestation allows you to validate that there is no unknown threats.

Imagine anti-virus software that doesn't need to be updated--ever. With attestation, there is no such thing as zero-day threats.

It is somewhat ironic that Rutkwaska choose SVM as the 'S' in SVM stands for secure specifically because AMD introduced special processor extensions for dynamic attestation along side the virtualization extensions. Attestation depends heavily on the existence of a TPM chip and I should mention that Xen is, I believe, the first VMM to provide TPM virtualization which ought to enable all sorts of interesting security research to be done in Xen in the future.

Virtualization is particularly important for attestation because it provides a much smaller trusted computing base than a traditional operating system. In reality, virtualization provides a much strongers security platform than a traditional operating system would.



Update: Keith Adams, Virtual Machine Monitor (VMM) Engineeer at VMware, is back on topic from his personal blog. When Blue Pill research has been published he already labelled it as quasi-illiterate gibberish and now he's reporting:

Well, first of all, SVM and VT make possible nothing that was not already possible before; VMware's software-only products are an existence proof. The BluePill-istas don't claim that SVM/VT make new exploits possible per se; rather, the claim is that SVM/VT make it possible to cloak the presence of a VMM rootkit completely.

Allow me to go on record: this claim is pure fantasy. In practice, it is always possible to detect the presence of a VMM, via timing attacks...


Second Update: After almost one year Blue Pill is again a popular topic. And Keith Adams, Virtual Machine Monitor (VMM) Engineer at VMware, slams Joanna Rutkowska claims once again:

...I've seen zero evidence that Rutkowska has pondered resource-based detection attempts like this, or indeed, any attacks more sophisticated than a "go-slow" loop between reads of the PIT. It is hard for me to imagine a "hypervisor" worthy of the name that doesn't leave noticable traces in resource usage. In fact, to the degree that a hypervisor goes to heroic lengths to prevent such a detection technique, e.g., by running a hardware-accurate cache simulator on every guest memory access, it will only open up wider timing discrepancies for the guest's HV-detector to exploit.

I can only conclude that in 2006 Rutkowska was either naive about the possibilities of such attacks, or that she consciously chose to make an outrageous and indefensible claim ("undetectable malware!!!!") in order to draw attention to herself and her company. Given the peripheral evidence of Rutkowska's competence, I think the evidence favors the latter, but I'd simply love to hear from her on the subject...

Comments

This post has been removed by a blog administrator.

By Blogger Peter Teoh & Magdalen Tan, at Friday, August 11, 2006 4:45:00 AM 

Detecting it is possible, as discussed here (through timing skew analysis):

http://www.openrce.org/forums/posts/198#697

By Blogger Peter Teoh & Magdalen Tan, at Friday, August 11, 2006 9:55:00 AM 

Wow, the amount of mis-information by this man Liguori is amazing. I attened both Joanna's and Dino Dai Zovi's presentations on Hypervisor root kits. Many of Liguori's claims were completly inncorrect as solutions were presented during the presentation.

It amazes me that a Xen developer can spread so much false information. Mr. Liguori please stick to what you know, not security, and try...just try to get all the facts on the matter before retorting with uninformed knowledge.

-ErikC

By Anonymous Anonymous, at Saturday, August 12, 2006 4:21:00 PM 

This post has been removed by a blog administrator.

By Blogger Anthony Liguori, at Saturday, August 12, 2006 4:27:00 PM 

ErikC,

By request, my post was not overly technique . However, if you review my older blog post (or even the link peter posted), you'll see that the TSC is not the main issue (which is what I believe Joanna discussed).

The real problem is an external time source. You cannot, in general, detect the usage of an external time source thanks to Rice's Theorem. There is no way to mitigate this.

If you have specific points, please post them and I'd be happy to address them.

By Blogger Anthony Liguori, at Saturday, August 12, 2006 4:39:00 PM 

Liguori,

You really should watch the video of Joanna and Dino's talks given this past blackhat. You are obviously going on second-hand information and do not know what you are talking about.

Yes, the time-skew calculations are doable, but we are talking about very small amounts of 'actual' time. That amount of time difference is too small to be accurately and repeatedly tested against an outside time source (NTP, etc).

Yes, it is likely possible to detect that your OS is are running in a VM. But, you make it sound trivial when in most cases it would be QUITE difficult.

Please, please don't spread FUD without providing tangible backing to your claims.

-irby

By Anonymous Irby, at Saturday, August 12, 2006 7:09:00 PM 

Besides,
What's to stop the VMM from modifying the external clock data once it comes into the system?

-irby

By Anonymous Irby, at Saturday, August 12, 2006 7:19:00 PM 

The "blue pill" is overrated:
People could just run Vista or whatever in a VMM right at the beginning.

Call that sort of thing a white pill if you wish.

The white pill could then intercept or scan for naughty stuff.

Once the blue pill is already in "the Matrix" from the very start, it's trapped - it can relaunch Vista in it's own Matrix, but that can still be detected by the "white pill".

By Anonymous Anonymous, at Saturday, August 12, 2006 8:31:00 PM 

This blog has been featured on Slashdot: http://it.slashdot.org/article.pl?sid=06/08/12/130204

Lots of comments have been made there.

By Anonymous Anonymous, at Saturday, August 12, 2006 8:54:00 PM 

irby,

I actually discussed this with Joanna on her blog prior to her giving any talks.

The actual amount of time differences is not important here, it's the ratio of the the two times that's important. A hypervisor exit will takes thousands of cycles so the difference is orders of magnitude.

One simply has to disable interrupts, and then repeat a trapping instruction a large number of times. Really, what you want is to run the instruction enough times so that you know it would take a certain number of seconds on native hardware.

Here's a concrete example. On normal hardware, say the rdmsr instruction takes 10 cycles. Under Blue Pill, it would take at least 1000 cycles (which is a difference in two orders of magnitude).

Let's then say that there is a 0.5 second uncertainity with an NTP server on average. This means the time you get from the NTP may be off by +0.5 or -0.5 seconds by the time you get around to processing it.

If you knew that an rdmsr loop would last for 2 seconds if you ran it N times (imagine the OS calibrates this loop upon installation), then if you ran it on native and used NTP to time it, you'd get anywhere from 1.5 to 2.5 seconds as the time it took.

Under Blue Pill however, when you ran the same loop, you'd observe that it took anywhere from 199.5 to 200.5 seconds. Clearly, such an extreme difference is an indicator that something is wrong.

As for Blue Pill modifying the external clock, first Blue Pill would have to recognize that the Operating System is attempting to determine that it is in a VM using an external time source.

Rice's Theorem states teaching Blue Pill how to do that reliably is impossible. It's equivalent to determining if a program has a bug, or determing whether a program will loop forever. You can build heuristics, but you can't solve the problem in general.

By Blogger Anthony Liguori, at Saturday, August 12, 2006 8:55:00 PM 

Regarding spoofing NTP: Surely you can use an external clock (e.g. another computer, radio, even your wristwatch) provided the number of instructions is sufficiently large.

The OS may be unable to see beyond the system within which it is embedded but you certainly can.

The proviso that the number of instructions be large is not restrictive, since it is assumed the user can execute any arbitrary instruction any number of times.

By Anonymous Anonymous, at Saturday, August 12, 2006 10:23:00 PM 

I posted these suggestions on JR's www earlier on today, but it hasn't appeared yet, due i imagine to the blog authors approval delay !

Could some way/s be engineered to introduce a highly accurate clock, RTC ( Real Time Clock ) eg Atomic = into the equation somewhere/somehow for timing purposes. This could consist of not one, but say 2 RTC's, internally, and/or external to the PC. Or just the one RTC, say external to the PC, but the timing pulses inputted to the PC as well. Then possibly a diff comparison could be made against any timing anomalies etc.

Some ideas might not be feasable for all sorts of reasons, but @ least they get discussed and thought about etc. That's often how progress is made, by working through ideas that maybe won't work as is, but either have some merit or potential, and/or give other people ideas and spur them on in being able to produce something that can/will work.

Spanner

SpannerITWks - hereNthere.com

By Anonymous SpannerITWks, at Sunday, August 13, 2006 12:30:00 AM 

As I stated back in 2003: Microsoft needs a Virtual Server for backward compatibility for it's NGSCB ( Next Generation Secure Computing Base ) DRM ( Denial of Rights Mechanism ) platform.

However "Remote Attestation", does little to protect the end user from the inevitable remotely exploitable vulnerabilities in media viewers and along with OS level DRM encryption offers the ability for the malware to store content, and without the keys to decode the content, keep it hidden from any forensic analysis.

See Remote Attestation" and content access monopolies"

By Blogger David, at Sunday, August 13, 2006 3:39:00 AM 

ez question. What if the malware code is inserted inside the branch of Xen and had access to all Xen's features ????

By Anonymous rdircio, at Tuesday, August 28, 2007 5:36:00 PM 

Post a new comment

Virtualization Congress 2008