Oculus Rift: A Bug with Windows Power Plan Configuration
I pre-ordered the Oculus Rift, a virtual reality headset, on January 6th of this year. After following the Rift from its Kickstarter campaign to trying it in person at PAX 2014, I've been waiting for the promise of virtual reality for years. Last week, my dream finally came true-- I received one of the very first Oculus Rift "CV1" headsets. After plugging it in and launching demos that cannot be explained with words, I noticed something wrong with my PC. The CPU fan was going crazy, and the sensors were reporting abnormally high temperatures.
TLDR: EVE: Valkyrie-- a game for the Oculus Rift-- has a problem where it, occasionally, may change your power management settings at the wrong time, resulting in your processor not throttling down and potentially causing a lot of heat and noise. This was exaggerated by my old, power hungry processor, but might still affect others. You can just right to the fix if you're not interested in my thought process to diagnose the issue.
My computer is relatively old-- in fact, it's based on an OEM HP PC. My motherboard is some obscure brand I can only assume sells exclusively to HP, and my processor is a nearly-6-year-old Intel 970. Though it has 6 true cores, it often can show its age when I try to run modern software. The PC originally came with an NVIDIA GTX 580-- a top of the line GPU of its time-- but I've swapped it out with a newer NVIDIA GTX 970. The old OEM HP case was also beginning to fall apart, with the metal warped and screws rattling inside, presumably from one of the many times I've opened the computer up. I've since switched to a Fractal Design r5, which completely hides that this computer is still based on an OEM model from 2010.
Both Oculus and HTC/Valve have a recommended specification for virtual reality headset users. On the GPU front, I match the recommended specification with my NVIDIA GTX 970. But due to the "Constellation" head tracking system that Oculus uses, they have a recommended CPU model that is many years newer, more powerful, and more efficient than my aging Intel 970. This initially concerned me, but I decided to defer the potentially messy motherboard-CPU-RAM upgrade until some later time. My confidence increased once Valve released a virtual reality benchmark suite, which put my computer as a whole squarely into the "Good for VR" category. Though Valve themselves indicated that the benchmark suite didn't really represent CPU readiness, the fact that my computer could run the benchmark was still reassuring.
An Ominous Whir
I received my Oculus Rift on Thursday, March 31st, and excitedly set it up for the first time. I had pre-loaded the software onto my computer a couple of days earlier during the "launch day"1, so I was almost ready to go. Though my impressions of the Oculus Rift and the arrival of consumer ready virtual reality is enough for a post in itself, it's safe to say that I was incredibly impressed. The "CV1" was a remarkable improvement on the older development kits, and I was extremely satisfied with my purchase.
Though I was blind to it when I used the headset myself (thanks to the audio, which I admittedly was impressed with), when I demoed the headset to others I noticed a fan inside of my case kicking into high gear. I originally was unconcerned, since I realized my aging processor likely struggled with this new, demanding technology.
After an exciting day of demoing the new technology to my family and having some time to play with it myself2, I shutdown my computer for the day. When I started it back up the next day, however, I noticed something was extremely wrong: without any software running and immediately after my computer booted up, that same fan I noticed earlier was running at full force. After some investigation (read: putting my ear to my computer to find the source of the sound), I figured out that the sound was from my CPU fan running at 100%.
Diagnosing the Problem
The first thing to check was obviously the built in Task Manager. My initial thought was that there was some rogue process running my CPU into the ground, and though Task Manager indicated my CPU was at 100% load, there weren't any processes that were obvious culprits. Chrome had occasional spikes into the 50%-usage territory, but it wasn't anything out of the ordinary.
When I switched to the "performance" tab, I noticed something odd. The CPU graph indicated 100% usage, and my processor was running at the full 3.3 GHz without throttling down.
One thing to note with recent Intel processors is their ability to throttle down under low loads. This feature-- called EIST, or SpeedStep-- changes the CPU multiplier to reduce the clock speed, and as a result the power usage and heat generated. This is a good thing under normal circumstances since less heat is generated, and therefore the life of your processor is increased.
On a normal PC, if you look at the "performance" tab of Task Manager you will see the "Speed" number fluctuate. At high loads it likely will reach at or above your processor's clock speed3, but when sitting in Chrome or checking your email, you will likely see this number drop to a fraction of the normal speed. For example, my Intel 970 runs at 3.2 GHz, though I regularly see it drop to 1.57 GHz.
However, this throttling didn't occur, and the utilization remained at a constant "100%". I looked at the "details" tab of Task Manager-- which shows all processes, including the System Idle Process-- and everything looked normal as well. In fact, if I excluded the Idle process and added up the CPU usage numbers, I arrived at a total of only 2-5% CPU usage. This was nothing like what the other sections of the Task Manager reported.
I also used the excellent "HWMonitor" program to check my temperatures. My CPU ran at approximately 80 degrees Celsius, which was the obvious cause of the fans.
Resource Monitor also reported 100% CPU usage across all 6 cores, though like the "details" tab in Task Manager, there were no processes that used more than 1-2% CPU. I thought that this might be some sort of bug, so I checked a third party task manager called "Process Explorer". It too reported that all processes used less than 1-2% total CPU usage, despite the high temperatures and obvious problem.
As a last resort, I performed a full scan of my computer with both Windows Defender and MalwareBytes, hoping that it wasn't the result of a virus or rootkit. Scans revealed nothing, to my relief.
At this point, I was dumbfounded. Something prevented my CPU from throttling down, yet nothing consumed more than a single percentage point of my CPU with the exception of the System Idle Process, which was normal. I thought back to when the problem started and tried to remember what I installed: nothing. Remember, I installed the Oculus software much earlier in the week. The only significant change was that I plugged in the Rift and tracking camera, but I already had unplugged those to see if their mere presence caused the CPU issue to occur. I was partially in denial-- my new toy was the only thing that had changed, yet I didn't want to accept that it was the culprit. There wasn't really any evidence to suggest this, anyhow.
As a last ditch effort, I booted into Safe Mode. This-- supposedly-- would let me know if it was third party software responsible. As my computer turned back on, I was crossing my fingers. My heart sank as I heard the fans spin back up at full force. I checked the Task Manager once again, and my fears were confirmed-- 100% CPU usage.
Out of ideas and desperate to get back into the virtual realm, I simply reformatted my computer and hoped this was the end of it.
As I reinstalled my software, I had doubts in the back of my mind. I knew that the last major change to my computer was related to the Rift, so I worried that installing the software again would bring the issue back and I would be stuck with the hard choice of letting my CPU run hot, or purchase a new and more efficient motherboard and CPU. Some hours later after the software, my drivers, and everything else I needed was back on my computer, I once again tensed up as I launched a Rift game.
Though the fans spun up while I played Lucky's Tale, shutting down the game and pulling the headset off resulted in the fans once again lowering to a low hum. It wasn't my Rift.
Or, so I thought.
The Problem Returns
The very next day, the problem relapsed. I was heartbroken-- the Oculus software and Rift games were literally the only pieces of software that I had installed, and had to be the culprit. I once again went through the steps: Task Manager, Resource Manager, Process Explorer, and HWMonitor to verify temperatures. Literally all of the same symptoms were reoccurring.
I eventually checked the power management settings built into Windows. I knew that there was a setting for "Maximum Processor State" and "Minimum Processor State", and there was a possibility those were changed somehow. However, they were normal-- 5% minimum, and 100% maximum. However, I was surprised that when I changed the "Maximum" value to 50%, the processor speed also dropped to half in the Task Manager (3.2 GHz to about 1.57 GHz), and the fans spun down. However, the "Utilization" percentage of my CPU was also locked to 50%. I suppose it made sense-- I capped the processor's speed to 50%, so now whatever was using up my entire CPU was just using all of the 50%.
Instead of this being an issue with Windows, maybe it was a problem with Intel's EIST-- that too, could have been disabled. There's a couple programs that support showing the status of Intel EIST, including "RealTemp". However, to my dismay, EIST was enabled as far as I could tell.
Once again, I had to hunt for any leads I could. At this point, I was doubtful that there was any process on my machine that was actually using up 100% of my CPU, so I focused on power management and the throttling issue.
The Discovery and Fix
Eventually, I came upon the documentation for the Windows power management configuration. Though Windows exposes a great deal of power configuration options in the control panel, there are many others that are not exposed in the GUI, and that can be accessed through the command line tool PowerCfg
4. There's one setting in particular that was of interest, nicknamed "IDLEDISABLE". The description of IDLEDISABLE was a massive clue:
IDLEDISABLE specifies if the processor idle states are disabled on the system. If disabled, the kernel spins in a loop when there is no code to execute on a processor instead of invoking a transition to an idle state.
With the fans still running at full, I queried the power management configuration of my computer:
PowerCfg /QH
Sure enough, the IDLEDISABLE setting was set to 0x00000001
for AC power, which is the only setting that matters for a desktop PC:
Power Setting GUID: 5d76a2ca-e8c0-402f-a133-2158492d58ad (Processor idle disable)
GUID Alias: IDLEDISABLE
Possible Setting Index: 000
Possible Setting Friendly Name: Enable idle
Possible Setting Index: 001
Possible Setting Friendly Name: Disable idle
Current AC Power Setting Index: 0x00000001
Current DC Power Setting Index: 0x00000000
Flipping this back to "0" is actually fairly easy. From an administrative command prompt, you need to type two commands:
C:\WINDOWS\system32>PowerCfg /SETACVALUEINDEX SCHEME_CURRENT SUB_PROCESSOR IDLEDISABLE 000
C:\WINDOWS\system32>PowerCfg /SETACTIVE SCHEME_CURRENT
This does two things-- it sets the IDLEDISABLE
property on the current, active power management scheme to a value of "0" (i.e. off), and then re-sets the power management configuration so that it loads the new values. Once I performed those two steps, my fans immediately spun back down and the Task Manager reported that everything was back to normal.
Post-Mortem
This simple, two line fix was the culmination of many hours of diagnosis, Google searches, and software updates. But, why did this even occur in the first place?
System Idle Process
First, you will need a little background on the System Idle Process. At a basic level, CPU must always be doing something. However, CPUs run a lot of different processes that are triggered at different times, and often there will be a time where there are no user processes that are running. So, what is your CPU supposed to do? This is where the "System Idle Process" comes in on Windows. The "System Idle Process" is the Windows version of a piece of code that runs when there's nothing else to do5. Essentially, the Idle process is just taking up extra CPU cycles.
This, of course, is why the Idle process is of no concern under normal circumstances. Originally, if you recall, I dismissed the Idle process in the Task Manager's "Details" tab and in Process Explorer. Though it took up 99% of the CPU time, this was normal: the Idle process would yield cycles to other user processes if needed, and it does not normally prevent a CPU from throttling down. In fact, though the Idle process was originally analogous to a while(1){}
loop, modern versions of Windows actually run instructions that enable the power saving features of modern processors.
IDLEDISABLE
Of course, the above is true only if the system is allowed to idle. As you may of guessed, the IDLEDISABLE setting actively disables the throttling features of processors, causing the Idle process to revert back to a basic while loop. In Microsoft's words, "the kernel spins in a loop when there is no code to execute on a processor instead of invoking a transition to an idle state".
This, of course, explains why my system never actually felt sluggish, despite the Task Manager reporting 100% CPU usage. Though the Idle process was consuming 100% of my CPU by being in a loop, it yields to user processes, meaning that other programs on my PC could still operate at a normal level.
Oculus Rift and Latency
But, how does this all relate to the Oculus Rift?
The Rift, and virtual reality in general, is very latency sensitive. The device runs at 90 frames per second to reduce motion sickness for the user, meaning that the system only has about 11 ms to actually render a frame. That's a crazy small amount of time for a complete scene in a video game to have its physics calculated by the CPU, rendered by the GPU, and sent to the headset. Dipping below this 90 fps target also is extremely bad for motion sickness, so Oculus implements a ton of different features (such as asynchronous time warp) to help reduce latency as much as possible.
The Rift also implements both rotational and positional tracking, meaning that you can tilt your head in space to have your character in a game do the same, or you can even lean forwards, backwards, or to the side. The Rift has an inertial measurement unit (IMU) package inside of it, with an accelerometer, gyroscope, and magnetometer to determine the movement of the headset. This package runs at a much higher frequency than the similar sensors in your phone-- up to 1000 Hz (one thousand measurements per second), in fact. The Rift also sends these data points every 2 ms over the USB cable.
All of these innovations are needed to reduce latency as much as possible, but there's one thing that they are doing that they haven't talked about to my knowledge-- changing the power plan of your computer to "High Performance". In Windows, there's a couple default "Power Plans". These plans contain the settings-- such as the amount of idle time before your monitor turns off and your computer goes to sleep-- that govern the performance and power usage of your computer. By default, your desktop computer is set to "Balanced", which contains a mix of options that allow your computer to perform well when needed, but still save electricity. There's also a "High Performance" plan that changes characteristics of your computer to help it perform at its maximum at all times.
What you may not notice is when you actually put your headset on6, the Oculus software changes the power plan of your computer to "High Performance". This is immediately toggled back when you take your headset off. The GIF below shows me putting my finger over the sensor-- you can see the Oculus software automatically switching the power plan to "High Performance".
However, the default "High Performance" power plan in Windows actually has IDLEDISABLE set to "0", meaning that the idling features are always enabled out of the box for Windows desktops. On top of that, my issue was that the fans were kicking in because IDLEDISABLE was actually set to "1" on the default "Balanced" power plan.
It turns out, CCP (the developers of EVE: Valkyrie) intentionally set IDLEDISABLE to "1". This isn't exactly far fetched-- Intel actually has a presentation on low latency computing where they suggest that developers do just that to reduce latency as much as possible.
Though they may intend to set the flag to "1" when the game is launched and back to "0" when it is closed, if the game is closed incorrectly (such as if it crashes or is forced to close via the Task Manager, etc.), there is a potential for the IDLEDISABLE flag to remain at "1", thus causing additional heat and fan noise. This can be compounded if EVE: Valkyrie changes the IDLEDISABLE flag to "1" when the power plan is still set to "Balanced", as this is the plan that most computers will switch back to once the Oculus Rift headset is removed.
The Perfect Storm: My Old Hardware
Other than the fact that there's only a couple thousand Oculus Rift devices in the wild, why is it that no one has discovered this before?
My hardware might be an explanation: it's old, more power hungry than newer processors, has a stock CPU fan that definitely needs cleaning, and the thermal paste is probably all dried up by now. It was already running warmer than it should be, but with it running at 3.3 GHz and with IDLEDISABLE set to "1", it heated up enough to cause my loud fan to kick in. With a newer, more power efficient processor or a better CPU cooler, I might not have even noticed the issue. After all, nothing felt sluggish7.
I hope this gives some insight into the process I went through to debug this problem, and save someone time. If you're seeing the Task Manager show 100% CPU usage and your CPU clock at the maximum but without any processes that are obviously the issue, you might want to check the power management configuration of your system to see if IDLEDISABLE is set to "1" when it should be reset back to the default of "0".
This issue has been reported to EVE: Valkyrie and I will update this article as I hear back from them.
- April 4th, 2016 @ 5:40 PM PST - Reported the issue to EVE: Valkyrie support
Special thanks to Reddit user /u/GodLikeVelociraptor, who seems to have pinpointed that it is in fact EVE: Valkyrie causing the issue, not the Oculus software itself. I have updated the article to reflect this fact.
- Considering only a couple hundred to maybe one thousand Oculus Rifts-- and only to the originally Kickstarter backers-- arrived on the launch day, some may hesitate to actually consider it anything but a soft launch. By the end of that week, Oculus CEO Brendan Iribe informed customers that there was a component shortage that resulted in a delay. ↩
- Lucky's Tale, though not a game I would play on a traditional monitor, is actually quite impressive in virtual reality. You can lean in and look at everything like it's a miniature model, and you can even headbutt things in the game world thanks to the positional tracking features of the headset. ↩
- Technologies like Intel's TurboBoost will actually let your processor run above it's "normal" clock speed when it's able to within thermal limits. ↩
-
You must use an administrative command line prompt to change the
PowerCfg
settings. ↩ - A better explanation of this is provided by Gustavo Duarte. ↩
- There is a sensor on the Rift that lets it know when the headset is put on. This allows for the Oculus Home software to launch and the OLED screens to turn on, saving power and the longevity of the screens themselves by not having them be always on. ↩
- Remember, the Idle process was taking up the majority of my CPU in what was essentially a "while" loop. Though it let other software take over CPU resources when needed, it still generates heat if the processor is not allowed to throttle down and idle. ↩