Thursday, January 03, 2013

The saga of muticore CPUs, core parking, and game performance on Windows 7

There is a major performance issue with games (possible affecting DirectX 9 games more) and generally CPU-intensive applications on Windows 7 / Server 2008 running on modern quad-core CPU's due to an obscure power-saving feature called CPU core parking. The symptoms are a dramatic decrease in performance with unstable and jittery in-game FPS and very noticeably laggy and jittery and unresponsive gameplay despite -- and this is the key point -- a high-end graphics card and CPU more than capable of running the game at its max settings and a low server-to-client ping (~100ms) in multiplayer games. The unique performance problems caused by this issue do not resemble typical underpowered hardware issues. People have described the problems as micro-stuttering -- fractional pauses when rendering a frame that are hard to pin down precisely but are definitely noticeable and very detrimental to game enjoyment. For single-player games like RAGE what was most noticeable to me was the display lag when first rendering new scenes. RAGE uses a unique graphics system that requires constant shoveling of texture data from disk to memory to the GPU...the performance problem I had wasn't exactly a low FPS but a noticeable delay in loading texture data that caused my FPS to drop for a second or two as the camera moved through a new scene for the first time.

For multiplayer games that suffer from this issue -- in my case Team Fortress 2 and Planetside 2 -- it appears to all the world (and more importantly to the other players on your server) that you're running a very old obsolete CPU. Gameplay is laggy and jittery especially with many other players in your field-of-view, despite your very capable GPU and CPU and network connection. This issue has NOTHING to do with how well your game or app is optimized or threaded or whether or not you are utilizing all of your CPU cores when your program is running. It is a semi-common performance issue that affects all CPU-intensive programs on quadcore computers under Windows 7 and possibly Vista; even those games and apps which are not heavily multithreaded. I think I encountered this issue even on a game from 2004 -- Return to Castle Wolfenstein -- which had a constant and very noticeable drop in FPS to around 40 for no apparent reason despite running on a high-end Nvidia discrete GPU.

Most games are not multithreaded enough to take full advantage of 4 or more CPU cores and the performance gains from modern multicore CPUs come from the ability of the OS to delegate other tasks to other cores rather than the game itself efficiently distributing its workload across multiple CPU cores. However this particular CPU power-management feature -- core-parking -- is seriously bugged on recent editions of Windows and affects all games regardless of how well 'optimized' -- as gamers like to say -- the game engine is. In Team Fortress 2 the performance slowdown did seem to become noticeable right at the Halloween TF2 update in 2012. TF2 updates always seem to put more and more performance demands on your PC which isn't really a good thing, but it should not have been as pronounced as what I observed. I don't know if a recent Windows update has exacerbated the core-parking issue with games. In TF2 the symptoms are very laggy and unresponsive game play, despite a low server ping and the game itself reporting high framerates. The internal FPS counter in TF2 reported a constant 60 FPS but I suspect this counter strictly measures the ability of the GPU to put frames on the screen. I'm pretty sure my overall framerate in TF2 was nowhere near 60.

According to one explanation which makes sense to me, newer versions of Windows will power down cores to save power when it thinks the current processor workload can be handled on only one or two cores. But it takes time for cores to be moved from this low-power inactive 'parked' state to an active state for handling threads and this delay can be signficant when handling near real-time workloads like games. Unfortunately it seems that the algortihm Windows uses for this park /wake-up /distribute process isn't balanced or doesn't identify heavy realtime workloads properly, and cores may be getting parked and woken up and threads transferred to them and then parked again over and over, resulting in a serious performance degradation. In computer programming a few ms is like a lifetime as far as performance goes.

The solution is simple -- if you want to play games on Windows disable core parking on your desktop if you have a Core Quad or i5 or i7. There are multiple ways and programs that allow you to do this that are documented on the web, like this: This way uses enables the built-in power-management interface to manage the core-parking feature and is probably the better approach than doing registry editing directly.

Programs like this one: modify the registry for you. Some programs and guides take the step of actually deleting the relevant power-management keys. I would not recommend actually deleting the registry keys or values since there is no automatic way to recreate them -- renaming them is a better approach.

The end result of disabling core parking for my quad-core PC on TF2 is that on servers with < 100ms ping and 32/32 players my gameplay is smooth as butter with 60 fps most of the time, only dipping with very many players in close proximity. My average CPU usage hovers around 50% and load times for levels are also significantly decreased.

On Planetside 2 using a RenderDistance of 1500, GPUPhysics 1, Terrain and Flora 0 and shadows off with all other graphic settings maxed, I get 30-35 fps in heavy firefights with plenty air and vehicle action. This isn't great but it is playable and my fps does not drop below 30 under any circumstances. I get 35-40 in smaller fights and 45+ in most friendly-only areas with many friendlies in view; a solid 60 fpa at warpgates. It's possible that Planetside 2's notoriety for extremely low frame rates even on current-gen hardware like Nvidia 6xx GPUs and Intel i5 and i7's  may be due largely to this issue with core parking. Whatever minimal power-savings you would get from this CPU core-parking feature are not worth turning your sweet gaming-rig into a crippled laptop.