NVIDIA "threaded optimization" oxymoron

9/4/2009 2:27:33 PM

By RetroRalph

So I talked a bit about NVIDIAs horrible OpenGL VSYNC performance (cpu usage) in a previous blog however I have found a solution to it. I was debugging RetroCopy v0.300B on my nvidia machine when I noticed it.

There were a few issues with the fragment shader on NVIDIA hardware which I fixed, but whilst fixing it I noticed absolutely horrible performance from time to time on my CORE2 machine. Now it has two cores and the video thread should only be using a single core, leaving one fully free to do the emulation. After doing a little investigation I saw that RetroCopy was using both cores 100%, which confused me because I wasn't running any emulator threads, it should have only been at 50% at worst.

So I pulled out Process Explorer and looked at the threads, lo and behold there is an opengl thread using a whole core on its own. I looked at the stack trace and saw this thread was doing the VSYNC. This confused me, because another thread was apparently also doing this, why do you need two threads to do the same thing? Especially something so costly? Well as it turns out, NVIDIA calls this an optimization, threaded optimization, and guess what, it's enabled by default.







So in conclusion if you want better performance on your multicore NVIDIA based machine, disable this "optimization". ATI doesn't suffer from the same issue, so if you're on ATI you're safe.

4 responses to NVIDIA "threaded optimization" oxymoron

elratauru wrote:

9/5/2009 1:55:00 AM

Its nice that I have a lovely HD4670, So...if you ever need to test something on a Radeon, I'll help you out =P

By the way, Im paciently waiting new wips =P

RetroRalph wrote:

9/5/2009 2:16:29 AM

I also now have an ATI 4870 on my main machine so I'm lucky I guess :) . I'm always looking for new testers though, so if you can show you are good at it with the public builds then you will likely get asked to be a private tester. The new public beta should be out later tonight or tomorrow, it's running quite well without any major bugs now.

MarshMellow wrote:

9/5/2009 9:43:50 PM

Wow nice find, I've disabled it and RetroCopy v0.200B runs a lot better on my computer. Does this only apply to RetroCopy or all 3D games?

RetroRalph wrote:

9/6/2009 12:04:43 AM

I'm not sure how many games this will apply to. It may only be OpenGL related, I remember NVIDIA under DirectX10 was fairly good with VSYNC, it only used about 12% of one CPU. DirectX9 might be another story though.

It appears NVIDIA tried to offload VSYNC to another thread but did it very poorly. Because RetroCopy doesn't do ALL it's game processing in one thread and has multiple emulator threads it is more severely affected by this than a single threaded game/application would be.

Add Comment
RetroCopy: Making emulation easy and fun.
  |   RetroCopy © 2011