Nvidia Gpu Boost 2.0 How To Use



Just curious, how much does GPU boost 2.0 actually boost for you? I was running AC:U (everything at maximum settings) and used GPUz to test, main menu clocking at 1050Mhz Shortly after running about the map and through crowds i checked back and the maximum clock it reached was 1280mhz That seems like a massive bump! GPU Boost 2.0: Overclocking & Overclocking Your Monitor. The first half of the GPU Boost 2 story is of course the fact that with 2.0 NVIDIA is switching from power based controls to temperature.

NVIDIA's GA107 GPU uses the Ampere architecture and is made using a 8 nm production process at Samsung. GA107 supports DirectX 12 Ultimate (Feature Level 12_2). For GPU compute applications, OpenCL version 2.0 and CUDA 8.6 can be used. Additionally, the DirectX 12 Ultimate capability guarantees support for hardware-raytracing, variable-rate shading and more, in upcoming video games. It features 3072 shading units, 96 texture mapping units and 48 ROPs. Also included are 96 tensor cores which help improve the speed of machine learning applications. The GPU also contains 24 raytracing acceleration cores.

Graphics Processor

GPU Name
GA107
Codename
NV177
Architecture
Ampere
Foundry
Samsung
Process Size
8 nm
Transistors
unknown
Die Size
unknown
Released
Unknown

Graphics Features

DirectX
12 Ultimate (12_2)
OpenGL
4.6
OpenCL
2.0
Vulkan
1.2
CUDA
8.6
Shader Model
6.5
PureVideo HD
VP11
VDPAU
Feature Set k

Render Config

Nvidia gpu boost 2.0 download
Shading Units
3072
TMUs
96
ROPs
48
SM Count
24
FP16 Units
3072
FP64 Units
64
INT32 Units
1536
Tensor Cores
96
RT Cores
24
SFUs
384
TPCs
12
GPCs
3
Tex L1 Cache
64 KB per SM
L1 Cache
128 KB per SM
L2 Cache
2048 KB
Max. TDP
90 W

All Ampere GPUs

  • NVIDIA GA107

NVIDIA GPU Architecture History

  • 1998-2000 Fahrenheit
  • 1999-2005 Celsius
  • 2001-2003 Kelvin
  • 2003-2005 Rankine
  • 2003-2013 Curie
  • 2006-2010 Tesla
  • 2007-2013 Tesla 2.0
  • 2010-2016 Fermi
  • 2010-2013 VLIW Vec4
  • 2010-2016 Fermi 2.0
  • 2012-2018 Kepler
  • 2013-2015 Kepler 2.0
  • 2014-2017 Maxwell
  • 2014-2019 Maxwell 2.0
  • 2016-2020 Pascal
  • 2017-2020 Volta
  • 2018-2020 Turing
  • 2020-2021 Ampere

Graphics cards using the NVIDIA GA107 GPU

NameChipMemoryShadersTMUsROPsBase ClockBoost ClockMemory Clock
NVIDIA GeForce RTX 3050GA107-300-A14 GB230472401545 MHz1740 MHz1750 MHz

GA107 GPU Notes

Speculation

Authors: Louis Bavoil and Iain Cantlay

With all modern graphics APIs (D3D11, D3D12, GL4 and Vulkan), it is possible for an application to query the elapsed GPU time for any given range of render calls by using timestamp queries. Most game engines today are using this mechanism to measure the GPU time spent on a whole frame and per pass. This blog post includes full source code for a simple D3D12 application (SetStablePowerState.exe) that can be run to disable and restore GPU Boost at any time, for all graphics applications running on the system. Disabling GPU Boost helps getting more deterministic GPU times from timestamp queries. And because the clocks are changed at the system level, you can run SetStablePowerState.exe even if your game is using a different graphics API than D3D12. The only requirement is that you use Windows 10 and have the Windows 10 SDK installed.

Motivation

On some occasions, we have found ourselves confused by the fact that the measured GPU time for a given pass we were working on would change over time, even if we did not make any change to that pass. The GPU times would be stable within a run, but would sometimes vary slightly from run to run. Later on, we learned that this can happen as a side effect of the GPU having a variable Core Clock frequency, depending on the current GPU temperature and possibly other factors such as power consumption. This can happen with all GPUs that have variable frequencies, and can happen with all NVIDIA GPUs that include a version of GPU Boost, more specifically all GPUs based on the Kepler, Maxwell and Pascal architectures, and beyond.

SetStablePowerState.exe

All NVIDIA GPUs that have GPU Boost have a well-defined Base Clock frequency associated with them. That is the value of the GPU Core Clock frequency that the GPU should be able to sustain while staying within the reference power usage and temperature targets. For the record, for each GeForce GPU, the Base Clock is specified in the associated Specification page on GeForce.com.

Using D3D12, there is an easy way for an application to request the NVIDIA driver to lock the GPU Core Clock frequency to its Base Clock value: by using the ID3D12Device::SetStablePowerState method. When calling SetStablePowerState(TRUE), a system-wide change of GPU power-management policy happens for the NVIDIA GPU associated with the current D3D12 device, and the current GPU Core Clock gets locked to the reference Base Clock recorded in the VBIOS for that GPU, unless thermal events happen. If the GPU detects that it’s overheating, it will then down-clock itself even if SetStablePowerState(TRUE) was called. But in practice, that should never happen if the GPU is in a properly cooled case and its fan is working properly. The result is that the GPU Core Clock frequency is then stable at Base Clock once any D3D12 application calls SetStablePowerState(TRUE) in the system. In other words, GPU Boost gets disabled. And our driver takes care of restoring the previous GPU power-management state when the locking D3D12 device gets released.

Knowing all that, we have written a simple standalone D3D12 application (SetStablePowerState.exe) that can lock and unlock the current GPU Core Clock frequency for any NVIDIA GPU with GPU Boost. The GPU Core Clock frequency gets instantly locked when launching this app, so it can be launched anytime you want to start/stop profiling GPU times. You can monitor your current GPU Core Clock frequency by using NVAPI (see Appendix) or by using an external GPU monitoring tool such as GPU-Z.

Using this standalone SetStablePowerState.exe application to lock the clocks before/after profiling GPU times makes it useless to ever call ID3D12Device::SetStablePowerState from a game engine directly. We actually recommend to never call this D3D12 method from engine code, especially for applications that have both D3D11 and D3D12 paths, to avoid any confusion when comparing GPU profiling results on D3D12 vs D3D11.

Gotchas

Using SetStablePowerState only modifies the GPU Core Clock frequency but does not modify the GPU Memory Clock frequency. So if an application gets a 1:1 between GPU Core Clock and GPU Memory Clock on a normal run, SetStablePowerState can modify it to up to 0.8 to 1. That’s an issue worth knowing as relative performance limiters will slightly shift. So when GPU Boost is disabled, a pass that is both math-throughput and memory-bandwidth limited may become more math limited; or, conversely, it may become relatively less memory limited.

Finally, for the SetStablePowerState call to succeed, you need to have the Windows 10 SDK installed. With Windows 10 up to Version 1511, that’s all you need. But with more recent versions of Windows 10 (starting from the Anniversary Update), you also need to enable “developer mode” in the OS settings, otherwise the call to SetStablePowerState will cause a D3D12 device removal.

Afterword: Some History and How Our Advice Evolved

If you have been following our DX12 Do's And Don'ts blog, you may have noticed that the advice on SetStablePowerState has changed. That could use some explanation…

In the first wave of DX12 games, we saw a couple of beta pre-releases that always called SetStablePowerState(TRUE) by default. As we discussed above, this API call significantly lowers the Core Clock frequency on NVIDIA GPUs and does not represent the end-user experience accurately. It is therefore quite inappropriate to call it by default in a shipping product, or even a beta.

We have also seen confusion result from the use of SetStablePowerState because it only works when the D3D12 debug layer is present on a system. We have seen multiple cases where development engineers and QA departments get different performance results because SetStablePowerState fails on some systems and the failure was quietly ignored.

Hence, our recommendation was to avoid SetStablePowerState or use it very thoughtfully and carefully.

For the Windows 10 Anniversary Update (aka Redstone), Microsoft changed the implementation, “SetStablePowerState now requires developer mode be enabled; otherwise, device removal will now occur.” (http://forums.directxtech.com/index.php?topic=5734.new). So any calls to SetStablePowerState will obviously fail on end-users systems or most QA systems. This is a change for the better and makes much of our previous advice irrelevant.

Nvidia Gpu Boost 2.0 How To Users

We are still left with the question of whether or not to test with SetStablePowerState. Do you test with reduced performance and more stable results? Do you test end-user performance and accept some variability? Do you monitor clocks and show a warning when variability exceeds a threshold? To be perfectly honest, we have changed our minds more than once at NVIDIA DevTech. This is for good reasons because there is no one true answer. The answer depends on exactly what you are trying to achieve and what matters most to you. We have done all three. We have largely settled on stabilizing the clocks for our in-depth, precise analyses.

Gpu Boost Nvidia

Appendix: SetStablePowerState.cpp

How To Use Gpu Boost

Appendix: Monitoring the GPU Core Clock using NVAPI

Nvidia Gpu Boost 2.0 How To Use Windows 10

If you want to monitor your NVIDIA GPU Core Clock frequency without having to use an external tool, you can use the NvAPI_GPU_GetAllClockFrequencies function from NVAPI like in the example code below. We recommend to not call this function every frame, to avoid the risk of introducing any significant performance hit. Instead, we recommend calling it at the beginning and end of a given time interval (for instance before/after a GPU profiling session, or before/after playing a level), and display a warning if the GPU Core Clock frequency has changed during the considered time interval.