Configuring hybrid GPU laptop to only use dGPU when requested

_Undercover · 31 July 2021 17:58

Hi there,

I have a laptop with hybrid GPU: an iGPU by Intel and a dGPU (1050 Ti) by NVIDIA.

Less than a year ago I needed to use CUDA for classes and, after using Manjaro with the iGPU since it was first installed, I followed a guide to install the proprietary NVIDIA drivers, as CUDA only works with them installed, and successfully managed to get CUDA up and running after some hiccups.
One thing that I could never fix however, was the fact that now my dGPU is always on, because of Xorg:

$ nvidia-smi 
Sat Jul 31 18:40:08 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   40C    P8    N/A /  N/A |      4MiB /  4042MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       672      G   /usr/lib/Xorg                       4MiB |
+-----------------------------------------------------------------------------+

Now I’ve had issues with many things so far but the only that really generated a major headache was NVIDIA related business.

My goal is quite simple: to have a setup where the dGPU only activates when requested (which is the default on Windows).

Although this seems simple at first I’ve seen multiple people doing it in multiple ways, and never succeeded, at least when I tried, one year ago. Now it seems that Manjaro does things differently from Arch, which means that I can’t use the Arch Wiki, and the Manjaro Wiki has a warning that says it’s outdated - link.

Now that I finally have some time to work on this issue, it seems that I’m left more or less alone with trial and error until I eventually succeed, or find a post somewhere that states that this is impossible.

My question is, in order to avoid wasting my time, where should I start looking to properly configure my hybrid GPU laptop?

And if it’s impossible to only activate my dGPU to run CUDA, without having it running in the background, what’s the recommended way of using the iGPU exclusively? (without disabling it on the BIOS, I’m on a dual boot with Windows)

Thanks in advance.

P.S.: I know that there’s a whole load of threads in several different forums about the matter at hand and providing solutions that range from installing FOSS drivers, which won’t work since CUDA doesn’t run on them, installing additional programs, which can be outdated and confusing, or targeted to Arch Linux or Ubuntu, and from what I find out, Manjaro does it differently.

michaldybczak · 31 July 2021 20:07

On Linux dGPU never activates automatically like on Windows, because hybrid technology was designed for Windows. The only possible automatic launch is when you have hybrid setup and launch games or apps using Proton or directly using Vulkan. Then by default, Nvidia launches for those apps/games. In other cases, you have to add environmental variable prime-run to launch it with Nvidia.

So for example, if you want to launch obs with Nvidia, you simply add to the launcher:

prime-run obs

Hybrid setup has some drawback thou: Nvidia is never fully powered down when not in use, so hybrid setup uses a slightly more power than intel based setup but still less than fully Nvidia mode.

How much power depends on the dGPU model and newer models allow for very little power consumption or even some models allow for complete power down, however older Nvidia cards uses more power, so it’s better to manually switch between modes.

Anyway, on Manjaro, if system detects that you have hybrid GPU, it automatically installs hybrid setup and all you need is to add this environmental variable to launch apps/games with Nvidia. However, if you want more granular control and better power savings on laptop, you may want to try optimus-manager:

I don’t know how your model of dGPU behaves under hybrid setup, how much power Nvidia uses. It’s for you to decide if you are OK with the default, hybrid setup (which is easy, you don’t have to do anything aside adding variables to launchers where you need it), or you want to maximize power savings and switch modes on demand.

daowanijo · 31 July 2021 20:16

FWIW I’m in the same boat.

For steam I have been settings my game launch options to
prime-run %command% to trigger dGPU to be used. %command% will be autofilled by Steam.

For other programs (ex Unity Editor) I have to find how Unity Hub launches the Editor and then prime-run the Editor directly.

It’s not great but it has worked for me so far. My main issue is external display / TV related or the issue with Powermizer not respecting performance modes when you unplug form AC and replug back in. It’s not exclusive issue to Manjaro though, I confirmed it happens on Pop and Mint the same.

You can confirm you have the same issue if you open nvidia x server settings (GUI app) and check powermizer. Set to maximum performance while plugged in. Then unplug and replug back in. It will never be able to hit the max power mode anymore until you log out. More on this here: Set Nvidia GPU Performance Level possible? - #4 by Yochanan

As for wanting to have it completely deactivated until requested. IIRC that is not possible without switching to bumblebee. Prime will always have it activated and at the lowest power state (3W ~ 6W). Otherwise if you lack the driver it will always be at way higher (25W or so!) since dGPU will init at system launch.

_Undercover · 31 July 2021 21:14

Anyway, on Manjaro, if system detects that you have hybrid GPU, it automatically installs hybrid setup and all you need is to add this environmental variable to launch apps/games with Nvidia.

I probably should have mentioned that I installed Manjaro from the Architect installer and then built my way up from there.

That’s why I was looking for a guide, so that everybody running Manjaro, regardless of their starting point, would be able to properly configure and run an hybrid setup without getting a massive headache.

Hybrid setup has some drawback thou: Nvidia is never fully powered down when not in use, so hybrid setup uses a slightly more power than intel based setup but still less than fully Nvidia mode.

This is interesting, I was under the impression that it could be fully powered down. But then as you said, it might be only available on Windows. However I feel like the power it consumes isn’t that minimal, and I say “feel” because I don’t hold actual benchmarks to prove it, only that the fan is always on regardless of the workload, while on Windows it shuts down completely when under minimal load, and battery life sucks, but those two might also be due to poor battery configuration instead of being consumed by the dGPU, which is actually headache nº2 in my Linux journey so far.

So it seems that I should just lose my hopes of being able to program in my dGPU without having it turned on all the time, even if under minimal consumption (I don’t really care about using it for anything else really)?

And to remove the hybrid setup and use Intel only, how should one proceed to do so?

Also thanks for the guide to configure hybrid GPU attached, which due to its size further strengthens my belief that there’s something wrong with hybrid setups on Linux based OSs, and I suppose that NVIDIA really deserved the finger.

_Undercover · 31 July 2021 21:44

As for wanting to have it completely deactivated until requested. IIRC that is not possible without switching to bumblebee. Prime will always have it activated and at the lowest power state (3W ~ 6W). Otherwise if you lack the driver it will always be at way higher (25W or so!) since dGPU will init at system launch.

This is good to know, do you know if it would be possible to wake up the dGPU with bumblebee to run a CUDA program on it and not be bothered with it the rest of the time? After browsing around it seems that bumblebee is compatible with CUDA, however is advertised as having performance issues in the Arch Wiki.

There’s also some confusion going in my head, hopefully some of you is able to explain it properly, and is related to the previous quote, if something is wrong feel free to correct me:

There are two drivers for the NVIDIA GPU: Nouveau (open source) and NVIDIA (proprietary)
Then, after installing one of the previous two, you can use several ways to switch between integrated and dedicated GPU, and Bumblebee, is just one of them, and the official way supported my NVIDIA is using PRIME offloading. You can also use other projects such as optimous-manager and nvidia-xrun to control what goes into the dGPU or not.

So it seems that there is no “good” way of doing things, and unfortunately having the dGPU off while not using it isn’t that straight forward.

daowanijo · 31 July 2021 21:51

Yes your understanding is correct. I cannot confirm the bumblebee for CUDA scenario as I only used it before prime existed.

You should know Windows handles the same way as prime. dGPU is on but very low power just like prime. Only difference is Windows can auto switch to dGPU or needs to be configured in Graphics Settings → High Performance for specific apps, which is
similar to prime-run.

NVIDIA proprietary is your best bet for performance.

ralm · 31 July 2021 23:24

For a quick and easy solution I think the Optimus Manager service is the way to go – along with Optimus Manager QT for easy switching from the panel. Somewhere in its power management documentation there’s a solution for you. Depending on your GPU chipsets, you may have to settle for an idle power state for the Nvidia GPU instead of powering down completely. But it’s not shabby either. On average I doubt battery drain will be as bad as you might think. For games and other heavy loads I would think AC power is preferable anyway.

XRaTiX · 1 August 2021 02:51

Actually,with the 470 drivers,the nvidia card now automatically power off when not used in integrated mode with optimus-manager,and I can use the nvidia card there too (integrated mode),meaning hybrid mode is kind of deprecated now (kinda because integrated mode doesn’t work HDMI port yet,you need hybrid or nvidia mode for that). I have a 1050 Ti for reference.

To answer @_Undercover ,you just need to install optimus-manager and switch from hybrid (the mode you are right now) to integrated mode.

Now when you do nvidia-smi you should have something like this

❯ nvidia-smi
Sat Jul 31 21:43:33 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   39C    P0    N/A /  N/A |      0MiB /  4040MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I don’t use CUDA,but I know that Vulkan applications (such as games) detect and uses the nvidia card without doing the prime-run command,so maybe CUDA should be the same,if not you can use the prime-run normally to force it.

Here is a example of a Vulkan application wiithout prime-run

Edit:

obs uses cuda? if so then that means cuda applications also works because my obs also detects the nvidia card without prime-run.

_Undercover · 1 August 2021 10:29

Alright that kicks ass! I’m surely going to look into optimus-manager, install it and check it out for myself, thanks!

(I also have a laptop with a 1050 Ti, so if it works for you, it has to work for me!)

michaldybczak · 1 August 2021 10:53

I’m not sure what you are saying. In integrated mode (Intel mode) Nvidia is anyway powered down completely, when we are saying about optimus-manager. However, I did hear that with some newer cards and with newer drivers, dGPU will be powered down completely in hybrid mode. I don’t think it will relate to my old dGPU. I recall reading which cards will work with that and my was way too old (GTX 970M).

It doesn’t work for me. Not sure if that is again dependent on dGPU model or on way of switching. I use Novueau to switch GPUs, which has a plus, that in integrated mode I can switch Nvidia with Nouveau using Novueau drivers.

So

It doesn’t work for me. I just get message that Nvidia is not possible to find. Or maybe you have some kind of special configs and your Integrated (Intel) mode works as hybrid one?

It works in integrated mode when you use modesetting Intel drivers. Maybe that is the reason why I can’t access dGPU? I will experiment with that, but I need multiple monitor setups, so I modesetting is a must for me.

michaldybczak · 1 August 2021 11:13

Yes and no . It’s more complicated than that. Nouvuea driver is in the kernel, so you always have it. It has two abilities: 1) is an open source driver for Nvidia cards, 2) can manage Nvidia card state.
That is why, when you have optimus-manager, you can choose Novueau as a way to wake up and power down (if possible) the dGPU. In fact, in intergrated (Intel) mode, when you use modesetting Intel drivers (community, open source drivers for Intel) and use Novueau as a switching method, you can wake up Nvidia to use Novuea drivers. So basically in Integrated mode, the whole system runs on Intel, but you can trigger some app to use Nvidia with Noveau drivers, so all open source solutions.

Nvidia driver contain of two parts: the driver and kernel module. The second one is needed, because hardware is run by kernel, so it must be part of the kernel. So the solution is to have compatible, additional parts (modules) that you can install to the current kernel to expand its possibilities. The rest, which is not part of kernel, is shifted to general package (files outside kernel).

So when you install nvidia kernel module and nvidia drivers, you have everything you need, but you still have to configure it how it will be used. See Manjaro Wiki:

So you can use mhwd, a script to manage the files and configuration for you graphical cards. In fact, that is the preferred way on Manjaro, because most can’t configure it properly. Too many dependencies and config places, so mhwd does all the work for us and ensures it works.

You can use mhwd either in terminal via command or through graphical interface.

If you do:

sudo mhwd -a pci nonfree 0300

then you install Nvidia drivers, module, and it will set up by default hybrid mode. If this is what you used in Architect, then you are already in hybrid mode. If not, you can use the command above or GUI to install it. You must remember, that if you have some conflicting packages, it will error out, showing you that you need to uninstall them manually.

So do a backup (with timeshif), then do all the changes till everything is installed and then reboot. If you fail to install and configure needed parts, the system won’t boot into the graphical interface.

Bumblebee is not compatible with optimus-manager, so it won’t work. I have no idea if Bumblebee will work with standard hybrid mode.

Anyway, power management improves with newer kernels and Nvidia drivers, providing that you dGPU is also compatible with those. So your experience may vary strongly from having great experience and no additional energy loos on hybrid mode to huge loos. You need to test your setup yourself.

XRaTiX · 1 August 2021 13:38

The only configuration I did and is not related to optimus-manager at all was to set SOUND_POWER_SAVE_OC_AC to 1 in TLP,otherwise if I invoked the nvidia card in Integrated mode,the nvidia card will not powered off until I disconnect the charger or reboot the machine for some reason,it was requested to change to 1 by TLP 1.4 release (it talks about nouveau but it applies for propietary drivers as well like my case)

In optimus-manager I have this settings

Yeah that is with turing cards,but I have a pascal card,maybe before pascal those cards don’t work?

daowanijo · 3 August 2021 03:51

Hey XRaTiX,

It seems when I use optimus-manager rather than the default manjaro prime solution nvidia-smi reports my NVIDIA GPU running at 24W in Integrated mode, where prime would be 3W/6W when not in use. However no processes are running on the dGPU. Additionally, my hardware LED shows Integrated is being used not dGPU… so I’m a bit confused.

When I run nvidia-smi the dGPU hardware LED does light up. So I’m not sure how to verify the real power pulled other than just experimenting with battery life while in Integrated mode.

| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   37C    P3    24W /  N/A |      0MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Additionally, seems plugged in while in Integrated mode always turns on my hardware LED to dGPU mode and pumps it up to 29W-30W, but again no running processes when I check with nvidia-smi

XRaTiX · 3 August 2021 06:36

I’m not sure whats happening with the watts usage,my card doesn’t have that,it only says N/A,also my laptop doesn’t have a LED indicator for the dGPU but there is a way to know if the card is really powered on or off,

if you execute this command (assuming you card is on 01:00:00,if not check lspci)

watch cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status

It can be active or suspended,activate means is powered on,otherwise it says suspended.

If it says activated in integrated mode,the only think I can think of is TLP preventing the dGPU from power off as I mention above,go to /etc/tlp.conf and change

SOUND_POWER_SAVE_ON_AC from 0 to 1,save and reboot,now the dGPU should say suspended.

You can also veritfy the power usage on the laptop with TLP,so you just have to do a

watch sudo tlp-stat -b

and check where it says /sys/class/power_supply/BAT0/current_now,this show the power usage when its in battery or charging.

In my case,Integrated mode in battery consumes 380-450 mA,when the dGPU is activated it consumes between 750-900 mA.

Manfrago · 3 August 2021 09:43

I think that will be due to your configuration. For me, nvidia-smi shows the same in the integrated graphics mode as @XraTix has shown without that I had tinkered anywhere. Incidentally, also works for me with the nvidia 390.xx driver. I also read here that the nVidia is completely switched off in integrated mode and wonder why its temperature is then reported or how mhwd should recognize the card.

daowanijo · 3 August 2021 15:44

Thank you for the info! It is very helpful for linux newcomer.

I confirmed runtime status is suspended when reboot into Integrated without AC power.
dGPU light is off.

tlp-stat shows
At 5% brightness I am getting 540mA - 720mA, usually averaging in 650mA.

When plugged in AC, still on Integrated with no reboot etc, dGPU LED is on now.
At 5% brightness I am getting consistent 1233mA.

runtime status also confirms active instead of suspended now.

Unplugging AC makes dGPU LED turn off, runtime back to suspended, and power back to 540mA - 720mA.

I will try tlp config change and edit/update this post in few min.

Update: I looked into tlp config and seems all of it is commented out. Including TLP_ENABLE.

#TLP_ENABLE=1
...
#SOUND_POWER_SAVE_ON_AC=0
#SOUND_POWER_SAVE_ON_BAT=1

Even with TLP_ENABLE=1 uncommented and SOUND_POWER_SAVE_ON_AC=1 (set to 1 and uncommented)

No change. It always becomes active with AC power. Not a huge deal though. Most case for me I just want to use Integrated for power saving while not plugged in. When I plug in I will just change to nvidia only mode, which I use when I do graphics intensive development or gaming anyway.

michaldybczak · 3 August 2021 15:45

I checked the average power usage in hybrid mode, and it was between 19-30W, 26W on average, while in Integrated mode it was 17W on average (just browsing, watching YT). So I get 30% power saving while in Integrated mode. Since I need Nvidia only for gaming (and not for all games, older ones work fine on Intel) and I rarely game, Integrated mode without Nvidia is fine for me. If needed, I switch to hybrid mode. Full Nvdia mode is useless for me.

P.S. If you have Plasma, there is a nice battery utility in system settings, so you can monitor the battery state, recharge, discharge over time, etc.

daowanijo · 3 August 2021 16:14

Good to know about the battery utility.
I’m curious what were your stats before you used optimus-manager? Either way glad you found a solution for you. I regularly will plug my laptop into a TV or HDMI, but this will cause stutter (known NVIDIA bug) if not using optimus-manager, which led me to this thread.

XRaTiX · 3 August 2021 18:52

I forgot that I tweak alot TLP and maybe you need to do this too,in the tlp.conf,set
RUNTIME_PM_ON_AC from on to auto,then in the terminal restart tlp with sudo tlp start,check the runtime_status,now it should say suspended when plugged in AC.

You can use the nvidia card in integrated mode,at least in my case is like that,the only thing that doesn’t work yet is the HDMI port in integrated mode.

_Undercover · 6 August 2021 12:55

I tried it today, installing optimus-manager and rebooting the machine automatically switched to integrated mode. After the reboot the following showed as output of nvidia-smi:

$ nvidia-smi
Fri Aug  6 13:43:47 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   33C    P0    N/A /  N/A |      0MiB /  4042MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

And running several other commands indicated that the iGPU was in fact being the one in use:

$ optimus-manager --status
Optimus Manager (Client) version 1.4

Current GPU mode : integrated
GPU mode requested for next login : no change
GPU at startup : integrated
Temporary config path: no

$ optimus-manager --print-mode
Current GPU mode : integrated

$ optimus-manager --print-next-mode
GPU mode requested for next login : no change

$ optimus-manager --print-startup  
GPU at startup : integrated

Compiling and running a program that is meant to use CUDA works in integrated mode:

$ nvcc fw.cu 

$ ./a.out &> /dev/null &
[1] 2717

$ nvidia-smi
Fri Aug  6 13:48:11 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   42C    P0    N/A /  N/A |     49MiB /  4042MiB |     99%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2140      C   ./a.out                            47MiB |
+-----------------------------------------------------------------------------+

I haven’t tested HDMI on integrated mode yet, but I will very likely not be needing it anyways.

Is there any way that, without additional hardware, I can test the power consumption of my dGPU accurately, to ensure that it is in fact powered off?