Strange.. rolling back with TimeShift doesn't fix broken Nvidia update

Hello all,
After a lot of struggle, I finally had perfectly working Nvidia laptop, with performance equal or sometimes even better than Windows. Then the system update came, which replaced my 495 nvidia driver to the 510 version. And the performance dropped a lot. Now it is again as slow as before my struggles. (Nvidia produces now half of the FPS compared to Intel integrated GPU).

Fortunately I had Timeshift backup of my system from just before the update, so I rolled back, and to my surprise, this did not fix the problem!

How is this possible? Did the driver update process modify some GPU firmware?
I have done precise measurements before and after, and the difference is huge. All conditions were the same, even air temperature.
After booting to Windows, the Nvidia is working ok as always, so the hardware didn’t break.

Technical stuff (before update):

  • Host: Dell Latitude 5401
  • Kernel: Linux 5.15.16-1-MANJARO
  • CPU: Intel i5-9400H (8)
  • GPU1: Intel UHD Graphics 630
  • GPU2: NVIDIA GeForce MX150

I had properly installed optimus-manager, and all testing was done using Hybrid mode with following commands:

# Linux Benchmarks:
gamemoderun prime-run vkmark --present-mode immediate  
gamemoderun prime-run glmark2 immediate

# Steam games and benchmarks:
gamemoderun prime-run mangohud --dlsym %command%

Did you rollback the kernel to ?

Did you try with another kernel ?

Like 5.10 or 5.15 ?

How to increase your chances of solving your issue:

Please provide Information:

Are you sure you didn’t do something before taking your snapshot and updating the system? What you describe is very unlikely if you indeed brought you system back to where it was you should have same performance, beside the usual need for shaders to be recompile for your Steam games.

No

What about not using this external software? Manjaro Prime setup is working as is.

Did you rollback the kernel to ?

Yes, I rolled back the whole system (only without home and root directories).
So, if something was changed, it probably was in home, that Timeshift excluded from backup.

Did you try with another kernel ?

Before my success with the driver I have tested many versions of kernel and driver pairs. It was always properly loading the driver (glinfo, and vulkaninfo confirmed that), but the good performance was only on the following version: linux515-nvidia (in version 495.46-10), which did fit to kernel version 5.15.16-1-MANJARO. So I kept this configuration.
All other versions did not provide good performance.

Are you sure you didn’t do something before taking your snapshot and updating the system?

I am 99,9% sure.

What about not using this external software? Manjaro Prime setup is working as is.

Belive me, I have tested this, and a lot of other stuff, even on different Linux distros (PopOS, Ubuntu). It was strange for me, but it only worked with Manjaro + optimus-manager. This setup survived 2 months on Manjaro, and now it is broken.

Perhaps there is some error, which is random in its behavior, and it is just coincidence that it happened just now.

If you are using OM in ‘hybrid’ mode … there is absolutely no reason to be using it at all.

It creates issues … I cant force you to use certain software on your own system.
But … this is well documented and addressed over many other threads.

Not to mention that, besides its own hacky construction, it is also an AUR package … so both things require extra attention.
optimus-manager even has special steps in its github for manually augmenting the display manager so that OM will work.
And, as an AUR package … you are responsible for rebuilding it when needed.

What I would suggest as you can’t avoid updating the system anyway, is to update Manjaro, remove Optimus Manager, and start from there.

So if you did a rollback including /boot and kernel and /usr …
and your system does not behave like before,

Then your snapshot was

  • not complete or
  • already not functioning as described when taken

This is why i do automatic snapshots every hour/day/week with snapper. So if i have to rollback i have 20-30 snapshots to select the right one.

Done, I have updated the system, removed the optimus-manager, and all nvidia related stuff, then followed the official Manjaro Wiki guide to install the driver.
…And no surprise there. The driver is loading and working, but the performance is terrible. It is twice as bad as the integrated Intel GPU.

Any ideas, what could be slowing it down? Where to look for the cause?
What stuff should I check?

Check if using prime-run use the Nvidia card.

prime-run glxinfo | grep OpenGL
glxinfo | grep OpenGL

Also as requested earlier in thread give proper system information.

inxi --admin --verbosity=7 --filter --width
mhwd -l
mhwd -li

As you can see, nvidia driver is loaded correctly:

prime-run glxinfo | grep OpenGL                                                          
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: NVIDIA GeForce MX150/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 510.47.03
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 510.47.03
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 510.47.03
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

…as well as intel (without prime-run):

glxinfo | grep OpenGL                                                                        
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) UHD Graphics 630 (CFL GT2)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 21.3.5
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6 (Compatibility Profile) Mesa 21.3.5
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.3.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

System info:

inxi --admin --verbosity=7 --filter --width                                  
System:
  Kernel: 5.15.19-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0
    parameters: BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64
    root=UUID=bb32a804-ef1c-4700-adec-d34d873f1cd0 rw quiet apparmor=1
    security=apparmor udev.log_priority=3
  Desktop: KDE Plasma 5.23.5 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM
    Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Laptop System: Dell product: Latitude 5401 v: N/A
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: Dell model: 0D2RTT v: A00 serial: <superuser required> UEFI: Dell
    v: 1.15.1 date: 10/06/2021
Battery:
  ID-1: BAT0 charge: 60.1 Wh (100.0%) condition: 60.1/68.0 Wh (88.3%)
    volts: 17.0 min: 15.2 model: LGC-LGC4.47 DELL 10X1J99 type: Li-ion
    serial: <filter> status: Full
Memory:
  RAM: total: 31.14 GiB used: 4.86 GiB (15.6%)
  RAM Report:
    permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: model: Intel Core i5-9400H bits: 64 type: MT MCP arch: Coffee Lake
    family: 6 model-id: 0x9E (158) stepping: 0xD (13) microcode: 0xEA
  Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
    L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
    L3: 8 MiB desc: 1x8 MiB
  Speed (MHz): avg: 1006 high: 1029 min/max: 800/2500 scaling:
    driver: intel_pstate governor: powersave cores: 1: 1029 2: 1020 3: 1001
    4: 1000 5: 1000 6: 1000 7: 1000 8: 1000 bogomips: 40009
  Flags: 3dnowprefetch abm acpi adx aes aperfmperf apic arat
    arch_capabilities arch_perfmon art avx avx2 bmi1 bmi2 bts clflush clflushopt
    cmov constant_tsc cpuid cpuid_fault cx16 cx8 de ds_cpl dtes64 dtherm dts
    epb ept ept_ad erms est f16c flexpriority flush_l1d fma fpu fsgsbase fxsr
    ht hwp hwp_act_window hwp_epp hwp_notify ibpb ibrs ibrs_enhanced intel_pt
    invpcid invpcid_single lahf_lm lm mca mce md_clear mmx monitor movbe mpx
    msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm pdpe1gb pebs
    pge pln pni popcnt pse pse36 pts rdrand rdseed rdtscp rep_good sdbg sep
    smap smep smx ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tm tm2
    tpr_shadow tsc tsc_adjust tsc_deadline_timer vme vmx vnmi vpid x2apic
    xgetbv1 xsave xsavec xsaveopt xsaves xtopology xtpr
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass
    mitigation: Speculative Store Bypass disabled via prctl and seccomp
  Type: spectre_v1
    mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Enhanced IBRS, IBPB: conditional, RSB filling
  Type: srbds mitigation: TSX disabled
  Type: tsx_async_abort mitigation: TSX disabled
Graphics:
  Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] vendor: Dell
    driver: i915 v: kernel bus-ID: 00:02.0 chip-ID: 8086:3e9b class-ID: 0300
  Device-2: NVIDIA GP108M [GeForce MX150] driver: nvidia v: 510.47.03
    alternate: nouveau,nvidia_drm bus-ID: 02:00.0 chip-ID: 10de:1d10
    class-ID: 0302
  Device-3: Realtek Integrated_Webcam_HD type: USB driver: uvcvideo
    bus-ID: 1-11:3 chip-ID: 0bda:5532 class-ID: 0e02 serial: <filter>
  Display: x11 server: X.org 1.21.1.3 compositor: kwin_x11 driver:
    loaded: modesetting,nvidia unloaded: nouveau alternate: fbdev,nv,vesa
    resolution: <missing: xdpyinfo>
  OpenGL: renderer: Mesa Intel UHD Graphics 630 (CFL GT2) v: 4.6 Mesa 21.3.5
    direct render: Yes
Audio:
  Device-1: Intel Cannon Lake PCH cAVS vendor: Dell driver: snd_hda_intel
    v: kernel alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3
    chip-ID: 8086:a348 class-ID: 0403
  Sound Server-1: ALSA v: k5.15.19-1-MANJARO running: yes
  Sound Server-2: PulseAudio v: 15.0 running: no
  Sound Server-3: PipeWire v: 0.3.45 running: yes
Network:
  Device-1: Intel Cannon Lake PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:a370 class-ID: 0280
  IF: wlo1 state: up mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
    broadcast: <filter>
  IP v6: <filter> type: noprefixroute scope: link
  Device-2: Intel Ethernet I219-LM vendor: Dell driver: e1000e v: kernel
    port: N/A bus-ID: 00:1f.6 chip-ID: 8086:15bb class-ID: 0200
  IF: eno2 state: down mac: <filter>
  WAN IP: <filter>
Bluetooth:
  Message: No bluetooth data found.
Logical:
  Message: No logical block device data found.
RAID:
  Message: No RAID data found.
Drives:
  Local Storage: total: 931.51 GiB used: 699.03 GiB (75.0%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: # THIS IS IRRELEVANT
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
    lanes: 4 type: SSD serial: <filter> rev: S5Z44106 temp: 32.9 C scheme: GPT
  Message: No optical or floppy data found.
Partition:
# THIS IS IRRELEVANT
Swap:
  Alert: No swap data was found.
Unmounted:
# THIS IS IRRELEVANT
USB:
  Hub-1: 1-0:1 info: Hi-speed hub with single TT ports: 16 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 1-3:2
    info: SHARKOON 2.4GHz Wireless rechargeable vertical mouse [More&Better]
    type: Mouse driver: hid-generic,usbhid interfaces: 1 rev: 1.1
    speed: 12 Mb/s power: 100mA chip-ID: 1ea7:0064 class-ID: 0301
  Device-2: 1-11:3 info: Realtek Integrated_Webcam_HD type: Video
    driver: uvcvideo interfaces: 4 rev: 2.0 speed: 480 Mb/s power: 500mA
    chip-ID: 0bda:5532 class-ID: 0e02 serial: <filter>
  Hub-2: 2-0:1 info: Super-speed hub ports: 10 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-3: 3-0:1 info: Hi-speed hub with single TT ports: 2 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Hub-4: 4-0:1 info: Super-speed hub ports: 2 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
Sensors:
  System Temperatures: cpu: 61.0 C pch: 63.0 C mobo: N/A
  Fan Speeds (RPM): cpu: 3610
Info:
  Processes: 286 Uptime: 2h 50m wakeups: 54592 Init: systemd v: 250
  tool: systemctl Compilers: gcc: 11.1.0 clang: 13.0.0 Packages: 1395
  pacman: 1388 lib: 359 flatpak: 0 snap: 7 Shell: Zsh v: 5.8 default: Bash
  v: 5.1.16 running-in: konsole inxi: 3.3.12

Available drivers:

mhwd -l                                                                  
> 0000:02:00.0 (0302:10de:1d10) Display controller nVidia Corporation:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
video-hybrid-intel-nvidia-prime            2021.12.18               false            PCI
video-hybrid-intel-nvidia-470xx-prime            2021.12.18               false            PCI
video-hybrid-intel-nvidia-390xx-bumblebee            2021.12.18               false            PCI
          video-nvidia            2021.12.18               false            PCI
    video-nvidia-470xx            2021.12.18               false            PCI
    video-nvidia-390xx            2021.12.18               false            PCI
           video-linux            2018.05.04                true            PCI


> 0000:00:02.0 (0300:8086:3e9b) Display controller Intel Corporation:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
video-hybrid-intel-nvidia-prime            2021.12.18               false            PCI
video-hybrid-intel-nvidia-470xx-prime            2021.12.18               false            PCI
video-hybrid-intel-nvidia-390xx-bumblebee            2021.12.18               false            PCI
           video-linux            2018.05.04                true            PCI
     video-modesetting            2020.01.13                true            PCI
            video-vesa            2017.03.12                true            PCI

Confirmation, that drivers are installed:

mhwd -li                                                                  
> Installed PCI configs:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
           video-linux            2018.05.04                true            PCI
     video-modesetting            2020.01.13                true            PCI
video-hybrid-intel-nvidia-prime            2021.12.18               false            PCI


Warning: No installed USB configs!

Indeed it seems it is all in order, beside your BIOS which is not on latest version.

Try to see if the Nvidia card is not locked in a low power state with nvidia-smi I already have seen that in dual boot situation.

Give the output of nvidia-smi or prime-run nvidia-smi while your are trying to use the video card in a Steam game with prime-run %command%

The GPU performance state APIs are used to get and set various performance levels on a per-GPU basis. P-States are GPU active/executing performance capability and power consumption states.

P-States range from P0 to P15, with P0 being the highest performance/power state, and P15 being the lowest performance/power state. Each P-State maps to a performance level. Not all P-States are available on a given system. The definition of each P-States are currently as follows:

  • P0/P1 - Maximum 3D performance
  • P2/P3 - Balanced 3D performance-power
  • P8 - Basic HD video playback
  • P10 - DVD playback
  • P12 - Minimum idle power consumption

I think, I found something (thanks to @omano)
When the GPU is not in use, it goes into low performance level as expected (GPU clock goes at 34MHz):

…but when the GPU is under heavy workload (rendering complex shaders where I get maximum of 10 FPS, for example some from: www.shadertoy.com) it’s clock does not reach the maximum speed. It just goes up to 433MHz never exceeding this value. But the Memory transfer rate is at it’s maximum:

Result of nvidia-smi and prime-run nvidia-smi (both were exactly the same):
When idle:

nvidia-smi                                                                                   
Mon Feb  7 20:39:34 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   52C    P8    N/A /  N/A |      5MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       624      G   /usr/lib/Xorg                       4MiB |
+-----------------------------------------------------------------------------+

When under load:

nvidia-smi                                                        
Mon Feb  7 20:52:51 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   59C    P0    N/A /  N/A |     78MiB /  2048MiB |     95%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       624      G   /usr/lib/Xorg                      34MiB |
|    0   N/A  N/A     16718      G   /usr/lib/firefox/firefox            6MiB |
|    0   N/A  N/A     16788      G   /usr/lib/firefox/firefox           36MiB |
+-----------------------------------------------------------------------------+

I have tested this also with vkmark, glmark2, and some Steam benchmarking demo “game”, the result is always the same.

Does this indicate, that maybe the system is performing inefficient method of copying framebuffers from Nvidia GPU into the RAM or something?

(Laptop screens are usually connected into the integrated GPUs.)

It seems like it goes to the performance power state P0 when in use so not the issue I was thinking about (locked in P8 state all the time). But indeed if it never reaches near the max clock there may be some issues. No idea for now.