Nvidia-settings and nvidia-smi hangs + other related issues on Manjaro KDE

Hi
My laptop has Nvidia 750M along with 4th gen integrated Intel graphics and is on Manjaro-KDE-minimal. I have video-hybrid-intel-nvidia-390xx-bumblebee installed. The problem that if I launch nvidia-settings from application launcher it just doesn’t open. If do it from terminal, I don’t even get any output. Same behaviour with nvidia-smi also.

As an after effect, shutdown or restart takes extremely long and displays Waiting for process: nvidia-settings, Xorg

Whats even weird is that this is not happen every-time! Roughly once every 20 reboots, everything just works fine. I can launch nvidia-settings and the shutdown/reboot is also smooth. Please help me with this. :pleading_face: :pray:

I am not very experienced with his but I still tried to poke around, try solutions and carry out an investigation to the best of my knowledge as detailed below:

  1. If I execute pgrep nvidia-settings to get pid to try to kill it with kill -9 pid, it just does not get killed!

  2. Adding nouveau.modeset=0 to GRUB_CMDLINE_LINUX_DEFAULT=" " does nothing for me so reverted.

  3. I have tried ditching bumblebee and using optimus-manager but if I switch to nvidia with optimus-manager --switch nvidia, the screen goes blank after login screen.

  4. Tried running nvidia-settings -pm 1 at boot as a systemd service. This also did nothing for me but there is an interesting observation. If I check the status of the service I created, it mostly displays it was successful and nothing special. Remember I mentioned once every few reboots, everything just works fine. Whenever that happens, the service status is populated additionally with an output saying persistence mode is already enabled.

  5. If I comment out blacklist nvidia-drm from /lib/modprobe.d/bumblebee.conf the issue is resolved. Now, nvidia-settings and nvidia-smi works fine. But now some apps won’t launch. E.g, if I launch kitty, I get,

$ kitty                                                                                                                                                                                                                  
[286 13:33:58.376165] [glfw error 65542]: GLX: GLX extension not found
[286 13:33:58.376211] Failed to create GLFW temp window! This usually happens because of old/broken OpenGL drivers. kitty requires working OpenGL 3.3 drivers.

But if execute optirun kitty it just works!!

I recently switched to manjaro KDE from Manjaro gnome as some gnome extensions were giving me stability issues but now this is giving a whole new headache. Btw, these issue was non existent in gnome. I have spent a lot of time and effort on this and I don’t want ot switch back to gnome because I love this.

I am a student and want to get started with learning cuda. So, this is kinda necessary. The hardware is getting somewhat old and have started saving for a new device but this gets most of the job done and i think it is wiser to keep going as long as it can as the gpu prices are yet to get reasonable.

Here is the output of inxi --admin --verbosity=7 --filter --no-host --width:

System:
  Kernel: 5.4.150-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.4-x86_64
  root=UUID=861afff4-8652-4a6a-b469-c5a1836e69d0 rw udev.log_priority=3
  Desktop: KDE Plasma 5.22.5 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM
  Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Portable System: Dell product: Inspiron 7737 v: N/A serial: <filter>
  Chassis: type: 8 v: 0.1 serial: <filter>
  Mobo: Dell model: 00V878 v: A00 serial: <filter> UEFI: Dell v: A16
  date: 05/24/2018
Memory:
  RAM: total: 7.67 GiB used: 5.18 GiB (67.5%)
  RAM Report: permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: Dual Core model: Intel Core i7-4510U bits: 64 type: MT MCP
  arch: Haswell family: 6 model-id: 45 (69) stepping: 1 microcode: 26 cache:
  L2: 4 MiB bogomips: 15968
  Speed: 2239 MHz min/max: 800/3100 MHz Core speeds (MHz): 1: 2239 2: 2133
  3: 2229 4: 1771
  Flags: abm acpi aes aperfmperf apic arat arch_perfmon avx avx2 bmi1 bmi2 bts
  clflush cmov constant_tsc cpuid cpuid_fault cx16 cx8 de ds_cpl dtes64 dtherm
  dts epb ept ept_ad erms est f16c flexpriority flush_l1d fma fpu fsgsbase
  fxsr ht ibpb ibrs ida invpcid invpcid_single lahf_lm lm mca mce md_clear mmx
  monitor movbe msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm
  pdpe1gb pebs pge pln pni popcnt pse pse36 pti pts rdrand rdtscp rep_good
  sdbg sep smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tm tm2
  tpr_shadow tsc tsc_adjust tsc_deadline_timer vme vmx vnmi vpid xsave
  xsaveopt xtopology xtpr
  Vulnerabilities: Type: itlb_multihit status: KVM: Split huge pages
  Type: l1tf
  mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: spec_store_bypass
  mitigation: Speculative Store Bypass disabled via prctl and seccomp
  Type: spectre_v1
  mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Full generic retpoline, IBPB: conditional,
  IBRS_FW, STIBP: conditional, RSB filling
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Intel Haswell-ULT Integrated Graphics vendor: Dell driver: i915
  v: kernel bus-ID: 00:02.0 chip-ID: 8086:0a16 class-ID: 0300
  Device-2: NVIDIA GK107M [GeForce GT 750M] vendor: Dell driver: nvidia
  v: 390.144 alternate: nouveau,nvidia_drm bus-ID: 04:00.0 chip-ID: 10de:0fe4
  class-ID: 0302
  Device-3: Logitech Webcam C270 type: USB driver: snd-usb-audio,uvcvideo
  bus-ID: 2-2:2 chip-ID: 046d:0825 class-ID: 0102 serial: <filter>
  Device-4: Microdia Integrated HD Webcam type: USB driver: uvcvideo
  bus-ID: 2-5:8 chip-ID: 0c45:6705 class-ID: 0e02
  Display: x11 server: X.Org 1.20.13 compositor: kwin_x11 driver:
  loaded: intel display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.0x11.2")
  s-diag: 582mm (22.9")
  OpenGL: renderer: N/A v: N/A direct render: N/A
Audio:
  Device-1: Intel Haswell-ULT HD Audio vendor: Dell driver: snd_hda_intel
  v: kernel bus-ID: 00:03.0 chip-ID: 8086:0a0c class-ID: 0403
  Device-2: Intel 8 Series HD Audio vendor: Dell driver: snd_hda_intel
  v: kernel bus-ID: 00:1b.0 chip-ID: 8086:9c20 class-ID: 0403
  Device-3: Logitech Webcam C270 type: USB driver: snd-usb-audio,uvcvideo
  bus-ID: 2-2:2 chip-ID: 046d:0825 class-ID: 0102 serial: <filter>
  Sound Server-1: ALSA v: k5.4.150-1-MANJARO running: yes
  Sound Server-2: JACK v: 1.9.19 running: no
  Sound Server-3: PulseAudio v: 15.0 running: yes
  Sound Server-4: PipeWire v: 0.3.38 running: yes
Network:
  Device-1: Intel Wireless 7260 driver: iwlwifi v: kernel bus-ID: 02:00.0
  chip-ID: 8086:08b1 class-ID: 0280
  IF: wlp2s0 state: up mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
  broadcast: <filter>
  IP v6: <filter> type: noprefixroute scope: link
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
  vendor: Dell driver: r8169 v: kernel port: 4000 bus-ID: 03:00.1
  chip-ID: 10ec:8168 class-ID: 0200
  IF: enp3s0f1 state: down mac: <filter>
  WAN IP: <filter>
Bluetooth:
  Device-1: Intel Bluetooth wireless interface type: USB driver: btusb v: 0.8
  bus-ID: 2-6:5 chip-ID: 8087:07dc class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 2 state: up address: see --recommends
Logical:
  Message: No logical block device data found.
RAID:
  Message: No RAID data found.
Drives:
  Local Storage: total: 1.35 TiB used: 727.92 GiB (52.8%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/sda maj-min: 8:0 vendor: Seagate model: ST1000LM014-1EJ164
  size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
  type: HDD rpm: 5400 serial: <filter> rev: DEMA scheme: GPT
  ID-2: /dev/sdb maj-min: 8:16 vendor: Kingston model: SA400S37480G
  size: 447.13 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
  type: SSD serial: <filter> rev: 0102 scheme: GPT
  Message: No optical or floppy data found.
Partition:
  ID-1: / raw-size: 108 GiB size: 105.75 GiB (97.91%) used: 13.28 GiB (12.6%)
  fs: ext4 dev: /dev/sdb6 maj-min: 8:22 label: N/A
  uuid: 861afff4-8652-4a6a-b469-c5a1836e69d0
  ID-2: /boot/efi raw-size: 529 MiB size: 527.9 MiB (99.80%)
  used: 288 KiB (0.1%) fs: vfat dev: /dev/sdb1 maj-min: 8:17 label: N/A
  uuid: 83C7-6256
  ID-3: /mnt/Data raw-size: 931.51 GiB size: 931.51 GiB (100.00%)
  used: 714.64 GiB (76.7%) fs: ntfs dev: /dev/sda1 maj-min: 8:1 label: Data
  uuid: FAFAA162FAA11BBF
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: partition size: 8 GiB used: 0 KiB (0.0%) priority: -2
  dev: /dev/sdb5 maj-min: 8:21 label: N/A
  uuid: 9c590295-046a-4331-b659-74d8e8417800
Unmounted:
  ID-1: /dev/sdb2 maj-min: 8:18 size: 100 MiB fs: vfat label: N/A
  uuid: AE96-81B1
  ID-2: /dev/sdb3 maj-min: 8:19 size: 16 MiB fs: <superuser required>
  label: N/A uuid: N/A
  ID-3: /dev/sdb4 maj-min: 8:20 size: 330.5 GiB fs: ntfs label: N/A
  uuid: E2B0A168B0A1443F
USB:
  Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 3 rev: 2.0
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Hub-2: 1-1:2 info: Intel Integrated Rate Matching Hub ports: 8 rev: 2.0
  speed: 480 Mb/s chip-ID: 8087:8000 class-ID: 0900
  Hub-3: 2-0:1 info: Full speed (or root) Hub ports: 9 rev: 2.0
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 2-2:2 info: Logitech Webcam C270 type: Video,Audio
  driver: snd-usb-audio,uvcvideo interfaces: 4 rev: 2.0 speed: 480 Mb/s
  power: 500mA chip-ID: 046d:0825 class-ID: 0102 serial: <filter>
  Device-2: 2-4:3 info: MosArt Wireless Keyboard/Mouse type: Keyboard,Mouse
  driver: hid-generic,usbhid interfaces: 2 rev: 1.1 speed: 12 Mb/s
  power: 100mA chip-ID: 062a:4101 class-ID: 0301
  Device-3: 2-5:8 info: Microdia Integrated HD Webcam type: Video
  driver: uvcvideo interfaces: 2 rev: 2.0 speed: 480 Mb/s power: 500mA
  chip-ID: 0c45:6705 class-ID: 0e02
  Device-4: 2-6:5 info: Intel Bluetooth wireless interface type: Bluetooth
  driver: btusb interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 100mA
  chip-ID: 8087:07dc class-ID: e001
  Device-5: 2-7:6 info: Elan Micro Touchscreen type: HID
  driver: hid-multitouch,usbhid interfaces: 1 rev: 2.0 speed: 12 Mb/s
  power: 100mA chip-ID: 04f3:031d class-ID: 0300
  Hub-4: 3-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.0 speed: 5 Gb/s
  chip-ID: 1d6b:0003 class-ID: 0900
Sensors:
  System Temperatures: cpu: 42.0 C mobo: 37.0 C
  Fan Speeds (RPM): cpu: 0
Info:
  Processes: 296 Uptime: 2h 10m wakeups: 2785 Init: systemd v: 249
  tool: systemctl Compilers: gcc: 11.1.0 Packages: pacman: 1203 lib: 347
  Shell: Zsh v: 5.8 default: Bash v: 5.1.8 running-in: konsole inxi: 3.3.07

Here is the output of sudo dmesg | grep -C 10 bbswitch :

[    2.749726] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
[    2.749727] RAPL PMU: hw unit of domain package 2^-14 Joules
[    2.749728] RAPL PMU: hw unit of domain dram 2^-14 Joules
[    2.749728] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[    2.779226] audit: type=1130 audit(1634085908.573:10): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    2.792649] proc: Bad value for 'hidepid'
[    2.830593] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.3)
[    2.846260] cryptd: max_cpu_qlen set to 1000
[    2.953393] AVX2 version of gcm_enc/dec engaged.
[    2.953394] AES CTR mode by8 optimization enabled
[    2.958346] bbswitch: loading out-of-tree module taints kernel.
[    2.959163] bbswitch: module verification failed: signature and/or required key missing - tainting kernel
[    2.962128] bbswitch: version 0.8
[    2.962135] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
[    2.962142] bbswitch: Found discrete VGA device 0000:04:00.0: \_SB_.PCI0.RP05.PEGP
[    2.962154] ACPI Warning: \_SB.PCI0.RP05.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20190816/nsarguments-59)
[    2.962429] bbswitch: detected an Optimus _DSM function
[    2.962442] pci 0000:04:00.0: enabling device (0006 -> 0007)
[    2.962520] bbswitch: disabling discrete graphics
[    2.983424] proc: Bad value for 'hidepid'
[    3.013511] psmouse serio1: synaptics: queried max coordinates: x [..5660], y [..4642]
[    3.046408] psmouse serio1: synaptics: queried min coordinates: x [1366..], y [1298..]
[    3.046416] psmouse serio1: synaptics: Your touchpad (PNP: DLL05fb SYN0600 SYN0002 PNP0f13) says it can support a different bus. If i2c-hid and hid-rmi are not used, you might want to try setting psmouse.synaptics_intertouch to 1 and report this to linux-input@vger.kernel.org.
[    3.056324] input: Dell WMI hotkeys as /devices/platform/PNP0C14:00/wmi_bus/wmi_bus-PNP0C14:00/PNP0C14:00-9DBB5994-A997-11DA-B012-B622A1EF5492/input/input8
[    3.080124] iTCO_vendor_support: vendor-support=0
[    3.085029] iwlwifi 0000:02:00.0: Detected Intel(R) Dual Band Wireless N 7260, REV=0x144
[    3.103487] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[    3.103618] iTCO_wdt: Found a Lynx Point_LP TCO device (Version=2, TCOBASE=0x1860)
[    3.104200] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
--
[    3.197995] snd_hda_codec_realtek hdaudioC1D0:      Internal Mic=0x12
[    3.246835] r8169 0000:03:00.1 enp3s0f1: renamed from eth0
[    3.297560] checking generic (c0000000 7f0000) vs hw (c0000000 10000000)
[    3.297562] fb0: switching to inteldrmfb from EFI VGA
[    3.297983] Console: switching to colour dummy device 80x25
[    3.298027] i915 0000:00:02.0: vgaarb: deactivate vga console
[    3.298915] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    3.298916] [drm] Driver supports precise vblank timestamp query.
[    3.299928] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    3.307578] intel-spi intel-spi: mx25l6405d (8192 Kbytes)
[    3.319076] bbswitch: Succesfully loaded. Discrete card 0000:04:00.0 is off
[    3.331414] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
[    3.337800] Creating 1 MTD partitions on "intel-spi":
[    3.337805] 0x000000000000-0x000000800000 : "BIOS"
[    3.345785] [drm] Initialized i915 1.6.0 20190822 for 0000:00:02.0 on minor 0
[    3.347104] ACPI: Video Device [PEGP] (multi-head: no  rom: yes  post: no)
[    3.347172] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:31/LNXVIDEO:00/input/input9
[    3.356122] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[    3.361153] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:01/input/input10
[    3.363576] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    3.363716] acpi device:28: Cannot transition to power state D3hot for parent in (unknown)

TIA

Hello,

Here you have a typo, its nouveau. :slight_smile: that’s besides the point.

Am… Have you tried installing only nvidia-390xx (through mhwd) and only use the nVidia card ???
The package name is mhwd-nvidia-390xx I think and its in core repo.
If not try that and see what happens.

Thanks for the reply. Fixed the typo.

Yes, I have tried using only NVIDIA.

It was a fresh install when I first encountered the issue. The first thing I did then was remove everything else and installed NVIDIA-Linux-x86_64-418.113 downloaded from the NVIDIA website. Things got choppy and not very responsive. I had no idea how to revert so ended up nuking everything and did a fresh install again.

I tried and I get this,

$ sudo mhwd -i pci mhwd-nvidia-390xx    
Error: config 'mhwd-nvidia-390xx' does not exist!

There seems to be a mhwd-nvidia-390xx in AUR. Should I be trying that, perhaps? Need I remove stuff before installing that?

No.

Yes.

I don’t know the exact command order, this page will tell you what you should do. Just read through the page and it will tell you the commands and all.

https://wiki.manjaro.org/index.php/Configure_Graphics_Cards

EDIT

  1. Go to the remove section and remove the current driver you have
  2. sudo mhwd -a pci nonfree 0300
  3. The package name is nvidia-390xx confirmed through pamac

It did not go well. Just posting what I did here for future reference,

  1. Removed current drivers with
sudo mhwd -r pci video-hybrid-intel-nvidia-390xx-bumblebee
  1. As suggested,
sudo mhwd -a pci nonfree 0300

This bought back everything which was just uninstalled. Not just only NVIDIA. Rebooted a couple of times to confirm that the original issue still persists.

  1. Installed nvidia-390xx with pamac,
pamac install nvidia-390xx

This was actually from AUR. It erred out during install and things got busted. Couldn’t even get to tty on next boot. Had to chroot from live media and reinstall let mhwd reinstall video-hybrid-intel-nvidia-390xx-bumblebee.

Don’t want to leave KDE. So, I guess I’ll have to live with this now :disappointed_relieved:

Hello, don’t give up.

You should’t have installed from AUR. The reason that it didn’t work is maybe you forgot to uninstall the previous from step 2.

This is what I got from pacman

$ pacman -Ss nvidia-390xx

core/mhwd-nvidia-390xx 390.144-1 [installed]  # this is installed with mhwd-db
    MHWD module-ids for nvidia 390.144
extra/linux54-nvidia-390xx 390.144-17 (linux54-extramodules)
    NVIDIA drivers for linux.
extra/nvidia-390xx-utils 390.144-1
    NVIDIA drivers utilities
extra/opencl-nvidia-390xx 390.144-1
    OpenCL implemention for NVIDIA
multilib/lib32-nvidia-390xx-utils 390.144-1
    NVIDIA drivers utilities (32-bit)
multilib/lib32-opencl-nvidia-390xx 390.144-1
    OpenCL implemention for NVIDIA (32-bit)

Also try to run mhwd -l or mhwd -l -d → for detailed list. There must be a name for the nvidia-390xx driver.

All drivers graphics card drivers will have the prefix (video-) in their name.

This is from the manjaro wiki page that I posted in previous post.

So, I removed with,
sudo mhwd -r pci video-hybrid-intel-nvidia-390xx-bumblebee

And then installed with,
sudo mhwd -i pci video-nvidia-390xx

At next boot stuck at,

[ OK ] Reached target Graphical Interface.
           Starting TLP system startup/shutdown...

Would not show me the login screen or let me get to tty.
So, again ended up installing video-hybrid-intel-nvidia-390xx-bumblebee through chroot.

This is totally strange.
I googled around and couldn’t find much useful info.

Maybe you could try the same from above

But before you install the nvidia-390xx remove these files, if they exits → check first!

sudo rm /etc/X11/xorg.conf.d/90-mhwd.conf
sudo rm /etc/modprobe.d/mhwd-gpu.conf
sudo rm /etc/modules-load.d/mhwd-gpu.conf

If it doesn’t work, idk sorry. Maybe someone with more knowledge can help.

Okay, lemme try this as well.

I think I will ask in KDE forums. Suspect its more likely to be something to do with KDE-NVIDIA.

Thank you very much for your time!!