Primus could not load gpu driver

bumblebee
kernel
nvidia

#13

Yeah sorry about that. Brainfart. I meant if you used mhwd to change the kernel?
because that install all the nessesary modules for nvidia graphics aswell.


#14

Couldn’t it be this “simple” TLP error which prevents nvidia from loading when used battery (even once)? If so, there is an easy fix for that:

Find output of: $ lspci | grep “NVIDIA” | cut -b -8

Open: sudo nano /etc/default/tlp

Put input of first command into: RUNTIME_PM_BLACKLIST so it can look like:

RUNTIME_PM_BLACKLIST=“01:00.0”

Reboot.


#15

Hi back guys,

Here it is some news :

@michaldybczak,
I tried to set the RUNTIME_PM_BLACKLIST at 01:00.0. (That was the output of your command)
Unfortunately, no change. So I have commented the whole line. (Like it was setted before)
In any case, thanks for the tips :slight_smile:

@Strit,
I didn’t used mhwd-kernel to change the kernel, but the GUI. (Settings/Manjaro Settings Manager/Kernel)
It has installed some additionnal stuff, like ndiswrapper and bbswitch.
I suppose it’s good. :slight_smile:

@FadeMind,
I always had this PCIe bus error and I never understand what it tells. May you give some details about it ? :slight_smile:
Indeed, the new dmesg is cleans of this message.
http://pastebin.com/k1xtMpS4
Unfortunately, bumblebeed doesn’t seems to work better. :confused:

Into the dmesg log, I found another error. Could the issue be related to that ?
[ 0.956952] ACPI Error: No handler for Region [EC__] (ffff8802760eb5a0) [EmbeddedControl] (20160930/evregion-166) [ 0.956959] ACPI Error: Region EmbeddedControl (ID=3) has no handler (20160930/exfldio-299) [ 0.956963] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8802760ec438), AE_NOT_EXIST (20160930/psparse-543)

@All,
Have a good week-end morning !

Cheers :slight_smile:


#16
[   89.335414] bbswitch: loading out-of-tree module taints kernel.
[   89.335844] bbswitch: version 0.8
[   89.335848] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
[   89.335852] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.PEGP
[   89.335859] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160930/nsarguments-95)
[   89.335941] bbswitch: detected an Optimus _DSM function
[   89.336034] bbswitch: disabling discrete graphics
[   89.336037] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160930/nsarguments-95)
[   89.541315] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is off
[   99.320502] bbswitch: enabling discrete graphics

Hmmm

IMO cause you have major issue in ACPI DSDT Table (ACPI ERRORS for SB.PCI0.LPCB.EC._REG) bbswitch can ON NVIDIA card but kernel cannot handle them at all).

Why you have double variables in:

Command line: BOOT_IMAGE=/vmlinuz-4.10-x86_64 root=UUID=29914461-0fc4-413a-bcb0-eff0d38d2445 rw pci=nomsi quiet splash pci=nomsi quiet splash resume=UUID=c423e44a-2435-4081-b93d-cdabc971148e

Paste grub conf:

cat /etc/default/grub

Check command:

optirun nvidia-smi

too.


#17

@FadeMind,
The double variable is due to the duplication of those values into GRUB_CMD_LINE_DEFAULT and GRUB_CMD_LINE
Here it is the head of the grub file

GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Manjaro"
GRUB_CMDLINE_LINUX_DEFAULT="pci=nomsi quiet splash resume=UUID=c423e44a-2435-4081-b93d-cdabc971148e"
GRUB_CMDLINE_LINUX=“pci=nomsi quiet splash”

(I did this duplication. Shouldn’t I do it ?)

And below, the output of optirun nvidia-msi

[ 407.104220] [ERROR]Cannot access secondary GPU - error: Could not load GPU driver
[ 407.104287] [ERROR]Aborting because fallback start is disabled.

Cheers.


#18

Change from

GRUB_CMDLINE_LINUX="pci=nomsi quiet splash"

to

GRUB_CMDLINE_LINUX=""

Update grub:

sudo update-grub

Save unchanged GRUB_CMDLINE_LINUX_DEFAULT.

Check OLDER kernel (4.1 kernel series if 4.4 and newer have issue).


#19

@FadeMind,
Oops, where’s my head today. Sorry :confused:
GRUB_CMDLINE_LINUX cleaned.
Here it is the output of dmesg while running 4.1.
http://pastebin.com/EeBckdjG

Unfortunately, I’m not able to reach the GUI with this kernel. (I used tty2 to get this output)
And with kernel 4.4, it didn’t notice some errors. But primusrun/optirun still cannot load secondary GPU

Cheers.


#20

Please using for now Linux 4.10 (lastest).


#21

dmesg while running 4.10.
http://pastebin.com/yesqFwC2


#22

WARNING: CPU: 5 PID: 406 at kernel/workqueue.c:2424 check_flush_dependency+0x122/0x130
workqueue: WQ_MEM_RECLAIM hci0:hci_power_off [bluetooth] is flushing !WQ_MEM_RECLAIM events:btusb_work [btusb]

New bug reported upstream: https://bugzilla.redhat.com/show_bug.cgi?id=1417144

OMG

MSI made crappy Firmware :scream: (ACPI Errors) and using crappy sub chipsets :scream:


#23

Kernel 4.10 only works with Nvidia 378.13 which isn’t available yet to us.

https://devtalk.nvidia.com/default/topic/995636/linux/-patches-378-13-4-10-and-4-11-rc1/

When I boot into 4.10 currently, it crashes with traces:

Mar 12 11:20:47 xmg kernel: BUG: scheduling while atomic: kworker/u16:4/363/0x00000002
Mar 12 11:20:47 xmg kernel: Modules linked in: bnep fuse hid_generic usbhid 8812au(O) cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media joydev mousedev ath3k btusb btrtl btbcm btintel bluetooth rfkill msr intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul snd_hda_codec_generic crc32c_intel iTCO_wdt ghash_clmulni_intel iTCO_vendor_support pcbc snd_hda_intel rtsx_pci_ms mxm_wmi r8168(O) memstick aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_hda_codec snd_soc_rt5640 mei_me intel_cstate ie31200_edac snd_soc_rl6231 snd_hda_core intel_rapl_perf snd_hwdep mei shpchp psmouse edac_core lpc_ich i2c_i801 input_leds thermal snd_soc_core snd_compress snd_pcm_dmaengine
Mar 12 11:20:47 xmg kernel:  snd_pcm snd_timer snd soundcore snd_soc_sst_acpi ac97_bus i2c_hid snd_soc_sst_match elan_i2c hid wmi fjes evdev i2c_designware_platform spi_pxa2xx_platform battery 8250_dw mac_hid i2c_designware_core tpm_crb ac tpm_tis tpm_tis_core tpm sch_fq_codel uinput ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache sd_mod serio_raw atkbd libps2 rtsx_pci_sdmmc xhci_pci ahci xhci_hcd libahci libata usbcore scsi_mod rtsx_pci usb_common i8042 sdhci_acpi serio sdhci led_class mmc_core nvidia_drm(PO) nvidia_uvm(PO) nvidia_modeset(PO) nvidia(PO) i915 video button intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm
Mar 12 11:20:47 xmg kernel: CPU: 2 PID: 363 Comm: kworker/u16:4 Tainted: P           O    4.10.1-1-MANJARO #1
Mar 12 11:20:47 xmg kernel: Hardware name: Clevo P65_P67SG                       /powered by premamod.com, BIOS 1.03.07PM v2 07/09/2015
Mar 12 11:20:47 xmg kernel: Workqueue: events_unbound intel_mmio_flip_work_func [i915]
Mar 12 11:20:47 xmg kernel: Call Trace:
Mar 12 11:20:47 xmg kernel:  dump_stack+0x63/0x83
Mar 12 11:20:47 xmg kernel:  __schedule_bug+0x54/0x70
Mar 12 11:20:47 xmg kernel:  __schedule+0x588/0x700
Mar 12 11:20:47 xmg kernel:  schedule+0x3d/0x90
Mar 12 11:20:47 xmg kernel:  schedule_preempt_disabled+0x15/0x20
Mar 12 11:20:47 xmg kernel:  __mutex_lock_slowpath+0x19b/0x2d0
Mar 12 11:20:47 xmg kernel:  mutex_lock+0x23/0x30
Mar 12 11:20:47 xmg kernel:  nvidia_drm_gem_prime_fence_op_signaled+0x2f/0x80 [nvidia_drm]
Mar 12 11:20:47 xmg kernel:  nvidia_drm_gem_prime_fence_op_enable_signaling+0x39/0x130 [nvidia_drm]
Mar 12 11:20:47 xmg kernel:  dma_fence_default_wait+0xb3/0x260
Mar 12 11:20:47 xmg kernel:  ? sched_clock_cpu+0xa9/0xc0
Mar 12 11:20:47 xmg kernel:  nvidia_drm_gem_prime_fence_op_wait+0x28/0x30 [nvidia_drm]
Mar 12 11:20:47 xmg kernel:  dma_fence_wait_timeout+0x39/0x120
Mar 12 11:20:47 xmg kernel:  ? dma_fence_free+0x20/0x20
Mar 12 11:20:47 xmg kernel:  i915_gem_object_wait_fence+0x3c/0x190 [i915]
Mar 12 11:20:47 xmg kernel:  i915_gem_object_wait_reservation+0xc3/0x2d0 [i915]
Mar 12 11:20:47 xmg kernel:  ? put_prev_entity+0x31/0xa10
Mar 12 11:20:47 xmg kernel:  ? pick_next_task_fair+0x300/0x4c0
Mar 12 11:20:47 xmg kernel:  i915_gem_object_wait+0x15/0x30 [i915]
Mar 12 11:20:47 xmg kernel:  intel_mmio_flip_work_func+0x50/0x2a0 [i915]
Mar 12 11:20:47 xmg kernel:  process_one_work+0x1e5/0x470
Mar 12 11:20:47 xmg kernel:  worker_thread+0x48/0x4e0
Mar 12 11:20:47 xmg kernel:  kthread+0x101/0x140
Mar 12 11:20:47 xmg kernel:  ? process_one_work+0x470/0x470
Mar 12 11:20:47 xmg kernel:  ? kthread_create_on_node+0x60/0x60
Mar 12 11:20:47 xmg kernel:  ret_from_fork+0x2c/0x40

#24

Sorry for the delay of this answer. I took the time to think and to step back about this issue.
I’ve done some work, and now it’s quite better (I think), but not fully functional. :slight_smile:

Here it’s the report of those actions

  1. I booted on a Live environment to use mhwd-chroot. (I don’t know why, but mhwd freezed while simply running manjaro)
  2. After mounting the partition, I’ve fully uninstalled, and then reinstalled, video-hybrid-intel-nvidia-bumblebee, following the step 1 of the manjaro wiki. (You’re right @ryanmusante, it has automatically installed Nvidia 375.39)
  3. Following the logs, I ran the following command mhwd-gpu --setgl nvidia
  4. Once more, following the logs, I have edited a new file : /etc/X11/xorg.conf.d/20-intel.conf ; with the following content. (http://pastebin.com/8U0C053Y)
  5. /!\ IMPORTANT : I disabled bumblebeed with sudo systemctl disable bumblebeed, then exited mhwd-chroot console, and reboot on Manjaro with Kernel 4.4.

@FadeMind
Now I’m able to run

optirun nvidia-smi

And here it’s the output
http://pastebin.com/dPpUQwWc

Unfortunatly, the final test hasn’t been OK yet :confused:
Indeed, after running glxspheres64 I’ve got the following output :

Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Xlib: extension “GLX” missing on display “:0.0”.
ERROR in line 620:

Cheers.


#25

Hi Yonulya, my laptop has a similar setup (mine is a GE62), and I’m having exactly the same issue like yours. I have updated the BIOS to the latest version (112 instead of 118 like yours) given by MSI but ACPI error persists… but nvm. How do you disable bumblebeed.service ? Is it doable through GRUB?
Thank you


#26
sudo systemctl disable bumblebeed

To only stop the service

sudo systemctl stop bumblebeed

#27

So what I should is that to boot in live environment, then chroot and put in these commands? Since I can’t boot into system. Nice pic btw


#28

Oh how I miss the GUI. Thank you for the help. I’m living with my DE now lol


#29

@Yonulya and @Da_Toast , take a look at this thread.

Your problem could be a TLP bumblebee conflict, if so the fix is in the thread.


#30

Hi back guys :slight_smile:

Above all, a small point of my (unsuccessful) work :

  • I tried with video-nvidia ==> Worst idea ever. I wasn’t able to reach the GUI.

  • I tried video-hybrid-intel-nouveau-bumblebee ==> Not so bad, but only works with Kernel 4.9 and not compatible with GTX960. So it used only the intel graphics. (Enough to play some light games awaiting to find a better solution)

@sueridgepipe
Thanks for reminding me this.
@michaldybczak has suggested this solution above, and since, I haven’t retried it.
I will go back to the ‘4 days ago’ configuration and try again. :slight_smile:

@Da_Toast
Indeed, when you enable bumblebeed.service, it create a symlink.
I don’t really know why, but I seems this symlink freezed our boot.

[quote=“Da_Toast, post:27, topic:19211, full:true”]
So what I should is that to boot in live environment, then chroot and put in these commands? Since I can’t boot into system. Nice pic btw[/quote]
That’s it. If you have some difficulties to use mhwd-chroot, this tutorial of Heart-of-Lion explains very well how to use it.
Note : By using this tool, you can only use the disable option of systemctl. Others options aren’t usable while using mhwd-chroot. (Fortunately, that’s enough to remove the symlink and reboot without troubles)

I’m coming soon with fresh news about the TLP issue.

Cheers :slight_smile:


#31

Hi,

That’s weird. Finally, I found a kind of workaround, but I don’t really understand what’s happening.
Execute glxgears or primusrun glxgears returns the following messages :

Xlib: extension “GLX” missing on display “:0.0”.
Error: couldn’t get an RGB, Double-buffered visual

But execute optirun glxgears works fine.

[ 4763.107218] [INFO]Configured driver: nvidia
[ 4763.108743] [INFO]Response: Yes. X is active.
[ 4763.108769] [INFO]Running application using virtualgl.

Downgrade bumblebee package to version 3.2.1-16, doesn’t seems to have any effect.
Same for setting RUNTIME_PM_BLACKLIST to the value returns by lspci | grep "NVIDIA" | cut -b -8

What’s the difference between optirun and primusrun ?
man says one is running program with the discrete video card and the other with the discrete NVIDIA video card.
So what card I’m using ?

Cheers.


#32

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.