Nvidia drivers boot into black screen (after upgrade on Desktop system with Intel iGPU)

so if you back in your system, open ksystemlog app and just check if you get log spam there - repeating the same messages over and over

Still in live snapshot, will restore now. Checked ksystemlog and there is no sign of log spam. But the error I posted starting with pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0 and ending with pcieport 0000:00:01.0: [ 0] RxErr (First) were in the journalctl log hundreds of times. Probably until I shut down the PC.

I am going to restore now and come back.

Back into my system again, restored the snapshot. I opened the ksystemlog but there were still no signs of log spam. I can still look for something in the journalctl, are you looking for something specific?

But the error I posted starting with pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0 and ending with pcieport 0000:00:01.0: [ 0] RxErr (First) were in the journalctl log hundreds of times

this is what i meant - you had spam log… and it looks like its related to nvidia … since you dont have it now
check with this:
journalctl -b-0 | grep pcieport
journalctl -b-0 | grep AER

journalctl -b-0 | grep pcieport:

May 28 17:39:28 JPC1 kernel: pcieport 0000:00:01.0: AER: enabled with IRQ 122
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:01.0: DPC: enabled with IRQ 122
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:01.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:06.0: AER: enabled with IRQ 123
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:06.0: DPC: enabled with IRQ 123
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:06.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:1c.5: AER: enabled with IRQ 125
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:1c.5: DPC: enabled with IRQ 125
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:1c.5: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+

journalctl -b-0 | grep AER :

May 28 17:39:27 JPC1 kernel: acpi PNP0A08:00: _OSC: OS now controls [AER PCIeCapability LTR DPC]
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:01.0: AER: enabled with IRQ 122
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:06.0: AER: enabled with IRQ 123
May 28 17:39:28 JPC1 kernel: pcieport 0000:00:1c.5: AER: enabled with IRQ 125

not a sign from these specific errors:
pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
that appeared only when you had nvidia … i had the same errors, but mine were related to wifi… so we can try to fix the errors, and then you install again nvidia drivers and test if it worked…
go to /etc/default/grub and open the grub file and edit this line:
RUB_CMDLINE_LINUX_DEFAULT and inside the “” quotes add these parameters: pci=nomsi pci=nommconf - dont remove anything from the quotes, just add them there and just to be sure its correct copy the line here

Hello @_HAZE :wink:

That is a sign, that there was a version missmatch of the driver. Kernel and Driver must fit. That could happen on precompiled drivers. Lets say, you installed the latest kernel version of 5.17.11, but the driver was compiled with kernel 5.17.10 → verification fails.

Use a different kernel or use dkms packages.

And always! Keep sure the system is updated to the latest state.

That happens because you run xorg as user. You must add the user to specific groups.

Restarting xorg goes normally by this:

sudo systemctl restart display-manager

Hello @megavolt, thanks for joining in! :grinning: Regarding dkms packages, I did not try those yet. Are those recommended or are they “last resort” territory?
Manjaro is fully up-to-date.

@brahma I see, this is what you meant. Yes, indeed there are no such errors when I don’t have the drivers installed. I researched a bit regarding what the flags pci=nomsi pci=nommconf do. Do you know of pcie_aspm=off? Is one better than the other? I will try out the flags you listed soon, though, but I am a bit worried because last time the grub config did not save my changes when I deleted the quiet parameter to enable logging.

yes there are 4 commands to fix the AER errors and aspm off is one of them… i have no idea which of those commands will work with you so just to be sure you can add these parameters:
pci=nomsi pci=nommconf pcie_aspm=off … after you edited the grub, you have to run this command: sudo update-grub to save those changes and make them permanent

No, it is just a question of choice.

mhwd uses precompiled drivers, which speed up the install process, but on the other hand, it can happen that that there is a version missmatch. Also mhwd preconfigured xorg and has hardware detection.

However… a dkms package compiles the module on every kernel upgrade. Slower, but in my opinion safer. Also there is no detection of hardware or preconfigured xorg configs. You have to do this yourself.

It is not really clear, can you please explain in short:

  1. What do you try to accomplish?
  2. What kind of setup do you expect?

Also about this. That are just debug information. Everything is good there since it reports:

severity=Corrected

Message from the Maintainer:

Short story: the AER driver receives the corrected error notification but fails to clear it. Nobody has stepped up to fix the bug yet. You can probably work around it by disabling AER completely by booting with “pci=noaer”.

Re: 4.4.x kernel (only) gives pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4 - Bjorn Helgaas

Okay, thank you for this information on the dkms package. :slightly_smiling_face:

I guess I will try that out as well then. Because to be honest, I was not completely happy with my current configuration either, since whenever I booted my system with the 2 monitors plugged in (main monitor in MBs DP and second monitor in MBs HDMI) the system would also boot into a black screen, I remember. Thats why I always had to plug in the second monitor after boot into Plasma, and then it worked.

Yes, I did not make it very clear what my goal now is. At first my goal was to get the Nvidia hybrid drivers running on my Desktop. Using the Intel iGPU for Desktop and Browser, only using the dGPU for running Games or Blender. But then I read that there is a lot of configuration to do for this to work on Desktop and so I figured maybe it is better to only run on the dGPU. But in both cases I get a black screen, so now I have to figure out what is causing the issue… Actually, the hybrid variant would be always nicer to have, but in the end, as long as it works, at this point anything will be fine. Just don’t wont to brick my system in the process :sweat_smile:

What about

May 28 16:48:10 JPC1 kernel: NVRM: GPU at PCI:0000:01:00: GPU-b6d5e999-de6c-852b-bea5-4379861a0dbc
May 28 16:48:10 JPC1 kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=848, GPU has fallen off the bus.
May 28 16:48:10 JPC1 kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.

though?
@brahma I will try the custom grub flags tomorrow.

To both of you, thank you so much for your help so far! I feel we are getting somewhere, at least! :smiley:

check with: lspci what you have under: 0000:01:00 - its probably the nvidia… its the pci port that is faulty…

Yeah that is the default. Just for understanding:

  1. It boots up with the intel gpu, nvidia gpu output are not accessible.
  2. Starting an app with the nvidia gpu is as simple as that: prime-run blender
  3. At the end it just switch the GLX to nvidia.

prime-run blender is a lot? :thinking:

As said above:

So my advice here is removing nvidia driver with mhwd:

sudo mhwd -r pci video-nvidia-470xx

or whatever you have currently installed:

mhwd -li

Install the headers for each kernel, example linux 5.17:

pamac install linux517-headers

(needed to compile the module)

Then installing the dkms package:

pamac install nvidia-dkms nvidia-prime mesa-utils

That will just install the NVIDIA driver and utils.

Blacklist nouveau

echo -e "blacklist nouveau\nblacklist ttm\nblacklist drm_kms_helper\nblacklist drm" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf

Load nvidia

echo -e "nvidia\nnvidia-drm" | sudo tee /etc/modules-load.d/nvidia.conf

Enable Dynamic PowerManagement on Hybrid Setup

Open and editor and put the rules in it:

sudo nano /etc/udev/rules.d/90-nvidia-prime-powermanagement.rules

Content:

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present (enable it for kernels lower than 5.5)
#ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

and

echo -e "options nvidia \"NVreg_DynamicPowerManagement=0x02\"" | sudo tee /etc/modprobe.d/nvidia-prime-powermanagement.conf

No more configuration needed here. It boots with the Intel drivers anyway, NVIDIA needs to be explicitly targeted.

Check that at the path /etc/X11/xorg.conf.d/ is no *.conf file which target nvidia.

After a reboot, you should be able to test it:

glxinfo | grep "OpenGL renderer"
prime-run glxinfo | grep "OpenGL renderer"

:notebook: I hope I have mentioned every step now :smiley:

Output of the line using lspci:

01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3060] (rev a1)

Seems to be OK?

Thank you very much for your detailed guide. I was so close, or so I thought:

The posts I read about the hybrid drivers on Desktop were mostly 2 years old, so maybe there was more configuration to be done. prime-run is obviously totally acceptable! :grinning:

So an hour ago I followed your guide, step by step. I installed the headers (pamac autoinstalled the headers for my other Kernels 5.15 and 5.10 as well). I rebooted the system so that the changes can take place. All good, after reboot I installed the packages using the command pamac install nvidia-dkms nvidia-prime mesa-utils. I blacklisted nouveau and loaded nvidia. I also checked that there was no config file targeting nvidia in etc/X11/xorg.conf.d/. So far so good.

Everything was done, so I wanted to reboot. Press the reboot button, screen goes black. Waiting for minutes on end, nothing happens, PC still running. TTY not accessible. So my only option is the Power Button. PC shuts down. Restarting PC, noticing my Encryption Dialog is now center-aligned, and not left-aligned like always. Encrypting my PC anyway, blackscreen for 20 seconds. Then, finally, the splashscreen! Hyped up, I check if nvidia-settings can be run. Yes! I see some GUI, but not checking any further. I try out the commands above:

glxinfo | grep "OpenGL renderer"
prime-run glxinfo | grep "OpenGL rendere

Nice! First line prints out iGPU, second line prints out dGPU. Hyped up, tried out prime-run blender. Performance is there, but shadows are throwing artifacts and the 3D viewport feels laggy when using GPU render. Trying out a game in Steam. Steam does not seem to use the dGPU, even with the prime-run command in the run options of the game.
Now skeptical, tried running nvidia-smi. The GPU is correctly listed, and blender was still in the process list, but Xorg too (although with only 93MB). But! Temperature was now at nearly 70 degrees and I did not notice the Fan spinning up once. And indeed, there was an “ERR!” under the fan entry. Opened nvidia-settings to see if I can manually set the fan (only to test things out). Result: Failed to set fan speed!

So everything seemed a bit too fishy, and to not risk any crash due to overheating I shut down the PC. Shutdown this time worked flawlessly, instantly shut down without any warning or error. But 10 minutes after the shutdown I restart again only to see a blackscreen, in all the Kernels (5.17, 5.15, 5.10). Cannot access TTY, deleting quiet out of the boot config I can see that the “Simple Desktop Manager” never finishes loading.

Now using @brahma s tips, using pcie_aspm=off I can boot into 5.17 Kernel once, only to see that nvidia-settings gives me critical erros, telling me the modules did not load and I could not open the settings. nvidia-smi same output. Trying out pci=nomsi and pci=nommconf results in a blackscreen as well. Using all 3 flags results in a blackscreen, no TTY. Using no flags at all in all Kernels = blackscreen, no TTY.

So I am currently living in yesterdays Timeshift snapshot again to type this. Currently looking at the journalctl. I am pretty sure this is the end from the log when I installed the driver and tried rebooting, resulting in a blackscreen:

May 29 10:37:59 JPC1 sddm[755]: Display server stopped.
May 29 10:37:59 JPC1 sddm[755]: Running display stop script  "/usr/share/sddm/scripts/Xstop"
May 29 10:37:59 JPC1 sddm[755]: Removing display ":0" ...
May 29 10:37:59 JPC1 sddm[755]: Adding new display on vt 1 ...
May 29 10:37:59 JPC1 sddm[755]: Loading theme configuration from ""
May 29 10:37:59 JPC1 sddm[755]: Display server starting...
May 29 10:37:59 JPC1 sddm[755]: Adding cookie to "/var/run/sddm/{31ff9a1a-453d-4eb0-944b-405d99145603}"
May 29 10:37:59 JPC1 sddm[755]: Running: /usr/bin/X -nolisten tcp -background none -seat seat0 vt1 -auth /var/run/sddm/{31ff9a1a-453d-4eb0-944b-405d99145603} -noreset -displayfd 18
May 29 10:37:59 JPC1 kernel: nvidia: loading out-of-tree module taints kernel.
May 29 10:37:59 JPC1 kernel: nvidia: module license 'NVIDIA' taints kernel.
May 29 10:37:59 JPC1 kernel: Disabling lock debugging due to kernel taint
May 29 10:37:59 JPC1 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
May 29 10:37:59 JPC1 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 508
May 29 10:37:59 JPC1 kernel: 
May 29 10:37:59 JPC1 kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
May 29 10:37:59 JPC1 kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
May 29 10:37:59 JPC1 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.73.05  Sat May  7 05:30:26 UTC 2022
May 29 10:38:00 JPC1 kernel: NVRM: GPU at PCI:0000:01:00: GPU-b6d5e999-de6c-852b-bea5-4379861a0dbc
May 29 10:38:00 JPC1 kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=25217, GPU has fallen off the bus.
May 29 10:38:00 JPC1 kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
May 29 10:38:00 JPC1 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000088
May 29 10:38:00 JPC1 kernel: #PF: supervisor write access in kernel mode
May 29 10:38:00 JPC1 kernel: #PF: error_code(0x0002) - not-present page
May 29 10:38:00 JPC1 kernel: PGD 0 P4D 0 
May 29 10:38:00 JPC1 kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
May 29 10:38:00 JPC1 kernel: CPU: 0 PID: 25191 Comm: Xorg Tainted: P           OE     5.17.9-1-MANJARO #1 7fd1fa212587ceb9c10eebada1d251e7facbe5ca
May 29 10:38:00 JPC1 kernel: Hardware name: ASUS System Product Name/ROG STRIX B560-I GAMING WIFI, BIOS 0904 05/24/2021
May 29 10:38:00 JPC1 kernel: RIP: 0010:_nv033473rm+0xac/0x130 [nvidia]
May 29 10:38:00 JPC1 kernel: Code: 44 89 e0 5b 41 5c c3 0f 1f 80 00 00 00 00 48 c1 e1 06 48 03 8c fe e0 23 00 00 45 84 c0 8b 50 08 44 8b 48 0c 74 78 85 db 74 3c <48> 83 41 08 01 0f b6 10 83 e2 03 80 fa 03 75 bf 45 84 c0 75 ba 0f
May 29 10:38:00 JPC1 kernel: RSP: 0018:ffffb593c0247728 EFLAGS: 00010206
May 29 10:38:00 JPC1 kernel: RAX: ffff8c2ed6d95ce0 RBX: 0000000000000055 RCX: 0000000000000080
May 29 10:38:00 JPC1 kernel: RDX: 00000000000000f6 RSI: ffff8c2f71450008 RDI: 0000000000000014
May 29 10:38:00 JPC1 kernel: RBP: ffff8c2ed6d95c90 R08: 0000000000000001 R09: 0000000000000000
May 29 10:38:00 JPC1 kernel: R10: 0000000000013540 R11: 0000000000000000 R12: 0000000000000055
May 29 10:38:00 JPC1 kernel: R13: ffff8c2f71450008 R14: 0000000000000014 R15: ffff8c2e26de4008
May 29 10:38:00 JPC1 kernel: FS:  00007fb292e16100(0000) GS:ffff8c353f400000(0000) knlGS:0000000000000000
May 29 10:38:00 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 10:38:00 JPC1 kernel: CR2: 0000000000000088 CR3: 000000011f3e2002 CR4: 0000000000770ef0
May 29 10:38:00 JPC1 kernel: PKRU: 55555554
May 29 10:38:00 JPC1 kernel: Call Trace:
May 29 10:38:00 JPC1 kernel:  <TASK>
May 29 10:38:00 JPC1 kernel:  ? _nv033470rm+0x162/0x2f0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv037675rm+0x70/0xb0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv037675rm+0x3f/0xb0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv011742rm+0x37/0x60 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv011059rm+0x7c/0x170 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv032311rm+0xc6/0x1f0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv010914rm+0x4e/0xc0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv009158rm+0x115/0x170 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv011987rm+0x2b5/0x4c0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv012032rm+0x25d/0x310 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv014242rm+0x3a/0x100 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv015179rm+0x16e/0x3c0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv021904rm+0x91/0x1e0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv021905rm+0x21/0x40 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv000696rm+0x1aa/0x2f0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? _nv000643rm+0x49c/0x20b0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? rm_init_adapter+0xc5/0xe0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? nv_open_device+0x2dc/0x8c0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? nvidia_open+0x2f3/0x600 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? kobj_lookup+0xf1/0x170
May 29 10:38:00 JPC1 kernel:  ? nvidia_frontend_open+0x50/0xa0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:38:00 JPC1 kernel:  ? chrdev_open+0xc1/0x250
May 29 10:38:00 JPC1 kernel:  ? cdev_device_add+0x90/0x90
May 29 10:38:00 JPC1 kernel:  ? do_dentry_open+0x1cf/0x3a0
May 29 10:38:00 JPC1 kernel:  ? path_openat+0xd94/0x1280
May 29 10:38:00 JPC1 kernel:  ? notify_change+0x49e/0x550
May 29 10:38:00 JPC1 kernel:  ? do_filp_open+0xaf/0x160
May 29 10:38:00 JPC1 kernel:  ? do_sys_openat2+0xb9/0x170
May 29 10:38:00 JPC1 kernel:  ? __x64_sys_openat+0x6a/0xa0
May 29 10:38:00 JPC1 kernel:  ? do_syscall_64+0x58/0x90
May 29 10:38:00 JPC1 kernel:  ? syscall_exit_to_user_mode+0x23/0x50
May 29 10:38:00 JPC1 kernel:  ? do_syscall_64+0x67/0x90
May 29 10:38:00 JPC1 kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
May 29 10:38:00 JPC1 kernel:  </TASK>
May 29 10:38:00 JPC1 kernel: Modules linked in: nvidia(POE) xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark iptable_mangle xt_addrtype iptable_raw xt_tcpudp tun ip6table_filter ip6_tables ccm iptable_filter xt_comment rfcomm cmac algif_hash algif_skcipher af_alg qrtr b>
May 29 10:38:00 JPC1 kernel:  snd_hda_codec_hdmi mei intel_lpss_pci intel_lpss btusb idma64 snd_hda_intel btrtl snd_intel_dspcfg btbcm snd_intel_sdw_acpi btintel snd_hda_codec btmtk i915 mousedev snd_hda_core bluetooth snd_hwdep snd_pcm snd_timer snd joydev ecdh_generic ttm rfkill cr>
May 29 10:38:00 JPC1 kernel: CR2: 0000000000000088
May 29 10:38:00 JPC1 kernel: ---[ end trace 0000000000000000 ]---
May 29 10:38:00 JPC1 kernel: RIP: 0010:_nv033473rm+0xac/0x130 [nvidia]
May 29 10:38:00 JPC1 kernel: Code: 44 89 e0 5b 41 5c c3 0f 1f 80 00 00 00 00 48 c1 e1 06 48 03 8c fe e0 23 00 00 45 84 c0 8b 50 08 44 8b 48 0c 74 78 85 db 74 3c <48> 83 41 08 01 0f b6 10 83 e2 03 80 fa 03 75 bf 45 84 c0 75 ba 0f
May 29 10:38:00 JPC1 kernel: RSP: 0018:ffffb593c0247728 EFLAGS: 00010206
May 29 10:38:00 JPC1 kernel: RAX: ffff8c2ed6d95ce0 RBX: 0000000000000055 RCX: 0000000000000080
May 29 10:38:00 JPC1 kernel: RDX: 00000000000000f6 RSI: ffff8c2f71450008 RDI: 0000000000000014
May 29 10:38:00 JPC1 kernel: RBP: ffff8c2ed6d95c90 R08: 0000000000000001 R09: 0000000000000000
May 29 10:38:00 JPC1 kernel: R10: 0000000000013540 R11: 0000000000000000 R12: 0000000000000055
May 29 10:38:00 JPC1 kernel: R13: ffff8c2f71450008 R14: 0000000000000014 R15: ffff8c2e26de4008
May 29 10:38:00 JPC1 kernel: FS:  00007fb292e16100(0000) GS:ffff8c353f400000(0000) knlGS:0000000000000000
May 29 10:38:00 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 10:38:00 JPC1 kernel: CR2: 0000000000000088 CR3: 000000011f3e2002 CR4: 0000000000770ef0
May 29 10:38:00 JPC1 kernel: PKRU: 55555554
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00002001/00002000
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:00 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Transmitter ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00003101/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 8] Rollover              
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [12] Timeout               
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: can't find device of ID0008
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Transmitter ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00003101/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 8] Rollover              
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [12] Timeout               
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: can't find device of ID0008
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: can't find device of ID0008
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00001100/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 8] Rollover              
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [12] Timeout               
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00003100/00002000
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:    [ 8] Rollover              
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0:    [12] Timeout               
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
May 29 10:38:02 JPC1 kernel: pcieport 0000:00:01.0: AER: can't find device of ID0008
May 29 10:39:28 JPC1 systemd[1]: sddm.service: State 'stop-sigterm' timed out. Killing.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 755 (sddm) with signal SIGKILL.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 25191 (Xorg) with signal SIGKILL.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 766 (QDBusConnection) with signal SIGKILL.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 25192 (n/a) with signal SIGKILL.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 25201 (Xorg:sh8) with signal SIGKILL.
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Main process exited, code=killed, status=9/KILL
May 29 10:39:28 JPC1 systemd[1]: sddm.service: Killing process 25201 (Xorg:sh8) with signal SIGKILL.
May 29 10:39:28 JPC1 kernel: BUG: unable to handle page fault for address: 0000000200000002
May 29 10:39:28 JPC1 kernel: #PF: supervisor read access in kernel mode
May 29 10:39:28 JPC1 kernel: #PF: error_code(0x0000) - not-present page
May 29 10:39:28 JPC1 kernel: PGD 0 P4D 0 
May 29 10:39:28 JPC1 kernel: Oops: 0000 [#2] PREEMPT SMP NOPTI
May 29 10:39:28 JPC1 kernel: CPU: 1 PID: 25201 Comm: Xorg:sh8 Tainted: P      D    OE     5.17.9-1-MANJARO #1 7fd1fa212587ceb9c10eebada1d251e7facbe5ca
May 29 10:39:28 JPC1 kernel: Hardware name: ASUS System Product Name/ROG STRIX B560-I GAMING WIFI, BIOS 0904 05/24/2021
May 29 10:39:28 JPC1 kernel: RIP: 0010:_nv010247rm+0x3c/0x340 [nvidia]
May 29 10:39:28 JPC1 kernel: Code: 07 0f 1f 44 00 00 31 d2 48 8b 07 48 85 c0 75 1a e9 a1 02 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 29 02 00 00 48 8b 48 18 48 85 c9 75 e9 48 89
May 29 10:39:28 JPC1 kernel: RSP: 0018:ffffb593c07efad8 EFLAGS: 00010002
May 29 10:39:28 JPC1 kernel: RAX: 0000000200000002 RBX: ffffb593c07efb20 RCX: 0000000200000002
May 29 10:39:28 JPC1 kernel: RDX: ffffb593c07efb78 RSI: 0000000000006271 RDI: ffffffffc37d72d8
May 29 10:39:28 JPC1 kernel: RBP: ffff8c2e2546e000 R08: 0000000000000000 R09: 0000000000000000
May 29 10:39:28 JPC1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc1d94d72
May 29 10:39:28 JPC1 kernel: R13: ffffffffc37d89a0 R14: ffff8c2e01644800 R15: ffffffffc37d4a20
May 29 10:39:28 JPC1 kernel: FS:  0000000000000000(0000) GS:ffff8c353f440000(0000) knlGS:0000000000000000
May 29 10:39:28 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 10:39:28 JPC1 kernel: CR2: 0000000200000002 CR3: 00000006e7e10002 CR4: 0000000000770ee0
May 29 10:39:28 JPC1 kernel: PKRU: 55555554
May 29 10:39:28 JPC1 kernel: Call Trace:
May 29 10:39:28 JPC1 kernel:  <TASK>
May 29 10:39:28 JPC1 kernel:  ? _nv037919rm+0xb0/0x1c0 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:39:28 JPC1 kernel:  ? rm_cleanup_file_private+0x42/0x170 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:39:28 JPC1 kernel:  ? nvidia_close+0x15e/0x310 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:39:28 JPC1 kernel:  ? nvidia_frontend_close+0x27/0x50 [nvidia a0d810184fda5cd2c24153a3e391921059a1c2a4]
May 29 10:39:28 JPC1 kernel:  ? __fput+0x86/0x240
May 29 10:39:28 JPC1 kernel:  ? task_work_run+0x59/0x90
May 29 10:39:28 JPC1 kernel:  ? do_exit+0x33a/0xab0
May 29 10:39:28 JPC1 kernel:  ? plist_del+0x5f/0xc0
May 29 10:39:28 JPC1 kernel:  ? do_group_exit+0x2d/0x90
May 29 10:39:28 JPC1 kernel:  ? get_signal+0x149/0x9c0
May 29 10:39:28 JPC1 kernel:  ? arch_do_signal_or_restart+0xd9/0x740
May 29 10:39:28 JPC1 kernel:  ? exit_to_user_mode_prepare+0xfd/0x190
May 29 10:39:28 JPC1 kernel:  ? syscall_exit_to_user_mode+0x23/0x50
May 29 10:39:28 JPC1 kernel:  ? do_syscall_64+0x67/0x90
May 29 10:39:28 JPC1 kernel:  ? syscall_exit_to_user_mode+0x23/0x50
May 29 10:39:28 JPC1 kernel:  ? do_syscall_64+0x67/0x90
May 29 10:39:28 JPC1 kernel:  ? exc_page_fault+0x71/0x170
May 29 10:39:28 JPC1 kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
May 29 10:39:28 JPC1 kernel:  </TASK>
May 29 10:39:28 JPC1 kernel: Modules linked in: nvidia(POE) xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark iptable_mangle xt_addrtype iptable_raw xt_tcpudp tun ip6table_filter ip6_tables ccm iptable_filter xt_comment rfcomm cmac algif_hash algif_skcipher af_alg qrtr b>
May 29 10:39:28 JPC1 kernel:  snd_hda_codec_hdmi mei intel_lpss_pci intel_lpss btusb idma64 snd_hda_intel btrtl snd_intel_dspcfg btbcm snd_intel_sdw_acpi btintel snd_hda_codec btmtk i915 mousedev snd_hda_core bluetooth snd_hwdep snd_pcm snd_timer snd joydev ecdh_generic ttm rfkill cr>
May 29 10:39:28 JPC1 kernel: CR2: 0000000200000002
May 29 10:39:28 JPC1 kernel: ---[ end trace 0000000000000000 ]---
May 29 10:39:28 JPC1 kernel: RIP: 0010:_nv033473rm+0xac/0x130 [nvidia]
May 29 10:39:28 JPC1 kernel: Code: 44 89 e0 5b 41 5c c3 0f 1f 80 00 00 00 00 48 c1 e1 06 48 03 8c fe e0 23 00 00 45 84 c0 8b 50 08 44 8b 48 0c 74 78 85 db 74 3c <48> 83 41 08 01 0f b6 10 83 e2 03 80 fa 03 75 bf 45 84 c0 75 ba 0f
May 29 10:39:28 JPC1 kernel: RSP: 0018:ffffb593c0247728 EFLAGS: 00010206
May 29 10:39:28 JPC1 kernel: RAX: ffff8c2ed6d95ce0 RBX: 0000000000000055 RCX: 0000000000000080
May 29 10:39:28 JPC1 kernel: RDX: 00000000000000f6 RSI: ffff8c2f71450008 RDI: 0000000000000014
May 29 10:39:28 JPC1 kernel: RBP: ffff8c2ed6d95c90 R08: 0000000000000001 R09: 0000000000000000
May 29 10:39:28 JPC1 kernel: R10: 0000000000013540 R11: 0000000000000000 R12: 0000000000000055
May 29 10:39:28 JPC1 kernel: R13: ffff8c2f71450008 R14: 0000000000000014 R15: ffff8c2e26de4008
May 29 10:39:28 JPC1 kernel: FS:  0000000000000000(0000) GS:ffff8c353f440000(0000) knlGS:0000000000000000
May 29 10:39:28 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 10:39:28 JPC1 kernel: CR2: 0000000200000002 CR3: 00000006e7e10002 CR4: 0000000000770ee0
May 29 10:39:28 JPC1 kernel: PKRU: 55555554
May 29 10:39:28 JPC1 kernel: note: Xorg:sh8[25201] exited with preempt_count 1
May 29 10:39:28 JPC1 kernel: Fixing recursive fault but reboot is needed!
May 29 10:39:28 JPC1 kernel: BUG: scheduling while atomic: Xorg:sh8/25201/0x00000000
May 29 10:39:28 JPC1 kernel: Modules linked in: nvidia(POE) xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark iptable_mangle xt_addrtype iptable_raw xt_tcpudp tun ip6table_filter ip6_tables ccm iptable_filter xt_comment rfcomm cmac algif_hash algif_skcipher af_alg qrtr b>
May 29 10:39:28 JPC1 kernel:  snd_hda_codec_hdmi mei intel_lpss_pci intel_lpss btusb idma64 snd_hda_intel btrtl snd_intel_dspcfg btbcm snd_intel_sdw_acpi btintel snd_hda_codec btmtk i915 mousedev snd_hda_core bluetooth snd_hwdep snd_pcm snd_timer snd joydev ecdh_generic ttm rfkill cr>
May 29 10:39:28 JPC1 kernel: CPU: 1 PID: 25201 Comm: Xorg:sh8 Tainted: P      D    OE     5.17.9-1-MANJARO #1 7fd1fa212587ceb9c10eebada1d251e7facbe5ca
May 29 10:39:28 JPC1 kernel: Hardware name: ASUS System Product Name/ROG STRIX B560-I GAMING WIFI, BIOS 0904 05/24/2021
May 29 10:39:28 JPC1 kernel: Call Trace:
May 29 10:39:28 JPC1 kernel:  <TASK>
May 29 10:39:28 JPC1 kernel:  dump_stack_lvl+0x47/0x64
May 29 10:39:28 JPC1 kernel:  __schedule_bug.cold+0x4c/0x58
May 29 10:39:28 JPC1 kernel:  __schedule+0xde8/0x11e0
May 29 10:39:28 JPC1 kernel:  do_task_dead+0x3f/0x50
May 29 10:39:28 JPC1 kernel:  make_task_dead.cold+0x51/0xab
May 29 10:39:28 JPC1 kernel:  rewind_stack_and_make_dead+0x17/0x17
May 29 10:39:28 JPC1 kernel: RIP: 0033:0x7fb29370e119
May 29 10:39:28 JPC1 kernel: Code: Unable to access opcode bytes at RIP 0x7fb29370e0ef.
May 29 10:39:28 JPC1 kernel: RSP: 002b:00007fb27f7fd9d0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
May 29 10:39:28 JPC1 kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007fb29370e119
May 29 10:39:28 JPC1 kernel: RDX: 0000000000000000 RSI: 0000000000000189 RDI: 000055fa7ef926e0
May 29 10:39:28 JPC1 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
May 29 10:39:28 JPC1 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000055fa7ef92690
May 29 10:39:28 JPC1 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 000055fa7ef926e0
May 29 10:39:28 JPC1 kernel:  </TASK>
May 29 10:40:58 JPC1 systemd[1]: sddm.service: Processes still around after final SIGKILL. Entering failed mode.
May 29 10:40:58 JPC1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=sddm comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 29 10:40:58 JPC1 kernel: kauditd_printk_skb: 14 callbacks suppressed
May 29 10:40:58 JPC1 kernel: audit: type=1131 audit(1653813658.929:224): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=sddm comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 29 10:40:58 JPC1 systemd[1]: sddm.service: Failed with result 'timeout'.
May 29 10:40:58 JPC1 systemd[1]: Stopped Simple Desktop Display Manager.
May 29 10:40:58 JPC1 systemd[1]: sddm.service: Consumed 51.163s CPU time.
May 29 10:40:58 JPC1 systemd[1]: Stopping User Login Management...
May 29 10:40:58 JPC1 systemd[1]: Stopping Permit User Sessions...
May 29 10:40:58 JPC1 systemd[1]: systemd-user-sessions.service: Deactivated successfully.
May 29 10:40:58 JPC1 systemd[1]: Stopped Permit User Sessions.
May 29 10:40:58 JPC1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-user-sessions comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 29 10:40:58 JPC1 systemd[1]: Stopped target Network.
May 29 10:40:58 JPC1 systemd[1]: Stopped target Remote File Systems.
May 29 10:40:58 JPC1 systemd[1]: Stopping Network Manager...
May 29 10:40:58 JPC1 systemd[1]: Stopping Network Name Resolution...
May 29 10:40:58 JPC1 systemd[1]: Stopping WPA supplicant...
May 29 10:40:58 JPC1 wpa_supplicant[830]: p2p-dev-wlo1: CTRL-EVENT-DSCP-POLICY clear_all
May 29 10:40:58 JPC1 wpa_supplicant[830]: p2p-dev-wlo1: CTRL-EVENT-DSCP-POLICY clear_all
May 29 10:40:58 JPC1 wpa_supplicant[830]: nl80211: deinit ifname=p2p-dev-wlo1 disabled_11b_rates=0
May 29 10:40:58 JPC1 NetworkManager[695]: <info>  [1653813658.9417] caught SIGTERM, shutting down normally.
May 29 10:40:58 JPC1 kernel: audit: type=1131 audit(1653813658.939:225): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-user-sessions comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 29 10:40:58 JPC1 systemd[1]: systemd-resolved.service: Deactivated successfully.
May 29 10:40:58 JPC1 systemd[1]: Stopped Network Name Resolution.
May 29 10:40:58 JPC1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-resolved comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 29 10:40:58 JPC1 kernel: audit: type=1131 audit(1653813658.942:226): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-resolved comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May 29 10:40:58 JPC1 audit: BPF prog-id=0 op=UNLOAD
May 29 10:40:59 JPC1 kernel: audit: type=1334 audit(1653813658.996:227): prog-id=0 op=UNLOAD
May 29 10:40:59 JPC1 wpa_supplicant[830]: p2p-dev-wlo1: CTRL-EVENT-TERMINATING
May 29 10:40:59 JPC1 wpa_supplicant[830]: wlo1: CTRL-EVENT-DSCP-POLICY clear_all
May 29 10:40:59 JPC1 NetworkManager[695]: <info>  [1653813659.0051] device (wlo1): state change: disconnected -> unmanaged (reason 'unmanaged', sys-iface-state: 'managed')
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 10:38:01 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000

The first boot was successful, but fans did not spin and performance was laggy. Funnily enough I did not disable aspm by myself, only later:

May 29 10:45:39 JPC1 kernel: r8169 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control
May 29 10:45:39 JPC1 kernel: nvidia: loading out-of-tree module taints kernel.
May 29 10:45:39 JPC1 kernel: nvidia: module license 'NVIDIA' taints kernel.
May 29 10:45:39 JPC1 kernel: Disabling lock debugging due to kernel taint
May 29 10:45:39 JPC1 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
May 29 10:45:39 JPC1 kernel: snd_hda_codec_hdmi hdaudioC0D2: Monitor plugged-in, Failed to power up codec ret=[-13]
...
May 29 10:45:39 JPC1 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 508
May 29 10:45:39 JPC1 kernel: 
May 29 10:45:39 JPC1 kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
May 29 10:45:39 JPC1 kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
...
ay 29 10:45:39 JPC1 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.73.05  Sat May  7 05:30:26 UTC 2022
...
May 29 10:45:40 JPC1 kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
...
May 29 10:45:40 JPC1 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  510.73.05  Sat May  7 05:21:20 UTC 2022
...
May 29 10:45:40 JPC1 kernel: nvidia-uvm: Loaded the UVM driver, major device number 506.
May 29 10:45:40 JPC1 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
May 29 10:45:40 JPC1 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
...
May 29 10:45:40 JPC1 kernel: ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\RMTW.SHWM) (20211217/utaddress-204)
May 29 10:45:40 JPC1 kernel: ACPI: OSL: Resource conflict; ACPI support missing from driver?
...
May 29 10:45:48 JPC1 kernel: NVRM: GPU at PCI:0000:01:00: GPU-b6d5e999-de6c-852b-bea5-4379861a0dbc
May 29 10:45:48 JPC1 kernel: NVRM: Xid (PCI:0000:01:00): 62, pid=859, 0000(0000) 00000000 00000000
...
May 29 10:45:52 JPC1 kernel: NVRM: Xid (PCI:0000:01:00): 122, pid=859, SPI read failure at address 272! (0x00000065)

Near shutdown:

May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
May 29 11:20:29 JPC1 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 

And this is a selection of the 10 attempts to boot into the OS again:

May 29 11:33:56 JPC1 kernel: nvidia: loading out-of-tree module taints kernel.
May 29 11:33:56 JPC1 kernel: nvidia: module license 'NVIDIA' taints kernel.
May 29 11:33:56 JPC1 kernel: Disabling lock debugging due to kernel taint
May 29 11:33:56 JPC1 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
May 29 11:33:56 JPC1 kernel: r8169 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control
...
May 29 11:33:56 JPC1 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 508
...
May 29 11:33:56 JPC1 kernel: thermal thermal_zone2: failed to read out thermal zone (-61)
May 29 11:33:56 JPC1 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.73.05  Sat May  7 05:30:26 UTC 2022
May 29 11:33:56 JPC1 kernel: snd_hda_codec_hdmi hdaudioC0D2: Monitor plugged-in, Failed to power up codec ret=[-13]
...
May 29 11:33:57 JPC1 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  510.73.05  Sat May  7 05:21:20 UTC 2022
May 29 11:33:57 JPC1 systemd-modules-load[400]: Inserted module 'nvidia_drm'
May 29 11:33:57 JPC1 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
May 29 11:33:57 JPC1 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
...
May 29 11:33:57 JPC1 kernel: ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\RMTW.SHWM) (20210730/utaddress-204)
...
May 29 11:33:59 JPC1 kernel: NVRM: GPU at PCI:0000:01:00: GPU-b6d5e999-de6c-852b-bea5-4379861a0dbc
May 29 11:33:59 JPC1 kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=862, GPU has fallen off the bus.
May 29 11:33:59 JPC1 kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
May 29 11:33:59 JPC1 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000088
May 29 11:33:59 JPC1 kernel: #PF: supervisor write access in kernel mode
May 29 11:33:59 JPC1 kernel: #PF: error_code(0x0002) - not-present page
May 29 11:33:59 JPC1 kernel: PGD 0 P4D 0 
May 29 11:33:59 JPC1 kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
May 29 11:33:59 JPC1 kernel: CPU: 9 PID: 845 Comm: Xorg Tainted: P           OE     5.15.41-1-MANJARO #1 990090cf7d29f1c59853dd9ffd4aab655306c4a0
May 29 11:33:59 JPC1 kernel: Hardware name: ASUS System Product Name/ROG STRIX B560-I GAMING WIFI, BIOS 0904 05/24/2021
May 29 11:33:59 JPC1 kernel: RIP: 0010:_nv033473rm+0xac/0x130 [nvidia]
May 29 11:33:59 JPC1 kernel: Code: 44 89 e0 5b 41 5c c3 0f 1f 80 00 00 00 00 48 c1 e1 06 48 03 8c fe e0 23 00 00 45 84 c0 8b 50 08 44 8b 48 0c 74 78 85 db 74 3c <48> 83 41 08 01 0f b6 10 83 e2 03 80 fa 03 75 bf 45 84 c0 75 ba 0f
May 29 11:33:59 JPC1 kernel: RSP: 0018:ffffb0c2821836c8 EFLAGS: 00010206
May 29 11:33:59 JPC1 kernel: RAX: ffff9a3f26722ce0 RBX: 0000000000000055 RCX: 0000000000000080
May 29 11:33:59 JPC1 kernel: RDX: 000000000000014c RSI: ffff9a3f30720008 RDI: 0000000000000014
May 29 11:33:59 JPC1 kernel: RBP: ffff9a3f26722c90 R08: 0000000000000001 R09: 0000000000000000
May 29 11:33:59 JPC1 kernel: R10: 0000000000015d5c R11: 0000000000000000 R12: 0000000000000055
May 29 11:33:59 JPC1 kernel: R13: ffff9a3f30720008 R14: 0000000000000014 R15: ffff9a3f26610008
May 29 11:33:59 JPC1 kernel: FS:  00007fbf515de100(0000) GS:ffff9a463f640000(0000) knlGS:0000000000000000
May 29 11:33:59 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 11:33:59 JPC1 kernel: CR2: 0000000000000088 CR3: 000000011bbbe002 CR4: 0000000000770ee0
May 29 11:33:59 JPC1 kernel: PKRU: 55555554
May 29 11:33:59 JPC1 kernel: Call Trace:
May 29 11:33:59 JPC1 kernel:  <TASK>
May 29 11:33:59 JPC1 kernel:  ? _nv033470rm+0x162/0x2f0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv037675rm+0x70/0xb0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv037675rm+0x3f/0xb0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv011742rm+0x37/0x60 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv011059rm+0x7c/0x170 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv032311rm+0xc6/0x1f0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv010914rm+0x4e/0xc0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv009158rm+0x115/0x170 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv011987rm+0x2b5/0x4c0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv012032rm+0x25d/0x310 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv033853rm+0x109/0x220 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv014242rm+0x3a/0x100 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv015179rm+0x16e/0x3c0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv021904rm+0x91/0x1e0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv021905rm+0x21/0x40 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv000696rm+0x1aa/0x2f0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? _nv000643rm+0x49c/0x20b0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? rm_init_adapter+0xc5/0xe0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? try_to_wake_up+0x210/0x540
May 29 11:33:59 JPC1 kernel:  ? nv_open_device+0x2dc/0x8c0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? nvidia_open+0x2f3/0x600 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? kobj_lookup+0xf1/0x170
May 29 11:33:59 JPC1 kernel:  ? nvidia_frontend_open+0x50/0xa0 [nvidia bbad269b7dba803ab240b94e00fa76e6df7f8cbe]
May 29 11:33:59 JPC1 kernel:  ? chrdev_open+0xc9/0x250
May 29 11:33:59 JPC1 kernel:  ? cdev_device_add+0x90/0x90
May 29 11:33:59 JPC1 kernel:  ? do_dentry_open+0x1cf/0x3a0
May 29 11:33:59 JPC1 kernel:  ? path_openat+0xcac/0x10d0
May 29 11:33:59 JPC1 kernel:  ? filename_lookup+0xc3/0x1e0
May 29 11:33:59 JPC1 kernel:  ? do_filp_open+0xa7/0x160
May 29 11:33:59 JPC1 kernel:  ? do_sys_openat2+0xb9/0x170
May 29 11:33:59 JPC1 kernel:  ? __x64_sys_openat+0x6a/0xa0
May 29 11:33:59 JPC1 kernel:  ? do_syscall_64+0x58/0x90
May 29 11:33:59 JPC1 kernel:  ? syscall_exit_to_user_mode+0x23/0x50
May 29 11:33:59 JPC1 kernel:  ? do_syscall_64+0x67/0x90
May 29 11:33:59 JPC1 kernel:  ? syscall_exit_to_user_mode+0x23/0x50
May 29 11:33:59 JPC1 kernel:  ? do_syscall_64+0x67/0x90
May 29 11:33:59 JPC1 kernel:  ? exc_page_fault+0x71/0x170
May 29 11:33:59 JPC1 kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
May 29 11:33:59 JPC1 kernel:  </TASK>
...
May 29 11:33:59 JPC1 kernel: Modules linked in: pcc_cpufreq(-) cmac algif_hash algif_skcipher af_alg qrtr ns bnep nct6775 hwmon_vid snd_sof_pci_intel_tgl snd_sof_intel_hda_common uinput nvidia_drm(POE) intel_rapl_msr nvidia_modeset(POE) soundwire_intel asus_nb_wmi soundwire_generic_a>
May 29 11:33:59 JPC1 kernel:  snd_intel_sdw_acpi btrtl btbcm snd_hda_codec btintel mousedev i915 joydev snd_hda_core bluetooth cfg80211 snd_hwdep snd_pcm snd_timer ecdh_generic snd crc16 intel_lpss_pci ttm intel_lpss rfkill idma64 soundcore intel_gtt wmi mac_hid video acpi_tad acpi_p>
May 29 11:33:59 JPC1 kernel: CR2: 0000000000000088
May 29 11:33:59 JPC1 kernel: ---[ end trace 10b0dc64e5ae1785 ]---
May 29 11:33:59 JPC1 kernel: RIP: 0010:_nv033473rm+0xac/0x130 [nvidia]
May 29 11:33:59 JPC1 kernel: Code: 44 89 e0 5b 41 5c c3 0f 1f 80 00 00 00 00 48 c1 e1 06 48 03 8c fe e0 23 00 00 45 84 c0 8b 50 08 44 8b 48 0c 74 78 85 db 74 3c <48> 83 41 08 01 0f b6 10 83 e2 03 80 fa 03 75 bf 45 84 c0 75 ba 0f
May 29 11:33:59 JPC1 kernel: RSP: 0018:ffffb0c2821836c8 EFLAGS: 00010206
May 29 11:33:59 JPC1 kernel: RAX: ffff9a3f26722ce0 RBX: 0000000000000055 RCX: 0000000000000080
May 29 11:33:59 JPC1 kernel: RDX: 000000000000014c RSI: ffff9a3f30720008 RDI: 0000000000000014
May 29 11:33:59 JPC1 kernel: RBP: ffff9a3f26722c90 R08: 0000000000000001 R09: 0000000000000000
May 29 11:33:59 JPC1 kernel: R10: 0000000000015d5c R11: 0000000000000000 R12: 0000000000000055
May 29 11:33:59 JPC1 kernel: R13: ffff9a3f30720008 R14: 0000000000000014 R15: ffff9a3f26610008
May 29 11:33:59 JPC1 kernel: FS:  00007fbf515de100(0000) GS:ffff9a463f640000(0000) knlGS:0000000000000000
May 29 11:33:59 JPC1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 29 11:33:59 JPC1 kernel: CR2: 0000000000000088 CR3: 000000011bbbe002 CR4: 0000000000770ee0
May 29 11:33:59 JPC1 kernel: PKRU: 55555554
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00002001/00002000
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0: AER: Corrected error received: 0000:00:01.0
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00000001/00002000
May 29 11:33:59 JPC1 kernel: pcieport 0000:00:01.0:    [ 0] RxErr                  (First)

etc. Then the error kept on going until shutdown.

The grub config flags seemed to work in other boots, but the error is always the same: GPU has fallen off the bus. And the flags I used do not seem to resolve the issue?

Please remember that I am only giving you journalctl information of things marked in yellow, red or that seem to have something to do with nvidia… I do not know if some of these things are not related to the problem.

One more random thing that came to my mind is, that I am using the GPU riser cable. But if that would be problem why did it work the first time, but not the fans?
Another random thought: I have a 450W SFX PSU. This should be plenty for my setup, especially on an mITX build. But just to be sure: Is there a way to limit power usage before boot to see if there is a problem with that?

Thank you so much already for your time reading this, you guys are doing me such a big favor already. I hope we can find this issue soon. I feel like we are getting closer, especially when the driver initially worked at least once…

Edit: Typos

Oh well… one steps was missed. I run a GTX1050, no need to set special power management rules, but mhwd mentioned it: pci/graphic_drivers/hybrid-intel-nvidia-prime/MHWDCONFIG · master · Applications / mhwd-db · GitLab

So I guess it must be set for RTX3060. To do this manually, you have to create udev rules:

File: /etc/udev/rules.d/90-nvidia-prime-powermanagement.rules

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present (enable it for kernels lower than 5.5)
#ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

and

File: /etc/modprobe.d/nvidia-prime-powermanagement.conf

options nvidia "NVreg_DynamicPowerManagement=0x02"

These config files should enable the power management on a hybrid setup. I will add it to the steps above.

Thanks so much for your fast reply! I guess the best I can do then is restore the snapshot and redo all the installation from scratch, using the steps above including the power management rules?

Are those rules already present or will I need to recreated the file? Is this:

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present (enable it for kernels lower than 5.5)
#ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

enough for it to work, or do I need more in the file? It would be fantastic if this would work! :crossed_fingers:

No idea. I have no magical glass globe to see what is on your system.

Custom rules are here: /etc/udev/rules.d/ (will not be overwritten on update)
System rules are here: /usr/lib/udev/rules.d/ (will be overwritten on every update)

Should be enough. In fact I just copy&paste what mhwd does.