OK I have been having this issue for awhile now with my Nvidia 970 GTX losing the signal to the display. I don’t if this is a driver issue with closed source drivers or the video card itself is the cause.
This always happens with games, although not all. At first this only happen with Windows games using Proton, now it is some Linux Native ones but not all.
I’m thinking either switching to the Open Source Nouveau drivers, to the iGPU, or simply buy this years AMD Radeon dGPU.
Has anyone else have been having issues with Nvidia Proprietary Drivers? If so, how did you deal with them?
The display could be starting to fail in some way. Generally such would be a failure of the power supply of the display in question. To test this you would turn down the brightness of the screen in question.
Is the graphics-card overheating in any capacity? If the temperature gets to high some cards can, if they are close to their stability point in terms of voltage, fail and cause problems.
To first determine the most likely causes of the problem we would need to know how exactly the failure mode behaves in your case. Does the picture just freeze or is the display turning off and on again for example?
So you have to turn off the display with the complete system to make it functional again? That is a very odd mode of failure. I have actually never seen something like this.
Could you install green with envy and underclock your card as much as possible in terms of core and memory to test if the card is failing?
Coolbits are a value that tell the nvidia driver how much control you can have over your card, for example control fan-speed or over/underclock your card. https://wiki.archlinux.org/index.php/NVIDIA/Tips_and_tricks#Enabling_overclocking
For what we want to do here you will want to enable the coolbit 8 with nvidia-xconfig --cool-bits=8 .
When you have done that you can create a profile on the lower left in gwe and decrease both Gpu and Mem offsets as much as possible.
You set the offsets actually way to high, you crashed your system, congratulations? As i said earlier and as it says in the program, these are offsets, not actual clock values. The offsets also go into the minus where you should set them as far as you are able, to test out if the gc is the problem. After that do some stuff that would normally cause the behaviour you described before.
That would be easiest, yes.
The config database would be in ~/.config/gwe/
Don’t worry to much about this though as it is just software overclocking. You would have much bigger issues if you had modified the firmware in that manner.
You generally cannot set the clocks to low. Atleast now we should be able to say your gc is fine.
The main thing that concerns me, is that you said it came over time, since that would not point to a software problem generally.
Maybe the logs will give some clue to what is exactly happening. Could you post the output of journalctl -x -p3 -b1 ?
[CODE] – Logs begin at Sat 2020-04-11 12:31:05 CDT, end at Sun 2020-08-23 07:05:34 CDT. –
Apr 11 12:31:05 william-pc kernel: [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x22 (or later)
Apr 11 12:31:05 william-pc systemd: Failed to start Load Kernel Modules.
Apr 11 12:31:05 william-pc systemd: Failed to start CLI Netfilter Manager.
Apr 11 12:31:05 william-pc systemd-modules-load: Failed to lookup module alias ‘crypto_user’: Function not implemented
Apr 11 12:31:05 william-pc systemd-modules-load: Failed to lookup module alias ‘nvidia’: Function not implemented
Apr 11 12:31:05 william-pc systemd-modules-load: Failed to lookup module alias ‘nvidia-drm’: Function not implemented
Apr 11 12:31:05 william-pc systemd-modules-load: Failed to lookup module alias ‘uinput’: Function not implemented
Apr 11 12:31:09 william-pc systemd: Failed to start Light Display Manager.
– Subject: A start job for unit lightdm.service has failed
– Defined-By: systemd
– Support: https://forum.manjaro.org/c/technical-issues-and-assistance
– A start job for unit lightdm.service has finished with a failure.
– The job identifier is 843 and the job result is failed.