My laptop keeps on freezing randomly
I tried switching kernels and it still freezes, so I tried taking a look at the journal
Here is what I have found so far.
Nov 07 01:53:37 euki kernel: DMAR: DRHD: handling fault status reg 3
Nov 07 01:53:37 euki kernel: DMAR: [DMA Read NO_PASID] Request device [00:02.0] fault addr 0x3cdd6000 [fault reason 0x06] PTE Read access is not set
Nov 07 01:53:37 euki kernel: x86/cpu: SGX disabled by BIOS.
Nov 07 01:53:37 euki kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.TDGC], AE_NOT_FOUND (20220331/psargs-330)
Nov 07 01:53:37 euki kernel: ACPI Error: Aborting method \_SB.PCI0.RP05.PC01._ON due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Nov 07 01:53:37 euki kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.IPPF._STA.POS1], AE_NOT_FOUND (20220331/psargs-330)
Nov 07 01:53:37 euki kernel: ACPI Error: Aborting method \_SB.IPPF._STA due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Nov 07 01:53:37 euki kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.IPPF._STA.POS1], AE_NOT_FOUND (20220331/psargs-330)
Nov 07 01:53:37 euki kernel: ACPI Error: Aborting method \_SB.IPPF._STA due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Nov 07 01:53:37 euki kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.IPPF._STA.POS1], AE_NOT_FOUND (20220331/psargs-330)
Nov 07 01:53:37 euki kernel: ACPI Error: Aborting method \_SB.IPPF._STA due to previous error (AE_NOT_FOUND) (20220331/psparse-529)
Nov 07 01:53:37 euki kernel: intel-spi 0000:00:1f.5: invalid resource
Nov 07 01:53:38 euki systemd-udevd[289]: could not read from '/sys/module/pcc_cpufreq/initstate': No such device
Nov 07 01:53:38 euki kernel: platform idma64.0: failed to claim resource 0: [mem 0x00000800-0x00000fff]
Nov 07 01:53:38 euki kernel: platform idma64.0: failed to claim resource 0: [mem 0x00000800-0x00000fff]
Nov 07 01:53:38 euki kernel: platform idma64.0: failed to claim resource 0: [mem 0x00000800-0x00000fff]
Nov 07 01:53:38 euki kernel: platform idma64.0: failed to claim resource 0: [mem 0x00000800-0x00000fff]
Nov 07 01:53:43 euki lightdm[911]: gkr-pam: unable to locate daemon control file
Nov 07 01:53:51 euki systemd-coredump[1651]: Failed to connect to coredump service: Connection refuse
d
This happened before a random freeze yesterday. What can I do??
Thank You Very Much!
The DMAR thing seems usually related to the “nvidia” closed, binary driver which is more or less immediately to say that unless you by chance happen to run into someone with the same issue “the community” as such has little chance of advising on other than switching to the open source “nouveau” driver or on a dual-graphics system to the onboard – even if only to see if indeed “nvidia” is the issue.
Which it may not be. The rest of the log excerpt you post would not indicate problems, but freezes are a fairly common issue with early Ryzen also for example. In any case hardware information is needed to possibly say anything; the output of
sudo inxi -Fxz
could in that sense be useful.
[EDIT] Was writing the above while you in fast posted inxi output…
So no “nvidia” at least. There’s a fair number of reports of similar out there for your CPU but unfortunately none that seem to definitively diagnose this. The Ryzen thing I noted is a C-state issue though and this looks possibly same.
You’d want to try to boot with the kernel parameter
intel_idle.max_cstate=0
to see if that helps. To do do you edit as root /etc/default/grub, f.e.,
sudo nano /etc/default/grub
and add the above to the GRUB_CMDLINE_LINUX_DEFAULT=“…” string, inside of the quotes. Save and run
sudo update-grub
and reboot to see if things are then stable.
If not, another suggestion specifically related to the DMAR thing could be the kernel parameter
intel_iommu=igfx_off
but that has less chance (i.e., if I’m not mistaken a default Arch/Manjaro kernel should not enable it by default anyway).
The first suggestion stands fair chance – but note that you may eat more battery even if it does. For now just treat it as a debugging step though; if it works you may be able to adjust things “better” through your BIOS setup rather than the kernel parameter.
You’ve now completely disabled the intel_idle driver – which has the kernel retreat to the ACPI driver and which might in fact be good enough but you may also have some configuration available in your BIOS setup as to C-states and perhaps tweaking something there could cause you less battery than the possibly suboptimal ACPI driver (I’ve noticed that on a laptop you have less chance of that than on a desktop, though).
Anycase, yes, let’s first try and see if it does anything at all…
First just don’t do anything; I may be barking up a completely wrong tree here
It’s otherwise not related to Manjaro; you’d enter your BIOS-setup by tapping e.g. the Del key, or F2 or F9, or … when the system starts up and then look around for anything suspicious. But for now just wait and see.
Because maybe you have to take care of the other things i mentioned too.
When both audio servers run, in some instances, on some systems things can start to choke. Happened on one of my installs. Here is how to avoid that, if that could be part of your issue.
Same as explained from here:
but you use the intel_pstate=active kernel parameter instead.
on Grub boot menu you press the e key and then on the Linux Kernel Line, you add right after the quiet the intel_pstate=active and then press F10 to save it and continue booting. That is not a permanent entry
or permanent entry, by editing the file
but honestly, there is no point to repeat that …
You don’t. is required, but in order to not keep it active you simply install manjaro-pipewire and will do for you all is required. Just reboot after that to take effect.