Confusing Crash Mayhem

Hi all,

my system developed some serious issues, escalating over the last half year and frankly, I’m out of ideas to try, I don’t know how and what to test anymore and I’m not the most advanced user linux has ever seen, too, to be honest, so I need some help.
I’m aware that this issue might not be manjaros fault at all, but on the other hand someone knowing manjaros behavior regarding hardware issues might spot what’s wrong. I suspect a hardware issue, at least. I might be totally wrong there, though.

I will try to list everything that’s not perfectly normal behavior, even if it’s not related at all in case I missed some freak connection there.

In a nutshell, some programs randomly crash without giving much information about it, or worse, contradicting information and escalating until they don’t start at all.

First issue I recall ever having with this system was problems booting up. Regularly (maybe every third bootup?), the mb logo would show up and stay indefinitely without giving the option to go into UEFI. Was solved with just restarting. This barely happens now, fixed itself apparently? Weird stuff, but hey, I won’t complain.
I started to monitor GPU and CPU simply by always having htop and stuff open, just to get a feel for any problems. I didn’t spot any weird behavior there, no spikes or anything and I kind of forgot about it.

I switched to a browser called vivaldi and was quite happy with that until I got tabcrashes, first just sometimes, was easily fixed with reloading said tab but that soon escalated into me not really being able to use that browser at all. I switched back to firefox which worked for some time but soon developed similar problems until vivaldi and firefox crashed completely and would continue to crash upon reload, console outputs are seen here:

https://ibb.co/Jc56YPw (First part vivaldi tab crash)
https://ibb.co/kyFRG2r (Second part vivaldi tab crash)
https://ibb.co/1spnMk4 (Firefox crash, as you can see, he tried to restart himself multiple times)
(that’s too much to write out, I’m sorry. Also, the vaapi issue seen here was fixed with a simple vaapi update, but didn’t change the main issue.)

This development stretched out over a month, maybe two. No other program had any issues at all, just browsers. Eventually vivaldi refused to open at all, reporting back “Speicherzugriffsfehler” which is german for memory access violation, I guess.

$ LIBVA_MESSAGING_LEVEL=2 vivaldi-stable
[0329/163813.292535:ERROR:elf_dynamic_array_reader.h(64)] tag not found
[0329/163813.292793:ERROR:elf_dynamic_array_reader.h(64)] tag not found
Speicherzugriffsfehler (Speicherabzug geschrieben)

$

At this point I made some extensive memtests of all disks and RAM, I did and do not get any issues back there at all. Specifically, I used metester to test RAM and smartctl -t long -a and badblocks -v to test every disk I have (feel free to suggest other methods, that’s just what 20mins of google spat out for me). Eventually, I had to reset the system and, naively thinking that a reset should fix anything like this, switched from xfce to kde on a whim.

So, happy as a clam, I go about my day, browsing and gaming and stuff, until: Vivaldi tab crashes. Uh…

$ /usr/bin/vivaldi-stable %U
[2931:2931:0402/210019.399092:ERROR:CONSOLE(1)] "syncDetachedTabInformation: The message port closed before a response was received.", source: chrome-extension://mpognobbkildjkofajifpdfhcoklimli/bundle.js (1)
[2931:2931:0402/210019.399233:ERROR:CONSOLE(1)] "syncDetachedTabInformation: The message port closed before a response was received.", source: chrome-extension://mpognobbkildjkofajifpdfhcoklimli/bundle.js (1)
[2931:2931:0402/210019.792670:ERROR:CONSOLE(1)] "syncDetachedTabInformation: The message port closed before a response was received.", source: chrome-extension://mpognobbkildjkofajifpdfhcoklimli/bundle.js (1)
[2931:2931:0402/210020.145513:ERROR:sharing_service.cc(222)] Device registration failed with fatal error
[2931:3023:0402/210020.650013:ERROR:chrome_browser_main_extra_parts_metrics.cc(227)] START: ReportBluetoothAvailability(). If you dont see the END: message, this is crbug.com/1216328.
[2931:3023:0402/210020.650035:ERROR:chrome_browser_main_extra_parts_metrics.cc(230)] END: ReportBluetoothAvailability()

(No full program crash here, just tab crash. Also, I use manjaro and vivaldi on my laptop and I get these chrome-extension errors there, too, but do not have any issues)

Firefox is acting up again, too, like it started the first time, slowly escalating.

I noticed today that thunderbird does not start without issues. Had to try about three times and with how things escalated for vivaldi and firefox… I am scared, help :D.

$ /usr/lib/thunderbird/thunderbird %u
[calBackendLoader] Using Thunderbirds libical backend
[LDAPModuleLoader] Using LDAPDirectory.jsm
[MsgSendModuleLoader] Using MessageSend.jsm
[SmtpModuleLoader] Using SmtpService.jsm
JavaScript error: resource://modules/MessengerContentHandler.jsm, line 76: NS_ERROR_FAILURE:

$

I’m starting to suspect Steam has issues as well, but I don’t want to commit to that statement, yet. Gaming on linux is inherently unstable and this might just be a normal “I don’t want to, go beg me”-issue that will fix itself with time.

Some infos on the system:
The disks are 2 years or younger and CPU (Ryzen 7 3800X), GPU (GTX 1660ti) and MB (msi MPG X570 G Plus) are about a year old, so are two RAM sticks. The other two are about a month old, bought after this issue started. Powersupply is the only thing that’s really old, I guess. I use two hdds, a ssd and a M2 ssd, M2 has the manjaro partition, but it was on the slower ssd before the reset.
Edit: See below post for more.

I am totally aware that this issue is not so easy to point at but I will gladly test anything you can suggest or even think of at this point, including and not ending at shaman rituals and blood sacrifices :D.

At the very least,
thanks for reading through <3
Nandalee

Hello,

Please use this [HowTo] Provide System Information and share the system information that way.
From this

we can’t conclude what would be the issue, and that is a configuration that probably is twice as powerful than the one i have, yet my system is performing really well for 3 years with the same install.

Oh I should have done that on my own, I guess, sorry. Here you go:

System:
  Kernel: 5.15.28-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 11.2.0
    parameters: BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64
    root=UUID=853cb94c-d24f-4ff4-8d2d-879c67a526e9 rw quiet apparmor=1
    security=apparmor resume=UUID=6d02a5a6-ebf4-4b9a-a72f-8c75e0b6170e
    udev.log_priority=3
  Desktop: KDE Plasma v: 5.24.3 tk: Qt v: 5.15.3 wm: kwin_x11 vt: 1 dm: SDDM
    Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Desktop Mobo: Micro-Star model: MPG X570 GAMING PLUS (MS-7C37) v: 2.0
    serial: <superuser required> UEFI: American Megatrends v: A.B0
    date: 10/29/2020
Battery:
  Message: No system battery data found. Is one present?
Memory:
  RAM: total: 31.33 GiB used: 3.51 GiB (11.2%)
  RAM Report:
    permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: model: AMD Ryzen 7 3800X bits: 64 type: MT MCP arch: Zen 2
    family: 0x17 (23) model-id: 0x71 (113) stepping: 0 microcode: 0x8701021
  Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache:
    L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 4 MiB desc: 8x512 KiB
    L3: 32 MiB desc: 2x16 MiB
  Speed (MHz): avg: 2942 high: 4234 min/max: 2200/4559 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 3642 2: 2073
    3: 2931 4: 2808 5: 4222 6: 2090 7: 2276 8: 3152 9: 4193 10: 2032 11: 2326
    12: 3140 13: 3280 14: 2150 15: 4234 16: 2532 bogomips: 124863
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1
    bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy
    constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total
    cqm_occup_llc cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid
    f16c flushbyasid fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb ibs
    irperf lahf_lm lbrv lm mba mca mce misalignsse mmx mmxext monitor movbe
    msr mtrr mwaitx nonstop_tsc nopl npt nrip_save nx osvw overflow_recov pae
    pat pausefilter pclmulqdq pdpe1gb perfctr_core perfctr_llc perfctr_nb
    pfthreshold pge pni popcnt pse pse36 rapl rdpid rdpru rdrand rdseed rdt_a
    rdtscp rep_good sep sev sev_es sha_ni skinit smap smca sme smep ssbd sse
    sse2 sse4_1 sse4_2 sse4a ssse3 stibp succor svm svm_lock syscall tce
    topoext tsc tsc_scale umip v_spec_ctrl v_vmsave_vmload vgif vmcb_clean vme
    vmmcall wbnoinvd wdt xgetbv1 xsave xsavec xsaveerptr xsaveopt xsaves
  Vulnerabilities:
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass
    mitigation: Speculative Store Bypass disabled via prctl and seccomp
  Type: spectre_v1
    mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2
    mitigation: Retpolines, IBPB: conditional, STIBP: conditional, RSB filling
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA TU116 [GeForce GTX 1660 Ti] vendor: ZOTAC driver: nvidia
    v: 510.54 alternate: nouveau,nvidia_drm pcie: gen: 3 speed: 8 GT/s lanes: 16
    bus-ID: 2d:00.0 chip-ID: 10de:2182 class-ID: 0300
  Display: x11 server: X.Org v: 1.21.1.3 with: Xwayland v: 22.1.0
    compositor: kwin_x11 driver: X: loaded: nvidia gpu: nvidia display-ID: :0
    screens: 1
  Screen-1: 0 s-res: 3840x1080 s-dpi: 81 s-size: 1204x343mm (47.40x13.50")
    s-diag: 1252mm (49.29")
  Monitor-1: DP-3 pos: primary,left res: 1920x1080 dpi: 81
    size: 600x340mm (23.62x13.39") diag: 690mm (27.15") modes: N/A
  Monitor-2: HDMI-0 pos: primary,right res: 1920x1080 hz: 60 dpi: 82
    size: 598x336mm (23.54x13.23") diag: 686mm (27.01") modes: N/A
  OpenGL: renderer: NVIDIA GeForce GTX 1660 Ti/PCIe/SSE2
    v: 4.6.0 NVIDIA 510.54 direct render: Yes
Audio:
  Device-1: NVIDIA TU116 High Definition Audio vendor: ZOTAC
    driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16
    bus-ID: 2d:00.1 chip-ID: 10de:1aeb class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
    bus-ID: 2f:00.4 chip-ID: 1022:1487 class-ID: 0403
  Sound Server-1: ALSA v: k5.15.28-1-MANJARO running: yes
  Sound Server-2: JACK v: 1.9.20 running: no
  Sound Server-3: PulseAudio v: 15.0 running: yes
  Sound Server-4: PipeWire v: 0.3.48 running: yes
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Micro-Star MSI X570-A PRO driver: r8169 v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 27:00.0 chip-ID: 10ec:8168
    class-ID: 0200
  IF: enp39s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
    broadcast: <filter>
  IP v6: <filter> type: dynamic noprefixroute scope: global
  IP v6: <filter> type: noprefixroute scope: link
  WAN IP: <filter>
Bluetooth:
  Message: No bluetooth data found.
Logical:
  Message: No logical block device data found.
RAID:
  Message: No RAID data found.
Drives:
  Local Storage: total: 3.75 TiB used: 323.23 GiB (8.4%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung
    model: SSD 970 EVO Plus 1TB size: 931.51 GiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
    rev: 2B2QEXM7 temp: 37.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 840 Series
    size: 111.79 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: SSD serial: <filter> rev: 6B0Q scheme: GPT
  ID-3: /dev/sdb maj-min: 8:16 vendor: Toshiba model: DT01ACA200
    size: 1.82 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    type: HDD rpm: 7200 serial: <filter> rev: ABB0 scheme: MBR
  ID-4: /dev/sdc maj-min: 8:32 vendor: Western Digital
    model: WD10EZEX-08M2NA0 size: 931.51 GiB block-size: physical: 4096 B
    logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter>
    rev: 1A01 scheme: MBR
  Message: No optical or floppy data found.
Partition:
  ID-1: / raw-size: 896.75 GiB size: 881.6 GiB (98.31%)
    used: 130.08 GiB (14.8%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2
    label: N/A uuid: 853cb94c-d24f-4ff4-8d2d-879c67a526e9
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 288 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
    label: NO_LABEL uuid: 1789-C51C
  ID-3: /hdd1 raw-size: 1.82 TiB size: 1.79 TiB (98.38%)
    used: 23.31 GiB (1.3%) fs: ext4 dev: /dev/sdb1 maj-min: 8:17 label: N/A
    uuid: caf7f19b-cea9-4c88-b3dc-c46cd8fb55ec
  ID-4: /hdd2 raw-size: 931.51 GiB size: 915.89 GiB (98.32%)
    used: 96.34 GiB (10.5%) fs: ext4 dev: /dev/sdc1 maj-min: 8:33 label: N/A
    uuid: 2c1f95dc-603c-40a4-b396-b1e07828bbeb
  ID-5: /ssd raw-size: 111.79 GiB size: 109.47 GiB (97.93%)
    used: 73.5 GiB (67.1%) fs: ext4 dev: /dev/sda1 maj-min: 8:1 label: N/A
    uuid: 4fe51f2b-1dcd-4bf7-8aad-ab23084f4506
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: partition size: 34.47 GiB used: 0 KiB (0.0%)
    priority: -2 dev: /dev/nvme0n1p3 maj-min: 259:3 label: swap
    uuid: 6d02a5a6-ebf4-4b9a-a72f-8c75e0b6170e
Unmounted:
  Message: No unmounted partitions found.
USB:
  Hub-1: 1-0:1 info: Hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 1-5:2 info: Logitech Gaming Mouse G300 type: Mouse,Keyboard
    driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s
    power: 200mA chip-ID: 046d:c246 class-ID: 0300
  Device-2: 1-6:3 info: Trust Keyboard [GXT 830] type: Keyboard,HID
    driver: hid-generic,usbhid interfaces: 2 rev: 1.1 speed: 1.5 Mb/s
    power: 100mA chip-ID: 145f:01e5 class-ID: 0300
  Hub-2: 2-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-3: 3-0:1 info: Hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 3-5:2 info: Micro Star MYSTIC LIGHT type: HID
    driver: hid-generic,usbhid interfaces: 1 rev: 1.1 speed: 12 Mb/s
    power: 500mA chip-ID: 1462:7c37 class-ID: 0300 serial: <filter>
  Hub-4: 3-6:3 info: Genesys Logic Hub ports: 4 rev: 2.0 speed: 480 Mb/s
    power: 100mA chip-ID: 05e3:0608 class-ID: 0900
  Hub-5: 4-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-6: 5-0:1 info: Hi-speed hub with single TT ports: 2 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Hub-7: 6-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-8: 7-0:1 info: Hi-speed hub with single TT ports: 4 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Hub-9: 8-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
Sensors:
  System Temperatures: cpu: N/A mobo: N/A gpu: nvidia temp: 36 C
  Fan Speeds (RPM): N/A gpu: nvidia fan: 31%
Info:
  Processes: 382 Uptime: 4m wakeups: 0 Init: systemd v: 250 tool: systemctl
  Compilers: gcc: 11.2.0 clang: 13.0.1 Packages: pacman: 1310 lib: 354
  flatpak: 0 Shell: Zsh v: 5.8.1 default: Bash v: 5.1.16 running-in: konsole
  inxi: 3.3.15

(On that note, I also found a post about how to share links, I will try to add my console pics I gathered in a somewhat readable fashion, too.)

Manjaro did perfectly well on this system before the upgrades last year, too, that’s why I don’t really think it is a manjaro exclusive issue. Well, essentially another system with these drives, anyway. I don’t think it’s a performance issue, too, that would be ridiculous :smiley: “Manjaro monster os eats Ryzen as breakfast”.
Is there something specific I could test that you would try first?

Well, this forum member had apparently same issue as you, same maiboard model, with older BIOS.

Is the issue persisting even if you update it?
https://www.msi.com/Motherboard/MPG-X570-GAMING-PLUS/support

I upgraded it and so far nothing crashes anymore, but that doesn’t mean anything. There are usually days it’s not happening at all, so I will report back on this in a few.

@bogdancovaciu Out of curiosity, how did you jump from that other issue you posted to mb drivers? It wasn’t mentioned there unless I missed something? Or was that purely because of you noticing the same model? Also, although it might as well be a related issue, it doesn’t sound anything remotely alike at first glance, I don’t suffer any graphical bugs at all (much to my delight I might add, these look annoying af) inside or outside a browser and neither xfce or kde crashed before, otherwise I would have strongly suspected my graphics card. My tabs simply shut down, clearly visible at that :D.
(Also: It’s really convenient and thoughtful of you to provide a link to msi like that, thank you ^^)

Well, some past experiences made me conclude that sometimes thing can manifest a bit different, but the root cause to be the same, reason why i always start with BIOS update, then check cables of each device inside the PC, then drivers … till i get a closure. My “method” is quite empirical and sometimes i can be off.

If you use to put the PC to sleep or hibernate, do it only one time per session.

Nope, happening again today. This time the console wouldn’t start, too, that’s unfortunate… Any other ideas? I checked my cables and drivers should be up to date, too.

I just shut down normally, no sleep or anything. Just out. Systems generally tend to be more stable for me without sleepmode/hibernation so I avoid them.

Without any other ideas, I borrowed a similar system from a friend and switched components one by one to further test hardware and concluded, that indeed two of the four RAM sticks might indeed be the culprit since the issue did not appear for over a week now, although any amount of extensive memory testing came back without issues.
I do wonder about how memory tests are done, though, if such an issue goes unnoticed. I will step up the paranoid backup game another notch, I guess.
Anyways, thank you all for helping out, even if it was such a banality in the end <3

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.