Hardware Error Microcode a20102b Ryzen 9 5950X

inxi --full --admin --filter --width                             ✔ 
System:
  Kernel: 6.6.44-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 14.1.1
    clocksource: tsc avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64
    root=UUID=9f38c2ae-21ca-4215-822a-0e478276b220 ro quiet splash apparmor=1
    security=apparmor udev.log_priority=3
  Desktop: GNOME v: 46.3.1 tk: GTK v: 3.24.43 wm: gnome-shell
    tools: gsd-screensaver-proxy dm: GDM v: 46.2 Distro: Manjaro base: Arch Linux
Machine:
  Type: Desktop Mobo: ASUSTeK model: TUF GAMING X570-PLUS (WI-FI) v: Rev X.0x
    serial: <superuser required> part-nu: SKU uuid: <superuser required>
    UEFI: American Megatrends v: 5013 date: 03/22/2024
CPU:
  Info: model: AMD Ryzen 9 5950X bits: 64 type: MT MCP arch: Zen 3+ gen: 4
    level: v3 note: check built: 2022 process: TSMC n6 (7nm) family: 0x19 (25)
    model-id: 0x21 (33) stepping: 0 microcode: 0xA20102B
  Topology: cpus: 1x cores: 16 tpc: 2 threads: 32 smt: enabled cache:
    L1: 1024 KiB desc: d-16x32 KiB; i-16x32 KiB L2: 8 MiB desc: 16x512 KiB
    L3: 64 MiB desc: 2x32 MiB
  Speed (MHz): avg: 2535 high: 3599 min/max: 2200/5083 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 3400 2: 2200
    3: 2200 4: 2200 5: 2200 6: 2200 7: 2200 8: 2200 9: 3598 10: 3599 11: 2879
    12: 3597 13: 2200 14: 2200 15: 2200 16: 2200 17: 3596 18: 2200 19: 2876
    20: 2929 21: 2200 22: 2875 23: 2200 24: 2200 25: 2200 26: 2200 27: 2200
    28: 2200 29: 2200 30: 2200 31: 2200 32: 3400 bogomips: 217721
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow mitigation: Safe RET
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; IBRS_FW;
    STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not
    affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] driver: amdgpu
    v: kernel arch: RDNA-2 code: Navi-2x process: TSMC n7 (7nm) built: 2020-22
    pcie: gen: 4 speed: 16 GT/s lanes: 16 ports: active: DP-1
    empty: DP-2,DP-3,HDMI-A-1 bus-ID: 0c:00.0 chip-ID: 1002:73bf class-ID: 0300
  Display: x11 server: X.Org v: 21.1.13 with: Xwayland v: 24.1.1
    compositor: gnome-shell driver: X: loaded: amdgpu
    unloaded: modesetting,radeon alternate: fbdev,vesa dri: radeonsi
    gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 2560x1440 s-dpi: 96 s-size: 677x381mm (26.65x15.00")
    s-diag: 777mm (30.58")
  Monitor-1: DP-1 mapped: DisplayPort-0 model: Sceptre Y27 serial: <filter>
    built: 2020 res: 2560x1440 hz: 165 dpi: 109 gamma: 1.2
    size: 597x336mm (23.5x13.23") diag: 685mm (27") ratio: 16:9 modes:
    max: 2560x1440 min: 720x400
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
    device: 1 drv: swrast gbm: drv: kms_swrast surfaceless: drv: radeonsi x11:
    drv: radeonsi inactive: wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.1.5-manjaro1.1
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon RX 6900 XT (radeonsi
    navi21 LLVM 18.1.8 DRM 3.54 6.6.44-1-MANJARO) device-ID: 1002:73bf
    memory: 15.62 GiB unified: no
  API: Vulkan v: 1.3.279 layers: 8 device: 0 type: discrete-gpu name: AMD
    Radeon RX 6900 XT (RADV NAVI21) driver: mesa radv v: 24.1.5-manjaro1.1
    device-ID: 1002:73bf surfaces: xcb,xlib
Audio:
  Device-1: AMD Navi 21/23 HDMI/DP Audio driver: snd_hda_intel v: kernel pcie:
    gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 0c:00.1 chip-ID: 1002:ab28
    class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
    bus-ID: 0e:00.4 chip-ID: 1022:1487 class-ID: 0403
  Device-3: Best Buy INSIGNIA NS-CBM19
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 2.0 speed: 12 Mb/s
    lanes: 1 mode: 1.1 bus-ID: 7-2:2 chip-ID: 2034:0105 class-ID: 0300
    serial: <filter>
  API: ALSA v: k6.6.44-1-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 1.2.2 status: off with: pipewire-media-session
    status: active tools: pw-cli
  Server-3: PulseAudio v: 17.0 status: active with: 1: pulseaudio-alsa
    type: plugin 2: pulseaudio-jack type: module tools: pacat,pactl
Network:
  Device-1: Intel Wi-Fi 5 Wireless-AC 9x6x [Thunder Peak] driver: iwlwifi
    v: kernel pcie: gen: 2 speed: 5 GT/s lanes: 1 bus-ID: 05:00.0
    chip-ID: 8086:2526 class-ID: 0280
  IF: wlp5s0 state: up mac: <filter>
  Device-2: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    vendor: ASUSTeK driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1
    port: d000 bus-ID: 06:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp6s0 state: down mac: <filter>
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: Intel Wireless-AC 9260 Bluetooth Adapter driver: btusb v: 0.8
    type: USB rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 3-5:2
    chip-ID: 8087:0025 class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 1 state: up address: see --recommends
Drives:
  Local Storage: total: 4.09 TiB used: 1.18 TiB (28.9%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Sabrent model: Rocket 4.0 1TB
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: RKT401.2 temp: 45.9 C
    scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Western Digital model: WD20EZAZ-00GGJB0
    size: 1.82 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    tech: HDD rpm: 5400 serial: <filter> fw-rev: 0A80 scheme: GPT
  ID-3: /dev/sdb maj-min: 8:16 vendor: Crucial model: CT1000MX500SSD1
    size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 033 scheme: GPT
  ID-4: /dev/sdc maj-min: 8:32 vendor: SanDisk model: Ultra size: 460.27 GiB
    block-size: physical: 512 B logical: 512 B type: USB rev: 3.2 spd: 5 Gb/s
    lanes: 1 mode: 3.2 gen-1x1 tech: N/A serial: <filter> fw-rev: 1.00
    scheme: MBR
Partition:
  ID-1: / raw-size: 488.28 GiB size: 479.55 GiB (98.21%)
    used: 369.02 GiB (77.0%) fs: ext4 dev: /dev/nvme0n1p6 maj-min: 259:6
  ID-2: /boot/efi raw-size: 100 MiB size: 99.2 MiB (99.21%)
    used: 31.9 MiB (32.1%) fs: vfat dev: /dev/nvme0n1p3 maj-min: 259:3
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 51.9 C mobo: N/A gpu: amdgpu temp: 59.0 C
    mem: 56.0 C
  Fan Speeds (rpm): N/A gpu: amdgpu fan: 0
Info:
  Memory: total: 32 GiB available: 31.25 GiB used: 4.51 GiB (14.4%)
  Processes: 624 Power: uptime: 5m states: freeze,mem,disk suspend: deep
    avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 12.48 GiB services: gsd-power,
    power-profiles-daemon, upowerd Init: systemd v: 256 default: graphical
    tool: systemctl
  Packages: 2009 pm: pacman pkgs: 1990 libs: 568 tools: gnome-software,pamac
    pm: appimage pkgs: 0 pm: flatpak pkgs: 0 pm: snap pkgs: 19 Compilers:
    clang: 18.1.8 gcc: 14.1.1 Shell: Zsh v: 5.9 running-in: gnome-terminal
    inxi: 3.3.35

uname -r                                                         ✔ 
6.6.44-1-MANJARO

Hello, I am having trouble with random restarts while doing development in unity. I would also have brave, discord, blender, etc open. Sometimes just unity and brave. The system restarts and presents me with a hardware error message. The two screenshots I have are two separate instances. I believe I installed the cpu whilst running the 5.9 kernel. I’ve used 5.10, 5.14, 5.15, 6.1, and 6.6 since. They all have the same issue. The only work arounds are to:

  1. Go into my ASUS TUF Gaming WiFi X570 BIOS and switch from Normal to ASUS Optimal in easy mode, (Or Manually set core ratio to 40, same effect)
  2. OR Use a custom High-Performance profile I made in corectl (nevermind, this had no effect)

I want to get the full speed out of my cpu as all 32 threads do in fact get used when compiling and rendering (I don’t have cycles GPU support yet).

I found this thread

so I tried adding processor.max_cstate=5 amd_iommu=on rcu_nocbs=0_11 to my Kernel parameters as suggested. I just pressed ‘e’ in the grub menu and added it to the bottom. This did not help, I just had another crash after doing that yesterday. It’s been years since I’ve used windows on my main machine, and I don’t want to switch now, or use Ubuntu(gross)… What do I do?
Thanks in advance. :slight_smile:
Edit2: I’m going to try to do this manually in /etc/default/grub, the correct way. I don’t know what I was thinking by doing it in the grub menu. Been a few years since I’ve added params. grub file at bottom.

Edit: I forgot the images :man_facepalming:


GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_TIMEOUT_STYLE=hidden
GRUB_DISTRIBUTOR="Manjaro"
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash apparmor=1 security=apparmor udev.log_priority=3 processor.max_cstate=5 amd_iommu=on rcu_nocbs=0_11"
GRUB_CMDLINE_LINUX=""

# If you want to enable the save default function, uncomment the following
# line, and set GRUB_DEFAULT to saved.
GRUB_SAVEDEFAULT=true

# Preload both GPT and MBR modules so that they are not missed
GRUB_PRELOAD_MODULES="part_gpt part_msdos"

# Uncomment to enable booting from LUKS encrypted devices
#GRUB_ENABLE_CRYPTODISK=y

# Uncomment to use basic console
GRUB_TERMINAL_INPUT=console

# Uncomment to disable graphical terminal
#GRUB_TERMINAL_OUTPUT=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command 'videoinfo'
GRUB_GFXMODE=auto

# Uncomment to allow the kernel use the same resolution used by grub
GRUB_GFXPAYLOAD_LINUX=keep

# Uncomment if you want GRUB to pass to the Linux kernel the old parameter
# format "root=/dev/xxx" instead of "root=/dev/disk/by-uuid/xxx"
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY=true

# Uncomment and set to the desired menu colors.  Used by normal and wallpaper
# modes only.  Entries specified as foreground/background.
GRUB_COLOR_NORMAL="light-gray/black"
GRUB_COLOR_HIGHLIGHT="green/black"

# Uncomment one of them for the gfx desired, a image background or a gfxtheme
#GRUB_BACKGROUND="/usr/share/grub/background.png"
GRUB_THEME="/usr/share/grub/themes/manjaro/theme.txt"

# Uncomment to get a beep at GRUB start
#GRUB_INIT_TUNE="480 440 1"

# Uncomment to ensure that the root filesystem is mounted read-only so that
# systemd-fsck can run the check automatically
GRUB_ROOT_FS_RO=true

# Uncomment this option to enable os-prober execution in the grub-mkconfig command
GRUB_DISABLE_OS_PROBER=false

The cpu fails under load and it seems to be the hardware. So it should be either a defect cpu, or bad contact (just reassemble the whole column motherboard socket-cpu-thermal compound-heatsink), or bad overclock (that is your current remedy, putting a profile which prevents full speed)

This CPU reaches max boost under full load just fine in Windows 10. Iv’e done burn-in stress tests on windows as well as a full suite of hardware tests over days. It’s been reseated multiple times. I think a bad overclock is a bit unfair, more of a static underclock if anything. Instead of getting a full range of 3.4Ghz to 4.6Ghz, it’s static at 4.0Ghz. I really don’t think it’s hardware, I’m not the only one with this problem.

This sounds like that problem posted about in some places.

Basically the voltages are wrong and windoze off sets it, but linux does not by default.

See this section of the archwiki and the one directly after it:

https://wiki.archlinux.org/title/Ryzen#Random_reboots

(emphasis added by me)

2 Likes

Ah! Thank you! It is still an active Kernel Bug! (Atleast in LTS) Thank you for the resource! I will test and come back here for future documentation.

Also seems to be pretty similar to this: Full system freeze on new install after some time - #19 by aFriendlyTurtle

In the wiki post you linked, it has a post from another forum linked. They talked about how a motherboard vendor recommended increasing RAM voltage by .050V, which is also mentioned in this manjaro forum post, they also increase CPU core voltage by .050V. I’ll see which solution is best. GRUB param, voltage bump, or your suggested PBO fix.

So I did some more testing. Editing the grub file and adding those parameters did not help on its own. I tried increasing core voltage by and offset of +.02500, this gave me a little more stability. I discovered this individual youtube video that would lock up my system every time. I think it has to do with the codec that was used to make the video (avc1.64002a (299) / opus (251) ).

Once I applied the offset, Brave would lock up, and give me a SIGBUS (7) error instead of locking up the whole system. I undid the voltage offset, enabled PBO and used the curve optimizer to set the offset to Positive and 4. Tried watching the video again. Watching that specific video seems to make the CPU jump from 55C to 60-70C, upon refreshing the page and starting playback.

I’ll come back once more testing is done, maybe I can lower the curve offset and test between iterations to find the lowest stable value. This could help with temps and power consumption.

Video if anyone wants to test (at your own risk of course): https://www.youtube.com/watch?v=vbaQlG2Ol6Q

1 Like

You could also try playing with the PBO power limits which will definitely help with temps and power consumption at the cost of a tiny bit of performance.

I have a 5800X so the actual numbers may be different for you. When I upgraded to that from a 5700G I was very surprised at how quickly the temps spiked to around 70 just from simple browsing. Setting PPT=120W TDC=80A EDC=120A (defaults 142/95/140) fixed that for only about 2% performance loss (and then I gained that back and more because I was able to set a -10 all cores offset).

Depending on your motherboard there may also be some presets like ECO_MODE_95W which is close to the manual settings I used iirc.