Random crashes on hp envy x360 @ AMD Ryzen 2500U

Following this, as I have a laptop with the Ryzen 2700U that does the same. :slight_smile:

Hm it’s different. No CPU#x stuck for 22s! anymore (but could be coincident) but still freezes with different messages.

Mai 02 23:01:01 xenon-m2 CROND[5171]: (root) CMD (run-parts /etc/cron.hourly)
Mai 02 23:01:21 xenon-m2 kernel: pcieport 0000:00:01.7: AER: Multiple Corrected error received: id=0008
Mai 02 23:01:21 xenon-m2 kernel: pcieport 0000:00:01.7: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000f(Transmitter ID)
Mai 02 23:01:21 xenon-m2 kernel: pcieport 0000:00:01.7:   device [1022:15d3] error status/mask=00001000/00006000
Mai 02 23:01:21 xenon-m2 kernel: pcieport 0000:00:01.7:    [12] Replay Timer Timeout  
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c609000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c607000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c627000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c61f000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c61d000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c601000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c60c000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c60b000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c610000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768)
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0:   at page 0x000000010c605000 from 27
Mai 02 23:02:38 xenon-m2 kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Mai 02 23:02:48 xenon-m2 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=4721865, last emitted seq=4721867
Mai 02 23:02:48 xenon-m2 kernel: [drm] GPU recovery disabled.
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7: AER: Corrected error received: id=0008
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000f(Transmitter ID)
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7:   device [1022:15d3] error status/mask=00001000/00006000
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7:    [12] Replay Timer Timeout  
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7: AER: Corrected error received: id=0008
Mai 02 23:04:53 xenon-m2 kernel: pcieport 0000:00:01.7: can't find device of ID0008
Mai 02 23:05:03 xenon-m2 kernel: INFO: task amdgpu_cs:0:754 blocked for more than 120 seconds.
Mai 02 23:05:03 xenon-m2 kernel:       Tainted: G         C       4.16.0-rc7-d64547a1cfa8 #1
Mai 02 23:05:03 xenon-m2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 02 23:05:03 xenon-m2 kernel: amdgpu_cs:0     D    0   754    563 0x00000000
Mai 02 23:05:03 xenon-m2 kernel: Call Trace:
Mai 02 23:05:03 xenon-m2 kernel:  ? __schedule+0x299/0x8a0
Mai 02 23:05:03 xenon-m2 kernel:  schedule+0x2f/0x90
Mai 02 23:05:03 xenon-m2 kernel:  schedule_timeout+0x205/0x3b0
Mai 02 23:05:03 xenon-m2 kernel:  ? amdgpu_ttm_alloc_gart+0x8f/0x2b0 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  dma_fence_default_wait+0x1cd/0x270
Mai 02 23:05:03 xenon-m2 kernel:  ? dma_fence_release+0xa0/0xa0
Mai 02 23:05:03 xenon-m2 kernel:  dma_fence_wait_timeout+0x39/0x110
Mai 02 23:05:03 xenon-m2 kernel:  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  amdgpu_cs_ioctl+0x98/0x1a90 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  ? dequeue_entity+0xd9/0x430
Mai 02 23:05:03 xenon-m2 kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  drm_ioctl_kernel+0x5b/0xb0 [drm]
Mai 02 23:05:03 xenon-m2 kernel:  drm_ioctl+0x2c3/0x360 [drm]
Mai 02 23:05:03 xenon-m2 kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Mai 02 23:05:03 xenon-m2 kernel:  do_vfs_ioctl+0xa4/0x630
Mai 02 23:05:03 xenon-m2 kernel:  ? SyS_futex+0x12d/0x180
Mai 02 23:05:03 xenon-m2 kernel:  SyS_ioctl+0x74/0x80
Mai 02 23:05:03 xenon-m2 kernel:  do_syscall_64+0x67/0x120
Mai 02 23:05:03 xenon-m2 kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mai 02 23:05:03 xenon-m2 kernel: RIP: 0033:0x7f5181842d87
Mai 02 23:05:03 xenon-m2 kernel: RSP: 002b:00007f51750b9ae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mai 02 23:05:03 xenon-m2 kernel: RAX: ffffffffffffffda RBX: 00007f51750b9bd8 RCX: 00007f5181842d87
Mai 02 23:05:03 xenon-m2 kernel: RDX: 00007f51750b9b50 RSI: 00000000c0186444 RDI: 0000000000000010
Mai 02 23:05:03 xenon-m2 kernel: RBP: 00007f51750b9b50 R08: 00007f51750b9c00 R09: 00007f51750b9b30
Mai 02 23:05:03 xenon-m2 kernel: R10: 00007f51750b9c00 R11: 0000000000000246 R12: 00000000c0186444
Mai 02 23:05:03 xenon-m2 kernel: R13: 0000000000000010 R14: 0000558aca6dd188 R15: 0000000000000000

Things to note:
The clock was frozen at 23:02 IIRC so it’s producing logs after freezing.
Don’t know if it’s coincidence (2 out of 2 samples) but the freeze was shortly after (root) CMD (run-parts /etc/cron.hourly)
Different logs but same symptoms (cursor still moves, audio still plays, …)

This is also what I am getting in my log. And there is a bug open at Ubuntu for it:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1758545

They mention it could be related to the Wifi Chip. So I am gonna do a test, where I disable the wifi and then try to make it freeze today.

Got another crash. This time the last pcieport error was ~30 min before the freeze. Last cron.hourly ~20 min before the freeze. So both seem unrelated but nonetheless I’ll try pcie_aspm=off as I saw this somewhere related to the pcie errors.
But Wifi could be worth a shot too. This time I got wlp3s0: WPA: Group rekeying completed with f4:f2:6d:63:de:25 [GTK=CCMP] 2 min after the freeze.

Let me know how it works out. I can’t test for the next 6 hours or so.

I don’t have time at the weekend but seems promising so far. No pcieport errors (got them every ~10min while youtubing w/o pcie_aspm=off) and no freeze so far.

EDIT: Nope pcie errors are gone but still freezes. I guess I have to wait for the kernel driver fix :expressionless:

Mai 04 12:33:06 xenon-m2 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, last signaled seq=2339499, last emitted seq=2339502
Mai 04 12:33:06 xenon-m2 kernel: [drm] GPU recovery disabled.
Mai 04 12:36:16 xenon-m2 kernel: INFO: task amdgpu_cs:0:785 blocked for more than 120 seconds.
Mai 04 12:36:16 xenon-m2 kernel:       Tainted: G         C O     4.16.0-rc7-d64547a1cfa8 #1
Mai 04 12:36:16 xenon-m2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 04 12:36:16 xenon-m2 kernel: amdgpu_cs:0     D    0   785    589 0x00000000
Mai 04 12:36:16 xenon-m2 kernel: Call Trace:
Mai 04 12:36:16 xenon-m2 kernel:  ? __schedule+0x299/0x8a0
Mai 04 12:36:16 xenon-m2 kernel:  schedule+0x2f/0x90
Mai 04 12:36:16 xenon-m2 kernel:  schedule_timeout+0x205/0x3b0
Mai 04 12:36:16 xenon-m2 kernel:  ? amdgpu_job_alloc+0x37/0xb0 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  ? amdgpu_ttm_alloc_gart+0x8f/0x2b0 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  dma_fence_default_wait+0x1cd/0x270
Mai 04 12:36:16 xenon-m2 kernel:  ? dma_fence_release+0xa0/0xa0
Mai 04 12:36:16 xenon-m2 kernel:  dma_fence_wait_timeout+0x39/0x110
Mai 04 12:36:16 xenon-m2 kernel:  amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  amdgpu_cs_ioctl+0x98/0x1a90 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  ? dequeue_entity+0xd9/0x430
Mai 04 12:36:16 xenon-m2 kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  drm_ioctl_kernel+0x5b/0xb0 [drm]
Mai 04 12:36:16 xenon-m2 kernel:  drm_ioctl+0x2c3/0x360 [drm]
Mai 04 12:36:16 xenon-m2 kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Mai 04 12:36:16 xenon-m2 kernel:  do_vfs_ioctl+0xa4/0x630
Mai 04 12:36:16 xenon-m2 kernel:  ? SyS_futex+0x12d/0x180
Mai 04 12:36:16 xenon-m2 kernel:  SyS_ioctl+0x74/0x80
Mai 04 12:36:16 xenon-m2 kernel:  do_syscall_64+0x67/0x120
Mai 04 12:36:16 xenon-m2 kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mai 04 12:36:16 xenon-m2 kernel: RIP: 0033:0x7faee3507d87
Mai 04 12:36:16 xenon-m2 kernel: RSP: 002b:00007faed6d7eae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mai 04 12:36:16 xenon-m2 kernel: RAX: ffffffffffffffda RBX: 00007faed6d7ebd8 RCX: 00007faee3507d87
Mai 04 12:36:16 xenon-m2 kernel: RDX: 00007faed6d7eb50 RSI: 00000000c0186444 RDI: 0000000000000010
Mai 04 12:36:16 xenon-m2 kernel: RBP: 00007faed6d7eb50 R08: 00007faed6d7ec00 R09: 00007faed6d7eb30
Mai 04 12:36:16 xenon-m2 kernel: R10: 00007faed6d7ec00 R11: 0000000000000246 R12: 00000000c0186444
Mai 04 12:36:16 xenon-m2 kernel: R13: 0000000000000010 R14: 00005591e62a50c8 R15: 0000000000000000

I recently get an Acer Aspire 3, powered by the same APU and it kept freezing as well. After meddling with it for 2 days, I think the old UEFI or EC firmware is at fault here. I’m not sure what UEFI HP uses, but my Acer was running on very old Insyde firmware from June 2017 I think (v1.02 for anyone curious). After updating to v1.08 from February this year, the PCIe errors are gone and so far it’s been stable for about 5 hours, while before it wouldn’t last 40 minutes. Unfortunately the update required Windows install (which crashed 4 times during first boot, guess this issue is not Linux specific).

Kernel 4.17.5-ck-zen from repo-ck with Mesa 18.1 on Arch Linux.

Hm Still crashes on me. UEFI is the latest from hp: F.17 Rev.A 11. Apr. 2018. According to CPU-Z: F.17 - AMD AGESA RavenPI-FP5-AM4 1.0.0.0.
Doesn’t crash on Windows.

Crashing on 4.16 - 4.18 but the message on dmesg is different. Mesa is 18.1.3-1.
Random crahes are slightly rarer on 4.18 but still there. Closing the lid is a (almost) guarantied crash.

Can you please shed some light on:
Which UEFI/AGESA is Acer/you using?
Any specific kernel parameter?
What about c6-state?
Stable with standard kernel, too? Or only the ck-zen patches?
Standby/Hibernate stable, too?

This situation is really annoying. I’m forced to use Windows which is… suboptimal

It’s stable for me except for hibernation (I’ll go into detail below), with these package versions, I’ve updated since last post:

$ pacman -Q mesa linux linux-ck-zen linux-firmware systemd
mesa 18.1.4-1
linux 4.17.6-1
linux-ck-zen 4.17.8-1
linux-firmware 20180606.d114732-1
systemd 239.0-2

Which UEFI/AGESA is Acer/you using?

According to dmidecode

# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.
Table at 0x8C4E4000.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: Insyde Corp.
        Version: V1.08
        Release Date: 05/23/2018
        Address: 0xE0000
        Runtime Size: 128 kB
        ROM Size: 4608 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 1.8
        Firmware Revision: 1.8

Any specific kernel parameter?

Only ‘resume=UUID=…’ and ‘quiet’

What about c6-state?

I didn’t change anything related to c-states.

Stable with standard kernel, too? Or only the ck-zen patches?

Both stock Arch and -ck kernels run fine.

Standby/Hibernate stable, too?

Standby works fine without any configuration required. Hibernation causes the screen to blink once, then the laptops shuts down as if it was properly hibernating, but when attempting to resume the system, it hangs instantly after the screen comes back up, interestingly, I’m able to do a clean reboot using SysRq key sequence now.

I would wait for kernel 4.18 and mesa 18.2, as they seem to contain some patches targeting Raven Ridge and then check again.

Just thought I’d finally share my experience in configuring my Ryzen 7 2700U.

Here’s the preliminary information:

System:    Host: HOSTNAME Kernel: 4.18.5-1-MANJARO x86_64 bits: 64 compiler: gcc v: 8.2.0 
           Desktop: MATE 1.20.1 info: mate-panel wm: marco 1.20.2 dm: lightdm 1.26.0 Distro: Manjaro Linux 
Machine:   Type: Laptop System: Acer product: Aspire A315-41 v: V1.01 serial:  
           Mobo: RR model: Metapod_RR v: V1.01 serial:  UEFI: Insyde v: 1.01 date: 01/19/2018 
Battery:   ID-1: BAT1 charge: 4.2 Wh condition: 33.6/37.0 Wh (91%) volts: 7.2/7.7 model: COMPAL PABAS0241231 
           type: Li-ion serial: 41167 status: Discharging 
CPU:       Topology: Quad Core model: AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx bits: 64 type: MT MCP 
           arch: Zen L2 cache: 2048 KiB 
           flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 35149 
           Speed: 1526 MHz max: 1600 MHz Core speeds (MHz): 1: 1480 2: 1398 3: 1513 4: 1414 5: 1440 6: 1459 
           7: 1414 8: 1388 
Graphics:  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] 
           vendor: Acer Incorporated ALI driver: amdgpu v: kernel bus ID: 03:00.0 chip ID: 1002:15dd 
           Display: x11 server: X.Org 1.20.1 driver: amdgpu,ati unloaded: modesetting alternate: fbdev,vesa 
           compositor: marco resolution: 1920x1080~60Hz 
           OpenGL: renderer: AMD RAVEN (DRM 3.26.0 4.18.5-1-MANJARO LLVM 6.0.1) v: 4.5 Mesa 18.1.7 
           compat-v: 3.1 direct render: Yes 
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] vendor: Acer Incorporated ALI driver: snd_hda_intel 
           v: kernel bus ID: 03:00.1 chip ID: 1002:15de 
           Device-2: Advanced Micro Devices [AMD] vendor: Acer Incorporated ALI driver: snd_hda_intel v: kernel 
           bus ID: 03:00.6 chip ID: 1022:15e3 
           Sound Server: ALSA v: k4.18.5-1-MANJARO 
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169 v: 2.3LK-NAPI 
           port: 2000 bus ID: 01:00.1 chip ID: 10ec:8168 
           IF: enp1s0f1 state: down mac: ##:##:##:##:##:## 
           Device-2: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter driver: ath10k_pci v: kernel 
           bus ID: 02:00 chip ID: 168c:0042 
           IF: wlp2s0 state: up mac: ##:##:##:##:##:## 
Drives:    Local Storage: total: 238.47 GiB used: 39.63 GiB (16.6%) 
           ID-1: /dev/sda vendor: SK Hynix model: HFS256G39TND-N210A size: 238.47 GiB speed: 6.0 Gb/s 
           serial: ########### rev: 1P10 scheme: GPT 
Partition: ID-1: / size: 223.39 GiB used: 39.63 GiB (17.7%) fs: ext4 dev: /dev/sda2 
           ID-2: swap-1 size: 10.01 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda4 
Sensors:   System Temperatures: cpu: 41.1 C mobo: N/A gpu: amdgpu temp: 41 C 
           Fan Speeds (RPM): N/A 
Info:      Processes: 219 Uptime: 17m Memory: 6.77 GiB used: 1.06 GiB (15.7%) Init: systemd v: 239 Compilers: 
           gcc: 8.2.0 clang: 6.0.1 Shell: zsh v: 5.5.1 running in: mate-terminal inxi: 3.0.21

I’ve been using this as my primary laptop since May with enough stability to game both natively and through Wine and now Proton. Currently, I’ve played Bioshock Infinite on maximum settings, Dishonored through Wine with dxvk on maximum settings, Throne of Darkness using Wine, Starcraft 1 Remastered through Wine, Alan Wake through Proton on maximum settings, and Skyrim: Special Edition through Proton on Low settings (Character speech issue was present and is documented at github). Everything mentioned played with acceptable framerates (30 and above). Retroarch with Vulkan shaders has worked nicely playing my SNES ROM archive with Compton providing a tear-free experience on Xfce and LxQT. Lately (after a Vulkan update, I believe), the system will become unresponsive consistently from simply opening Retroarch because I configured the Video driver to be Vulkan. The gl driver doesn’t cause the freeze. By unresponsive, I mean the screen stops, turns black (oftentimes), all input becomes ineffective save the display on/off key (F6), and holding the power button down is necessary. Update: This Vulkan issue is no longer after another update. Remember to update your Retroarch Slang shaders after every Vulkan update!!! My default web browser is Pale Moon and I have never frozen my laptop using the web browser. I haven’t installed flashplayer (probably won’t).

I too encountered this in my journal logs:

pcieport 0000:00:01.7: device [1022:15d3] error status 

I still haven’t found a solution for ^this^ yet, but it brought me here.

Since others seem to be having difficulties getting their Raven Ridge APUs to behave, allow me to share my kernel boot parameters:

GRUB_CMDLINE_LINUX_DEFAULT="quiet ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2 scsi_mod.use_blk_mq=1 resume=UUID=#################"

The ivrs_ioapic lines are necessary to boot in most cases, especially true for older (i.e., 4.13-4.14) kernel series. Your numbering on these devices will almost certainly be different, and it stems from improper BIOS implementations. These are the errors the boot parameters address:

Sep 02 16:07:54 HOSTNAME kernel: [Firmware Bug]: AMD-Vi: IOAPIC[4] not in IVRS table
Sep 02 16:07:54 HOSTNAME kernel: [Firmware Bug]: AMD-Vi: IOAPIC[5] not in IVRS table
Sep 02 16:07:54 HOSTNAME kernel: [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found
Sep 02 16:07:54 HOSTNAME kernel: AMD-Vi: Disabling interrupt remapping

Since updates to your BIOS rarely come if ever, it becomes necessary to enumerate these devices manually. I’d assume that if one were to disable IOMMU in the BIOS, these parameters wouldn’t be necessary. Since some AMD hardware features require IOMMU being enabled, I opted to add these to my /etc/default/grub (i.e., in my case one device is the SMBUS controller, so it happened to be extremely important). The numbering will have to be discovered in the logs (i.e., journalctl -b -1 or dmesg). Its been awhile since I’ve read the solution. I’ll update this with a link if I can find it. Update: Here’s the meat of the matter

The blk_mq line is unnecessary, most users probably shouldn’t add this unless they’ve read the Archwiki page.

Initially, I ran Manjaro unstable until July; unstable wasn’t necessary for acceptable stability after, but I’d still recommend it for performance/better support. The wifi interface can become unresponsive, but works 85%-90% of the time. Bluetooth works great with the Steam Controller (on KDE 5 (not 4) there are unresolvable issues with Alt+Tab/SteamBtn+Start). For those wondering, hibernate and resume works for me with one drawback: the i2c-designware touchpad fails to power on after hibernate and a reboot is required to restore functionality. Suspend/resume work flawlessly 95% of the time. If anyone wants or needs more information, I’ll supply it.

I hope some of this information proves helpful and may even foster a Ryzen Mobile GNU/Linux community. Here’s my contribution!

1 Like

Hi,
nice post, thanks for sharing!
Please try (if you haven’t) to add the kernel boot parameter: idle=nomwait to see if it solves those locks.

I started a general Ryzen problems and fixes tutorial thread:

feel free to post there or try the other options to see if it helps!

I reinstalled Windows 10 to install the Insyde UEFI update from the factory version (i.e., v1.01) to 1.09. Unfortunately for Acer laptops, BIOS updates aren’t made for anything but Windows 10. This was the only way I felt safe updating. After the fresh Manjaro Architect installation, some recent Vulkan updates, Mesa updates, and Linux updates; I no longer experience the system freezes in Retroarch using the Vulkan video driver. Remember to update Retroarch slang shaders after every Vulkan or Mesa update.

I didn’t want to mess with C states and risk degrading power efficiency (i.e., I’m running a laptop Acer Aspire 3 Ryzen 7 2700U and I don’t want to waste any power). For this reason, I avoided tampering with the polling idle loop.

TPM should be disabled unless you are concerned about someone physically stealing your hardware. Most distributions have stability and suspend/resume issues with TPM enabled. TPM being enabled works best on Fedora, but even that isn’t without the semi-frequent lockups/freezing. Often with TPM enabled, my machine would fail to boot regardless of the distribution chosen. Ditto for Secure Boot unless you’re certain the distribution of choice supports it (i.e., Ubuntu, Fedora, Suse)

When I first started testing this laptop, the only GNU/Linux distribution to successfully boot without modifying kernel boot parameters was PCLinuxOS with Linux 4.14. PCLinuxOS recently removed Steam from the repositories because others had issues with it, so it was soon realized PCLinuxOS was unsuitable for my purposes, namely gaming.

I do not have permission to post in the other forum section because I’m new here.

Here’s my current system information:

System:
  Host: HOSTNAME Kernel: 4.19.0-1-MANJARO x86_64 bits: 64 compiler: gcc 
  v: 8.2.1 Desktop: Xfce 4.13.2git-UNKNOWN tk: Gtk 3.22.30 info: xfce4-panel 
  wm: xfwm4 dm: lightdm 1.26.0 Distro: Manjaro Linux 
Machine:
  Type: Laptop System: Acer product: Aspire A315-41 v: V1.09 
  serial:  
  Mobo: RR model: Metapod_RR v: V1.09 serial:  UEFI: Insyde 
  v: 1.09 date: 07/27/2018 
Battery:
  ID-1: BAT1 charge: 11.9 Wh condition: 36.3/37.0 Wh (98%) volts: 7.4/7.7 
  model: PANASONIC 41,50,31,36,4D,35,4A type: Li-ion serial: 0006 
  status: Discharging 
CPU:
  Topology: Quad Core model: AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx 
  bits: 64 type: MT MCP arch: Zen L2 cache: 2048 KiB 
  flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 35144 
  Speed: 1366 MHz max: 1600 MHz Core speeds (MHz): 1: 1437 2: 1479 3: 1369 
  4: 1516 5: 1368 6: 1368 7: 1368 8: 1347 
Graphics:
  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] 
  vendor: Acer Incorporated ALI driver: amdgpu v: kernel bus ID: 03:00.0 
  chip ID: 1002:15dd 
  Display: x11 server: X.Org 1.20.1 driver: amdgpu,ati unloaded: modesetting 
  alternate: fbdev,vesa resolution: 1920x1080~60Hz 
  OpenGL: renderer: AMD RAVEN (DRM 3.27.0 4.19.0-1-MANJARO LLVM 6.0.1) 
  v: 4.5 Mesa 18.1.8 compat-v: 3.1 direct render: Yes 
Audio:
  Device-1: AMD vendor: Acer Incorporated ALI driver: snd_hda_intel 
  v: kernel bus ID: 03:00.1 chip ID: 1002:15de 
  Device-2: AMD vendor: Acer Incorporated ALI driver: snd_hda_intel 
  v: kernel bus ID: 03:00.6 chip ID: 1022:15e3 
  Sound Server: ALSA v: k4.19.0-1-MANJARO 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  driver: r8169 v: kernel port: 2000 bus ID: 01:00.1 chip ID: 10ec:8168 
  IF: enp1s0f1 state: down mac: ##:##:##:##:##:##
  Device-2: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter 
  driver: ath10k_pci v: kernel bus ID: 02:00 chip ID: 168c:0042 
  IF: wlp2s0 state: up mac: ##:##:##:##:##:## 
Drives:
  Local Storage: total: 238.47 GiB used: 9.73 GiB (4.1%) 
  ID-1: /dev/sda vendor: SK Hynix model: HFS256G39TND-N210A size: 238.47 GiB 
  speed: 6.0 Gb/s serial: ES7CN44721150C75U rev: 1P10 scheme: GPT 
Partition:
  ID-1: / size: 224.40 GiB used: 9.73 GiB (4.3%) fs: ext4 dev: /dev/sda3 
  ID-2: swap-1 size: 9.37 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda4 
Sensors:
  System Temperatures: cpu: 40.9 C mobo: N/A gpu: amdgpu temp: 40 C 
  Fan Speeds (RPM): N/A 
Info:
  Processes: 262 Uptime: 1d 2h 33m Memory: 6.77 GiB used: 1.60 GiB (23.6%) 
  Init: systemd v: 239 Compilers: gcc: 8.2.1 clang: 6.0.1 Shell: zsh 
  v: 5.5.1 running in: xfce4-terminal inxi: 3.0.21 

Has anything changed? Is the Ryzen 5 2500U stable with Manjaro with latest kernel?

I’m on 4.19 with kernel parameter idle=nomwait and had 1 freeze within the last 2 or 3 months. So seems nearly stable on regular usage. But sadly going to sleep by closing the lid is a guaranteed crash. At least at my machine/configuration.

Edit:
Oh and touchscreen doesn’t work. There is a patch floating around somewhere on kernel bugzilla but that means compiling your own kernel.

that has nothing to do with the processor or graphic chip. It is solely related to the touch panel used in your specific model.

Yeah i know but mentioned it anyways since this thread here is specific about that model

Fortunately I’m used to turning the laptop off instead of putting it to sleep, so it won’t be an issue for me

Thanks for the idle=nomwait, when I’ll get the laptop I’ll try this

By the way, what were you doing when the laptop froze? Something resource intensive?

Nothing in particular. Just random froze unrelated to workload. As unspectacular as I can’t even remember what i did on the laptop that day.

1 Like

Oh okay, thanks for information!

Forum kindly sponsored by