DE froze with graphic glitches... lots of kernel, drm, and amdgpu entries in journal

I stumbled across this thread [PATCH 5.15 000/917] 5.15.3-rc1 review - Greg Kroah-Hartman and found one interesting thing… assuming encoders are tied in with hardware acceleration…

… but even if they are unrelated, I think all the amdgpu and drm related fixes just in that one 15.3-rc list sort of proves that the body of kernel work is always in motion, things are always getting better.

Now I would love to make sure my (and @Zesko 's) issue is known and is being worked on… but I’m a little green, and don’t fully understand everything I read in the kernel notes (or here sometimes) and how it might relate to my issues. For example, is it really a problem when some “depends” are absent from one kernel version to another? Or was it planned?

If I can take the information I posted here and repost it upstream… where would I do that? I figure in the worst case scenario, someone might close it as a duplicate, and I’m okay with that. Or maybe the worst case scenario would be them telling me to redirect the issue to VLC?

But there is also a part of me that thinks that with the new Test build released that stable may be seeing an update soon… and a 5.15.5+ kernel will land to retest the script and VLC with hardware acceleration.

In fact, what I might do in preparation for that is to try duplicate the problem first. Twice I have been able to play my 91 minute video playlist without issue using VLC with hardware acceleration disabled. So, I think I’ll reverse that setting back to auto and see if I can trigger the corruption and errors at least one more time.

If I can’t duplicate the issue… I’m not sure what the next steps are. But if I can duplicate it, then I have a definite test case for all future kernels past 5.15.2… and I think upstream will take anything I post more seriously if I can provide repeatable steps.

EDIT: And I guess I need to remember it was mentioned the “fix” was in kernel 5.16… so I might be waiting a while to run the final test case :wink:

I also do not know. If need certain answer an investigation needed. I just suggest to compare.

Need to know what is the bug related to. If to load 5.15 kernel gen and do not use VLC but to use other player (Manjaro - Branch Compare or another), than do you see the bug?

Before to bug report much much better to have the latest possible versions within you kernel gen if 5.15 then currently (https://www.kernel.org/) (5.15.6) which is currently in unstable and testing branches only (Manjaro - Branch Compare)

If VLC related, than you will be asked to test on nightly build first (git version) or even 4th version of VLC which was not released at least a few days ago and to reset settings (Report bugs - VideoLAN Wiki):

Bug reports on older versions of VLC are likely to be ignored, because changes in newer versions of VLC may have already fixed the issue.
Before you file a new bug, please try a preview development build of our next version on our nightly build website. The bug may already be fixed in those builds.
Old preferences and/or incorrect settings are common causes of problems.

okay, I was trying to learn more at Linux Kernel Module Management 101 - Linux.com and How to load or unload a Linux kernel module | Opensource.com … and I think I got a little bit smarter; enough to stumble on a new trick, which then revealed another head scratcher.

What I think I learned…

  1. amdgpu is actually not inside the kernel, but is loaded as a kernel module (driver), which for my current 5.15.2 kernel is in the path…
$ find /lib/modules/$(uname -r) -type f | grep amdgpu
/lib/modules/5.15.2-2-MANJARO/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz
  1. But amdgpu has some dependency modules as well that get loaded, as found by the script, and seconded by…
$ lsmod | grep amdgpu
amdgpu               8613888  59
gpu_sched              53248  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    86016  2 amdgpu,drm_ttm_helper

And the trick I think I learned was how to find other installed module dependencies/differences without having to boot the other kernels…

  1. so taking the find command above, I thought I’d try remove the loaded kernel uname part of the path and instead grep for one of “missing” modules… drm.ko (ko = kernel object)… which basically confirmed it was absent (even on the drive) for the 5.15 kernel.
$ find /lib/modules/ -type f | grep /drm.ko
/lib/modules/5.10.79-1-MANJARO/kernel/drivers/gpu/drm/drm.ko.xz
/lib/modules/5.14.18-1-MANJARO/kernel/drivers/gpu/drm/drm.ko.xz
/lib/modules/5.13.19-2-MANJARO/kernel/drivers/gpu/drm/drm.ko.xz
/lib/modules/5.14.10-1-MANJARO/kernel/drivers/gpu/drm/drm.ko.xz
  1. While thinking about how I would get the entire amdgpu depends list for the other kernels, and feeling a bit more comfortable with the paths and extensions/file-types, I recalled the “ugly” $ cat /lib/modules/$(uname -r)/modules.dep | grep amdgpu command I’d found earlier, and built a command to find all the module.dep files…
$ find /lib/modules -name "modules.dep"
/lib/modules/5.10.79-1-MANJARO/modules.dep
/lib/modules/5.14.18-1-MANJARO/modules.dep
/lib/modules/5.15.2-2-MANJARO/modules.dep
/lib/modules/5.13.19-2-MANJARO/modules.dep
/lib/modules/5.14.10-1-MANJARO/modules.dep
  1. Now that I had all the paths, I could check for the amdgpu module dependencies for … say kernel 5.14 (which happened to pull the same list as the 5.13 file)…
$ cat /lib/modules/5.14.18-1-MANJARO/modules.dep | grep amdgpu
kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz: kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.xz kernel/drivers/i2c/algos/i2c-algo-bit.ko.xz kernel/drivers/gpu/drm/drm_ttm_helper.ko.xz kernel/drivers/gpu/drm/ttm/ttm.ko.xz kernel/drivers/gpu/drm/drm_kms_helper.ko.xz kernel/drivers/media/cec/core/cec.ko.xz kernel/drivers/gpu/drm/drm.ko.xz kernel/drivers/char/agp/agpgart.ko.xz kernel/drivers/video/fbdev/core/syscopyarea.ko.xz kernel/drivers/video/fbdev/core/sysfillrect.ko.xz kernel/drivers/video/fbdev/core/sysimgblt.ko.xz kernel/drivers/video/fbdev/core/fb_sys_fops.ko.xz

But the results from this file (5.13 and 5.14) introduces another head scratcher, as it lists a few more depends than the script found… I’ll apply tr to the command to make it easier to read…

$ cat /lib/modules/5.14.18-1-MANJARO/modules.dep | grep amdgpu | tr ' ' '\012'
kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz:
kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.xz
kernel/drivers/i2c/algos/i2c-algo-bit.ko.xz
kernel/drivers/gpu/drm/drm_ttm_helper.ko.xz
kernel/drivers/gpu/drm/ttm/ttm.ko.xz
kernel/drivers/gpu/drm/drm_kms_helper.ko.xz
kernel/drivers/media/cec/core/cec.ko.xz
kernel/drivers/gpu/drm/drm.ko.xz
kernel/drivers/char/agp/agpgart.ko.xz
kernel/drivers/video/fbdev/core/syscopyarea.ko.xz
kernel/drivers/video/fbdev/core/sysfillrect.ko.xz
kernel/drivers/video/fbdev/core/sysimgblt.ko.xz
kernel/drivers/video/fbdev/core/fb_sys_fops.ko.xz

That’s 13 depends, not 8 like the script found. Maybe the .dep file just lists the dependencies… but whether or not they are used may be another matter? If that’s the case, my trick may be a red herrring and I really do have to boot the kernel to really know what was actually loaded :thinking:

Daniel, perhaps by that publication of research pages you cut almost all auditory from reading the thread: it is hard to read and understand that all. May be now we are alone here who read fresh posts and I do not know how can I be useful in your case.
Imagine any people reaction who will find this thread by key excerpts from logs. It is already like about a 10% of book size :slight_smile: Now the only cure for further readers, who will search solution also - is your mark that some post was the solution for you. But I guess it is some further post, not nearest one.

About to manually load kernel module from 5.13 kernel by modprobe to 5.15: 5.15’s platform should accept the module as actual. may be 5.15 mutate and modules from 5.13 could be loaded by there no triggers to switch to graphics processing to them or it is not easy to do. So it is possible idea to try, but not so highly promising.

If I were you, I would first realize if another media player has the bug on latest 5.15 gen or it is only vlc-related, and if vlc, than report to vlc team.
If another player also fails on latest 5.15, then to kernel team. We could guess infinitely.
Or just wait for report results posted in DE froze with graphic glitches... lots of kernel, drm, and amdgpu entries in journal - #7 by Zesko

you may be right… but I will never be accused of (1) not providing enough information or (2) having no desire to learn :wink:

I decided to look past the kernel module learning session, and decided to focus on the first 4 lines of the original error message I posted… and I just now noticed the reference to “process: vlc” (line 4), and decided to focus on DDG searches for…

  1. “drm vcn0” (line 1) lead to an email about a patch @ RE: [PATCH] drm/amdgpu/sriov/vcn: skip ip revision check case to ip init for SIENNA_CICHLID specific for my GPU
  2. and “amdgpu vcn_dec_0 timeout” (line 3) took me to the source code for amdgpu_vcn.c @ linux/amdgpu_vcn.c at master · torvalds/linux · GitHub which also had a link to the same patch from #1 for my GPU @ drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_C… · torvalds/linux@da3b36a · GitHub… and when I clicked the parent for the commit of that patch, I found that it be released in 5.16-rc3 @ Linux 5.16-rc3 · torvalds/linux@d58071a · GitHub

Maybe I’m wrong, but I somewhat expect a fix like this (committed only 3 days ago) to find it’s way back into 5.15 as well (the latest LTS )… and if not, I know where the fix is and will wait for it unless I find disabling hardware acceleration is no longer a workaround.

But … also … like I have mentioned before …
What about, assuming you have HWAccel properly implemented in your system, setting VLC to use it, instead of relying on ‘automatic’ (which indicates a bug in vlc instead).

You have an amdgpu like me, so it can do VAAPI or VDPAU … though I consider vdpau more ‘native’.

VLC Preferences > Input/Codecs > Hardware-accelerated decoding
Selection: VDPAU



VLC Preferences > Video > Output
Selection: VDPAU

(again this is hardware/configuration dependent … many users, especially without acceleration for example, may get best results from OpenGL or X11 output)

1 Like

Thanks @cscs !

I was aware the Input/Codecs option defaulted to automatic, but I was unaware there was another automatic default under Video as well… seems I hadn’t followed the link yet you’d shared. Doh!

This information is perfect… I was thinking to try duplicate the issue again, and these VDPAU settings will be what I’ll try use first.

Also going to verify you have the things …

lib32-libva-vdpau-driver
lib32-libvdpau
lib32-mesa-vdpau
libva-vdpau-driver
libvdpau
mesa-vdpau
vdpauinfo

And you can run a vdpau check:

vdpauinfo

PS … you know what. Your logs include DRM messages right?
I realized VLC has HWAccel option “VAAPI via DRM” … maybe thats what automatic was trying to do…:thinking:

1 Like

Well I had successfully played a 5 minute video using the VDPAU (x2) settings, so that was a good initial run… and I just confirmed in the pamac GUI that I had all the core packages you listed installed:

Now that I’ve installed vfpauinfo, it actually shows a lot of information about what’s supported and what’s not…

$ vdpauinfo
display: :0   screen: 0
API version: 1
Information string: G3DVL VDPAU Driver Shared Library version 1.0

Video surface:

name   width height types
-------------------------------------------
420    16384 16384  NV12 YV12 
422    16384 16384  UYVY YUYV 
444    16384 16384  Y8U8V8A8 V8U8Y8A8 
420_16 16384 16384  
422_16 16384 16384  
444_16 16384 16384  

Decoder capabilities:

name                        level macbs width height
----------------------------------------------------
MPEG1                          --- not supported ---
MPEG2_SIMPLE                    3 78336  4096  4906
MPEG2_MAIN                      3 78336  4096  4906
H264_BASELINE                  52 78336  4096  4906
H264_MAIN                      52 78336  4096  4906
H264_HIGH                      52 78336  4096  4906
VC1_SIMPLE                      1 78336  4096  4906
VC1_MAIN                        2 78336  4096  4906
VC1_ADVANCED                    4 78336  4096  4906
MPEG4_PART2_SP                  3 78336  4096  4906
MPEG4_PART2_ASP                 5 78336  4096  4906
DIVX4_QMOBILE                  --- not supported ---
DIVX4_MOBILE                   --- not supported ---
DIVX4_HOME_THEATER             --- not supported ---
DIVX4_HD_1080P                 --- not supported ---
DIVX5_QMOBILE                  --- not supported ---
DIVX5_MOBILE                   --- not supported ---
DIVX5_HOME_THEATER             --- not supported ---
DIVX5_HD_1080P                 --- not supported ---
H264_CONSTRAINED_BASELINE       0 78336  4096  4906
H264_EXTENDED                  --- not supported ---
H264_PROGRESSIVE_HIGH          --- not supported ---
H264_CONSTRAINED_HIGH          --- not supported ---
H264_HIGH_444_PREDICTIVE       --- not supported ---
VP9_PROFILE_0                  --- not supported ---
VP9_PROFILE_1                  --- not supported ---
VP9_PROFILE_2                  --- not supported ---
VP9_PROFILE_3                  --- not supported ---
HEVC_MAIN                      186 139264  8192  4352
HEVC_MAIN_10                   186 139264  8192  4352
HEVC_MAIN_STILL                --- not supported ---
HEVC_MAIN_12                   --- not supported ---
HEVC_MAIN_444                  --- not supported ---
HEVC_MAIN_444_10               --- not supported ---
HEVC_MAIN_444_12               --- not supported ---

Output surface:

name              width height nat types
----------------------------------------------------
B8G8R8A8         16384 16384    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 P010 P016 A8I8 I8A8 
R8G8B8A8         16384 16384    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 P010 P016 A8I8 I8A8 
R10G10B10A2      16384 16384    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 P010 P016 A8I8 I8A8 
B10G10R10A2      16384 16384    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 P010 P016 A8I8 I8A8 

Bitmap surface:

name              width height
------------------------------
B8G8R8A8         16384 16384
R8G8B8A8         16384 16384
R10G10B10A2      16384 16384
B10G10R10A2      16384 16384
A8               16384 16384

Video mixer:

feature name                    sup
------------------------------------
DEINTERLACE_TEMPORAL             y
DEINTERLACE_TEMPORAL_SPATIAL     -
INVERSE_TELECINE                 -
NOISE_REDUCTION                  y
SHARPNESS                        y
LUMA_KEY                         y
HIGH QUALITY SCALING - L1        y
HIGH QUALITY SCALING - L2        -
HIGH QUALITY SCALING - L3        -
HIGH QUALITY SCALING - L4        -
HIGH QUALITY SCALING - L5        -
HIGH QUALITY SCALING - L6        -
HIGH QUALITY SCALING - L7        -
HIGH QUALITY SCALING - L8        -
HIGH QUALITY SCALING - L9        -

parameter name                  sup      min      max
-----------------------------------------------------
VIDEO_SURFACE_WIDTH              y        48     4096
VIDEO_SURFACE_HEIGHT             y        48     4096
CHROMA_TYPE                      y  
LAYERS                           y         0        4

attribute name                  sup      min      max
-----------------------------------------------------
BACKGROUND_COLOR                 y  
CSC_MATRIX                       y  
NOISE_REDUCTION_LEVEL            y      0.00     1.00
SHARPNESS_LEVEL                  y     -1.00     1.00
LUMA_KEY_MIN_LUMA                y  
LUMA_KEY_MAX_LUMA                y  

What does that mean if I have a file that’s not encoded in one of the supported ways?

  1. Will the video play, just not hardware accelerated?
  2. Will the video not play/error?
    • Sounds like an opportunity to use handbrake?
    • Or an opportunity to hunt down and install more codec?

Without delving into it I am tempted to pick this one, as I havent really run into the issue of something being unplayable.
Granted, I do use a superior player (smplayer) :stuck_out_tongue: , but still, I cant remember a video file that wouldnt work.

EDIT:
OK … one quick ffmpeg -i file.mp4 file.mpeg later and … yeah it plays.
Looks terrible of course. But it definitely plays.

1 Like

Hmm, maybe it’s time to try something new. So it looks like smplayer defaults to loading mpv, but also includes mplayer as an alternate mutimedia engine… did you stick with mpv?

I’ve used both.
I was about to respond ‘mplayer’ … but in fact its actually set to ‘other’ with path /bin/mpv for some reason. Ha.

Go ahead and configure it. But one of the nice things is its automatics ‘just work’ … or at least well enough to not crash in funkadelic style.

Also make sure to get the skins/themes … makes it a lot nicer.
I use interface : GUI=Basic Icon=PapirusDark Style=Breeze
(packages smplayer-themes, smplayer-skins)

I decided to install all it’s optional dependencies; there weren’t than many and youtube-dl I already had installed.

So, taking some notes from our VLC discussion, I’ve made the following changes in SMPlayer as well:

  1. Changed the video output driver to VDPAU…
    Screenshot_20211204_150551
    hmmm, I’m not in a Wayland session, should I uncheck it?
  2. Changed Hardware decoding to VDPAU…
    Screenshot_20211204_150909
    hmm, increase the threads, or keep it at 1?

Your skin/theme recommendation is great!

Did you make any adjustments under Advanced for mpv/mplayer?

I actually have that set to GPU, with the decoding being VDPAU. Not sure if it matters for us.

I have it unchecked

Always been OK with 1

Nope, nothing there.
I do have the ‘Repaint…’ option checked in the previous tab Advanced>Advanced

1 Like

Thanks again @cscs . I think I’m good to take SMPlayer for a spin.

Wayland unchecked, video output is GPU, leaving threads at 1, and "repaint’ was already checked by default.

Great. You are welcome. Now lets see if you can crash it (either one actually).

~whisper~ another cool feature is integrated subtitle grabbing ~whisper~

Challenge accepted :crazy_face:

Ok, I “spoiled” my initial plan a little bit by installing a BIOS update related to another issue… I thought I already had AGESA 1.2.0.2, but found I was a version back @ AGESA 1.2.0.0.

So after my RAID scrubbing and updating my BIOS, I decided to run VLC (VDPAU/VDPAU) and SMPlayer (VDPAU/GPU) in parallel (one muted) to save time and try force the issue by throwing more at the hardware accelerators… and had zero issues. No monitor sleeping, no corruption, and no errors… on kernel 15.5.2-2 with acceleration properly configured for AMD.

I’m going to select “setting Hardware Acceleration up properly for the hardware” as the solution… but also noting the importance of BIOS/AGESA updates.

Thank you @cscs , @alven , @Zesko , and @megavolt for all your value input! You’ve taught me a lot.

aha!
Dumb VLC.
Not to beat it to death … but yeah … this ‘automatic’ thing on VLC has been an issue that bites people for years. Its actually what made me switch oh so many moons ago.

In any case I am glad everything is OK and you dont need to do a kernel bisect :wink:

1 Like

I’m going to take your word for it and resist asking what a kernel bisect is :wink:

1 Like