Cannot boot realtime kernel 6.x with Arc GPU

Hello,
I need some help troubleshooting and resolving, please.

Problem Statement:
Cannot boot realtime kernels 6.x
Regular kernels 6.x boot normally

Symptoms:
Boot process hangs during Kernel initialization/Service startup, last lines are:

mei_gsc i915.mei-gsc.768: waiting for mei start failed
mei_gsc i915.mei-gsc.768: HBM haven’t started
mei_gsc i915.mei-gsc.768: link layer initialization failed.

Steps to reproduce:

  1. Install Arc A380 GPU (requires kernel 6.1 or newer)
  2. Install realtime kernel 6.x
  3. Boot

Note that I was previously running realtime kernel on this same machine before, and boot failure started after:

  • Arc GPU
  • kernel 6.x

Unfortunately I cannot segment the two, as Arc GPU requires 6.1 or later

Possible Causes:

  1. realtime kernels 6.x itself (doubt it)
  2. combination of realtime kernel and Arc A380 (suspected)
  3. combination of realtime kernel and Arc A380 and particular monitor/resolution (suspected but doubtful)

Reason for possible cause #3:
I previously had trouble getting full resolution with my ultrawide monitor, and it was resolved with this patch:
https://patchwork.freedesktop.org/patch/523769/?series=114278&rev=1

Although I think it was merged in 6.2 or 6.3, so I don’t expect this to be the issue.

System info:

inxi
CPU: quad core Intel Core i5-2500K (-MCP-) speed/min/max: 2103/1600/3700 MHz
Kernel: 6.4.16-5-MANJARO x86_64 Up: 35m Mem: 2.84/14.6 GiB (19.4%)
Storage: 577.55 GiB (56.2% used) Procs: 264 Shell: Zsh inxi: 3.3.30

inxi -G
Graphics:
  Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics
    driver: i915 v: kernel
  Device-2: Intel DG2 [Arc A380] driver: i915 v: kernel
  Display: x11 server: X.Org v: 21.1.8 driver: X: loaded: modesetting
    dri: iris,crocus gpu: i915 resolution: 3440x1440~60Hz
  API: EGL v: 1.5 drivers: crocus,iris,swrast
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6 compat-v: 3.3 vendor: intel mesa v: 23.1.9-manjaro1.1
    renderer: Mesa Intel Arc A380 Graphics (DG2)
  API: Vulkan v: 1.3.264 drivers: intel surfaces: xcb,xlib

Logs:
Successful boot (6.4 kernel):

journalctl -b 0 | grep -E 'i915|mei|HBM' > 64ok_i915.log
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: enabling device (0006 -> 0007)
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: [drm] Failed to find VBIOS tables (VBT)
Oct 09 09:18:40 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: vgaarb: deactivate vga console
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] Can't resize LMEM BAR - platform support is missing
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] Local memory IO size: 0x0000000010000000
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] Local memory available: 0x000000017c800000
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] Using a reduced BAR size of 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_08.bin (v2.8)
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GuC firmware i915/dg2_guc_70.bin version 70.8.0
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin version 7.10.3
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: submission enabled
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: SLPC enabled
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: RC enabled
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:18:40 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 1
Oct 09 09:18:40 manjaro-daw kernel: fbcon: i915drmfb (fb0) is primary device
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:03:00.0: [drm] fb0: i915drmfb frame buffer device
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:18:40 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:18:41 manjaro-daw kernel: mei_gsc i915.mei-gsc.768: FW not ready: resetting: dev_state = 2 pxp = 2
Oct 09 09:18:41 manjaro-daw kernel: mei_gsc i915.mei-gsc.768: unexpected reset: dev_state = ENABLED fw status = 00000345 84670000 00000000 00000000 E0020002 00000000
Oct 09 09:18:41 manjaro-daw kernel: snd_hda_intel 0000:04:00.0: bound 0000:03:00.0 (ops i915_audio_component_bind_ops [i915])
Oct 09 09:18:41 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: HuC: authenticated!
Oct 09 09:18:41 manjaro-daw kernel: mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915])
Oct 09 09:33:21 manjaro-daw kernel: i915 0000:03:00.0: [drm] *ERROR* Atomic update failure on pipe A (start=54133 end=54134) time 153 us, min 1431, max 1439, scanline start 1426, end 1440

Failed boot (6.5-rt):

journalctl -b -1 | grep -E 'i915|mei|HBM' > 65rtng_i915.log
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: enabling device (0006 -> 0007)
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: [drm] Failed to find VBIOS tables (VBT)
Oct 09 09:16:17 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 1
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: vgaarb: deactivate vga console
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] Can't resize LMEM BAR - platform support is missing
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] Local memory IO size: 0x0000000010000000
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] Local memory available: 0x000000017c800000
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] Using a reduced BAR size of 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_08.bin (v2.8)
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GuC firmware i915/dg2_guc_70.bin version 70.8.0
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin version 7.10.3
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: submission enabled
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: SLPC enabled
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] GT0: GUC: RC enabled
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:16:17 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 2
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:16:17 manjaro-daw kernel: fbcon: i915drmfb (fb0) is primary device
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
Oct 09 09:16:17 manjaro-daw kernel: i915 0000:03:00.0: [drm] fb0: i915drmfb frame buffer device

The only thing that stands out to me is this difference in which minor:
OK (6.4):

Oct 09 09:18:40 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
...
Oct 09 09:18:40 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 1

NG (6.5-rt):

Oct 09 09:16:17 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 1
...
Oct 09 09:16:17 manjaro-daw kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 2

I’ve looked through other journalctl logs also, but not seeing anything useful in them.
Would be happy to provide more if anyone wants to review.

Any ideas what I should do to continue troubleshooting this?

I know this is not the advice you want to hear but do you really actually need the RT version? From my understanding, most changes are already included anyway.

1 Like

I’ve tried everything else to get rid of my xruns

edit: Even if I set my buffer size to 2048 (46.4 ms latency) I still get xruns.
I concede that realtime kernel may not be the solution, but this same machine used to handle 512 samples (11.6 ms) reliably and 256 samples (5.8 ms) usually.

I’ve also gone through and addressed all the issues/recommendations from rtcqs, just in case.

Okay, you seem to know what you’re doing. Unfortunately, I can’t be of any more help :frowning: Good luck, though.

Thanks for taking a look, and replying.
Appreciate it.

Troubleshooting xruns is the worst, I would really like to try the realtime kernel to see if it helps.
Hopefully someone can help me out. :sos:

Might not be related, but Resizable BAR being enabled seems to be a requirement for supported hardware configurations.

Also https://game.intel.com/story/intel-arc-graphics-resizable-bar/

We’ve been transparent that Resizable BAR (Base Address Register) is required to get a good experience with Intel® Arc™ hardware

Complete guess, but I wonder if the combination of that plus a 12-year old cpu just can’t keep up with gfx + audio memory transfers.

No idea why the RT kernel doesn’t work though.

Thanks, appreciate the input.
I suppose it could be. I was pretty disappointed to find out about the resizable BAR issue, but it drives my Ultrawide monitor fine, which is all I wanted it for. It is an audio workstation and desktop, not a gaming rig or anything.

Still, it sure seems like the kernel should still boot.
It feels like the rt kernel is just missing something, since pre-6.1 kernels wouldn’t boot with Arc either.

That’s possible, they are built with different configs. The way to investigate this further if you have the technical knowledge would be

git clone https://gitlab.manjaro.org/packages/core/linux65.git
git clone https://gitlab.manjaro.org/packages/core/linux65-rt.git

then look at differences between linux65/config and linux65-rt/config.rt (and also the PKGBUILDs).

Update: I have changed motherboard and CPU.
Still can’t boot RT kernel.

But the hardware upgrade has vanquished the xruns, so I no longer need the RT kernel.
Still, wanted to share the info that issue still occurs even with:

  • CPU only 3 years old
  • Resizable BAR supported and enabled
    (so they are not the issue)
inxi
CPU: 10-core Intel Core i9-10900KF (-MCP-) speed/min/max: 4899/800/5300 MHz
Kernel: 6.4.16-5-MANJARO x86_64 Up: 11m Mem: 3.05/15.49 GiB (19.7%)
Storage: 1.47 TiB (28.2% used) Procs: 461 Shell: Zsh inxi: 3.3.30

inxi -G
Graphics:
  Device-1: Intel DG2 [Arc A380] driver: i915 v: kernel
  Display: x11 server: X.Org v: 21.1.8 driver: X: loaded: modesetting
    dri: iris gpu: i915 resolution: 3440x1440~60Hz
  API: EGL v: 1.5 drivers: iris,swrast platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 23.1.9-manjaro1.1
    renderer: Mesa Intel Arc A380 Graphics (DG2)
  API: Vulkan v: 1.3.264 drivers: intel surfaces: xcb,xlib