I915 GPU hang [SOLVED]

The update is causing constant graphic problems. The messages of the log are these:

28/12/17 11:22	[drm] GPU HANG	ecode 6:0:0x87e8fffd, in kwin_x11 [2180], reason: Hang on rcs0, action: reset
28/12/17 11:22		[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
28/12/17 11:22		[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
28/12/17 11:22		[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
28/12/17 11:22		[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
28/12/17 11:22		[drm] GPU crash dump saved to /sys/class/drm/card0/error
28/12/17 11:22	i915 0000	0:02.0: Resetting chip after gpu hang

I’m trying to find some solution, but I still have not achieved anything.
Does anyone have any ideas?


Solved! Adding these parameters to the kernel is fixed:

i915.modeset=1 i915.enable_rc6=1 i915.enable_fbc=1 i915.enable_guc_loading=1 i915.enable_guc_submission=1 i915.enable_huc=1 i915.enable_psr=1 i915.disable_power_well=0 i915.semaphores=1

It works in version 4.14 and 4.15

A little info about your system would probably help someone help you.
Please post the output of inxi -Fxz.

My inxi -Fxz is this:

Resuming in non X mode: xrandr not found. For package install advice run: inxi --recommends
System:    Host: Portatil-Kuu Kernel: 4.14.9-2-MANJARO x86_64 bits: 64 gcc: 7.2.1
           Desktop: KDE Plasma 5.11.4 (Qt 5.10.0) Distro: Manjaro Linux
Machine:   Device: portable System: Dell product: Dell System XPS L502X serial: N/A
           Mobo: Dell model: 0YR8NN v: A00 serial: N/A UEFI [Legacy]: Dell v: A11 date: 05/29/2012
Battery    BAT0: charge: 35.7 Wh 100.0% condition: 35.7/57.7 Wh (62%) model: SIMPLO Dell status: Full
           hidpp__0: charge: 5% condition: NA/NA Wh model: Logitech M570 status: Discharging
CPU:       Quad core Intel Core i7-2630QM (-MT-MCP-) arch: Sandy Bridge rev.7 cache: 6144 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 15970
           clock speeds: max: 2900 MHz 1: 1995 MHz 2: 1995 MHz 3: 1995 MHz 4: 1995 MHz 5: 1995 MHz 6: 1995 MHz
           7: 1995 MHz 8: 1995 MHz
Graphics:  Card-1: Intel 2nd Generation Core Processor Family Integrated Graphics Controller bus-ID: 00:02.0
           Card-2: NVIDIA GF108M [GeForce GT 525M] bus-ID: 01:00.0
           Display Server: N/A driver: intel tty size: 130x42
Audio:     Card Intel 6 Series/C200 Series Family High Definition Audio Controller
           driver: snd_hda_intel bus-ID: 00:1b.0
           Sound: Advanced Linux Sound Architecture v: k4.14.9-2-MANJARO
Network:   Card-1: Intel Centrino Wireless-N 1000 [Condor Peak] driver: iwlwifi bus-ID: 03:00.0
           IF: wlp3s0 state: down mac: <filter>
           Card-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           driver: r8168 v: 8.044.02-NAPI port: 2000 bus-ID: 06:00.0
           IF: enp6s0 state: up speed: 100 Mbps duplex: full mac: <filter>
Drives:    HDD Total Size: 1000.2GB (80.0% used)
           ID-1: /dev/sda model: WDC_WD10JPVX size: 1000.2GB
Partition: ID-1: / size: 20G used: 9.9G (54%) fs: ext4 dev: /dev/sda1
           ID-2: /home size: 591G used: 491G (84%) fs: ext4 dev: /dev/sda5
           ID-3: swap-1 size: 2.15GB used: 0.00GB (0%) fs: swap dev: /dev/sda3
Sensors:   System Temperatures: cpu: 54.0C mobo: 54.0C
           Fan Speeds (in rpm): cpu: N/A
Info:      Processes: 211 Uptime: 12 min Memory: 2429.6/5861.6MB Init: systemd Gcc sys: 7.2.1
           Client: Shell (bash 4.4.121) inxi: 2.3.53

Did you downgraded the xorg-server packages yet? sudo pacman -Syyuu

Yes, I have downgraded xorg-server and the problem persists.

After a series of tests, I have come to the conclusion that the problem may be in the kernel 4.14.9-2. Versions 4.14.9-1 and 4.14.8 did not give that problem. But version 4.14.5 also fails.

1 Like

Makes sense, @Kuu. One of the first things we suggest around here is changing kernels when problems hit. Rolling nature…

1 Like

With v4.14.9-2 I added following pre-patch. Those patches can be reviewed also here. I doubt that following patch creates a regression on your end. Who knows. v4.14.10 will be available later tomorrow. Then we will see …

Thanks for the reply.
I do not know enough about programming to understand what those patches do, so I’ll wait for version 4.14.10.

In case it helps, now this error has appeared:
asynchronous wait on fence i915:[global]:1c32fa timed out

The graphic problem with the kernel 4.14.10-2 continues:

|31/12/17 9:08|[drm] GPU HANG|ecode 6:0:0x87e8effd, in plasmashell [1101], reason: Hang on rcs0, action: reset|
|---|---|---|
|31/12/17 9:08||[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.|
|31/12/17 9:08||[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel|
|31/12/17 9:08||[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.|
|31/12/17 9:08||[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.|
|31/12/17 9:08||[drm] GPU crash dump saved to /sys/class/drm/card0/error|
|31/12/17 9:08|i915 0000|0:02.0: Resetting chip after gpu hang|

But with kernel 4.9.73-1 no.
Do you need more information to fix it?

You have all the info you need to fill out an bug for Intel. Simply follow the guideline you already have:

You may also try linux415 to see if it is already fixed upstream.

Solved! Adding these parameters to the kernel is fixed:

i915.modeset=1 i915.enable_rc6=1 i915.enable_fbc=1 i915.enable_guc_loading=1 i915.enable_guc_submission=1 i915.enable_huc=1 i915.enable_psr=1 i915.disable_power_well=0 i915.semaphores=1

It works in version 4.14 and 4.15

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.