Intel-virtual-output - High temperature causes Linux crash

When I activate a monitor on the HDMI output (wired to the NVIDIA Optimus chip) of my notebook through the commands below the notebook starts to get very hot, mainly if I am running a video (streaming web, for example) on this monitor . So far so good, but the high temperature is making my Linux crash.

How can I solve this problem of high temperature?

NOTE: When I use NVIDIA graphics acceleration chip (primusrun) I have no temperature problems.

intel-virtual-output

xrandr -q &>/dev/null
xrandr --newmode "1392x892_60.00"  102.00  1392 1472 1616 1840  892 895 905 926 -hsync +vsync

xrandr --addmode VIRTUAL2 1392x892_60.00
xrandr --output VIRTUAL2 --mode 1392x892_60.00 --right-of LVDS1

My NVIDIA Drivers Installation Process:

SOME REF: https://wiki.archlinux.org/index.php/bumblebee#Outputs_wired_to_the_Intel_chip

The best way to reduce temperatures is to give your fans, heatsink fins, and vents a clean out.

If your GPU and CPU are active then they will be producing heat. If the laptop’s cooling isn’t sufficient to prevent overheating (in normal operation) then that’s a design flaw.

1 Like

I have no overheating issues when I use the same hardware in the same way with another operating system. I believe there must be another solution. There are no ways to use the Nouveau driver, for example. Any other suggestions? Thanks a lot! =D

That’s nice and vague. If it’s the same setup then there wouldn’t be a problem here either, which makes me think it’s not being used in the same way.

For example, is the “other OS” using render offload instead of display offload?

Don’t discount an obvious answer just because it’s not what you want to hear (i.e. don’t expect “Do this one weird trick to solve your problem!”).

Optimus support under Linux is a mess. It’s known to be a mess, and it’s not going to change unless NVIDIA devote some effort to it. Unfortunately, Linux makes up a tiny share of the NVIDIA market (but presents a greater number of issues) hence they’re going to spend their time on areas with a larger market share.

Of course, you could try switching to the nouveau drivers (as you mentioned) and see if that helps.

1 Like

Any clues on how I could use the nouveau? I would be very grateful! =D

Complementing/confirming @jonathon 's answer…

Transposed from https://unix.stackexchange.com/a/446609/61742

The problem is a physical one: the processor + GPU are producing more heat than the laptop can dissipate, and so it’s overheating and ultimately crashing. (Does the crash look like the notebook is hitting an overheat shutdown, or is it actually crashing because of overheating-induced data corruption? In other words, does it just power completely off once it gets too hot, or do you see graphical glitches or any other strange behavior?)

The only software workarounds for that would be restricting the heat production, which would mean restricting the system performance. For example, you could use cpufreq set or cpupower frequency-set with appropriate options to limit the maximum clock frequency of your CPU, and using the nvidia-settings to set the GPU to a lower performance level.

A real fix would probably be a physical one: to start with, make sure nothing is blocking the slots used for cooling air, and clean them if they seem to be clogged with dust. Since this is a notebook, opening it up and cleaning the heat sinks more thoroughly is not as easy as with desktop systems, and would probably void the warranty if one is still in effect.

If the notebook is still under warranty, I would recommend contacting the vendor’s support and describing the problem. It could be that the heat sink is not in good thermal contact with the processor and/or GPU, and the notebook would need to be opened up to have the heat sink(s) reattached properly.

If you start thinking about opening up the notebook yourself, be very careful and try and find as much information of the task as you can from internet first, as the insides of a notebook are quite a bit more fragile than the corresponding parts on a regular desktop. A Youtube video that shows the steps and techniques needed for your specific model would be a great find. A service manual from the hardware vendor would also be good; however not all vendors make service manuals freely downloadable.

Originally answered by telcoM ( https://unix.stackexchange.com/users/258991/telcom ).

At the moment I’m thinking of something to limit the GPU’s performance when the temperature reaches high levels. Apparently the GPU does not process “anything” it just copies one memory area (generated by Intel GPU?). So I think for HDMI the NVIDIA GPU could run at smaller speeds with no problems.

Almost a solution ahead!

According to this documentation…

https://wiki.archlinux.org/index.php/fan_speed_control#Asus_laptops

… it is possible to control the processor’s fan power/rpm with the following commands…

echo 255 > /sys/devices/platform/asus-nb-wmi/hwmon/hwmon[[:print:]]*/pwm1           # Full fan speed (Value: 255)
echo 0 > /sys/devices/platform/asus-nb-wmi/hwmon/hwmon[[:print:]]*/pwm1             # Fan is stopped (Value: 0)
echo 2 > /sys/devices/platform/asus-nb-wmi/hwmon/hwmon[[[:print:]]*/pwm1_enable     # Change fan mode to automatic
echo 1 > /sys/devices/platform/asus-nb-wmi/hwmon/hwmon[[:print:]]*/pwm1_enable      # Change fan mode to manual

These commands apply to ASUS notebooks (my case).

Seen this, we trigger the HDMI output (as explained in this thread) and in sequence we execute the command…

echo 255 > /sys/devices/platform/asus-nb-wmi/hwmon/hwmon[[:print:]]*/pwm1           # Full fan speed (Value: 255)

… that puts the fan at full power/rpm.

Result: Temperature under control! No system crash occurs! =D

Given this, I propose one of the following solutions:

  • Put the fan on full power/rpm when you turn on the HDMI;
  • Make the fan’s temperature response more “aggressive” (preferred).

NOTE I: There are other ways to control the fan’s power/rpm (and temperature response). The “fancontrol”, for example, is one of them.
NOTE II: There are components that may be influencing the fan’s power/rpm such as “thermald” and “tlp”. Both are installed by default in Manjaro.


PLUS: I would like your opinion and solution suggestions to increase fan’s power/rpm more intelligently.

1 Like

There are a couple of experts around here :wink:
Check these if you have some free time for reading:
Power Management from @FadeMind
TLP from @jonathon
TLP tagged

1 Like

@eduardolucioac, after exhausting everything else–and you have done an excellent job of researching and posting your findings, by the way–an alternative that served me well several years ago in a similar nVidia/Laptop/ext. monitor situation, yada, yada, was an inexpensive, wire-frame chill-stand with 3 small fans. There are any number of them available for purchase.

Sometimes there is only so much hackery can help.

Good job!

regards

1 Like

@c00ter I understood! You talk about it here…

In fact … I’ve tried this and in my case it helps (just a little bit), but it does not solve. Thanks for the sugestion! :grinning::grinning::grinning:

1 Like