ASUS laptop discrete AMD card will overheat and crash the system

the laptop will stabilize at 105 degrees and no longer crashing
is 105 degrees going to damage any part of the laptop or its just fine?
update : it crashed

If it 105F, then no worry, but if it is 105C then definitely yes! It should be below 90C and normal usage at about 30C to 60C. The more heat the GPU gets the more it reduces its life time.

2 Likes

Maybe related to this.

Also, does this also happen on X11 session?

and did you try the parameters?

yes the parameters are applied

si and cik supports are enabled
and the problem exists on both wayland and x11 sessions

so go to /etc/mkinitcpio.conf and edit the modules section to look like this:
MODULES=(amdgpu)
if there are already some modules, add the amdgpu to them
run this:
sudo mkinitcpio -P
reboot

The only reasonable question then: is the collying system of the laptop clean?
Just a bit of dust “in the right place” and everything goes … hot.

the laptop board was cleaned of dust and the thermal paste was changed about 3 months ago
so its kind of safe to assume that cooling is working well

Well … then my last question: Does this overheating happen on other OS?

in windows the problem does not seem to exist
while doing anything resource intensive in windows(gaming for example) the fan will speed at maximum speed but the system wont crash

done and tested on 5.15 . still crashing
i will test other kernel versions with changes we’ve made so far and i wil update the discussion

At what temperature does it reach?
Linux kernel, via watchdog, will turn off the machine if the system reaches maximum temperature or detects other serious issues, to prevent more damage.

so you could try flashing a non arch iso like linux mint onto a usb and try in the live session if you have the same issues

a alternative is to install nbfc and create a own thermal profil. a search at internet shows that this model has even under ms-windows thermal problems. seems that the design of the cooling-system at this system is critical.

on windows the temperature will reach around 95 degrees while using resource intensive apps running on dgpu (which is not good but it will become stable at that temp)

linux mint with kernel version 5.4 and radeon driver for the dgpu became stable at 100 degrees Celsius

and by the way i have booted into kernel version 5.17 and tested the dgpu with the settings done on manjaro

the system will reach approximately 110 degrees C which is the temp that causes the instant shutdown but at about 108 degrees the dgpu will start lowering its clock rates in order to remain at 108 degrees
which is just a second away of an instant shutdown

here is a post on reddit with the same issue, you could check it out, but really without any solution… except to not use linux until another stable update and check if it is better after that

my issue is almost the same with the reddit one (the only difference was the laptop model with only 2 letters of diffrenece) and i have done everything the post had mentioned before and face the same problems afterwards ( missing pp_od_clk_voltage for example)

since i dont really want to leave linux behind can i move all my games (and other resource intensive stuff ) to windows and keep using linux as a daily driver?(since thanks to corectrl system temp does not go beyond 70 degrees C and its very quiet)

and another thing i want to ask:
i have a little experience with C++ ( i’ve been using it for about 9 years but im as exprience as someone who’s been using it for 2 years ) and thus i thing i kind of can help to fix this hardware bug
my question is : should i do this sometime or let the gearheads do their thing?

1 Like

that’s what i already replied, even with windows the thermal-regulation is critical and if you research you’ll recognize that the thermal design of this model isn’t a superior work of enginering. i doub’t that you’ll get better values with this system and indeed the 95°C will bake your thermal compound very quick. try the alternative with nbfc and creating your own thermal profile.