Hello everyone!
Pretty much since the last update I seem to be getting crashes and started seeing hardware errors.
Granted my machine is 7 years old and scheduled for replacement, I’d still need it working for another 6 months. It has been running perfectly for the last 16 months so this is a new thing.
Last crash happened at night while the computer was idle.
Here is the error:
22.01.21 11:53	kernel	smpboot: CPU0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
22.01.21 11:53	kernel	mce: [Hardware Error]: Machine check events logged
22.01.21 11:53	kernel	mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 3: f200000000800400
22.01.21 11:53	kernel	mce: [Hardware Error]: TSC 0 
22.01.21 11:53	kernel	mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 0 microcode 28
22.01.21 11:53	kernel	Performance Events: PEBS fmt2+, Haswell events, 16-deep LBR, full-width counters, Intel PMU driver.
22.01.21 11:53	kernel	... version:                3
22.01.21 11:53	kernel	... bit width:              48
22.01.21 11:53	kernel	... generic registers:      4
22.01.21 11:53	kernel	... value mask:             0000ffffffffffff
22.01.21 11:53	kernel	... max period:             00007fffffffffff
22.01.21 11:53	kernel	... fixed-purpose events:   3
22.01.21 11:53	kernel	... event mask:             000000070000000f
22.01.21 11:53	kernel	rcu: Hierarchical SRCU implementation.
22.01.21 11:53	kernel	NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
22.01.21 11:53	kernel	smp: Bringing up secondary CPUs ...
22.01.21 11:53	kernel	x86: Booting SMP configuration:
22.01.21 11:53	kernel	.... node  #0, CPUs:      #1
22.01.21 11:53	kernel	mce: [Hardware Error]: Machine check events logged
22.01.21 11:53	kernel	mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 3: f200000000800400
22.01.21 11:53	kernel	mce: [Hardware Error]: TSC 0 
22.01.21 11:53	kernel	mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 2 microcode 28
22.01.21 11:53	kernel	 #2
22.01.21 11:53	kernel	mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 3: fe00000000800400
22.01.21 11:53	kernel	mce: [Hardware Error]: TSC 0 ADDR ffffffffc1c2525e MISC ffffffffc1c2525e 
22.01.21 11:53	kernel	mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 4 microcode 28
22.01.21 11:53	kernel	 #3
22.01.21 11:53	kernel	mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: fe00000000800400
22.01.21 11:53	kernel	mce: [Hardware Error]: TSC 0 ADDR ffffffff8ff948a1 MISC ffffffff8ff948a1 
22.01.21 11:53	kernel	mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 6 microcode 28
22.01.21 11:53	kernel	 #4
22.01.21 11:53	kernel	MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
22.01.21 11:53	kernel	 #5 #6 #7
22.01.21 11:53	kernel	smp: Brought up 1 node, 8 CPUs
22.01.21 11:53	kernel	smpboot: Max logical packages: 1
22.01.21 11:53	kernel	smpboot: Total of 8 processors activated (56019.80 BogoMIPS)
22.01.21 11:53	kernel	devtmpfs: initialized
I did a bit of digging and the errors started the day after I did the last update which was around midnight before Jan 20th.
journalctl -p emerg    
-- Journal begins at Fri 2020-12-25 05:04:21 CET, ends at Fri 2021-01-22 13:01:49 CET. --
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff9992b212 MISC ffffffff9992b212 
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611146408 SOCKET 0 APIC 0 microcode 28
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: be00000000800400
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff9992b212 MISC ffffffff9992b212 
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611146408 SOCKET 0 APIC 6 microcode 28
-- Boot 5d00180b84de4a24a54fd1d1a038f2a8 --
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff90a957a0 MISC ffffffff90a957a0 
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611153763 SOCKET 0 APIC 0 microcode 28
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: be00000000800400
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff90a957a0 MISC ffffffff90a957a0 
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611153763 SOCKET 0 APIC 6 microcode 28
-- Boot dc7caea2c28946349ec4d476e0aa63c9 --
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 3: f200000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 0 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 3: f200000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 2 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 3: fe00000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffffc1c2525e MISC ffffffffc1c2525e 
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 4 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: fe00000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff8ff948a1 MISC ffffffff8ff948a1 
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 6 microcode 28
Is there anyone that could give me tips on how to tackle this issue?
Thank you,
Beer