Hello everyone!
Pretty much since the last update I seem to be getting crashes and started seeing hardware errors.
Granted my machine is 7 years old and scheduled for replacement, I’d still need it working for another 6 months. It has been running perfectly for the last 16 months so this is a new thing.
Last crash happened at night while the computer was idle.
Here is the error:
22.01.21 11:53 kernel smpboot: CPU0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
22.01.21 11:53 kernel mce: [Hardware Error]: Machine check events logged
22.01.21 11:53 kernel mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 3: f200000000800400
22.01.21 11:53 kernel mce: [Hardware Error]: TSC 0
22.01.21 11:53 kernel mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 0 microcode 28
22.01.21 11:53 kernel Performance Events: PEBS fmt2+, Haswell events, 16-deep LBR, full-width counters, Intel PMU driver.
22.01.21 11:53 kernel ... version: 3
22.01.21 11:53 kernel ... bit width: 48
22.01.21 11:53 kernel ... generic registers: 4
22.01.21 11:53 kernel ... value mask: 0000ffffffffffff
22.01.21 11:53 kernel ... max period: 00007fffffffffff
22.01.21 11:53 kernel ... fixed-purpose events: 3
22.01.21 11:53 kernel ... event mask: 000000070000000f
22.01.21 11:53 kernel rcu: Hierarchical SRCU implementation.
22.01.21 11:53 kernel NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
22.01.21 11:53 kernel smp: Bringing up secondary CPUs ...
22.01.21 11:53 kernel x86: Booting SMP configuration:
22.01.21 11:53 kernel .... node #0, CPUs: #1
22.01.21 11:53 kernel mce: [Hardware Error]: Machine check events logged
22.01.21 11:53 kernel mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 3: f200000000800400
22.01.21 11:53 kernel mce: [Hardware Error]: TSC 0
22.01.21 11:53 kernel mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 2 microcode 28
22.01.21 11:53 kernel #2
22.01.21 11:53 kernel mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 3: fe00000000800400
22.01.21 11:53 kernel mce: [Hardware Error]: TSC 0 ADDR ffffffffc1c2525e MISC ffffffffc1c2525e
22.01.21 11:53 kernel mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 4 microcode 28
22.01.21 11:53 kernel #3
22.01.21 11:53 kernel mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: fe00000000800400
22.01.21 11:53 kernel mce: [Hardware Error]: TSC 0 ADDR ffffffff8ff948a1 MISC ffffffff8ff948a1
22.01.21 11:53 kernel mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 6 microcode 28
22.01.21 11:53 kernel #4
22.01.21 11:53 kernel MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
22.01.21 11:53 kernel #5 #6 #7
22.01.21 11:53 kernel smp: Brought up 1 node, 8 CPUs
22.01.21 11:53 kernel smpboot: Max logical packages: 1
22.01.21 11:53 kernel smpboot: Total of 8 processors activated (56019.80 BogoMIPS)
22.01.21 11:53 kernel devtmpfs: initialized
I did a bit of digging and the errors started the day after I did the last update which was around midnight before Jan 20th.
journalctl -p emerg
-- Journal begins at Fri 2020-12-25 05:04:21 CET, ends at Fri 2021-01-22 13:01:49 CET. --
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff9992b212 MISC ffffffff9992b212
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611146408 SOCKET 0 APIC 0 microcode 28
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: be00000000800400
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff9992b212 MISC ffffffff9992b212
Jän 20 13:40:11 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611146408 SOCKET 0 APIC 6 microcode 28
-- Boot 5d00180b84de4a24a54fd1d1a038f2a8 --
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff90a957a0 MISC ffffffff90a957a0
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611153763 SOCKET 0 APIC 0 microcode 28
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: be00000000800400
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff90a957a0 MISC ffffffff90a957a0
Jän 20 15:42:47 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611153763 SOCKET 0 APIC 6 microcode 28
-- Boot dc7caea2c28946349ec4d476e0aa63c9 --
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 3: f200000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 0 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 3: f200000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 2 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 3: fe00000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffffc1c2525e MISC ffffffffc1c2525e
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 4 microcode 28
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: fe00000000800400
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff8ff948a1 MISC ffffffff8ff948a1
Jän 22 11:53:00 cheetah kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1611312778 SOCKET 0 APIC 6 microcode 28
Is there anyone that could give me tips on how to tackle this issue?
Thank you,
Beer