System hangs on network activation

Hi, I just installed Manjaro as bare CLI and the system hung after I activated network connection. Kernels 5.4 and 5.6 (from Architect) work fine, in 5.9, 5.10, 5.11 and 5.12rc5 the issue occurs and I can’t event reboot the system and some other commands hang too. I was able to find an email on lkml about the issue - about cfg80211 getting deadlocked and it got fixed in June 2020. Is there a way how I can get it to work?

[  245.353452] INFO: task kworker/u64:7:236 blocked for more than 122 seconds.
[  245.353455]       Not tainted 5.9.16-1-MANJARO #1
[  245.353456] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.353458] task:kworker/u64:7   state:D stack:    0 pid:  236 ppid:     2 flags:0x00004000
[  245.353482] Workqueue: events_power_efficient reg_check_chans_work [cfg80211]
[  245.353483] Call Trace:
[  245.353489]  __schedule+0x292/0x830
[  245.353493]  schedule+0x46/0xf0
[  245.353495]  schedule_preempt_disabled+0x14/0x20
[  245.353497]  __mutex_lock.constprop.0+0x180/0x530
[  245.353499]  ? _raw_spin_lock+0x13/0x30
[  245.353517]  reg_check_chans_work+0x2d/0x3f0 [cfg80211]
[  245.353522]  process_one_work+0x1da/0x3d0
[  245.353524]  worker_thread+0x4d/0x3d0
[  245.353526]  ? rescuer_thread+0x410/0x410
[  245.353528]  kthread+0x142/0x160
[  245.353530]  ? __kthread_bind_mask+0x60/0x60
[  245.353533]  ret_from_fork+0x22/0x30
[  245.353551] INFO: task ip:1317 blocked for more than 122 seconds.
[  245.353552]       Not tainted 5.9.16-1-MANJARO #1
[  245.353552] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.353553] task:ip              state:D stack:    0 pid: 1317 ppid:  1304 flags:0x00004080
[  245.353555] Call Trace:
[  245.353557]  __schedule+0x292/0x830
[  245.353560]  schedule+0x46/0xf0
[  245.353562]  schedule_preempt_disabled+0x14/0x20
[  245.353563]  __mutex_lock.constprop.0+0x180/0x530
[  245.353571]  igb_resume+0xff/0x1d0 [igb]
[  245.353576]  pci_pm_runtime_resume+0xaa/0xc0
[  245.353578]  ? pci_pm_freeze_noirq+0x110/0x110
[  245.353581]  __rpm_callback+0xc5/0x170
[  245.353584]  rpm_callback+0x4f/0x70
[  245.353585]  ? pci_pm_freeze_noirq+0x110/0x110
[  245.353587]  rpm_resume+0x5d7/0x820
[  245.353590]  __pm_runtime_resume+0x3b/0x60
[  245.353593]  __dev_open+0x63/0x1b0
[  245.353596]  __dev_change_flags+0x1a5/0x210
[  245.353598]  dev_change_flags+0x21/0x60
[  245.353601]  do_setlink+0x2bc/0x1160
[  245.353607]  ? __nla_validate_parse+0x5f/0x910
[  245.353610]  __rtnl_newlink+0x65f/0x9e0
[  245.353620]  rtnl_newlink+0x44/0x70
[  245.353622]  rtnetlink_rcv_msg+0x13e/0x390
[  245.353625]  ? rtnl_calcit.isra.0+0x120/0x120
[  245.353627]  netlink_rcv_skb+0x75/0x140
[  245.353630]  netlink_unicast+0x242/0x340
[  245.353632]  netlink_sendmsg+0x243/0x480
[  245.353636]  sock_sendmsg+0x5e/0x60
[  245.353638]  ____sys_sendmsg+0x25a/0x2a0
[  245.353640]  ? copy_msghdr_from_user+0x6e/0xa0
[  245.353642]  ___sys_sendmsg+0x97/0xe0
[  245.353646]  __sys_sendmsg+0x81/0xd0
[  245.353649]  do_syscall_64+0x33/0x40
[  245.353651]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  245.353654] RIP: 0033:0x7f54dfc31737
[  245.353655] RSP: 002b:00007ffe8adaa928 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  245.353656] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54dfc31737
[  245.353657] RDX: 0000000000000000 RSI: 00007ffe8adaa990 RDI: 0000000000000003
[  245.353658] RBP: 00000000607c3ca8 R08: 0000000000000001 R09: 00007f54dfcf2a60
[  245.353659] R10: 0000000000000230 R11: 0000000000000246 R12: 0000000000000001
[  245.353659] R13: 00007ffe8adaaa60 R14: 0000000000000000 R15: 000055722d3b8020

System info:

System:
  Kernel: 5.4.108-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.4-x86_64 
  root=/dev/mapper/manjaro-system rw 
  cryptdevice=UUID=c862b0da-751c-4bbc-bd7c-fde691941c26:cryptroot quiet 
  udev.log_priority=3 
  Console: tty 1 Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop System: ASUS product: N/A v: N/A serial: N/A 
  Mobo: ASUSTeK model: ROG CROSSHAIR VII HERO v: Rev 1.xx serial: <filter> 
  UEFI: American Megatrends v: 4301 date: 03/04/2021 
Memory:
  RAM: total: 62.77 GiB used: 1.01 GiB (1.6%) 
  RAM Report: 
  missing: Required tool dmidecode not installed. Check --recommends 
CPU:
  Info: 8-Core model: AMD Ryzen 7 2700X bits: 64 type: MT MCP arch: Zen+ 
  family: 17 (23) model-id: 8 stepping: 2 microcode: 800820D cache: L2: 4 MiB 
  bogomips: 118225 
  Speed: 2415 MHz min/max: 2200/3700 MHz boost: enabled Core speeds (MHz): 
  1: 2415 2: 1714 3: 2087 4: 2067 5: 2194 6: 2190 7: 2187 8: 2191 9: 1718 
  10: 1922 11: 1974 12: 1886 13: 1886 14: 1960 15: 3723 16: 2085 
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1 
  bmi2 bpext clflush clflushopt clzero cmov cmp_legacy constant_tsc cpb cpuid 
  cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid f16c flushbyasid 
  fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb irperf lahf_lm lbrv lm mca 
  mce misalignsse mmx mmxext monitor movbe msr mtrr mwaitx nonstop_tsc nopl 
  npt nrip_save nx osvw overflow_recov pae pat pausefilter pclmulqdq pdpe1gb 
  perfctr_core perfctr_llc perfctr_nb pfthreshold pge pni popcnt pse pse36 
  rdrand rdseed rdtscp rep_good sep sev sha_ni skinit smap smca sme smep ssbd 
  sse sse2 sse4_1 sse4_2 sse4a ssse3 succor svm svm_lock syscall tce topoext 
  tsc tsc_scale v_vmsave_vmload vgif vmcb_clean vme vmmcall wdt xgetbv1 xsave 
  xsavec xsaveerptr xsaveopt xsaves 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass 
  mitigation: Speculative Store Bypass disabled via prctl and seccomp 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP: 
  disabled, RSB filling 
  Type: srbds status: Not affected 
  Type: tsx_async_abort status: Not affected 
Network:
  Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: 5.6.0-k 
  port: e000 bus-ID: 05:00.0 chip-ID: 8086:1539 class-ID: 0200 
  IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
  IP v4: <filter> scope: global broadcast: <filter> 
  IP v6: <filter> scope: link 
  Device-2: Intel Wireless-AC 9260 driver: iwlwifi v: kernel port: e000 
  bus-ID: 07:00.0 chip-ID: 8086:2526 class-ID: 0280 
  IF: wlp7s0 state: down mac: <filter> 
  WAN IP: <filter> 
Bluetooth:
  Device-1: Intel Wireless-AC 9260 Bluetooth Adapter type: USB driver: btusb 
  v: 0.8 bus-ID: 1-10:5 chip-ID: 8087:0025 class-ID: e001 
  Report: rfkill ID: hci0 rfk-id: 1 state: down bt-service: disabled 
  rfk-block: hardware: no software: no address: see --recommends 
1 Like

have you computer on resume before ?

1 Like

Thanks for the pointers. I connected the dots and the cause was Windows 10 and its fast boot…and disconnected computer from power for 10 seconds to clear stuff and new kernels work now.

Not sure if it counts as a regression or not.

You will run into this problem again in the future and it’s called a side-effect and the solution is:

  • When rebooting from Windows to Linux, cold boot the system (I.E.: Power off in Windopws, press the power button again to boot to Linux.)
  • When booting from Linux to Windows, you can still warm boot.

Why, oh why?

Because Windows has a monopoly, they don’t care that their network drivers mess up the UEFI system as they re-init said system in their drivers, whereas Linux expects a clean system.

:sob:

1 Like

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.