HDD spinning up every 10min

not yet, no.
I was planing on switching to pure arch in the foreseeable future anyways, so i think i’ll expedite that and do that in the next days. That way we can also cross check if its something manjaro-specific or affecting upstream as well

No, I can’t restart my system, I have to wait until at least tomorrow. I will also try to downgrade my kernel, because the sound of restarting drive is quite annoying at night. I will try the rule first and report back here.

edit: I’ve managed to restart it today. But adding the rule doesn’t help.
This “Synchronizing SCSI cache looks suspicious”, though.

edit2: downgrading the kernel to 5.14.0 or 5.13.8 does NOT resolve this issue either.

edit3: I’ve also downgraded systemd, acpid and udisks2 but it didn’t help either. I’m outta ideas for now.

[parsec Pulpit]# dmesg | grep sda
[ 0.901385] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[ 0.901386] sd 0:0:0:0: [sda] 4096-byte physical blocks
[ 0.901390] sd 0:0:0:0: [sda] Write Protect is off
[ 0.901391] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 0.901397] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn’t support DPO or FUA
[ 0.920378] sda:
[ 0.933017] sd 0:0:0:0: [sda] Attached SCSI disk
[ 76.167697] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 76.168627] sd 0:0:0:0: [sda] Stopping disk
[ 208.499527] sd 0:0:0:0: [sda] Starting disk
[parsec Pulpit]# dmesg | grep sdb
[ 1.366685] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[ 1.366687] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[ 1.366691] sd 1:0:0:0: [sdb] Write Protect is off
[ 1.366692] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 1.366698] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn’t support DPO or FUA
[ 1.368583] sdb:
[ 1.394590] sd 1:0:0:0: [sdb] Attached SCSI disk
[parsec Pulpit]# dmesg | grep SCSI
[ 0.377699] SCSI subsystem initialized
[ 0.410850] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 243)
[ 0.933017] sd 0:0:0:0: [sda] Attached SCSI disk
[ 1.394590] sd 1:0:0:0: [sdb] Attached SCSI disk
[ 1.871869] sd 2:0:0:0: [sdc] Attached SCSI disk
[ 2.341434] sd 3:0:0:0: [sdd] Attached SCSI disk
[ 76.167697] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 242.222462] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[parsec Pulpit]#

1 Like

I’ve opened my PC and manually swapped HDD raid cables with SSD raid.
Now the issue is affecting my SSD, but at least it doesn’t spin up/down.

I’ve been using Manjaro since 2013 on various PCs ranging from overclocked VIA C3 terminal to this Ryzen 5950x and I haven’t had as annoying issue as this one. It even wakes sleeping sda drive just to put it to sleep again, and doesn’t allow listening to music from 400MB/s raid because it stops whole array for a couple of seconds to respin the disk!

My Ivy Bridge laptop is also affected by this (with SSD). But Haswell server is NOT. It has Manjaro Mate instance from 2015.

Both affected machines are running quite fresh Cinnamon instances. Maybe Cinnamon 5.0 is to blame?

Here is how does it look with SSD as sda:

[parsec Pulpit]# dmesg | grep sda
[ 0.734406] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[ 0.734407] sd 0:0:0:0: [sda] 4096-byte physical blocks
[ 0.734411] sd 0:0:0:0: [sda] Write Protect is off
[ 0.734412] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 0.734417] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn’t support DPO or FUA
[ 0.745079] sda:
[ 0.765039] sd 0:0:0:0: [sda] supports TCG Opal
[ 0.765041] sd 0:0:0:0: [sda] Attached SCSI disk
[ 191.010273] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 191.012103] sd 0:0:0:0: [sda] Stopping disk
[ 220.239374] sd 0:0:0:0: [sda] Starting disk
[ 238.674932] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 238.676783] sd 0:0:0:0: [sda] Stopping disk
[ 250.960330] sd 0:0:0:0: [sda] Starting disk
[ 267.756883] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 267.759015] sd 0:0:0:0: [sda] Stopping disk
[ 274.866747] sd 0:0:0:0: [sda] Starting disk

[ 1668.593844] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 1668.595681] sd 0:0:0:0: [sda] Stopping disk
[ 1698.252385] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 1698.253247] ata1.00: supports DRM functions and may not be fully accessible
[ 1698.253974] ata1.00: supports DRM functions and may not be fully accessible
[ 1698.254499] ata1.00: configured for UDMA/133
[ 1698.264561] ahci 0000:02:00.1: port does not support device sleep
[ 1698.264600] sd 0:0:0:0: [sda] Starting disk

1 Like

I have a similar issue described in Hard drive (HDD) endlessly cycling stopping/starting disk when laptop is on battery
Tried to turn off file indexing (along with balooctl disable) and the mentioned udev rule ACTION=="add", SUBSYSTEM=="block", ATTR{events_poll_msecs}="0" (both 0 and -1) (rebooted, of course).
Nothing helped, unfortunately.

What if to try earlier kernel generations or/and to pass verbose logging / debug parameter’s into kernel cmdline (GRUB_CMDLINE_LINUX_DEFAULT line in /etc/default/grub)?

I’ve reverted back to 5.9.11 from Nov '20 and the problem persisted. My laptop failed to boot 5.5.15 from Apr’20 so I’m not going to play with kernels anymore. This doesn’t seem like kernel issue anyway.

Ok i now switched to Arch and the behavior isn’t showing anymore - so it has to be something manjaro-specific and not coming from upstream.

So there is only one spinning up event in the journalctl -b -k during Arch boot up and no more for several hours later?
I heard that Arch has a bit more issues than Manjaro. How to interact with Arch in long-term usage if we can’t to diagnose the problem source even on Manjaro? Arch could be a more heavier in term of more tech staff to learn and to be tested. May be I’m wrong.

Pretty much, yeah - no

[ 6021.369285] sd 8:0:0:0: [sda] Starting disk
[ 6047.567109] sd 8:0:0:0: [sda] Synchronizing SCSI cache
[ 6047.567373] sd 8:0:0:0: [sda] Stopping disk

at all, just some entries 3s into the boot, then nothing more.
I’ve no real idea on how to narrow down which configuration or service or whatnot is doing this - but if someone has an idea and want to compare, i’m glad to help :slight_smile:

Why `journalctl -k` usage preference could be better over `dmesg`

1 Like

Hi! I have latest archlinux (not manjaro), and same problem reproduces.

By using fatrace and some other magic I localized events to udisks daemon.
sudo systemctl stop udisks2.service
This command temporary stops problem until reboot (as I understood, service is critical to the system, and can’t be fully disabled).

There are also a lot of angry posts at ubuntu tracker: https_bugs.launchpad.net/ubuntu/+source/udisks2/+bug/1281588
And there is an issue at github: Please let me configure the housekeeping interval or otherwise unbreak externally configured spindown · Issue #407 · storaged-project/udisks · GitHub

4 Likes

Found even better solution. Stopping and masking service either breaks some functionality or system randomly restarts service. Looks like core problem is in udisks polling smart data. We can disable smart data, and udisks will stop doing it’s shit.

smartctl --smart=off /dev/sda

This solution persists across reboots. Stopping / disabling / masking udisks is not required.

Original solution from here: https_bugs.launchpad.net/ubuntu/+source/udisks2/+bug/1373318

4 Likes

I think it might be related to KDE’s SMART status utility. I disabled the checking for my particular hard drive and seems to have stopped spinning up now. IIRC that utility is a fairly recent addition so maybe that’s why it only started recently.

Edit: Nevermind, it started spinning up again, must’ve been coincidence that it stopped after disabling it in the SMART utility :frowning:

1 Like

Doesn’t that effectively disable SMART self-monitoring entirely?

You won’t see any new errors reported in the SMART logs for the drive, and may miss out on an early-warning of a pre-fail drive.

-s VALUE, --smart=VALUE
Enables or disables SMART on device.
The valid arguments to this option are on and off.

Not solely at least - i’ve got smartmontools and the KDE Smart-Utility now, too (and i just checked - it’s working and showing info about sda without having a single spin-up logged :wink: ) and don’t see the behavior anymore. So there must be some other factor involved. But at least we found out it’s a known issue, so thanks @vtyulb !

1 Like

That is probably true. However, spinning up hdd every 10 minutes will destroy your drive sooner or later (my guess is soon enough), so it is a bigger concern. In my case, there is only windows and boot partition on hdd, so I don’t care if my hdd will die without warning.

1 Like

And once non-spinning drives (SATA SSD / NVMe / who knows what the future holds) becomes the norm, then all of these concerns will be nothing more than historical artifacts to read to our grandchildren before bedtime. :open_book: :bed:

Many, and likely the majority, of retail consumer PCs (especially laptops, but even desktops) only have solid-state storage. We just need the big manufacturers (e.g, Western Digital and Seagate) to make higher profits from producing and selling SSD compared to HDD at a large scale.

Probably there are still two benefits of HDDs as storage class over SSD:
-) bigger capacity of single drive;
-) less price (cost of each TB of data).

And if already have it: no extra money to spend to make PC has that extra storage capacities.
But it can be useful for a decade for backup and multimedia files purposes.

PS
Did anybody submit a bug report? May be to post link here? If everybody will solve problems only locally, than issue count will only increase.

I opened bug for archlinux udisks package: https_bugs.archlinux.org/task/72543
Not sure if this is the correct way, but that’s cheapest solution (just apply local archlinux patch 10 minutes → 13 hours, and generate new package).

Best way is to let user configure how udisks works, or may be completely disable smart timer in udisks. However, github issue was opened in 2017, and it does not look like someone will fix it. As I understood, every distro has it’s own patches for this purpose.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.