Raid 0 software raid issues (?)

Hey there guys,

I’ve successfully setup a raid 0 with 2 of my NVMEs. The raid is working well (in terms of being able to be used).

However, I’ve observed a few things that made me reconsider if it indeed is working well.

I’ve setup the raid following this Arch Linux guide.

When I tried to do my first scrubbing via

echo check > /sys/block/md127/md/sync_action

it gave me a permission error:

LC_ALL=C echo check > /sys/block/md127/md/sync_action                                       
zsh: keine Berechtigung: /sys/block/md127/md/sync_action

(I don’t know why LC_ALL=C isn’t working here, but nevermind, just telling me I don’t have permission).

3 things I’ve wondered now:

  • Why is it giving me permission errors?
  • Why does my raid have this arbitrary number 127 in the first place? Did I set up something wrong?
  • Why are the UUIDs mismatching from /etc/mdadm.conf and from blkid /dev/md127?

In the conf it says 63b5e92f:a8276105:cb8562bd:b69717d6 and blkid prints c8acd199-af63-4583-99a3-c9896b346668

Here’s an excerpt from my /etc/mdadm.conf:

ARRAY /dev/md/DownloadCache metadata=1.2 UUID=63b5e92f:a8276105:cb8562bd:b69717d6

My questions would now be: Is this expected or did I mess up at some point? If I did, how can I fix it? And also, why is the scrubbing command not working for me/printing no permissions?

Any help would be greatly appreciated :slight_smile:

I also tried starting from scratch, which didn’t work as expected. I just can’t seem to entirely delete all traces of that raid.

I must’ve turned wrong direction at some point I can’t see right now

I remember setting up raid using md/raid - I made some notes - given the fact that the server is no longer in service - I don’t have more than those to offer.

1 Like

Thank you Aarhus once again for chiming in.

I’m onto something and will report back with my findings - and a possible solution/error analysis :slight_smile:

Sadly neither my research nor your notes have helped me resolving the issue, sigh.

Still getting no permission error when running

sudo echo check > /sys/block/mdX/md/sync_action

You need root permissions to write to this sys file. Your command has not the appropriate permissions to write. A simple sudo echo cannot work, since the redirect is run with your user permissions.

For example check the file owner for the newly created file

sudo echo TEST > /tmp/testfile
ls -l /tmp/testfile

You should switch to your root account

su -l 

and then try

echo check > /sys/block/mdX/md/sync_action

An alternativ is to use tee.

tee works fine, one of the simplest programs in your /usr/bin.

LC_ALL=C echo check | sudo tee /sys/block/md127/md/sync_action
2 Likes

Thank you for chiming in. Sadly, both commands are still giving me permission errors. I don’t know how I can get past this :frowning:

LC_ALL=C echo check | sudo tee /sys/block/md0/md/sync_action                              

tee: /sys/block/md0/md/sync_action: Keine Berechtigung
check
[PowerTower ~]# echo check > /sys/block/md0/md/sync_action
-bash: /sys/block/md0/md/sync_action: Keine Berechtigung

Note that the raid device has changed from md127 to md0. I can’t seem to be able to start this task…

have you checked groups ?

Switch to root context before executing the command

su -l root

Then - you should make sure the block device is correct - simple copy/paste may not work

echo check > /sys/block/md127/md/sync_action

Is the md127 correct?

In all of the Arch wikis, the example commands are prepended with a prompt.

the # stands for the root account
(switch to root - use su to become root)

Just prepending sudo does not always work.

As you found out,
sudo echo check > ...
is not the same as, does not work the same as
echo check > ... when issued from the root account

The intention of
LC_ALL=C
(I use LANG=C)
is to force the output in englisch

Some commands are chained together - the output of one is the input to the next
and a language different than englisch might not work because some keyword will not be found when the output of the first is in … german instead of english.

su -l root                                                                                    
Passwort: 
[PowerTower ~]# echo check > /sys/block/md0/md/sync_action
-bash: /sys/block/md0/md/sync_action: Keine Berechtigung
[PowerTower ~]#

That’s what I did, but sadly it still doesn’t like it.

I did not. And I’m not sure how I would go about inspecting the groups. Here’s the id of my own user, is there anything peculiar?

id donatus                                                                                  
uid=1000(donatus) gid=1001(donatus) Gruppen=1001(donatus),998(wheel),991(lp),3(sys),90(network),98(power),1000(autologin),959(docker)

Check the raid build process

cat /proc/mdstat

You need the process to be finished before you can do anything with it.

It is finished I guess and it’s showing the raid as active:

cat /proc/mdstat                                                                            
Personalities : [raid0] 
md0 : active raid0 nvme1n1p1[1] nvme2n1p1[0]
      1953257472 blocks super 1.2 512k chunks
      
unused devices: <none>

if you list your block devices you should get something like

[server1 ~]# lsblk
NAME      MAJ:MIN RM   SIZE RO TYPE   MOUNTPOINTS
sda         8:0    0 223,6G  0 disk   
├─sda1      8:1    0   300M  0 part   /boot/efi
├─sda2      8:2    0 214,5G  0 part   /srv/samba/public
│                                     /srv/samba/data
│                                     /
└─sda3      8:3    0   8,8G  0 part   [SWAP]
sdb         8:16   0 447,1G  0 disk   
sdc         8:32   0 447,1G  0 disk   
sdd         8:48   0 447,1G  0 disk   
└─sdd1      8:49   0   400G  0 part   
  └─md127   9:127  0 399,9G  0 raid10 
sde         8:64   0 447,1G  0 disk   
└─sde1      8:65   0   400G  0 part   
  └─md127   9:127  0 399,9G  0 raid10 
sdf         8:80   1     0B  0 disk   
sdg         8:96   1     0B  0 disk   
sdh         8:112  1     0B  0 disk   
sdi         8:128  1     0B  0 disk   
sdj         8:144  1     0B  0 disk  

I do sir!

lsblk                                                                           
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0  16,4T  0 disk  
└─sda2        8:2    0  16,4T  0 part  /mnt/disk1
sdb           8:16   0  16,4T  0 disk  
└─sdb2        8:18   0  16,4T  0 part  /mnt/disk2
sdc           8:32   0    20T  0 disk  
├─sdc1        8:33   0    16M  0 part  
└─sdc2        8:34   0    20T  0 part  /mnt/disk3
sdd           8:48   1     0B  0 disk  
sde           8:64   1     0B  0 disk  
nvme1n1     259:0    0 931,5G  0 disk  
└─nvme1n1p1 259:1    0 931,5G  0 part  
  └─md0       9:0    0   1,8T  0 raid0 /mnt/DownloadCache
nvme0n1     259:2    0   3,6T  0 disk  
├─nvme0n1p1 259:3    0   300M  0 part  /boot/efi
└─nvme0n1p2 259:4    0   3,6T  0 part  /
nvme2n1     259:5    0 931,5G  0 disk  
└─nvme2n1p1 259:6    0 931,5G  0 part  
  └─md0       9:0    0   1,8T  0 raid0 /mnt/DownloadCache
nvme3n1     259:7    0   3,6T  0 disk  
└─nvme3n1p1 259:8    0   3,6T  0 part  /mnt/EmbyData

Then if you want to scrub them you target the md0 device

As root

echo check > /sys/block/md0/md/sync_action

But…that’s what I’m doing already for the past hours, which sadly doesn’t work as I said earlier:

sudo -i                                                                                       
[sudo] Passwort für donatus: 
[PowerTower ~]# echo check > /sys/block/md0/md/sync_action
-bash: /sys/block/md0/md/sync_action: Keine Berechtigung

That’s exactly where it always says “No permission”

Hmm - I missed that - sorry.

What happens if you use

sudo mdadm --action=check /dev/md0

But since the scrubbing is to locate disk errors and repair data before failure - raid0 is just speed - no redundancy - is it necessary in such case?

The raid I used was for redundancy and speed - raid10far2 - and check is necessary to preemptive locate data errors before they become critical.

No problem at all sir!

That’s exactly what I’ve tried earlier, too! And I was wondering why this doesn’t work either!

sudo mdadm --action=check /dev/md0

LC_ALL=C sudo mdadm --action=check /dev/md0echo                                             
mdadm: Couldn't open /dev/md0echo: No such file or directory

Which is weird, as lsblk reports it as “/dev/md0”.

That’s an entirely reasonable question. It’s not that I think it is urgent or necessary, it’s just that I tried it earlier, but wasn’t able to ever execute it, so I began asking myself if I’ve even set up the entire raid correctly