How do I convert a single disk BTRFS filesystem to a 4 disk RAID0?

So here’s the scenario, I have a PC I use pretty much only for gaming/entertainment. Data protection is a low priority, what needs to be safe is backed up regularly.
I’m trying to figure out the best way to put my disks in an array that maximizes speed.I have 4 2tb NVME ssds. 1 of them has a fresh install of Manjaro with BTRFS. I remember reading somewhere that you can add disks to a BTRFS filesystem and change the array type at any time but my google fu is failing me.

So that leaves me with 2 questions:

How does one convert an existing BTRFS disk with Manjaro installed to a 4 disk raid0?

Is that even the best option to get maximum speed from my disks?

Thanks!

You find this in btrfs wiki:

In section RAID

You can find allways good Information about Btrfs in the wiki

(Wisdom lies in asking → listening → reading :wink: )

Btrfs is fast if you use compession.

Btrfs is NOT optimized for speed, but for safety !

:footprints:

1 Like

Yes, RAID0 (RAID - ArchWiki) is probably the fastest option.

(However, I doubt that with NVMe, there is much speed to gain anyway, I believe this idea of spreading data between disks started when the disks were slow.)

You can find the commands here in the btrfs documentation:
https://btrfs.readthedocs.io/en/latest/btrfs-device.html#device-management

2 Likes

So if I wanted to use MDADM to make a raid0 that should be done from the livecd before installing manjaro? Or can I do it from an already installed system?

You could but then you lose the advantage of BTRFS.
Did you actually read the linked pages?

1 Like

I did. It seems like mdadm raid0 with f2fs would suit my needs better. I dont need safety I need speed

When you have 4 disks most people choose RAID 0 + 1, or RAID 10; a mirrored stripe. But this is more common for server setups. RAID 0 can be popular for gamers with two SSDs, since all they had was two drives to work with. Seeing “faster” in anything makes them jump on it.

With 4 drives in RAID 10, you will have well over 2 times read speed, and about double the write speed. Though of course this halves the usable space, but does provide redundancy.

RAID 0, or a stripe, over 4 disks obviously increases speeds, and is increased about 1-4 times? It’s not predictable, since the next block of data may or may not be going to or from the same drive you were just dealing with.

Since it’s mostly a gaming PC, they usually want random non-consecutive reads being the biggest priority. This isn’t really provided by RAID 0 consistently. But with RAID 1 you have another drive ready to provide the next block of data if one isn’t being used. RAID 0 often leaves it to chance. (Consecutive reads do lower that chance, i.e. dealing with large files.)

It should be noted that with these newer NVMe drives now read/writing well over 5 Gbps. Often the SSD(s) aren’t even the bottleneck. It is now getting that data so fast, it’s often a single core that slow things down the most (which games are notorious for not being truly multithreaded, especially for loading the game, or new areas, etc.) And improving disk I/O speeds starts having diminished returns.

That aside, if I were to run BTRFS with disk performance as my number one concern. I would make a mdraid 0+1 setup with a single BTRFS filesystem on top of it. I realise RAID 0 and 1 are stable in BTRFS, but everything I’ve read say that mdraid 10 still outperforms BTRFS RAID 10, by a noticeable amount.

For me, the flexibility of BTRFS usually trumps a little more performance. You can even do stuff like convert RAID types on a live BTRFS filesystem.

If still unsure, while the disks are unpopulated, it would be a good time to try both, and benchmark them.

1 Like

That seems like a solid plan, thanks I’ll test both and report back.

I’m missing something here with my BTRFS setup.
I ran the following after a fresh install of manjaro with btrfs

sudo btrfs device add -f [/dev/nvme1-3n1] /    
sudo btrfs balance start -dconvert=raid0 /  

And initially it seemed to balance the disks, so I downloaded a 100gb file to test, and it only stored it on the first disk instead of striping it like I expected it to according to KDE partition manager.

What have I done wrong?

Edit: Nevermind, apparently KDE partition manager was mis-reporting

sudo btrfs filesystem usage -T /                                                               ✔ 
Overall:
    Device size:                   7.24TiB
    Device allocated:            138.02GiB
    Device unallocated:            7.11TiB
    Device missing:                  0.00B
    Device slack:                  2.50KiB
    Used:                        127.71GiB
    Free (estimated):              7.12TiB      (min: 3.56TiB)
    Free (statfs, df):             7.12TiB
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              273.27MiB      (used: 0.00B)
    Multiple profiles:                  no

                  Data      Metadata  System                              
Id Path           RAID0     DUP       DUP      Unallocated Total   Slack  
-- -------------- --------- --------- -------- ----------- ------- -------
 1 /dev/nvme0n1p2  34.00GiB   2.00GiB 16.00MiB     1.75TiB 1.79TiB 2.50KiB
 2 /dev/nvme1n1    34.00GiB         -        -     1.79TiB 1.82TiB       -
 3 /dev/nvme2n1    34.00GiB         -        -     1.79TiB 1.82TiB       -
 4 /dev/nvme3n1    34.00GiB         -        -     1.79TiB 1.82TiB       -
-- -------------- --------- --------- -------- ----------- ------- -------
   Total          136.00GiB   1.00GiB  8.00MiB     7.11TiB 7.24TiB 2.50KiB
   Used           125.92GiB 913.75MiB 16.00KiB

Are you booting from these drives, and using swap on them as well? Either way you are passing partition 2 for the first, then the whole block device for the rest…

When dealing with mdraid or hardware RAID you often pass the whole block device. But in this case, though that technically would work, you should pass the partition to BTRFS (not the whole device).

So is /dev/nvme0n1p1 your EFI partition? And /dev/nvme0n1p3 swap?

UEFI has to run on a vFAT filesystem. (It’s how the BIOS boots your PC.). One of the caveats of using only BTRFS for RAID is that tiny partition is left out. I’ll come back to that.

And when given multiple swap partitions it interleaves them kind of like RAID 0, so you can make them smaller when spread over multiple devices. (That is if swap is on /dev/nvme0n1p3.)

So if you recreate the same partition table on all 4 devices, it is better and cleaner. (That space is unused anyway.)

To copy the GPT partition table over to the other 3 devices you can back it up by:

sgdisk -b=<PATH>/nvme0n1.gpt /dev/nvme0n1

And you can copy it to the other 3 disks:

sgdisk -l=<PATH>/nvme0n1.gpt /dev/nvme[1-3]n1

The kernel handles multiple swap partitions efficiently, so you can just put them in all in fstab or swapon.

But that leaves out your EFI partition. You could use mdraid (doing a 4 disk mirror) on that partition alone. Especially when using RAID for redundancy it seems preferred, but does complicate things; especially if you’re not familiar with this.

I’ve tried many apps and others’ scripts for my backup purposes, but just ended up just writing my own simple scripts for my needs. It primarily backs up BTRFS snapshots. But included in that, I backup my EFI partition with dd. It’s a 300M partition which compresses down to about 100kB (at least for me).

So all I do in the script is:

dd if=/dev/nvme0n1p1 | xz -c | dd of=nvme0n1p1.blk.xz

This copies it to a file, but you could copy it to the other nvme[1-3]p1 drives. This partition doesn’t even change all that much through updates. It just contains enough logic to boot the kernel from your BTRFS volume.

So with the GPT table, and the EFI partition. All I need is my BTRFS volumes: root/@ and @home to restore my whole system. I’ve restored this way multiple times.

Just throwing that out there as an alternative, since it seems like you are just doing RAID 0 anyway and backup by other means?

Almost forgot…

I noticed it seems like the metadata isn’t spread across all your devices. If performance is a primary concern, you definitely want this. I thought it defaulted to RAID 1 for the metadata with multiple devices, but I only see it on the first. That must be for when only creating from scratch?

My RAID 1 HDDs in my system come up with:

             Data    Metadata System
Id Path      RAID1   RAID1    RAID1     Unallocated Total    Slack
-- --------- ------- -------- --------- ----------- -------- -----
 1 /dev/sda1 4.73TiB  9.00GiB   8.00MiB   734.02GiB  5.46TiB     -
 2 /dev/sdb1 4.73TiB  9.00GiB   8.00MiB   734.02GiB  5.46TiB     -
-- --------- ------- -------- --------- ----------- -------- -----
   Total     4.73TiB  9.00GiB   8.00MiB     1.43TiB 10.92TiB 0.00B
   Used      4.34TiB  8.48GiB 752.00KiB

So I’m guessing you need to use the -m and -s options with the balance command?

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.