SSD slow (NVME/PCIe4.0)

Hi all,
on myComputer a 1TB Samsung 980 Pro is installed and I’ve made a speedtest with Windows as well as under a new Manjaro installation. There is a huge difference between both results and maybe someone here has an idea how to solve it.

Thanks a lot!
items

Windows:


Linux:

Hello @items :wink:

Not sure, but possible that your NVME runs under Linux with gen2/3 and not gen4.

That should confirm it:

sudo lspci -vv | egrep "PCI bridge|LnkCap"

Please post the out here.

Did you use the same software package from the same developer to make a test?
Different software can use different algorithms and methodologies to calculate result. Even different version of the same software can change methodology and each diff can introduce incomparable results.
So make sure you use the same app of the same version from the same developer in order to compare results.

But I see a huge diff of read speeds on SEQ1MQ8T1 test.
That difference can be caused by not used TRIM and do not to wait at least 15 minutes of idle before to test again (some tester on Youtube does that delay, probably to ensure that all internal idle operations to be completed after TRIM request from OS to make sure what during the test it will not interfere into a measurements).

@items, where is the inxi report? (How to provide good information)

Is it M.2 or PCIe of more than 4 lanes?
If M.2, than probably 4 PCIe lanes.
If PCIe has gen 3, than max theoretical (ideal) payload data speed is (according to PCI Express - Wikipedia):
3.94 GB/s (not GiB), but in one test we see 4238 MB/s (even if means MiB, than if to convert it into MB to compare with GB in Wiki, than it will be greater value than was in MB scale).

So if it is M.2 SSD it is impossible speed of software mode of PCIe 3.0, so it is software mode of PCIe 4+ version.

Does the SSD throttles itself? What temps does it reach?

A bit off-topic: what cooling system do you prefer to cool down the speedy SSD making it able to remain so high speed over the time?

I have 970 EVO plus in a passive cooling full-metal heavy case with ventilation holes in it If not to cool down. I have CPU temp of 50-52C while hours of load for 70-80%. So passive cooling of the case is not bad (Core i5-8250U CPU w/ Hyper Threading turned off and on nominal TDP of 15W).

I have torrent shares, if I got constant (15-20 minutes) of reading at speed of 8-11 MiB/sec (on my 100 Mbps Ethernet connection), my ssd became 90-95C. It is only about 10 MiB/sec but is it constant.
If I copy from the ssd to external USB 2.0 device with the speed of 38-40 MiB/sec I saw temp of 97-105C after 15-20 minutes of such usage, I have M.2 PCIe 3.0 ssd, but in my use case constant read speed of USB 2.0 device heats up my SSD up to water boils point.

I added 4mm-thickness copper heat sink (passive radiator) on top of the label of SSD, got maximum 80C on 10 MiB/sec. Still very high temp, but lower than was.

So, cooling of nvme SSD matters, there is no any PCIe 3.0 NVMe cold or even warm SSD device (and you have even PCIe 4, but different controller technology), speedy devices are all very hot if not to use cooling system, so you case could be also: you power up PC, test ssd previously warmed up only by Windows boot, then use for sometime, reboot into Linux - SSD became even hotter - and test yet throttling SSD on Linux.

2 Likes

Good morning and thanks a lot first!

The output of

sudo lspci -vv | egrep "PCI bridge|LnkCap" 

provides a lot of information but I’m absolutely uncertain where to find the important part (output below).

I’ve also checked the BIOS and I can choose between GEN 3 or GEN 4 for the M2 but after setting it to GEN 4 also the speed test in Windows slows down. So I’ve gone back to “Auto” in the BIOS settings. I’ve used a tool with windows that is called “Crystal Disk Info” and it shows me that a PCIe4 is possible for the disk and at least in Windows the tool says that it is also connected via 4.0 so I assumed that this is determined by the BIOS and not by the OS. I’ve also read a lot of information via the internet search whether there is a way to figure out in Linux if the SSD is connected by PCIe3.0 or 4.0 but I was not able to get this information in a way, that was understandable for me.

Are you able to interpete the output of the terminal? That would be great! In my opinion it shows the speed range but not really the speed that is in use.

00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #1, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <32us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #1, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <32us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #4, Speed 16GT/s, Width x1, ASPM L1, Exit Latency L1 <32us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #5, Speed 16GT/s, Width x1, ASPM L1, Exit Latency L1 <32us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
03:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #4, Speed 5GT/s, Width x1, ASPM L1, Exit Latency L1 <8us
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
                LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1) (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <64us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us

It is difficult to compare the results unless you ensure your caches are flushed before running the tests.

Linux uses RAM to cache files - so maybe this is the reason for the skewed results.

Thank you very much alven! First I have to say, that I do my best but I’m not really deep into hardware issues neither in linux itself but I do my best. But my first thought was also whether the connection type is the cause.

I can also paste the long output of inxi but I assume that the part with “lanes: 4” is interesting. If I shall paste the whole output, no problem just say it.

Local Storage: total: 3.64 TiB used: 598.88 GiB (16.1%) 
           ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 PRO 1TB size: 931.51 GiB block-size: 
           physical: 512 B logical: 512 B speed: 63.2 Gb/s lanes: 4 type: SSD serial: <filter> rev: 3B2QGXA7 temp: 37.9 C 
           scheme: GPT 
           SMART: yes health: PASSED on: 5 hrs cycles: 32 read-units: 1,992,621 [1.02 TB] written-units: 1,261,669 [645 GB] 
           ID-2: /dev/nvme1n1 maj-min: 259:1 vendor: Samsung model: SSD 980 PRO 1TB size: 931.51 GiB block-size: 
           physical: 512 B logical: 512 B speed: 63.2 Gb/s lanes: 4 type: SSD serial: <filter> rev: 3B2QGXA7 temp: 40.9 C 
           scheme: GPT 
           SMART: yes health: PASSED on: 6 hrs cycles: 32 read-units: 2,329 [1.19 GB] written-units: 833,242 [426 GB] 
           ID-3: /dev/sda maj-min: 8:0 type: USB vendor: Toshiba model: External USB 3.0 drive model: MQ01ABB200 
           family: 2.5" HDD MQ01ABB... size: 1.82 TiB block-size: physical: 4096 B logical: 512 B sata: 2.6 speed: 3.0 Gb/s 
           type: HDD rpm: 5400 serial: <filter> drive serial: <filter> rev: 5438 drive rev: AY000U temp: 26 C scheme: GPT 
           SMART: yes state: enabled health: PASSED on: 1y 308d 1h cycles: 229 
           Message: No optical or floppy data found. 

The SSD has a cooling case also. One is on the board itself and I’ve used [this one] for the other drive: (EK Water Blocks EK-M.2 NVMe Heatsink - grün)
So the temperature looks pretty good to me with 40.9 degrees and I don’t think that the question is a result of the temperature.

Using the same tool in Windows and Linux seems to be difficult but I assumed that if I use the same main settings in both tools that also the results should be almost comparable. But I’ve also used “Peak Performance” in KDisk Mark and that leads to the results below:

Thank you linux-aarhus! There is an option in KDisk Mark called “Flush Pagecache” which is activated. So I suppose this is taking care for that. I think that maybe the settings are different but I’m not sure, how to replicate them exactly.

It just came to my mind if it possible to turn it the other way round as a solution:
It the SSD is connected via PCIe4 can I expect the same performance as under Windows? If yes, I’m fine with that.
Or, and this is another point. Is there anything I can do to increase the performance under Linux?

Thanks a lot and a nice day to all of you!

I’ve talked with KDiskMark author and he said you need to check this:

(issue with Linux kernel it seems)

And also mind that CrystalDiskMark and KDiskMark (fio) results are not quite comparable.

1 Like

Ok, thanks LordTermor!
This leads to KDiskMark, than to fio and than to a kernel problem. So it is not really comprehensible and I can’t follow the long discussions because I don’t understand, what they are talking about. But finally it seems that we have a measurement problem here and not a performance problem of the disk.

I will look for some more hints and maybe I’ll find a way to check the performance by using tools that other people have successful used.

I have been testing my NVNe’s to see if the promised transfer rates are achievable and the read/write rates provided by the vendor is almost never reachable in real world application.

They are marketing numbers and some vendors have even been caught in the act of changing the device chips and controllers after the initial tests - without telling anyone.

:cough: Western Digital :cough:

“Sure is dusty in here…” cough cough

Guys, with marketing discussion we are going a bit off-topic: Win shows 6700 vs Linux 1800-2000 MB/s sequential read speed on the same SSD sample. For me it is hard to believe that different apps used so different algorithms which leads to 3.5-4 times difference. May be 5-10-20-25%, but diff in 3.5 times?..

May be caused by SSD drivers used on Linux or it’s default mode/settings?
Different file systems used for testing partitions also could matter, but that much?
Also partition/disk encryption should reduce speeds, as CPU & RAM performance became a bottleneck or SSD’s controller (case of SSD’s HW encryption).

What about if you create a 2GB file, filled with random data, and dd it to /dev/null and time it?

Something like,

dd if=/dev/urandom bs=512K count=4096 of=~/random.dat status=progress

sync

su -c "echo 3 > /proc/sys/vm/drop_caches"

time sudo dd iflag=direct if=~/random.dat bs=512K of=/dev/null status=progress

I’M absolutely aware of that of course and I don’t expect the theoretical read-write rates but I expect that they are comparable between Windows and Linux so that I don’t loose speed here if I buy those disks. And I thought that there is maybe a kernel/driver issue that could be fixed by me by following the right steps. But this doesn’t seem to be the case here.

How did you determine that?
If we investigating together, than please share your investigation results to get community members to the next step of your case understanding to make all to make all be closer to resolve the issue.

Another difference, and I didn’t see that clarified, is the file system on Linux and use of encryption. Any clarification?

Also, from the developer point of view, from my understanding, it is a bug out of control from his program, in fio.

JonMagon added the related to fio label on 15 Mar

Tried an old kernel for the sake of testing?

1 Like