NVME M.2 BANDWIDTH TESTS 7 CONSUMER DEVICES Matthew Dillon I primarily test random-read bandwidth. Write bandwidth for these M.2 devices generally is in the 600-800 MBytes/sec range (see special notes on the Intel 600P though), which is what I would expect for the M.2 form factor due to the fewer NAND channels. These tests are with (roughly) 250GB devices. Devices with higher capacities may (are likely to) have more bandwidth. But its usually better to test with lower capacity devices because it reveals potential bottlenecks and other red-flags in the firmware and hardware. All random-read tests are done on a 16GB block of uncompressable data that I wrote out from /dev/urandom into the partition. All of these devices support 7, 8, or 16 queues, so a 16-thread test will generate a queue depth of 2 and a 64-thread test will generate a queue depth of 8. (on the 16 queue devices, 64-threads has a QD of 4). All tests were done in PCIe-3.0 sockets. Note that I renumbered nvme5 and nvme6 for readability (I only have 5 PCIe sockets). All tests were done on a DragonFlyBSD system (of course), using all the MSI-X vectors and parallelism offered by the devices (see the dmesg probes later on). There is no OS contention and the scheduler will optimally map vectors and threads. SUMMARY RECOMMENDATION (so far) I recommend the Samsung 951, 960 EVO, and the Plextor M8Pe. Samsung 951 - Recommended Excellent all-around characteristics Samsung 960 EVO - Recommended Excellent all-around characteristics Plextor - Recommended Excellent all-around characteristics, ships with heat sink (the others do not). Does not take the speed crown but delivers high, consistently good performance and temperature is well regulated. Note that this device probably uses commodity flash so results will vary. MyDigitalSSD - Conditionally Recommended Takes the speed crown at 2.4 GBytes/sec but the controller clearly runs hot and the device quickly throttles down to 1.4 GBytes/sec at 80C, then throttles further to 1.1 GBytes/sec at 100C. Also, basic QD1 sequential read test is only mediocre (though much better than the Intel). I'll conditionally recommend this device with the provisio that the controller runs too hot and it is probably not appropriate for a mobile device. Consider buying a third-party heat sink as this device does not ship with one (it really should, frankly). Note that this device probably uses commodity flash so results will vary. I'm giving these guys a bit of a pass with my provisional recommendation even though it clearly runs way too hot, because the rest of the results are quite good and they have gone out on a limb and are using a third commodity controller (the phison controller). The commodity space desparately needs a third controller vendor. Toshiba OCZ - NOT RECOMMENDED Decent test results but the controller appears allows the device to overheat without throttling or flagging the indication, which is a big black mark. OCZ branding is also a big questionmark (OCZ had a horrible reputation). I just can't recommend it. WD Black - NOT RECOMMENDED Would recommend except uses Sandisk Planar TLC NAND and the endurance is only 80TB for the 256GB model, which is totally unacceptable. Intel 600P - NOT RECOMMENDED Does not pass muster for many reasons. Do not buy this device for any reason. NOT YET AVAILABLE: Crucial/Micron M.2 NVMe offerings. Right now they appear to have two-prong M.2 (likely AHCI) stuff, but no one-prong PCIe-only NVMe M.2 stuff. FABS: Samsung Currently has the best 3D NAND process. Micron (Crucial branding) Expected to have a competing 3D NAND process but no real NVMe consumer offerings as of this telling. This is the one I'm really waiting for. Sandisk (WD branding) Their best process is an Advanced planar TLC which is way behind everyone else. Very low durabilities. Toshiba I don't know a whole lot about Toshiba's NAND fab/process. The Toshiba device I test here has very good performance but also runs hot. Intel (Intel branding) Unclear where Intel is going here. There is an Intel 3D TLC process in fab but to say the performance is godawful would be an understatement. NOTE ON PCI-V3 AND ADVERTISED BANDWIDTHS None of the M.2 NVMe devices I've tested can actually push real data for any significant period of time beyond 1.6GBytes/sec or so. The devices can push more if the data is compressable or zero, and I think this is where some reviewers get confused. I'm testing with data loaded from /dev/urandom, meaning it is completely uncompressable. For example, if I read from a zero'd partition on the EVO using randread /dev/nvme1s1a 32768 1 64 (32KB block size, 64 threads), the iostat output shows around 3.2 GBytes/sec. If I run the same command pointed to a partition with uncompressable data in it, the iostat output shows only around 1.0 GBytes/sec. The reason it can push 3.2 GBytes/sec in this example is because the NVME device doesn't actually have to read its flash, because that partition is empty. These bandwidths are limited primarily due to the fewer channels and fewer NAND chips. A higher-capacity M.2 device may be able to push more by virtue of having more NAND chips, but for testing purposes I only purchased (mainly) ~250GB devices. xeon126# iostat nvme1 1 tty nvme1 cpu tin tout KB/t tps MB/s us ni sy in id 0 3 32.00 102943 3216.97 0 0 5 1 95 <--- from 'a' 0 3 32.00 102830 3213.45 0 0 5 1 95 0 3 32.00 102806 3212.70 0 0 4 0 95 0 3 32.00 32969 1030.27 0 0 2 0 98 <--- from 'b' 0 3 32.00 32864 1027.01 0 0 2 0 98 0 3 32.00 33130 1035.30 0 0 2 0 98 HEAT SINK OR NOT? Most M.2 SSDs do not come with a heat-sink. Adding a heat-sink can be EXTREMELY beneficial for a multitude of reasons, not the least of which reducing flash failure rates. If you don't generally have a heavy workload then you probably don't have to worry about it. Nearly all NVMe SSDs have to throttle to regulate temperature given a write-heavy workload, and quite a few will have to throttle with a read-heavy workload, so a heatsink will definitely improve matters in terms of getting consistent performance out of your device. Read-heavy workloads don't heat the cards up as much (though as we see below, a full-on read-heavy workload does effect the Intel, the Toshiba, and the MyDigitalSSD severely). Still, it very unlikely that normal read operations would have to throttle. See other notes, particularly on the MyDigitalSSD. If you are an enthusiast, then by all means add a heat-sink. The cooler you keep those flash chips, the more reliable they will be. -Matt SAMSUNG 951 randread /dev/nvme1s1b 100 16 16 threads - SAMSUNG 951 (16GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 81 MBytes/sec (133 MB/s @ 64 threads) 1024 162 MBytes/sec (266 MB/s @ 64 threads) 2048 322 MBytes/sec (533 MB/s @ 64 threads) 4096 635 MBytes/sec (1070 MB/s @ 64 threads) 8192 987 MBytes/sec (1661 MB/s @ 64 threads) 16384 1245 MBytes/sec (1680 MB/s @ 64 threads) 32768 1465 MBytes/sec (1660 MB/s @ 64 threads) 65536 1434 MBytes/sec (1410 MB/s @ 64 threads) 131072 1450 MBytes/sec (1406 MB/s @ 64 threads) 262144 1456 MBytes/sec (1407 MB/s @ 64 threads) Sequential read @ 64KB block size: 1090 MBytes/sec SPECIAL NOTES ON THE SAMSUNG 951 * This particular device was only 128GB, unlike the others, so the above numbers are actually even more impressive than they might otherwise seem. SAMSUNG 960 EVO randread /dev/nvme1s1b 100 16 16 threads - SAMSUNG 960 EVO (16GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 70 MBytes/sec (125 MB/s @ 64 threads) 1024 138 MBytes/sec (251 MB/s @ 64 threads) 2048 275 MBytes/sec (502 MB/s @ 64 threads) 4096 543 MBytes/sec (1003 MB/s @ 64 threads) 8192 815 MBytes/sec (1076 MB/s @ 64 threads) 16384 1026 MBytes/sec (1099 MB/s @ 64 threads) 32768 1037 MBytes/sec 65536 930 MBytes/sec 131072 1024 MBytes/sec 262144 1162 MBytes/sec (1166 MB/s @ 64 threads) Sequential read @ 64KB block size: 1500 MBytes/sec TOSHIBA OCZ RD400 randread /dev/nvme1s1b 100 16 16 threads - TOSHIBA OCZ RD400 (16 GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 69 MBytes/sec (105 MB/s @ 64 threads) 1024 137 MBytes/sec (211 MB/s @ 64 threads) 2048 273 MBytes/sec (423 MB/s @ 64 threads) 4096 549 MBytes/sec (843 MB/s @ 64 threads) 8192 690 MBytes/sec (947 MB/s @ 64 threads) 16384 1056 MBytes/sec (1014 MB/s @ 64 threads) 32768 1332 MBytes/sec (1120 MB/s @ 64 threads) 65536 1320 MBytes/sec (1241 MB/s @ 64 threads) 131072 1482 MBytes/sec (1494 MB/s @ 64 threads) 262144 1459 MBytes/sec (1494 MB/s @ 64 threads) Sequential read @ 64KB block size: 892 MBytes/sec SPECIAL NOTES ON THE TOSHIBA: * This is fairly inconsequential, but note that the Toshiba only probed 7 chipset queues. This means that a 1:1 cpu:queue mapping is not possible on a 4-core/8-thread system. * Controller allows the device to get too hot. * 'OCZ' branding? Really? INTEL 600P randread /dev/nvme1s1b 100 16 16 threads - INTEL 600P (16GB area, uncompressable data) (INTEL_SSDPEKKW256G7) blksize aggregate bw ------ -------------- 512 23 MBytes/sec (NOTE: 64-thread tests showed no improvement 1024 47 MBytes/sec at all block sizes) 2048 93 MBytes/sec 4096 180 MBytes/sec 8192 333 MBytes/sec 16384 570 MBytes/sec 32768 716 MBytes/sec 65536 865 MBytes/sec 131072 930 MBytes/sec 262144 945 MBytes/sec Sequential read @ 64KB block size: 220 MBytes/sec <--- HORRIBLE Sequential read @ 128KB block size: 350 MBytes/sec <--- HORRIBLE SPECIAL NOTES ON THE INTEL 600P: * Horrible device, do not buy it. Write bandwidth started out ok at 170MB/s but once the SLC 'frontend' cache filled up, write bandwidth dropped to 20MB/s for long periods of time and became inconsistent. * QD1 Single-thread sequential read bandwidth was horrifyingly slow at only 194 MBytes/sec. In comparison, the Samsung 951 can put out 1.1 GBytes/sec. I don't usually test multi-thread sequential read on SSDs (that is, multiple threads each reading sequentially). 1-thread sequential read - 220 MB/s <--- HORRIBLE 2-thread sequential read - 430 MB/s <--- HORRIBLE 3-thread sequential read - 544 MB/s <--- HORRIBLE 4-thread sequential read - 715 MB/s ... up to around 1100 MB/s. If you happened to have saddled yourself with this piece of junk, all is not lost. Filesystems will do read-ahead on sequential I/O and in this situation the Intel does just fine. If I dd (read) an uncompresasble file through a filesystem, I get around 1000 MBytes/sec due to the fact that the filesystem keeps several asynchronous I/Os in progress at the same time. * Samsung offers much higher and more consistent performance for the same price. * Intel NVMe SSDs have had some interesting issues associated with them. For example, the Intel 750's performance goes to hell with block sizes greater or equal to 64KB. Intel clearly needs to do a LOT of work on their firmware, and perhaps use (or make themselves) a better controller. * OVERTEMP! With just the multi-threaded random read test the device overheated. The NVME chipset itself flagged that it was too hot and began to throttle it after less than 40 seconds of testing. Now Intel IS doing the right thing by flagging the over-temp and throttling at 75C, that reduces premature flash failures. The Toshiba on this page in fact doesn't seem to throttle (it hit 81C), which is a negative. But the Intel 600P is stil junk in my view. There are simply too many issues with it. WD BLACK 256G randread /dev/nvme4s1b 100 16 16 threads - WDC_WDS256G1X0C-00ENX0 (16GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 51 MBytes/sec (75 MB/s @ 64 threads) 1024 101 MBytes/sec (148 MB/s @ 64 threads) 2048 202 MBytes/sec (302 MB/s @ 64 threads) 4096 413 MBytes/sec (588 MB/s @ 64 threads) 8192 676 MBytes/sec (792 MB/s @ 64 threads) 16384 938 MBytes/sec (1094 MB/s @ 64 threads) 32768 1257 MBytes/sec (1489 MB/s @ 64 threads) 65536 1345 MBytes/sec (1329 MB/s @ 64 threads) 131072 1301 MBytes/sec (1282 MB/s @ 64 threads) 262144 1511 MBytes/sec (1477 MB/s @ 64 threads) Sequential read @ 64KB block size: 338 MBytes/sec <--- MEDIOCRE Sequential read @ 128KB block size: 407 MBytes/sec <--- MEDIOCRE SPECIAL NOTES ON THE WD BLACK 256G: * Sequential write bandwidth (writing /dev/urandom data) was a consistent 233 MB/sec. Quite decent for a 256G M.2 device. However, apparently this device uses a SLC front-end for burst write performance which (like the 600P) only adds another stage that can fail to the mix. * Slow temperature ramp during read tests. Appears to stabilize at 72C with no (read) performance degredation after 10 minutes @ 1.4 GB/s (32K @ 64 threads random read). * Consistent performance ramp with no gaps, good at both low and high queue depths. * Nice chipset queue setup for a M.2 device (16/16 instead of 8/8 or 7/7). * Sequential read (queue depth 1) bandwidth is mediocre. Not as bad as the Intel 600P but far below the Samsung. As I stated in the Intel notes, filesystems will usually do multi-IO read-ahead so it isn't a big a deal as it sounds. * But rated endurance is too low, only 80TB for the 256GB model (160TB for the 512GB model). That is way, way too low for any M.2 SSD. This is the real review killer here. MYDIGITALSSD randread /dev/nvme5s1b 100 16 16 threads - BPX (16GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 70 MBytes/sec (97 MB/s @ 64 threads) 1024 140 MBytes/sec (194 MB/s @ 64 threads) 2048 280 MBytes/sec (390 MB/s @ 64 threads) 4096 570 MBytes/sec (1001 MB/s @ 64 threads) 8192 904 MBytes/sec (1229 MB/s @ 64 threads) 16384 1260 MBytes/sec (1480 MB/s @ 64 threads) 32768 1780 MBytes/sec (1959 MB/s @ 64 threads) 65536 2208 MBytes/sec (2310 MB/s @ 64 threads) (THROTTLES to ~1450, then 1160) 131072 2460 MBytes/sec (2367 MB/s @ 64 threads) (THROTTLES to ~1450, then 1160) 262144 2440 MBytes/sec (2371 MB/s @ 64 threads) (THROTTLES to ~1450, then 1160) Sequential read @ 64KB block size: 405 MBytes/sec <--- MEDIOCRE Sequential read @ 128KB block size: 611 MBytes/sec <--- OK SPECIAL NOTES ON THE MYDIGITALSSD * Consistent sequential write bandwidth (16GB, QD1) of 701 MB/s) * Temperature limits out at 80C and clearly throttles the random read test to around 1400 MB/sec in fairly short order. In an extended test the temperature continues to climb until topping out at 100C (yes, the controller is in fact reporting 100C!), and at that point the random read performance has dropped to a best case of around 1.1 GBytes/sec. * Does not flag the hot condition (it really should), but does clearly throttle. Controller clearly runs hot under load. * Conditionally recommended if NOT used in a mobile device, and I would recommend purchasing a heat sink for it (it doesn't come with one). * Controlled-queried temperature jumps all over the place even when idle. Implies lack of polish. I see this at idle, at load, all time on this device. Results from 7 quick queries in a row: comp_temp: 60C comp_temp: 60C comp_temp: 61C comp_temp: 66C comp_temp: 64C comp_temp: 63C comp_temp: 63C * Most consumer NVMe devices only report comp_temp (typically the controller temperature) and do not report temp_sensors properly or at all. In the case of this one, temp_sensors is reported but seems to get stuck at an incorrect value of 105C even when idle. PLEXTOR randread /dev/nvme6s1b 100 16 16 threads - PLEXTOR_PX-256M8PeG (16GB area, uncompressable data) blksize aggregate bw ------ -------------- 512 62 MBytes/sec (94 MB/s @ 64 threads) 1024 124 MBytes/sec (135 MB/s @ 64 threads) 2048 248 MBytes/sec (271 MB/s @ 64 threads) 4096 494 MBytes/sec (542 MB/s @ 64 threads) 8192 808 MBytes/sec (899 MB/s @ 64 threads) 16384 1104 MBytes/sec (1160 MB/s @ 64 threads) 32768 1282 MBytes/sec (1281 MB/s @ 64 threads) 65536 1391 MBytes/sec (1385 MB/s @ 64 threads) 131072 1662 MBytes/sec (1664 MB/s @ 64 threads) 262144 1667 MBytes/sec (1660 MB/s @ 64 threads) Sequential read @ 64KB block size: 1131 MBytes/sec Sequential read @ 128KB block size: 1211 MBytes/sec SPECIAL NOTES ON THE PLEXTOR * Consistent sequential write bandwidth (16GB, QD1) of 656 MB/s) * Consistent random read results. * Slow temperature ramp (good), stabilizes at 77C. Throttling does occur after an extended period of time of continuous testing during the read test. Note that the Plextor was the only NVMe device to ship with a heat sink, so this is the expected result. The extended read test showed throttling was kinda bursty, going from 1.4 GB/sec in the 64KB test and suddenly dropping to 177 MBytes/sec for 10 seconds or so, then going back to 1.4 GB/sec for around 30 seconds, and repeating. Plextor could probably work on this a bit in the firmware, they clearly do not need to throttle all the way down to 177MB/sec when they hit the temp limit. I don't dock them for this behavior because they ship with a heat sink, throttle at a nice conservative 77C (which reduces chances of heat related flash failures... a gift to the customer), and the results are honestly really good. * Nice chipset queue setup for a M.2 device (16/16 instead of 8/8 or 7/7). NVME PROBE nvme0 - Samsung 951 - SAMSUNG_MZVPV128HDGM-00000 nvme1 - Samsung 960 EVO - Samsung_SSD_960_EVO_250GB nvme2 - Intel 600P - INTEL_SSDPEKKW256G7 nvme3 - Toshiba OCZ RD400 - TOSHIBA-RD400 nvme4 - WD Black 256G - WDC_WDS256G1X0C-00ENX0 nvme5 - MyDigitalSSD - BPX nvme6 - Plextor M8Pe - PLEXTOR_PX-256M8PeG nvme0: port 0x6000-0x60ff mem 0xc7600000-0xc7603fff irq 26 at device 0.0 on pci1 nvme0: mapped 9 MSIX IRQs nvme0: NVME Version 1.1 maxqe=16384 caps=00f000203c013fff nvme0: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA03031 nscount=1 nvme0: Request 64/32 queues, Returns 8/8 queues, rw-sep map (8, 8) nvme0: Interrupt Coalesce: 100uS / 4 qentries nvme0: Disk nvme0 ns=1 blksize=512 lbacnt=250069680 cap=119GB serno=S1XVNYAGA03031-1 nvme1: mem 0xc7500000-0xc7503fff irq 32 at device 0.0 on pci2 nvme1: mapped 8 MSIX IRQs nvme1: NVME Version 1.2 maxqe=16384 caps=00f000203c033fff nvme1: Model Samsung_SSD_960_EVO_250GB BaseSerial S3ESNX0J219064Y nscount=1 nvme1: Request 64/32 queues, Returns 8/8 queues, rw-sep map (8, 8) nvme1: Interrupt Coalesce: 100uS / 4 qentries nvme1: Disk nvme1 ns=1 blksize=512 lbacnt=488397168 cap=232GB serno=S3ESNX0J219064Y-1 nvme2: mem 0xc7400000-0xc7403fff irq 34 at device 0.0 on pci3 nvme2: mapped 16 MSIX IRQs nvme2: NVME Version 1.2 maxqe=256 caps=00100030780100ff nvme2: Model INTEL_SSDPEKKW256G7 BaseSerial BTPY64430Q5B256D nscount=1 nvme2: Request 64/32 queues, Returns 8/8 queues, rw-sep map (8, 8) nvme2: Interrupt Coalesce: 100uS / 4 qentries nvme2: Disk nvme2 ns=1 blksize=512 lbacnt=500118192 cap=238GB serno=BTPY64430Q5B256D-1 nvme3: mem 0xc7300000-0xc7303fff irq 40 at device 0.0 on pci4 nvme3: mapped 8 MSIX IRQs nvme3: NVME Version 1.1 maxqe=65536 caps=000000302803ffff nvme3: Model TOSHIBA-RD400 BaseSerial Z6TS10AUTPEV nscount=1 nvme3: Request 64/32 queues, Returns 7/7 queues, rw-sep map (7, 7) nvme3: Interrupt Coalesce: 100uS / 4 qentries nvme3: Disk nvme3 ns=1 blksize=512 lbacnt=500118192 cap=238GB serno=Z6TS10AUTPEV-1 nvme4: mem 0xfbe00000-0xfbe03fff irq 67 at device 0.0 on pci130 nvme4: mapped 19 MSIX IRQs nvme4: NVME Version 1.2 maxqe=1024 caps=000000202c0103ff nvme4: Model WDC_WDS256G1X0C-00ENX0 BaseSerial 170369420988 nscount=1 nvme4: Request 64/32 queues, Returns 16/16 queues, rw-sep map (16, 16) nvme4: Interrupt Coalesce: 100uS / 4 qentries nvme4: Disk nvme4 ns=1 blksize=512 lbacnt=500118192 cap=238GB serno=170369420988-1 nvme5: mem 0xc7400000-0xc7403fff irq 34 at device 0.0 on pci3 nvme5: mapped 8 MSIX IRQs nvme5: NVME Version 1.2 maxqe=65536 caps=000000203c03ffff nvme5: Model BPX BaseSerial 8B7107720F0823024374 nscount=1 nvme5: Request 64/32 queues, Returns 7/7 queues, rw-sep map (7, 7) nvme5: Interrupt Coalesce: 100uS / 4 qentries nvme5: Disk nvme2 ns=1 blksize=512 lbacnt=468862128 cap=223GB serno=8B7107720F0823024374-1 nvme6: mem 0xc7320000-0xc7323fff irq 40 at device 0.0 on pci4 nvme6: mapped 19 MSIX IRQs nvme6: NVME Version 1.2 maxqe=1024 caps=00300030280303ff nvme6: Model PLEXTOR_PX-256M8PeG BaseSerial P02652102851 nscount=1 nvme6: Request 64/32 queues, Returns 16/16 queues, rw-sep map (16, 16) nvme6: Interrupt Coalesce: 100uS / 4 qentries nvme6: Disk nvme3 ns=1 blksize=512 lbacnt=500118192 cap=238GB serno=P02652102851-1 NVME TEMPERATURES (concurrent read test, after 2 minutes) xeon126# nvmectl info | egrep 'temp|nvme' nvme0: comp_temp: 76C <---- PROBABLY A CAP BUT warn_temp_time: 0 min (0.00 hrs) PERFORMANCE NOT IMPACTED crit_temp_time: 0 min (0.00 hrs) temp_sensors: none nvme1: comp_temp: 62C <---- NICE AND COOL warn_temp_time: 0 min (0.00 hrs) crit_temp_time: 0 min (0.00 hrs) temp_sensors: 62C 88C nvme2: crit_flags: TOO_HOT comp_temp: 74C <-- THROTTLED warn_temp_time: 12 min (0.20 hrs) crit_temp_time: 0 min (0.00 hrs) temp_sensors: none nvme3: comp_temp: 81C <---- TOO HOT BUT NOT FLAGGED warn_temp_time: 0 min (0.00 hrs) OR THROTTLED crit_temp_time: 0 min (0.00 hrs) temp_sensors: 81C nvme4: comp_temp: 72C <---- NICE SLOW TEMP RAMP warn_temp_time: 0 min (0.00 hrs) (NO RD PERF IMPACT) crit_temp_time: 0 min (0.00 hrs) temp_sensors: none nvme5: comp_temp: 100C <---- TOO HOT, SEE NOTES warn_temp_time: 0 min (0.00 hrs) crit_temp_time: 0 min (0.00 hrs) temp_sensors: 105C nvme6: comp_temp: 77C <---- NICE SLOW TEMP RAMP warn_temp_time: 0 min (0.00 hrs) (NO RD PERF IMPACT) crit_temp_time: 0 min (0.00 hrs) temp_sensors: 77C In all fairness, the Intel SSD (nvme2) is doing the right thing here by flagging the overheat and throttling. The Toshiba SSD, on the otherhand, is probably getting too hot and that's a negative for the Toshiba. The Samsung 951 settles out at 76C and maintains full performance, but the 960 EVO is the most impressive here, staying at a cool 62C. The MyDigitalSSD definitely gets too hot. It does throttle but at too hot a temperature, but see my notes there. These are the numbers reported by the NVME controller itself, aka the firmware for each SSD is doing the reporting, so they may not be entirely accurate on a relative basis. That said, these devices (except the EVO) could probably do with heat spreaders... only the Plextor come with one. ALL ABOARD TEST This test shows five NVMe devices running at the same time. I only have 5 PCIe slots so nvme5 and nvme6 are not included. I run two random-read workloads, 64 threads per device (320 threads total), the first using a 32KB block size and the second using a 128KB block size. Also included is a snapshot of the systat -pv 1 output showing the system 90% idle during the test (320 threads running in total). This particular test is showing an aggregate read bandwidth of around 6.0 GBytes/sec and 6.5 GBytes/sec. tty nvme0 nvme1 nvme2 nvme3 nvme4 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 18 32.00 53205 1662.64 32.00 33165 1036.41 32.00 22626 707.07 32.00 36630 1144.69 32.00 48184 1505.75 0 0 6 1 92 0 11 32.00 53160 1661.25 32.00 33142 1035.69 32.00 22815 712.96 32.00 36476 1139.87 32.00 47728 1491.54 0 0 7 1 91 0 11 32.00 53208 1662.74 32.00 33167 1036.47 32.00 22810 712.80 32.00 36286 1133.93 32.00 47578 1486.81 0 0 8 1 91 0 11 32.00 52891 1652.85 32.00 32917 1028.66 32.00 22674 708.56 32.00 36120 1128.69 32.00 47718 1491.20 0 0 8 1 91 0 11 32.00 53175 1661.71 32.00 33022 1031.94 32.00 23000 718.74 32.00 36409 1137.78 32.00 47820 1494.41 0 0 7 1 91 0 11 32.00 53290 1665.31 32.00 33017 1031.78 32.00 22576 705.46 32.00 35995 1124.84 32.00 47825 1494.54 0 0 8 1 91 tty nvme0 nvme1 nvme2 nvme3 nvme4 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 11 128.00 11182 1397.75 128.00 9321 1165.15 128.00 7611 951.41 128.00 11754 1469.24 128.00 11840 1479.99 0 0 7 0 93 0 11 128.00 11252 1406.50 128.00 9270 1158.77 128.00 7619 952.41 128.00 12033 1504.11 128.00 11819 1477.36 0 0 7 0 93 0 11 128.00 11178 1397.25 128.00 9306 1163.27 128.00 7607 950.92 128.00 11877 1484.62 128.00 11748 1468.50 0 0 7 0 92 0 11 128.00 11260 1407.50 128.00 9280 1160.03 128.00 7603 950.42 128.00 11756 1469.50 128.00 11873 1484.12 0 0 7 0 92 0 11 128.00 11257 1407.12 128.00 9282 1160.27 128.00 7612 951.54 128.00 11661 1457.62 128.00 11800 1474.99 0 0 7 0 93 0 11 128.00 11180 1397.51 128.00 9304 1163.03 128.00 7553 944.17 128.00 11990 1498.75 128.00 11704 1463.01 0 0 7 0 92 timer ipi extint user% sys% intr% idle% smpcol label sample_pc total 70921 211106 236 cpu0 289 4429 16 0.0 4.6 2.4 93.0 158 Xnvqlk cpu_mmw_pause_int+31 cpu1 282 1 10200 0.8 8.5 1.5 89.2 0 cpu_mmw_pause_int+31 cpu2 282 3 16801 0.8 8.5 0.8 90.0 3 Xgetpbuf_mem cpu_mmw_pause_int+31 cpu3 282 15 13398 0.0 13.8 1.5 84.6 2 Xnvqlk cpu_mmw_pause_int+31 cpu4 282 1890 10587 0.0 13.1 1.5 85.4 4 pool cpu_mmw_pause_int+31 cpu5 282 680 11741 1.5 6.9 2.3 89.2 2 Xrelpbuf cpu_mmw_pause_int+31 cpu6 282 2 13223 0.0 12.3 1.5 86.2 1 Xnvqlk cpu_mmw_pause_int+31 cpu7 282 7 15479 0.0 17.7 1.5 80.8 5 Xnvqlk std_copyout+73 cpu8 282 2 10935 0.0 11.5 1.5 86.9 1 Xnvqlk cpu_mmw_pause_int+31 cpu9 282 4479 0 0.8 3.1 0.0 96.2 5 Xnvqlk cpu_mmw_pause_int+31 cpu10 282 3851 7417 0.0 6.9 0.0 93.1 1 Xrelpbuf cpu_mmw_pause_int+31 cpu11 282 5394 776 0.0 7.7 0.8 91.5 3 Xnvqlk cpu_mmw_pause_int+31 cpu12 282 5245 6795 0.0 9.2 0.8 90.0 2 Xrelpbuf lwkt_getalltokens+952 cpu13 283 3746 4408 0.0 6.9 0.0 93.1 1 Xnvqlk cpu_mmw_pause_int+31 cpu14 283 4931 8985 0.8 10.0 1.5 87.7 1 Xnvqlk cpu_mmw_pause_int+31 cpu15 282 1822 9301 0.0 7.7 0.0 92.3 1 Xrelpbuf cpu_mmw_pause_int+31 cpu16 282 2166 12486 0.0 9.2 3.1 87.7 7 Xnvqlk cpu_mmw_pause_int+31 cpu17 282 3051 3906 0.0 6.2 1.5 92.3 0 cpu_mmw_pause_int+31 cpu18 282 1680 4639 0.8 4.6 1.5 93.1 0 cpu_mmw_pause_int+31 cpu19 282 682 6492 0.0 5.4 0.0 94.6 2 Xnvqlk cpu_mmw_pause_int+31 cpu20 282 1540 6451 0.0 13.1 0.8 86.2 1 XUSB device mut cpu_mmw_pause_int+31 cpu21 282 1577 3984 0.0 3.8 0.0 96.2 1 Xrelpbuf cpu_mmw_pause_int+31 cpu22 282 851 3599 0.0 5.4 0.8 93.8 0 cpu_mmw_pause_int+31 cpu23 282 1740 4571 0.0 6.9 0.8 92.3 3 Xrelpbuf cpu_mmw_pause_int+31 cpu24 282 558 5739 0.0 3.1 1.5 95.4 2 Xrelpbuf cpu_mmw_pause_int+31 cpu25 282 610 4364 0.0 4.6 0.0 95.4 0 cpu_mmw_pause_int+31 cpu26 283 2733 2958 0.8 6.2 0.0 93.1 0 cpu_mmw_pause_int+31 cpu27 283 3416 968 0.0 7.7 0.0 92.3 13 Xnvqlk cpu_mmw_pause_int+31 cpu28 282 3457 4011 0.0 5.4 0.8 93.8 0 cpu_mmw_pause_int+31 cpu29 282 2929 2905 0.0 4.6 0.8 94.6 2 Xnvqlk cpu_mmw_pause_int+31 cpu30 282 4435 1955 1.5 8.5 0.0 90.0 14 tty_token cpu_mmw_pause_int+31 cpu31 284 2999 2016 0.0 4.6 1.5 93.8 1 pool cpu_mmw_pause_int+31