Measuring Disk IO Performance on MacOS

Measuring Disk IO Performance on MacOS

Author: seven September 19, 2017

Over time and numerous hardware updates around the office, I collected a vast number of 2.5″ HDD’s in my “hardware junk” box. The other day, I noticed two Kingston SSDNow V200 128GB SSD’s just sitting there doing nothing, so I decided to make them usable again. I have a really BAD track record of broken non-ssd 2.5″ travelling external disks. 99% of them broke or started showing serious problems just after 1st year of usage (traveling with them with the notebook). I wanted to see how will SSD disk act in same conditions.

I visited my local hardware store to get USB3 2.5″ HDD enclosure, being geek, I did my homework and decided to get noname enclosure for 15 EUR with semi rubber protection.
Good lady at the counter suggested that instead of 15EUR one, I get 13EUR noname enclosure since “it was better”.

Sceptical that I am, I bought both and decided to do a test and prove her that she is wrong. The one with higher price had to be better. :)

After fitting disks in enclosures, first issue I stumbled upon was a lack of disk benchmarking tool on MacOS. On Windows I used hdtune for ages and was happy with it. On MacOS however, Blackmagic Disk Speed Test in Mac App Store did not inspire confidence in me (blac kmagic, cmon?), not did 11yrs old Xbench or jDiskMark beta (written in Java).

In Ubuntu/Debian/RHEL land I’ve benchmarked device IO before and had good experience with FIO. FIO is a popular tool for measuring IOPS on a Linux servers.


Do not make mistake of benchmarking (or using dd for eg.) /dev/disk device.
On MacOS you should always use /dev/rdisk device.

/dev/disk – buffered access, for kernel filesystem calls, broken in 4kb chunks. goes more expensive root.
/dev/rdisk – “raw” in the BSD sense and force block-aligned I/O. Those devices are closer to the physical disk than the buffered cache ones.
If you do a read or write larger than one sector to /dev/rdisk, that request will be passed straight through. The lower layers may break it up (eg., USB breaks it up into 128KB pieces due to the maximum payload size in the USB protocol), but you generally can get bigger and more efficient I/Os. When streaming, like via dd, 128KB to 1MB are pretty good sizes to get near-optimal performance on current non-RAID hardware. (source)

1. Install FIO

brew install fio

2. Check correct disk number

diskutil list

Everything from this step forward can and will delete data on your disk. So BE VERY CAREFUL on which disk you use. You have been warned.

3. Precondition SSD
We precondition each drive the same way for each measurement, and stimulate the drive to the same performance state so the test process is deterministic

sudo dd if=/dev/zero of=/dev/rdisk2 bs=1m

4. Running tests

Random read/write performance

./fio --randrepeat=1 --ioengine=posixaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

Random read performance

./fio --randrepeat=1 --ioengine=posixaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randread

Random write performance

./fio --randrepeat=1 --ioengine=posixaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randwrite

(On MacOS we must use posixaio ioengine. If you are on running some different flavour of Unix just replace –ioengine=posixaio with eg. –ioengine=libaio for Ubuntu)

5. The results

The lady at the store was right! Using same HDD’s the cheaper HDD enclosure gave us better results. It was faster by almost 35%.

tray

read mb/s write mb/s read IOPS write IOPS
ASMT (/dev/disk)

10.9MiB/s 11.9MiB/s 86 IOPS 94 IOPS
ASMT

69.7MiB/s 72.8MiB/s 552 IOPS 576 IOPS
PATRIOT

92.4MiB/s 93.5MiB/s 738 IOPS 747 IOPS

If you are interested in values I got, here there are.

The first set of benchmarks (done on buffered /dev/disk device) revealed really poor performance [r=10.9MiB/s,w=11.9MiB/s][r=86,w=94 IOPS].

sudo fio --filename=/dev/disk2 --direct=1 --rw=randrw --rwmixwrite=50 --refill_buffers --norandommap --randrepeat=0 --ioengine=posixaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting --name=benchtest
fio-2.18
Starting 1 thread
^Cbs: 1 (f=1), 0-2560 IOPS: [m(1)][0.5%][r=10.9MiB/s,w=11.9MiB/s][r=86,w=94 IOPS][eta 23h:52m:35s]
fio: terminating on signal 2

benchtest: (groupid=0, jobs=1): err= 0: pid=3075: Fri Mar 24 20:14:55 2017
   read: IOPS=94, BW=11.8MiB/s (12.4MB/s)(5234MiB/445379msec)
    slat (usec): min=0, max=303, avg= 0.40, stdev= 2.28
    clat (msec): min=47, max=228, avg=100.40, stdev=14.81
     lat (msec): min=47, max=228, avg=100.40, stdev=14.81
    clat percentiles (msec):
     |  1.00th=[   74],  5.00th=[   82], 10.00th=[   85], 20.00th=[   90],
     | 30.00th=[   93], 40.00th=[   96], 50.00th=[   98], 60.00th=[  102],
     | 70.00th=[  105], 80.00th=[  111], 90.00th=[  119], 95.00th=[  127],
     | 99.00th=[  151], 99.50th=[  161], 99.90th=[  184], 99.95th=[  192],
     | 99.99th=[  208]
  write: IOPS=94, BW=11.8MiB/s (12.4MB/s)(5237MiB/445379msec)
    slat (usec): min=0, max=296, avg= 0.53, stdev= 2.81
    clat (msec): min=25, max=177, avg=69.66, stdev= 9.52
     lat (msec): min=25, max=177, avg=69.66, stdev= 9.52
    clat percentiles (msec):
     |  1.00th=[   51],  5.00th=[   58], 10.00th=[   61], 20.00th=[   63],
     | 30.00th=[   66], 40.00th=[   68], 50.00th=[   69], 60.00th=[   71],
     | 70.00th=[   73], 80.00th=[   76], 90.00th=[   80], 95.00th=[   86],
     | 99.00th=[  105], 99.50th=[  114], 99.90th=[  133], 99.95th=[  137],
     | 99.99th=[  151]
    lat (msec) : 50=0.44%, 100=76.81%, 250=22.76%
  cpu          : usr=0.46%, sys=0.41%, ctx=283619, majf=3, minf=6
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=50.0%, 16=50.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=98.3%, 8=1.7%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=41875,41894,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=11.8MiB/s (12.4MB/s), 11.8MiB/s-11.8MiB/s (12.4MB/s-12.4MB/s), io=5234MiB (5489MB), run=445379-445379msec
  WRITE: bw=11.8MiB/s (12.4MB/s), 11.8MiB/s-11.8MiB/s (12.4MB/s-12.4MB/s), io=5237MiB (5491MB), run=445379-445379msec

Repeated benchmark on same enclosure, but using raw device (/dev/rdisk) revealed much nicer numbers – 600% faster than buffered device
[m(1)][0.3%][r=69.7MiB/s,w=72.8MiB/s][r=552,w=576 IOPS][eta 23h:55m:54s]

sudo fio --filename=/dev/rdisk2 --direct=1 --rw=randrw --rwmixwrite=50 --refill_buffers --norandommap --randrepeat=0 --ioengine=posixaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting --name=benchtest
fio-2.18
Starting 1 thread
^Cbs: 1 (f=1), 0-2560 IOPS: [m(1)][0.3%][r=69.7MiB/s,w=72.8MiB/s][r=552,w=576 IOPS][eta 23h:55m:54s]
fio: terminating on signal 2

benchtest: (groupid=0, jobs=1): err= 0: pid=3075: Fri Mar 24 21:13:39 2017
   read: IOPS=538, BW=67.3MiB/s (70.6MB/s)(16.2GiB/245308msec)
    slat (usec): min=0, max=47, avg= 0.45, stdev= 1.02
    clat (msec): min=8, max=45, avg=15.05, stdev= 2.70
     lat (msec): min=8, max=45, avg=15.05, stdev= 2.70
    clat percentiles (usec):
     |  1.00th=[11200],  5.00th=[12224], 10.00th=[12736], 20.00th=[13376],
     | 30.00th=[13888], 40.00th=[14400], 50.00th=[14784], 60.00th=[15168],
     | 70.00th=[15680], 80.00th=[16320], 90.00th=[17280], 95.00th=[18048],
     | 99.00th=[23936], 99.50th=[36608], 99.90th=[39680], 99.95th=[40192],
     | 99.99th=[42240]
  write: IOPS=538, BW=67.4MiB/s (70.7MB/s)(16.2GiB/245308msec)
    slat (usec): min=0, max=65, avg= 0.46, stdev= 0.67
    clat (msec): min=6, max=45, avg=14.56, stdev= 2.71
     lat (msec): min=6, max=45, avg=14.57, stdev= 2.71
    clat percentiles (usec):
     |  1.00th=[10560],  5.00th=[11712], 10.00th=[12224], 20.00th=[12864],
     | 30.00th=[13376], 40.00th=[13888], 50.00th=[14272], 60.00th=[14784],
     | 70.00th=[15168], 80.00th=[15808], 90.00th=[16768], 95.00th=[17536],
     | 99.00th=[23680], 99.50th=[36096], 99.90th=[39168], 99.95th=[40192],
     | 99.99th=[42240]
    lat (msec) : 10=0.22%, 20=98.34%, 50=1.44%
  cpu          : usr=3.48%, sys=2.40%, ctx=531264, majf=3, minf=5
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=50.0%, 16=50.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=97.9%, 8=1.8%, 16=0.3%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=132027,132160,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=67.3MiB/s (70.6MB/s), 67.3MiB/s-67.3MiB/s (70.6MB/s-70.6MB/s), io=16.2GiB (17.4GB), run=245308-245308msec
  WRITE: bw=67.4MiB/s (70.7MB/s), 67.4MiB/s-67.4MiB/s (70.7MB/s-70.7MB/s), io=16.2GiB (17.4GB), run=245308-245308msec

Finally, the second HDD tray I benchmarked revealed best results, almost 35% faster than cheap-enclosure-1.
[m(1)][0.5%][r=92.4MiB/s,w=93.5MiB/s][r=738,w=747 IOPS][eta 23h:52m:50s]

sudo fio --filename=/dev/rdisk3 --direct=1 --rw=randrw --rwmixwrite=50 --refill_buffers --norandommap --randrepeat=0 --ioengine=posixaio --bs=128k --rate_iops=1280  --iodepth=16 --numjobs=1 --time_based --runtime=86400 --group_reporting --name=benchtest
fio-2.18
Starting 1 thread
^Cbs: 1 (f=1), 0-2560 IOPS: [m(1)][0.5%][r=92.4MiB/s,w=93.5MiB/s][r=738,w=747 IOPS][eta 23h:52m:50s]
fio: terminating on signal 2

benchtest: (groupid=0, jobs=1): err= 0: pid=3075: Fri Mar 24 20:37:26 2017
   read: IOPS=761, BW=95.2MiB/s (99.8MB/s)(39.2GiB/430198msec)
    slat (usec): min=0, max=310, avg= 0.55, stdev= 2.23
    clat (msec): min=1, max=48, avg=11.43, stdev= 2.84
     lat (msec): min=1, max=48, avg=11.43, stdev= 2.84
    clat percentiles (usec):
     |  1.00th=[ 6880],  5.00th=[ 8256], 10.00th=[ 8896], 20.00th=[ 9536],
     | 30.00th=[10048], 40.00th=[10560], 50.00th=[11072], 60.00th=[11584],
     | 70.00th=[12224], 80.00th=[12864], 90.00th=[14016], 95.00th=[15296],
     | 99.00th=[22912], 99.50th=[28800], 99.90th=[35584], 99.95th=[37120],
     | 99.99th=[40704]
  write: IOPS=762, BW=95.3MiB/s (99.9MB/s)(40.3GiB/430198msec)
    slat (usec): min=0, max=767, avg= 0.96, stdev= 3.58
    clat (usec): min=492, max=45310, avg=9422.63, stdev=2869.71
     lat (usec): min=493, max=45311, avg=9423.59, stdev=2869.68
    clat percentiles (usec):
     |  1.00th=[ 5024],  5.00th=[ 6240], 10.00th=[ 6944], 20.00th=[ 7712],
     | 30.00th=[ 8256], 40.00th=[ 8640], 50.00th=[ 9024], 60.00th=[ 9536],
     | 70.00th=[10048], 80.00th=[10688], 90.00th=[11712], 95.00th=[13120],
     | 99.00th=[21888], 99.50th=[27264], 99.90th=[35072], 99.95th=[37120],
     | 99.99th=[40704]
    lat (usec) : 500=0.01%
    lat (msec) : 2=0.01%, 4=0.08%, 10=49.48%, 20=49.08%, 50=1.35%
  cpu          : usr=4.59%, sys=2.86%, ctx=1256049, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=57.4%, 16=42.6%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=98.2%, 8=1.8%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=327551,327861,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=95.2MiB/s (99.8MB/s), 95.2MiB/s-95.2MiB/s (99.8MB/s-99.8MB/s), io=39.2GiB (42.1GB), run=430198-430198msec
  WRITE: bw=95.3MiB/s (99.9MB/s), 95.3MiB/s-95.3MiB/s (99.9MB/s-99.9MB/s), io=40.3GiB (42.1GB), run=430198-430198msec

Conclusion
fio is pretty robust utility for io testing. Beware of quality of onboard electronics when buying HDD trays. Trays within same price range, can vary 15-30% in speed.

Author
seven
CEO/CTO at Nivas®
Neven Jacmenović has been passionately involved with computers since late 80s, the age of Atari and Commodore Amiga. As one of internet industry pioneers in Croatia, since 90s, he has been involved in making of many award winning, innovative and successful online projects. He is an experienced full stack web developer, analyst and system engineer. In his spare time, Neven is transforming retro-futuristic passion into various golang, Adobe Flash and JavaScript/WebGL projects.

    Leave a Reply

    Your email address will not be published. Required fields are marked *