Saturday, May 20, 2006
ZFS I/O reordering benchmark
I've posted about the write performance of ZFS compared to UFS, ext3fs and reiserfs (different OS's but exactly identical hardware (the same server) on the same disk cylinders.) That wasn't the extent of the benchmarking I performed, however. I'll detail another one that I did, although it's not as much a benchmark as a demonstration of the I/O reordering that ZFS does to give reads preference over writes.
The basic idea is to read a file while there's a heavy write load to the filesystem. The write load I was applying to the filesystem was the same one I detailed in this earlier entry. Again, I started out with a load that was serious overkill for the filesystems and then reran the test with a more reasonable workload.
I first created a data file to be read, 2GB for the overkill load, 512MB for the reasonable load. (The sizes were chosen to guarantee that the read of the file always took less than the time to generate the write load. I could have generated a longer write load so that I could reasonably compare the two different tests, but the tests were meant to be compared across filesystems with a similar write load, not across loads for the same filesystem.)
Instead of using cat or dd to read the file, I used md5sum. Why? Well, for one, 'cat file > /dev/null' has interesting behavior under Solaris. The cat utility actually mmaps the file and then relies on demand paging to read the file as needed. So 'cat file > /dev/null' essentially translates into a no-op. 'cat file | cat > /dev/null' achieves the goal, but it seems silly. I could have used dd, but I actually used md5sum in order to slightly handicap ZFS. Given that ZFS involves a lot of computation (checksums and whatnot), I wanted to give the CPU some more work to do so as to intentionally interfere with that. I also did it in order to create a more realistic test. In general, you don't read data from disk just to throw it away, you do something with it.
To quote from my earlier post for completeness here: The server I was using was a 2 x 2.8GHz Dell 1850 with a single 73GB SCSI disk and 2GB RAM. I ran the tests using both UFS and ZFS under Solaris x86 and both ext3fs and reiserfs under Linux. To avoid differences in performance between the inside and the outside of the disk, I used the same cylinders on the disk for all tests (plus or minus a cylinder or two.)
So here are the results for the overkill write load:
Wow. So while ZFS performed the read under load in close to the same time it took with no load, ext3fs took 90 times as long, and reiserfs took 125 times as long. Again, all I can say is, "Wow." But I also have to emphasize that this write load was too heavy for the filesystems. (Although one could argue that the write load wasn't too heavy, given that ZFS could handle it gracefully. But it's certainly not a real-world workload, so while the data is interesting, it would be hard to argue that it's useful.)
And the results for the reasonable write load. (Note that I didn't run the test for UFS, mostly due to time constraints when I was doing this. Remember also that this was a 512MB file, not 2GB.):
Okay, so these numbers aren't quite as ludicrous as the first. They're still impressive, though. The ZFS engineers appear to have done a very good job.
The basic idea is to read a file while there's a heavy write load to the filesystem. The write load I was applying to the filesystem was the same one I detailed in this earlier entry. Again, I started out with a load that was serious overkill for the filesystems and then reran the test with a more reasonable workload.
I first created a data file to be read, 2GB for the overkill load, 512MB for the reasonable load. (The sizes were chosen to guarantee that the read of the file always took less than the time to generate the write load. I could have generated a longer write load so that I could reasonably compare the two different tests, but the tests were meant to be compared across filesystems with a similar write load, not across loads for the same filesystem.)
Instead of using cat or dd to read the file, I used md5sum. Why? Well, for one, 'cat file > /dev/null' has interesting behavior under Solaris. The cat utility actually mmaps the file and then relies on demand paging to read the file as needed. So 'cat file > /dev/null' essentially translates into a no-op. 'cat file | cat > /dev/null' achieves the goal, but it seems silly. I could have used dd, but I actually used md5sum in order to slightly handicap ZFS. Given that ZFS involves a lot of computation (checksums and whatnot), I wanted to give the CPU some more work to do so as to intentionally interfere with that. I also did it in order to create a more realistic test. In general, you don't read data from disk just to throw it away, you do something with it.
To quote from my earlier post for completeness here: The server I was using was a 2 x 2.8GHz Dell 1850 with a single 73GB SCSI disk and 2GB RAM. I ran the tests using both UFS and ZFS under Solaris x86 and both ext3fs and reiserfs under Linux. To avoid differences in performance between the inside and the outside of the disk, I used the same cylinders on the disk for all tests (plus or minus a cylinder or two.)
So here are the results for the overkill write load:
Filesystem | Time (min:sec, unloaded) | Time (min:sec,loaded) | Ratio loaded:unloaded |
UFS | 0:50.2 | 5:50 | 8.2:1 |
ZFS | 0:31.8 | 0:36.0 | 1.13:1 |
ext3fs | 0:36.3 | 54:21 | 89.9:1 |
reiserfs | 0:33.4 | 69:45 | 124.6:1 |
Wow. So while ZFS performed the read under load in close to the same time it took with no load, ext3fs took 90 times as long, and reiserfs took 125 times as long. Again, all I can say is, "Wow." But I also have to emphasize that this write load was too heavy for the filesystems. (Although one could argue that the write load wasn't too heavy, given that ZFS could handle it gracefully. But it's certainly not a real-world workload, so while the data is interesting, it would be hard to argue that it's useful.)
And the results for the reasonable write load. (Note that I didn't run the test for UFS, mostly due to time constraints when I was doing this. Remember also that this was a 512MB file, not 2GB.):
Filesystem | Time (min:sec, unloaded) | Time (min:sec,loaded) | Ratio loaded:unloaded |
ZFS | 0:09.0 | 0:10.3 | 1.14:1 |
ext3fs | 0:08.8 | 5:27 | 37.2:1 |
reiserfs | 0:08.7 | 3:50 | 26.4:1 |
Okay, so these numbers aren't quite as ludicrous as the first. They're still impressive, though. The ZFS engineers appear to have done a very good job.
Comments:
<< Home
The Linux numbers are incomprehensible unless you quote kernel version and which IO scheduler was used.
Linux performs the read-vs-write balancing at the IO scheduler level. It matters. A lot.
Linux performs the read-vs-write balancing at the IO scheduler level. It matters. A lot.
To introduce the latest nike.adidas soccer shoes. Hope you will enjoy it Nike mercurial soccer cleats Prefer the new nike soccer shoes.
Thanks. I habitually relish reading your blog - they are habitually funny and intelligent.I am a ceramic tour lover,You can discover more .
Please visit for escort girls-http://www.hknightlifeescort.com/.
Please visit for escort girls-http://www.hknightlifeescort.com/.
Visit http://www.hktopmodelescort.com/asian_girls.htm for select you dating and escort girl in Hong Kong.
Post a Comment
<< Home