Blazing Fast Namespace

I wrote a few months ago about how one might leverage SSDs/flash. I found three general categories – flash as cache, flash as storage, and other innovative techniques. I’m happy to say that today, February 11, Isilon is launching a unique set of products which leverages SSDs:

Isilon IQ 10000X-SSD / Isilon IQ 5000S-SSD / Isilon IQ 32000X-SSD

These new products all feature 2 quad-core processors, 16 GB of DRAM and 10GbE per node. The 10000X-SSD has 10 TB of SATA combined with two 100 GB SSDs, the 5000S-SSD has 5 TBs of SAS combined with 100 GB of SSDs, and the 32000X-SSD has 32 TB of SATA combined with four 100 GB SSDs.

Many vendors are using some form of flash for cache – read or write. The latency and performance characteristics of flash, while much better than disk, are still an order of magnitude away from DRAM. An Isilon cluster uses Infiniband-connected globally coherent DRAM as our read caching layer – so SSDs are both unnecessary and sub-optimal (a typical 10 node cluster has 160GB of DRAM cache). Similarly, OneFS uses high-speed battery-backed NVRAM, which is also Infiniband-connected, allowing us to achieve very high write rates.

Presenting flash for storage is a no-brainer, once our customers demand it – but before that could occur, we were presented with an interesting challenge of a different nature: speeding up one of the most latency-bound, cache-resistant workflows… Namespace operations. Not only that, but every file system operation accesses internal metadata and unless it’s cached, that introduces a latency penalty. Not only will using SSDs help namespace operations, it will also speed up every other kind of un-cached I/O (especially those that involve several metadata blocks, such as a random offset.)

Most storage systems don’t have the ability to place specific file system structures on different media, but OneFS has no LUNs, volumes, or RAID, meaning OneFS has complete control over the layout of every aspect of the file system. With the innovative work of our engineering team, we have placed all file system metadata on SSDs – inodes, directory blocks, file b-trees, etc. – eliminating the un-cached latency penalty for this data. The metadata is still mirrored across the nodes for protection (and performance) although a user can choose whether to have the secondary mirror on SSD or spinning media.

How cool is that? When you have absolute control over where every structure lives in your file system and you are not beholden to the traditional abstractions, you can do some amazing stuff!

I can’t wait to see what comes next…

About the Author: Nick Kirsch