How WhipTail Handles OEM Variances in SSD Hardware; Interview Part III with WhipTail CTO Candelaria

| | Leave a comment
Today I continue in my interview series with WhipTail Technologies CTO, James Candelaria, whose company specializes in Solid State Drive (SSD) storage solutions.  Last time, we discussed how WhipTail is optimizing SSD performance while minimizing the deficiencies of Multi-Level Cell flash. Today, we look at the variances between SSD manufacturer's hardware and firmware and how WhipTail deals with those differences.

Ben:  WhipTail appears to have overcome many of the issues related to SSD technologies, from failure rates to making SSD arrays compliant with high availability expectations.  But, are you finding that you're still limited by the firmware on the SSDs themselves?   Does WhipTail have certain supported SSDs then, certain firmware revisions? How does that work?

James:  Absolutely, yes.  Just like other manufacturers, EMC or Network Appliance for example, we tightly control what media goes into our chassis. We have been shipping for over two and a half years, so we've got some experience with this hardware.

I've looked at Samsung, I've looked at SandForce, and others. Early in our maturity cycle, we shipped a lot of these. Later, we decided we were going to work with Intel, the most vertically integrated company manufacturing NAND out there.

Intel's firmware is extremely mature.
It also is willing to work with a company like ours to give us access to their basic flowcharts on what input X equals output Y.

We did a huge amount of work to understand the underlying behavioral characteristics of the drives that we ship. We have vertically integrated our product through Intel hardware with pretty deep knowledge of what happens when we send it a block.  

Now that being said, we also have to be responsible on our end, and we never give a drive something unpredictable, because then, all bets are off.  You have to give the drive a consistent data pattern all the time and, if you deviate from that even once, all of a sudden the flash translation table may get fragmented and garbage collection on the drives may ensue.

Ben:  So is that why WhipTail ships on its own hardware and  uses a standard RAID stack.  It is being handled at the RAID stack level, not by WhipTail, right?


James:   That is right. So when the RAID stack kicks back the I/O, the RAID stack sees the I/O failed, we go in and replace it. So that is one of the great things that is different about our product - we ship a field-serviceable array.

When you look at our box, you see a chassis with 24 drives in it. If the hardware fails, you go find what failed, pull it out, and put the new one in.  You can then go to our GUI and replace it, and WhipTail rebuilds the RAID set. As a matter of fact, we always ship with a hot spare.  By the time you get to the data center, the array has probably already been rebuilt on the hot spare.

Ben:  The life expectancy of SSD drives is kind of a big story right now. It's one of the main things driving up cost for these drives.

James:  I am not sure if you saw the new Intel announcement the other day, but Intel just started manufacturing 20 nanometer, 128 gigabit density NAND with a new page size of 16 kilobytes. Even though this NAND has the same 5,000 cycle endurance, more write amplification equals less usable life for the customer. So this is not a good thing.

Ben:    It is the same thing when you are dealing with a server's file system.  When building out a file system, you set the node size to 2kilobytes, and if you are dealing with a bunch of massive files, that is one thing. But with a block size of 16 kilobytes and a bunch of tiny files, then you are really shooting yourself in the foot.

James:
  It is a great parallel actually, it is 100 percent true. It is essentially the same thing except now you have a media wear out instead of just performance to worry about.  

We realized it a long time ago that the average NFS customer has very little tolerance for more storage cost. They are already spending 60 percent of their budgets on storage and capacity creep in the data center.

So we realized we had to make MLC flash work and had to make it at a reasonable cost. The only way to make it work at a reasonable cost was to dedicate engineering resources into SLC or even EMLC, to leverage the buying market of "consumer grade" flash. We had to find a way to make it enterprise ready.  

When we shipped two and half years ago, no one was daring to ship an MLC appliance.  The write amplification was out of control, no one knew what to do with it. We have shown demonstrable results that you can use these devices if you manage them.

Ben:  What happens if Intel starts moving to larger erasure blocks? Does your product presuppose that it is better to have smaller erasure blocks than large ones?

James:  No. Our stack is completely tunable for whatever erase block size exists on the underlying media.

Ben:  How are you dealing with erase block size?  Does WhipTail run on its own hardware?

James:  The good news, and I will just let you in on a little secret, we do not run hardware RAID. We cannot run hardware RAID. There is no RAID controller that keeps up. So we run software RAID.

So we have ultimate control over the RAID algorithm.  Even EMC, Network Appliance, they run software RAID because you have to, to get the right deterministic behavior, you cannot rely on any hard programmed Application Specific Integrated Circuit or a Field Programmable Gate Array that you do not have visibility into.

In the next installment in this series, I will talk with James about how WhipTail approaches software RAID.

In Part I of this interview series, James explained the SSD garbage collection problem and how WhipTail handles it.

In Part II of the series, James discussed how WhipTail is optimizing SSD performance while minimizing the deficiencies of MLC flash.


In Part V of this series, James and I discuss the hardware and software supported by WhipTail and why FCoE and iSCSI trump Infiniband in today's SSD deployments.

Leave a comment

Optional: Sign in with   |  

Spotlight Blogs

Entry Sponsorship

DCIG Disclaimer

    DCIG writes evaluations of products and services in the storage and electronically stored information (ESI) markets for consumers, public relations firms, business analysts and other interested companies. Our analysis is an informed inside look made possible through business blogging agreements.

Buyers Guides


Recent Entries

May 2012

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Follow DCIG on Twitter