The deployment of flash memory storage as either storage or memory almost inevitably results in increases in application performance. However to get the real ‘kick’ in performance that today’s transactional applications need and which flash can provide, a more elegant approach to flash’s deployment is needed. Today I continue my discussion with Fusion-io Senior Director of Product Management, Brent Compton, who elaborates on the APIs that the Fusion ioMemory SDK exposes that make this boost in transactional performance possible.
Ben: Direct access to flash RAM is definitely a foundational requirement. The ability to offload processing of key-value stores is another basic yet consistently needed functionality for developers. What else can we expect?
Brent: First, I’d like to distinguish between a ‘primitive‘ and an ‘API‘ in our SDK nomenclature. A primitive is a single, foundational interface call while an API is a family of related interface calls. For instance, the directKey-Value Store API is a family of related interface calls.
One of the primitives is the Atomic Multi-block Writes. It takes advantage of one of the native properties of our ioMemory flash translation layer: the log structured write mechanism. This mechanism provides a basic copy-on-write foundation.
Just a simple illustration, if you write three blocks A, B and C and then you come along and update A, unlike a disk which performs a rewrite in place, ioMemory writes ‘A’ to the tail of the log. So both A and newly-written ‘A’ exist on ioMemory.
There are many different ways that applications could exploit this foundational copy-on-write property, not the least of which is to provide atomicity for multi-block writes. This occurs when an application says, “I need to write a bunch of blocks and I need to ensure that all of them are written, or none of them are written.”
A practical example of this might be writing a combination of data and metadata. Both the data and corresponding metadata need to be written to ensure integrity of the update. If only part of the data is written, or only the metadata is written, the data repository will be out-of-sync. All written, or none written.
This means we have all the makings of a double buffered write. If there is any interruption to a multi-block write we have all the mechanisms in place to roll back to the previous content. We just ignore a partial write of those blocks as if it had never occurred.
We gave a sneak preview of Atomic Writes in October 2011 at Oracle Open World. In conjunction we pulled down some of the MySQL open source for the InnoDB storage engine.
We modified MySQL InnoDB by replacing its double buffered writes with native calls to our atomic multi-block write interface. We saw significant latency reductions, performance improvements with more IOPS, while reducing the code path. So it’s one of those “less code, better performance” stories.
Ben: I think those are some good examples. What other categories of APIs are available?
Brent: The next category is the memory access API family. Note the different use of words: memory access API versus direct IO API.
Flash has always been a hybrid of memory and storage but to date, its uses in industry have been storage. We’re offering the industry’s first memory access semantics to flash. There a couple of different APIs under this memory access API family. I’m going to highlight one.
On January 5th Fusion-io demonstrated one billion IOPS using a technology we call Auto Commit Memory (ACM). This SDK becomes the vehicle through which we provide this capability to software developers through an ACM API.
The essence of it that is an application can designate a region of its process virtual address space as persistent. ACM provides API semantics to attach this persistent region to Auto Commit Memory, to ioMemory, such that anything written to that region of the process virtual address space is guaranteed to be automatically persisted.
This is a mechanism has some similarities to mmap, so you have the benefit of saying, “Wow! I can eliminate a lot of my code complexity by persisting my data, without having to resort to IO primitives. I can just store data in that location of memory and have somebody else worry about persisting.” In this case the ‘somebody else’ is the ioMemory SDK.
However, unlike mmap, and this is key, ACM guarantees persistence. In other words when you write something to memory allocated with mmap, and there is an interruption of service, you’re not guaranteed that it will be persisted. You don’t have the durability of writes which means you have to resort to various other mechanisms of journaling or double buffered writes or things like that.
On the other hand, when you write something to Auto Commit Memory, by design it will be automatically committed. In other words, it is durable across service interruptions such as power failures.
Note that part of the ACM API will be a write barrier operation, like a flush, ensuring that the data is cleared from the processor complex, various levels of CPU caches and what not. Once flushed from the processor complex, it’s automatically persisted to ioMemory.
What attracts a lot of database developers to this new API is the notion of solving the tail-of-their-transaction log performance inhibitor. By definition, it is the transaction log through which they can ensure ACID properties of transactions.
Previously developers had to issue blocking synchronous I/Os at the tail of their log, to ensure that the most recent writes before service interruption were durable. With our ACM API they can convert that blocking synchronous IO to a non-blocking asynchronous IO by maintaining the tail of their transaction log in auto commit memory.
They may still persist the tail of their log to a backing store but they will not need to do it synchronously through a blocking IO. If there’s an interruption, for instance upon a system or an application restart, they can always recover their state through what was persisted in auto commit memory. So developers are quite keen on that.
In Part I of our interview series, we discussed how the Fusion-io SDK kit will help to unleash the next gen properties of flash.
In Part 3 in this interview series we will discuss the “DirectFS” API, a native POSIX compliant direct file system layer and discuss the more technical aspects of how the SDK works.
In Part 4 in this series Brent and I discuss the semantics of using the API in the C language and how Fusion-io is leveraging its early access partnerships.