The Harder Problem: File Systems & Cloud Scalability
Andres Rodriguez shares his discovery of how to integrate file systems with the unlimited object store to achieve true cloud scalability.
June 18, 2024 | Andres Rodriguez
When NASA was designing its first manned mission to the moon, the initial proposals called for a larger version of the rockets that had successfully carried astronauts into orbit. Yet ferrying a crew to the moon and back was a different sort of challenge. This was a bigger, harder problem, and often a hard problem demands a disruptive, nearly unimaginable solution.
NASA’s eventual decision to pursue the more complex lunar orbit rendezvous (LOR) mission required two ships to rendezvous mid-orbit. Nothing demanding such exact telemetry and control had ever been attempted in space. The story of this mission reminds me of the challenges at the dawn of the cloud era. When we started working on what would become Nasuni, organizations had been relying on traditional block-based file systems for decades. A new infinitely scalable storage medium had just been introduced by AWS – publicly available S3 object (or cloud) storage. Azure and GCP released similar offerings in quick succession.
Finding a way to make use of this new medium was a hard problem. The harder problem was figuring out how to integrate file systems with the unlimited object store to achieve true cloud scalability.
The limitations of traditional file systems
The obvious solution was the incremental one. Take any of the well-established traditional file systems (XFS, ZFS, etc.) and bolt on the object store. Make the traditional file system handle all metadata and offload the data payload to the object store. This is a variation on tiering. The shortcoming of tiering is that the traditional file system metadata becomes the new bottleneck for scale. If Nasuni had chosen this route, we would have gotten to market sooner, generated results faster, and pleased our investors in the short run. Eventually, though, we would have stalled.
Traditional file systems were designed for finite storage hardware. They have fundamental scale limitations. By simply adapting traditional file systems, we would’ve been placing a limit on something with infinite potential. What we needed to do was design a file system without limits. This would enable the obvious benefit of being able to store any number of files without reformatting the file system but, just as important, unlimited metadata would also enable unlimited immutable versioning. And that would allow us to eliminate the need to ever back up our file system.
This approach demanded we go against everything the industry understood about how file systems were supposed to work. We’d have to capture changes in the file system by creating immutable versions. These versions would have to be stored as objects to avoid violating the eventual consistency model that is at the heart of scale in the cloud. Computationally this would be an expensive operation that ran straight against the grain of storage orthodoxy.
Committing to the unimaginable
Our founding engineering team thought about every possible way to avoid this route. We had endless arguments and debates. We did everything we could to identify an easier, more incremental, more traditional solution.
There wasn’t one. And so, we committed to what seemed unimaginable.
We understood that to make this work, all the metadata holding the file system together had to live in the object store. This was the single most important requirement. Once we had this rule in place, the only design possible was UniFS®, a file system that is constantly writing data and metadata to the object store in one direction (no modifications or overwrites) without scale limitations. This decision transformed the file system into something the object store could understand natively, and laid the foundation for what our customers rely on today: a scalable global file data platform that functions like a single server.
Do you want cloud scalability or not?
In aeronautics and file system dynamics alike, enough scale changes the nature of the problem. The challenge of sending the first astronaut into earth’s orbit versus that of designing a round trip to the moon was of an entirely different scale. NASA couldn’t simply build a bigger rocket. They had to opt for the more audacious LOR mission. John Houbolt, the NASA engineer who championed the approach, sold the proposal best in an internal memo after an early defeat: “Do we want to go to the Moon or not?”
Here at Nasuni the question was whether we wanted to create something with native cloud scalability or not. To do that we couldn’t just drag the outdated file system into a new era. The cloud demanded a brand new file system design, and although our choice to pursue this new kind of file system slowed our velocity at the start, our last five years of growth and success have only reinforced the validity of that decision.
The harder solution pays off in the end.