Failed Backups, Missing Emails & A New Kind of File System
November 11, 2021
I was talking with the CIO of one of our longstanding clients recently, when he mentioned that one of his favorite things about our platform is that he no longer worries about backup. This came as no surprise. Backup is broken and it has been for decades. Yet, comments like these are music to my ears, because backup is one of the reasons I founded Nasuni in the first place.
Let me tell you my backup story. Years ago, I was CTO of a large media company and, like many organizations in that space, we were frequently sued for alleged intellectual property breaches. One of those suits called for us to establish a chronology of events based on a series of emails that were roughly five to seven years old.
We spent weeks looking for the backups for that mail server and couldn’t find them anywhere. There were old backups, but they couldn’t be restored, because the backup formats could no longer be read. Later, as I was walking past a colleague’s office, I noticed an old, decommissioned Dell server that had been transformed into an improvised coffee table. The server was the perfect height to fit neatly under his desk. Fortunately, all of our servers were labeled, and this one happened to be our old mail server.
We grabbed it, found a power cord, monitor — VGA anybody? — and keyboard and cranked up the old machine. Then we held our collective breath. Miraculously, the old emails were still there.
Innovation via Frustration
We had succeeded, yes, but it took a tremendous stroke of luck. We were lucky to have the physical machine that still had the production data. In most modern data centers, where every server has been virtualized, this would be impossible. I was relieved, but also frustrated. Intense frustration is what makes entrepreneurs do anything. Those backups should have been accessible, not hiding under someone’s papers and coffee cups. I decided there had to be a more efficient way to protect file data.
Nasuni didn’t begin immediately — we still needed the cloud to evolve as a medium — and this added time gave me a chance to step back and look at the backup process end to end. That is when I realized there was something intrinsically wrong with the approach. The backup industry was telling companies to put all their files into a proprietary format that might change over time. They were making it very difficult to test your recovery capabilities, too. This put IT professionals in a terrible position.
In IT, if you lose data, you lose your job. Earlier in my career, we had to fire our head of IT, because our build machine, where we stored all of the code we were working on, failed. That’s when we discovered that the backup server had been inadvertently skipping the core development repository. This was partially human error, because our IT Director hadn’t vetted that the backup server had access to all the files it was supposed to protect, but it revealed a systemic flaw in backup.
The Systemic Flaw in Backup
Any system that is only fully tested when there is a complete failure is bound to fail as a system of protection. Also, the more moving parts and the more human intervention required, the more likely that something will go wrong. Technology is there to make sure that we don’t forget. That we don’t get lazy. Because we are only human.
Forget backup. Backup has been a means to an end, and not a very good means at that. Instead, focus on the end goal. IT is responsible for never, ever, losing a file. Data is more precious than money. Data is irreplaceable. When I looked at the problem this way, I found four things that needed to change if we were going to really protect files:
- Any system protecting digital assets should be immune to the decomposition or failure of the storage media. If you move the data offline, the medium will decompose, and the data will eventually disappear — or be transformed into furniture. Cloud object storage would prove to be an ideal alternative here, because it creates multiple copies to guard against the failure of one machine or even one facility.
- Data should not be transformed into a proprietary format. If you transform the data in some way that isn’t related to production — a favorite technique of the backup vendors — then you’re subject to the technological whims of the vendor. If your backups from several years ago were stored in a format they no longer support, you won’t find out until you try to restore, and then you’ll be out of luck.
- This new way of protecting digital assets would have to move away from the one-way push model of backup, in which your data is shoved into some secondary system. You can’t test your complete restore capabilities effectively with this approach, so you leave yourself open to disastrous data loss incidents like the one that struck my old startup.
- The system of production and protection should be one and the same. That unification, more than anything, drove all of our early design work at Nasuni. It called for the unification of the physical media protecting the bits with the production system that generates them. A secondary system is inherently unreliable. The best backup is no backup. I realized that a better way to protect a file system would be to do so within that file system — and design a technology that automatically protects itself as it operates normally.
Our observation was that largest companies in the world were entrusting a fundamentally flawed technology with the protection of their data. We wanted to create a zero-gap approach to data protection, a single system that did everything, from storing to protecting files, without the need for human intervention or oversight. A platform that would function as the system of production and the system of protection at the same time.
Twelve years later, it’s still working beautifully. When it comes to data protection, two systems are much less than one. The name Nasuni stands for NAS + Unified — one system that brings together production and protection.
The promise of what we do is in our name.