Archive vs. “Archive-ish” Data

Ask anyone in IT to define “archive data” and chances are it would go something like this: “Data that is no longer active, but still needs to be kept around for future reference or regulatory compliance.” You may also get an expletive or two thrown in for good measure. In many organizations, archive data is just that, inactive data – though archive data could include recently active data and data that is 5 years old. When it comes of age, this data is generally moved off to tape or to a slower, out of the network storage array. Should anyone need to gain access to it, a request would be made to IT and they would then bring the data back to the active storage array.

Archive data has traditionally had a clear place in the data lifecycle. However, in recent years, the notion of archive data has evolved into what can be better thought of as “semi-archive,” “archive-ish” or near-line data. Key drivers forcing this change of what archive vs. active data is the speed at which new data is created and users demanding fast access to all of their data, regardless of when it was created.

This poses a challenge for IT. They can continue to take the traditional approach to moving older data off to slower disk or tape archives and attempt to accommodate the increasing number of requests to retrieve it or they can buy more Tier 1 storage. The first approach is far from convenient given latency and pain associated with data recovery, and the latter dramatically increases costs. Even if the budget was available for this added storage, it increases the complexities associated with both administrative overhead and burden on backup systems. As more and more “archive” storage systems become mere extensions of the normal file server, things become unnecessarily complicated and the strong demarcation between active and archive data becomes further blurred.

Nasuni addresses the issue of archive data by its inherent design. By effectively eliminating all the pains associated with rapid data growth, there is no longer a need to archive data.

When a file is stored in the Nasuni Service, a copy is saved locally and the golden master is sent off to the cloud where it is then replicated across multiple data centers. Snapshots keep a complete version history of the files and at any point in time a file can be accessed by the end user, with no intervention from IT, whether that file is considered part of the active data set or is considered ‘archive’ data. With Nasuni, all the processes associated with retrieving archived files are eliminated.

Archiving in the traditional storage sense has moved from common business practice to a specialty use case. And while there may be times when true archiving is required, this will not be typical of most enterprises as we enter 2012 and beyond.

Do you find yourself setting up more “archives” that act as extensions of the main file server?

How would your storage needs change if you never needed to archive data?

End user behavior is difficult to change – if you can change it at all.  If anything, users will want access to their past data and their ever-increasing present, active data.  To meet end user demands without continually expanding your storage infrastructure, or increasing your own management workload, consider a solution like Nasuni’s where all data is accessible all of the time.

Leave a Reply