Cloud Report Q&A

« View more resources

On December 12th, 2011 Nasuni published the results of a two-year analysis of the largest public cloud storage providers. This cloud report generated a massive amount of interest and commentary across the industry. Nasuni Founders, Andres Rodriguez and Rob Mason, sat down to address these comments during a recent Nasuni Tech Talk.

Cloud Report Q&A Transcription

Andres Rodriguez:
Welcome to another addition of the Nasuni Tech Talks. My name is Andres Rodriguez I am the co-founder and CEO of Nasuni and I’m here today with my co-founder, Rob Mason.

Today we’re going to cover the cloud storage report, we’re going to be answering questions that we’ve gotten online in different forums for the report. Rob oversaw the production of that report and all of the software that was used to collect data over the last two plus years. And without further ado, let’s move into the first question.

The first question comes from Computerworld: Have you considered the biases of the source of this information? Are these results really credible when conveniently the two cloud providers that Nasuni rates as the highest are also the ones they use for their back-end. Hmmm, doesn't this sound a bit suspicious to you? And, since when has Nasuni been designated a cloud storage tester? It's not like they're an independent Consumer Reports-like organization, such as Transaction Processing Performance Council, SNIA, NIST, etc.

Rob Mason:
I think it’s a great question, why is Nasuni qualified to test these cloud providers and why have we rated these two so high. It comes back to our heritage, we came from companies like EMC where we worked on the Centera, one of the first scalable object storage repositories in the world. There we dealt with huge scalability problems for one of the first times in the storage industry. From there we went on and went to companies like Archivas and that was bought by Hitachi and turned into the HCAP product. Huge scalable repository, these are the private clouds of today. Some of our engineers also worked at EMC at the ATMOS product – again another huge private cloud – very scalable. So we’ve dealt with the cloud back-end, building the actual clouds ourselves for the last decade or so, we’ve had a lot of experience building those clouds. Building those clouds you don’t just write code and ship it, you have to test and test and test. You have to test scalability, multi-threadedness, performance – everything. You have to test every aspect of the systems. We have been testing these systems for probably longer than some of these systems that we’re testing have even existed. There was no Azure back when we were testing Centera, there was no Azure when we were testing Acrhivas. So we’ve been testing these systems for a very long time.

Now, when it comes down to how are we qualified to make these choices well, Nasuni, we really could care less which component we use in our product – what name brand is on it – we’re just looking for a component. We want the best component for our product. I don’t care if that ends up being Azure or Microsoft or some other vendor, we want the best one in our product so we can provide the best one to our customers. So our testing had no predetermined outcome. We have no special partnership with Amazon or Azure or Microsoft, we are looking for the best component, we are testing to get the best component and the one that comes out on top is the one we selected. It’s not like we selected Amazon first and then tested and said “oh, we made the right choice,” we tested first and then chose. That is why we are shipping with Amazon and Azure as our two top providers.

Andres:
Now in GigaOm, we have the question: How safe is it to use Cloud storage. I’ve been hearing of various cloud services being hacked and data loss. What is your take on security when it comes to Could Storage. What do you think that Microsoft and Amazon will have to say about it?

Rob:
Well I think cloud storage security really starts at the edge, and that’s Nasuni’s philosophy. You encrypt at the customers site, with encryption keys provided by the customer, and then after that the customer controls the data before it leaves their site. So when the data is out of their site whatever happens to it, if it gets spread around the Ethernet or gets picked up by other people it doesn’t matter because it is safe in its form that it is transmitted around the Internet. Start with encryption at the edge and then what happens after that is not a problem.

Andres:
The next question comes from Arstechnica: Unfortunately, the study doesn't seem to have collected (or at least not published) what is probably the most important metric for cloud storage: p99 (or p95, or p99.9) latency for reads and writes. The biggest issue using cloud storage for any sort of DB (whether MySQL or NoSQL of some sort) tends to be stability of latency for I/O operations. A fast best-case or even average doesn't mean much if every now and then it suddenly starts taking 30ms for every block read or write.

Rob:
This comes back a little bit to the first question, Nasuni is not a general purpose cloud storage testing company. We are producing a product and service of our own. So when we went out to test the components for our needs, we tested the attributes of those services that we care about, that we need to provide our service. The latency of the cloud, the response time doesn’t matter because Nasuni is an edge-caching appliance that takes away all the pain of those latencies and interim outages and things like that. We tested the things we care about.

One of the other questions we are asked recently is why we only tested files in 1MB chunks instead of much larger files. We do test larger files, but we break a lot of the files that are put into our product up into pieces so we can get multi-threaded concurrent access so when you get changes in those files we can send those changes in smaller pieces to the cloud. This report is about the results of our testing, but the results of our testing were not a comprehensive test of everything about the cloud - it is the best cloud that meets our basic needs for Nasuni for our customers. That’s why we only tested certain things, the rest of the things we did not test are not as relevant to us.

Andres:
Their test creates 100 million files, and they claim (without any supporting evidence whatsoever) that a cloud should support billions of files without a problem. But they don't test metadata operations. How long would it take to traverse a tree of a billion files on S3 or Azure? Is it even possible? Can you walk the tree in parallel, and if you do, does it take less time, or more?

Rob:
It’s another good question. First, in the report we talked about the cost of creating 100 million files in one of these cloud providers. As you know, Nasuni is still a venture funded company, we’re not a great big public company, we haven’t had our IPO yet, so 100 million objects written into S3 cost us $1500. By our revenue today, that’s not a big number, but we don’t want to just waste money. So there is only so far we could have tested the object count scalability. We had to trust the reports coming out of Amazon and others that talk about billions and billions of objects. And you can see with Netflix and others using them, they do have tremendous object counts. But when you come back to the traversing the tree question, that comes back to use case. If you are doing a query or walking through containers or buckets in the cloud, yeah, you are probably going to run into all sorts of problems – and will that scale? I don’t know, but Nasuni doesn’t use the cloud that way. Nasuni has its own index - its own way of finding things in the cloud, we never walk trees, traverse trees – we simply ask for an object and we get it back, it’s that simple because we’ve simplified the component down to the basic core elements so that Nasuni can scale and use any component that we need to.

Andres:
Thank you Rob. I want to thank everyone for their questions and encourage you to send any questions that you have about cloud storage or about Nasuni directly to us. We will handle them in future sessions. These Tech Talks are really designed to educate everyone about the potential of using cloud storage as a component for the next generation controllers that you can use in your data centers as primary storage. There are some fantastic things that you can do with it, and we hope to be bringing more and more of that information to you through these venues.

Questions or comments? Contact us

« View more resources