How Avere Systems is putting the speed back into storage
Q&A with Avere Systems CEO Ron Bianchini.
Companies of all sizes are generating huge amounts of data which needs to be collected and analysed to create business benefits. But storing all that data in a way that makes business and economic sense is an increasing headache for many.
Enterprise storage company Avere Systems aims to help by allowing customers to store files in the cloud or on premises without reducing the availability of their data. It counts the US Centers for Disease Control among its customers.
ZDNet spoke to Avere CEO Ron Bianchini about how his company and the world of storage is changing.
ZDNet: Where are you based?
Bianchini: We are based in Pittsburgh and that is mainly because of the engineering [team]. The bulk of the people in engineering are all Carnegie Mellon grads and that is useful because one of the areas of expertise that they have is in file systems.
All storage products are based on an underlying file system and that expertise is one of the reasons why we are there. But we also have sales and marketing there as well.
How many people do you employ?
One hundred and ten at the moment, so we are doing pretty well. If you peel back the cover and look at what is underneath and what makes our product work, it is the fact that we taught our file system that there are two different sorts of media. There is something that is close, nearby and has very low latency but it is also precious and, relatively speaking, very expensive.
And then there is a bulk media that is far away, has a higher latency but is very cheap.
So what you try to figure out is, how do you leverage the local stuff, and you put the stuff that you need there, and only when you are going to run out of space do you shift stuff off into a repository.
Now what's different [about our technology] is that our hit rate, the rate at which you use the local stuff, is the highest in the industry. We have an accuracy of 98 percent. So what that means is that of all the transactions you make, 98 percent of the time we will go local, only two percent of the time do we have to go remote.
So because of that, we can make remote storage look like it's nearby and very fast.
And most companies are happy with a 94 or 95 percent rate?
Exactly. So when most people are doing that, they assume we are doing read caching. Now we do do read caching -- we are actually a very effective read cacher. One of our boxes has nine terabytes and if you put three of them together it's 27TB, so we globally share the media among them.
We can globally share from two up to 50 nodes at 9TB, so if we were only a read cache you would find that our hit rate was closer to 30 percent. So most read caches -- a lot of updates, a lot of writes -- would miss it and have to go to the bulk storage. So not only do we read-cache, but we write-cache.
Now the most interesting thing about our write-caching is that as soon as you have two or more nodes, our write accuracy, our write off-load is close to 100 percent. How we do that is that when a write comes in we make sure that two nodes have it. As soon as two nodes have it, that's it. So our close to 100 percent accuracy comes from a 100 percent off-load of writes.
Then the 98 percent is the reads. The interesting thing about that 98 percent is that you can change that by changing how many nodes you buy. You can change it, so we call 98 percent typical because you can run certain benchmarks so you can have 98 percent. But some customers might want higher than that so we run the benchmarks until they can get the accuracy rate they want.
It is all because we are doing the writes. That is what makes it possible.
That is our IP, figuring out what needs to be local in this constrained, precious, and expensive data and what you can afford to put far away.
What kind of algorithms do you run?
What we do is we look at the past to see what you have accessed in the past, and then we use that to predict the future. Then we do other things, like if you read the first portion of a file, we assume you are going to read the rest so we pre-fetch it. Then we use all the LAN-optimisation tricks. Instead of using one stream to pre-fetch, we can go 100 wide so that we keep the pipe full. So it is all about the file systems predicting what you are going to do.
So you do this in the box?
It is all in the box and it is all in real-time. And we have some incredible stats on this. For two years in a row -- 2013 and 2014 -- the top 12 grossing movies were rendered on our stuff. So the movie business loves us, because when they render they are using computers to add stuff into frames as fast as they can. Now we can offer them the performance to do that quickly because we are using all that local flash.
And then if you think about large software companies, especially large software development companies, they are all using us. And then there is tools distribution and now we are also starting to get into life sciences. So people like CDC [the US Center for Disease Control] and John Hopkins University use us in areas like genomics.
If you imagine that, especially in genomics, you have very large datasets and to put them into flash would be incredibly expensive. But we only pull in what we think you are going to need and with that 90 percent they are only pulling in what they actually need and that big bulk storage is now in disk.
So if you think of anything that requires a high transaction rate into storage, we do really well.
What we work on then is keeping the important stuff core and the less important stuff out on the edge so what we call our architecture is Edge/Core. Our device is the Edge -- the thing that is close to the application -- and the Core is the rest.
So you have your users and your servers -- in other words anything or anybody who wants to read and write data -- and you have a cluster of those near-by and you have the rest far away. So you have the stuff that you will need 90 percent of the time close and it is only on cold, read misses that you have to go to the rest.
All of the writes are done in the core and then we asynchronously write it back. And so with all the reads that are hot we take from the Core and the cold reads we take from the edge.
We sell the box and we tell people to use any central or core repository that they want.
It started off with the incumbents in the storage space like EMC but now we are seeing more of the cloud as well. Now so much of the storage space is NAS (Network Attached Storage) that we call this NAS optimisation.
We started in the NAS optimisation world where you put our box in front of your on-premised storage and at the beginning of 2014 we expanded into cloud storage. So now you can talk to Amazon S3, Google, and Microsoft or you could talk to a private cloud. And it turns out that our box works as effectively in front of the cloud as it does in front of NAS.
So you support all the major players?
Yes. There is the industry standard benchmark SPEC-FS. You down load the widget, you point it at your storage and it scales up your performance until you stop responding, then it scales back and it calls that your [basic figure]. Then it creates an encrypted file which you upload to the SPEC committee and they review it, and if they believe everything is legit, then they let you publish the number.
So we downloaded the SPEC-FS app, we pointed it at a three-node cluster and one of our boxes does 60,000 SPEC-ops, so a three-node cluster does 180,000 SPEC-ops. The next thing we did was run a baseline 1.0. We ran SPEC in Pittsburgh against our three nodes and put a NAS file server behind it and we got 180,000 SPEC-ops.
Then we ran SPEC against three of our nodes and we basically put it in front of a NAS box in Europe and we actually simulated a 180 millisecond latency and we still got 180,000 SPEC-ops.
Next we ran SPEC from in front of three of our nodes and we put it against a private cloud in [Amazon] S3 and we still got 180,000 SPEC-ops.
And then the one that really got everybody surprised was that when we ran SPEC against three of our nodes across the public internet, we still got 180,000 SPEC-ops.
The point is that if your users are running against our box, we don't care where the repository is. It could be local or remote, it could be NAS or object, the user experience will be identical. We have the flash that is in our box, and we use that so frequently that it doesn't really matter how distant or what protocol your repository is using.
What I love about this is that we are the only company that has posted a NAS benchmark against the cloud.