David Flynn, Hammerspace | AWS re:Invent 2018

>> Live from Las Vegas. It's theCUBE. Covering AWS re:Invent 2018. Brought to you by Amazon Web Services, Intel and their ecosystem partners. >> And welcome back to our continuing coverage here on theCUBE of AWS re:Invent, we're on day three of three days of wall to wall coverage that we've brought you here from the Sands Expo along with David Vellante, I'm John Walls. Glad you're with us here, we're joined by David Flynn from Hammerspace, and David, good afternoon to you. >> Good afternoon. >> Been quite a year for you, right? >> Yeah. >> This has been something else. Set us up a little bit about where you've been, the journey you're on right now with Hammerspace and maybe for folks at home who aren't familiar, a little bit about what you do. >> So Hammerspace is all about data agility. We believe that data should be like the air you breathe, where you need it, when you need it, without having to think about it. Today, data's managed by copying it between the sundry different types of storage. And that's 'cause we're managing data through the storage system itself. What we want is for data to simply be there, when you need it. So it's all about data agility. >> I need to know more. So let's talk about some of your past endeavors. Fusion-io we watched you grow that company from just an idea. You solved the block storage problem, you solved the performance problems, amazing what you guys did with that company. My understanding is you're focused on file. >> That's right. >> Which is a much larger-- >> Unstructured data in general file and object. >> So a much larger proportion of the data that's out there. >> Yes. >> What's the problem that you guys are going after? >> Well at Fusion-io and this was pre-flash, now flash everybody takes it for granted. When we started it didn't really exist in the data center. And if you're using SAN, most likely it's for performance. And there's a better way to get performance with flash down in the server. Very successful with that. Now the problem is, people want the ease of managablility of having a global name space of file and object name space. And that's what we're tackling now because file is not native in the Cloud. It's kind of an afterthought. And all of these different forms of storage represents silos into which you copy data, from on-prem into cloud, between the different types of storage, from one site to another. This is what we're addressing with virtualizing the data, putting powerful metadata in control of how that data's realized across multiple data centers across the different types of storage, so that you see it as a single piece of data regardless of where it lives. >> Okay so file's not a first class citizen. You're making copies, moving data all over the place. You got copy creep going on. >> It's like cutting off Hydra's head. When you manage data by copying it you're just making more of it and that's because the metadata's down with the data. Every time you make a copy, it's a new piece of data that needs to be managed. >> So talk more about the metadata structure, architecture, what you guys are envisioning? >> Fundamentally, the technology is a separate metadata control plane that is powerful enough to present data as both file and object. And takes that powerful metadata, and puts it in control of where the data is realized, both in terms of what data center it's in, as well as what type of storage it's on, allowing you to tap into the full dynamic range of the performance of server-attached flash, of course Fusion-io, very near and dear to my heart, getting tens of millions of I-ops and tens of gigabytes per second, you can't do that across the network. You have to have the data be very agile, and be able to be promoted into the server. And then be able to manage it all the way to global scale between whole different data centers. So that's the magic of being able to cover the full dynamic range performance to capacity, scale and distance, and have it be that same piece of data that's simply instantiated, where you need it, when you need it, based on the power of the metadata. >> So when you talk about object, you talk about a simplified means of interacting, it's a get-put paradigm right? >> That's right. >> So that's something that you're checking up? >> That's right, ultimately you need to also have random read and write semantics and very high performance, and today, the standard model is you put your data in object storage and then you have your application rewritten to pull it down, store it on some local storage, to work with it and then put it back. And that's great for very large-scale applications, where you can invest the effort to rewrite them. But what about the world where they want the convenience of, the data is simply there, in something that you can mount as a file system or access as object, and it can be at the highest performance of random IO against local flash, all the way to cold in the Cloud where it's cheap. >> I get it so it's like great for Shutterfly 'cause they've got the resources to rewrite the application but for everybody else. >> That's right, and that's why the web scalers pioneered the notion of object storage and we helped them with the local block to get very, very high performance. So that bifurcated world, because the spectrum got stretched so wide that a single size fits all no longer works. So you have to kind of take object on the capacity, distance and scale side, and block, local on the performance side. But what I realized early on, all the way back to Fusion-io is that it is possible to have a shared namespace, both file system and object, that can span that whole spectrum. But to do that you have to provide really powerful metadata as a separate service that has the competency to actually manage the realization of the data across the infrastructure. >> You know David you talk about data agility, so that's what we're all about right? We're all about being agile. Just conceptually today, a lot more data than you've ever had to deal with before. In a lot more places. >> It's a veritable forest. >> With a lot more demands, so just fundamentally, how do you secure that agility. How can you provide that kind of reliability and agility, in that environment, like the challenge for you. >> Oh yeah. Well the challenge really goes back to the fact that the network storage protocols haven't had innovation for like 20 years because of the world of NAS being so dominant by a few players, well one. There really hasn't been a lot of innovation. Y'know NFSv3 three has been around for decades. NFSv4 didn't really happen. It was slower and worse off. At the heart of the storage networking protocols for presenting a file system, it hadn't even been enhanced to be able to communicate across hostile networks. So how are you going to use that at the kind of scale and distance of cloud, right? So what I did, after leaving Fusion-io, was I went and teamed up with the world's top experts. We're talking here about Trent Micklebus, the Linux Kernel author and maintainer of the storage networking stack. And we have spent the last five plus years fixing the fundamental plumbing that makes it possible to bring the shared file semantic into something that becomes cloud native. And that really is two things. One is about the ability to scale, both performance, capacity, in the metadata and in the data. And you couldn't do that before because NAS systems fundamentally have the metadata and data together. Splitting the two allows you to scale them both. So scale is one. Also the ability to secure it over large distances and networks, the ability to operate in an eventually consistent, to work across multiple datacenters. NAS had never made the multi-datacenter leap. Or the securing it across other networks, it just hadn't got there. But that is actually secondary compared to the fact that the world of NAS is very focused on the infrastructure guys and the storage admin. And what you have to do is elevate the discussion to be about the data user and empower them with powerful metadata to do self service. And as a service so that they can completely automate all of the concerns about the infrastructure. 'Cause if there's anything that's cloud, it's being able to delegate and hand off the infrastructure concerns, and you simply can't do that when you're focused at it from a storage administration and data janitorial kind of model. >> So I want to pause for a second and just talk to our audience and just stress how important it is to pay attention to this man. So there's no such thing as a sure thing in business. But there is one sure thing that is if David Flynn's involved you're going to disrupt something so you disrupted Scuzzy, the horrible storage stack. So when you hear things today like NVME and CAPPY and Atomic Rights and storage class memory, you got it all started. Fusion-io. >> That's right. >> And that was your vision that really got that started up. When I used to talk to people about that they would say I'm crazy, and you educated myself and Floyer and now you see it coming to fruition today. So you're taking aim at decades old infrastructure and protocols called NAS, and trying to do the same thing at Cloud scale, which is obviously something you know a lot about. >> That's right. I mean if you think about it. The spectrum of data, goes from performance on the one hand to ease of manageability, distance and scale, cost capacity versus cost performance. And that's inherent to our physical universe because it takes time to propagate information to a distance and to get ease of manageability to encode things very, very tight to get capacity efficiency, takes time, which works against performance. And as technology advances the spectrum only gets wider, and that's why we're stuck to the point of having to bifurcate it, that performance is locally attached flash. And that's what I pioneered with flash in the server in NVME. I told everybody, EMC, SAN, it sucks. If you want performance put flash in the server. Now we're saying if you want ease of use and manageability there's a better way to do that than NAS, and even object storage. It's to separate the metadata as a distinct control plane that is put in charge of managing data through very rich and powerful metadata, and that puts the data owner in control of their data. Not just across different types of storage in the performance capacity spectrum, but also across on-prem and in the Cloud, and across multi-cloud. 'Cause the Cloud after all is just another big storage silo. And given the inertia of data, they've got you by the balls when they've got all the data there. (laughing) I'm sorry, I know I'm at AWS I should be careful what I say. >> Well this is live. >> Yeah, okay so they can't censor us, right. So just like the storage vendors of yesteryear, would charge you an arm and a leg when their arrays were out of service, to get out of your service, because they knew that if you were trying to extend the service life of that, that that's because it was really hard for you to get the data off of it because you had to suffer application downtime and all of that. In the same fashion, when you have your data in the Cloud, the egress costs are so expensive. And so this is all about putting the data owner in control of the data by giving them a rich powerful metadata platform to do that. >> You always want to have strategies that give you flexibility, exit strategies if things don't work out, so that's fascinating. I know we got to wrap, but give us the low-down on the company, the funding, what can you share with us. Go-to-market, et cetera. >> So it's a tightly held company. I was very successful financially. So from that point of view we're... >> Self-funded. >> Self-funded, funded from angels. I made some friends with Fusion-io right? So from that point of view yeah, it's the highest power team you can get. I mean these are great guys, the Linux Kernel maintainer on the storage networking stack. This was a heavy lift because you have to fix the fundamental plumbing in the way storage networking works so that you can, it's like a directories service for data, and then all the management service. This has been a while in the making, but it's that foundational engineering. >> You love heavy lifts. >> I love hard problems. >> I feel like I mis-introduced you, I should have said the great disruptor is what I should have said. >> Well, we'll see. I think disrupting the performance side was a pure play and very easy. Disrupting the ease of use side of the data spectrum, that's the fun one that's actually so transformative because it touches the people that use the data. >> Well best of luck. It was really, I'm excited for ya. >> Thanks for joining us David. Appreciate the time. David Flynn joined up from Hammerspace, and back with more on theCUBE at AWS re:Invent. (upbeat music)

Published Date : Nov 29 2018

SUMMARY :

Brought to you by Amazon Web Services, Intel that we've brought you here from the Sands Expo the journey you're on right now with Hammerspace We believe that data should be like the air you breathe, You solved the block storage problem, from on-prem into cloud, between the different types You're making copies, moving data all over the place. of it and that's because the metadata's down with the data. So that's the magic of being able to cover the full dynamic the data is simply there, in something that you can mount they've got the resources to rewrite the application But to do that you have to provide really powerful metadata You know David you talk about data agility, in that environment, like the challenge for you. Splitting the two allows you to scale them both. So when you hear things today like NVME and CAPPY and now you see it coming to fruition today. And given the inertia of data, they've got you by the balls In the same fashion, when you have your data in the Cloud, the company, the funding, what can you share with us. So from that point of view we're... so that you can, it's like a directories service for data, the great disruptor is what I should have said. that's the fun one that's actually so transformative Well best of luck. Appreciate the time.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Flynn	PERSON	0.99+
David Vellante	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Trent Micklebus	PERSON	0.99+
20 years	QUANTITY	0.99+
two things	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Today	DATE	0.99+
Intel	ORGANIZATION	0.99+
One	QUANTITY	0.99+
two	QUANTITY	0.99+
three days	QUANTITY	0.99+
today	DATE	0.99+
tens of millions	QUANTITY	0.99+
Sands Expo	EVENT	0.98+
both	QUANTITY	0.98+
Hammerspace	ORGANIZATION	0.98+
Linux Kernel	TITLE	0.97+
one	QUANTITY	0.96+
one site	QUANTITY	0.96+
Shutterfly	ORGANIZATION	0.95+
single piece	QUANTITY	0.91+
day three	QUANTITY	0.9+
tens of gigabytes per second	QUANTITY	0.89+
single size	QUANTITY	0.87+
decades	QUANTITY	0.87+
last five plus years	DATE	0.85+
Fusion-io	TITLE	0.83+
Invent	EVENT	0.82+
a second	QUANTITY	0.8+
NFSv4	TITLE	0.79+
one sure thing	QUANTITY	0.78+
AWS re:Invent 2018	EVENT	0.76+
Hammerspace	TITLE	0.76+
I-ops	QUANTITY	0.75+
NVME	TITLE	0.74+
both file	QUANTITY	0.74+
NFSv3 three	TITLE	0.73+
first class	QUANTITY	0.73+
EMC	ORGANIZATION	0.73+
CAPPY	TITLE	0.72+
Hydra	ORGANIZATION	0.7+
Fusion-io	ORGANIZATION	0.69+
re:Invent	EVENT	0.65+
Scuzzy	PERSON	0.61+
Fusion-	ORGANIZATION	0.6+
Atomic	TITLE	0.58+
io	TITLE	0.52+
2018	TITLE	0.51+
Floyer	ORGANIZATION	0.49+
re	EVENT	0.4+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for NFSv3 three: