Liran Zvibel & Andy Watson, WekaIO | CUBE Conversation, December 2018

(cheery music) >> Hi I'm Peter Burris, and welcome to another CUBE Conversation from our studios in Palo Alto, California. Today we're going to be talking about some new advances in how data gets processed. Now it may not sound exciting, but when you hear about some of the performance capabilities, and how it liberates new classes of applications, this is important stuff, now to have that conversation we've got Weka.IO here with us, specifically Liran Zvibel is the CEO of Weka.IO, and joined by Andy Watson, who's the CTO of Weka.IO. Liran, Andy, welcome to the cube. >> Thanks. >> Thank you very much for having us. >> So Liran, you've been here before, Andy, you're a newbie, so Liran, let's start with you. Give us the Weka.IO update, what's going on with the company? >> So 18 has been a grand year for us, we've had great market adoption, so we've spent last year proving our technology, and this year we have accelerated our commercial successes, we've expanded to Europe, we've hired quite a lot of sales in the US, and we're seeing a lot of successes around machine learning, deep learning, and life sciences data processes. >> And you've hired a CTO. >> And we've hired the CTO, Andy Watson, which I am excited about. >> So Andy, what's your pedigree, what's your background? >> Well I've been around a while, got the scars on my back to show it, mostly in storage, dating back to even off-specs before NetApp, but probably best known for the years I spent at NetApp, was there from 95 through 2007, kind of the glory years, I was the second CTO at NetApp, as a matter of fact, and that was a pretty exciting time. We changed the way the world viewed shared storage, I think it's fair to say, at NetApp, and it feels the same here at Weka.IO, and that's one of the reasons I'm so excited to have joined this company, because it's the same kind of experience of having something that is so revolutionary that quite often, whether it's a customer, or an analyst like yourself, people are a little skeptical, they find it hard to believe that we can do the things that we do, and so it's gratifying when we have the data to back it up, and it's really a lot of fun to see how customers react when they actually have it in their environment, and it changes their workflow and their life experience. >> Well I will admit, I might be undermining my credibility here, but I will admit that back in the mid 90s I was a little bit skeptical about NetApp, but I'm considerably less skeptical about Weka.IO, just based on the conversations we've had, but let's turn to that, because there are classes of applications that are highly dependent on very large, small files, being able to be moved very very rapidly, like machine learning, so you mentioned machine learning, Liran, talk a little bit about some of the market success that you're having, some of those applications' successes. >> Right so machine learning actually works extremely well for us for two reasons. For one big reasons, machine learning is being performed by GPU servers, so a server with several GPU offload engines in them, and what we see with this kind of server, a single GPU server replaces ten or tens of CPU based servers, and what we see that you actually need, the IO performance to be ten or tens times what the CPU servers has been, so we came up with a way of providing significantly higher, so two orders of magnitude higher IO to a single client on the one hand, and on the other hand, we have sold the data performance from the metadata perspective, so we can have directories with billions of files, we can have the whole file system with trillions of files, and when we look at the autonomous driving problem, for examples, if you look at the high end car makers, they have eight cameras around the cars, these cameras take small resolution, because you don't need a very high resolution to recognize the line, or a cat, or a pedestrian, but they take them at 60 frames per second, so 30 minutes, you get about the 100k files, traditional filers could put in the directory, but if you'd like to have your cars running in the Bay Area, you'd like to have all the data from the Bay Area in the single directory, then you would need the billions of file directories for us, and what we have heard from some of our customers that have had great success with our platform is that not only they get hundreds of gigabytes of small file read performance per second, they tell us that they take their standard time to add pop from about two weeks before they switched to us down to four hours. >> Now let's explore that, because one of the key reasons there is the scalability of the number of files you can handle, so in other words, instead of having to run against a limit of the number of files that they can typically run through the system, saturate these GPUs based on some other storage or file technology, they now don't have to stop and set up the job again and run it over and over, they can run the whole job against the entire expansive set of files, and that's crucial to speeding up the delivery of the outcome, right? >> Definitely, so what they, these customers used to do before us, they would do a local caching, cause NFS was not fast enough for them, so they would copy the data locally, and then they would run them over on the local file system, because that has been the pinnacle of performance of recent year. We are the only storage currently, I think we'll actually be the first wave of storage solutions where a shared platform built for NVME is actually faster than a local file system, so we'd let them go through any file, they don't have to pick initially what files goes to what server, and also we are even faster than the traditional caching solutions. >> And imagine, having to collect the data and copy it to the local server, application server, and do that again and again and again for a whole server farm, right? So it's bad enough to even do it once, to do it many times, and then to do it over and over and over and over again, it's a huge amount of work. >> And a lot of time? >> And a lot of time, and cumulatively that burden, it's going to slow you down, so that makes a big big difference and secondly, as Liran was explaining, if you put 100,000 files in a directory of other file systems, that is stressful. You want to put more than 100,000 files in a directory of other file systems, that is a tragedy, and we routinely can handle millions of files in a directory, doesn't matter to us at all because just like we distribute the data, we also distribute the metadata, and that's completely counter to the way the other file systems are designed because they were all designed in an era where their focus was on the physical geometry of hard disks, and we have been designed for flash storage. >> And the metadata associated with the distribution of that data typically was in a one file, in one place, and that was the master serialization problem when you come right down to it. So we've got a lot of ML workloads, very large number of files, definitely improved performance because of the parallelism through your file system, in the as I said, the ML world. Let's generalize this. What does this mean overall, you've kind of touched upon it, but what does it mean overall for the way that customers are going to think about storage architectures in the future as they are combining ML and related types of workloads with more traditional types of things? What's the impact of this on storage? >> So if you look at how people architect their solutions around storage recently, you have four different kind of storage systems. If you need the utmost performance, you're going to DAS, Fusion IO had a run, perfecting DAS and then the whole industry realized it. >> Direct attached storage. >> Direct attached storage, right, and then the industry realized hey it makes so much sense, they create a standard out of it, created NVME, but then you're wasting a lot of capacity, and you cannot manage it, you cannot back it up, and then if you need it as some way to manage it, you would put your data over SAN, actually our previous company was XAV storage that IBM acquired, vast majority of our use cases are actually people buying block, and then they overlay a local file system over it because it gets you so much higher performance then if you must get, but you don't get, you cannot share the data. Now, if you put it on a filer, which is Neta, or Islon, or the other solutions, you can share the data but your performance is limited, and your scalability is limited as Andy just said, and if you had to scale through the roof- >> With a shared storage approach. >> With a shared storage approach you had to go and port your application to an object storage which is an enormous feat of engineering, and tons of these projects actually failed. We actually bring the new kind of storage, which is assured storage, as scalable as an object storage, but faster than direct attach storage, so looking at the other traditional storage systems of the last 20 or 30 years, we actually have all the advantages people would come to expect from the different categories, but we don't have any of the downsides. >> Now give us some numbers, or do you have any benchmarks that you can talk about that kind of show or verify or validate this kind of vision that you've got, that Weka's delivering on? >> Definitely, but the i500? >> Sure, sure, we recently actually published our IO500 performance results at the SE1800, SE18 event in Dallas, and there are two different metrics- >> So fast you can go back in time? >> Yes, exactly, there are two different metrics, one metric is like an aggregate total amount of performance, it's a much longer list. I think the one that's more interesting is the one where it's the 10-client version, which we like to focus on because we believe that the most important area for a customer to focus on is how much IO can you deliver to an individual application server? And so this part of the benchmark is most representative of that, and on that rating, we were able to come in second well, after you filter out the irrelevant results, which, that's a separate process. >> Typical of every benchmark. >> Yes exactly, of the relevant meaningful results, we came in second behind the world's largest and most expensive supercomputer at Oak Ridge, the SUMMIT system. So they have a 40 rack system, and we have a half, or maybe a little bit more than half, one rack system of industry standard hardware running our software. So compare that, the cost of our hardware footprint and so forth is much less than a million dollars. >> And what was the differential between the two? >> Five percent. >> Five percent? So okay, sound of jaw dropping. 40 rack system at Oak Ridge? Five percent more performance than you guys running on effectively a half rack of like a supermicro or something like that? >> Oh and it was the first time we ran the benchmark, we were just learning how to run it, so those guys are all experts, they had IBM in there at their elbow helping them with all their tuning and everything, this was literally the first time our engineers ran the benchmark. >> Is a large feature of that the fact that Oak Ridge had to get all that hardware to get the physical IO necessary to run serial jobs, and you guys can just do this parallel on a relatively standard IO subset, NVME subset? >> Because beyond that, you have to learn how to use all those resources, right? All the tuning, all the expertise, one of the things people say is you need a PhD to administer one of those systems, and they're not far off, because it's true that it takes a lot of expertise. Our systems are dirt simple. >> Well you got to move the parallelism somewhere, and either you create it yourself, like you do at Oak Ridge, or you do it using your guys' stuff, through a file system. >> Exactly, and what we are showing that we have tremendously higher IO density, and we actually, what we're showing, that instead of using a local file system, that where most of them were created in the 90s, in the serial way of thinking, of optimizing over hard drives, if now you say, hey, NVME devices, SSDs are beasts at running 4k IOs, if you solve the networking problem, if the network is not the bottleneck anymore, if you just run all your IOs as much parallelized workload over 4k IOs, you actually get much higher performance than what you could get, up until we came, the pinnacle of performance, which is a local file system over a local device. >> Well so NFS has an effective throughput limitation of somewhere around a gigabyte, so if you've got a bunch of GPUs that are each wanting four, five, 10 gigabytes of data coming in, you're not saturating them out of an effective one gigabyte throughput rate, so it's almost like you've got the New York City Waterworks coming in to some of these big file systems, and you got like your little core sink that's actually spitting the data out into the GPUs, have I got that right? >> Good analogy, if you are creating a data lake, and then you're going to sip at it with some tiny little straw, it doesn't matter how much data you have, you can't really leverage the value of all that data that you've accumulated, if you're feeding it into your compute farm, GPU or not, because if you're feeding it into that farm slowly, then you'll never get to it all, right? And meanwhile more data's coming in every day, at a faster rate. It's an impossible situation, so the only solution really is to increase the rate by which you access the data, and that's what we do. >> So I could see how you're making the IO bandwidth junkies at Oak Ridge, or would make them really happy, but the other thing that at least I find interesting about Weka.IO is as you just talked about is that, that you've come up with an approach that's specifically built for SSD, you've moved the parallelism into the file system, as opposed to having it be somewhere else, which is natural, because SSD is not built to persist data, it's built to deliver data, and that suggests as you said earlier, that we're looking at a new way of thinking about storage as a consequence of technologies like Weka, technologies like NVME. Now Andy, you came from NetApp, and I remember what NetApp did to the industry, when it started talking about the advantages of sharing storage. Are we looking at something similar happening here with SSD and NVME and Weka? >> Indeed, I think that's the whole point, it's one of the reasons I'm so excited about it. It's not only because we have this technology that opens up this opportunity, this potential being realized. I think the other thing is, there's a lot of features, there's a lot of meaningful software that needs to be written around this architectural capability, and the team that I joined, their background, coming from having created XIV before, and the almost amazing way they all think together and recognize the market, and the way they interact with customers allows the organization to address realistically customer requirements, so instead of just doing things that we want to do because it seems elegant, or because the technology sparkles in some interesting way, this company, and it remains me of NetApp in the early days, and it was a driver of NetApp's big success, this company is very customer-focused, very customer driven. So when customers tell us what they're trying to do, we want to know more. Tell us in detail how you're trying to get there. What are your requirements? Because if we understand better, then we can engineer what we're doing to meet you there, because we have the fundamental building blocks. Those are mostly done, now what we're trying to do is add the pieces that allow you to implement it into your workflow, into your data center, or into your strategy for leveraging the cloud. >> So Liran, when you're here in 2019, we're having a similar conversation with this customer focus, you've got a value proposition to the IO bandwidth junkies, you can give more, but what's next in your sights? Are you going to show how this for example, you can get higher performance with less hardware? >> So we are already showing how you can get higher performance with less hardware, and I think as we go forward, we're going to have more customers embracing us for more workloads, so what we see already, they get us in for either the high end of their life sciences or their machine learning, and then people working around these people realize hey, I could get some faster speed as well, and then we start expanding within these customers and we get to see more and more workloads where people like us and we can start telling stories about them. The other thing that we have natural to us, we run natively in the cloud, and we actually let you move your workload seamlessly between your on-premises and the cloud, and we are seeing tremendous interest about moving to the cloud today, but not a lot of organizations already do it. I think 19 and forward, we are going to see more and more enterprises considering seriously moving to the cloud, cause we have almost 100% of our customers PFCing, cloudbursting, but not a lot of them using them. I think as time passes, all of them that has seen it working, when they did the initial test, will start leveraging this, and getting the elasticity out of the cloud, because this is what you should get out of the cloud, so this is one way for expansion for us. We are going to spend more resources into Europe, which we have recently started building the team, and later in that year also, JPAC. >> Gentlemen, thanks very much for coming on theCUBE and talking to us about some new advances in file systems that are leading to greater performance, less specialized hardware, and enabling new classes of applications. Liran Zvibel is the CEO of Weka.IO, Andy Watson is the CTO of Weka.IO, thanks for being on theCUBE. >> Thank you very much. >> Yeah, thanks a lot. >> And once again, I'm Peter Burris, and thanks very much for participating in this CUBE Conversation, until next time. (cheery music)

Published Date : Dec 14 2018

SUMMARY :

some of the performance So Liran, you've in the US, and we're And we've hired the CTO, Andy Watson, 2007, kind of the glory years, just based on the conversations we've had, a single client on the one the data locally, and then and then to do it over and distribute the data, we also in the future as they are So if you look at how people and then if you need it as We actually bring the new more interesting is the one Yes exactly, of the than you guys running on the benchmark. expertise, one of the things the parallelism somewhere, in the 90s, in the serial way of thinking, so the only solution the file system, as opposed to and the team that I and the cloud, and we are Liran Zvibel is the CEO and thanks very much for

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
Peter Burris	PERSON	0.99+
Liran	PERSON	0.99+
30 minutes	QUANTITY	0.99+
ten	QUANTITY	0.99+
Andy Watson	PERSON	0.99+
Liran Zvibel	PERSON	0.99+
2019	DATE	0.99+
Oak Ridge	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Weka.IO	ORGANIZATION	0.99+
100,000 files	QUANTITY	0.99+
Five percent	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
40 rack	QUANTITY	0.99+
four hours	QUANTITY	0.99+
two	QUANTITY	0.99+
December 2018	DATE	0.99+
Dallas	LOCATION	0.99+
US	LOCATION	0.99+
2007	DATE	0.99+
Bay Area	LOCATION	0.99+
hundreds of gigabytes	QUANTITY	0.99+
last year	DATE	0.99+
two reasons	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
billions of file directories	QUANTITY	0.99+
NetApp	ORGANIZATION	0.99+
more than 100,000 files	QUANTITY	0.99+
one file	QUANTITY	0.99+
second	QUANTITY	0.99+
this year	DATE	0.99+
NVME	ORGANIZATION	0.99+
mid 90s	DATE	0.99+
one metric	QUANTITY	0.99+
one place	QUANTITY	0.99+
millions of files	QUANTITY	0.98+
90s	DATE	0.98+
five	QUANTITY	0.98+
Weka	ORGANIZATION	0.98+
tens	QUANTITY	0.98+
first time	QUANTITY	0.98+
eight cameras	QUANTITY	0.98+
two different metrics	QUANTITY	0.98+
single directory	QUANTITY	0.98+
trillions of files	QUANTITY	0.98+
one	QUANTITY	0.97+
SE1800	EVENT	0.97+
less than a million dollars	QUANTITY	0.97+
a half	QUANTITY	0.97+
JPAC	ORGANIZATION	0.97+
one way	QUANTITY	0.97+
CUBE Conversation	EVENT	0.96+
10-client	QUANTITY	0.96+
tens times	QUANTITY	0.96+
60 frames per second	QUANTITY	0.96+
Today	DATE	0.96+
NetApp	TITLE	0.96+
two orders	QUANTITY	0.95+
four	QUANTITY	0.95+
almost 100%	QUANTITY	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for SE1800: