Liran Zvibel, WekalO & Maor Ben Dayan, WekalO | AWS re:Invent

>> Announcer: Live from Las Vegas, it's The Cube, covering AWS re:Invent 2017, presented by AWS, Intel, and our ecosystem of partners. >> And we're back, here on the show floor in the exhibit hall at Sands Expo, live at re:Invent for AWS along with Justin Warren. I'm John Walls. We're joined by a couple of executives now from Weka IO, to my immediate right is Liran Zvibel, who is the co-founder and CEO and then Maor Ben Dayan who's the chief architect at IO. Gentleman thanks for being with us. >> Thanks for having us. >> Appreciate you being here on theCube. First off tell the viewers a little bit about your company and I think a little about the unusual origination of the name. You were sharing that with me as well. So let's start with that, and then tell us a little bit more about what you do. >> Alright, so the name is Weka IO. Weka is actually a greek unit, like mega and terra and peta so it's actually a trillion exobytes, ten to the power of thirty, it's a huge capacity, so it works well for a storage company. Hopefully we will end up storing wekabytes. It will take some time. >> I think a little bit of time to get there. >> A little bit. >> We're working on it. >> One customer at a time. >> Give a little more about what you do, in terms of your relationship with AWS. >> Okay, so at Weka IO we create the highest performance file system, either on prem or in the cloud. So we have a parallel file system over NVME. Like no previous generation file system did parallel work over hard drives. But these are 20 years old technology. We're the first file system to bring new paralleled rhythms to NVME so we get you lowest latency, highest throughput either on prem or in the cloud. We are perfect for machine learning and life sciences applications. Also you've mentioned media and entertainment earlier. We can run on your hardware on prem, we can run on our instances, I3 instances, in AWS and we can also take snapshots that are native performance so they don't take away performance and we also have the ability to take these snapshots and push them to S3 based object storage. This allows you to have DR or backup functionality if you look on prem but if your object storage is actually AWSS3, it also lets you do cloud bursting, so it can take your on prem cluster, connect it to AWSS3, take a snapshot, push it to AS3 and now if you have a huge amount of computation that you need to do, your local GPU servers don't have enough capacity or you just want to get the results faster, you would build a big enough cluster on AWS, get the results and bring them back. >> You were explaining before that it's a big challenge to be able to do something that can do both low latency with millions and millions of small files but also be able to do high throughput for some large files, like media and entertainment tends to be very few but very, very large files with something like genomics research, you'll have millions and millions of files but they're all quite tiny. That's quite hard, but you were saying it's actually easier to do the high throughput than it is for low latency, maybe explain some of that. >> You want to take it? >> Sure, on the one hand, streaming lots of data is easy when you distribute the data over many servers or instances in the AWS like luster dust or other solutions, but then doing small files becomes really hard. Now this is where Weka innovated and really solved this bottleneck so it really frees you to do whatever you want with the storage system without hitting any bottlenecks. This is the secret sauce of Weka. >> Right and you were mentioning before, it's a file system so it's an NFS and SMB access to this data but you're also saying that you can export to S3. >> Actually we have NFS, we have SMB, but we also have native posits so any application that you could up until now only run on the local file system such as EXT4 or ZFS, you can actually run in assured manner. Anything that's written on the many pages we do, so adjust works, locking, everything. That's one thing we're showing for life sciences, genomic workflows that we can scale their workflows without losing any performance, so if one server doing one kind of transformation takes time x, if you use 10 servers, it will take 10x the time to get 10x the results. If you have 100 servers, it's gonna take 100x servers to get 100x the results, what customers see with other storage solutions, either on prem or in the cloud, that they're adding servers but they're getting way less results. We're giving the customers five to 20 times more results than what they did on what they thought were high performance file systems prior to the Weka IO solution. >> Can you give me a real life example of this, when you talk about life sciences, you talk about genomic research and we talk about the itty bitty files and millions of samples and whatever, but exactly whatever, translate it for me, when it comes down to a real job task, a real chore, what exactly are you bringing to the table that will enable whatever research is being done or whatever examination's being done. >> I'll give you a general example, not out of specifically of life sciences, we were doing a POC at a very large customer last week and we were compared head to head with best of breed, all flash file system, they did a simple test. They created a large file system on both storage solutions filled with many many millions of small files, maybe even billions of small files and they wanted to go through all the files, they just ran the find command, so the leading competitor finished the work in six and a half hours. We finished the same work in just under two hours. More than 3x time difference compared to a solution that is currently considered probably the fastest. >> Gold standard allegedly, right? Allegedly. >> It's a big difference. During the same comparison, that customer just did an ALS of a directory with a million files that other leading solution took 55 seconds and it took just under 10 seconds for us. >> We just get you the results faster, meaning your compute remains occupied and working. If you're working with let's say GPU servers that are costly, but usually they are just idling around, waiting for the data to come to them. We just unstarve these GPU servers and let's you get what you paid for. >> And particularly with something like the elasticity of AWS, if it takes me only two hours instead of six, that's gonna save me a lot of money because I don't have to pay for that extra six hours. >> It does and if you look at the price of the P3 instances, for reason those voltage GPUs aren't inexpensive, any second they're not idling around is a second you saved and you're actually saving a lot of money, so we're showing customers that by deploying Weka IO on AWS and on premises, they're actually saving a lot of money. >> Explain some more about how you're able to bridge between both on premises and the cloud workloads, because I think you mentioned before that you would actually snapshot and then you could send the data as a cloud bursting capability. Is that the primary use case you see customers using or is it another way of getting your data from your side into the cloud? >> Actually we have a slightly more complex feature, it's called tiering through the object storage. Now customers have humongous name spaces, hundreds of petabytes some of them and it doesn't make sense to keep them all on NVME flash, it's too expensive so a big feature that we have is that we let you tier between your flash and object storage and let's you manage economics and actually we're chopping down large files and doing it to many objects, similarly to how a traditional file system treat hard drives so we treat NVMEs in a parallel fashion, that's world first but we also do all the tricks that a traditional parallel file system do to get good performance out of hard drives to the object storage. Now we take that tiering functionality and we couple it with our highest performance snapshotting abilities so you can take the snapshot and just push it completely into the object storage in a way that you don't require the original cluster anymore >> So you've mentioned a few of the areas that you're expertise now and certainly where you're working, what are some other verticals that you're looking at? What are some other areas where you think that you can bring what you're doing for maybe in the life science space and provide equal if not superior value? >> Currently. >> Like where are you going? >> Currently we focus on GPU based execution because that's where we save the most money to the customers, we give the biggest bang for the buck. Also genomics because they have severe performance problems around building, we've shown a huge semiconductor company that was trying to build and read, they were forced to building on local file system, it took them 35 minutes, they tried their fastest was actually on RAM battery backed RAM based shared file system using NFS V4, it took them four hours. It was too long, you only got to compile the day. It doesn't make sense. We showed them that they can actually compile in 38 minutes, show assured file system that is fully coherent, consistent and protected only took 10% more time, but it didn't take 10% more time because what we enabled them to do is now share the build cache, so the next build coming in only took 10 minutes. A full build took slightly longer, but if you take the average now their build was 13 or 14 minutes, so we've actually showed that assured file system can save time. Other use cases are media and entertainment, for rendering use cases, you have these use cases, they parallelize amazingly well. You can have tons of render nodes rendering your scenes and the more rendering nodes you have, the quicker you can come up with your videos, with your movies or they look nicer. We enable our customers to scale their clusters to sizes they couldn't even imagine prior to us. >> It's impressive, really impressive, great work and thanks for sharing it with us here on theCube, first time for each right? You're now Cube alumni, congratulations. >> Okay, thanks for having us. >> Thank you for being with us here. Again, we're live here at re:Invent and back with more live coverage here on theCube right after this time out.

Published Date : Dec 1 2017

SUMMARY :

Intel, and our ecosystem of partners. in the exhibit hall at Sands Expo, bit more about what you do. Alright, so the name is Weka IO. Give a little more about what you do, rhythms to NVME so we get you lowest latency, That's quite hard, but you were saying it's actually easier is easy when you distribute the data over many servers saying that you can export to S3. native posits so any application that you could up until now a real chore, what exactly are you bringing to the table and we were compared head to head with best of breed, and it took just under 10 seconds for us. and let's you get what you paid for. because I don't have to pay for that extra six hours. It does and if you look at the price Is that the primary use case you see customers using so a big feature that we have is that we let you tier and the more rendering nodes you have, and thanks for sharing it with us here on theCube, Thank you for being with us here.

ENTITIES

Entity	Category	Confidence
Justin Warren	PERSON	0.99+
Liran Zvibel	PERSON	0.99+
John Walls	PERSON	0.99+
10x	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Maor Ben Dayan	PERSON	0.99+
10 servers	QUANTITY	0.99+
10 minutes	QUANTITY	0.99+
six hours	QUANTITY	0.99+
13	QUANTITY	0.99+
35 minutes	QUANTITY	0.99+
millions	QUANTITY	0.99+
55 seconds	QUANTITY	0.99+
100 servers	QUANTITY	0.99+
four hours	QUANTITY	0.99+
six	QUANTITY	0.99+
five	QUANTITY	0.99+
100x	QUANTITY	0.99+
14 minutes	QUANTITY	0.99+
38 minutes	QUANTITY	0.99+
20 times	QUANTITY	0.99+
last week	DATE	0.99+
Las Vegas	LOCATION	0.99+
One customer	QUANTITY	0.99+
hundreds of petabytes	QUANTITY	0.99+
six and a half hours	QUANTITY	0.99+
first time	QUANTITY	0.99+
Intel	ORGANIZATION	0.98+
Sands Expo	EVENT	0.98+
thirty	QUANTITY	0.98+
a million files	QUANTITY	0.98+
Weka IO	ORGANIZATION	0.98+
one server	QUANTITY	0.98+
under two hours	QUANTITY	0.98+
both	QUANTITY	0.98+
Weka	ORGANIZATION	0.98+
millions of samples	QUANTITY	0.97+
each	QUANTITY	0.97+
under 10 seconds	QUANTITY	0.97+
two hours	QUANTITY	0.97+
first file system	QUANTITY	0.97+
IO	ORGANIZATION	0.97+
billions of small files	QUANTITY	0.96+
First	QUANTITY	0.96+
one	QUANTITY	0.96+
NFS V4	TITLE	0.96+
re:Invent	EVENT	0.96+
ten	QUANTITY	0.95+
millions of files	QUANTITY	0.94+
AWSS3	TITLE	0.94+
Cube	ORGANIZATION	0.94+
10% more time	QUANTITY	0.93+
More than 3x time	QUANTITY	0.91+
20 years old	QUANTITY	0.9+
millions of small files	QUANTITY	0.89+
a trillion exobytes	QUANTITY	0.89+
first	QUANTITY	0.87+
one kind	QUANTITY	0.84+
mega	ORGANIZATION	0.83+
re:Invent 2017	EVENT	0.81+
theCube	ORGANIZATION	0.81+
WekalO	ORGANIZATION	0.79+
AWS	EVENT	0.78+
greek	OTHER	0.78+
millions of	QUANTITY	0.75+
tons	QUANTITY	0.65+
S3	TITLE	0.63+
second	QUANTITY	0.62+
terra	ORGANIZATION	0.62+
re	EVENT	0.61+
EXT4	TITLE	0.57+
render	QUANTITY	0.57+
couple	QUANTITY	0.56+
AS3	TITLE	0.55+
theCube	COMMERCIAL_ITEM	0.53+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for AS3: