Liran Tal, Synk | CUBE Conversation

(upbeat music) >> Hello, everyone. Welcome to theCUBE's coverage of the "AWS Startup Showcase", season two, episode one. I'm Lisa Martin, and I'm excited to be joined by Snyk, next in this episode. Liran Tal joins me, the director of developer advocacy. Liran, welcome to the program. >> Lisa, thank you for having me. This is so cool. >> Isn't it cool? (Liran chuckles) All the things that we can do remotely. So I had the opportunity to speak with your CEO, Peter McKay, just about a month or so ago at AWS re:Invent. So much growth and momentum going on with Snyk, it's incredible. But I wanted to talk to you about specifically, let's start with your role from a developer advocate perspective, 'cause Snyk is saying modern development is changing, so traditional AppSec gatekeeping doesn't apply anymore. Talk to me about your role as a developer advocate. >> It is definitely. The landscape is changing, both developer and security, it's just not what it was before, and what we're seeing is developers need to be empowered. They need some help, just working through all of those security issues, security incidents happening, using open source, building cloud native applications. So my role is basically about making them successful, helping them any way we can. And so getting that security awareness out, or making sure people are having those best practices, making sure we understand what are the frustrations developers have, what are the things that we can help them with, to be successful day to day. And how they can be a really good part of the organization in terms of fixing security issues, not just knowing about it, but actually being proactively on it. >> And one of the things also that I was reading is, Shift Left is not a new concept. We've been talking about it for a long time. But Snyk's saying it was missing some things and proactivity is one of those things that it was missing. What else was it missing and how does Snyk help to fix that gap? >> So I think Shift Left is a good idea. In general, the idea is we want to fix security issues as soon as we can. We want to find them. Which I think that is a small nuance that what's kind of missing in the industry. And usually what we've seen with traditional security before was, 'cause notice that, the security department has like a silo that organizations once they find some findings they push it over to the development team, the R&D leader or things like that, but until it actually trickles down, it takes a lot of time. And what we needed to do is basically put those developer security tools, which is what Snyk is building, this whole security platform. Is putting that at the hands and at the scale of, and speed of modern development into developers. So, for example, instead of just finding security issues in your open source dependencies, what we actually do at Snyk is not just tell you about them, but you actually open a poll request to your source codes version and management system. And through that we are able to tell you, now you can actually merge it, you can actually review it, you can actually have it as part of your day-to-day workflows. And we're doing that through so many other ways that are really helpful and actually remediating the problem. So another example would be the IDE. So we are actually embedding an extension within your IDEs. So, once you actually type in your own codes, that is when we actually find the vulnerabilities that could exist within your own code, if that's like insecure code, and we can tell you about it as you hit Command + S and you will save the file. Which is totally different than what SaaS tools starting up application security testing was before because, when things started, you usually had SaaS tools running in the background and like CI jobs at the weekend and in deltas of code bases, because they were so slow to run, but developers really need to be at speed. They're developing really fast. They need to deploy. One development is deployed to production several times a day. So we need to really enable developers to find and fix those security issues as fast as we can. >> Yeah, that speed that you mentioned is absolutely critical to their workflow and what they're expecting. And one of the unique things about Snyk, you mentioned, the integration into how this works within development workflow with IDE, CIDC, they get environment enabling them to work at speed and not have to be security experts. I imagine are two important elements to the culture of the developer environment, right? >> Correct, yes. It says, a large part is we don't expect developers to be security experts. We want to help them, we want to, again, give them the tools, give them the knowledge. So we do it in several ways. For example, that IDE extension has a really cool thing that's like kind of unique to it that I really like, and that is, when we find, for example, you're writing code and maybe there's a batch traversal vulnerability in the function that you just wrote, what we'll actually do when we tell you about it, it will actually tell you, hey, look, these are some other commits made by other open source projects where we found the same vulnerability and those commits actually fixed it. So actually giving you example cases of what potentially good code looks like. So if you think about it, like who knows what patch reversal is, but prototype pollution like many types of vulnerabilities, but at the same time, we don't expect developers to actually know, the deep aspects of security. So they're left off with, having some findings, but not really, they want to fix them, but they don't really have the expertise to do it. So what we're doing is we're bridging that gap and we're being helpful. So I think this is what really proactive security is for developers, that says helping them remediate it. And I can give like more examples, like the security database, it's like a wonderful place where we also like provide examples and references of like, where does their vulnerability come from if there's like, what's fogging in open-source package? And we highlight that with a lot of references that provide you with things, the pull requests that fixed date, or the issue with where this was discussed. You have like an entire context of what is the... What made this vulnerability happen. So you have like a little bit more context than just specifically, emerging some stuff and updating, and there's a ton more. I'm happy to like dive more into this. >> Well, I can hear your enthusiasm for it, a developer advocate it seems like you are. But talking about the burdens of the gaps that you guys are filling it also seems like the developers and the security folks that this is also a bridge for those teams to work better together. >> Correct. I think that is not siloed anymore. I think the idea of having security champions or having threat modeling activities are really, really good, or like insightful both like developers and security, but more than just being insightful, useful practices that organizations should actually do actually bringing a discussion together to actually creating a more cohesive environment for both of those kind of like expertise, development and security to work together towards some of these aspects of like just mitigating security issues. And one of the things that actually Snyk is doing in that, in bringing their security into the developer mindset is also providing them with the ability to prioritize and understand what policies to put in place. So a lot of the times security organizations actually, the security org wants to do is put just, guardrails to make sure that developers have a good leeway to work around, but they're not like doing things that like, they definitely shouldn't do that, like prior to bringing a big risk into today organizations. And that's what I think we're doing also like great, which is the fact that we're providing the security folks to like put the policies in place and then developers who actually like, work really well within those understand how to prioritize vulnerabilities is an important part. And we kind of like quantify that, we put like an urgency score that says, hey, you should fix this vulnerability first. Why? Because it has, first of all, well, you can upgrade really quickly. It has a fix right there. Secondly, there's like an exploit in the wild. It means potentially an attacker can weaponize this vulnerability and like attack your organizations, in an automated fashion. So you definitely want to put that put like a lead on that, on that broken window, if so to say. So we ended up other kind of metrics that we can quantify and put this as like an urgency score, which we called a priority score that helps again, developers really know what to fix first, because like they could get a scan of like hundreds of vulnerabilities, but like, what do I start first with? So I find that like very useful for both the security and the developers working together. >> Right, and especially now, as we've seen such changes in the last couple of years to the threat landscape, the vulnerabilities, the security issues that are impacting every industry. The ability to empower developers to not only work at the speed with which they are accustomed and need to work, but also to be able to find those vulnerabilities faster prioritize which ones need to be fixed. I mean, I think of Log4Shell, for example, and when the challenge is going on with the supply chain, that this is really a critical capability from a developer empowerment perspective, but also from a overall business health and growth perspective. >> Definitely. I think, first of all, like if you want to step just a step back in terms of like, what has changed. Like what is the landscape? So I think we're seeing several things happening. First of all, there's this big, tremendous... I would call it a trend, but now it's like the default. Like of the growth of open source software. So first of all as developers are using more and more open source and that's like a growing trend of have like drafts of this. And it's like always increasing across, by the way, every ecosystem go, rust, .net, Java, JavaScript, whatever you're building, that's probably like on a growing trend, more open source. And that is, we will talk about it in a second what are the risks there. But that is one trend that we're saying. The other one is cloud native applications, which is also worth to like, I think dive deep into it in terms of the way that we're building applications today has completely shifted. And I think what AWS is doing in that sense is also creating a tremendous shift in the mindset of things. For example, out of the cloud infrastructure has basically democratized infrastructure. I do not need to, own my servers and own my monitoring and configure everything out. I can actually write codes that when I deploy it, when something parses this and runs this, it actually creates servers and monitoring, logging, different kinds of things for me. So it democratize the whole sense of building applications from what it was decades ago. And this whole thing is important and really, really fast. It makes things scalable. It also introduces some rates. For example, some of these configuration. So there's a lot that has been changed. And in that landscape of like what modern developer is and I think in that sense, we kind of can need a lead to a little bit more, be helpful to developers and help them like avoid all those cases. And I'm like happy to dive into like the open source and the cloud native. That was like follow-ups on this one. >> I want to get into a little bit more about your relationship with AWS. When I spoke with Peter McKay for re:Invent, he talked about the partnership being a couple of years old, but there's some kind of really interesting things that AWS is doing in terms of leveraging, Snyk. Talk to me about that. >> Indeed. So Snyky integrates with almost, I think probably a lot of services, but probably almost all of those that are unique and related to developers building on top of the AWS platform. And for example, that would be, if you actually are building your code, it connects like the source code editor. If you are pushing that code over, it integrates with code commits. As you build and CIS are running, maybe code build is something you're using that's in code pipeline. That is something that you have like native integrations. At the end of the day, like you have your container registry or Lambda. If you're using like functions as a service for your obligations, what we're doing is integrating with all of that. So at the end of the day, you really have all of that... It depends where you're integrating, but on all of those points of integration, you have like Snyk there to help you out and like make sure that if we find on any of those, any potential issues, anything from like licenses to vulnerabilities in your containers or just your code or your open source code in those, they actually find it at that point and mitigate the issue. So this kind of like if you're using Snyk, when you're a development machine, it kind of like accompanies you through this journey all over what a CIC kind of like landscape looks like as an architectural landscape for development, kind of like all the way there. And I think what you kind of might be I think more interested, I think to like put your on and an emphasis would be this recent integration with the Amazon Inspector. Which is as it's like very pivotal parts on the AWS platform to provide a lot of, integrate a lot of services and provide you with those insights on security. And I think the idea that now that is able to leverage vulnerability data from the Snyk's security intelligence database that says that's tremendous. And we can talk about that. We'd look for shell and recent issues. >> Yeah. Let's dig into that. We've have a few minutes left, but that was obviously a huge issue in November of 2021, when obviously we're in a very dynamic global situation period, but it's now not a matter of if an organization is going to be hit by vulnerabilities and security threats. It's a matter of when. Talk to me about really how impactful Snyk was in the Log4Shell vulnerability and how you help customers evade probably some serious threats, and that could have really impacted revenue growth, customer satisfaction, brand reputation. >> Definitely. The Log4Shell is, well, I mean was a vulnerability that was disclosed, but it's probably still a major part and going to be probably for the foreseeable future. An issue for organizations as they would need to deal with us. And we'll dive in a second and figure out like why, but in like a summary here, Log4Shell was the vulnerability that actually was found in Java library called Log4J. A logging library that is so popular today and used. And the thing is having the ability to react fast to those new vulnerabilities being disclosed is really a vital part of the organizations, because when it is asking factful, as we've seen Log4Shell being that is when, it determines where the security tool you're using is actually helping you, or is like just an added thing on like a checkbox to do. And that is what I think made Snyk's so unique in the sense. We have a team of those folks that are really boats, manually curating the ecosystem of CVEs and like finding by ourselves, but also there's like an entire, kind of like an intelligence platform beyond us. So we get a lot of notifications on chatter that happens. And so when someone opens an issue on an open source repository says, Hey, I found an issue here. Maybe that's an XSS or code injection or something like that. We find it really fast. And we at that point, before it goes to CVE requirement and stuff like that through like a miter and NVD, we find it really fast and can add it to the database. So this has been something that we've done with Log4Shell, where we found that as it was disclosed, not on the open source, but just on the open source system, but it was generally disclosed to everyone at that point. But not only that, because look for J as the library had several iterations of fixes they needed. So they fixed one version. Then that was the recommendation to upgrade to then that was actually found as vulnerable. So they needed to fix the another time and then another time and so on. So being able to react fast, which is, what I think helped a ton of customers and users of Snyk is that aspect. And what I really liked in the way that this has been received very well is we were very fast on creating those command line tools that allow developers to actually find cases of the Log4J library, embedded into (indistinct) but not true a package manifest. So sometimes you have those like legacy applications, deployed somewhere, probably not even legacy, just like the Log4J libraries, like bundled into a net or Java source code base. So you may not even know that you're using it in a sense. And so what we've done is we've like exposed with Snyk CLI tool and a command line argument that allows you to search for all of those cases. Like we can find them and help you, try and mitigate those issues. So that has been amazing. >> So you've talked in great length, Liran about, and detail about how Snyk is really enabling and empowering developers. One last question for you is when I spoke with Peter last month at re:Invent, he talked about the goal of reaching 28 million developers. Your passion as a director of developer advocacy is palpable. I can feel it through the screen here. Talk to me about where you guys are on that journey of reaching those 28 million developers and what personally excites you about what you're doing here. >> Oh, yeah. So many things. (laughs) Don't know where to start. We are constantly talking to developers on community days and things like that. So it's a couple of examples. We have like this dev site community, which is a growing and kicking community of developers and security people coming together and trying to work and understand, and like, just learn from each other. We have those events coming up. We actually have this, "The Big Fix". It's a big security event that we're launching on February 25th. And the idea is, want to help the ecosystem secure security obligations, open source or even if it's closed source. We like help you fix that though that yeah, it's like helping them. We've launched this Snyk ambassadors program, which is developers and security people, CSOs are even in there. And the idea is how can we help them also be helpful to the community? Because they are like known, they are passionate as we are, on application security and like helping developers code securely, build securely. So we launching all of those programs. We have like social impact related programs and the way that we like work with organizations, like maybe non-profit maybe they just need help, like getting, the security part of things kind of like figured out, students and things like that. Like, there's like a ton of those initiatives all over the boards, helping basically the world be a little bit more secure. >> Well, we could absolutely use Snyk's help in making the world more secure. Liran it's been great talking to you. Like I said, your passion for what you do and what Snyk is able to facilitate and enable is palpable. And it was a great conversation. I appreciate that. And we look forward to hearing what transpires during 2022 for Snyk so you got to come back. >> I will. Thank you. Thank you, Lisa. This has been fun. >> All right. Excellent. Liran Tal, I'm Lisa Martin. You're watching theCUBE's second season, season two of the "AWS Startup Showcase". This has been episode one. Stay tuned for more great episodes, full of fantastic content. We'll see you soon. (upbeat music)

Published Date : Jan 17 2022

SUMMARY :

of the "AWS Startup Showcase", Lisa, thank you for having me. So I had the opportunity to speak of the organization in terms And one of the things and like CI jobs at the weekend and not have to be security experts. the expertise to do it. that you guys are filling So a lot of the times and need to work, So it democratize the whole he talked about the partnership So at the end of the day, you and that could have really the ability to react fast and what personally excites you and the way that we like in making the world more secure. I will. We'll see you soon.

ENTITIES

Entity	Category	Confidence
Liran	PERSON	0.99+
Peter McKay	PERSON	0.99+
Lisa Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
February 25th	DATE	0.99+
Peter	PERSON	0.99+
November of 2021	DATE	0.99+
Liran Tal	PERSON	0.99+
one	QUANTITY	0.99+
Snyk	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Log4Shell	TITLE	0.99+
second season	QUANTITY	0.99+
Java	TITLE	0.99+
JavaScript	TITLE	0.99+
last month	DATE	0.99+
decades ago	DATE	0.98+
Lambda	TITLE	0.98+
Log4J	TITLE	0.98+
one version	QUANTITY	0.98+
one trend	QUANTITY	0.97+
One last question	QUANTITY	0.97+
both	QUANTITY	0.97+
first	QUANTITY	0.96+
AppSec	TITLE	0.96+
2022	DATE	0.95+
One development	QUANTITY	0.95+
Secondly	QUANTITY	0.95+
28 million developers	QUANTITY	0.95+
today	DATE	0.94+
theCUBE	ORGANIZATION	0.93+
episode one	QUANTITY	0.88+
hundreds of vulnerabilities	QUANTITY	0.86+
Shift Left	ORGANIZATION	0.84+
two important elem	QUANTITY	0.83+
Snyk	PERSON	0.82+
about a month or	DATE	0.8+
Snyky	PERSON	0.8+
last couple of years	DATE	0.76+
couple of years	QUANTITY	0.75+
several times a day	QUANTITY	0.75+
re	EVENT	0.74+
Startup Showcase	TITLE	0.74+
Synk	ORGANIZATION	0.74+
CIC	TITLE	0.73+
Left	TITLE	0.72+
season two	QUANTITY	0.7+
re:Invent	EVENT	0.7+
First	QUANTITY	0.68+
customers	QUANTITY	0.68+

Liran Zvibel, WekaIO | CUBEConversations, June 2019

>> from our studios in the heart of Silicon Valley. HOLLOWAY ALTO, California It is a cube conversation. >> Hi! And welcome to the Cube studios from the Cube conversation, where we go in depth with thought leaders driving innovation across the tech industry on hosted a Peter Burress. What are we talking about today? One of the key indicators of success and additional business is how fast you can translate your data into new value streams. That means sharing it better, accelerating the rate at which you're running those models, making it dramatically easier to administrate large volumes of data at scale with a lot of different uses. That's a significant challenge. Is going to require a rethinking of how we manage many of those data assets and how we utilize him. Notto have that conversation. We're here with Le'Ron v. Bell, who was the CEO of work a Iot leering. Welcome back to the Cube. >> Thank you very much for having >> me. So before we get to the kind of a big problem, give us an update. What's going on at work a Iot these days? >> So very recently we announced around CIA financing for the company. Another 31.7 a $1,000,000 we've actually had a very unorthodox way of raising thiss round. Instead of going to the traditional VC lead round, we actually went to our business partners and joined forces with them into building a stronger where Collier for customers we started with and video that has seen a lot of success going with us to their customers. Because when Abel and Video to deploy more G pews so they're customers can either solve bigger problems or solve their problems faster. The second pillar off the data center is networking. So we've had melon ox investing in the company because there are the leader ofthe fast NETWORKINGS. So between and Vidia, melon, ox and work are yo u have very strong pillars. Iran compute network and storage performance is crucial, but it's not the only thing customers care about, so customers need extremely fast access to their data. But they're also accumulating and keeping and storing tremendous amount of it. So we've actually had the whole hard drive industry investing in us, with Sigi and Western Digital both investing in the company and finally one off a very successful go to market partner, Hewlett Pocket enterprise invested in us throw their Pathfinder program. So we're showing tremendous back from the industry, supporting our vision off, enabling next generation performance, two applications and the ability to scale to any workload >> graduations. And it's good money. But it's also smart money that has a lot of operational elements and just repeat it. It's a melon ox, our video video, H P E C Gate and Western Digital eso. It's It's an interesting group, but it's a group that will absolutely sustain and further your drive to try to solve some of these key data Orient problems. But let's talk about what some of those key day or data oriented problems where I set up front that one of the challenges that any business that has that generates a lot of it's value out of digital assets is how fast and how easily and with what kind of fidelity can I reuse and process and move those data assets? How are how is the industry attending? How's that working in the industry today, and where do you think we're going? >> So that's part on So businesses today, through different kind of workloads, need toe access, tremendous amount of data extremely quickly, and the question of how they're going to compare to their cohort is actually based on how quickly and how well they can go through the data and process it. And that's what we're solving for our customers. And we're now looking into several applications where speed and performance. On the one hand, I have to go hand in hand with extreme scale. So we see great success in machine learning, where in videos in we're going after Life Sciences, where the genomic models, the cryo here microscopy the computational chemistry all are now accelerated. And for the pharmacy, because for the research interested to actually get to conclusion, they serve to sift through a lot of data. We are working extremely well at financial analytics, either for the banks, for the hedge funds for the quantitative trading Cos. Because we allow them to go through data much, much quicker. Actually, only last week I had the grades to rate the customer where we were able to change the amount of time they go through one analytic cycle from almost two hours, four minutes. >> This is in a financial analytics >> Exactly. And I think last time I was here was telling you about one of their turn was driving companies using us taking, uh, time to I poke another their single up from two weeks to four hours. So we see consistent 122 orders of monk to speed time in wall clock. So we're not just showing we're faster for a benchmark. We're showing our customer that by leveraging our technology, they get results significantly faster. We're also successful in engineering around chip designed soft rebuild fluid dynamics. We've announced Melon ox as an idiot customer. The chip designed customers, so they're not only a partner, they have brought our technology in house, and they're leveraging us for the next chips. And recently we've also discovered that we are great help for running Noah scale databases in the clouds running ah sparkles plank or Cassandra over work. A Iot is more than twice faster than running over the Standard MPs elected elastic clock services. >> All right, so let's talk about this because your solving problems that really only recently have been within range of some of the technology, but we still see some struggling. The way I described it is that storage for a long time was focused on persisting data transactions executed. Make sure you persisted Now is moved to these life life sciences, machine learning, genomics, those types of outpatients of five workloads we're talking about. How can I share data? How can I deploy and use data faster? But the historian of the storage industry still predicated on this designs were mainly focused on persistent. You think about block storage and filers and whatnot. How is Wecker Io advancing that knowledge that technology space of, you know, reorganizing are rethinking storage for the types, performance and scale that some of these use cases require. >> This is actually a great question. We actually started the company. We We had a long legacy at IBM. We now have no Andy from, uh, metta, uh, kind of prints from the emcee. We see what happens. Page be current storage portfolio for the large Players are very big and very convoluted, and we've decided when we're starting to come see that we're solving it. So our aim is to solve all the little issues storage has had for the last four decades. So if you look at what customers used today, if they need the out most performance they go to direct attached. This's what fusion I awards a violin memory today, these air Envy me devices. The downside is that data is cannot be sure, but it cannot even be backed up. If a server goes away, you're done. Then if customers had to have some way of managing the data they bought Block san, and then they deployed the volume to a server and run still a local file system over that it wasn't as performance as the Daz. But at least you could back it up. You can manage it some. What has happened over the last 15 years, customers realized more. Moore's law has ended, so upscaling stopped working and people have to go out scaling. And now it means that they have to share data to stop to solve their problems. >> More perils more >> probably them out ofthe Mohr servers. More computers have to share data to actually being able to solve the problem, and for a while customers were able to use the traditional filers like Aneta. For this, kill a pilot like an eyes alone or the traditional parlor file system like the GP affair spectrum scale or luster, but these were significantly slower than sand and block or direct attached. Also, they could never scale matter data. You were limited about how many files that can put in a single, uh, directory, and you were limited by hot spots into that meta data. And to solve that, some customers moved to an object storage. It was a lot harder to work with. Performance was unimpressive. You had to rewrite our application, but at least he could scale what were doing at work a Iot. We're reconfiguring the storage market. We're creating a storage solution that's actually not part of any of these for categories that the industry has, uh, become used to. So we are fasted and direct attached, they say is some people hear it that their mind blows off were faster, the direct attached, whereas resilient and durable as San, we provide the semantics off shirt file, so it's perfect your ability and where as Kayla Bill for capacity and matter data as an object storage >> so performance and scale, plus administrative control and simplicity exactly alright. So because that's kind of what you just went through is those four things now now is we think about this. So the solution needs to be borrow from the best of these, but in a way that allows to be applied to work clothes that feature very, very large amounts of data but typically organized as smaller files requiring an enormous amount of parallelism on a lot of change. Because that's a big part of their hot spot with metadata is that you're constantly re shuffling things. So going forward, how does this how does the work I owe solution generally hit that hot spot And specifically, how are you going to apply these partnerships that you just put together on the investment toe actually come to market even faster and more successfully? >> All right, so these are actually two questions. True, the technology that we have eyes the only one that paralyzed Io in a perfect way and also meditate on the perfect way >> to strangers >> and sustains it parla Liz, um, buy load balancing. So for a CZ, we talked about the hot sport some customers have, or we also run natively in the cloud. You may get a noisy neighbor, so if you aren't employing constant load balancing alongside the extreme parallelism, you're going to be bound to a bottleneck, and we're the only solution that actually couples the ability to break each operation to a lot of small ones and make sure it distributed work to the re sources that are available. Doing that allows us to provide the tremendous performance at tremendous scale, so that answers the technology question >> without breaking or without without introducing unbelievable complexity in the administration. >> It's actually makes everything simpler because looking, for example, in the ER our town was driving example. Um, the reason they were able to break down from two weeks to four hours is that before us they had to copy data from their objects, George to a filer. But the father wasn't fast enough, so they also had to copy the data from the filer to a local file system. And these copies are what has added so much complexity into the workflow and made it so slow because when you copy, you don't compute >> and loss of fidelity along the way right? OK, so how is this money and these partnerships going to translate into accelerated ionization? >> So we are leveraging some off the funds for Mohr Engineering coming up with more features supporting Mohr enterprise applications were gonna leverage some of the funds for doing marketing. And we're actually spending on marketing programs with thes five good partners within video with melon ox with sick it with Western Digital and with Hewlett Packard Enterprise. But we're also deploying joint sales motion. So we're now plugged into in video and plugged, anted to melon ox and plugging booked the Western Digital and to Hillary Pocket Enterprise so we can leverage their internal resource now that they have realized through their business units and the investment arm that we make sense that we can actually go and serve their customers more effectively and better. >> Well, well, Kaio is introduced A road through the unique on new technology into makes perfect sense. But it is unique and it's relatively new, and sometimes enterprises might go well. That's a little bit too immature for me, but if the problem than it solves is that valuable will bite the bullet. But even more importantly, a partnership line up like this has got to be ameliorating some of the concerns that your fearing from the marketplace >> definitely so when and video tells the customers Hey, we have tested it in our laps. Where in Hewlett Packard Enterprise? Till the customer, not only we have tested it in our lab, but the support is going to come out of point. Next. Thes customers now have the ability to keep buying from their trusted partners. But get the intellectual property off a nor company with better, uh, intellectual property abilities another great benefit that comes to us. We are 100% channel lead company. We are not doing direct sales and working with these partners, we actually have their channel plans open to us so we can go together and we can implement Go to Market Strategy is together with they're partners that already know howto work with them. And we're just enabling and answering the technical of technical questions, talking about the roadmap, talking about how to deploy. But the whole ecosystem keeps running in the fishing way it already runs, so we don't have to go and reinvent the whales on how how we interact with these partners. Obviously, we also interact with them directly. >> You could focus on solving the problem exactly great. Alright, so once again, thanks for joining us for another cube conversation. Le'Ron zero ofwork I Oh, it's been great talking to you again in the Cube. >> Thank you very much. I always enjoy coming over here >> on Peter Burress until next time.

Published Date : Jun 5 2019

SUMMARY :

from our studios in the heart of Silicon Valley. One of the key indicators of me. So before we get to the kind of a big problem, give us an update. is crucial, but it's not the only thing customers care about, How are how is the industry attending? And for the pharmacy, because for the research interested to actually get to conclusion, in the clouds running ah sparkles plank or Cassandra over But the historian of the storage industry still predicated on this And now it means that they have to share data to stop to solve We're reconfiguring the storage market. So the solution needs to be borrow and also meditate on the perfect way actually couples the ability to break each operation to a lot of small ones and Um, the reason they were able to break down from two weeks to four hours So we are leveraging some off the funds for Mohr Engineering coming up is that valuable will bite the bullet. Thes customers now have the ability to keep buying from their You could focus on solving the problem exactly great. Thank you very much.

ENTITIES

Entity	Category	Confidence
Western Digital	ORGANIZATION	0.99+
Liran Zvibel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
two weeks	QUANTITY	0.99+
Mohr Engineering	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
Western Digital	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
CIA	ORGANIZATION	0.99+
Peter Burress	PERSON	0.99+
George	PERSON	0.99+
June 2019	DATE	0.99+
122 orders	QUANTITY	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
four hours	QUANTITY	0.99+
last week	DATE	0.99+
four minutes	QUANTITY	0.99+
Mohr	ORGANIZATION	0.99+
Sigi	ORGANIZATION	0.99+
Hewlett Pocket	ORGANIZATION	0.99+
two applications	QUANTITY	0.99+
five good partners	QUANTITY	0.98+
second pillar	QUANTITY	0.98+
both	QUANTITY	0.98+
31.7	QUANTITY	0.98+
Andy	PERSON	0.98+
Collier	ORGANIZATION	0.98+
today	DATE	0.98+
H P E C Gate	ORGANIZATION	0.97+
more than twice	QUANTITY	0.97+
Le'Ron	PERSON	0.96+
Hillary Pocket Enterprise	ORGANIZATION	0.95+
Bell	PERSON	0.95+
four things	QUANTITY	0.95+
Melon ox	ORGANIZATION	0.95+
one	QUANTITY	0.94+
five workloads	QUANTITY	0.92+
each operation	QUANTITY	0.92+
Cube	ORGANIZATION	0.92+
Abel	ORGANIZATION	0.91+
Western Digital eso	ORGANIZATION	0.91+
$1,000,000	QUANTITY	0.89+
almost two hours	QUANTITY	0.89+
single	QUANTITY	0.89+
Io	PERSON	0.88+
last 15 years	DATE	0.86+
HOLLOWAY ALTO, California	LOCATION	0.86+
One	QUANTITY	0.85+
one analytic cycle	QUANTITY	0.81+
Cassandra	PERSON	0.78+
last four decades	DATE	0.77+
Kayla Bill	PERSON	0.74+
Kaio	ORGANIZATION	0.69+
Moore	PERSON	0.67+
Vidia	ORGANIZATION	0.59+
Noah	COMMERCIAL_ITEM	0.59+
Video	ORGANIZATION	0.56+
Iran	LOCATION	0.54+
Wecker	ORGANIZATION	0.54+
Aneta	ORGANIZATION	0.53+
Block san	ORGANIZATION	0.5+
Iot	TITLE	0.5+
Daz	ORGANIZATION	0.44+
WekaIO	PERSON	0.41+
Pathfinder	TITLE	0.32+

Liran Zvibel & Andy Watson, WekaIO | CUBE Conversation, December 2018

(cheery music) >> Hi I'm Peter Burris, and welcome to another CUBE Conversation from our studios in Palo Alto, California. Today we're going to be talking about some new advances in how data gets processed. Now it may not sound exciting, but when you hear about some of the performance capabilities, and how it liberates new classes of applications, this is important stuff, now to have that conversation we've got Weka.IO here with us, specifically Liran Zvibel is the CEO of Weka.IO, and joined by Andy Watson, who's the CTO of Weka.IO. Liran, Andy, welcome to the cube. >> Thanks. >> Thank you very much for having us. >> So Liran, you've been here before, Andy, you're a newbie, so Liran, let's start with you. Give us the Weka.IO update, what's going on with the company? >> So 18 has been a grand year for us, we've had great market adoption, so we've spent last year proving our technology, and this year we have accelerated our commercial successes, we've expanded to Europe, we've hired quite a lot of sales in the US, and we're seeing a lot of successes around machine learning, deep learning, and life sciences data processes. >> And you've hired a CTO. >> And we've hired the CTO, Andy Watson, which I am excited about. >> So Andy, what's your pedigree, what's your background? >> Well I've been around a while, got the scars on my back to show it, mostly in storage, dating back to even off-specs before NetApp, but probably best known for the years I spent at NetApp, was there from 95 through 2007, kind of the glory years, I was the second CTO at NetApp, as a matter of fact, and that was a pretty exciting time. We changed the way the world viewed shared storage, I think it's fair to say, at NetApp, and it feels the same here at Weka.IO, and that's one of the reasons I'm so excited to have joined this company, because it's the same kind of experience of having something that is so revolutionary that quite often, whether it's a customer, or an analyst like yourself, people are a little skeptical, they find it hard to believe that we can do the things that we do, and so it's gratifying when we have the data to back it up, and it's really a lot of fun to see how customers react when they actually have it in their environment, and it changes their workflow and their life experience. >> Well I will admit, I might be undermining my credibility here, but I will admit that back in the mid 90s I was a little bit skeptical about NetApp, but I'm considerably less skeptical about Weka.IO, just based on the conversations we've had, but let's turn to that, because there are classes of applications that are highly dependent on very large, small files, being able to be moved very very rapidly, like machine learning, so you mentioned machine learning, Liran, talk a little bit about some of the market success that you're having, some of those applications' successes. >> Right so machine learning actually works extremely well for us for two reasons. For one big reasons, machine learning is being performed by GPU servers, so a server with several GPU offload engines in them, and what we see with this kind of server, a single GPU server replaces ten or tens of CPU based servers, and what we see that you actually need, the IO performance to be ten or tens times what the CPU servers has been, so we came up with a way of providing significantly higher, so two orders of magnitude higher IO to a single client on the one hand, and on the other hand, we have sold the data performance from the metadata perspective, so we can have directories with billions of files, we can have the whole file system with trillions of files, and when we look at the autonomous driving problem, for examples, if you look at the high end car makers, they have eight cameras around the cars, these cameras take small resolution, because you don't need a very high resolution to recognize the line, or a cat, or a pedestrian, but they take them at 60 frames per second, so 30 minutes, you get about the 100k files, traditional filers could put in the directory, but if you'd like to have your cars running in the Bay Area, you'd like to have all the data from the Bay Area in the single directory, then you would need the billions of file directories for us, and what we have heard from some of our customers that have had great success with our platform is that not only they get hundreds of gigabytes of small file read performance per second, they tell us that they take their standard time to add pop from about two weeks before they switched to us down to four hours. >> Now let's explore that, because one of the key reasons there is the scalability of the number of files you can handle, so in other words, instead of having to run against a limit of the number of files that they can typically run through the system, saturate these GPUs based on some other storage or file technology, they now don't have to stop and set up the job again and run it over and over, they can run the whole job against the entire expansive set of files, and that's crucial to speeding up the delivery of the outcome, right? >> Definitely, so what they, these customers used to do before us, they would do a local caching, cause NFS was not fast enough for them, so they would copy the data locally, and then they would run them over on the local file system, because that has been the pinnacle of performance of recent year. We are the only storage currently, I think we'll actually be the first wave of storage solutions where a shared platform built for NVME is actually faster than a local file system, so we'd let them go through any file, they don't have to pick initially what files goes to what server, and also we are even faster than the traditional caching solutions. >> And imagine, having to collect the data and copy it to the local server, application server, and do that again and again and again for a whole server farm, right? So it's bad enough to even do it once, to do it many times, and then to do it over and over and over and over again, it's a huge amount of work. >> And a lot of time? >> And a lot of time, and cumulatively that burden, it's going to slow you down, so that makes a big big difference and secondly, as Liran was explaining, if you put 100,000 files in a directory of other file systems, that is stressful. You want to put more than 100,000 files in a directory of other file systems, that is a tragedy, and we routinely can handle millions of files in a directory, doesn't matter to us at all because just like we distribute the data, we also distribute the metadata, and that's completely counter to the way the other file systems are designed because they were all designed in an era where their focus was on the physical geometry of hard disks, and we have been designed for flash storage. >> And the metadata associated with the distribution of that data typically was in a one file, in one place, and that was the master serialization problem when you come right down to it. So we've got a lot of ML workloads, very large number of files, definitely improved performance because of the parallelism through your file system, in the as I said, the ML world. Let's generalize this. What does this mean overall, you've kind of touched upon it, but what does it mean overall for the way that customers are going to think about storage architectures in the future as they are combining ML and related types of workloads with more traditional types of things? What's the impact of this on storage? >> So if you look at how people architect their solutions around storage recently, you have four different kind of storage systems. If you need the utmost performance, you're going to DAS, Fusion IO had a run, perfecting DAS and then the whole industry realized it. >> Direct attached storage. >> Direct attached storage, right, and then the industry realized hey it makes so much sense, they create a standard out of it, created NVME, but then you're wasting a lot of capacity, and you cannot manage it, you cannot back it up, and then if you need it as some way to manage it, you would put your data over SAN, actually our previous company was XAV storage that IBM acquired, vast majority of our use cases are actually people buying block, and then they overlay a local file system over it because it gets you so much higher performance then if you must get, but you don't get, you cannot share the data. Now, if you put it on a filer, which is Neta, or Islon, or the other solutions, you can share the data but your performance is limited, and your scalability is limited as Andy just said, and if you had to scale through the roof- >> With a shared storage approach. >> With a shared storage approach you had to go and port your application to an object storage which is an enormous feat of engineering, and tons of these projects actually failed. We actually bring the new kind of storage, which is assured storage, as scalable as an object storage, but faster than direct attach storage, so looking at the other traditional storage systems of the last 20 or 30 years, we actually have all the advantages people would come to expect from the different categories, but we don't have any of the downsides. >> Now give us some numbers, or do you have any benchmarks that you can talk about that kind of show or verify or validate this kind of vision that you've got, that Weka's delivering on? >> Definitely, but the i500? >> Sure, sure, we recently actually published our IO500 performance results at the SE1800, SE18 event in Dallas, and there are two different metrics- >> So fast you can go back in time? >> Yes, exactly, there are two different metrics, one metric is like an aggregate total amount of performance, it's a much longer list. I think the one that's more interesting is the one where it's the 10-client version, which we like to focus on because we believe that the most important area for a customer to focus on is how much IO can you deliver to an individual application server? And so this part of the benchmark is most representative of that, and on that rating, we were able to come in second well, after you filter out the irrelevant results, which, that's a separate process. >> Typical of every benchmark. >> Yes exactly, of the relevant meaningful results, we came in second behind the world's largest and most expensive supercomputer at Oak Ridge, the SUMMIT system. So they have a 40 rack system, and we have a half, or maybe a little bit more than half, one rack system of industry standard hardware running our software. So compare that, the cost of our hardware footprint and so forth is much less than a million dollars. >> And what was the differential between the two? >> Five percent. >> Five percent? So okay, sound of jaw dropping. 40 rack system at Oak Ridge? Five percent more performance than you guys running on effectively a half rack of like a supermicro or something like that? >> Oh and it was the first time we ran the benchmark, we were just learning how to run it, so those guys are all experts, they had IBM in there at their elbow helping them with all their tuning and everything, this was literally the first time our engineers ran the benchmark. >> Is a large feature of that the fact that Oak Ridge had to get all that hardware to get the physical IO necessary to run serial jobs, and you guys can just do this parallel on a relatively standard IO subset, NVME subset? >> Because beyond that, you have to learn how to use all those resources, right? All the tuning, all the expertise, one of the things people say is you need a PhD to administer one of those systems, and they're not far off, because it's true that it takes a lot of expertise. Our systems are dirt simple. >> Well you got to move the parallelism somewhere, and either you create it yourself, like you do at Oak Ridge, or you do it using your guys' stuff, through a file system. >> Exactly, and what we are showing that we have tremendously higher IO density, and we actually, what we're showing, that instead of using a local file system, that where most of them were created in the 90s, in the serial way of thinking, of optimizing over hard drives, if now you say, hey, NVME devices, SSDs are beasts at running 4k IOs, if you solve the networking problem, if the network is not the bottleneck anymore, if you just run all your IOs as much parallelized workload over 4k IOs, you actually get much higher performance than what you could get, up until we came, the pinnacle of performance, which is a local file system over a local device. >> Well so NFS has an effective throughput limitation of somewhere around a gigabyte, so if you've got a bunch of GPUs that are each wanting four, five, 10 gigabytes of data coming in, you're not saturating them out of an effective one gigabyte throughput rate, so it's almost like you've got the New York City Waterworks coming in to some of these big file systems, and you got like your little core sink that's actually spitting the data out into the GPUs, have I got that right? >> Good analogy, if you are creating a data lake, and then you're going to sip at it with some tiny little straw, it doesn't matter how much data you have, you can't really leverage the value of all that data that you've accumulated, if you're feeding it into your compute farm, GPU or not, because if you're feeding it into that farm slowly, then you'll never get to it all, right? And meanwhile more data's coming in every day, at a faster rate. It's an impossible situation, so the only solution really is to increase the rate by which you access the data, and that's what we do. >> So I could see how you're making the IO bandwidth junkies at Oak Ridge, or would make them really happy, but the other thing that at least I find interesting about Weka.IO is as you just talked about is that, that you've come up with an approach that's specifically built for SSD, you've moved the parallelism into the file system, as opposed to having it be somewhere else, which is natural, because SSD is not built to persist data, it's built to deliver data, and that suggests as you said earlier, that we're looking at a new way of thinking about storage as a consequence of technologies like Weka, technologies like NVME. Now Andy, you came from NetApp, and I remember what NetApp did to the industry, when it started talking about the advantages of sharing storage. Are we looking at something similar happening here with SSD and NVME and Weka? >> Indeed, I think that's the whole point, it's one of the reasons I'm so excited about it. It's not only because we have this technology that opens up this opportunity, this potential being realized. I think the other thing is, there's a lot of features, there's a lot of meaningful software that needs to be written around this architectural capability, and the team that I joined, their background, coming from having created XIV before, and the almost amazing way they all think together and recognize the market, and the way they interact with customers allows the organization to address realistically customer requirements, so instead of just doing things that we want to do because it seems elegant, or because the technology sparkles in some interesting way, this company, and it remains me of NetApp in the early days, and it was a driver of NetApp's big success, this company is very customer-focused, very customer driven. So when customers tell us what they're trying to do, we want to know more. Tell us in detail how you're trying to get there. What are your requirements? Because if we understand better, then we can engineer what we're doing to meet you there, because we have the fundamental building blocks. Those are mostly done, now what we're trying to do is add the pieces that allow you to implement it into your workflow, into your data center, or into your strategy for leveraging the cloud. >> So Liran, when you're here in 2019, we're having a similar conversation with this customer focus, you've got a value proposition to the IO bandwidth junkies, you can give more, but what's next in your sights? Are you going to show how this for example, you can get higher performance with less hardware? >> So we are already showing how you can get higher performance with less hardware, and I think as we go forward, we're going to have more customers embracing us for more workloads, so what we see already, they get us in for either the high end of their life sciences or their machine learning, and then people working around these people realize hey, I could get some faster speed as well, and then we start expanding within these customers and we get to see more and more workloads where people like us and we can start telling stories about them. The other thing that we have natural to us, we run natively in the cloud, and we actually let you move your workload seamlessly between your on-premises and the cloud, and we are seeing tremendous interest about moving to the cloud today, but not a lot of organizations already do it. I think 19 and forward, we are going to see more and more enterprises considering seriously moving to the cloud, cause we have almost 100% of our customers PFCing, cloudbursting, but not a lot of them using them. I think as time passes, all of them that has seen it working, when they did the initial test, will start leveraging this, and getting the elasticity out of the cloud, because this is what you should get out of the cloud, so this is one way for expansion for us. We are going to spend more resources into Europe, which we have recently started building the team, and later in that year also, JPAC. >> Gentlemen, thanks very much for coming on theCUBE and talking to us about some new advances in file systems that are leading to greater performance, less specialized hardware, and enabling new classes of applications. Liran Zvibel is the CEO of Weka.IO, Andy Watson is the CTO of Weka.IO, thanks for being on theCUBE. >> Thank you very much. >> Yeah, thanks a lot. >> And once again, I'm Peter Burris, and thanks very much for participating in this CUBE Conversation, until next time. (cheery music)

Published Date : Dec 14 2018

SUMMARY :

some of the performance So Liran, you've in the US, and we're And we've hired the CTO, Andy Watson, 2007, kind of the glory years, just based on the conversations we've had, a single client on the one the data locally, and then and then to do it over and distribute the data, we also in the future as they are So if you look at how people and then if you need it as We actually bring the new more interesting is the one Yes exactly, of the than you guys running on the benchmark. expertise, one of the things the parallelism somewhere, in the 90s, in the serial way of thinking, so the only solution the file system, as opposed to and the team that I and the cloud, and we are Liran Zvibel is the CEO and thanks very much for

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
Peter Burris	PERSON	0.99+
Liran	PERSON	0.99+
30 minutes	QUANTITY	0.99+
ten	QUANTITY	0.99+
Andy Watson	PERSON	0.99+
Liran Zvibel	PERSON	0.99+
2019	DATE	0.99+
Oak Ridge	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Weka.IO	ORGANIZATION	0.99+
100,000 files	QUANTITY	0.99+
Five percent	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
40 rack	QUANTITY	0.99+
four hours	QUANTITY	0.99+
two	QUANTITY	0.99+
December 2018	DATE	0.99+
Dallas	LOCATION	0.99+
US	LOCATION	0.99+
2007	DATE	0.99+
Bay Area	LOCATION	0.99+
hundreds of gigabytes	QUANTITY	0.99+
last year	DATE	0.99+
two reasons	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
billions of file directories	QUANTITY	0.99+
NetApp	ORGANIZATION	0.99+
more than 100,000 files	QUANTITY	0.99+
one file	QUANTITY	0.99+
second	QUANTITY	0.99+
this year	DATE	0.99+
NVME	ORGANIZATION	0.99+
mid 90s	DATE	0.99+
one metric	QUANTITY	0.99+
one place	QUANTITY	0.99+
millions of files	QUANTITY	0.98+
90s	DATE	0.98+
five	QUANTITY	0.98+
Weka	ORGANIZATION	0.98+
tens	QUANTITY	0.98+
first time	QUANTITY	0.98+
eight cameras	QUANTITY	0.98+
two different metrics	QUANTITY	0.98+
single directory	QUANTITY	0.98+
trillions of files	QUANTITY	0.98+
one	QUANTITY	0.97+
SE1800	EVENT	0.97+
less than a million dollars	QUANTITY	0.97+
a half	QUANTITY	0.97+
JPAC	ORGANIZATION	0.97+
one way	QUANTITY	0.97+
CUBE Conversation	EVENT	0.96+
10-client	QUANTITY	0.96+
tens times	QUANTITY	0.96+
60 frames per second	QUANTITY	0.96+
Today	DATE	0.96+
NetApp	TITLE	0.96+
two orders	QUANTITY	0.95+
four	QUANTITY	0.95+
almost 100%	QUANTITY	0.94+

Liran Zvibel, WekaIO | CUBEConversation, April 2018

[Music] hi I'm Stu minimun and this is the cube conversation in Silicon angles Palo Alto office happy to welcome back to the program Lear on survival who is the co-founder and CEO of Weka IO thanks so much for joining me thank you for having me over alright so on our research side you know we've really been saying that data is at the center of everything it's in the cloud it's in the network and of course in the storage industry data has always been there but I think especially for customers it's been more front and center well you know why is data becoming more important it's not data growth and some of the other things that we've talked about for decades but you know how was it changing what are you hearing from customers today so I think the main difference is that organization they're starting to understand that the more data they have the better service they're going to provide to their customers and there will be an overall better company than their competitors so about 10 years ago we started hearing about big data and other ways that in a more simpler form just went over sieved through a lot of data and tried to get some sort of high-level meaning out of it last few years people are actually employing deep learning machine learning technique to their vast amounts of data and they're getting much higher level of intelligence out of their huge capacities of data and actually with deep learning the more data you have the better outputs you get before we go into kind of the m/l and the deep learning piece just did kind of a focus on data itself there's some that say you know digital transformation is it's this buzzword when I talk to users absolutely they're going through transformations you know we're saying everybody's becoming a software company but how does data specifically help them with that you know what what what is your viewpoint there and what are you hearing from your customers so if you look at it from the consumer perspective so people now keep track record of their lives at much higher resolution than the and I'm not talking about the images rigid listen I'm talking about the vast amount of data that they store so if I look at how many pictures I have of myself as a kid and how many pictures I have of my kids like you could fit all of my pictures into albums I can probably fit my my kids like a week's worth of time into albums so people keep a lot more data as consumers and then organization keep a lot more data of their customers in order to provide better service and better overall product you know the industry as an industry we saw a real mixed bag when it came to Big Data when I was saying great I have lots more volume of data that doesn't necessarily mean that I got more value out of it so what are the one of the trends that you're seeing why is you know where things like you deep learning machine learning AI you know is it going to be different or is this just kind of the next iteration of well we're trying and maybe we didn't hit as well with big data let's see if this does it does better so I think that Big Data had its glory days and now where they're coming to to the end of that crescendo because people realized that what they got was sort of aggregate of things that they couldn't make too much sense of and then people really understand that for you to make better use of your data you need to employ way similarly to how the brain works so look a lot of data and then you have to have some sense out of their data and once you've made some sense out of that data we can now get computers to go through way more data and make a similar amount of sense out of that and actually get much much better results so just instead of going finding anecdotes or this thing that you were able to do with big date you're actually now are able to generate intelligent systems you know what one of the other things we saw is it used to be okay I have this this huge back catalogue or I'm going to survey all the data I've collected today you know it's much more you know real times a word that's been thrown around for many years you know whether it do you say live data or you know if you're at sensors where I need to have something where I can you know train models react immediately that that kind of immediacy is much more important you know that's what I'm assuming that's something that you're seeing from customers to indeed so what we say is that customers end up collecting vast amounts of data and then they train their models on these kind of data and then they're pushing these intelligent models to the edges and then you're gonna have edges running inference and that could be a straight camera it could be a camera in the store or it could be your car and then usually you run these inference at the endpoints using all the things you've trained the models back then and you will still keep the data push it back and then you should you still run inference at the data center sort of doing QA and now the edges also know to mark where they couldn't make sense of what they saw so the the data center systems know what should we look at first how we make our models smarter for the next iteration because these are closed-loop systems you train them you push through the edges the edges tell you how well you think they think they understood your train again and things improve we're now at the infancy of a lot of these loops but I think the following probably two to five years will take us through a very very fascinating revolution where systems all around us will become way way more intelligent yeah and there's interesting architectural discussions going on if you talk about this edge environment if I'm an autonomous vehicle now from an airplane of course I need to react there I can't go back to the cloud but you know what what happens in the cloud versus what happens at the edge where do where does Weka fit into that that whole discussion so where we currently are running we're running at the data centers so at Weka we created the fastest file system that's perfect for AI and machine learning and training and we make sure that your GPU field servers that are very expensive never sit idle the second component of our system is tearing two very effective object storages that can run into exabytes so we have the system that makes sure you can have as many GPU servers churning all the time and getting the results getting the new models while having the ability to read any form of data that was collected in the several years really through hundreds of petabytes of data sets and now we have customers talking about exabytes of data sets representing a single application not throughout the organization just for that training application yeah so a I in ml you know Keita is that that the killer use case for your customers today so that's one killer application just because of the vast amount of data and the high-performance nature of the clients we actually show clients that runwa kayo finished training sessions ten times faster than how they would use traditional NFS based solutions but just based on the different way we handle data another very strong application for us is around Life Sciences and genomics where we show that we're the only storage that let these processes remain CPU bound so any other storage at some points becomes IO bound so you couldn't paralyzed paralyzed the processing anymore we actually doesn't matter how many servers you run as clients you double the amount of clients you either get the twice the result the same amount of time or you get the same result it's half the time and with genomics nowadays there are applications that are life-saving so hospitals run these things and they need results as fast as they can so faster storage means better healthcare yeah without getting too deep in it because you know the storage industry has lots of wonkiness and it's there's so many pieces there but you know I hear life scientists I think object storage I hear nvme I think block storage your file storage when it comes down to it you know why is that the right architecture you know for today and what advantages does that give you so we we are actually the only company that went through the hassles and the hurdles of utilizing nvme and nvme of the fabrics for a parallel file system all other solutions went the easier route and created the block and the reason we've created a file system is that this is what computers understand this is what the operating system understand when you go to university you learn computer science they teach you how to write programs they need a file system now if you want to run your program over to servers or ten servers what you need is a shirt file system up until we came gold standard was using NFS for sharing files across servers but NFS was actually created in the 80s when Ethernet run at 10 megabit so currently most of our customers run already 100 gigabytes which is four orders of magnitude faster so they're seeing that they cannot run a network protocol that was designed four orders of magnitude last speed with the current demanding workloads so this explains why we had to go and and pick a totally different way of pushing data to the to the clients with regarding to object storages object storages are great because they allow customers to aggregate hard drives into inexpensive large capacity solutions the problem with object storages is that the programming model is different than the standard file system that computers can understand in too thin two ways a when you write something you don't know when it's going to get actually stored it's called eventual consistency and it's very difficult for mortal programmers to actually write a system that is sound that is always correct when you're writing eventual consistency storage the second thing is that objects cannot change you cannot modify them you need to create them you get them or you can delete them they can have versions but this is also much different than how the average programmer is used to write its programs so we are actually tying between the highest performance and vme of the fabrics at the first year and these object storages that are extremely efficient but very difficult to work with at the back and tier two a single solution that is highest performance and best economics right there on I want to give you the last word give us a little bit of a long view you talked about where we've gone how parallel you know architecture helps now that we're at you know 100 Gig look out five years in the future what's gonna happen you know blockchain takes over the world cloud dominates everything but from an infrastructure application in you know storage world you know where does wek I think that the things look like so one one very strong trend that we are saying is around encryption so it doesn't matter what industry I think storing things in clear-text for many organizations just stops making sense and people will demand more and more of day of their data to be encrypted and tighter control around everything that's one very strong trend that we're seeing another very strong trend that we're seeing is enterprises would like to leverage the public cloud but in an efficient way so if you were to run economics moving all your application to the public cloud may end up being more expensive than running everything on Prem and I think a lot of organizations realized that the the trick is going to be each organisation will have to find a balance to what kind of services are run on Prem and these are going to be the services that are run around the clock and what services have the more of a bursty nature and then organization will learn how to leverage the public cloud for its elasticity because if you're just running on the cloud you're not leveraging the elasticity you're doing it wrong and we're actually helping a lot of our customers do it with our hybrid cloud ability to have local workloads and the cloud workloads and getting these whole workflows to actually run is a fascinating process they're on thank you so much for joining us great to hear the update not only on Weka but really where the industry is going dynamic times here in the industry data at the center of all cubes looking to cover it at all the locations including here and our lovely Palo Alto Studio I'm Stu minimun thanks so much for watching the cube thank you very much [Music] you

Published Date : Apr 6 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Liran Zvibel	PERSON	0.99+
100 gigabytes	QUANTITY	0.99+
April 2018	DATE	0.99+
10 megabit	QUANTITY	0.99+
two	QUANTITY	0.99+
Weka IO	ORGANIZATION	0.99+
Weka	ORGANIZATION	0.99+
twice	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
second thing	QUANTITY	0.99+
five years	QUANTITY	0.98+
second component	QUANTITY	0.98+
each organisation	QUANTITY	0.98+
first year	QUANTITY	0.98+
today	DATE	0.97+
Stu minimun	PERSON	0.97+
two ways	QUANTITY	0.97+
Prem	ORGANIZATION	0.96+
ten times	QUANTITY	0.95+
about 10 years ago	DATE	0.94+
one	QUANTITY	0.94+
Stu minimun	PERSON	0.94+
last few years	DATE	0.93+
hundreds of petabytes of data sets	QUANTITY	0.93+
first	QUANTITY	0.92+
several years	QUANTITY	0.92+
80s	DATE	0.91+
single application	QUANTITY	0.9+
decades	QUANTITY	0.9+
a lot of data	QUANTITY	0.89+
Silicon angles	LOCATION	0.89+
half the time	QUANTITY	0.87+
ten servers	QUANTITY	0.87+
two very effective object	QUANTITY	0.87+
single solution	QUANTITY	0.86+
four orders	QUANTITY	0.85+
four orders	QUANTITY	0.85+
a week	QUANTITY	0.84+
Palo Alto Studio	ORGANIZATION	0.8+
lot more data	QUANTITY	0.78+
WekaIO	ORGANIZATION	0.78+
100 Gig	QUANTITY	0.74+
Lear on	TITLE	0.72+
double	QUANTITY	0.72+
many pieces	QUANTITY	0.65+
Keita	ORGANIZATION	0.63+
lot of data	QUANTITY	0.6+
lot	QUANTITY	0.58+
lots	QUANTITY	0.58+
application	QUANTITY	0.56+
vast amounts of data	QUANTITY	0.54+
exabytes	QUANTITY	0.53+
trend	QUANTITY	0.52+
CEO	PERSON	0.5+
Big Data	ORGANIZATION	0.45+

Liran Zvibel, WekalO & Maor Ben Dayan, WekalO | AWS re:Invent

>> Announcer: Live from Las Vegas, it's The Cube, covering AWS re:Invent 2017, presented by AWS, Intel, and our ecosystem of partners. >> And we're back, here on the show floor in the exhibit hall at Sands Expo, live at re:Invent for AWS along with Justin Warren. I'm John Walls. We're joined by a couple of executives now from Weka IO, to my immediate right is Liran Zvibel, who is the co-founder and CEO and then Maor Ben Dayan who's the chief architect at IO. Gentleman thanks for being with us. >> Thanks for having us. >> Appreciate you being here on theCube. First off tell the viewers a little bit about your company and I think a little about the unusual origination of the name. You were sharing that with me as well. So let's start with that, and then tell us a little bit more about what you do. >> Alright, so the name is Weka IO. Weka is actually a greek unit, like mega and terra and peta so it's actually a trillion exobytes, ten to the power of thirty, it's a huge capacity, so it works well for a storage company. Hopefully we will end up storing wekabytes. It will take some time. >> I think a little bit of time to get there. >> A little bit. >> We're working on it. >> One customer at a time. >> Give a little more about what you do, in terms of your relationship with AWS. >> Okay, so at Weka IO we create the highest performance file system, either on prem or in the cloud. So we have a parallel file system over NVME. Like no previous generation file system did parallel work over hard drives. But these are 20 years old technology. We're the first file system to bring new paralleled rhythms to NVME so we get you lowest latency, highest throughput either on prem or in the cloud. We are perfect for machine learning and life sciences applications. Also you've mentioned media and entertainment earlier. We can run on your hardware on prem, we can run on our instances, I3 instances, in AWS and we can also take snapshots that are native performance so they don't take away performance and we also have the ability to take these snapshots and push them to S3 based object storage. This allows you to have DR or backup functionality if you look on prem but if your object storage is actually AWSS3, it also lets you do cloud bursting, so it can take your on prem cluster, connect it to AWSS3, take a snapshot, push it to AS3 and now if you have a huge amount of computation that you need to do, your local GPU servers don't have enough capacity or you just want to get the results faster, you would build a big enough cluster on AWS, get the results and bring them back. >> You were explaining before that it's a big challenge to be able to do something that can do both low latency with millions and millions of small files but also be able to do high throughput for some large files, like media and entertainment tends to be very few but very, very large files with something like genomics research, you'll have millions and millions of files but they're all quite tiny. That's quite hard, but you were saying it's actually easier to do the high throughput than it is for low latency, maybe explain some of that. >> You want to take it? >> Sure, on the one hand, streaming lots of data is easy when you distribute the data over many servers or instances in the AWS like luster dust or other solutions, but then doing small files becomes really hard. Now this is where Weka innovated and really solved this bottleneck so it really frees you to do whatever you want with the storage system without hitting any bottlenecks. This is the secret sauce of Weka. >> Right and you were mentioning before, it's a file system so it's an NFS and SMB access to this data but you're also saying that you can export to S3. >> Actually we have NFS, we have SMB, but we also have native posits so any application that you could up until now only run on the local file system such as EXT4 or ZFS, you can actually run in assured manner. Anything that's written on the many pages we do, so adjust works, locking, everything. That's one thing we're showing for life sciences, genomic workflows that we can scale their workflows without losing any performance, so if one server doing one kind of transformation takes time x, if you use 10 servers, it will take 10x the time to get 10x the results. If you have 100 servers, it's gonna take 100x servers to get 100x the results, what customers see with other storage solutions, either on prem or in the cloud, that they're adding servers but they're getting way less results. We're giving the customers five to 20 times more results than what they did on what they thought were high performance file systems prior to the Weka IO solution. >> Can you give me a real life example of this, when you talk about life sciences, you talk about genomic research and we talk about the itty bitty files and millions of samples and whatever, but exactly whatever, translate it for me, when it comes down to a real job task, a real chore, what exactly are you bringing to the table that will enable whatever research is being done or whatever examination's being done. >> I'll give you a general example, not out of specifically of life sciences, we were doing a POC at a very large customer last week and we were compared head to head with best of breed, all flash file system, they did a simple test. They created a large file system on both storage solutions filled with many many millions of small files, maybe even billions of small files and they wanted to go through all the files, they just ran the find command, so the leading competitor finished the work in six and a half hours. We finished the same work in just under two hours. More than 3x time difference compared to a solution that is currently considered probably the fastest. >> Gold standard allegedly, right? Allegedly. >> It's a big difference. During the same comparison, that customer just did an ALS of a directory with a million files that other leading solution took 55 seconds and it took just under 10 seconds for us. >> We just get you the results faster, meaning your compute remains occupied and working. If you're working with let's say GPU servers that are costly, but usually they are just idling around, waiting for the data to come to them. We just unstarve these GPU servers and let's you get what you paid for. >> And particularly with something like the elasticity of AWS, if it takes me only two hours instead of six, that's gonna save me a lot of money because I don't have to pay for that extra six hours. >> It does and if you look at the price of the P3 instances, for reason those voltage GPUs aren't inexpensive, any second they're not idling around is a second you saved and you're actually saving a lot of money, so we're showing customers that by deploying Weka IO on AWS and on premises, they're actually saving a lot of money. >> Explain some more about how you're able to bridge between both on premises and the cloud workloads, because I think you mentioned before that you would actually snapshot and then you could send the data as a cloud bursting capability. Is that the primary use case you see customers using or is it another way of getting your data from your side into the cloud? >> Actually we have a slightly more complex feature, it's called tiering through the object storage. Now customers have humongous name spaces, hundreds of petabytes some of them and it doesn't make sense to keep them all on NVME flash, it's too expensive so a big feature that we have is that we let you tier between your flash and object storage and let's you manage economics and actually we're chopping down large files and doing it to many objects, similarly to how a traditional file system treat hard drives so we treat NVMEs in a parallel fashion, that's world first but we also do all the tricks that a traditional parallel file system do to get good performance out of hard drives to the object storage. Now we take that tiering functionality and we couple it with our highest performance snapshotting abilities so you can take the snapshot and just push it completely into the object storage in a way that you don't require the original cluster anymore >> So you've mentioned a few of the areas that you're expertise now and certainly where you're working, what are some other verticals that you're looking at? What are some other areas where you think that you can bring what you're doing for maybe in the life science space and provide equal if not superior value? >> Currently. >> Like where are you going? >> Currently we focus on GPU based execution because that's where we save the most money to the customers, we give the biggest bang for the buck. Also genomics because they have severe performance problems around building, we've shown a huge semiconductor company that was trying to build and read, they were forced to building on local file system, it took them 35 minutes, they tried their fastest was actually on RAM battery backed RAM based shared file system using NFS V4, it took them four hours. It was too long, you only got to compile the day. It doesn't make sense. We showed them that they can actually compile in 38 minutes, show assured file system that is fully coherent, consistent and protected only took 10% more time, but it didn't take 10% more time because what we enabled them to do is now share the build cache, so the next build coming in only took 10 minutes. A full build took slightly longer, but if you take the average now their build was 13 or 14 minutes, so we've actually showed that assured file system can save time. Other use cases are media and entertainment, for rendering use cases, you have these use cases, they parallelize amazingly well. You can have tons of render nodes rendering your scenes and the more rendering nodes you have, the quicker you can come up with your videos, with your movies or they look nicer. We enable our customers to scale their clusters to sizes they couldn't even imagine prior to us. >> It's impressive, really impressive, great work and thanks for sharing it with us here on theCube, first time for each right? You're now Cube alumni, congratulations. >> Okay, thanks for having us. >> Thank you for being with us here. Again, we're live here at re:Invent and back with more live coverage here on theCube right after this time out.

Published Date : Dec 1 2017

SUMMARY :

Intel, and our ecosystem of partners. in the exhibit hall at Sands Expo, bit more about what you do. Alright, so the name is Weka IO. Give a little more about what you do, rhythms to NVME so we get you lowest latency, That's quite hard, but you were saying it's actually easier is easy when you distribute the data over many servers saying that you can export to S3. native posits so any application that you could up until now a real chore, what exactly are you bringing to the table and we were compared head to head with best of breed, and it took just under 10 seconds for us. and let's you get what you paid for. because I don't have to pay for that extra six hours. It does and if you look at the price Is that the primary use case you see customers using so a big feature that we have is that we let you tier and the more rendering nodes you have, and thanks for sharing it with us here on theCube, Thank you for being with us here.

ENTITIES

Entity	Category	Confidence
Justin Warren	PERSON	0.99+
Liran Zvibel	PERSON	0.99+
John Walls	PERSON	0.99+
10x	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Maor Ben Dayan	PERSON	0.99+
10 servers	QUANTITY	0.99+
10 minutes	QUANTITY	0.99+
six hours	QUANTITY	0.99+
13	QUANTITY	0.99+
35 minutes	QUANTITY	0.99+
millions	QUANTITY	0.99+
55 seconds	QUANTITY	0.99+
100 servers	QUANTITY	0.99+
four hours	QUANTITY	0.99+
six	QUANTITY	0.99+
five	QUANTITY	0.99+
100x	QUANTITY	0.99+
14 minutes	QUANTITY	0.99+
38 minutes	QUANTITY	0.99+
20 times	QUANTITY	0.99+
last week	DATE	0.99+
Las Vegas	LOCATION	0.99+
One customer	QUANTITY	0.99+
hundreds of petabytes	QUANTITY	0.99+
six and a half hours	QUANTITY	0.99+
first time	QUANTITY	0.99+
Intel	ORGANIZATION	0.98+
Sands Expo	EVENT	0.98+
thirty	QUANTITY	0.98+
a million files	QUANTITY	0.98+
Weka IO	ORGANIZATION	0.98+
one server	QUANTITY	0.98+
under two hours	QUANTITY	0.98+
both	QUANTITY	0.98+
Weka	ORGANIZATION	0.98+
millions of samples	QUANTITY	0.97+
each	QUANTITY	0.97+
under 10 seconds	QUANTITY	0.97+
two hours	QUANTITY	0.97+
first file system	QUANTITY	0.97+
IO	ORGANIZATION	0.97+
billions of small files	QUANTITY	0.96+
First	QUANTITY	0.96+
one	QUANTITY	0.96+
NFS V4	TITLE	0.96+
re:Invent	EVENT	0.96+
ten	QUANTITY	0.95+
millions of files	QUANTITY	0.94+
AWSS3	TITLE	0.94+
Cube	ORGANIZATION	0.94+
10% more time	QUANTITY	0.93+
More than 3x time	QUANTITY	0.91+
20 years old	QUANTITY	0.9+
millions of small files	QUANTITY	0.89+
a trillion exobytes	QUANTITY	0.89+
first	QUANTITY	0.87+
one kind	QUANTITY	0.84+
mega	ORGANIZATION	0.83+
re:Invent 2017	EVENT	0.81+
theCube	ORGANIZATION	0.81+
WekalO	ORGANIZATION	0.79+
AWS	EVENT	0.78+
greek	OTHER	0.78+
millions of	QUANTITY	0.75+
tons	QUANTITY	0.65+
S3	TITLE	0.63+
second	QUANTITY	0.62+
terra	ORGANIZATION	0.62+
re	EVENT	0.61+
EXT4	TITLE	0.57+
render	QUANTITY	0.57+
couple	QUANTITY	0.56+
AS3	TITLE	0.55+
theCube	COMMERCIAL_ITEM	0.53+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Liran: