theCUBE Previews Supercomputing 22

(inspirational music) >> The history of high performance computing is unique and storied. You know, it's generally accepted that the first true supercomputer was shipped in the mid 1960s by Controlled Data Corporations, CDC, designed by an engineering team led by Seymour Cray, the father of Supercomputing. He left CDC in the 70's to start his own company, of course, carrying his own name. Now that company Cray, became the market leader in the 70's and the 80's, and then the decade of the 80's saw attempts to bring new designs, such as massively parallel systems, to reach new heights of performance and efficiency. Supercomputing design was one of the most challenging fields, and a number of really brilliant engineers became kind of quasi-famous in their little industry. In addition to Cray himself, Steve Chen, who worked for Cray, then went out to start his own companies. Danny Hillis, of Thinking Machines. Steve Frank of Kendall Square Research. Steve Wallach tried to build a mini supercomputer at Convex. These new entrants, they all failed, for the most part because the market at the time just wasn't really large enough and the economics of these systems really weren't that attractive. Now, the late 80's and the 90's saw big Japanese companies like NEC and Fujitsu entering the fray and governments around the world began to invest heavily in these systems to solve societal problems and make their nations more competitive. And as we entered the 21st century, we saw the coming of petascale computing, with China actually cracking the top 100 list of high performance computing. And today, we're now entering the exascale era, with systems that can complete a billion, billion calculations per second, or 10 to the 18th power. Astounding. And today, the high performance computing market generates north of $30 billion annually and is growing in the high single digits. Supercomputers solve the world's hardest problems in things like simulation, life sciences, weather, energy exploration, aerospace, astronomy, automotive industries, and many other high value examples. And supercomputers are expensive. You know, the highest performing supercomputers used to cost tens of millions of dollars, maybe $30 million. And we've seen that steadily rise to over $200 million. And today we're even seeing systems that cost more than half a billion dollars, even into the low billions when you include all the surrounding data center infrastructure and cooling required. The US, China, Japan, and EU countries, as well as the UK, are all investing heavily to keep their countries competitive, and no price seems to be too high. Now, there are five mega trends going on in HPC today, in addition to this massive rising cost that we just talked about. One, systems are becoming more distributed and less monolithic. The second is the power of these systems is increasing dramatically, both in terms of processor performance and energy consumption. The x86 today dominates processor shipments, it's going to probably continue to do so. Power has some presence, but ARM is growing very rapidly. Nvidia with GPUs is becoming a major player with AI coming in, we'll talk about that in a minute. And both the EU and China are developing their own processors. We're seeing massive densities with hundreds of thousands of cores that are being liquid-cooled with novel phase change technology. The third big trend is AI, which of course is still in the early stages, but it's being combined with ever larger and massive, massive data sets to attack new problems and accelerate research in dozens of industries. Now, the fourth big trend, HPC in the cloud reached critical mass at the end of the last decade. And all of the major hyperscalers are providing HPE, HPC as a service capability. Now finally, quantum computing is often talked about and predicted to become more stable by the end of the decade and crack new dimensions in computing. The EU has even announced a hybrid QC, with the goal of having a stable system in the second half of this decade, most likely around 2027, 2028. Welcome to theCUBE's preview of SC22, the big supercomputing show which takes place the week of November 13th in Dallas. theCUBE is going to be there. Dave Nicholson will be one of the co-hosts and joins me now to talk about trends in HPC and what to look for at the show. Dave, welcome, good to see you. >> Hey, good to see you too, Dave. >> Oh, you heard my narrative up front Dave. You got a technical background, CTO chops, what did I miss? What are the major trends that you're seeing? >> I don't think you really- You didn't miss anything, I think it's just a question of double-clicking on some of the things that you brought up. You know, if you look back historically, supercomputing was sort of relegated to things like weather prediction and nuclear weapons modeling. And these systems would live in places like Lawrence Livermore Labs or Los Alamos. Today, that requirement for cutting edge, leading edge, highest performing supercompute technology is bleeding into the enterprise, driven by AI and ML, artificial intelligence and machine learning. So when we think about the conversations we're going to have and the coverage we're going to do of the SC22 event, a lot of it is going to be looking under the covers and seeing what kind of architectural things contribute to these capabilities moving forward, and asking a whole bunch of questions. >> Yeah, so there's this sort of theory that the world is moving toward this connectivity beyond compute-centricity to connectivity-centric. We've talked about that, you and I, in the past. Is that a factor in the HPC world? How is it impacting, you know, supercomputing design? >> Well, so if you're designing an island that is, you know, tip of this spear, doesn't have to offer any level of interoperability or compatibility with anything else in the compute world, then connectivity is important simply from a speeds and feeds perspective. You know, lowest latency connectivity between nodes and things like that. But as we sort of democratize supercomputing, to a degree, as it moves from solely the purview of academia into truly ubiquitous architecture leverage by enterprises, you start asking the question, "Hey, wouldn't it be kind of cool if we could have this hooked up into our ethernet networks?" And so, that's a whole interesting subject to explore because with things like RDMA over converged ethernet, you now have the ability to have these supercomputing capabilities directly accessible by enterprise computing. So that level of detail, opening up the box of looking at the Nix, or the storage cards that are in the box, is actually critically important. And as an old-school hardware knuckle-dragger myself, I am super excited to see what the cutting edge holds right now. >> Yeah, when you look at the SC22 website, I mean, they're covering all kinds of different areas. They got, you know, parallel clustered systems, AI, storage, you know, servers, system software, application software, security. I mean, wireless HPC is no longer this niche. It really touches virtually every industry, and most industries anyway, and is really driving new advancements in society and research, solving some of the world's hardest problems. So what are some of the topics that you want to cover at SC22? >> Well, I kind of, I touched on some of them. I really want to ask people questions about this idea of HPC moving from just academia into the enterprise. And the question of, does that mean that there are architectural concerns that people have that might not be the same as the concerns that someone in academia or in a lab environment would have? And by the way, just like, little historical context, I can't help it. I just went through the upgrade from iPhone 12 to iPhone 14. This has got one terabyte of storage in it. One terabyte of storage. In 1997, I helped build a one terabyte NAS system that a government defense contractor purchased for almost $2 million. $2 million! This was, I don't even know, it was $9.99 a month extra on my cell phone bill. We had a team of seven people who were going to manage that one terabyte of storage. So, similarly, when we talk about just where are we from a supercompute resource perspective, if you consider it historically, it's absolutely insane. I'm going to be asking people about, of course, what's going on today, but also the near future. You know, what can we expect? What is the sort of singularity that needs to occur where natural language processing across all of the world's languages exists in a perfect way? You know, do we have the compute power now? What's the interface between software and hardware? But really, this is going to be an opportunity that is a little bit unique in terms of the things that we typically cover, because this is a lot about cracking open the box, the server box, and looking at what's inside and carefully considering all of the components. >> You know, Dave, I'm looking at the exhibitor floor. It's like, everybody is here. NASA, Microsoft, IBM, Dell, Intel, HPE, AWS, all the hyperscale guys, Weka IO, Pure Storage, companies I've never heard of. It's just, hundreds and hundreds of exhibitors, Nvidia, Oracle, Penguin Solutions, I mean, just on and on and on. Google, of course, has a presence there, theCUBE has a major presence. We got a 20 x 20 booth. So, it's really, as I say, to your point, HPC is going mainstream. You know, I think a lot of times, we think of HPC supercomputing as this just sort of, off in the eclectic, far off corner, but it really, when you think about big data, when you think about AI, a lot of the advancements that occur in HPC will trickle through and go mainstream in commercial environments. And I suspect that's why there are so many companies here that are really relevant to the commercial market as well. >> Yeah, this is like the Formula 1 of computing. So if you're a Motorsports nerd, you know that F1 is the pinnacle of the sport. SC22, this is where everybody wants to be. Another little historical reference that comes to mind, there was a time in, I think, the early 2000's when Unisys partnered with Intel and Microsoft to come up with, I think it was the ES7000, which was supposed to be the mainframe, the sort of Intel mainframe. It was an early attempt to use... And I don't say this in a derogatory way, commodity resources to create something really, really powerful. Here we are 20 years later, and we are absolutely smack in the middle of that. You mentioned the focus on x86 architecture, but all of the other components that the silicon manufacturers bring to bear, companies like Broadcom, Nvidia, et al, they're all contributing components to this mix in addition to, of course, the microprocessor folks like AMD and Intel and others. So yeah, this is big-time nerd fest. Lots of academics will still be there. The supercomputing.org, this loose affiliation that's been running these SC events for years. They have a major focus, major hooks into academia. They're bringing in legit computer scientists to this event. This is all cutting edge stuff. >> Yeah. So like you said, it's going to be kind of, a lot of techies there, very technical computing, of course, audience. At the same time, we expect that there's going to be a fair amount, as they say, of crossover. And so, I'm excited to see what the coverage looks like. Yourself, John Furrier, Savannah, I think even Paul Gillin is going to attend the show, because I believe we're going to be there three days. So, you know, we're doing a lot of editorial. Dell is an anchor sponsor, so we really appreciate them providing funding so we can have this community event and bring people on. So, if you are interested- >> Dave, Dave, I just have- Just something on that point. I think that's indicative of where this world is moving when you have Dell so directly involved in something like this, it's an indication that this is moving out of just the realm of academia and moving in the direction of enterprise. Because as we know, they tend to ruthlessly drive down the cost of things. And so I think that's an interesting indication right there. >> Yeah, as do the cloud guys. So again, this is mainstream. So if you're interested, if you got something interesting to talk about, if you have market research, you're an analyst, you're an influencer in this community, you've got technical chops, maybe you've got an interesting startup, you can contact David, david.nicholson@siliconangle.com. John Furrier is john@siliconangle.com. david.vellante@siliconangle.com. I'd be happy to listen to your pitch and see if we can fit you onto the program. So, really excited. It's the week of November 13th. I think November 13th is a Sunday, so I believe David will be broadcasting Tuesday, Wednesday, Thursday. Really excited. Give you the last word here, Dave. >> No, I just, I'm not embarrassed to admit that I'm really, really excited about this. It's cutting edge stuff and I'm really going to be exploring this question of where does it fit in the world of AI and ML? I think that's really going to be the center of what I'm really seeking to understand when I'm there. >> All right, Dave Nicholson. Thanks for your time. theCUBE at SC22. Don't miss it. Go to thecube.net, go to siliconangle.com for all the news. This is Dave Vellante for theCUBE and for Dave Nicholson. Thanks for watching. And we'll see you in Dallas. (inquisitive music)

Published Date : Oct 25 2022

SUMMARY :

And all of the major What are the major trends on some of the things that you brought up. that the world is moving or the storage cards that are in the box, solving some of the across all of the world's languages a lot of the advancements but all of the other components At the same time, we expect and moving in the direction of enterprise. Yeah, as do the cloud guys. and I'm really going to be go to siliconangle.com for all the news.

ENTITIES

Entity	Category	Confidence
Danny Hillis	PERSON	0.99+
Steve Chen	PERSON	0.99+
NEC	ORGANIZATION	0.99+
Fujitsu	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Steve Wallach	PERSON	0.99+
David	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Dave Nicholson	PERSON	0.99+
NASA	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Steve Frank	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Seymour Cray	PERSON	0.99+
John Furrier	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Unisys	ORGANIZATION	0.99+
1997	DATE	0.99+
Savannah	PERSON	0.99+
Dallas	LOCATION	0.99+
EU	ORGANIZATION	0.99+
Controlled Data Corporations	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
Penguin Solutions	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Tuesday	DATE	0.99+
siliconangle.com	OTHER	0.99+
AMD	ORGANIZATION	0.99+
21st century	DATE	0.99+
iPhone 12	COMMERCIAL_ITEM	0.99+
10	QUANTITY	0.99+
Cray	PERSON	0.99+
one terabyte	QUANTITY	0.99+
CDC	ORGANIZATION	0.99+
thecube.net	OTHER	0.99+
Lawrence Livermore Labs	ORGANIZATION	0.99+
Broadcom	ORGANIZATION	0.99+
Kendall Square Research	ORGANIZATION	0.99+
iPhone 14	COMMERCIAL_ITEM	0.99+
john@siliconangle.com	OTHER	0.99+
$2 million	QUANTITY	0.99+
November 13th	DATE	0.99+
first	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
Today	DATE	0.99+
more than half a billion dollars	QUANTITY	0.99+
20	QUANTITY	0.99+
seven people	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
mid 1960s	DATE	0.99+
three days	QUANTITY	0.99+
Convex	ORGANIZATION	0.99+
70's	DATE	0.99+
SC22	EVENT	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
late 80's	DATE	0.98+
80's	DATE	0.98+
ES7000	COMMERCIAL_ITEM	0.98+
today	DATE	0.98+
almost $2 million	QUANTITY	0.98+
second	QUANTITY	0.98+
both	QUANTITY	0.98+
20 years later	DATE	0.98+
tens of millions of dollars	QUANTITY	0.98+
Sunday	DATE	0.98+
Japanese	OTHER	0.98+
90's	DATE	0.97+

Liran Zvibel, WekaIO | CUBEConversation, April 2018

[Music] hi I'm Stu minimun and this is the cube conversation in Silicon angles Palo Alto office happy to welcome back to the program Lear on survival who is the co-founder and CEO of Weka IO thanks so much for joining me thank you for having me over alright so on our research side you know we've really been saying that data is at the center of everything it's in the cloud it's in the network and of course in the storage industry data has always been there but I think especially for customers it's been more front and center well you know why is data becoming more important it's not data growth and some of the other things that we've talked about for decades but you know how was it changing what are you hearing from customers today so I think the main difference is that organization they're starting to understand that the more data they have the better service they're going to provide to their customers and there will be an overall better company than their competitors so about 10 years ago we started hearing about big data and other ways that in a more simpler form just went over sieved through a lot of data and tried to get some sort of high-level meaning out of it last few years people are actually employing deep learning machine learning technique to their vast amounts of data and they're getting much higher level of intelligence out of their huge capacities of data and actually with deep learning the more data you have the better outputs you get before we go into kind of the m/l and the deep learning piece just did kind of a focus on data itself there's some that say you know digital transformation is it's this buzzword when I talk to users absolutely they're going through transformations you know we're saying everybody's becoming a software company but how does data specifically help them with that you know what what what is your viewpoint there and what are you hearing from your customers so if you look at it from the consumer perspective so people now keep track record of their lives at much higher resolution than the and I'm not talking about the images rigid listen I'm talking about the vast amount of data that they store so if I look at how many pictures I have of myself as a kid and how many pictures I have of my kids like you could fit all of my pictures into albums I can probably fit my my kids like a week's worth of time into albums so people keep a lot more data as consumers and then organization keep a lot more data of their customers in order to provide better service and better overall product you know the industry as an industry we saw a real mixed bag when it came to Big Data when I was saying great I have lots more volume of data that doesn't necessarily mean that I got more value out of it so what are the one of the trends that you're seeing why is you know where things like you deep learning machine learning AI you know is it going to be different or is this just kind of the next iteration of well we're trying and maybe we didn't hit as well with big data let's see if this does it does better so I think that Big Data had its glory days and now where they're coming to to the end of that crescendo because people realized that what they got was sort of aggregate of things that they couldn't make too much sense of and then people really understand that for you to make better use of your data you need to employ way similarly to how the brain works so look a lot of data and then you have to have some sense out of their data and once you've made some sense out of that data we can now get computers to go through way more data and make a similar amount of sense out of that and actually get much much better results so just instead of going finding anecdotes or this thing that you were able to do with big date you're actually now are able to generate intelligent systems you know what one of the other things we saw is it used to be okay I have this this huge back catalogue or I'm going to survey all the data I've collected today you know it's much more you know real times a word that's been thrown around for many years you know whether it do you say live data or you know if you're at sensors where I need to have something where I can you know train models react immediately that that kind of immediacy is much more important you know that's what I'm assuming that's something that you're seeing from customers to indeed so what we say is that customers end up collecting vast amounts of data and then they train their models on these kind of data and then they're pushing these intelligent models to the edges and then you're gonna have edges running inference and that could be a straight camera it could be a camera in the store or it could be your car and then usually you run these inference at the endpoints using all the things you've trained the models back then and you will still keep the data push it back and then you should you still run inference at the data center sort of doing QA and now the edges also know to mark where they couldn't make sense of what they saw so the the data center systems know what should we look at first how we make our models smarter for the next iteration because these are closed-loop systems you train them you push through the edges the edges tell you how well you think they think they understood your train again and things improve we're now at the infancy of a lot of these loops but I think the following probably two to five years will take us through a very very fascinating revolution where systems all around us will become way way more intelligent yeah and there's interesting architectural discussions going on if you talk about this edge environment if I'm an autonomous vehicle now from an airplane of course I need to react there I can't go back to the cloud but you know what what happens in the cloud versus what happens at the edge where do where does Weka fit into that that whole discussion so where we currently are running we're running at the data centers so at Weka we created the fastest file system that's perfect for AI and machine learning and training and we make sure that your GPU field servers that are very expensive never sit idle the second component of our system is tearing two very effective object storages that can run into exabytes so we have the system that makes sure you can have as many GPU servers churning all the time and getting the results getting the new models while having the ability to read any form of data that was collected in the several years really through hundreds of petabytes of data sets and now we have customers talking about exabytes of data sets representing a single application not throughout the organization just for that training application yeah so a I in ml you know Keita is that that the killer use case for your customers today so that's one killer application just because of the vast amount of data and the high-performance nature of the clients we actually show clients that runwa kayo finished training sessions ten times faster than how they would use traditional NFS based solutions but just based on the different way we handle data another very strong application for us is around Life Sciences and genomics where we show that we're the only storage that let these processes remain CPU bound so any other storage at some points becomes IO bound so you couldn't paralyzed paralyzed the processing anymore we actually doesn't matter how many servers you run as clients you double the amount of clients you either get the twice the result the same amount of time or you get the same result it's half the time and with genomics nowadays there are applications that are life-saving so hospitals run these things and they need results as fast as they can so faster storage means better healthcare yeah without getting too deep in it because you know the storage industry has lots of wonkiness and it's there's so many pieces there but you know I hear life scientists I think object storage I hear nvme I think block storage your file storage when it comes down to it you know why is that the right architecture you know for today and what advantages does that give you so we we are actually the only company that went through the hassles and the hurdles of utilizing nvme and nvme of the fabrics for a parallel file system all other solutions went the easier route and created the block and the reason we've created a file system is that this is what computers understand this is what the operating system understand when you go to university you learn computer science they teach you how to write programs they need a file system now if you want to run your program over to servers or ten servers what you need is a shirt file system up until we came gold standard was using NFS for sharing files across servers but NFS was actually created in the 80s when Ethernet run at 10 megabit so currently most of our customers run already 100 gigabytes which is four orders of magnitude faster so they're seeing that they cannot run a network protocol that was designed four orders of magnitude last speed with the current demanding workloads so this explains why we had to go and and pick a totally different way of pushing data to the to the clients with regarding to object storages object storages are great because they allow customers to aggregate hard drives into inexpensive large capacity solutions the problem with object storages is that the programming model is different than the standard file system that computers can understand in too thin two ways a when you write something you don't know when it's going to get actually stored it's called eventual consistency and it's very difficult for mortal programmers to actually write a system that is sound that is always correct when you're writing eventual consistency storage the second thing is that objects cannot change you cannot modify them you need to create them you get them or you can delete them they can have versions but this is also much different than how the average programmer is used to write its programs so we are actually tying between the highest performance and vme of the fabrics at the first year and these object storages that are extremely efficient but very difficult to work with at the back and tier two a single solution that is highest performance and best economics right there on I want to give you the last word give us a little bit of a long view you talked about where we've gone how parallel you know architecture helps now that we're at you know 100 Gig look out five years in the future what's gonna happen you know blockchain takes over the world cloud dominates everything but from an infrastructure application in you know storage world you know where does wek I think that the things look like so one one very strong trend that we are saying is around encryption so it doesn't matter what industry I think storing things in clear-text for many organizations just stops making sense and people will demand more and more of day of their data to be encrypted and tighter control around everything that's one very strong trend that we're seeing another very strong trend that we're seeing is enterprises would like to leverage the public cloud but in an efficient way so if you were to run economics moving all your application to the public cloud may end up being more expensive than running everything on Prem and I think a lot of organizations realized that the the trick is going to be each organisation will have to find a balance to what kind of services are run on Prem and these are going to be the services that are run around the clock and what services have the more of a bursty nature and then organization will learn how to leverage the public cloud for its elasticity because if you're just running on the cloud you're not leveraging the elasticity you're doing it wrong and we're actually helping a lot of our customers do it with our hybrid cloud ability to have local workloads and the cloud workloads and getting these whole workflows to actually run is a fascinating process they're on thank you so much for joining us great to hear the update not only on Weka but really where the industry is going dynamic times here in the industry data at the center of all cubes looking to cover it at all the locations including here and our lovely Palo Alto Studio I'm Stu minimun thanks so much for watching the cube thank you very much [Music] you

Published Date : Apr 6 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Liran Zvibel	PERSON	0.99+
100 gigabytes	QUANTITY	0.99+
April 2018	DATE	0.99+
10 megabit	QUANTITY	0.99+
two	QUANTITY	0.99+
Weka IO	ORGANIZATION	0.99+
Weka	ORGANIZATION	0.99+
twice	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
second thing	QUANTITY	0.99+
five years	QUANTITY	0.98+
second component	QUANTITY	0.98+
each organisation	QUANTITY	0.98+
first year	QUANTITY	0.98+
today	DATE	0.97+
Stu minimun	PERSON	0.97+
two ways	QUANTITY	0.97+
Prem	ORGANIZATION	0.96+
ten times	QUANTITY	0.95+
about 10 years ago	DATE	0.94+
one	QUANTITY	0.94+
Stu minimun	PERSON	0.94+
last few years	DATE	0.93+
hundreds of petabytes of data sets	QUANTITY	0.93+
first	QUANTITY	0.92+
several years	QUANTITY	0.92+
80s	DATE	0.91+
single application	QUANTITY	0.9+
decades	QUANTITY	0.9+
a lot of data	QUANTITY	0.89+
Silicon angles	LOCATION	0.89+
half the time	QUANTITY	0.87+
ten servers	QUANTITY	0.87+
two very effective object	QUANTITY	0.87+
single solution	QUANTITY	0.86+
four orders	QUANTITY	0.85+
four orders	QUANTITY	0.85+
a week	QUANTITY	0.84+
Palo Alto Studio	ORGANIZATION	0.8+
lot more data	QUANTITY	0.78+
WekaIO	ORGANIZATION	0.78+
100 Gig	QUANTITY	0.74+
Lear on	TITLE	0.72+
double	QUANTITY	0.72+
many pieces	QUANTITY	0.65+
Keita	ORGANIZATION	0.63+
lot of data	QUANTITY	0.6+
lot	QUANTITY	0.58+
lots	QUANTITY	0.58+
application	QUANTITY	0.56+
vast amounts of data	QUANTITY	0.54+
exabytes	QUANTITY	0.53+
trend	QUANTITY	0.52+
CEO	PERSON	0.5+
Big Data	ORGANIZATION	0.45+

Liran Zvibel, WekalO & Maor Ben Dayan, WekalO | AWS re:Invent

>> Announcer: Live from Las Vegas, it's The Cube, covering AWS re:Invent 2017, presented by AWS, Intel, and our ecosystem of partners. >> And we're back, here on the show floor in the exhibit hall at Sands Expo, live at re:Invent for AWS along with Justin Warren. I'm John Walls. We're joined by a couple of executives now from Weka IO, to my immediate right is Liran Zvibel, who is the co-founder and CEO and then Maor Ben Dayan who's the chief architect at IO. Gentleman thanks for being with us. >> Thanks for having us. >> Appreciate you being here on theCube. First off tell the viewers a little bit about your company and I think a little about the unusual origination of the name. You were sharing that with me as well. So let's start with that, and then tell us a little bit more about what you do. >> Alright, so the name is Weka IO. Weka is actually a greek unit, like mega and terra and peta so it's actually a trillion exobytes, ten to the power of thirty, it's a huge capacity, so it works well for a storage company. Hopefully we will end up storing wekabytes. It will take some time. >> I think a little bit of time to get there. >> A little bit. >> We're working on it. >> One customer at a time. >> Give a little more about what you do, in terms of your relationship with AWS. >> Okay, so at Weka IO we create the highest performance file system, either on prem or in the cloud. So we have a parallel file system over NVME. Like no previous generation file system did parallel work over hard drives. But these are 20 years old technology. We're the first file system to bring new paralleled rhythms to NVME so we get you lowest latency, highest throughput either on prem or in the cloud. We are perfect for machine learning and life sciences applications. Also you've mentioned media and entertainment earlier. We can run on your hardware on prem, we can run on our instances, I3 instances, in AWS and we can also take snapshots that are native performance so they don't take away performance and we also have the ability to take these snapshots and push them to S3 based object storage. This allows you to have DR or backup functionality if you look on prem but if your object storage is actually AWSS3, it also lets you do cloud bursting, so it can take your on prem cluster, connect it to AWSS3, take a snapshot, push it to AS3 and now if you have a huge amount of computation that you need to do, your local GPU servers don't have enough capacity or you just want to get the results faster, you would build a big enough cluster on AWS, get the results and bring them back. >> You were explaining before that it's a big challenge to be able to do something that can do both low latency with millions and millions of small files but also be able to do high throughput for some large files, like media and entertainment tends to be very few but very, very large files with something like genomics research, you'll have millions and millions of files but they're all quite tiny. That's quite hard, but you were saying it's actually easier to do the high throughput than it is for low latency, maybe explain some of that. >> You want to take it? >> Sure, on the one hand, streaming lots of data is easy when you distribute the data over many servers or instances in the AWS like luster dust or other solutions, but then doing small files becomes really hard. Now this is where Weka innovated and really solved this bottleneck so it really frees you to do whatever you want with the storage system without hitting any bottlenecks. This is the secret sauce of Weka. >> Right and you were mentioning before, it's a file system so it's an NFS and SMB access to this data but you're also saying that you can export to S3. >> Actually we have NFS, we have SMB, but we also have native posits so any application that you could up until now only run on the local file system such as EXT4 or ZFS, you can actually run in assured manner. Anything that's written on the many pages we do, so adjust works, locking, everything. That's one thing we're showing for life sciences, genomic workflows that we can scale their workflows without losing any performance, so if one server doing one kind of transformation takes time x, if you use 10 servers, it will take 10x the time to get 10x the results. If you have 100 servers, it's gonna take 100x servers to get 100x the results, what customers see with other storage solutions, either on prem or in the cloud, that they're adding servers but they're getting way less results. We're giving the customers five to 20 times more results than what they did on what they thought were high performance file systems prior to the Weka IO solution. >> Can you give me a real life example of this, when you talk about life sciences, you talk about genomic research and we talk about the itty bitty files and millions of samples and whatever, but exactly whatever, translate it for me, when it comes down to a real job task, a real chore, what exactly are you bringing to the table that will enable whatever research is being done or whatever examination's being done. >> I'll give you a general example, not out of specifically of life sciences, we were doing a POC at a very large customer last week and we were compared head to head with best of breed, all flash file system, they did a simple test. They created a large file system on both storage solutions filled with many many millions of small files, maybe even billions of small files and they wanted to go through all the files, they just ran the find command, so the leading competitor finished the work in six and a half hours. We finished the same work in just under two hours. More than 3x time difference compared to a solution that is currently considered probably the fastest. >> Gold standard allegedly, right? Allegedly. >> It's a big difference. During the same comparison, that customer just did an ALS of a directory with a million files that other leading solution took 55 seconds and it took just under 10 seconds for us. >> We just get you the results faster, meaning your compute remains occupied and working. If you're working with let's say GPU servers that are costly, but usually they are just idling around, waiting for the data to come to them. We just unstarve these GPU servers and let's you get what you paid for. >> And particularly with something like the elasticity of AWS, if it takes me only two hours instead of six, that's gonna save me a lot of money because I don't have to pay for that extra six hours. >> It does and if you look at the price of the P3 instances, for reason those voltage GPUs aren't inexpensive, any second they're not idling around is a second you saved and you're actually saving a lot of money, so we're showing customers that by deploying Weka IO on AWS and on premises, they're actually saving a lot of money. >> Explain some more about how you're able to bridge between both on premises and the cloud workloads, because I think you mentioned before that you would actually snapshot and then you could send the data as a cloud bursting capability. Is that the primary use case you see customers using or is it another way of getting your data from your side into the cloud? >> Actually we have a slightly more complex feature, it's called tiering through the object storage. Now customers have humongous name spaces, hundreds of petabytes some of them and it doesn't make sense to keep them all on NVME flash, it's too expensive so a big feature that we have is that we let you tier between your flash and object storage and let's you manage economics and actually we're chopping down large files and doing it to many objects, similarly to how a traditional file system treat hard drives so we treat NVMEs in a parallel fashion, that's world first but we also do all the tricks that a traditional parallel file system do to get good performance out of hard drives to the object storage. Now we take that tiering functionality and we couple it with our highest performance snapshotting abilities so you can take the snapshot and just push it completely into the object storage in a way that you don't require the original cluster anymore >> So you've mentioned a few of the areas that you're expertise now and certainly where you're working, what are some other verticals that you're looking at? What are some other areas where you think that you can bring what you're doing for maybe in the life science space and provide equal if not superior value? >> Currently. >> Like where are you going? >> Currently we focus on GPU based execution because that's where we save the most money to the customers, we give the biggest bang for the buck. Also genomics because they have severe performance problems around building, we've shown a huge semiconductor company that was trying to build and read, they were forced to building on local file system, it took them 35 minutes, they tried their fastest was actually on RAM battery backed RAM based shared file system using NFS V4, it took them four hours. It was too long, you only got to compile the day. It doesn't make sense. We showed them that they can actually compile in 38 minutes, show assured file system that is fully coherent, consistent and protected only took 10% more time, but it didn't take 10% more time because what we enabled them to do is now share the build cache, so the next build coming in only took 10 minutes. A full build took slightly longer, but if you take the average now their build was 13 or 14 minutes, so we've actually showed that assured file system can save time. Other use cases are media and entertainment, for rendering use cases, you have these use cases, they parallelize amazingly well. You can have tons of render nodes rendering your scenes and the more rendering nodes you have, the quicker you can come up with your videos, with your movies or they look nicer. We enable our customers to scale their clusters to sizes they couldn't even imagine prior to us. >> It's impressive, really impressive, great work and thanks for sharing it with us here on theCube, first time for each right? You're now Cube alumni, congratulations. >> Okay, thanks for having us. >> Thank you for being with us here. Again, we're live here at re:Invent and back with more live coverage here on theCube right after this time out.

Published Date : Dec 1 2017

SUMMARY :

Intel, and our ecosystem of partners. in the exhibit hall at Sands Expo, bit more about what you do. Alright, so the name is Weka IO. Give a little more about what you do, rhythms to NVME so we get you lowest latency, That's quite hard, but you were saying it's actually easier is easy when you distribute the data over many servers saying that you can export to S3. native posits so any application that you could up until now a real chore, what exactly are you bringing to the table and we were compared head to head with best of breed, and it took just under 10 seconds for us. and let's you get what you paid for. because I don't have to pay for that extra six hours. It does and if you look at the price Is that the primary use case you see customers using so a big feature that we have is that we let you tier and the more rendering nodes you have, and thanks for sharing it with us here on theCube, Thank you for being with us here.

ENTITIES

Entity	Category	Confidence
Justin Warren	PERSON	0.99+
Liran Zvibel	PERSON	0.99+
John Walls	PERSON	0.99+
10x	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Maor Ben Dayan	PERSON	0.99+
10 servers	QUANTITY	0.99+
10 minutes	QUANTITY	0.99+
six hours	QUANTITY	0.99+
13	QUANTITY	0.99+
35 minutes	QUANTITY	0.99+
millions	QUANTITY	0.99+
55 seconds	QUANTITY	0.99+
100 servers	QUANTITY	0.99+
four hours	QUANTITY	0.99+
six	QUANTITY	0.99+
five	QUANTITY	0.99+
100x	QUANTITY	0.99+
14 minutes	QUANTITY	0.99+
38 minutes	QUANTITY	0.99+
20 times	QUANTITY	0.99+
last week	DATE	0.99+
Las Vegas	LOCATION	0.99+
One customer	QUANTITY	0.99+
hundreds of petabytes	QUANTITY	0.99+
six and a half hours	QUANTITY	0.99+
first time	QUANTITY	0.99+
Intel	ORGANIZATION	0.98+
Sands Expo	EVENT	0.98+
thirty	QUANTITY	0.98+
a million files	QUANTITY	0.98+
Weka IO	ORGANIZATION	0.98+
one server	QUANTITY	0.98+
under two hours	QUANTITY	0.98+
both	QUANTITY	0.98+
Weka	ORGANIZATION	0.98+
millions of samples	QUANTITY	0.97+
each	QUANTITY	0.97+
under 10 seconds	QUANTITY	0.97+
two hours	QUANTITY	0.97+
first file system	QUANTITY	0.97+
IO	ORGANIZATION	0.97+
billions of small files	QUANTITY	0.96+
First	QUANTITY	0.96+
one	QUANTITY	0.96+
NFS V4	TITLE	0.96+
re:Invent	EVENT	0.96+
ten	QUANTITY	0.95+
millions of files	QUANTITY	0.94+
AWSS3	TITLE	0.94+
Cube	ORGANIZATION	0.94+
10% more time	QUANTITY	0.93+
More than 3x time	QUANTITY	0.91+
20 years old	QUANTITY	0.9+
millions of small files	QUANTITY	0.89+
a trillion exobytes	QUANTITY	0.89+
first	QUANTITY	0.87+
one kind	QUANTITY	0.84+
mega	ORGANIZATION	0.83+
re:Invent 2017	EVENT	0.81+
theCube	ORGANIZATION	0.81+
WekalO	ORGANIZATION	0.79+
AWS	EVENT	0.78+
greek	OTHER	0.78+
millions of	QUANTITY	0.75+
tons	QUANTITY	0.65+
S3	TITLE	0.63+
second	QUANTITY	0.62+
terra	ORGANIZATION	0.62+
re	EVENT	0.61+
EXT4	TITLE	0.57+
render	QUANTITY	0.57+
couple	QUANTITY	0.56+
AS3	TITLE	0.55+
theCube	COMMERCIAL_ITEM	0.53+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Weka IO: