Anthony Dina, Dell Technologies and Bob Crovella, NVIDIA | SuperComputing 22

>>How do y'all, and welcome back to Supercomputing 2022. We're the Cube, and we are live from Dallas, Texas. I'm joined by my co-host, David Nicholson. David, hello. Hello. We are gonna be talking about data and enterprise AI at scale during this segment. And we have the pleasure of being joined by both Dell and Navidia. Anthony and Bob, welcome to the show. How you both doing? Doing good. >>Great. Great show so far. >>Love that. Enthusiasm, especially in the afternoon on day two. I think we all, what, what's in that cup? Is there something exciting in there that maybe we should all be sharing with you? >>Just say it's just still Yeah, water. >>Yeah. Yeah. I love that. So I wanna make sure that, cause we haven't talked about this at all during the show yet, on the cube, I wanna make sure that everyone's on the same page when we're talking about data unstructured versus structured data. I, it's in your title, Anthony, tell me what, what's the difference? >>Well, look, the world has been based in analytics around rows and columns, spreadsheets, data warehouses, and we've made predictions around the forecast of sales maintenance issues. But when we take computers and we give them eyes, ears, and fingers, cameras, microphones, and temperature and vibration sensors, we now translate that into more human experience. But that kind of data, the sensor data, that video camera is unstructured or semi-structured, that's what that >>Means. We live in a world of unstructured data structure is something we add to later after the fact. But the world that we see and the world that we experience is unstructured data. And one of the promises of AI is to be able to take advantage of everything that's going on around us and augment that, improve that, solve problems based on that. And so if we're gonna do that job effectively, we can't just depend on structured data to get the problem done. We have to be able to incorporate everything that we can see here, taste, smell, touch, and use >>That as, >>As part of the problem >>Solving. We want the chaos, bring it. >>Chaos has been a little bit of a theme of our >>Show. It has been, yeah. And chaos is in the eye of the beholder. You, you think about, you think about the reason for structuring data to a degree. We had limited processing horsepower back when everything was being structured as a way to allow us to be able to, to to reason over it and gain insights. So it made sense to put things into rows and tables. How does, I'm curious, diving right into where Nvidia fits into this, into this puzzle, how does NVIDIA accelerate or enhance our ability to glean insight from or reason over unstructured data in particular? >>Yeah, great question. It's really all about, I would say it's all about ai and Invidia is a leader in the AI space. We've been investing and focusing on AI since at least 2012, if not before, accelerated computing that we do it. Invidia is an important part of it, really. We believe that AI is gonna revolutionize nearly every aspect of computing. Really nearly every aspect of problem solving, even nearly every aspect of programming. And one of the reasons is for what we're talking about now is it's a little impact. Being able to incorporate unstructured data into problem solving is really critical to being able to solve the next generation of problems. AI unlocks, tools and methodologies that we can realistically do that with. It's not realistic to write procedural code that's gonna look at a picture and solve all the problems that we need to solve if we're talking about a complex problem like autonomous driving. But with AI and its ability to naturally absorb unstructured data and make intelligent reason decisions based on it, it's really a breakthrough. And that's what NVIDIA's been focusing on for at least a decade or more. >>And how does NVIDIA fit into Dell's strategy? >>Well, I mean, look, we've been partners for many, many years delivering beautiful experiences on workstations and laptops. But as we see the transition away from taking something that was designed to make something pretty on screen to being useful in solving problems in life sciences, manufacturing in other places, we work together to provide integrated solutions. So take for example, the dgx a 100 platform, brilliant design, revolutionary bus technologies, but the rocket ship can't go to Mars without the fuel. And so you need a tank that can scale in performance at the same rate as you throw GPUs at it. And so that's where the relationship really comes alive. We enable people to curate the data, organize it, and then feed those algorithms that get the answers that Bob's been talking about. >>So, so as a gamer, I must say you're a little shot at making things pretty on a screen. Come on. That was a low blow. That >>Was a low blow >>Sassy. What I, >>I Now what's in your cup? That's what I wanna know, Dave, >>I apparently have the most boring cup of anyone on you today. I don't know what happened. We're gonna have to talk to the production team. I'm looking at all of you. We're gonna have to make that better. One of the themes that's been on this show, and I love that you all embrace the chaos, we're, we're seeing a lot of trend in the experimentation phase or stage rather. And it's, we're in an academic zone of it with ai, companies are excited to adopt, but most companies haven't really rolled out their strategy. What is necessary for us to move from this kind of science experiment, science fiction in our heads to practical application at scale? Well, >>Let me take this, Bob. So I've noticed there's a pattern of three levels of maturity. The first level is just what you described. It's about having an experience, proof of value, getting stakeholders on board, and then just picking out what technology, what algorithm do I need? What's my data source? That's all fun, but it is chaos over time. People start actually making decisions based on it. This moves us into production. And what's important there is normality, predictability, commonality across, but hidden and embedded in that is a center of excellence. The community of data scientists and business intelligence professionals sharing a common platform in the last stage, we get hungry to replicate those results to other use cases, throwing even more information at it to get better accuracy and precision. But to do this in a budget you can afford. And so how do you figure out all the knobs and dials to turn in order to make, take billions of parameters and process that, that's where casual, what's >>That casual decision matrix there with billions of parameters? >>Yeah. Oh, I mean, >>But you're right that >>That's, that's exactly what we're, we're on this continuum, and this is where I think the partnership does really well, is to marry high performant enterprise grade scalability that provides the consistency, the audit trail, all of the things you need to make sure you don't get in trouble, plus all of the horsepower to get to the results. Bob, what would you >>Add there? I think the thing that we've been talking about here is complexity. And there's complexity in the AI problem solving space. There's complexity everywhere you look. And we talked about the idea that NVIDIA can help with some of that complexity from the architecture and the software development side of it. And Dell helps with that in a whole range of ways, not the least of which is the infrastructure and the server design and everything that goes into unlocking the performance of the technology that we have available to us today. So even the center of excellence is an example of how do I take this incredibly complex problem and simplify it down so that the real world can absorb and use this? And that's really what Dell and Vidia are partnering together to do. And that's really what the center of excellence is. It's an idea to help us say, let's take this extremely complex problem and extract some good value out of >>It. So what is Invidia's superpower in this realm? I mean, look, we're we are in, we, we are in the era of Yeah, yeah, yeah. We're, we're in a season of microprocessor manufacturers, one uping, one another with their latest announcements. There's been an ebb and a flow in our industry between doing everything via the CPU versus offloading processes. Invidia comes up and says, Hey, hold on a second, gpu, which again, was focused on graphics processing originally doing something very, very specific. How does that translate today? What's the Nvidia again? What's, what's, what's the superpower? Because people will say, well, hey, I've got a, I've got a cpu, why do I need you? >>I think our superpower is accelerated computing, and that's really a hardware and software thing. I think your question is slanted towards the hardware side, which is, yes, it is very typical and we do make great processors, but the processor, the graphics processor that you talked about from 10 or 20 years ago was designed to solve a very complex task. And it was exquisitely designed to solve that task with the resources that we had available at that time. Time. Now, fast forward 10 or 15 years, we're talking about a new class of problems called ai. And it requires both exquisite, soft, exquisite processor design as well as very complex and exquisite software design sitting on top of it as well. And the systems and infrastructure knowledge, high performance storage and everything that we're talking about in the solution today. So Nvidia superpower is really about that accelerated computing stack at the bottom. You've got hardware above that, you've got systems above that, you have middleware and libraries and above that you have what we call application SDKs that enable the simplification of this really complex problem to this domain or that domain or that domain, while still allowing you to take advantage of that processing horsepower that we put in that exquisitely designed thing called the gpu >>Decreasing complexity and increasing speed to very key themes of the show. Shocking, no one, you all wanna do more faster. Speaking of that, and I'm curious because you both serve a lot of different unique customers, verticals and use cases, is there a specific project that you're allowed to talk about? Or, I mean, you know, you wanna give us the scoop, that's totally cool too. We're here for the scoop on the cube, but is there a specific project or use case that has you personally excited Anthony? We'll start with that. >>Look, I'm, I've always been a big fan of natural language processing. I don't know why, but to derive intent based on the word choices is very interesting to me. I think what compliments that is natural language generation. So now we're having AI programs actually discover and describe what's inside of a package. It wouldn't surprise me that over time we move from doing the typical summary on the economic, the economics of the day or what happened in football. And we start moving that towards more of the creative advertising and marketing arts where you are no longer needed because the AI is gonna spit out the result. I don't think we're gonna get there, but I really love this idea of human language and computational linguistics. >>What a, what a marriage. I agree. Think it's fascinating. What about you, Bob? It's got you >>Pumped. The thing that really excites me is the problem solving, sort of the tip of the spear in problem solving. The stuff that you've never seen before, the stuff that you know, in a geeky way kind of takes your breath away. And I'm gonna jump or pivot off of what Anthony said. Large language models are really one of those areas that are just, I think they're amazing and they're just kind of surprising everyone with what they can do here on the show floor. I was looking at a demonstration from a large language model startup, basically, and they were showing that you could ask a question about some obscure news piece that was reported only in a German newspaper. It was about a little shipwreck that happened in a hardware. And I could type in a query to this system and it would immediately know where to find that information as if it read the article, summarized it for you, and it even could answer questions that you could only only answer by looking pic, looking at pictures in that article. Just amazing stuff that's going on. Just phenomenal >>Stuff. That's a huge accessibility. >>That's right. And I geek out when I see stuff like that. And that's where I feel like all this work that Dell and Invidia and many others are putting into this space is really starting to show potential in ways that we wouldn't have dreamed of really five years ago. Just really amazing. And >>We see this in media and entertainment. So in broadcasting, you have a sudden event, someone leaves this planet where they discover something new where they get a divorce and they're a major quarterback. You wanna go back somewhere in all of your archives to find that footage. That's a very laborist project. But if you can use AI technology to categorize that and provide the metadata tag so you can, it's searchable, then we're off to better productions, more interesting content and a much richer viewer experience >>And a much more dynamic picture of what's really going on. Factoring all of that in, I love that. I mean, David and I are both nerds and I know we've had take our breath away moments, so I appreciate that you just brought that up. Don't worry, you're in good company. In terms of the Geek Squad over >>Here, I think actually maybe this entire show for Yes, exactly. >>I mean, we were talking about how steampunk some of the liquid cooling stuff is, and you know, this is the only place on earth really, or the only show where you would come and see it at this level in scale and, and just, yeah, it's, it's, it's very, it's very exciting. How important for the future of innovation in HPC are partnerships like the one that Navia and Dell have? >>You wanna start? >>Sure, I would, I would just, I mean, I'm gonna be bold and brash and arrogant and say they're essential. Yeah, you don't not, you do not want to try and roll this on your own. This is, even if we just zoomed in to one little beat, little piece of the technology, the software stack that do modern, accelerated deep learning is incredibly complicated. There can be easily 20 or 30 components that all have to be the right version with the right buttons pushed, built the right way, assembled the right way, and we've got lots of technologies to help with that. But you do not want to be trying to pull that off on your own. That's just one little piece of the complexity that we talked about. And we really need, as technology providers in this space, we really need to do as much as we do to try to unlock the potential. We have to do a lot to make it usable and capable as well. >>I got a question for Anthony. All >>Right, >>So in your role, and I, and I'm, I'm sort of, I'm sort of projecting here, but I think, I think, I think your superpower personally is likely in the realm of being able to connect the dots between technology and the value that that technology holds in a variety of contexts. That's right. Whether it's business or, or whatever, say sentences. Okay. Now it's critical to have people like you to connect those dots. Today in the era of pervasive ai, how important will it be to have AI have to explain its answer? In other words, words, should I trust the information the AI is giving me? If I am a decision maker, should I just trust it on face value? Or am I going to want a demand of the AI kind of what you deliver today, which is No, no, no, no, no, no. You need to explain this to me. How did you arrive at that conclusion, right? How important will that be for people to move forward and trust the results? We can all say, oh hey, just trust us. Hey, it's ai, it's great, it's got Invidia, you know, Invidia acceleration and it's Dell. You can trust us, but come on. So many variables in the background. It's >>An interesting one. And explainability is a big function of ai. People want to know how the black box works, right? Because I don't know if you have an AI engine that's looking for potential maladies in an X-ray, but it misses it. Do you sue the hospital, the doctor or the software company, right? And so that accountability element is huge. I think as we progress and we trust it to be part of our everyday decision making, it's as simply as a recommendation engine. It isn't actually doing all of the decisions. It's supporting us. We still have, after decades of advanced technology algorithms that have been proven, we can't predict what the market price of any object is gonna be tomorrow. And you know why? You know why human beings, we are so unpredictable. How we feel in the moment is radically different. And whereas we can extrapolate for a population to an individual choice, we can't do that. So humans and computers will not be separated. It's a, it's a joint partnership. But I wanna get back to your point, and I think this is very fundamental to the philosophy of both companies. Yeah, it's about a community. It's always about the people sharing ideas, getting the best. And anytime you have a center of excellence and algorithm that works for sales forecasting may actually be really interesting for churn analysis to make sure the employees or students don't leave the institution. So it's that community of interest that I think is unparalleled at other conferences. This is the place where a lot of that happens. >>I totally agree with that. We felt that on the show. I think that's a beautiful note to close on. Anthony, Bob, thank you so much for being here. I'm sure everyone feels more educated and perhaps more at peace with the chaos. David, thanks for sitting next to me asking the best questions of any host on the cube. And thank you all for being a part of our community. Speaking of community here on the cube, we're alive from Dallas, Texas. It's super computing all week. My name is Savannah Peterson and I'm grateful you're here. >>So I.

Published Date : Nov 16 2022

SUMMARY :

And we have the pleasure of being joined by both Dell and Navidia. Great show so far. I think we all, cause we haven't talked about this at all during the show yet, on the cube, I wanna make sure that everyone's on the same page when we're talking about But that kind of data, the sensor data, that video camera is unstructured or semi-structured, And one of the promises of AI is to be able to take advantage of everything that's going on We want the chaos, bring it. And chaos is in the eye of the beholder. And one of the reasons is for what we're talking about now is it's a little impact. scale in performance at the same rate as you throw GPUs at it. So, so as a gamer, I must say you're a little shot at making things pretty on a I apparently have the most boring cup of anyone on you today. But to do this in a budget you can afford. the horsepower to get to the results. and simplify it down so that the real world can absorb and use this? What's the Nvidia again? So Nvidia superpower is really about that accelerated computing stack at the bottom. We're here for the scoop on the cube, but is there a specific project or use case that has you personally excited And we start moving that towards more of the creative advertising and marketing It's got you And I'm gonna jump or pivot off of what That's a huge accessibility. And I geek out when I see stuff like that. and provide the metadata tag so you can, it's searchable, then we're off to better productions, so I appreciate that you just brought that up. I mean, we were talking about how steampunk some of the liquid cooling stuff is, and you know, this is the only place on earth really, There can be easily 20 or 30 components that all have to be the right version with the I got a question for Anthony. to have people like you to connect those dots. And anytime you have a center We felt that on the show.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Nicholson	PERSON	0.99+
Bob	PERSON	0.99+
Anthony	PERSON	0.99+
Bob Crovella	PERSON	0.99+
Dell	ORGANIZATION	0.99+
20	QUANTITY	0.99+
Invidia	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
Savannah Peterson	PERSON	0.99+
Mars	LOCATION	0.99+
Vidia	ORGANIZATION	0.99+
Nvidia	ORGANIZATION	0.99+
10	QUANTITY	0.99+
both	QUANTITY	0.99+
Dave	PERSON	0.99+
Dallas, Texas	LOCATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
15 years	QUANTITY	0.99+
Dallas, Texas	LOCATION	0.99+
Navidia	ORGANIZATION	0.99+
One	QUANTITY	0.99+
first level	QUANTITY	0.99+
both companies	QUANTITY	0.98+
Today	DATE	0.98+
one	QUANTITY	0.98+
2012	DATE	0.98+
today	DATE	0.98+
billions	QUANTITY	0.98+
earth	LOCATION	0.97+
10	DATE	0.96+
Anthony Dina	PERSON	0.96+
five years ago	DATE	0.96+
30 components	QUANTITY	0.95+
Navia	ORGANIZATION	0.95+
day two	QUANTITY	0.94+
one little piece	QUANTITY	0.91+
tomorrow	DATE	0.87+
three levels	QUANTITY	0.87+
HPC	ORGANIZATION	0.86+
20 years ago	DATE	0.83+
one little	QUANTITY	0.77+
billions of parameters	QUANTITY	0.75+
a decade	QUANTITY	0.74+
decades	QUANTITY	0.68+
German	OTHER	0.68+
dgx a 100 platform	COMMERCIAL_ITEM	0.67+
themes	QUANTITY	0.63+
second	QUANTITY	0.57+
22	QUANTITY	0.48+
Squad	ORGANIZATION	0.4+
Supercomputing 2022	ORGANIZATION	0.36+

Matt Burr, Pure Storage

(Intro Music) >> Hello everyone and welcome to this special cube conversation with Matt Burr who is the general manager of FlashBlade at Pure Storage. Matt, how you doing? Good to see you. >> I'm doing great. Nice to see you again, Dave. >> Yeah. You know, welcome back. We're going to be broadcasting this is at accelerate. You guys get big news. Of course, FlashBlade S we're going to dig into it. The famous FlashBlade now has new letter attached to it. Tell us what it is, what it's all about. >> (laughing) >> You know, it's easy to say. It's just the latest and greatest version of the FlashBlade, but obviously it's a lot more than that. We've had a lot of success with FlashBlade kind of across the board in particular with Meta and their research super cluster, which is one of the largest AI super clusters in the world. But, it's not enough to just build on the thing that you had, right? So, with the FlashBlade S, we've increased modularity, we've done things like, building co-design software and hardware and leveraging that into something that increases, or it actually doubles density, performance, power efficiency. On top of that, you can scale independently, storage, networking, and compute, which is pretty big deal because it gives you more flexibility, gives you a little more granularity around performance or capacity, depending on which direction you want to go. And we believe that, kind of the end of this is fundamentally the, I guess, the way to put it is sort of the highest performance and capacity optimization, unstructured data platform on the market today without the need for, kind of, an expensive data tier of cash or expected data cash and tier. So we're pretty excited about, what we've ended up with here. >> Yeah. So I think sometimes people forget, about how much core engineering Meta does. Facebook, you go on Facebook and play around and post things, but yeah, their backend cloud is just amazing. So talk a little bit more about the problem targets for FlashBlade. I mean, it's pretty wide scope and we're going to get into that, but what's the core of that. >> Yeah. We've talked about that extensively in the past, the use cases kind of generally remain the same. I know, we'll probably explore this a little bit more deeply, but you know, really what we're talking about here is performance and scalability. We have written essentially an unlimited Metadata software level, which gives us the ability to expand, we're already starting to think about computing an exabyte scale. Okay. So, the problem that the customer has of, Hey, I've got a Greenfield, object environment, or I've got a file environment and my 10 K and 7,500 RPM disc is just spiraling out of control in my environment. It's an environmental problem. It's a management problem, we have effectively, simplified the process of bringing together highly performant, very large multi petabyte to eventually exabyte scale unstructured data systems. >> So people are obviously trying to inject machine intelligence, AI, ML into applications, bring data into applications, bringing those worlds closer together. Analytics is obviously exploding. You see some other things happening in the news, read somewhere, protection and the like, where does FlashBlade fit in terms of FlashBlade S in some terms of some of these new use cases. >> All those things, we're only going wider and broader. So, we've talked in the past about having a having a horizontal approach to this market. The unstructured data market has often had vertical specificity. You could see successful infrastructure companies in oil and gas that may not play median entertainment, where you see, successful companies that play in media entertainment, but don't play well in financial services, for example. We're sort of playing the long game here with this and we're focused on, bringing an all Q L C architecture that combines our traditional kind of pure DFM with the software that is, now I guess seven years hardened from the original FlashBlade system. And so, when we look at customers and we look at kind of customers in three categories, right, we have customers that sort of fit into a very traditional, more than three, but kind of make bucketized this way, customers that fit into kind of this EDA HPC space, then you have that sort of data protection, which I believe kind of ransomware falls under that as well. The world has changed, right? So customers want their data back faster. Rapid restore is a real thing, right? We have customers that come to us and say, anybody can back up my data, but if I want to get something back fast and I mean in less than a week or a couple days, what do I do? So we can solve that problem. And then as you sort of accurately pointed out where you started, there is the AI ML side of things where the Invidia relationship that we have, right. DGX is are a pretty powerful weapon in that market and solving those problems. But they're not cheap. And keeping those DGX's running all the time requires an extremely efficient underpinning of a flash system. And we believe we have that market as well. >> It's interesting when pure was first coming out as a startup, you obviously had some cool new tech, but you know, your stack wasn't as hard. And now you've got seven years under your belt. The last time you were on the cube, we talked about some of the things that you guys were doing differently. We talked about UFFO, unified fast file and object. How does this new product, FlashBlade S, compare to some previous generations of FlashBlade in terms of solving unstructured data and some of these other trends that we've been talking about? >> Yeah. I touched on this a little bit earlier, but I want to go a little bit deeper on this concept of modularity. So for those that are familiar with Pure Storage, we have what's called the evergreen storage program. It's not as much a program as it is an engineering philosophy. The belief that everything we build should be modular in nature so that we can have essentially a chassi that has an a 100% modular components inside of it. Such that we can upgrade all of those features, non disruptively from one version to the next, you should think about that as you know, if you have an iPhone, when you go get a new iPhone, what do you do with your old iPhone? You either throw it away or you sell it. Well, imagine if your iPhone just got newer and better each time you renewed your, whatever it is, two year or three year subscription with apple. That's effectively what we have as a core philosophy, core operating engineering philosophy within pure. That is now a completely full and robust program with this instantiation of the FlashBlade S. And so kind of what that means is, for a customer I'm future proofed for X number of years, knowing that we have a run rate of being able to keep customers on the flash array side from the FA 400 all the way through the flash array X and Excel, which is about a 10 year time span. So, that then, and of itself sort of starts to play into customers that have concerns around ESG. Right? Last time I checked power space and cooling, still mattered in data center. So although I have people that tell me all the time, power space clearly doesn't matter anymore, but I know at the end of the day, most customers seem to say that it does, you're not throwing away refrigerator size pieces of equipment that once held spinning disc, something that's a size of a microwave that's populated with DFMs with all LC flash that you can actually upgrade over time. So if you want to scale more performance, we can do that through adding CPU. If you want to scale more capacity, we can do that through adding more And we're in control of those parameters because we're building our own DFM, our direct fabric modules on our own storage notes, if you will. So instead of relying on the consumer packaging of an SSD, we're upgrading our own stuff and growing it as we can. So again, on the ESG side, I think for many customers going into the next decade, it's going to be a huge deal. >> Yeah. Interesting comments, Matt. I mean, I don't know if you guys invented it, but you certainly popularize the idea of, no Fort lift upgrades and sort of set the industry on its head when you guys really drove that evergreen strategy and kind of on that note, you guys talk about simplicity. I remember last accelerate went deep with cause on your philosophy of keeping things simple, keeping things uncomplicated, you guys talk about using better science to do that. And you a lot of talk these days about outcomes. How does FlashBlade S support those claims and what do you guys mean by better science? >> Yeah. You know, better science is kind of a funny term. It was an internal term that I was on a sales call actually. And the customer said, well, I understand the difference between these two, but could you tell me how we got there and I was a little stumped on the answer. And I just said, well, I think we have better scientists and that kind of morphed into better science, a good example of that is our Metadata architecture, right? So our scalable Metadata allows us to avoid having that cashing tier, that other architectures have to rely on in order to anticipate, which files are going to need to be in read cash and read misses become very expensive. Now, a good follow up question there, not to do your job, but it's the question that I always get is, well, when you're designing your own hardware and your own software, what's the real material advantage of that? Well, the real material advantage of that is that you are in control of the combination and the interaction of those two things you don't give up the sort of the general purpose nature, if you will, of the performance characteristics that come along with things like commodity, you get a very specific performance profile. That's tailored to the software that's being married to it. Now in some instances you could say, well, okay, does that really matter? Well, when you start to talking about 20, 40, 50, 100, 500, petabyte data sets, every percentage matters. And so those individual percentages equate to space savings. They equate to power and cooling savings. We believe that we're going to have industry best dollars per lot. We're going to have industry best, kind of dollar PRU. So really the whole kind of game here is a round scale. >> Yeah. I mean, look, there's clearly places for the pure software defined. And then when cloud first came out, everybody said, oh, build the cloud and commodity, they don't build custom art. Now you see all the hyper scalers building custom software, custom hardware and software integration, custom Silicon. So co-innovation between hardware and software. It seems pretty as important, if not more important than ever, especially for some of these new workloads who knows what the edge is going to bring. What's the downside of not having that philosophy in your view? Is it just, you can't scale to the degree that you want, you can't support the new workloads or performance? What should customers be thinking about there? >> I think the downside plays in two ways. First is kind of the future and at scale, as I alluded to earlier around cost and just savings over time. Right? So if you're using a you know a commodity SSD, there's packaging around that SSD that is wasteful both in terms of- It's wasteful in the environmental sense and wasteful in the sort of computing performance sense. So that's kind of one thing. On the second side, it's easier for us to control the controllables around reliability when you can eliminate the number of things that actually sit in that workflow and by workflow, I mean when a right is acknowledged from a host and it gets down to the media, the more control you have over that, the more reliability you have over that piece. >> Yeah. I know. And we talked about ESG earlier. I know you guys, I'm going to talk a little bit about more news from accelerate within Invidia. You've certainly heard Jensen talk about the wasted CPU cycles in the data center. I think he's forecasted, 25 to 30% of the cycles are wasted on doing things like storage offload, or certainly networking and security. So now it sort of confirms your ESG thought, we can do things more efficiently, but as it relates to Invidia and some of the news around AIRI's, what is the AI RI? What's that stand for? What's the high level overview of AIRI. >> So the AIRI has been really successful for both us and Invidia. It's a really great partnership we're appreciative of the partnership. In fact, Tony pack day will be speaking here at accelerate. So, really looking forward to that, Look, there's a couple ways to look at this and I take the macro view on this. I know that there's a equally as good of a micro example, but I think the macro is really kind of where it's at. We don't have data center space anymore, right? There's only so many data centers we can build. There's only so much power we can create. We are going to reach a point in time where municipalities are going to struggle against the businesses that are in their municipalities for power. And now you're essentially bidding big corporations against people who have an electric bill. And that's only going to last so long, you know who doesn't win in that? The big corporation doesn't win in that. Because elected officials will have to find a way to serve the people so that they can get power. No matter how skewed we think that may be. That is the reality. And so, as we look at this transition, that first decade of disc to flash transition was really in the block world. The second decade, which it's really fortunate to have a multi decade company, of course. But the second decade of riding that wave from disk to flash is about improving space, power, efficiency, and density. And we sort of reach that, it's a long way of getting to the point about iMedia where these AI clusters are extremely powerful things. And they're only going to get bigger, right? They're not going to get smaller. It's not like anybody out there saying, oh, it's a Thad, or, this isn't going to be something that's going to yield any results or outcomes. They yield tremendous outcomes in healthcare. They yield tremendous outcomes in financial services. They use tremendous outcome in cancer research, right? These are not things that we as a society are going to give up. And in fact, we're going to want to invest more on them, but they come at a cost and one of the resources that is required is power. And so when you look at what we've done in particular with Invidia. You found something that is extremely power efficient that meets the needs of kind of going back to that macro view of both the community and the business. It's a win-win. >> You know and you're right. It's not going to get smaller. It's just going to continue to in momentum, but it could get increasingly distributed. And you think about, I talked about the edge earlier. You think about AI inferencing at the edge. I think about Bitcoin mining, it's very distributed, but it consumes a lot of power and so we're not exactly sure what the next level architecture is, but we do know that science is going to be behind it. Talk a little bit more about your Invidia relationship, because I think you guys were the first, I might be wrong about this, but I think you were the first storage company to announce a partnership with Invidia several years ago, probably four years ago. How is this new solution with a AIRI slash S building on that partnership? What can we expect with Invidia going forward? >> Yeah. I think what you can expect to see is putting the foot on the gas on kind of where we've been with Invidia. So, as I mentioned earlier Meta is by some measurements, the world's largest research super cluster, they're a huge Invidia customer and built on pure infrastructure. So we see kind of those types of well reference architectures, not that everyone's going to have a Meta scale reference architecture, but the base principles of what they're solving for are the base principles of what we're going to begin to see in the enterprise. I know that begin sounds like a strange word because there's already a big business in DGX. There's already a sizable business in performance, unstructured data. But those are only going to get exponentially bigger from here. So kind of what we see is a deepening and a strengthening of the of the relationship and opportunity for us to talk, jointly to customers that are going to be building these big facilities and big data centers for these types of compute related problems and talking about efficiency, right? DGX are much more efficient and Flash Blades are much more efficient. It's a great pairing. >> Yeah. I mean you're definitely, a lot of AI today is modeling in the cloud, seeing HPC and data just slam together all kinds of new use cases. And these types of partnerships are the only way that we're going to solve the future problems and go after these future opportunities. I'll give you a last word you got to be excited with accelerate, what should people be looking for, add accelerate and beyond. >> You know, look, I am really excited. This is going on my 12th year at Pure Storage, which has to be seven or eight accelerates whenever we started this thing. So it's a great time of the year, maybe take a couple off because of because of COVID, but I love reconnecting in particular with partners and customers and just hearing kind of what they have to say. And this is kind of a nice one. This is four years or five years worth of work for my team who candidly I'm extremely proud of for choosing to take on some of the solutions that they, or excuse me, some of the problems that they chose to take on and find solutions for. So as accelerate roles around, I think we have some pretty interesting evolutions of the evergreen program coming to be announced. We have some exciting announcements in the other product arenas as well, but the big one for this event is FlashBlade. And I think that we will see. Look, no one's going to completely control this transition from disc to flash, right? That's a that's a macro trend. But there are these points in time where individual companies can sort of accelerate the pace at which it's happening. And that happens through cost, it happens through performance. My personal belief is this will be one of the largest points of those types of acceleration in this transformation from disc to flash and unstructured data. This is such a leap. This is essentially the equivalent of us going from the 400 series on the block side to the X, for those that you're familiar with the flash array lines. So it's a huge, huge leap for us. I think it's a huge leap for the market. And look, I think you should be proud of the company you work for. And I am immensely proud of what we've created here. And I think one of the things that is a good joy in life is to be able to talk to customers about things you care about. I've always told people my whole life, inefficiency is the bane of my existence. And I think we've rooted out ton of inefficiency with this product and looking forward to going and reclaiming a bunch of data center space and power without sacrificing any performance. >> Well congratulations on making it into the second decade. And I'm looking forward to the orange and the third decade, Matt Burr, thanks so much for coming back in the cubes. It's good to see you. >> Thanks, Dave. Nice to see you as well. We appreciate it. >> All right. And thank you for watching. This is Dave Vellante for the Cube. And we'll see you next time. (outro music)

Published Date : May 24 2022

SUMMARY :

Good to see you. to see you again, Dave. We're going to be broadcasting kind of the end of this the problem targets for FlashBlade. in the past, the use cases kind of happening in the news, We have customers that come to us and say, that you guys were doing differently. that tell me all the time, and kind of on that note, the general purpose nature, if you will, to the degree that you want, First is kind of the future and at scale, and some of the news around AIRI's, that meets the needs of I talked about the edge earlier. of the of the relationship are the only way that we're going to solve of the company you work for. and the third decade, Nice to see you as well. This is Dave Vellante for the Cube.

ENTITIES

Entity	Category	Confidence
Matt Burr	PERSON	0.99+
Dave	PERSON	0.99+
Invidia	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
100%	QUANTITY	0.99+
25	QUANTITY	0.99+
AIRI	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
five years	QUANTITY	0.99+
10 K	QUANTITY	0.99+
four years	QUANTITY	0.99+
seven	QUANTITY	0.99+
Excel	TITLE	0.99+
three year	QUANTITY	0.99+
First	QUANTITY	0.99+
12th year	QUANTITY	0.99+
7,500 RPM	QUANTITY	0.99+
Matt	PERSON	0.99+
two year	QUANTITY	0.99+
apple	ORGANIZATION	0.99+
less than a week	QUANTITY	0.99+
first decade	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
second side	QUANTITY	0.99+
eight	QUANTITY	0.99+
second decade	QUANTITY	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
40	QUANTITY	0.99+
four years ago	DATE	0.99+
more than three	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
100	QUANTITY	0.98+
next decade	DATE	0.98+
two ways	QUANTITY	0.98+
50	QUANTITY	0.98+
one version	QUANTITY	0.98+
several years ago	DATE	0.98+
30%	QUANTITY	0.98+
two	QUANTITY	0.97+
one	QUANTITY	0.97+
Tony	PERSON	0.97+
two things	QUANTITY	0.97+
500	QUANTITY	0.97+
Pure Storage	ORGANIZATION	0.97+
FlashBlade	TITLE	0.97+
today	DATE	0.94+
third decade	QUANTITY	0.94+
FlashBlade	EVENT	0.94+
a couple days	QUANTITY	0.9+
first storage company	QUANTITY	0.88+
each time	QUANTITY	0.88+
ESG	ORGANIZATION	0.87+
Jensen	PERSON	0.85+
DGX	ORGANIZATION	0.85+
FlashBlade S	TITLE	0.85+
three categories	QUANTITY	0.85+
FlashBlade S	COMMERCIAL_ITEM	0.82+
about a 10 year	QUANTITY	0.82+
400 series	QUANTITY	0.78+

Abhinav Joshi & Tushar Katarki, Red Hat | KubeCon + CloudNativeCon Europe 2020 – Virtual

>> Announcer: From around the globe, it's theCUBE with coverage of KubeCon + CloudNativeCon Europe 2020 Virtual brought to you by Red Hat, the Cloud Native Computing Foundation and Ecosystem partners. >> Welcome back I'm Stu Miniman, this is theCUBE's coverage of KubeCon + CloudNativeCon Europe 2020, the virtual event. Of course, when we talk about Cloud Native we talk about Kubernetes there's a lot that's happening to modernize the infrastructure but a very important thing that we're going to talk about today is also what's happening up the stack, what sits on top of it and some of the new use cases and applications that are enabled by all of this modern environment and for that we're going to talk about artificial intelligence and machine learning or AI and ML as we tend to talk in the industry, so happy to welcome to the program. We have two first time guests joining us from Red Hat. First of all, we have Abhinav Joshi and Tushar Katarki they are both senior managers, part of the OpenShift group. Abhinav is in the product marketing and Tushar is in product management. Abhinav and Tushar thank you so much for joining us. >> Thanks a lot, Stu, we're glad to be here. >> Thanks Stu and glad to be here at KubeCon. >> All right, so Abhinav I mentioned in the intro here, modernization of the infrastructure is awesome but really it's an enabler. We know... I'm an infrastructure person the whole reason we have infrastructure is to be able to drive those applications, interact with my data and the like and of course, AI and ML are exciting a lot going on there but can also be challenging. So, Abhinav if I could start with you bring us inside your customers that you're talking to, what are the challenges, the opportunities? What are they seeing in this space? Maybe what's been holding them back from really unlocking the value that is expected? >> Yup, that's a very good question to kick off the conversation. So what we are seeing as an organization they typically face a lot of challenges when they're trying to build an AI/ML environment, right? And the first one is like a talent shortage. There is a limited amount of the AI, ML expertise in the market and especially the data scientists that are responsible for building out the machine learning and the deep learning models. So yeah, it's hard to find them and to be able to retain them and also other talents like a data engineer or app DevOps folks as well and the lack of talent can actually stall the project. And the second key challenge that we see is the lack of the readily usable data. So the businesses collect a lot of data but they must find the right data and make it ready for the data scientists to be able to build out, to be able to test and train the machine learning models. If you don't have the right kind of data to the predictions that your model is going to do in the real world is only going to be so good. So that becomes a challenge as well, to be able to find and be able to wrangle the right kind of data. And the third key challenge that we see is the lack of the rapid availability of the compute infrastructure, the data and machine learning, and the app dev tools for the various personas like a data scientist or data engineer, the software developers and so on that can also slow down the project, right? Because if all your teams are waiting on the infrastructure and the tooling of their choice to be provisioned on a recurring basis and they don't get it in a timely manner, it can stall the projects. And then the next one is the lack of collaboration. So you have all these kinds of teams that are involved in the AI project, and they have to collaborate with each other because the work one of the team does has a dependency on a different team like say for example, the data scientists are responsible for building the machine learning models and then what they have to do is they have to work with the app dev teams to make sure the models get integrated as part of the app dev processes and ultimately rolled out into the production. So if all these teams are operating in say silos and there is lack of collaboration between the teams, so this can stall the projects as well. And finally, what we see is the data scientists they typically start the machine learning modeling on their individual PCs or laptops and they don't focus on the operational aspects of the solution. So what this means is when the IT teams have to roll all this out into a production kind of deployment, so they get challenged to take all the work that has been done by the individuals and then be able to make sense out of it, be able to make sure that it can be seamlessly brought up in a production environment in a consistent way, be it on-premises, be it in the cloud or be it say at the edge. So these are some of the key challenges that we see that the organizations are facing, as they say try to take the AI projects from pilot to production. >> Well, some of those things seem like repetition of what we've had in the past. Obviously silos have been the bane of IT moving forward and of course, for many years we've been talking about that gap between developers and what's happening in the operation side. So Tushar, help us connect the dots, containers, Kubernetes, the whole DevOps movement. How is this setting us up to actually be successful for solutions like AI and ML? >> Sure Stu I mean, in fact you said it right like in the world of software, in the world of microservices, in the world of app modernization, in the world of DevOps in the past 10, 15 years, but we have seen this evolution revolution happen with containers and Kubernetes driving more DevOps behavior, driving more agile behavior so this in fact is what we are trying to say here can ease up the cable to EIML also. So the various containers, Kubernetes, DevOps and OpenShift for software development is directly applicable for AI projects to make them move agile, to get them into production, to make them more valuable to organization so that they can realize the full potential of AI. We already touched upon a few personas so it's useful to think about who the users are, who the personas are. Abhinav I talked about data scientists these are the people who obviously do the machine learning itself, do the modeling. Then there are data engineers who do the plumbing who provide the essential data. Data is so essential to machine learning and deep learning and so there are data engineers that are app developers who in some ways will then use the output of what the data scientists have produced in terms of models and then incorporate them into services and of course, none of these things are purely cast in stone there's a lot of overlap you could find that data scientists are app developers as well, you'll see some of app developers being data scientist later data engineer. So it's a continuum rather than strict boundaries, but regardless what all of these personas groups of people need or experts need is self service to that preferred tools and compute and storage resources to be productive and then let's not forget the IT, engineering and operations teams that need to make all this happen in an easy, reliable, available manner and something that is really safe and secure. So containers help you, they help you quickly and easily deploy a broad set of machine learning tools, data tools across the cloud, the hybrid cloud from data center to public cloud to the edge in a very consistent way. Teams can therefore alternatively modify, change a shared container images, machine learning models with (indistinct) and track changes. And this could be applicable to both containers as well as to the data by the way and be transparent and transparency helps in collaboration but also it could help with the regulatory reasons later on in the process. And then with containers because of the inherent processes solution, resource control and protection from threat they can also be very secure. Now, Kubernetes takes it to the next level first of all, it forms a cluster of all your compute and data resources, and it helps you to run your containerized tools and whatever you develop on them in a consistent way with access to these shared compute and centralized compute and storage and networking resources from the data center, the edge or the public cloud. They provide things like resource management, workload scheduling, multi-tendency controls so that you can be a proper neighbors if you will, and quota enforcement right? Now that's Kubernetes now if you want to up level it further if you want to enhance what Kubernetes offers then you go into how do you write applications? How do you actually make those models into services? And that's where... and how do you lifecycle them? And that's sort of the power of Helm and for the more Kubernetes operators really comes into the picture and while Helm helps in installing some of this for a complete life cycle experience. A kubernetes operator is the way to go and they simplify the acceleration and deployment and life cycle management from end-to-end of your entire AI, ML tool chain. So all in all organizations therefore you'll see that they need to dial up and define models rapidly just like applications that's how they get ready out of it quickly. There is a lack of collaboration across teams as Abhinav pointed out earlier, as you noticed that has happened still in the world of software also. So we're talking about how do you bring those best practices here to AI, ML. DevOps approaches for machine learning operations or many analysts and others have started calling as MLOps. So how do you kind of bring DevOps to machine learning, and fosters better collaboration between teams, application developers and IT operations and create this feedback loop so that the time to production and the ability to take more machine learning into production and ML-powered applications into production increase is significant. So that's kind of the, where I wanted shine the light on what you were referring to earlier, Stu. >> All right, Abhinav of course one of the good things about OpenShift is you have quite a lot of customers that have deployed the solution over the years, bring us inside some of your customers what are they doing for AI, ML and help us understand really what differentiates OpenShift in the marketplace for this solution set. >> Yeah, absolutely that's a very good question as well and we're seeing a lot of traction in terms of all kinds of industries, right? Be it the financial services like healthcare, automotive, insurance, oil and gas, manufacturing and so on. For a wide variety of use cases and what we are seeing is at the end of the day like all these deployments are focused on helping improve the customer experience, be able to automate the business processes and then be able to help them increase the revenue, serve their customers better, and also be able to save costs. If you go to openshift.com/ai-ml it's got like a lot of customer stories in there but today I will not touch on three of the customers we have in terms of the different industries. The first one is like Royal Bank of Canada. So they are a top global financial institution based out of Canada and they have more than 17 million clients globally. So they recently announced that they build out an AI-powered private cloud platform that was based on OpenShift as well as the NVIDIA DGX AI compute system and this whole solution is actually helping them to transform the customer banking experience by being able to deliver an AI-powered intelligent apps and also at the same time being able to improve the operational efficiency of their organization. And now with this kind of a solution, what they're able to do is they're able to run thousands of simulations and be able to analyze millions of data points in a fraction of time as compared to the solution that they had before. Yeah, so like a lot of great work going on there but now the next one is the ETCA healthcare. So like ETCA is one of the leading healthcare providers in the country and they're based out of the Nashville, Tennessee. And they have more than 184 hospitals as well as more than 2,000 sites of care in the U.S. as well as in the UK. So what they did was they developed a very innovative machine learning power data platform on top of our OpenShift to help save lives. The first use case was to help with the early detection of sepsis like it's a life-threatening condition and then more recently they've been able to use OpenShift in the same kind of stack to be able to roll out the new applications that are powered by machine learning and deep learning let say to help them fight COVID-19. And recently they did a webinar as well that had all the details on the challenges they had like how did they go about it? Like the people, process and technology and then what the outcomes are. And we are proud to be a partner in the solution to help with such a noble cause. And the third example I want to share here is the BMW group and our partner DXC Technology what they've done is they've actually developed a very high performing data-driven data platform, a development platform based on OpenShift to be able to analyze the massive amount of data from the test fleet, the data and the speed of the say to help speed up the autonomous driving initiatives. And what they've also done is they've redesigned the connected drive capability that they have on top of OpenShift that's actually helping them provide various use cases to help improve the customer experience. With the customers and all of the customers are able to leverage a lot of different value-add services directly from within the car, their own cars. And then like last year at the Red Hat Summit they had a keynote as well and then this year at Summit, they were one of the Innovation Award winners. And we have a lot more stories but these are the three that I thought are actually compelling that I should talk about here on theCUBE. >> Yeah Abhinav just a quick follow up for you. One of the things of course we're looking at in 2020 is how has the COVID-19 pandemic, people working from home how has that impacted projects? I have to think that AI and ML are one of those projects that take a little bit longer to deploy, is it something that you see are they accelerating it? Are they putting on pause or are new project kicking off? Anything you can share from customers you're hearing right now as to the impact that they're seeing this year? >> Yeah what we are seeing is that the customers are now even more keen to be able to roll out the digital (indistinct) but we see a lot of customers are now on the accelerated timeline to be able to say complete the AI, ML project. So yeah, it's picking up a lot of momentum and we talk to a lot of analyst as well and they are reporting the same thing as well. But there is the interest that is actually like ramping up on the AI, ML projects like across their customer base. So yeah it's the right time to be looking at the innovation services that it can help improve the customer experience in the new virtual world that we live in now about COVID-19. >> All right, Tushar you mentioned that there's a few projects involved and of course we know at this conference there's a very large ecosystem. Red Hat is a strong contributor to many, many open source projects. Give us a little bit of a view as to in the AI, ML space who's involved, which pieces are important and how Red Hat looks at this entire ecosystem? >> Thank you, Stu so as you know technology partnerships and the power of open is really what is driving the technology world these days in any ways and particularly in the AI ecosystem. And that is mainly because one of the machine learning is in a bootstrap in the past 10 years or so and a lot of that emerging technology to take advantage of the emerging data as well as compute power has been built on the kind of the Linux ecosystem with openness and languages like popular languages like Python, et cetera. And so what you... and of course tons of technology based in Java but the point really here is that the ecosystem plays a big role and open plays a big role and that's kind of Red Hat's best cup of tea, if you will. And that really has plays a leadership role in the open ecosystem so if we take your question and kind of put it into two parts, what is the... what we are doing in the community and then what we are doing in terms of partnerships themselves, commercial partnerships, technology partnerships we'll take it one step at a time. In terms of the community itself, if you step back to the three years, we worked with other vendors and users, including Google and NVIDIA and H2O and other Seldon, et cetera, and both startups and big companies to develop this Kubeflow ecosystem. The Kubeflow is upstream community that is focused on developing MLOps as we talked about earlier end-to-end machine learning on top of Kubernetes. So Kubeflow right now is in 1.0 it happened a few months ago now it's actually at 1.1 you'll see that coupon here and then so that's the Kubeflow community in addition to that we are augmenting that with the Open Data Hub community which is something that extends the capabilities of the Kubeflow community to also add some of the data pipelining stuff and some of the data stuff that I talked about and forms a reference architecture on how to run some of this on top of OpenShift. So the Open Data Hub community also has a great way of including partners from a technology partnership perspective and then tie that with something that I mentioned earlier, which is the idea of Kubernetes operators. Now, if you take a step back as I mentioned earlier, Kubernetes operators help manage the life cycle of the entire application or containerized application including not only the configuration on day one but also day two activities like update and backups, restore et cetera whatever the application needs. Afford proper functioning that a "operator" needs for it to make sure so anyways, the Kubernetes operators ecosystem is also flourishing and we haven't faced that with the OperatorHub.io which is a community marketplace if you will, I don't call it marketplace a community hub because it's just comprised of community operators. So the Open Data Hub actually can take community operators and can show you how to run that on top of OpenShift and manage the life cycle. Now that's the reference architecture. Now, the other aspect of it really is as I mentioned earlier is the commercial aspect of it. It is from a customer point of view, how do I get certified, supported software? And to that extent, what we have is at the top of the... from a user experience point of view, we have certified operators and certified applications from the AI, ML, ISV community in the Red Hat marketplace. And from the Red Hat marketplace is where it becomes easy for end users to easily deploy these ISVs and manage the complete life cycle as I said. Some of the examples of these kinds of ISVs include startups like H2O although H2O is kind of well known in certain sectors PerceptiLabs, Cnvrg, Seldon, Starburst et cetera and then on the other side, we do have other big giants also in this which includes partnerships with NVIDIA, Cloudera et cetera that we have announced, including our also SaaS I got to mention. So anyways these provide... create that rich ecosystem for data scientists to take advantage of. A TEDx Summit back in April, we along with Cloudera, SaaS Anaconda showcased a live demo that shows all these things to working together on top of OpenShift with this operator kind of idea that I talked about. So I welcome people to go and take a look the openshift.com/ai-ml that Abhinav already referenced should have a link to that it take a simple Google search might download if you need some of that, but anyways and the other part of it is really our work with the hardware OEMs right? And so obviously NVIDIA GPUs is obviously hardware, and that accelerations is really important in this world but we are also working with other OEM partners like HP and Dell to produce this accelerated AI platform that turnkey solutions to run your data-- to create this open AI platform for "private cloud" or the data center. The other thing obviously is IBM, IBM Cloud Pak for Data is based on OpenShift that has been around for some time and is seeing very good traction, if you think about a very turnkey solution, IBM Cloud Pak is definitely kind of well ahead in that and then finally Red Hat is about driving innovation in the open-source community. So, as I said earlier, we are doing the Open Data Hub which that reference architecture that showcases a combination of upstream open source projects and all these ISV ecosystems coming together. So I welcome you to take a look at that at opendatahub.io So I think that would be kind of the some total of how we are not only doing open and community building but also doing certifications and providing to our customers that assurance that they can run these tools in production with the help of a rich certified ecosystem. >> And customer is always key to us so that's the other thing that the goal here is to provide our customers with a choice, right? They can go with open source or they can go with a commercial solution as well. So you want to make sure that they get the best in cloud experience on top of our OpenShift and our broader portfolio as well. >> All right great, great note to end on, Abhinav thank you so much and Tushar great to see the maturation in this space, such an important use case. Really appreciate you sharing this with theCUBE and Kubecon community. >> Thank you, Stu. >> Thank you, Stu. >> Okay thank you and thanks a lot and have a great rest of the show. Thanks everyone, stay safe. >> Thanks you and stay with us for a lot more coverage from KubeCon + CloudNativeCon Europe 2020, the virtual edition I'm Stu Miniman and thank you as always for watching theCUBE. (soft upbeat music plays)

Published Date : Aug 18 2020

SUMMARY :

the globe, it's theCUBE and some of the new use Thanks a lot, Stu, to be here at KubeCon. and the like and of course, and make it ready for the data scientists in the operation side. and for the more Kubernetes operators that have deployed the and also at the same time One of the things of course is that the customers and how Red Hat looks at and some of the data that the goal here is great to see the maturation and have a great rest of the show. the virtual edition I'm Stu Miniman

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
David Brown	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Stu	PERSON	0.99+
Herain Oberoi	PERSON	0.99+
John	PERSON	0.99+
Dave Valante	PERSON	0.99+
Kamile Taouk	PERSON	0.99+
John Fourier	PERSON	0.99+
Rinesh Patel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Santana Dasgupta	PERSON	0.99+
Europe	LOCATION	0.99+
Canada	LOCATION	0.99+
BMW	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
ICE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Jack Berkowitz	PERSON	0.99+
Australia	LOCATION	0.99+
NVIDIA	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Venkat	PERSON	0.99+
Michael	PERSON	0.99+
Camille	PERSON	0.99+
Andy Jassy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Venkat Krishnamachari	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Don Tapscott	PERSON	0.99+
thousands	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Intercontinental Exchange	ORGANIZATION	0.99+
Children's Cancer Institute	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
telco	ORGANIZATION	0.99+
Sabrina Yan	PERSON	0.99+
Tim	PERSON	0.99+
Sabrina	PERSON	0.99+
John Furrier	PERSON	0.99+
Google	ORGANIZATION	0.99+
MontyCloud	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Leo	PERSON	0.99+
COVID-19	OTHER	0.99+
Santa Ana	LOCATION	0.99+
UK	LOCATION	0.99+
Tushar	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Valente	PERSON	0.99+
JL Valente	PERSON	0.99+
1,000	QUANTITY	0.99+

John Curran & Jim Benedetto, Core Scientific | Pure Accelerate 2019

>> Announcer: From Austin, Texas, it's theCUBE Covering Pure Storage Accelerate 2019. Brought to you by Pure Storage. >> Welcome back to theCUBE, Lisa Martin live on the Pure Accelerate floor in Austin, Texas. Dave Vellante is joining me and we're pleased to welcome a couple of guests from Core Scientific for the first time to theCUBE. We have Jim Benedetto, Chief Data Officer and John Curran, the SVP of Business Development. Gentlemen, welcome to theCUBE. >> Both: Thank you. >> Pleasure to be here. >> So John, we're going to start with you. Give our audience an overview of who Core Scientific is, what you guys do, what you deliver. >> Sure, well, we're a two year old start up. Headquartered out of Bellevue, Washington and we really focus on two primary businesses. We have a blockchain business and we have an AI business. In blockchain, we are one of the largest blockchain cryptocurrency hosting companies in North America. We've got facilities, four facilities in North Carolina, South Carolina, Georgia, and Kentucky. And really the business there is helping companies to be able to take advantage of blockchain and then position them for the future, you know. And then on the AI side of our business, really we operate that in two ways. One is we can also co-locate and host people, just like we do on the blockchain side. But primarily, we're focused on creating a public cloud focused on GPU centric computing and artificial intelligence and we're there to help really usher in the new age of AI. >> So you guys you founded, you said two years ago. >> Yes. >> From what I can tell you haven't raised a ton of dough. Is that true or are you guys quiet about that? >> John: We're very well capitalized. >> Okay, so it hasn't hit crunch base yet. >> Yeah, no. So we're a very well capitalized company. We've got, you know, to give you-- >> 'Cause what you do is not cheap. >> No, no, we've got about 675 megawatts of power under contract so each one of our facilities is about 50 megawatts plus in size. So no, it's not cheap. They're large installations and large build outs. >> And to even give you a comparison, a standard data center is about five to 10 megawatts. We won't even look at a facility or a plot of land unless we can supply at least 50 megawatts of power. >> So I was going to ask you kind of describe what's different between sort of blockchain hosting at conventional data bases or data centers. You kind of just did, but are there other sort of technical factors that you guys consider? >> Absolutely. We custom build our own data centers from the ground up. We've got patent pending technology, and if you look at virtually every data center in the world today, it's built with one thing at it's core and that's the CPU. The CPU is fundamentally different than the GPU and if you try to retrofit CPU based data centers for GPUs you're not going to fully maximize the performance and the capabilities of the GPU. So we build from the ground up data centers focused with the GPU at the center and not the CPU at the center. >> And is center in quotes because I mean, you have all this alternative processing, GPUs in particular that are popping up all over the place. As opposed to traditional CPU, which is, okay, just jam as much as I can on the real estate as possible, is that a factor? >> Well there's also a lot, the GPU at the center but there's also a lot of supporting infrastructure. So you got to look at first off the power density is very, very different. GPU, they require significantly a lot more power than CPUs do and then also just from a fluid dynamic prospective, it's very, the heating and cooling of them is again fundamentally different. You're not looking at standard hot, cold aisles and raised floors. But the overall goal also is to be able to provide a supporting infrastructure, which is from an AI ready design, is the interconnected networking and also the incredibly fast storage behind it. Because the name of the game with GPUs is different than with CPUs. With GPUs, the one thing you want to do is you want to get as much data into the GPU as fast as possible. Because compute will very rarely be your limiting factor with the GPU so the supporting infrastructure is significantly more important than it is when you're dealing with CPUs. >> So the standard narrative is, well, I don't know about cryptocurrency but the underlying technology of blockchain has a lot of potential. I personally think they're very much related and I wonder if you guys can comment on that. You started during the real, sort of the latest, most recent sort of big uptick, I know it's bounced back in cryptocurrency and so must you must've had a lot of activity in really, in your early days. And then maybe the crypto winter affected you, maybe it didn't. Some of those companies were so well capitalized, it was kind of their time to innovate, right? And yeah, there were some bad actors but that's really not the core of it. So I wonder what you guys have seen in the blockchain market. We'll get to AI and Pure and all that other stuff but this is a great topic, so I wonder if you could comment. >> So you know, yes, there's certainly classicality in the blockchain market, right? I think one of the key things is being well capitalized allows you to invest through the down turns to position to come out stronger as the market came out and you know, we've certainly seen that. Our growth in blockchain continues to really be substantial. And you know, we're making all the right strategic investments, right? Whether it's blockchain or AI, because you have such significant power requirements you know, you got to be very strategic about where you put the facilities. You're looking for facilities that have large sustained power capabilities, green. You know we've seen carbon taxes come in, that'll adversely affect folks. We want to make sure we're positioned for long term in terms of the capabilities. And then some geo political uncertainty is certainly affected, you know. The blockchain side of the business and it's driven more business to North America which has been fantastic for us. >> To me you're hosting innovation, you're talking blockchain and AI and like you're saying include crypto in there, you have some cryptocurrency guys, right? >> We do blockchain or cryptocurrency mining for ourselves as well. >> For yourselves, okay. But so my take on it is a whole new internet is being built and the crypto craze actually has funded a lot of that innovation. New protocol, when's the last time, the protocols of the internet, SMTP, HTDP, they're all government funded or education funded, academic institutions and the big internet companies sort of co-opted them. So you had a dirt of innovation, that's now come back. And you guys are hosting that innovation, that's kind of how I look at it. And I feel like we've seated the base and there's going to be this massive explosion of innovation, both in blockchain, crypto, AI automation and you're in the heart of it. >> Yeah I agree, I think cryptocurrencies or digital currencies are really just the first successful experiment of the blockchain and I agree with you, I think that is is as revolutionary and is going to change as many industries as the internet did and we're still very in a nascent stage of the technology but at Core, we're working to position ourselves to really be the underlying platform, almost like the alchemy of the early days of the internet. The underlying platform and the plumbing for both blockchain and AI applications. >> Right, whether it's smart contracts, like I say, new innovation, AI, it's all powering next generation of distributed apps. Really okay, so, sorry, I love this topic. >> I know you do. (laughs) >> Okay so where do these guys fit in? >> John: So do we. >> I mean, it's just so exciting. I think it's misunderstood. I mean the people who are into it are believers. I mean like myself, I really believe in a value store, I believe in smart contracts, immutability, you know, and I believe in responsibility too and that other good stuff but so. >> Innovation in private blockchain is just starting. If you look at it, I think there's going to be multiple waves in the blockchain side and we want to be there to make sure that we're helping power and position folks from both an infrastructure as well as a software perspective. >> Every financial institution, you got VMware doing stuff, Libra, I love Libra even though it's getting a lot of criticism, it just shined a light on the whole topic but bring us back to sort of commercial mainstream, what are you guys doing here, what's going on with Pure? >> So we have built, we're the first AI ready certified data center and we've actually partnered very closely with Pure and INVIDIA. As we went through the selection process of what type of storage we're going to be using to back our GPUs, we went through a variety of different evaluation criteria and Pure came out ahead and we've decided that we're going with Pure and we, again, for me it boils down to one thing as a Chief Data Officer is how much data can I get into those GPUs as fast as possible? And what you see is if you look at a existing, current Cloud providers, you'll see that their retro fitting CPU based centers for GPUs and you see a lot of problems with that where the storage that they provide is not fast enough to drive quote unquote warm or cold data into the GPUs so people end up adding more and more GPUs, it's actually just increased GPU memory when they're usually running around a couple percents, like one or two percent, five percent compute but you have to add more just for the memory because the storage is so slow. >> So you, how Jim you were saying before when we were chatting earlier, that you have had 20 years of experience looking at different storage vendors, working with them, what were some of the criteria, you talked about the speed and the performance, but in terms of, you also mentioned John that green was, is an important component of the way that you build data centers, where was Pure's vision on sustainability, ever green, where was that a factor in the decision to go with Pure? >> If you look at Pure's power density requirements and things like that, I think it's important. One thing that also, and this does apply from the sustainability perspective, where a lot of other storage vendors say that they're horizontally scalable forever but they're actually running different heads and in a variety of different ways. Pure is the only storage vendor that I've ever come across that is truly horizontally scalable. And when you start to try to build stuff like that you get into all the different things of super computing where you got, you know, split brain scenarios and fencing and it's very complex but their ability to scale horizontally with just, not even disc, but just the storage is something that was really important to us. >> I think the other thing that's certainly interesting for our customers is you're looking at important workloads that they're driving out and so the ability to do in place upgrades, business continuity, right, to make sure that we're able to deliver them technology that doesn't disrupt their business when their business needs the results, it's critically important so Pure is a great choice for us from that perspective and the innovations they're driving on that side of the business has really been helpful. >> I read a stat on the Pure website where users of Core Scientific infrastructure are seeing performance improvements of up to 800%. Are you delighting the heck out of data scientists now? >> Yeah, I mean. >> Are those the primary users? >> That is, it again references what we see with people using GPUs in the public Cloud. Again, going back to the thing that I keep hammering on, driving data into that GPU. We had one customer that had somewhere 14 or 15 GPUs running an analytics application in the public Cloud and we told them keep all your CPU compute in one of the largest Cloud providers but move just your GPU compute to us and they went from 14 or 15 GPUs down to two. GV-100 and a DGX-1 and backed by Pure Storage with Arista and from 14 GPUs to two GPUs, they saw an 800% in performance. >> Wow. >> And there's a really important additional part to that, let's say if I'm running a dashboard or running a query and a .5 second query gets an 800% increase in performance, how much do I really care? Now if I'm the guy running a 100 queries every single day, I probably do but it's not just that, it's the fact that it allows, it doesn't just speed up things, it allows you to look at data you were never able to look at before. So it's not just that they have an 800% performance increase, it's that instead of having tables with 100s of millions of rows, they now can have tables with billions of rows. So data that was previously not looked at before, data that was previously not turned into the actionable information to help drive their business, is now, they're now getting visibility into data they didn't have access to before. >> So you're a CDO that, it sounds like you have technical chops. >> Yeah, I'm a tech nerd at heart. >> It's kind rare actually for a CDO, I've interviewed a lot of CDOs and most of them are kind of come from a data quality background or a governance and compliance world, they don't dress like you (laughs) They dress like I do. (laughs) Even quite a bit better. But the reason I ask that, it sounds like you're a different type of CDO, like even a business like yours, I almost think you're a data scientist. So describe your role. >> I've actually held, I was with the company from the beginning so I've held quite a few roles actually. I think this might be my third title at this point. >> Okay. >> But in general, I'm a very technical person. I'm hands on, I love technology. I've held CTO titles in the past as well. >> Dave: Right. >> But I kind of, I've always been very interested in data and interested in storage because that's where data lives and it's a great fit for me. >> So I've always been interested in this because you know the narrative is that CDOs shouldn't be technical, they should be business and I get all that but the flip side of that is when you talk to CDOs about AI projects, which is you know, not digital transformation but specifically AI projects, they're not, most CDOs in healthcare, financial services, even government, they're not intimately involved, they're kind of like yeah, Chief Data Officer, we'll let you know when we have a data quality problem and I don't think that's right. I mean the CDO should be intimately involved. >> I agree. >> In those AI projects. >> I think a lot of times if you ask them, you ask, a lot of people, they'll say are you interested in deploying AI in your organization? And the answer is 100% yes and then the next follow up question is what would you like to do with it? And most of the time the answer is we don't know. I don't know. So what I have found is I go into organizations, I don't ask if people want to use AI, I ask what are your problems and I think what problems are you facing, what KPIs are you trying to optimize for and there are some of those problems, there are some problems on that list that might not be able to be helped by AI but usually there are problems on that list that can be helped by AI with the right data and the right place. >> So my translation of what you're asking is how can you make more money? (laughs) >> That what it comes down to. >> That's what you're asking, how can you cut costs or raise revenue, that's really ultimately what you're getting to. >> Data. >> Find new customers. I think the other interesting thing about our partnership with Pure and especially with regards to AIRE, AIRE's is an exciting technology but for a lot of companies is they're looking to get started in AI, there's almost this moment of pause, of how do I get started and then if I look at some of the greatest technology out there, it's like, okay, well now I have to retrofit my data center to get it in there, right. There's a bunch of technical barriers that slow down the progression and what we've been able to do with AIRE and the Cloud is really to be able to help people jumpstart, to get started right away. So rather than you know, let me think for six months or 12 months or 18 months on what would I analyze, start analyzing, get started and you can do it on a very cost effective outback's model as opposed to a capital intensive CAMP-X model. >> Alright, so I got to ask you. >> Yeah. >> And Pure will be pissed off I'm asking this question because you're talking about AIRE as a, it's real and I want some color on that but I felt like when the first announcement came out with Invida, it was rushed so that Pure could have another first. (laughs) Ink was drying, like we beat the competition but the way you're talking is AIRE is real, you're using it, it's a tangible solution. It's a value to your business. >> It's a core solution in our facility. >> Dave: It's a year ago. >> It's a core thing that we go to market with and it's something that you know, we're seeing customer demand to go out and really start to drive some business value. So you know, absolutely. >> A core component of helping them jumpstart that AI. Well you guys just, I think an hour or so ago, announced your new partnership level with Pure. John, take us away as we wrap here with the news please. >> Yeah, so well we're really excited. We're one of a handful of elite level MSP partners for Pure. I think there's only a few of us in the world so that's something and we're really the one who is focused on bringing ARIE to the Cloud and so it's a unique partnership. It's a deep partnership and it allows us to really coordinate our technical teams, our sales teams, you know, and be able to bring this technology across the industry and so we're excited, it's just the start but it's a great start and we're looking forward to nothing but upside from here. >> Fantastic, you'll have to come back guys and talk to us about a customer's who's done a jumpstart with ARIE and just taking the world by storm. So we thank you both for stopping by theCUBE. >> Absolutely, we'll love to do that. >> Lisa: Alright John, Jim, thank you so much for your time. >> Thank you. >> Absolutely. >> John: Really appreciate it. >> For Dave Vellante, I'm Lisa Martin, you're watching theCUBE from Pure Accelerate 2019. (upbeat techno music)

Published Date : Sep 18 2019

SUMMARY :

Brought to you by Pure Storage. and John Curran, the SVP of Business Development. what you guys do, what you deliver. and then position them for the future, you know. Is that true or are you guys quiet about that? We've got, you know, to give you-- So no, it's not cheap. And to even give you a comparison, that you guys consider? and if you look at virtually every data center you have all this alternative processing, GPUs in particular With GPUs, the one thing you want to do and I wonder if you guys can comment on that. as the market came out and you know, We do blockchain or cryptocurrency mining and the crypto craze actually has funded a lot and is going to change as many industries of distributed apps. I know you do. I mean the people who are into it are believers. If you look at it, I think there's going to be multiple waves and you see a lot of problems And when you start to try to build stuff like that from that perspective and the innovations they're driving I read a stat on the Pure website where in one of the largest Cloud providers it allows you to look at data you were never able you have technical chops. they don't dress like you from the beginning so I've held quite a few roles actually. But in general, I'm a very technical person. and it's a great fit for me. and I get all that but the flip side is what would you like to do with it? how can you cut costs or raise revenue, and you can do it on a very cost effective but the way you're talking is AIRE is real, and it's something that you know, Well you guys just, I think an hour or so ago, you know, and be able to bring this technology and just taking the world by storm. you're watching theCUBE from Pure Accelerate 2019.

ENTITIES

Entity	Category	Confidence
Jim Benedetto	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
John Curran	PERSON	0.99+
John	PERSON	0.99+
five percent	QUANTITY	0.99+
Kentucky	LOCATION	0.99+
Core Scientific	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Jim	PERSON	0.99+
Georgia	LOCATION	0.99+
20 years	QUANTITY	0.99+
one	QUANTITY	0.99+
14	QUANTITY	0.99+
800%	QUANTITY	0.99+
100%	QUANTITY	0.99+
six months	QUANTITY	0.99+
Lisa	PERSON	0.99+
North America	LOCATION	0.99+
14 GPUs	QUANTITY	0.99+
12 months	QUANTITY	0.99+
AIRE	ORGANIZATION	0.99+
two percent	QUANTITY	0.99+
18 months	QUANTITY	0.99+
two year	QUANTITY	0.99+
South Carolina	LOCATION	0.99+
Austin, Texas	LOCATION	0.99+
Pure	ORGANIZATION	0.99+
North Carolina	LOCATION	0.99+
15 GPUs	QUANTITY	0.99+
two GPUs	QUANTITY	0.99+
third title	QUANTITY	0.99+
two ways	QUANTITY	0.99+
first	QUANTITY	0.99+
One	QUANTITY	0.99+
INVIDIA	ORGANIZATION	0.99+
one customer	QUANTITY	0.99+
first time	QUANTITY	0.99+
Both	QUANTITY	0.98+
100 queries	QUANTITY	0.98+
ARIE	ORGANIZATION	0.98+
up to 800%	QUANTITY	0.98+
first announcement	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
two years ago	DATE	0.98+
Libra	ORGANIZATION	0.98+
Pure Storage	ORGANIZATION	0.98+
two	QUANTITY	0.98+
a year ago	DATE	0.98+
VMware	ORGANIZATION	0.98+
Bellevue, Washington	LOCATION	0.98+
both	QUANTITY	0.97+
billions of rows	QUANTITY	0.97+
10 megawatts	QUANTITY	0.97+
each one	QUANTITY	0.97+
an hour or so ago	DATE	0.97+
two primary businesses	QUANTITY	0.95+
one thing	QUANTITY	0.95+
about 50 megawatts	QUANTITY	0.94+

Renee Yao, NVIDIA & Bharat Badrinath, NetApp

>> Announcer: Live from Las Vegas, it's theCUBE, covering NetApp Insight 2018. Brought to you by NetApp. >> Welcome back to theCUBE, we are live. We've been here all day at NetApp Insight in Las Vegas at the Mandalay Bay. I'm Lisa Martin with Stu Miniman and we're joined by a couple of guests. One of our alumni, Bharat Badrinath, the V.P. of Product Solutions and Marketing at NetApp. Hey, Bharat, welcome back. >> Thank you, thanks for having me. >> And we've also got Renee Yao, who is a Senior Product Marketing Manager for Deep Learning and AI Systems at Nvidia. Renee, welcome to theCUBE. >> Thanks for having me. >> So guys, this is a pretty big event. NetApp's biggest customer-partner event, the keynote, standing room only this morning five thousand plus people, lot of buzz, lot of momentum. Speaking of momentum, NetApp and Nvidia just launched an interesting partnership a couple months ago. Bharat, talk to us about how NetApp is working with Nvidia to really take advantage of AI and allow your customers to do that as well. >> Sure. So, as we started talking to customers and started looking at what they were investing in, AI bubbled up, right up to the top. And given our rich history in NFS, high performance NFS, it became an obvious choice for NetApp to invest in this space. So we've been working with Nvidia for a really long time, probably close to a year, to start integrating our products with their DGX-1 supercomputer and providing it as a single package to our customers, which makes it a lot easier for them to deploy their AI instead of waiting months for testing infrastructure, which the data scientists don't want to do. We get them a pre-tested, pre-validated system and our All-Flash Fast, which has been winning multiple awards and the recent A800 announcement were perfect choice for us to integrate into this architecture for the system. >> Alright, Renee, in the keynote this morning, the Futurist, he said-- We talked about data as the new oil, he said AI is the new electricity. Maybe you can speak a little bit as to why this is so important. Having gone to a lot of shows this year, it felt like every single show I go to, I see Nvidia, arm in arm with partners, because there's a huge wave coming. >> Yes, absolutely, and I think there was this hype about data, there was this hype about AI, and I think the years of Big Data World, that's creating data, absolutely the foundation for AI, and AI as the new electricity is a very, very good analogy. And let's do some math, shall we? So Swiss Federal Railway, it's a very good customer of ours. For those of you who don't know, they're kind of like the heart or center of all the railway tracks going through, serving about 1.2 million passengers on a day-to-day basis. Securing their security is very, very important. Now, they also have a lot of switches that turn on, then the train can go by and with the tunnels and bridges and switches, so they need to make sure that these trains actually don't collide. So when one train goes by with 11 switches, that gives you 30 ways of possible routing. Two trains, 900 ways. 80 trains, 10 to the eightieth power of ways. That's more than the observed atoms in the universe. And they actually have more than 10 thousand trains. So think about, can human being possibly calculate that much data and possibilities in their brain? As smart as we all want to think we all are, they turn to DGX, and the full day of simulation on DGX-1 was only 17 seconds for them to get back results. And I think that analogy of AI as the new electricity, just talking about the speed of light, is very spot on. >> So this isn't hype anymore, this is actually reality. And you gave a really great example of how a large transportation system is using it to get almost real time information. Bharat, talk to us about NetApp storage, history, 26 years, you guys have really made a lot of pivots in terms of your digital transformation, your cultural transformation. How are you helping with, now, kind of the added power of Nvidia, helping customers to, the hype's gone, actually deploy it, live it, and benefit a business from it? >> Yeah, absolutely, I think, as you rightly pointed out, NetApp has made a lot of pivots. Right, and I think the latest journey in terms of being empowering our customers with data has been a very powerful mission for the company. We entered the Flash market a little bit later than our competitors, but we have made dramatic progress in that space. In fact, recently, based on the latest IDC report, we were number one in All-Flash market worldwide, so that is quite an accomplishment for a company which was late to the market. And having said that, that's because of the innovation engine that is still alive and well within NetApp. We're announcing, as you've seen in the conference, we're announcing a lot of new products and technology which are way ahead of what our competitors are offering, but I think it is all hinged on what our customers need. The customer benefits because, yeah, it has profound benefit of changing how customers operate, their entire operations, it can transform dramatically overnight. And as Renee pointed out, Big Data gave the foundation which collected all the data, but wasn't able to process it. But AI with the power of Nvidia and DGX is able to utilize that to create those outcomes for customers. And from our perspective, we bring two key value adds to the space. One, we're able to serve up the data at incredibly high speeds with our award-winning All-Flash systems. But more importantly, data today lives everywhere. If you think about it, edge is becoming even more important. You can't expect an autonomous car to make an instantaneous decision without the backing of data, which means it can't, everything can't reside in the cloud, it may be at the edge. Some of it may be at your data center. How do you tie all three together, edge, core, and cloud? And that's where the data fabric, the vision of data fabric that you saw today comes in the picture. So one is performance, the ability to stream up the kind of data at the speed of the new processors are demanding, at the speed the customers are demanding to make business decisions and also the edge to core to cloud, our data fabric, which is unique and unparalleled in the industry. >> Now, I'm wondering if you could both bring us inside the customers a little bit. If I think of the traditional storage customer, I need performance, I have more and more data that I need to deal with. But Renee pointed out real outcomes, which is beyond what a traditional storage person would be doing. Who are you working with at the customers-- How do they put together-- It almost sounds like you're building a car. I've got the engine, I've got all the pieces. Who helps put this whole solution together? How does the partnership on the customer's side go together? >> That's a great question. I'll give my take and you can jump on it because she's just returned from being on road shows with joint customers and prospects. So I believe it has to be a joint decision. It's not like IT does it first and the data scientists come in later. Although it may be the case in certain instances where the data scientists start the discussion and then the IT gets brought in. In an ideal case, just like building a car, you want all the teams to be sitting together, make sure they're making the right calls because every compromise you make at one end will impact the other. So you want to make sure you make the optimal decision end to end. And that's where some of our channel partners come in who kind of bridge the data scientist team and the IT team. In some cases, customers show up with data scientists and IT teams together and some, it's one after the other. >> Absolutely. We see the same thing when we're on the road show. Literally two weeks ago, in Canada, by the way, there was a snowstorm, and it was an unforeseen snowstorm, you don't get snowstorm in October-- >> Yes, even for Canada, it was unforeseen. >> Yeah, and we had a packed room of people coming to learn about AI and in the audience, we absolutely see people from the infrastructure side, from the data center side, from the data scientist side, and they realized that they really have to start talking because none of them can afford to be reactive. For example, the data scientists, we want to do the innovation. I can't just go to the infrastructure guys and say that, "Hey, this is my workload, do something about it." And the infrastructure guys don't want to hold on to that problem and then don't know what to do with it. They really need to be ahead of everything and I think the interesting thing is, among those four cities that we're at, we see customers from the government, oil and gas, transportation, health care, and just any industry you can think of, they're all here. One specific example, do you know Mike's company that actually came to us, they have about 15 petabytes of data and that's storing 20 years of historical data and they only have two staff and they were not hiring more staff. They were like, "We just want something that's "going to be able to work and we know everything, "so just give us a solution that's going to be able to "easily scale up and out and enable us to continue to "store more data, manage more data, "and get insights out of data fast." So they came to both of us, it's just a very good, natural decision. That's why we have a partnership together as well. >> So you guys talked about kind of connecting the data scientists with the infrastructure folks. Where's the business involved in this conversation? In terms of, we want to identify new products and services to deliver faster than our competition, new markets. Talk to us about, are the data scientists and the infrastructure guys and girls following business initiatives that have been set or are the business leaders involved in these joint conversations? >> Go ahead, you take it. >> Sure. So, I think we see both. We definitely see that there's top-level executives saying that this is our initiative and we have to do it. And they will make the decision that we have to refresh our infrastructure from the ground up to make sure we're supportive of our data scientists' innovation. We've also seen brilliant minds, researchers, data scientists doing amazing things and then roll it up to the VP level and then roll it up to CEO level to say that this has to be done because this-- For example, that simulation of 17 second results, it's things that people used to cannot do in their lifetime, now they can do it in seconds, that kind of innovation just cannot be ignored. >> Yeah, we see the same thing. In fact, any team that has possession of that data or is accountable for that data is the one usually driving the decisions. Because as you mine the data, as you start deploying new techniques, you realize new opportunities, which means the business gets more interested in it and vice versa. If the business is interested, they're going to look for those answers within the data that they have. >> So last thing, Renee, you were on the Women in Tech panel that ended yesterday, Bharat and I were both in the audience, and one of the things that I thought was really inspiring about your story is that you had given us, the audience, an interesting example of a TV opportunity that you were inspired to do by the CEO of Nvidia. Give our audience who didn't have a chance to see that panel a little bit, and in the last minute, of that story and how you were able to step forward and go, "I'm going to try this." >> Yeah, of course. I think that brings us back to the concept that we have at Nvidia, the speed of light concept, and you really have to learn, act, to move at the speed of light, just like our GPUs, with extreme performance. And obviously, at that speed, none of us know everything. So what Jensen, CEO, shared with us was, in an all-hands meeting internally, he told us that none of us are here qualified to do any of our jobs, maybe besides his legal counsel and CFO. And all of us are here to learn, and we need to learn as fast and as much as we can. And we can't really just let the competition determine where our limit is, but instead is by the limit of what is possible. So that is very much a fundamental mindset change in this AI revolution. >> Well thanks so much, Renee and Bharat, for stopping by and sharing with us the exciting things that you guys are doing with NetApp. We look forward to talking with you again soon. >> Thank you. >> Me too, thanks. >> For Stu Miniman, I'm Lisa Martin. You're watching theCUBE, live from NetApp Insight 2018 in Las Vegas. Stu and I will be right back with our next guests after a short break. (techno music)

Published Date : Oct 23 2018

SUMMARY :

Brought to you by NetApp. in Las Vegas at the Mandalay Bay. And we've also got Renee Yao, the keynote, standing room only this morning and providing it as a single package to our customers, Alright, Renee, in the keynote this morning, and AI as the new electricity is a very, very good analogy. kind of the added power of Nvidia, So one is performance, the ability to stream up How does the partnership on the customer's side go together? the optimal decision end to end. We see the same thing when we're on the road show. and they realized that they really have to start talking the data scientists with the infrastructure folks. refresh our infrastructure from the ground up If the business is interested, they're going to look for and one of the things that I thought was the speed of light concept, and you really have to learn, We look forward to talking with you again soon. Stu and I will be right back

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Renee	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Renee Yao	PERSON	0.99+
Stu	PERSON	0.99+
20 years	QUANTITY	0.99+
Mike	PERSON	0.99+
10	QUANTITY	0.99+
Canada	LOCATION	0.99+
11 switches	QUANTITY	0.99+
30 ways	QUANTITY	0.99+
80 trains	QUANTITY	0.99+
900 ways	QUANTITY	0.99+
Swiss Federal Railway	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
one train	QUANTITY	0.99+
Bharat	PERSON	0.99+
Two trains	QUANTITY	0.99+
Bharat Badrinath	PERSON	0.99+
One	QUANTITY	0.99+
October	DATE	0.99+
26 years	QUANTITY	0.99+
both	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
Las Vegas	LOCATION	0.99+
more than 10 thousand trains	QUANTITY	0.99+
DGX	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
yesterday	DATE	0.99+
17 seconds	QUANTITY	0.99+
two weeks ago	DATE	0.99+
two staff	QUANTITY	0.98+
five thousand plus people	QUANTITY	0.98+
two key	QUANTITY	0.98+
today	DATE	0.98+
this year	DATE	0.98+
Jensen	PERSON	0.97+
single package	QUANTITY	0.97+
Deep Learning	ORGANIZATION	0.97+
NetApp	TITLE	0.96+
IDC	ORGANIZATION	0.96+
one	QUANTITY	0.96+
NVIDIA	ORGANIZATION	0.96+
about 1.2 million passengers	QUANTITY	0.95+
Systems	ORGANIZATION	0.94+
eightieth power	QUANTITY	0.94+
first	QUANTITY	0.94+
NetApp Insight	ORGANIZATION	0.92+
couple months ago	DATE	0.91+
this morning	DATE	0.89+
about 15 p	QUANTITY	0.89+
a year	QUANTITY	0.87+
DGX-1 supercomputer	COMMERCIAL_ITEM	0.87+
Big Data	ORGANIZATION	0.86+
17 second results	QUANTITY	0.84+
couple of guests	QUANTITY	0.78+
theCUBE	ORGANIZATION	0.77+
four cities	QUANTITY	0.76+
number one	QUANTITY	0.76+
three	QUANTITY	0.75+

DDN Chrowdchat | October 11, 2018

(uptempo orchestral music) >> Hi, I'm Peter Burris and welcome to another Wikibon theCUBE special feature. A special digital community event on the relationship between AI, infrastructure and business value. Now it's sponsored by DDN with participation from NIVIDA, and over the course of the next hour, we're going to reveal something about this special and evolving relationship between sometimes tried and true storage technologies and the emerging potential of AI as we try to achieve these new business outcomes. So to do that we're going to start off with a series of conversations with some thought leaders from DDN and from NVIDIA and at the end, we're going to go into a crowd chat and this is going to be your opportunity to engage these experts directly. Ask your questions, share your stories, find out what your peers are thinking and how they're achieving their AI objectives. That's at the very end but to start, let's begin the conversation with Kurt Kuckein who is a senior director of marketing at DDN. >> Thanks Peter, happy to be here. >> So tell us a little bit about DNN at the start. >> So DDN is a storage company that's been around for 20 years. We've got a legacy in high performance computing, and that's what we see a lot of similarities with this new AI workload. DDN is well known in that HPC community. If you look at the top 100 super computers in the world, we're attached to 75% of them. And so we have the fundamental understanding of that type of scalable need, that's where we're focused. We're focused on performance requirements. We're focused on scalability requirements which can mean multiple things. It can mean the scaling of performance. It can mean the scaling of capacity, and we're very flexible. >> Well let me stop you and say, so you've got a lot of customers in the high performance world. And a lot of those customers are at the vanguard of moving to some of these new AI workloads. What are customer's saying? With this significant engagement that you have with the best and the brightest out there. What are they saying about this transition to AI? >> Well I think it's fascinating that we have a bifurcated customer base here where we have those traditionalist who probably have been looking at AI for over 40 years, and they've been exploring this idea and they've gone to the peaks and troughs in the promise of AI, and then contraction because CPUs weren't powerful enough. Now we've got this emergence of GPS in the super computing world. And if you look at how the super computing world has expanded in the last few years. It is through investment in GPUs. And then we've got an entirely different segment which is a much more commercial segment, and they may be newly invested in this AI arena. They don't have the legacy of 30, 40 years of research behind them, and they are trying to figure out exactly what do I do here. A lot of companies are coming to us. Hey, I have an AI initiative. Well, what's behind it? We don't know yet but we've got to have something, and they don't you understand where is this infrastructure going to come from. >> So a general availability of AI technologies and obviously flash has been a big part of that. Very high speed networks within data centers. Virtualization certainly helps as well. Now opens up the possibility for using these algorithms, some of which have been around for a long time that require very specialized bespoke configurations of hardware to the enterprise. That still begs the question. There are some differences between high performance computing workloads and AI workloads. Let's start with some of the, what are the similarities and let's explore some of the differences. >> So the biggest similarity I think is it's an intractable hard IO problem. At least from the storage perspective, it requires a lot of high throughput. Depending on where those idle characteristics are from. It can be a very small file, high opt intensive type workflows but it needs the ability of the entire infrastructure to deliver all of that seamlessly from end to end. >> So really high performance throughput so that you can get to the data you need and keep this computing element saturated. >> Keeping the GPU saturated is really the key. That's where the huge investment is. >> So how do AI and HPC workloads differ? >> So how they are fundamentally different is often AI workloads operate on a smaller scale in terms of the amount of capacity, at least today's AI workloads, right? As soon as a project encounter success, what our forecast is is those things will take off and you'll want to apply those algorithm games bigger and bigger data sets. But today, we encounter things like 10 terabyte data sets, 50 terabyte data sets, and a lot of customers are focused only on that but what happens when you're successful? How you scale your current infrastructure to petabytes and multi petabytes when you'll need it in the future. >> So when I think of HPC, I think of often very, very big batch jobs. Very, very large complex datasets. When I think about AI, like image processing or voice processing whatever else it might be. Like for a lot of small files randomly access that require nonetheless some very complex processing that you don't want to have to restart all the time and the degree of some pushing that's required to make sure that you have the people that can do. Have I got that right? >> You've got that right. Now one, I think misconception is on the HPC side, that whole random small file thing has come in in the last five, 10 years, and it's something DDN have been working on quite a bit. Our legacy was in high performance throughput workloads but the workloads have evolved so much on the HPC side as well, and as you posited at the beginning so much of it has become AI and deep learning research. >> Right, so they look a lot more alike. >> They do look a lot more alike. >> So if we think about the revolving relationship now between some of these new data first workloads, AI oriented change the way the business operates type of stuff. What do you anticipate is going to be the future of the relationship between AI and storage? >> Well, what we foresee really is that the explosion in AI needs and AI capability is going to mimic what we already see, and really drive what we see on the storage side. We've been showing that graph for years and years of just everything going up into the right but as AI starts working on itself and improving itself, as the collection means keep getting better and more sophisticated, and have increased resolutions whether you're talking about cameras or in life sciences, acquisition. Capabilities just keep getting better and better and the resolutions get better and better. It's more and more data right and you want to be able to expose a wide variety of data to these algorithms. That's how they're going to learn faster. And so what we see is that the data centric part of the infrastructure is going to need the scale even if you're starting today with a small workload. >> Kurt, thank you very much, great conversation. How did this turn into value for users? Well let's take a look at some use cases that come out of these technologies. >> DDN A3I within video DGX-1 is a fully integrated and optimized technology solution that provides an enable into acceleration for a wide variety of AI and the use cases in any scale. The platform provides tremendous flexibility and supports a wide variety of workflows and data types. Already today, customers in the industry, academia and government all around the globe are leveraging DDN A3I within video DGX-1 for their AI and DL efforts. In this first example used case, DDN A3I enables the life sciences research laboratory to accelerate through microscopic capture and analysis pipeline. On the top half of the slide is the legacy pipeline which displays low resolution results from a microscope with a three minute delay. On the bottom half of the slide is the accelerated pipeline where DDN A3I within the video DGX-1 delivers results in real time. 200 times faster and with much higher resolution than the legacy pipeline. This used case demonstrates how a single unit deployment of the solution can enable researchers to achieve better science and the fastest times to results without the need to build out complex IT infrastructure. The white paper for this example used case is available on the DDN website. In the second example used case, DDN A3I with NVIDIA DGX-1 enables an autonomous vehicle development program. The process begins in the field where an experimental vehicle generates a wide range of telemetry that's captured on a mobile deployment of the solution. The vehicle data is used to train capabilities locally in the field which are transmitted to the experimental vehicle. Vehicle data from the fleet is captured to a central location where a large DDN A3I within video DGX-1 solution is used to train more advanced capabilities, which are transferred back to experimental vehicles in the field. The central facility also uses the large data sets in the repository to train experimental vehicles and simulate environments to further advance the AV program. This used case demonstrates the scalability, flexibility and edge to data center capability of the solution. DDN A3I within video DGX-1 brings together industry leading compute, storage and network technologies, in a fully integrated and optimized package that makes it easy for customers in all industries around the world to pursue break from business innovation using AI and DL. >> Ultimately, this industry is driven by what users must do, the outcomes if you try to seek. But it's always is made easier and faster when you got great partnerships working on some of these hard technologies together. Let's hear how DDN and NVIDIA are working together to try to deliver new classes of technology capable of making these AI workloads scream. Specifically, we've got Kurt Kuckein coming back. He's a senior director of marketing for DDN and Darrin Johnson who is global director of technical marketing for NVIDIA in the enterprise and deep learning. Today, we're going to be talking about what infrastructure can do to accelerate AI. And specifically we're going to use a relationship. A virgin relationship between DDN and NVIDIA to describe what we can do to accelerate AI workloads by using higher performance, smarter and more focused infrastructure for computing. Now to have this conversation, we've got two great guest here. We've got Kurt Kuckein, who is the senior director of marketing at DDN. And also Darrin Johnson, who's the global director of technical marketing for enterprise at NVIDIA. Kurt, Darrin, welcome to the theCUBE. >> Thank you very much. >> So let's get going on this 'cause this is a very, very important topic, and I think it all starts with this notion of that there is a relationship that you guys put forward. Kurt, why don't you describe. >> Sure, well so what we're announcing today is DDNs, A3I architecture powered by NVIDIA. So it is a full rack level solution, a reference architecture that's been fully integrated and fully tested to deliver an AI infrastructure very simply, very completely. >> So if we think about why this is important. AI workloads clearly put special stress on underline technology. Darrin talk to us a little bit about the nature of these workloads and why in particular things like GPUs, and other technologies are so important to make them go fast? >> Absolutely, and as you probably know AI is all about the data. Whether you're doing medical imaging, whether you're doing natural language processing. Whatever it is, it's all driven by the data. The more data that you have, the better results that you get but to drive that data into the GPUs, you need greater IO and that's why we're here today to talk about DDN and the partnership of how to bring that IO to the GPUs on our DGX platforms. >> So if we think about what you describe. A lot of small files often randomly distributed with nonetheless very high profile jobs that just can't stop midstream and start over. >> Absolutely and if you think about the history of high performance computing which is very similar to AI, really IO is just that. Lots of files. You have to get it there. Low latency, high throughput and that's why DDNs probably, nearly 20 years of experience working in that exact same domain is perfect because you get the parallel file system which gives you that throughput, gives you that low latency. Just helps drive the GPU. >> So you mentioned HPC from 20 years of experience. Now it use to be that HPC, you'd have a scientist with a bunch of graduate students setting up some of these big, honking machine. but now we're moving with commercial domain You don't have graduate students running around. You have very low cost, high quality people. A lot of administrators, nonetheless quick people but a lot to learn. So how does this relationship actually start making or bringing AI within reach of the commercial world? Kurt, why you don't you-- >> Yeah, that's exactly where this reference architecture comes in. So a customer doesn't need to start from scratch. They have a design now that allows them to quickly implement AI. It's something that's really easily deployable. We fully integrated the solution. DDN has made changes to our parallel file system appliance to integrate directly with the DGX-1 environment. Makes the even easier to deploy from there, and extract the maximum performance out of this without having to run around and tuning a bunch of knobs, change a bunch of settings. It's really going to work out of the box. >> And NVIDIA has done more than the DGX-1. It's more than hardware. You've don't a lot of optimization of different AI toolkits et cetera so talk a little bit about that Darrin. >> Talking about the example that used researchers in the past with HPC. What we have today are data scientists. A scientist understand pie charts, they understand TensorFlow, they understand the frameworks. They don't want to understand the underlying file system, networking, RDM, a InfiniBand any of that. They just want to be able to come in, run their TensorFlow, get the data, get the results, and just keep turning that whether it's a single GPU or 90 DGXs or as many DGXs as you want. So this solution helps bring that to customers much easier so those data scientist don't have to be system administrators. >> So roughly it's the architecture that makes things easier but it's more than just for some of these commercial things. It's also the overall ecosystem. New application fires up, application developers. How is this going to impact the aggregate ecosystem is growing up around the need to do AI related outcomes? >> Well, I think one point that Darrin was getting to there in one of the bigg effects is also as these ecosystems reach a point where they're going to need to scale. There's somewhere where DDN has tons of experience. So many customers are starting off with smaller datasets. They still need the performance, a parallel file system in that case is going to deliver that performance. But then also as they grow, going from one GBU to 90 GXs is going to be an incredible amount of both performance scalability that they're going to need from their IO as well as probably capacity, scalability. And that's another thing that we've made easy with A3I is being able to scale that environment seamlessly within a single name space, so that people don't have to deal with a lot of again tuning and turning of knobs to make this stuff work really well and drive those outcomes that they need as they're successful. In the end, it is the application that's most important to both of us, right? It's not the infrastructure. It's making the discoveries faster. It's processing information out in the field faster. It's doing analysis of the MRI faster. Helping the doctors, helping anybody who is using this to really make faster decisions better decisions. >> Exactly. >> And just to add to that. In automotive industry, you have datasets that are 50 to 500 petabytes, and you need access to all that data, all the time because you're constantly training and retraining to create better models to create better autonomous vehicles, and you need the performance to do that. DDN helps bring that to bear, and with this reference architecture is simplifies it so you get the value add of NVIDIA GPUs plus its ecosystem software plus DDN. It's match made in heaven. >> Kurt, Darrin, thank you very much. Great conversation. To learn more about what they're talking about, let's take a look at a video created by DDN to explain the product and the offering. >> DDN A3I within video NVIDIA DGX-1 is a fully integrated and optimized technology solution that enables and accelerates end to end data pipelines for AI and DL workloads of any scale. It is designed to provide extreme amounts of performance and capacity backed by a jointly engineered and validated architecture. Compute is the first component of the solution. The DGX-1 delivers over one petaflop of DL training performance leveraging eight NVIDIA tester V100 GPUs in a 3RU appliance. The GPUs are configured in a hybrid cube mesh topology using the NVIDIA and VLink interconnect. DGX-1 delivers linearly predictable application performance and is powered by the NVIDIA DGX software stack. DDN A31 solutions can scale from single to multiple DGX-1s. Storage is a second component of the solution. The DDN and the AI200 is all NVIDIA parallel file storage appliance that's optimized for performance. The AI200 is specifically engineered to keep GPU computing resources fully utilized. The AI200 ensures maximum application productivity while easily managing to update data operations. It's offered in three capacity options and a compact tour U chassis. AI200 appliance can deliver up to 20 gigabytes a second of throughput and 350,000 IOPS. The DDN A3I architecture can scale up and out seamlessly over multiple appliances. The third component of the solution is a high performance, low latency, RDM capable network. Both EDR and InfiniBand, and 100 gigabit ethernet options are available. This provides flexibility, interesting seamless scaling and easy integration of the solution within any IT infrastructure. DDN A3I solutions within video DGX-1 brings together industry leading compute, storage and network technologies in a fully integrated and optimized package that's easy to deploy and manage. It's backed by deep expertise and enables customers to focus on what really matters. Extracting the most value from their data with unprecedented accuracy and velocity. >> Always great to hear the product. Let's hear the analyst's perspective. Now I'm joined by Dave Vellante, who's now with Wikibon, colleague here at Wikibon and co-CEO of SiliconANGLE. Dave welcome to theCUBE. Dave a lot of conversations about AI. What is it about today that is making AI so important to so many businesses? >> Well I think it's three things Peter. The first is the data we've been on this decade long aduped bandwagon and what that did is really focused organizations on putting data at the center of their business, and now they're trying to figure okay, how do we get more value of that? So the second piece of that is technology is now becoming available, so AI of course have been around forever but the infrastructure to support that, GPUs, the processing power, flash storage, deep learning frameworks like TensorFlow have really have started to come to the marketplace. So the technology is now available to act on that data, and I think the third is people are trying to get digital right. This is it about digital transformation. Digital meets data. We talked about that all the time and every corner office is trying to figure out what their digital strategy should be. So there try to remain competitive and they see automation, and artificial intelligence, machine intelligence applied to that data as a lynch pan of their competitiveness. >> So a lot of people talk about the notion of data as a source value in some and the presumption that's all going to the cloud. Is that accurate? >> Oh yes, it's funny that you say that because as you know, we're done a lot of work of this and I think the thing that's important organizations have realized in the last 10 years is the idea of bringing five megabytes of compute to a petabyte of data is far more valuable. And as a result a pendullum is really swinging in many different directions. One being the edge, data is going to say there, and certainly the cloud is a major force. And most of the data still today lives on premises, and that's where most of the data os likely going to stay. And so no all the data is not going to go into the cloud. >> It's not the central cloud? >> That's right, the central public cloud. You can redefined the boundaries of the cloud and the key is you want to bring that cloud like experience to the data. We've talked about that a lot in the Wikibon and Cube communities, and that's all about the simplification and cloud business models. >> So that suggest pretty strongly that there is going to continue to be a relationship between choices about hardware infrastructure on premises, and the success at making some of these advance complex workloads, run and scream and really drive some of that innovative business capabilities. As you think about that what is it about AI technologies or AI algorithms and applications that have an impact on storage decisions? >> Well, the characteristics of the workloads are going to be often times is going to be largely unstructured data that's going to be small files. There's going to a lot of those small files, and they're going to be randomly distributed, and as a result, that's going to change the way in which people are going to design systems to accommodate those workloads. There's going to be a lot more bandwidth. There's going to be a lot more parallelism in those systems in order to accommodate and keep those CPUs busy. And yeah, we're going to talk about but the workload characteristics are changing so the fundamental infrastructure has to change as well. >> And so our goal ultimately is to ensure that we keep these new high performing GPUs saturated by flowing data to them without a lot of spiky performance throughout the entire subsystem. We've got that right? >> Yeah, I think that's right, and that's when I was talking about parallelism, that's what you want to do. You want to be able to load up that processor especially these alternative processors like GPUs, and make sure that they stay busy. The other thing is when there's a problem, you don't want to have to restart the job. So you want to have real time error recovery, if you will. And that's been crucial in the high performance world for a long, long time on terms of, because these jobs as you know take a long, long time to the extent that you don't have to restart a job from ground zero. You can save a lot of money. >> Yeah especially as you said, as we start to integrate some of these AI applications with some of the operational applications that are actually recording your results of the work that's being performed or the prediction that's being made or the recommendation that's been offered. So I think ultimately, if we start thinking about this crucial role that AI workloads is going to have in business and that storage is going to have on AI, move more processes closer to data et cetera. That suggest that there's going to be some changes in the offering for the storage industry. What are your thinking about how storage interest is going to evolve over time? >> Well there's certainly a lot of hardware stuff that's going on. We always talk about software define but they say hardware stuff matters. If obviously flash doors changed the game from a spinning mechanical disc, and that's part of this. Also as I said the day before seeing a lot more parallelism, high bandwidth is critical. A lot of the discussion that we're having in our community is the affinity between HPC, high performance computing and big data, and I think that was pretty clear, and now that's evolving to AI. So the internal network, things like InfiniBand are pretty important. NVIDIA is coming onto the scene. So those are some of the things that we see. I think the other one is file systems. NFS tends to deal really well with unstructured data and data that is sequential. When you have all the-- >> Streaming. >> Exactly, and you have all this what we just describe as random nature and you have the need for parallelism. You really need to rethink file systems. File systems are again a lynch pan of getting the most of these AI workloads, and the others if we talk about the cloud model. You got to make this stuff simple. If we're going to bring AI and machine intelligence workloads to the enterprise, it's got to be manageable by enterprise admins. You're not going to be able to have a scientist be able to deploy this stuff, so it's got to be simple or cloud like. >> Fantastic, Dave Vellante, Wikibon. Thanks for much for being on theCUBE. >> My pleasure. >> We've had he analyst's perspective. Now tells take a look at some real numbers. Not a lot of companies has delivered a rich set of bench marks relating AI, storage and business outcomes. DDN has, let's take a video that they prepared describing the bench mark associated with these new products. >> DDN A3I within video DGX-1 is a fully integrated and optimized technology solution that provides massive acceleration for AI and DL applications. DDN has engaged extensive performance and interoperable testing programs in close collaboration with expert technology partners and customers. Performance testing has been conducted with synthetic throughputs in IOPS workloads. The results demonstrate that the DDN A3I parallel architecture delivers over 100,000 IOPS and over 10 gigabytes per second of throughput to a single DGX-1 application container. Testing with multiple container demonstrates linear scaling up to full saturation of the DGX-1 Zyo capabilities. These results show concurrent IO activity from four containers with an aggregate delivered performance of 40 gigabytes per second. The DDN A3I parallel architecture delivers true application acceleration, extensive interoperability and performance testing has been completed with a dozen popular DL frameworks on DGX-1. The results show that with the DDN A3I parallel architecture, DL applications consistently achieve a higher training throughput and faster completion times. In this example, Caffe achieves almost eight times higher training throughput on DDN A3I as well it completes over five times faster than when using a legacy file sharing architecture and protocol. Comprehensive test and results are fully documented in the DDN A3I solutions guide available from the DDN website. This test illustrates the DGX-1 GPU utilization and read activity from the AI 200 parallel storage appliance during a TensorFlow training integration. The green line shows that the DGX-1 be used to achieve maximum utilization throughout the test. The red line shows the AI200 delivers a steady stream of data to the application during the training process. In the graph below, we show the same test using a legacy file sharing architecture and protocol. The green line shows that the DGX-1 never achieves full GPU utilization and that the legacy file sharing architecture and protocol fails to sustain consistent IO performance. These results show that with DDN A3I, this DL application on the DGX-1 achieves maximum GPU product activity and completes twice as fast. This test then resolved is also documented in the DDN A3I solutions guide available from the DDN website. DDN A3I solutions within video DGX-1 brings together industry meaning compute, storage and network technologies in a fully integrated and optimized package that enables widely used DL frameworks to run faster, better and more reliably. >> You know, it's great to see real benchmarking data because this is a very important domain, and there is not a lot of benchmarking information out there around some of these other products that are available but let's try to turn that benchmarking information into business outcomes. And to do that we've got Kurt Kuckein back from DDN. Kurt, welcome back. Let's talk a bit about how are these high value outcomes That seeks with AI going to be achieved as a consequence of this new performance, faster capabilities et cetera. >> So there is a couple of considerations. The first consideration, I think, is just the selection of AI infrastructure itself. Right, we have customers telling us constantly that they don't know where to start. Now they have readily available reference architectures that tell them hey, here's something you can implement, get installed quickly, you're up and running your AI from day one. >> So the decision process for what to get is reduced. >> Exactly. >> Okay. >> Number two is, you're unlocking all ends of the investment with something like this, right. You're maximizing the performance on the GPU side, you're maximizing the performance on the ingest side for the storage. You're maximizing the throughput of the entire system. So you're really gaining the most out of your investment there. And not just gaining the most out of your investment but truly accelerating the application and that's the end goal, right, that we're looking for with customers. Plenty of people can deliver fast storage but if it doesn't impact the application and deliver faster results, cut run times down then what are you really gaining from having fast storage? And so that's where we're focused. We're focused on application acceleration. >> So simpler architecture, faster implementation based on that, integrated capabilities, ultimately, all revealing or all resulting in better application performance. >> Better application performance and in the end something that's more reliable as well. >> Kurt Kuckein, thanks so much for being on theCUBE again. So that's ends our prepared remarks. We've heard a lot of great stuff about the relationship between AI, infrastructure especially storage and business outcomes but here's your opportunity to go into crowd chat and ask your questions get your answers, share your stories, engage your peers and some of the experts that we've been talking with about this evolving relationship between these key technologies, and what it's going to mean for business. So I'm Peter Burris. Thank you very much for listening. Let's step into the crowd chat and really engage and get those key issues addressed.

Published Date : Oct 10 2018

SUMMARY :

and over the course of the next hour, It can mean the scaling of performance. in the high performance world. A lot of companies are coming to us. and let's explore some of the differences. So the biggest similarity I think is so that you can get to the data you need Keeping the GPU saturated is really the key. of the amount of capacity, and the degree of some pushing that's required to make sure on the HPC side as well, and as you posited at the beginning of the relationship between AI and storage? of the infrastructure is going to need the scale that come out of these technologies. in the repository to train experimental vehicles of technical marketing for NVIDIA in the enterprise and I think it all starts with this notion of that there is and fully tested to deliver an AI infrastructure Darrin talk to us a little bit about the nature of how to bring that IO to the GPUs on our DGX platforms. So if we think about what you describe. Absolutely and if you think about the history but a lot to learn. Makes the even easier to deploy from there, And NVIDIA has done more than the DGX-1. in the past with HPC. So roughly it's the architecture that makes things easier so that people don't have to deal with a lot of DDN helps bring that to bear, to explain the product and the offering. and easy integration of the solution Let's hear the analyst's perspective. So the technology is now available to act on that data, So a lot of people talk about the notion of data And so no all the data is not going to go into the cloud. and the key is you want to bring and the success at making some of these advance so the fundamental infrastructure has to change as well. by flowing data to them without a lot And that's been crucial in the high performance world and that storage is going to have on AI, A lot of the discussion that we're having in our community and the others if we talk about the cloud model. Thanks for much for being on theCUBE. describing the bench mark associated and read activity from the AI 200 parallel storage appliance And to do that we've got Kurt Kuckein back from DDN. is just the selection of AI infrastructure itself. and that's the end goal, right, So simpler architecture, and in the end something that's more reliable as well. and some of the experts that we've been talking

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Kurt Kuckein	PERSON	0.99+
Peter	PERSON	0.99+
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
Kurt	PERSON	0.99+
50	QUANTITY	0.99+
200 times	QUANTITY	0.99+
Darrin	PERSON	0.99+
October 11, 2018	DATE	0.99+
DDN	ORGANIZATION	0.99+
Darrin Johnson	PERSON	0.99+
50 terabyte	QUANTITY	0.99+
20 years	QUANTITY	0.99+
10 terabyte	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
75%	QUANTITY	0.99+
two	QUANTITY	0.99+
five megabytes	QUANTITY	0.99+
Today	DATE	0.99+
second piece	QUANTITY	0.99+
third component	QUANTITY	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
DNN	ORGANIZATION	0.99+
third	QUANTITY	0.99+
second component	QUANTITY	0.99+
90 GXs	QUANTITY	0.99+
first component	QUANTITY	0.99+
today	DATE	0.99+
three minute	QUANTITY	0.99+
AI200	COMMERCIAL_ITEM	0.98+
over 40 years	QUANTITY	0.98+
first example	QUANTITY	0.98+
DGX-1	COMMERCIAL_ITEM	0.98+
100 gigabit	QUANTITY	0.98+
500 petabytes	QUANTITY	0.98+
V100	COMMERCIAL_ITEM	0.98+
30, 40 years	QUANTITY	0.98+
second example	QUANTITY	0.97+
NIVIDA	ORGANIZATION	0.97+
over 100,000 IOPS	QUANTITY	0.97+
SiliconANGLE	ORGANIZATION	0.97+
AI 200	COMMERCIAL_ITEM	0.97+
first consideration	QUANTITY	0.97+
three things	QUANTITY	0.96+

9_20_18 with Peter, Kuckein & Johnson DDN

>> What up universe? Welcome to our theCUBE conversation from our fantastic studios in beautiful Palo Alto, California. Today we're going to be talking about what infrastructure can do to accelerate AI. And specifically we're going to use a relationship, a burgeoning relationship between DDN and NVIDIA to describe what we can do to accelerate AI workloads by using higher performance, smarter, and more focused infrastructure for computing. Now to have this conversation, we've got two great guests, here. We've got Kurt Kuckein, who's the senior director of marketing at DDN. And also Darren Johnson, who's the global director of technical marketing for Enterprise and NVIDIA. Kurt, Darren, welcome to theCUBE. >> Thanks For having us. >> Thank you very much. >> So let's get going on this because this is a very, very important topic. And I think it all starts with this notion of that there is a relationship that you guys put forth. Kurt, why don't you describe it. >> So what we're announcing today is the ends A3I architecture, powered by NVIDIA. So it is a full, rack-level solution, a reference to architecture that's been fully integrated and fully tested to deliver an AI infrastructure very simply very completely. >> So if we think about how this or why this is important, AI workloads clearly have a special stress on underlying technology. Darren, talk to us a little bit about the nature of these workloads, and why in particular, things like GPU's and other technologies are so important to make them go fast. >> Absolutely. And as you probably know AI is all about the data. Whether you're doing medical imaging, or whether your doing actual language processing, whatever it is, it's all driven by the data. The more data that you have, the better results that you get. But to drive that data into the GPU's, you need great IO. And that's why we're here today, to talk about DDN and the partnership and how to bring that IO to the GPU's on our DJX platforms. >> So if we think about what you describe, a lot of small files, often randomly distributed, with nonetheless very high profile jobs that just can't stop this dream and start over. >> Absolutely. And if you think about the history of high-performance computing, which is very similar to AI, really IO is just that, lots of files, you have to get it there, low latency, high throughput and that's why DDN's probably nearly 20 years of experience working in that exact same domain is perfect. Because you get the parallel file system which gives you that throughput, gives you that low latency, just helps drive the GPU. >> So you mentioned HPC from twenty years of experience, now, it used to be that HPC you'd have some scientists with a bunch of graduate students, setting up some of these big, honking machines. But now we're moving with commercial domain. You don't have graduate students running around. You don't have very low cost, high quality people here. So, you know, there's a lot of administrators who nonetheless good people, but want to learn. So, how does this relationship actually start making or bringing AI within reach of the commercial world? Kurt, why don't- >> That's exactly where this reference architecture comes in right. So a customer doesn't need to start from scratch. They have a design now that allows them to quickly implement AI, It's something that's really easily deployable. We've fully integrated this solution. DDN has made changes to our parallel file system appliance to integrate directly within the DGX-1 environment. That makes that even easier to deploy from there. And extract the maximum performance out of this without having to run around and tune a bunch of knobs, change a bunch of settings, it's really going to work out of the box. >> And you know it's really done more than just the DGX-1, it's more than hardware. You've done a lot of optimization of different AI toolkits, et cetera et cetera. Talk a little about that Darren. >> Yeah so, I mean, talking about the example used, researchers in the past with HPC, what we have today are data scientists. Data scientists understand pi charts, they understand tenser flow, they understand the frameworks. They don't want to understand the underlying file system, networking, RDMA, InfiniBand, any of that. They just want to be able to come in, run their tenser flow, get the data, get the results. And just churn that, keep churning that, whether it's a single GPU or 90 DJX's or as many DJX's as you want. So this solution helps bring that to customers much easier so those data scientists don't have to be system administrators. >> So, reference architecture that makes things easier. But it's more than just for some of these commercial things. It's also the overall ecosystem, you have application providers, application developers. How is this going to impact the average ecosystem that's growing up around the need to do AI related outcomes? >> Well, I think the one point that Darren was getting to there, and one of the big impacts is also as these ecosystems reach a point where they're going to need a scale. There's somewhere where DDN has tons of experience. So many customers are starting off with smaller data sets, they still need the performance, the parallel file system in that case is going to deliver that performance. But then also, as they grow, going from one GPU to 90 DJX's is going to be an incredible amount of both performance scalability that they're going to need from their IO, as well as probably capacity, scalability. And that's another thing that we've made easy with A3I, is being able to scale that environment seamlessly, within a single name space so that people don't have to deal with a lot of, again, tuning and turning of knobs to make this stuff work really well and drive those outcomes that they need as their successful. In the end, it is the application that's most important to both of us. It's not the end of a structure, it's making the discoveries faster, it's processing the information out in the field faster, it's doing analysis of the MRI faster, and helping the doctor, helping anybody who's using this to really make faster decisions, better decisions. >> Exactly. And just to add to that, in automotive industry, you have data sets that are from 50 to 500 petabytes, and you need access to all that data, all the time, because you're constantly training and retraining to create better models, to create better autonomous vehicles. And you need the performance to do that. DDN helps bring that to bear, and with this reference architecture, simplifies it. So you get the value add of InfiniData GPU's plus its ecosystem is software plus DDN is a match made in Heaven. >> Darren Johnson, NVIDIA, Kurt Kuckein, DDN. Thanks very much for being on theCube. >> Thank you very much. >> Glad I could be here. >> And I'm Peter Burns, and once again I'd like to thank you for watching this Cube Conversation. Until next time.

Published Date : Sep 28 2018

SUMMARY :

and NVIDIA to describe what we can do of that there is a relationship that you guys put forth. a reference to architecture that's been Darren, talk to us a little bit about the nature But to drive that data into the GPU's, you need great IO. So if we think about what you describe, lots of files, you have to get it there, low latency, So you mentioned HPC from twenty years of experience, change a bunch of settings, it's really going to work And you know it's really done more than just the DGX-1, that to customers much easier so those data scientists How is this going to impact the average ecosystem in that case is going to deliver that performance. that are from 50 to 500 petabytes, and you need access Thanks very much for being on theCube. And I'm Peter Burns, and once again I'd like to thank you

ENTITIES

Entity	Category	Confidence
NVIDIA	ORGANIZATION	0.99+
DDN	ORGANIZATION	0.99+
Kurt	PERSON	0.99+
Kurt Kuckein	PERSON	0.99+
Darren Johnson	PERSON	0.99+
Darren	PERSON	0.99+
twenty years	QUANTITY	0.99+
Peter Burns	PERSON	0.99+
Palo Alto, California	LOCATION	0.99+
both	QUANTITY	0.99+
Today	DATE	0.99+
50	QUANTITY	0.99+
90 DJX	QUANTITY	0.98+
500 petabytes	QUANTITY	0.98+
today	DATE	0.98+
two great guests	QUANTITY	0.97+
one GPU	QUANTITY	0.97+
one	QUANTITY	0.96+
one point	QUANTITY	0.95+
nearly 20 years	QUANTITY	0.94+
InfiniData	ORGANIZATION	0.92+
single name	QUANTITY	0.86+
theCUBE	ORGANIZATION	0.83+
DGX-1	TITLE	0.83+
A3I	OTHER	0.82+
Peter	PERSON	0.78+
single GPU	QUANTITY	0.7+
Johnson	PERSON	0.54+
Kuckein	PERSON	0.51+
A3I	TITLE	0.5+

Neil Vachharajani, Pure Storage | CUBEConversation, Sept 2018

(upbeat music) >> Hi I'm Peter Burris. Welcome to another CUBE Conversation from our wonderful studios in beautiful Palo Alto, CA. Today we are going to be talking about new architectures, new disciplines required to really make possible the opportunities associated with digital business. And to do that, we've got Neil Vachharajani, who is the Technical Director at Pure Storage. Neil welcome to theCUBE. >> Thank you for having me, Peter. >> So Neil, we have spent a fair amount of time within Wikibon and within the CUBE community, talking a lot about what is digital business. So, give me a second, run something by ya, tell me if you agree. So we think that there is a difference between business and digital business. And specifically, we think that difference is, a digital business uses data assets differently, than a business does. Walmart beat Sears 'cause it used data differently. AWS is putting the pressure on Walmart, because it uses data differently. Or Amazon is putting the pressure on Walmart, because it uses data differently. So, that is at the centerpiece of a lot of these digital transformations. How are you using data to re-institutionalize your work, realign your resources, reestablish a new engagement model with your marketplace. Would you agree with that? >> Yeah, absolutely agree with that and I think a lot of it has to do with the volume of data, where the data is coming from. If you look at traditional business, it really was about just putting into computers what we used to do on paper. And digital business today I think is about generating huge volumes of data by really looking at every interaction we have no matter how small or how big. >> So, putting telemetry on as many things. So, IoT for machines, mobile for human beings, but it used to be as you said. It was a process, known process, unknown technology world for a long time. And now, these are data driven processes. We're actually using data to describe what their next best action should be, what the recommendation should be. >> That's right. >> So, as we think about this, you know, businesses has been around for a long time. There's this notion of evidence based management, which is the idea that we use data differently, from the boardroom all the way down to the drivers. How does a business start to bring forward the discipline required to really make possible this data driven world. >> Well you know I think the first thing is, to really recognize why does this new paradigm shift changes things? And I think in the old world, if you looked at a piece of data, you actually could articulate all the way from the boardroom down to the stockroom every use of the data. And that meant that you could build a lot of siloed applications and that wasn't a big deal. You got your money's worth out of the data. So for example, recording transactions in store number 17. >> That's right. But in the new world, you actually don't know what the value of the data is ahead of time. Right. You're, in some sense, you're trying to capture a lot of data and then use technology to correlate it with things, mix and mash, mix and match, mash it up, and then drive business decisions that you didn't even know you were making a decision a few weeks ago and that means that you can't really lock up your data, you can't constrain it, because that's going to limit your possibilities. It's going to limit your ROI on that data. >> Yeah, we like to say that data as an asset is different from all other assets, because it is inherently sharable, reusable, it doesn't follow the laws of scarcity. And so, in many respects what the IT organization has had to do is find new ways to privatize that data through things like security, but as you're saying, they don't want to introduce technologies that artificially constrain derivative and future uses of that data. >> And I think, that's where, really the big architectural shift is happening in the data center. Because if you look traditionally, we have siloed the data and it wasn't like this intentional thing that we want to put it into a silo. But that's how we packaged our applications and that's how we deployed our applications. And now, we need a new discipline inside the data center, that makes the data available, lets people put policies on it. Like security policies. But then also makes it available for the innovators all throughout the company to get access to that data. You know, we're trying to crystallize this whole philosophy into something we refer to as the data-centric architecture. Where data is at the center, people have access to the data, and then there's just applications all around it that are all hitting this common pool of data and doing different things, driving new business processes. >> Now, you're talking not about a physical pool of data, but rather a logical pool of data. Data is stil going to be very distributed, right? >> Well you know, data gets generated in a distributed way, data is very large. I think it would be a bit naive to be able to point to one rack and one data center and say all your data center is going to be right here in this one rack. >> Or in one cloud. >> Or in one cloud for that matter. But just from a philosophical perspective, you do want to pull your data out of anything that is, like you said a minute ago, that's constraining it. So, I think, one really good example of this is when we went, quote unquote, web scale, we saw a lot of applications move into direct attached storage, to dive deep into a technology. And that was great if you wanted to only come in the front door and access the data through the application that was managing that das. But, if you wanted to do anything else, you were kind of stuck. >> So as to summarize this point, we're moving from a world in which data is a place to data is a service. >> That's right. >> Have I got that right? >> That's absolutely right. I mean, the way I like to think about it is that data and storage need to really be different things and storage's job is to give you access to the data. Storage in its own right, you know, doesn't solve a business problem. It's the data that solves the business problem. Storage is the vehicle that gets you there. And so I think it's pretty exciting that there's new technologies that are coming out, or that honestly are here, that are enabling that. Things like Flash and NVMe, and you know, it's futures. >> Well let's talk about that because what, the observation that I made to clients for quite some time is that if you go back, disk, was a great technology for persisting data. So again, Store number 17, transaction at a certain time. It's already occurred, we have to record it. So, we record it, we persisted on disk. Now what we are trying to do is we're utilizing technologies that are inherently structured to deliver data so that we can have the data be very distributed, but still look at it from a logical standpoint. And have that data be delivered to a lot of applications whether that is local and as long as we don't undermine basic physics perhaps further away. But even more importantly, deliver it to different roles, different, same day of being delivered to developers, same day to being different, delivered to a new application. What are some of those core technologies that are going to be necessary to do this? You mentioned NVMe, let's start there. >> Yeah, if I just back up a little bit right, that in some sense, even that recording the data workflow that you talked about, we made disk work. But it was actually a pretty challenging media and so we put in a lot of optimizations and things in place, because we said, we know the usage pattern. And if we know the usage pattern, we know how to organize our data. And so as a step one, like the transformation that I think is, in pretty full swing these days was moving from disk to flash. And that was a huge transformation, because it meant that random access to the data was just as performant as this carefully crafted sequential access. That meant you could start accepting unknown workloads into your applications, but you were still stuck behind this very serial, very antiquated SCSI protocol. And NVMe is now bringing a lot more parallels, to play. And that's going to help us to drive things like just simple, plain old data center. Stuff like density, and performance density, and power, and that kind of thing. So, that's sort of step one in terms of the technology that you can package all of this stuff in a pretty dense package and put petabytes of storage with enough I/O to actually access that data. If that's the key that you can have pedabytes, but you can only have one I out for each gig, well you're not going to get a lot out of that data. >> So, just to stop right there, and that leads to a world, in which as long as your disciplined and architected, you do not have to know what workloads are going to access that data near term. >> Well, you know, that's only step one, right. >> Right. >> Because the other challenge is that very few people access storage directly, right. We hide this behind databases, and we hide this behind a whole bunch of other technologies. Now, those technologies might have have their own limitations in place. But we have a lot or really rich things we can do at the storage level to present the same data out multiple frontends. And so the simplest idea is, we don't have one copy of a database, we often will have the transactional database that's using, recording those transactions, but then we'll have an analytics copy of the database and now we need to keep the two of those things in sync. And this is where the discipline and the architecture really comes into place. And we kind of have a lot of that figured out for things like relational databases and best practices there. But in the meantime, the world also moved over to the new world of Node-SQL databases, Queue's, Kafka. Things of that nature. And those, brought direct attached storage as the best practice. And so I think where the discipline comes in and where some of the new technologies that we're talking about right now are: How do you bring those old disciplines that we figured out, on let's say the relational world, how you bring that to bear on the new technologies that are meeting the scale requirements that we have today? >> Well one of the more important workloads that are going to require scale is, for example, AI. So, how are we going to organize some of these technologies, add them to these new disciplines, to be able to make some of these AI workloads run really, really fast. >> You know, I think a lot of this really comes down to pulling the storage out and putting it into it's own tier. And so, Pure Storage has an offering which is called AIRI, which is packaging DGX and Video DGX boxes with FlashBlades. And we say, hey you don't need a whole bunch of direct attached storage which is siloing your data, you can go put it into this common shared pool. And I think that on, you know, the other side the house, our FlashArray business is doing something really similar with NVMe, the FlashArray/X is essentially commoditizing NVMe. It's saying, everybody has access to this high performance density. And looking into the future with technologies like NVMe over Fabric, what we're really saying is your apps that used to use direct attached storage, there's no reason why they can't go to a sand based architecture that offers rich data services and not compromise one iota on latency. >> Or access or any other number of activities as well. So we've got NVMe, NVMe over Fabric, Flash, new approaches for thinking about packaging some of these things. Are there any other technologies that you envision on the horizon that are going to be really important to customers and that Pure is going to take advantage of. >> Yeah, you know, I really think that the other thing is once you collect all this stuff, you need a way to tame the beast. You need a way to deploy your applications. You need a way to catalog everything. And honestly, things like Kubernetes and container orchestration is becoming this platform where you deploy all of this stuff. And some of the assumptions that are baked into that, really go back and tie in nicely with those other technologies. In particular, they assume that I can schedule this compute wherever I want and I have access to the data. So in that way of having a fabric if you will between your compute and your data is essential. And it's just another reason why siloing things off into particular units of compute is just really the architecture of the past. And the architecture going forward is going to be to logically centralize. And maybe put some smarts at that other layer, saying, hey if this data is in the public cloud, let me schedule up there. But if this data is in my data center, let me schedule the compute down there. But then not having to worry about the micro decisions about, does it have to be in this rack or, you know, or on this particular physical node. All your data is accessible. >> But increasingly, we're going to do things that move the compute both physically as well as logically closer to the data. >> You know, 100%. Right. But it's at what scale? That you really want to get the data center right. Your compute should be running in the correct data center. >> Or the center of data right? >> Or the center of data, right, you know. Get it in the right spot, but then you don't want to have to worry about all the other micro constraints. You don't want, you know, if you look on the networking side of the world, Leaf Spy networks are all about say, hey look they're really is a uniform fabric for networking. We're trying to do the same thing in storage and just say, look, the storage is so performant, there's no reason to silo. You can run your compute where ever you want. If you've got a good networking fabric and you've got a good storage fabric, the end of the day, all your data is accessible, to whatever new application you envision. And you just, there's no reason why you have to lock it up. You mentioned security before. You know, you should absolutely be able to orchestrate things like taking a snapshot of your data, putting it through, masking, or whatever anonymization you need to make it safely accessible to new applications and innovators inside of your company to drive that digital business. >> Yes, and we like to talk about moving from a world that is focused on infrastructure, taking cost out, making it static, by removing all uncertainty to a world where we've no workloads, and elastic capacity, or elastic scale to a plastic world. Where plastic, using of the physicals, you know, the physic sense is unknown workload, unknown scale. And just making sure that we have the option to use data any way we want as much as possible in the future. >> And I think that that's why you see the rise of service catalogs and self service coming up in IT, it's that plasticity that you have the brightest minds in your company trying to figure out what to do, and you don't want to have infrastructure be this bottleneck that's causing everything to go slower. Or for people to say no. You just always want to say, yes. And that's where I think it's always exciting to see, see these technologies, NVMe, come out and say, we've now got the performance to say yes. NVMe over Fabric to say there's no compromise over latency. And then honestly, having this stuff packaged in things like FlashArray/X, where the CIO or the CFO, doesn't complain about breaking the bank as well. Because now these technologies are the status quo. They're the standard. There's no premium for them. And if anyone is trying to charge you that premium, you should really, you know, ask them why. This is the new architecture, this should be, this should be, what, the only thing you offer >> Right. >> In some sense >> Yeah, we're bringing all these new technologies into economic envelope that IT has to be in for business today. >> That's right, and you know, you look at something like flash memory, right. It's not a new technology. I remember in college having a flash card to put into like a digital camera in the early days of digital cameras. But for it to make it into the data center, the thing that was critical was that economic aspect of it. So it's not just about being on the bleeding edge of technology, but it's packaging that in a way that's actually palatable for the entire C-Suite to consume inside your organization. >> And I remember my disk pack that I carried around in college from the PDP system that we had to use. (laughter) Alright, Neil Vachharajani, Technical Director of Pure Storage talking about the relationship between new technologies, data centeric architectures, and digital business. Thanks very much for being on theCUBE. >> Thanks so much Peter. >> And once again, I'm Peter Burris, you've been participating in another CUBE conversation. 'Til we talk again. (upbeat music)

Published Date : Sep 25 2018

SUMMARY :

And to do that, we've got So, that is at the centerpiece has to do with the volume but it used to be as you that we use data differently, And that meant that you could build a lot the new world, you actually has had to do is find new have access to the data, and Data is stil going to be is going to be right here to pull your data out of anything that is, So as to summarize this Storage is the vehicle that that I made to clients for And that's going to help us to have to know what workloads Well, you know, that's that to bear on the new to be able to make some And we say, hey you don't need horizon that are going to in this rack or, you know, to the data. in the correct data center. And you just, that we have the option got the performance to say to be in for business today. But for it to make it into system that we had to use. And once again, I'm

ENTITIES

Entity	Category	Confidence
Neil Vachharajani	PERSON	0.99+
Peter Burris	PERSON	0.99+
Neil	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
Peter	PERSON	0.99+
100%	QUANTITY	0.99+
Sept 2018	DATE	0.99+
two	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Sears	ORGANIZATION	0.99+
each gig	QUANTITY	0.99+
Pure Storage	ORGANIZATION	0.99+
Palo Alto, CA.	LOCATION	0.99+
one rack	QUANTITY	0.99+
one cloud	QUANTITY	0.99+
one	QUANTITY	0.98+
one rack	QUANTITY	0.98+
Today	DATE	0.97+
Kafka	TITLE	0.97+
Node	TITLE	0.97+
both	QUANTITY	0.97+
first thing	QUANTITY	0.96+
CUBE	ORGANIZATION	0.94+
today	DATE	0.93+
step one	QUANTITY	0.93+
one copy	QUANTITY	0.93+
one data center	QUANTITY	0.92+
few weeks ago	DATE	0.87+
X	TITLE	0.86+
FlashBlades	COMMERCIAL_ITEM	0.85+
Kubernetes	TITLE	0.84+
FlashArray	TITLE	0.84+
a minute ago	DATE	0.81+
petabytes	QUANTITY	0.77+
C-Suite	TITLE	0.76+
number 17	OTHER	0.74+
CUBE	TITLE	0.71+
a second	QUANTITY	0.67+
DGX	ORGANIZATION	0.63+
theCUBE	ORGANIZATION	0.62+
AIRI	TITLE	0.61+
SQL	TITLE	0.6+
Leaf	ORGANIZATION	0.6+
Wikibon	ORGANIZATION	0.58+
CUBE Conversation	EVENT	0.51+
CUBEConversation	EVENT	0.45+

Santosh Rao, NetApp | Accelerate Your Journey to AI

>> From Sunnyvale California, in the heart of Silicon Valley. It's theCUBE, covering, Accelerate Your Journey to AI, Brought to you by NetApp. >> Hi I'm Peter Burris, welcome to another conversation here from the Data Visionary Center at NetApp's headquarters in beautiful Sunnyvale California. I'm being joined today by Santosh Rao. Santosh is the Senior Technical Director at NetApp, Specifically Santosh we're going to talk about some of the challenges and opportunities associated with AI and how NetApp is making that possible. Welcome to theCUBE. >> Thank you Peter, I'm excited to be here. Thank you for that. >> So, Santosh what is your role at Netapp? Why don't we start there. >> Wonderful, glad to be here, my name is Santosh Rao, I'm a Senior Technical Director at NetApp, part of the Product Operations group, and I've been here 10 years. My role is to drive up new lines of opportunity for NetApp, build up new product businesses. The most recent one has been AI. So I've been focused on bootstrapping and incubating the AI effort at NetApp for the last nine months now. Been excited to be part of this effort now. >> So nine months of talking, both internally, but spending time with customers too. What are customers telling you that are NetApp's opportunities, and what NetApp has to do to respond to those opportunities? >> That's a great question. We are seeing a lot of focus around expanding the digital transformation to really get value out of the data, and start looking at AI, and Deep Learning in particular, as a way to prove the ROI on the opportunities that they've had. AI and deep learning requires a tremendous amount of data. We're actually fascinated to see the amount of data sets that customers are starting to look at. A petabyte of data is sort of the minimum size of data set. So when you think about petabyte-scale data lakes. The first think you want to think about is how do you optimize the TCO for the solution. NetApp is seen as a leader in that, just because of our rich heritage of storage efficiency. A lot of these are video image and audio files, and so you're seeing a lot of unstructured data in general, and we're a leader in NFS as well. So a lot of that starts to come together from a NetApp perspective. And that's where customers see us as the leader in NFS, the leader in files, and the leader in storage efficiency, all coming together. >> And you want to join that together with some leadership, especially in GPU's, so that leads to NVIDIA. So you've announced an interesting partnership between NetApp and NVIDIA. How did that factor into your products, and where do you think that goes? >> It's kind of interesting how that came about, because when you look at the industry it's a small place. Some of the folks driving the NVIDIA leadership have been working with us in the past, when we've bootstrapped converged infrastructures with other vendors. We're known to have been a 10 year metro vendor in the converged infrastructure space. The way this came about was NVIDIA is clearly a leader in the GPU and AI acceleration from a computer perspective. But they're also seen as a long history of GPU virtualization and GPU graphics acceleration. When they look at NetApp, what NetApp brings to NVIDIA is just the converged infrastructure, the maturity of that solution, the depth that we have in the enterprise and the rich partner ecosystem. All of that starts to come together, and some of the players in this particular case, have had aligned in the past working on virtualization based conversion infrastructures in the past. It's an exciting time, we're really looking forward to working closely with NVIDIA. >> So NVIDIA brings these lighting fast machines, optimized for some of the new data types, data forms, data structures associated with AI. But they got to be fed, got to get the data to them. What is NetApp doing from a standpoint of the underlying hardware to improve the overall performance, and insure that these solutions really scream for customers? >> Yeah, it's kind of interesting, because when you look at how customers are designing this. They're thinking about digital transformation as, "What is the flow of that data? "What am I doing to create new sensors "and endpoints that create data? "How do I flow the data in? "How do I forecast how much data I'm going to "create quarter over quarter, year over year? "How many endpoints? what is the resolution of the data?" And then as that starts to come into the data center, they got to think about, where are the bottlenecks. So you start looking at a wide range of bottlenecks. You look at the edge data aggregation, then you start looking at network bandwidth to push data into the core data centers. You got to think smart about some of these things. For example, no matter how much network bandwidth you throw at it, you want to reduce the amount of data you're moving. Smart data movement technologies like SnapMirror, which NetApp brings to the table, are some things that we uniquely enable compared to others. The fact of the matter is when you take a common operating system, like ONTAP, and you can lear it across the Edge, Core and Cloud, that gives us some unnatural advantages. We can do things that you can't do in a silo. You've got a commodities server trying to push data, and having to do raw full copies of data into the data center. So we think smart data movement is a huge opportunity. When you look at the core, obviously it's a workhorse, and you've got the random sampling of data into this hardware. And we think the A800 is a workhorse built for AI. It is a best of a system in terms of performance, it does about 25 gigabytes per second just on a dual controller pair. You'll recall that we spent several number of years building out the foundation of Clustered ONTAP to allow us to scale to gigantic sizes. So 24 node or 12 controller pad A800 gets us to over 300 gigabytes per second, and over 11 million IOPS if you think about that. That's over about four to six times greater than anybody else in the industry. So when you think about NVIDIA investment in DGX and they're performance investment they've made there. We think only NetApp can keep up with that, in terms of performance. >> So 11 million IOPS, phenomenal performance for today. But the future is going to demand ever more. Where do you think these trends go? >> Well nobody really knows for sure. The most exciting part of this journey, is nobody knows where this is going. This is where you need to future proof customers, and you need to enable the technology to have sufficient legs, and the architecture to have sufficient legs. That no matter how it evolves and where customers go, the vendors working with customers can go there with them. And actually when customers look at NetApp and say, "You guys are working with the Cloud partners, "you're now working with NVIDIA. "And in the past you worked with a "variety of data source vendors. "So we think we can work with NetApp because, "you're not affiliated to any one of them, "and yet you're giving us that full range of solutions." So we think that performance is going to be key. Acceleration of compute workloads is going to demand orders of magnitude performance improvement. We think data set efficiencies and storage efficiencies is absolutely key. And we think you got to really look at PCO, because customers want to build these great solutions for the business, but they can't afford it unless vendors give them viable options. So it's really up to partners like NVIDIA and NetApp to work together to give customers the best of breed solutions that reduce the TCO, accelerate compute, accelerate the data pipeline, and yet, bring the cost of the overall solution down, and make it simple to deploy and pre integrated. These are the things customers are looking for and we think we have the best bet at getting there. >> So that leads to... Great summary, but that leads to some interesting observations on what customers should be basing their decisions on. What would you say are the two or three most crucial things that customers need to think about right now as a conceptualized, where to go with their AI application, or AI workloads, their AI projects and initiatives? >> So when customers are designing and building these solutions, they're thinking the entire data lifecycle. "How am I getting this new type of "data for digital transformation? "What is the ingestion architecture? "What are my data aggregation endpoints for ingestion? "How am I going to build out my AI data sources? "What are the types of data? "Am I collecting sensor data? Is it a variety of images? "Am I going to add in audio transcription? "Is there video feeds that come in over time?" So customers are having to think about the entire digital experience, the types of data, because that leads to the selection of data sources. For example, if you're going to be learning sensor data, you want to be looking at maybe graph databases. If you want to be learning log data, you're going to be looking at log analytics over time, as well as AI. You're going to look at video image and audio accordingly. Architecting these solutions requires an understanding of, what is your digital experience? How does that evolve over time? What is the right and optimal data source to learn that data, so that you get the best experience from a search, from an indexing, from a tiering, from analytics and AI? And then, what is the flow of that data? And how do you architect it for a global experience? How do you build out these data centers where you're not having to copy all data maybe, into your global headquarters. If you're a global company with presence across multiple Geo's, how do you architect for regional data centers to be self contained? Because we're looking at exabyte scale opportunities in some of these. I think that's pretty much the two or three things that I'd say, across the entire gamut of space here. >> Excellent, turning that then into some simple observations about the fact that data still is physical. There's latency issues, there's the cost of bandwidth issues. There's other types of issues. This notion of Edge, Core, Cloud. How do you see the ONTAP operating system, the ONTAP product set, facilitating being able to put data where it needs to be, while at the same time creating the options that a customer needs to use data as they need to use it? >> The fact of the matter is, these things cannot be achieved overnight. It takes a certain amount of foundational work, that, frankly, takes several years. The fact that ONTAP can run on small, form factor hardware at the edge is a journey that we started several years ago. The fact that ONTAP can run on commodity white box hardware, has been a journey that we have run over the last three, four years. Same thing in the Cloud, we have virtualized ONTAP to the point that it can run on all hyperscalers and now we are in the process of consuming ONTAP as a service, where you don't even know that it is an infrastructure product, or has been. So the process of building an Edge, Core, and Cloud data pipeline leverages the investments that we've made over time. When you think about the scale of compute, data and performance needed, that's a five to six year journey in Clustered ONTAP, if you look at NetApp's past. These are all elements that are coming together from a product and solution perspective. But the reality is that leveraging years and years of investment that NetApp engineering has made. In a a way that the industry really did not invest in the same areas. So when we compare and contrast what NetApp has done versus the rest of the industry. At a time when people were building monolithic engineered systems, we were building software defined architectures. At a time when they were building tightly cobbled system for traditional enterprise, we were building flexible, scale out systems, that assumed that you would want to scale in modular increments. Now as the world has shifted from enterprise into third platform and Webscale. We're finding all those investments NetApp made over the years is really starting to pay off for us. >> Including some of the investments in how AI can be used to handle how ONTAP operates at each of those different levels of scale. >> Absolutely, yes. >> Sontash Rao, Technical Director at NetApp, talking about AI, some of the new changes in the relationships between AI and storage. Thanks very much for being on theCUBE. >> Thank you, appreciate it.

Published Date : Aug 1 2018

SUMMARY :

Brought to you by NetApp. Santosh is the Senior Technical Director at NetApp, Thank you Peter, I'm excited to be here. Why don't we start there. the AI effort at NetApp for the last nine months now. What are customers telling you that are So a lot of that starts to come especially in GPU's, so that leads to NVIDIA. All of that starts to come together, What is NetApp doing from a standpoint of the The fact of the matter is when you But the future is going to demand ever more. and the architecture to have sufficient legs. Great summary, but that leads to some because that leads to the selection of data sources. observations about the fact that data The fact of the matter is, Including some of the investments in how AI can in the relationships between AI and storage.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Justin Warren	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Clarke	PERSON	0.99+
David Floyer	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Dave Volante	PERSON	0.99+
George	PERSON	0.99+
Dave	PERSON	0.99+
Diane Greene	PERSON	0.99+
Michele Paluso	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Sam Lightstone	PERSON	0.99+
Dan Hushon	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
Teresa Carlson	PERSON	0.99+
Kevin	PERSON	0.99+
Andy Armstrong	PERSON	0.99+
Michael Dell	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Kevin Sheehan	PERSON	0.99+
Leandro Nunez	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Keith	PERSON	0.99+
Bob Metcalfe	PERSON	0.99+
VMware	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Sam	PERSON	0.99+
Larry Biagini	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Brendan	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Clarke Patterson	PERSON	0.99+

Partha Seetala, Robin Systems | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE. Covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back everyone, you are watching day two of theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight. I'm coming at you with my cohost Jame Kobielus. We're joined by Partha Seetala, he is the Chief Technology Officer at Robin Systems, thanks so much for coming on theCUBE. >> Pleasure to be here. >> You're a first timer, so we promise we don't bite. >> Actually I'm not, I was on theCUBE- >> Oh! >> At DockerCon in 2016. >> Oh well excellent, okay, so now you're a veteran, right. >> Yes, ma'am. >> So Robin Systems, as before the cameras were rolling, we were talking about it, it's about four years old, based here in San Jose, venture backed company. Tell us a little bit more about the company and what you do. >> Absolutely. First of all, thanks for hosting me here. Like you said, Robin is a Silicon Valley based company. Our focus is in allowing applications, such as big data, databases, no sequel and AI ML, to run within the Kubernetes platform. What we have built is a product that converges storage, complex storage, networking, application workflow management, along with Kubernetes to create a one click experience where users can get managed services kind of feel when they're deploying these applications. They can also do one click life cycle management on these apps. Our thesis has initially been to, instead of looking at this problem from an infrastructure up into application, to actually look at it from the applications down and then say, "Let the applications drive the underlying infrastructure to meet the user's requirements." >> Is that your differentiating factor, would you say? >> Yeah, I think it is because most of the folks out there today are looking at is as if it's a competent based play, it's like they want to bring storage to Kubernetes or networking to Kubernetes but the challenges are not really around storage and networking. If you talk to the operations folk they say that, "You know what? Those are underlying problems but my challenge is more along the lines of, okay, my CIO says the initiative is to make my applications mobile. They want go across to different Clouds. That's my challenge." The line of business user says, "I want to get a managed source experience." Yes, storage is the thing that you want to manage underneath, but I want to go and click and create my, let's say, an Oracle database or distributions log. >> In terms of the developer experience here, from the application down, give us a sense for how Robin Systems tooling your product enables that degree of specification of the application logic that will then get containerized within? >> Absolutely, like I said, we want applications to drive the infrastructure. What it means is that we, Robin is a software platform. We later ourselves on top of the machines that we sit on whether it is bare metal machines on premises, our VMs, or even an Azure, Google Cloud as well as AWs. Then we make the underlying compute, storage, network resources almost invisible. We treat it as a pool of resources. Now once you have this pool of resources, they can be attached to the applications that are being deployed as can inside containers. I mean, it's a software place, install on machines. Once it's installed, the experience now moves away from infrastructure into applications. You log in, you can see a portal, you have a lot of applications in that portal. We ship support for about 25 applications of some such. >> So these are templates? >> Yes. >> That the developer can then customize to their specific requirements? Or no? >> Absolutely, we ship reference templates for pretty much a wide variety of the most popular big data, no sequel, database, AI ML applications today. But again, as I said, it's a reference implementation. Typically customers take the reference recommendation and they enhance it or they use that to onboard their custom apps, for example, or the apps that we don't ship out of the box. So it's a very open, extensible platform but the goal being that whatever the application might be, in fact we keep saying that, if it runs somewhere else, it's runs on Robin, right? So the idea here is that you can bring anything, and we just, the flip of switch, you can make it a one click deploy, one click manage, one click mobile across Clouds. >> You keep mentioning this one click and this idea of it being so easy, so convenient, so seamless, is that what you say is the biggest concern of your customers? Is this ease and speed? Or what are some other things that are on their minds that you want to deliver? >> Right, so one click of course is a user experience part but what is the real challenge? The real challenges, there are a wide variety of tools being used by enterprises today. Even the data analytic pipeline, there's a lot across the data store, processor pipeline. Users don't want to deal with setting it up and keeping it up and running. They don't want that, they want to get the job done, right? Now when you only get the job done, you really want to hide the underlying details of those platforms and the best way to convey that, the best way to give that experience is to make it a single click experience from the UI. So I keep calling it all one click because that is the experience that you get to hide the underlying complexity for these apps. >> Does your environment actually compile executable code based on that one click experience? Or where does the compilation and containerization actually happen in your distributed architecture? >> Alright, so, I think the simplest- >> You're a prem based offering, right? You're not in the Cloud yourself? >> No, we are. We work on all the three big public clouds. >> Oh, okay. >> Whether it is Azure, AWS or Google. >> So your entire application is containerized itself for deployment into these Clouds? >> Yes, it is. >> Okay. >> So the idea here is let's simplify it significantly, right? You have Kubernetes today, it can run anywhere, on premises, in the public Cloud and so on. Kubernetes is a great platform for orchestrating containers but it is largely inaccessible to a certain class of data centric applications. >> Yeah. >> We make that possible. But our take is, just onboarding those applications on Kubernetes does not solve your CXO or you line of business user's problems. You ought to make the management, from an application point of view, not from a container management point of view, from an application point of view, a lot easier and that is where we kind of create this experience that I'm talking about, one click experience. >> Give us a sense for how, we're here at DataWorks and it's the Hortonworks show. Discuss with us your partnership with Hortonworks and you know, we've heard the announcement of HDP 3.0 and containerization support, just give us a rough sense for how you align or partner with Hortonworks in this area. >> Absolutely. It's kind of interesting because Hortonworks is a data management platform, if you think about it from that point of view and when we engaged with them first- So some of our customers have been using the product, Hortonworks, on top of Robin, so orchestrating Hortonworks, making it a lot easier to use. >> Right. >> One of the requirements was, "Are you certified with Hortonworks?" And the challenge that Hortonworks also had is they had never certified a container based deployment of Hortonworks before. They actually were very skeptical, you know, "You guys are saying all these things. Can you actually containerize and run Hortonworks?" So we worked with Hortonworks and we are, I mean if you go to the Hortonworks website, you'll see that we are the first in the entire industry who have been certified as a container based play that can actually deploy and manage Hortonworks. They have certified us by running a wide variety of tests, which they call the Q80 Test Suite, and when we got certified the only other players in the market that got that stamp of approval was Microsoft in Azure and EMC with Isilon. >> So you're in good company? >> I think we are in great company. >> You're certified to work with HTP 3.0 or the prior version or both? >> When we got certified we were still in the 2.X version of Hortonworks, HTP 3.0 is a more relatively newer version. But our plan is that we want to continue working with Hortonworks to get certified as they release the program and also help them because HTP 3.0 also has some container based orchestration and deployment so you want to help them provide the underlying infrastructure so that it becomes easier for beyond to spin up more containers. >> The higher level security and governance and all these things you're describing, they have to be over the Kubernetes layer. Hortonworks supports it in their data plane services portfolio. Does Robin Systems solutions portfolio tap in to any of that, or do you provide your own layer of sort of security and metadata management so forth? >> Yeah, so we don't want- >> In context of what you offer? >> Right, so we don't want to take away the security model that the application itself provides because might have step it up so that they are doing governance, it's not just logging in and auto control and things like this. Some governance is built into. We don't want to change that. We want to keep the same experience and the same workflow hat customers have so we just integrate with whatever security that the application has. We, of course, provide security in terms of isolating these different apps that are running on the Robin platform where the security or the access into the application itself is left to the apps themselves. When I say apps, I'm talking about Hortonworks. >> Yeah, sure. >> Or any other databases. >> Moving forward, as you think about ways you're going to augment and enhance and alter the Robin platform, what are some of the biggest trends that are driving your decision making around that in the sense of, as we know that companies are living with this deluge of data, how are you helping them manage it better? >> Sure. I think there are a few trends that we are closely watching. One is around Cloud mobility. CIOs want their applications along with their data to be available where their end users are. It's almost like follow the sun model, where you might have generated the data in one Cloud and at a different time, different time zone, you'll basically want to keep the app as well as data, moving. So we are following that very closely. How we can enable the mobility of data and apps a lot easier in that world. The other one is around the general AI ML workflow. One of the challenges there, of course, you have great apps like TensorFlow or Theano or Caffe, these are very good AI ML toolkits but one of the challenges that people face, is they are buying this very expensive, let's say NVIDIA DGX Box, this box costs about $150,000 each, how do you keep these boxes busy so that you're getting a good return on investment? It will require you to better manage the resources offered with these boxes. We are also monitoring that space and we're seeing that how can we take the Robin platform and how do you enable the better utilization of GPUs or the sharing of GPUs for running your AI ML kind of workload. >> Great. >> Those are, I think, two key trends that we are closely watching. >> We'll be discussing those at the next DataWorks Summit, I'm sure, at some other time in the future. >> Absolutely. >> Thank you so much for coming on theCUBE, Partha. >> Thank you. >> Thank you, my pleasure. Thanks. >> I'm Rebecca Knight for James Kobielus, We will have more from DataWorks coming up in just a little bit. (techno beat music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, he is the Chief Technology we promise we don't bite. so now you're a veteran, right. and what you do. from the applications down Yes, storage is the thing that you want the machines that we sit on or the apps that we don't because that is the No, we are. So the idea here is let's and that is where we kind of create and it's the Hortonworks show. if you think about it One of the requirements was, or the prior version or both? the underlying infrastructure so that to any of that, or do you that are running on the Robin platform the Robin platform and how do you enable that we are closely watching. at the next DataWorks Summit, Thank you so much for Thank you, my pleasure. We will have more from DataWorks

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Jame Kobielus	PERSON	0.99+
San Jose	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
James Kobielus	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Robin Systems	ORGANIZATION	0.99+
Partha Seetala	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
San Jose, California	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one click	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
2016	DATE	0.99+
both	QUANTITY	0.99+
HTP 3.0	TITLE	0.99+
NVIDIA	ORGANIZATION	0.99+
first	QUANTITY	0.99+
DataWorks	ORGANIZATION	0.99+
Robin	ORGANIZATION	0.98+
Kubernetes	TITLE	0.98+
One	QUANTITY	0.98+
TensorFlow	TITLE	0.98+
about $150,000 each	QUANTITY	0.98+
about 25 applications	QUANTITY	0.98+
one click	QUANTITY	0.98+
Partha	PERSON	0.98+
Isilon	ORGANIZATION	0.97+
DGX Box	COMMERCIAL_ITEM	0.97+
today	DATE	0.96+
First	QUANTITY	0.96+
DockerCon	EVENT	0.96+
Azure	ORGANIZATION	0.96+
Theano	TITLE	0.96+
DataWorks Summit 2018	EVENT	0.95+
theCUBE	ORGANIZATION	0.94+
Caffe	TITLE	0.91+
Azure	TITLE	0.91+
Robin	PERSON	0.91+
Robin	TITLE	0.9+
two key trends	QUANTITY	0.89+
HDP 3.0	TITLE	0.87+
EMC	ORGANIZATION	0.86+
single click	QUANTITY	0.86+
day two	QUANTITY	0.84+
DataWorks Summit	EVENT	0.83+
three big public clouds	QUANTITY	0.82+
DataWorks	EVENT	0.81+

Jim McHugh, NVIDIA | SAP SAPPHIRE NOW 2018

>> From Orlando, Florida it's theCUBE! Covering SAP SAPPHIRE NOW 2018, brought to you by NetApp. >> Welcome to theCUBE I'm Lisa Martin with Keith Townsend and we are in Orlando at SAP SAPPHIRE NOW 2018, where we're in the NetApp booth and talking with lots of partners and we're excited to welcome back to theCUBE, distinguished alumni Jim McHugh from NVIDIA, you are the VP and GM of Deep Learnings and "other stuff" as you said in the keynote. (all laugh) >> Yeah, and other stuff. That's a lot of responsibility! That other stuff, that, you know, that can really pile up! >> That can kill ya. Yeah, exactly. >> So here we are at SAPPHIRE you've been working with SAP in various forms for a long time, this event is enormous, lots of momentum at NVIDIA, what is NVIDIA doing with SAP? >> We're really helping SAP figure out and drive the development of their SAP Leonardo machine learning services so, machine learning, as we saw in the keynote today, with Haaso as a key component of it, and really what it's doing is it's automating a lot of the standard processes that people did, in the interactions, so whether it's closing your invoices at the end of the quarter, and that can take weeks to go through it manually, you can actually do machine learning and deep learning and do that instantaneously, so you can get a continuous close. Things like service ticketing, so when a service ticket comes in, you know, we all know, you pick up the phone, you call 'em and they collect your information, and then they pass you on to someone else that wants to confirm the information, all that can be handled just in a email, because now I know a lot about you when you send me an email I know who you are, know what company you're with, I know your problem 'cause you stated it, and I can route it, using machine learning, to the appropriate person. I can not only route it to the appropriate person I can look up in a knowledge database and say hey, have we seen this answer a question before feed that to the customer service representative, and when they start interacting with the customer they already have a lot of information about them and it's already well underway. >> So from a practical technology perspective we hear a lot about AI, machine learning, NVIDIA obviously leading the way with GPUs and enabling development frameworks to take advantage of machine learning and that compute power. But the enterprise, we'll at that and we're like you know that, we see obvious value, but I need a data scientist, I need a programmer, I need all this capability, from a technical staff perspective, to take advantage of it. How is NVIDIA, SAP, making that easier to consume? >> So most enterprises, if you're just jumpin' in and tryin' to figure it out, you would need all these people, you'd need a data scientist and someone to go through the process. 'Cause AIs, it's a new way of writing software, and you're using data to train the software, so we don't have, we don't put programmers in a room anymore and let 'em code for nine months and out pops software, you know, eventually. We give 'em more and more data, and the data scientist is training it. Well the good news is we're working with SAP and they have the data scientists, they know how SAP apps work, they know how the integration works, they know the workflows of their customers, so they're building the models and then making it available as a service, right? So when you go to the SAP cloud, you're saying I wanna actually take advantage of the SAP service for service ticketing or, you know, I wanna figure out how I can do my invoice processing better, or I'm an HR representative, and I don't wanna spend 60% of my time reading resumes, I wanna actually have an AI do it for me, and then it's a service that you can consume. There, that we do make it possible, like if you have a developer in your enterprise and you say you know what, I'm a big SAP user but I actually wanna develop a custom app or other some things I might do, then SAP makes available the Leonardo machine learning foundation and you can take advantage of that and develop a custom app. And if you have a really big problem and you wanna take it off, NVIDIA's happy to work with you directly and figure out how to solve different problems. And most of our customers are in all three of those, Right? They're consuming the services 'cause they automate things today, they're figuring out, what are the custom apps they need to build around SAP and then they're, you know, they're figuring out some of the product building products or something else that's a much bigger machine learning, deep learning problem. >> So yesterday during Bill McDermott's keynote he talked about tech for good, now there's been a lot of news recently of tech for not-so-good and data privacy, GDPR, you know, compliance going into affect last week, NVIDIA really has been an integral part of this AI renaissance, you talked about, you know, you can help loads of different customers there's so much potential with AI, as Bill McDermott said yesterday, AI to augment humanity. I can imagine, you know, life and death situations like in healthcare, can you give us an example of what you guys are doing with SAP that, you know, maybe is transforming healthcare at a particular hospital? >> Yeah, so one of the great examples I was just talking about is, what Massachusetts General is doing. Massachusetts General is one of the largest research hospitals in the United States, and they're doing a lot of work in AI, to really automate processes that, you know, when you would take your child in to figure out the bone density scan, which basically tells you the bone age of your child, and they compare it to your biological age, and that can tell you a lot of things, is it just a, you know, a growth problem, or is there something more serious to be concerned about. Well, they would do these MRIs, and then you would have to wait for days while the, the technician and the doctor would flip through a textbook from the 1950's, to determine it. Well Massachusetts General automated all that where they actually trained a neural network on all these different scans and all these different components and now you find out in minutes. So it greatly reduces the stress, right? And there's plenty of other project going on and you can see it in determination if that's a cancer cell, or, you know, so many different aspects of it, your retina happens to be an incredible venue into whether you have hypertension, whether you have Malaria, Dengue fever, so things like, you know what, maybe you shouldn't be around anywhere where you're gonna get bit by a mosquito and it's gonna pass it to your family, all that can now be handled, and you don't need expensive healthcare, you can actually take it to a clinician out in the field. So, we love all that. But if you think about the world of SAP which is the, you know, controls the data records of most companies, right? Their supply chain information, their resource information about, you know, what they have available, all that's being automated. So if we think from the production of food where we're having tractors now that they have the ability to go over a plant and say you know what, that needs insecticide or that needs weeds to be removed 'cause it's just bad for the whole component, or that's a diseased plant and I'm gonna remove it, or it just needs water so it can grow, right? That is increasing the production of food in an organic way, then we improve the distribution centers so it doesn't sit as long, right, so that we can actually have drones flying through the warehouses and knowing what needs to be moved first, go from there, we're moving to autonomous driving vehicles and, where deliveries can happen at night when there's not so much traffic, and then we can get the food as fresh as possible and deliver it. So if you think that whole distribution center and just being in the pipeline as being automated, it's doing an incredible amount of good. And then, jumping into the world of autonomous driving vehicles, it's a 10 trillion dollar business that's being changed, radically. >> So as we think about these super complex systems that we're trying to improve, we start to break them down into small components, smaller components, you end up with these scenarios, these edge scenarios, use cases where, you know, whether it's data frequency, data value, or data latency, we have to push to compute out to the edge. Can you talk about use cases where NVIDIA has pushed the technology far out to the edge to take in massive amounts of data, that effectively can't be sent back to the core or to the data center for processing, what are some of these use cases solutions? >> So it's, the world of IOT is changing as well, right, the compute power has to be where it's needed, right, and in any form, so whether that's cloud based, data center based, or at the edge and we have a great customer that is actually doing inspection, oil refineries, bridges, you know, where they spot a crack or some sort of mark where they have to go look at it, well traditionally what you do is you send out a whole team and they build up scaffolding, or they have people repel down to try to inspect it. Well now what we're doing is flying drones and sending wall crawlers up. So they find something, they get data, and then, instead of actually, like you said, putting it, you know, on a truck and taking it back to your data center or trying to figure out how to have enough bandwidth to get there, they're taking one of our products, which is a DGX station, it's basically the equivalent of a half a row of servers, but it's in a single box, water cooled, and they're putting it in vans sitting out in remote areas of Alaska, and retraining the model there on site. So, they get the latest model, they get more intelligence and they just collect it, and they can resend the drones up and then discover more about it. So it really, really is saving, and that saves a lot of money, so you have a group of really smart you know, technicians and people who understand it and a guy who can do the neural network capability instead of a whole team coming up and setting up scaffolding that would cost millions of dollars. >> That reminds me of that commercial that they showed yesterday during general session SAP commercial with Clive Owen the actor, talking about, you mentioned, you know, cracks in oil wells and things like that it just reminded me of that, and what they talked about in that video was really how invisible software, like SAP, is transforming industries, saving lives, I think I saw on their website an example of how they're leveraging AI and technology to reduce water scarcity in India or save the rhino conservation and what you just described with NVIDIA seems to be quite in alignment with the direction that SAP is going. >> Oh absolutely, yeah, I mean we believe in SAP's view of the intelligent enterprise and people gotta remember, enterprise isn't just like the corporate office whatever, enterprises are many different things, alright. Public safety, if you can think about that, that's a big thing we focus on. A really amazing thing that's going on, thinking about using drones for first responders they actually can know what's going on at the scene and when the other people are showing up they know what kind of area they're going into. Or for search and rescue, drones can cover a lot of territory and detect a human faster than a human can, right? And if you can actually find someone within the first 24 hours, chance of survival is so much higher. All of that is, you know, leveraging the exact same technology that we do for looking at our business processes, right, and it's not as, you know, dramatic, it's not gonna show up on the evening news, but honestly, streamlining our business processes, making it happen so much faster and more efficient makes businesses more efficient, you know, it's better for the company, it's better for the employees as well. >> So let's talk about, something that's, that's taboo, financial services, making money with data, or with analytics or machine learning from data, again we have to, John Furrier is here, and we have someone from NVIDIA here, and if we don't bring up blockchain in some type of way he's gonna throw something at his team, so, >> Let's give a shout out to John Furrier. (laughing) >> Give a shout out to John. But from a practical sense, let's subtract the digital currency part of machine, of blockchain, do you see applications for blockchain from a machine learning perspective? >> Yeah, I mean well, if you just boil blockchain down or for trusted networks, right? And you know you heard Bill McDermott say that on stage he called his marketplaces, or areas that he could do for an exchange, it makes total sense. If I can have a trusted way of doing things where I have a common ledger between companies and we know that it's valid, that we can each interchange with, yeah it makes complete sense, right, now we gotta get to the practical imitation of that and we have to build the trust of the companies to understand, okay this technology can take you there, and that's where I think, you know, where we come in with our technology capabilities, ensuring to people that it's reliable and work, SAP comes in with the customer relationships and trusted in what they've been doing in helping people run their business for years, and then it becomes cultural. Like all things, we can kid ourselves in technology that we'll just solve everything, it's a cultural change. I'm gonna share that common ledger, I'm gonna share that common network and feel confident in it, it's something that people have to do and, you know, my take on that always is when the accuracy is so much better, when the efficiency is so much better, when the return is so much better, we get a lot more comfortable. People used to be nervous about giving the grocery store their phone number, right, 'cause they would track their food, right? And today we're just like okay yeah here's my phone number. (Keith laughing) >> So. (laughs) >> Give you a 30 cent discount, here's my number. >> Exactly. We're so cheap. (laughing) >> So we're in the NetApp booth and you guys recently announced a reference, combined reference, AI reference architecture with NetApp, tell us a little bit more about that. >> Yeah, well the little secret behind all the things we just talked about, there's an incredible amount of data, right, and as you collect this data it's really important to store it in a way that it's accessible when you need it. And when you're doing trainings, I have a product that's called DGX-1, DGX-1 takes an incredible amount of data that helps us train these neural networks, and it's fast, and it has an insatiable desire for data. So what we've worked with NetApp is actually pool together reference architecture so that when a data scientist, who is a very valuable resource, is working on this, he's ensured that the infrastructures are gonna work together seamlessly and deliver that data to the training process. And then when you create that model, we use something that's called inference, you put it in production, and again same time, when you're having that inference running you wanna make sure that data can get to it and can interact with the data seamlessly and the reference architectures play out there as well. So our goal is, start knocking off one by one, what do the customers need to be successful? And we put a lot of effort into the GPUs, we put a lot of effort into the deep learning software that runs on top of that, we put a lot of effort into, you know, what's the models they need to use, etc. And now we have to spend a lot more time of what's their infrastructure? And make sure that's reliable because, you would hate to do all that work only to find that your infrastructure had a hiccup, and took your job down. So we're working really hard to make sure that never happens >> So I have this theory that, well I don't have the theory, David Curry came out with this theory of data has gravity, but I've come up with this additional theory, now that we look at AI, and the capability of AI and what people are and what the hyper scalers are doing in their data center is that individual companies think, have a challenge replicating in their own data center, this AI and compute now has gravity. You know, I can't well, at least before today I didn't think well I can take my data center, put it on the road, and do these massive pieces of injection on the edge, sounds like we're pushin' back on that a little bit and saying that you know what sure if it's, I don't know what the limits are, and I guess that's the question. What are the limits of what we can do on the edge when it comes to the amount of data, and portable AI to that edge? >> Well so, there's again the two aspects of it, the training takes an incredible amount of data that's why they would have to take a super computer and put it there so they could do the retraining, but, when you think about when you can have the pro-- something the size of a credit card, which is our Jetson solution, and you can install it in a drone or you can put in cameras for public safety, etc. Which is, has incredible, think about looking for a lost child or parents with Alzheimer's, you can scan through video real quick and find them, right? All because of a credit card sized processor, that's pretty impressive. But that's what's happening at the edge, we're now writing applications that are much more intelligent using AI, there are AI applications sitting at the edge that, instead of just processing the data in a way where I'm getting a average, average number of people who walked into my store, right, that's what we used to do five years ago, now we're actually using intelligent applications that are making calculated decisions, it's understanding who's coming in a store, understanding their buying/purchasing power, etc. That's extremely important in retail, because, if you wanna interact with someone and give them that, you know when they're doing self checkout, try to sell 'em one more thing, you know, did you forget the batteries that go with that, or whatever you want it to be, you only have a few seconds, right? And so you must be able to process that and have something really intelligent doing that instead of just trying to do the law of average and get a directionally correct-- and we've known this, anytime you've been on your webpage or whatever and someone recommends something you're like that doesn't have anything to do with me and then all of a sudden it started getting really good that's where they're getting more intelligent. >> When I walk into the store with my White Sox hat and then they recommend the matching jersey. I'm gonna look, gonna come lookin' for you guys at NVIDIA like wa-hey! I don't have money for a jersey, but things like that, yeah. >> We're just behind the scenes somewhere. >> Well, you title VP and GM of Deep Learning and stuff, there's a lot of stuff. (all laugh) Jim thanks so much for coming back on theCUBE sharing with us what's new at NVIDIA it sounds like the world of possibilities is endless, so exciting! >> Yeah, it is an exciting time, thank you. >> Thanks for your time, we wanna thank you for watching theCUBE, Lisa Martin with Keith Townsend from SAP SAPPHIRE 2018, thanks for watching. (bubbly music)

Published Date : Jun 9 2018

SUMMARY :

brought to you by NetApp. and "other stuff" as you said in the keynote. That other stuff, that, you know, That can kill ya. and then they pass you on to someone else and enabling development frameworks to take advantage of and then they're, you know, I can imagine, you know, and that can tell you a lot of things, these edge scenarios, use cases where, you know, and then, instead of actually, like you said, what you just described with NVIDIA and it's not as, you know, dramatic, Let's give a shout out to John Furrier. do you see applications for blockchain and that's where I think, you know, Give you a 30 cent discount, We're so cheap. you guys recently announced a reference, and deliver that data to the training process. and saying that you know what and you can install it in a drone and then they recommend the matching jersey. behind the scenes somewhere. Well, you title VP and GM of Deep Learning and stuff, we wanna thank you for watching theCUBE,

ENTITIES

Entity	Category	Confidence
Jim McHugh	PERSON	0.99+
John	PERSON	0.99+
Lisa Martin	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Massachusetts General	ORGANIZATION	0.99+
Keith Townsend	PERSON	0.99+
John Furrier	PERSON	0.99+
Alaska	LOCATION	0.99+
David Curry	PERSON	0.99+
60%	QUANTITY	0.99+
Bill McDermott	PERSON	0.99+
nine months	QUANTITY	0.99+
Orlando	LOCATION	0.99+
Clive Owen	PERSON	0.99+
30 cent	QUANTITY	0.99+
Jim	PERSON	0.99+
United States	LOCATION	0.99+
Orlando, Florida	LOCATION	0.99+
yesterday	DATE	0.99+
10 trillion dollar	QUANTITY	0.99+
last week	DATE	0.99+
White Sox	ORGANIZATION	0.99+
today	DATE	0.99+
Leonardo	ORGANIZATION	0.99+
SAP	ORGANIZATION	0.99+
India	LOCATION	0.98+
Jetson	ORGANIZATION	0.98+
SAPPHIRE	ORGANIZATION	0.98+
two aspects	QUANTITY	0.98+
GDPR	TITLE	0.98+
millions of dollars	QUANTITY	0.97+
one	QUANTITY	0.97+
three	QUANTITY	0.97+
five years ago	DATE	0.97+
first responders	QUANTITY	0.96+
Malaria	OTHER	0.96+
single box	QUANTITY	0.95+
half a row	QUANTITY	0.95+
VP	PERSON	0.95+
Keith	PERSON	0.94+
Deep Learnings	ORGANIZATION	0.94+
1950's	DATE	0.93+
first 24 hours	QUANTITY	0.93+
NetApp	TITLE	0.92+
one more thing	QUANTITY	0.91+
NOW	DATE	0.91+
Dengue fever	OTHER	0.91+
each	QUANTITY	0.88+
SAP SAPPHIRE	TITLE	0.88+
SAP	TITLE	0.88+
a few seconds	QUANTITY	0.84+
NetApp	ORGANIZATION	0.81+
2018	DATE	0.81+
theCUBE	TITLE	0.77+
hypertension	OTHER	0.74+
first	QUANTITY	0.72+

Matt Burr, Pure Storage & Rob Ober, NVIDIA | Pure Storage Accelerate 2018

>> Announcer: Live from the Bill Graham Auditorium in San Francisco, it's theCUBE! Covering Pure Storage Accelerate 2018 brought to you by Pure Storage. >> Welcome back to theCUBE's continuing coverage of Pure Storage Accelerate 2018, I'm Lisa Martin, sporting the clong and apparently this symbol actually has a name, the clong, I learned that in the last half an hour. I know, who knew? >> Really? >> Yes! Is that a C or a K? >> Is that a Prince orientation or, what is that? >> Yes, I'm formerly known as. >> Nice. >> Who of course played at this venue, as did Roger Daltry, and The Who. >> And I might have been staff for one of those shows. >> You could have been, yeah, could I show you to your seat? >> Maybe you're performing later. You might not even know this. We have a couple of guests joining us. We've got Matt Burr, the GM of FlashBlade, and Rob Ober, the Chief Platform Architect at NVIDIA. Guys, welcome to theCUBE. >> Hi. >> Thank you. >> Dave: Thanks for coming on. >> So, lots of excitement going on this morning. You guys announced Pure and NVIDIA just a couple of months ago, a partnership with AIRI. Talk to us about AIRI, what is it? How is it going to help organizations in any industry really democratize AI? >> Well, AIRI, so AIRI is something that we announced, the AIRI Mini today here at Accelerate 2018. AIRI was originally announced at the GTC, Global Technology Conference, for NVIDIA back in March, and what it is is, it essentially brings NVIDIA's DGX servers, connected with either Arista or Cisco switches down to the Pure Storage FlashBlade, so this is something that sits in less than half a rack in the data center, that replaces something that was probably 25 or 50 racks of compute and store, so, I think Rob and I like to talk about it as kind of a great leap forward in terms of compute potential. >> Absolutely, yeah. It's an AI supercomputer in a half rack. >> So one of the things that this morning, that we saw during the general session that Charlie talked about, and I think Matt (mumbles) kind of a really brief history of the last 10 to 20 years in storage, why is modern external storage essential for AI? >> Well, Rob, you want that one, or you want me to take it? Coming from the non storage guy, maybe? (both laugh) >> Go ahead. >> So, when you look at the structure of GPUs, and servers in general, we're talking about massively parallel compute, right? These are, we're now taking not just tens of thousands of cores but even more cores, and we're actually finding a path for them to communicate with storage that is also massively parallel. Storage has traditionally been something that's been kind of serial in nature. Legacy storage has always waited for the next operation to happen. You actually want to get things that are parallel so that you can have parallel processing, both at the compute tier, and parallel processing at the storage tier. But you need to have big network bandwidth, which was what Charlie was eluding to, when Charlie said-- >> Lisa: You like his stool? >> When Charlie was, one of his stools, or one of the legs of his stool, was talking about, 20 years ago we were still, or 10 years ago, we were at 10 gig networks, in merges of 100 gig networks has really made the data flow possible. >> So I wonder if we can unpack that. We talked a little bit to Rob Lee about this, the infrastructure for AI, and wonder if we can go deeper. So take the three legs of the stool, and you can imagine this massively parallel compute-storage-networking grid, if you will, one of our guys calls it uni-grid, not crazy about the name, but this idea of alternative processing, which is your business, really spanning this scaled out architecture, not trying to stuff as much function on a die as possible, really is taking hold, but what is the, how does that infrastructure for AI evolve from an architect's perspective? >> The overall infrastructure? I mean, it is incredibly data intensive. I mean a typical training set is terabytes, in the extreme it's petabytes, for a single run, and you will typically go through that data set again and again and again, in a training run, (mumbles) and so you have one massive set that needs to go to multiple compute engines, and the reason it's multiple compute engines is people are discovering that as they scale up the infrastructure, you actually, you get pretty much linear improvements, and you get a time to solution benefit. Some of the large data centers will run a training run for literally a month and if you start scaling it out, even in these incredibly powerful things, you can bring time to solution down, you can have meaningful results much more quickly. >> And you be a sensitive, sort of a practical application of that. Yeah there's a large hedge fund based in the U.K. called Man AHL. They're a system-based quantitative training firm, and what that means is, humans really aren't doing a lot of the training, machines are doing the vast majority if not all of the training. What the humans are doing is they're essentially quantitative analysts. The number of simulations that they can run is directly correlative to the number of trades that their machines can make. And so the more simulations you can make, the more trades you can make. The shorter your simulation time is, the more simulations that you can run. So we're talking about in a sort of a meta context, that concept applies to everything from retail and understanding, if you're a grocery store, what products are not on my shelves at a given time. In healthcare, discovering new forms of pathologies for cancer treatments. Financial services we touched on, but even broader, right down into manufacturing, right? Looking at, what are my defect rates on my lines, and if it used to take me a week to understand the efficiency of my assembly line, if I can get that down to four hours, and make adjustments in real time, that's more than just productivity, it's progress. >> Okay so, I wonder if we can talk about how you guys see AI emerging in the marketplace. You just gave an example. We were talking earlier again to Rob Lee about, it seems today to be applied and, in narrow use cases, and maybe that's going to be the norm, whether it's autonomous vehicles or facial recognition, natural language processing, how do you guys see that playing out? Whatever be, this kind of ubiquitous horizontal layer or do you think the adoption is going to remain along those sort of individual lines, if you will. >> At the extreme, like when you really look out at the future, let me start by saying that my background is processor architecture. I've worked in computer science, the whole thing is to understand problems, and create the platforms for those things. What really excited me and motivated me about AI deep learning is that it is changing computer science. It's just turning it on its head. And instead of explicitly programming, it's now implicitly programming, based on the data you feed it. And this changes everything and it can be applied to almost any use case. So I think that eventually it's going to be applied in almost any area that we use computing today. >> Dave: So another way of asking that question is how far can we take machine intelligence and your answer is pretty far, pretty far. So as processor architect, obviously this is very memory intensive, you're seeing, I was at the Micron financial analyst meeting earlier this week and listening to what they were saying about these emerging, you got T-RAM, and obviously you have Flash, people are excited about 3D cross-point, I heard it, somebody mentioned 3D cross-point on the stage today, what do you see there in terms of memory architectures and how they're evolving and what do you need as a systems architect? >> I need it all. (all talking at once) No, if I could build a GPU with more than a terabyte per second of bandwidth and more than a terabyte of capacity I could use it today. I can't build that, I can't build that yet. But I need, it's a different stool, I need teraflops, I need memory bandwidth, and I need memory capacity. And really we just push to the limit. Different types of neural nets, different types of problems, will stress different things. They'll stress the capacity, the bandwidth, or the actual compute. >> This makes the data warehousing problem seem trivial, but do you see, you know what I mean? Data warehousing, it was like always a chase, chasing the chips and snake swallowing a basketball I called it, but do you see a day that these problems are going to be solved, architecturally, it talks about, More's laws, moderating, or is this going to be this perpetual race that we're never going to get to the end of? >> So let me put things in perspective first. It's easy to forget that the big bang moment for AI and deep learning was the summer of 2012, so slightly less than six years ago. That's when Alex Ned get the seed and people went wow, this is a whole new approach, this is amazing. So a little less than six years in. I mean it is a very young, it's a young area, it is in incredible growth, the change in state of art is literally month by month right now. So it's going to continue on for a while, and we're just going to keep growing and evolving. Maybe five years, maybe 10 years, things will stabilize, but it's an exciting time right now. >> Very hard to predict, isn't it? >> It is. >> I mean who would've thought that Alexa would be such a dominant factor in voice recognition, or that a bunch of cats on the internet would lead to facial recognition. I wonder if you guys can comment, right? I mean. >> Strange beginnings. (all laughing) >> But very and, I wonder if I can ask you guys ask about the black box challenge. I've heard some companies talk about how we're going to white box everything, make it open and, but the black box problem meaning if I have to describe, and we may have talked about this, how I know that it's a dog. I struggle to do that, but a machine can do that. I don't know how it does it, probably can't tell me how it does it, but it knows, with a high degree of accuracy. Is that black box phenomenon a problem, or do we just have to get over it? >> Up to you. >> I think it's certain, I don't think it's a problem. I know that mathematicians, who are friends, it drives them crazy, because they can't tell you why it's working. So it's a intellectual problem that people just need to get over. But it's the way our brains work, right? And our brains work pretty well. There are certain areas I think where for a while there will be certain laws in place where you can't prove the exact algorithm, you can't use it, but by and large, I think the industry's going to get over it pretty fast. >> I would totally agree, yeah. >> You guys are optimists about the future. I mean you're not up there talking about how jobs are going to go away and, that's not something that you guys are worried about, and generally, we're not either. However, machine intelligence, AI, whatever you want to call it, it is very disruptive. There's no question about it. So I got to ask you guys a few fun questions. Do you think large retail stores are going to, I mean nothing's in the extreme, but do you think they'll generally go away? >> Do I think large retail stores will generally go away? When I think about retail, I think about grocery stores, and the things that are going to go away, I'd like to see standing in line go away. I would like my customer experience to get better. I don't believe that 10 years from now we're all going to live inside our houses and communicate over the internet and text and half of that be with chat mods, I just don't believe that's going to happen. I think the Amazon effect has a long way to go. I just ordered a pool thermometer from Amazon the other day, right? I'm getting old, I ordered readers from Amazon the other day, right? So I kind of think it's that spur of the moment item that you're going to buy. Because even in my own personal habits like I'm not buying shoes and returning them, and waiting five to ten times, cycle, to get there. You still want that experience of going to the store. Where I think retail will improve is understanding that I'm on my way to their store, and improving the experience once I get there. So, I think you'll see, they need to see the Amazon effect that's going to happen, but what you'll see is technology being employed to reach a place where my end user experience improves such that I want to continue to go there. >> Do you think owning your own vehicle, and driving your own vehicle, will be the exception, rather than the norm? >> It pains me to say this, 'cause I love driving, but I think you're right. I think it's a long, I mean it's going to take a while, it's going to take a long time, but I think inevitably it's just too convenient, things are too congested, by freeing up autonomous cars, things that'll go park themselves, whatever, I think it's inevitable. >> Will machines make better diagnoses than doctors? >> Matt: Oh I mean, that's not even a question. Absolutely. >> They already do. >> Do you think banks, traditional banks, will control of the payment systems? >> That's a good one, I haven't thought about-- >> Yeah, I'm not sure that's an AI related thing, maybe more of a block chain thing, but, it's possible. >> Block chain and AI, kind of cousins. >> Yeah, they are, they are actually. >> I fear a world though where we actually end up like WALLE in the movie and everybody's on these like floating chez lounges. >> Yeah lets not go there. >> Eating and drinking. No but I'm just wondering, you talked about, Matt, in terms of the number of, the different types of industries that really can verge in here. Do you see maybe the consumer world with our expectation that we can order anything on Amazon from a thermometer to a pair of glasses to shoes, as driving other industries to kind of follow what we as consumers have come to expect? >> Absolutely no question. I mean that is, consumer drives everything, right? All flash arrays were driven by you have your phone there, right? The consumerization of that device was what drove Toshiba and all the other fad manufacturers to build more NAM flash, which is what commoditized NAM flash, which what brought us faster systems, these things all build on each other, and from a consumer perspective, there are so many things that are inefficient in our world today, right? Like lets just think about your last call center experience. If you're the normal human being-- >> I prefer not to, but okay. >> Yeah you said it, you prefer not to, right? My next comment was going to be, most people's call center experiences aren't that good. But what if the call center technology had the ability to analyze your voice and understand your intonation, and your inflection, and that call center employee was being given information to react to what you were saying on the call, such that they either immediately escalated that call without you asking, or they were sent down a decision path, which brought you to a resolution that said that we know that 62% of the time if we offer this person a free month of this, that person is going to view, is going to go away a happy customer, and rate this call 10 out of 10. That is the type of things that's going to improve with voice recognition, and all of the voice analysis, and all this. >> And that really get into how far we can take machine intelligence, the things that machines, or the humans can do, that machines can't, and that list changes every year. The gap gets narrower and narrower, and that's a great example. >> And I think one of the things, going back to your, whether stores'll continue being there or not but, one of the biggest benefits of AI is recommendation, right? So you can consider it userous maybe, or on the other hand it's great service, where a lot of, something like an Amazon is able to say, I've learned about you, I've learned about what people are looking for, and you're asking for this, but I would suggest something else, and you look at that and you go, "Yeah, that's exactly what I'm looking for". I think that's really where, in the sales cycle, that's really where it gets up there. >> Can machines stop fake news? That's what I want to know. >> Probably. >> Lisa: To be continued. >> People are working on that. >> They are. There's a lot, I mean-- >> That's a big use case. >> It is not a solved problem, but there's a lot of energy going into that. >> I'd take that before I take the floating WALLE chez lounges, right? Deal. >> What if it was just for you? What if it was just a floating chez lounge, it wasn't everybody, then it would be alright, right? >> Not for me. (both laughing) >> Matt and Rob, thanks so much for stopping by and sharing some of your insights and we should have a great rest of the day at the conference. >> Great, thank you very much. Thanks for having us. >> For Dave Vellante, I'm Lisa Martin, we're live at Pure Storage Accelerate 2018 at the Bill Graham Civic Auditorium. Stick around, we'll be right back after a break with our next guest. (electronic music)

Published Date : May 23 2018

SUMMARY :

brought to you by Pure Storage. I learned that in the last half an hour. Who of course played at this venue, and Rob Ober, the Chief Platform Architect at NVIDIA. Talk to us about AIRI, what is it? I think Rob and I like to talk about it as kind of It's an AI supercomputer in a half rack. for the next operation to happen. has really made the data flow possible. and you can imagine this massively parallel and if you start scaling it out, And so the more simulations you can make, AI emerging in the marketplace. based on the data you feed it. and what do you need as a systems architect? the bandwidth, or the actual compute. in incredible growth, the change I wonder if you guys can comment, right? (all laughing) I struggle to do that, but a machine can do that. that people just need to get over. So I got to ask you guys a few fun questions. and the things that are going to go away, I think it's a long, I mean it's going to take a while, Matt: Oh I mean, that's not even a question. maybe more of a block chain thing, but, it's possible. and everybody's on these like floating to kind of follow what we as consumers I mean that is, consumer drives everything, right? information to react to what you were saying on the call, the things that machines, or the humans can do, and you look at that and you go, That's what I want to know. There's a lot, I mean-- It is not a solved problem, I'd take that before I take the Not for me. and sharing some of your insights and Great, thank you very much. at the Bill Graham Civic Auditorium.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Matt Burr	PERSON	0.99+
Matt	PERSON	0.99+
Charlie	PERSON	0.99+
10 gig	QUANTITY	0.99+
25	QUANTITY	0.99+
Rob Lee	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Rob	PERSON	0.99+
five	QUANTITY	0.99+
Lisa	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
100 gig	QUANTITY	0.99+
Toshiba	ORGANIZATION	0.99+
Rob Ober	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
62%	QUANTITY	0.99+
Dave	PERSON	0.99+
10	QUANTITY	0.99+
March	DATE	0.99+
five years	QUANTITY	0.99+
10 years	QUANTITY	0.99+
Pure Storage	ORGANIZATION	0.99+
Alex Ned	PERSON	0.99+
Roger Daltry	PERSON	0.99+
AIRI	ORGANIZATION	0.99+
U.K.	LOCATION	0.99+
four hours	QUANTITY	0.99+
ten times	QUANTITY	0.99+
one	QUANTITY	0.99+
Bill Graham Civic Auditorium	LOCATION	0.99+
today	DATE	0.99+
less than half a rack	QUANTITY	0.98+
Arista	ORGANIZATION	0.98+
10 years ago	DATE	0.98+
San Francisco	LOCATION	0.98+
20 years ago	DATE	0.98+
summer of 2012	DATE	0.98+
three legs	QUANTITY	0.98+
tens of thousands of cores	QUANTITY	0.97+
less than six years	QUANTITY	0.97+
Man AHL	ORGANIZATION	0.97+
both	QUANTITY	0.97+
a week	QUANTITY	0.96+
earlier this week	DATE	0.96+
more than a terabyte	QUANTITY	0.96+
50 racks	QUANTITY	0.96+
Global Technology Conference	EVENT	0.96+
this morning	DATE	0.95+
more than a terabyte per second	QUANTITY	0.95+
Pure	ORGANIZATION	0.94+
GTC	EVENT	0.94+
less than six years ago	DATE	0.93+
petabytes	QUANTITY	0.92+
terabytes	QUANTITY	0.92+
half rack	QUANTITY	0.92+
one of the legs	QUANTITY	0.92+
single run	QUANTITY	0.92+
a month	QUANTITY	0.91+
FlashBlade	ORGANIZATION	0.9+
theCUBE	ORGANIZATION	0.88+
Pure Storage Accelerate 2018	EVENT	0.88+
20 years	QUANTITY	0.87+

Gus Horn, NetApp | Big Data NYC 2017

>> Narrator: Live from Midtown Manhattan, it's theCUBE. Covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Hello everyone. Welcome back to our CUBE coverage here in New York City, live in Manhattan for theCUBE's coverage of Big Data NYC, our event we've had five years in a row. Eight years covering Big Data, Hadoop World originally in 2010, then it moved to Hadoop Strata Conference, Strata Hadoop, now called Strata Data. In conjunction with that event we have our Big Data NYC event. SiliconANGLE Media's CUBE. I'm John Furrier, your cohost, with Jim Kobielus, analyst at wikibon.com for Big Data. Our next guest is Gus Horn who is the global Big Data analytics and CTO ambassador for NetApp, machine learning, AI, guru, gives talks all around the world. Great to have you, thanks for coming in and spending the time with us. >> Thanks, John, appreciate it. >> So we were talking before the camera came on, you're doing a lot of jet setting really around Evangelize But also educating a lot of folks on the impact of machine learning and AI in particular. Obviously AI we love, we love the hype. It motivates young kids getting into software development, computer science, makes it kind of real for them. But still, a lot more ways to go in terms of what AI really is. And that's good, but what is really going on with AI? Machine learning is where the rubber hits the road. That seems to be the hot air, that's your wheelhouse. Give us the update, where is AI now? Obviously machine learning is super important, it's one of the hot topics here in New York City. >> Well, I think it's super important globally, and it's going to be disruptive. So before we were talking, I said how this is going to be a disruptive technology for all of society. But regardless of that, what machine learning is bringing is a methodology to deal with this influx of IOT data, whether it's autonomous vehicles, active safety in cars, or even looking at predictive analytics for complex manufacturing processes like an automotive assembly line. Can I predict when a welding machine is going to break and can I take care of it during a scheduled maintenance cycle so I don't take the whole line down? Because the impacts are really cascading and dramatic when you have a failure that you couldn't predict. And what we're finding is that Hadoop and the Big Data space is uniquely positioned to help solve these problems, both from quality control and process management and how you can get better uptime, better quality, and then we take it full circle and how can I build an environment to help automotive manufacturers to do test and DEV and retest and retraining and learning of the AI modules and the AI engines that have to exist in these autonomous vehicles. And the only way you can do that is with data, and managing data like a data steward, which is what we do at NetApp. So for us, it's not just about the solution, but the underlying architecture is going to be absolutely critical in setting up the agility you'll need in this environment, and the flexibility you need. Because the other thing that's happening in the space right now is that technology's evolving very quickly. You see this with the DGX from NVIDIA, you see P100 cards from NVIDIA. So I have an architecture that we have in Germany right now where we have multiple NVIDIA cards in our Hadoop cluster that we've architected. But I don't make NVIDIA cards. I don't make servers. I make really good storage. And I have an ecosystem that helps manage where that data is when it needs to be there, and especially when it doesn't need to there so we can get new data. >> Yeah, Gus, we were talking also before camera, the folks watching that you were involved with AI going way back to in your days at MIT, and that's super important. Because a lot of people, the pattern that we're seeing across all the events that we go to, and we'll be at the NetApp event next week, Insight, in Vegas, but the pattern is pretty clear. You have one camp, oh, AI is just the same thing that was going on in the late '70s, '80s, and '90s, but it now has a new dynamic with the cloud. So a lot of people are saying okay, there's been some concepts that have been developed in AI, in computer science, but now with the evolution of hyperconvergence infrastructure, with cloud computing, with now a new architecture, it seems to be turbocharging and accelerating. So I'd like to get your thoughts on why is it so hot now? Obviously machine learning, everyone should be on that, no doubt, but you got the dynamic of the cloud. And NetApp's in the storage business, so that's stores data, I get that. What's the dynamic with the cloud? Because that seems to be the accelerant right now with open source and in with AI. >> Yeah, I think you got to stay focused. The cloud is going to be playing an integral role in everything. And what we do at NetApp as a data steward, and what George Kurian said, our CEO, that data is the currency of today actually, right? It's really fundamentally what drives business value, it's the data. But there's one little slight attribute change that I'd like to add to that, and that it's a perishable commodity. It has a certain value at T-sub zero when you first get it. And especially true when you're trying to do machine learning and you're trying to learn new events and new things, but it rapidly degrades and becomes less valuable. You still need to keep it because it's historical and if we forget historical data, we're doomed to repeat mistakes. So you need to keep it and you have to be a good steward. And that's where we come into play with our technologies. Because we have a portfolio of different kinds of products and management capabilities that move the data where it needs to be, whether you're in the cloud, whether you're near the cloud, like in an Equinox colo, or even on prem. And the key attribute there, and especially in automotive they want to keep the data forever because of liability, because of intellectual property and privacy concerns. >> Hold on, one quick question on that. 'Cause I think you bring up a good point. The perishability's interesting because realtime, we see this now, bashing in realtime is the buzzword in the industry, but you're talking about something that's really important. That the value of the data when you get it fast, in context, is super important. But then the historical piece where you store it also plays into the machine learning dynamics of how deep learning and machine learning has to use the historical perspective. So in a way, it's perishable in the realtime piece in the moment. If you're a self-driving car you want the data in milliseconds 'cause it's important, but then again, the historical data will then come back. Is that kind of where you're getting at with that? >> Yeah, because the way that these systems operate is the paradigm is like deep learning. You want them to learn the way a human learns, right? The only reason we walk on our feet is 'cause we fell down a lot. But we remember falling down, we remember how we got up and could walk. So if you don't have the historical context, you're just always falling down, right? So you have to have that to build up the proper machine learning neural network, the kind of connections you need to do the right things. And then as you get new data and varieties of data, and I'll stick with automotive, because it can almost be thought of as an intractable amount of data. Because most people will keep cars for measured in decades. The quality of the car is incredible now, and they're all just loaded with sensors, right? High definition cameras, radars, GPS tracking. And you want to make sure you get improvements there because you have liability issues coming as well with these same technologies, so. >> Yeah, so we talk about the perishability of the data, that's a given. What is less perishable, it seems to me and Wikibon, is that what you derive from the data, the correlations, the patterns, the predictive models, the meat of machine learning and deep learning, AI in general, is less perishable in the sense that it has a validity over time. What are your thoughts at NetApp about how those data derived assets should be stored, should be managed for backup and recovery and protected? To what extent do those requirements need to be reflected in your storage retention policies if you're an enterprise doing this? >> That's a great question. So I think what we find is that that first landing zone, and everybody talks about that being the cloud. And for me it's a cloudy day, although in New York today it's not. There are lots of clouds and there are lots of other things that come with that data like GDPR and privacy, and what are you allowed to store, what are you allowed to keep? And how do you distinguish one from the other? That's one part. But then you're going to have to ETL it, you're going to have to transform that data. Because like everything, there's a lot of noise. And the noise is really fundamentally not that important. It's those anomalies within the stream of noise that you need to capture. And then use that as your training data, right? So that you learn from it. So there's a lot of processing, I think, that's going to have to happen in the cloud regardless of what cloud, and it has to be kind of ubiquitous in every cloud. And then from there you decide, how am I going to curate the data and move it? And then how am I going to monetize the data? Because that's another part of the equation, and what can I monetize? >> Well that's a question that we hear a lot on theCUBE. On day one we were ripping at some of the concepts that we see, and certainly we talk to enterprise customers. Whether it's a CIO, CVO, chief data officer, chief security officer. There's a huge application development going on in the enterprise right now. You see the opensource booming. This huge security practice is being built up and then it's got this governance with the data. Overlay that with IOT, it's kind of an architectural, I don't want to say reset, but a retrenching for a lot of enterprises. So the question I have for you guys as a critical part of the infrastructure of storage, storage isn't going away, there's no doubt about that, but now the architecture's changing. How are you guys advising your customers? What's your position on when you come into CXO and you give a talk and I said, hey, Gus, the house is on fire, we got so much going on. Bottom line me, what's the architecture? What's best for me, but don't lose the headroom. I need to have some headroom to grow, that's where I see some machine learning, what do I do? >> I think you have to embrace the cloud, and that's one of the key attributes that NetApp brings to the table. We have our core software, our ONTAP software, is in the cloud now. And for us, we want to make sure we make it very easy for our customers to both be in the cloud, be very protected in the cloud with encryption and protection of the data, and also get the scale and all of the benefits of the cloud. But on top of that, we want to make it easy for them to move it wherever they want it to be as well. So for us it's all about the data mobility and the fact that we want to become that data steward, that data engine that helps them drive to where they get the best business value. >> So it's going to be on prem, on cloud. 'Cause I know just for the record, you guys if not the earliest, one of the earliest in with AWS, when it wasn't fashionable. I interviewed you guys on that many years ago. >> And let me ask a related question. What is NetApp's position, or your personal thinking, on what data should be persisted closer to the edge in the new generation of IOT devices? So IOT, edge devices, they do inference, they do actuation and sensing, but they also do persistence. Now should any data be persisted there longterm as part of your overall storage strategy, if you're an enterprise? >> It could be. The question is durability, and what's the impact if for some reason that edge was damaged, destroyed or the data lost. So a lot of times when we start talking about opensource, one of the key attributes we always have to take into account is data durability. And traditionally it's been done through replication. To me that's a very inefficient way to do it, but you have to protect the data. Because it's like if you've got 20 bucks in your wallet, you don't want to lose it, right? You might split it into two 10s, but you still have 20, right? You want that durability and if it has that intrinsic value, you've got to take care of it and be a good steward. So if it's in the edge, it doesn't mean that's the only place it's going to be. It might be in the edge because you need it there. Maybe you need what I call reflexive actions. This is like when a car is well, you have deep learning and machine learning and vision and GPS tracking and all these things there, and how it can stay in the lane and drive, but the sensors themself that are coming from Delphi and Bosch and ZF and all of these companies, they also have to have this capability of being what I call a reflex, right? The reason we can blink and not get a stone in our eye is not because it went to our cerebral cortex. Because it went to the nerve stem and it triggered the blink. >> Yeah, it's cache. And you have to do the same thing in a lot of these environments. So autonomous vehicles is one. It could be using facial recognition for restricting access to a gate. And all the sudden this guy's on a blacklist, and you've stopped the gate. >> Before we get into some of the product questions I have for you, Hadoop in-place analytics, as well as some of the regulations around GDPR, to end the trend segment here is what's your thoughts on decentralization? You see a lot of decentralized apps coming out, you see blockchain getting a lot of traction. Obviously that's a tell sign, certainly in the headroom category of what may be coming down. Not really on the agenda for most enterprises today, but it does kind of indicate that the wave is coming for a lot more decentralization on top of distributed computing and storage. So how do you look at that, as someone who's out on the cutting edge? >> For me it's just yet another industry trend where you have to embrace it. I'm constantly astonished at the people who are trying to push back from things that are coming. To think that they're going to stop the train that's going to run 'em over. And the key is how can we make even those trends better, more reliable, and do the right thing for them? Because if we're the trusted advisor for our customers, regardless of whether or not I'm going to sell a lot of storage to them, I'm going to be the person they're going to trust to give 'em good advice as things change, 'cause that's the one thing that's absolutely coming is change. And oftentimes when you lock yourself into these quote, commodity approaches with a lot of internal storage and a lot of these things, the counterpart to that is that you've also locked yourself in probably for two to four years now, in a technology that you can't be agile with. And this is one of the key attributes for the in-place analytics that we do with our ONTAP product and we also have our E series product that's been around for six plus years in the space, is the defacto performance leader in the space, even. And by decoupling that storage, in some cases very little but it's still connected to the data node, and in other cases where it's shared like an NFS share, that decoupling has enormous benefits from an agility perspective. And that's the key. >> That kind of ties up with the blockchain thing as kind of a tell sign, but you mentioned the in-place analytics. That decoupling gives you a lot more cohesiveness, if you will, in each area. But tying 'em together's critical. How do you guys do that? What's the key feature? Because that's compelling for someone, they want agility. Certainly DevOps' infrastructure code, that's going mainstream, you're seeing that now. That's clearly cloud operation, whatever you want to call it, on prem, off prem. Cloud ops is here. This is a key part of it, what's the unique features of why that works so well? >> Well, some of the unique features we have, so if we look at your portfolio products, so I'll stick with the ONTAP product. One of the key things we have there is the ability to have incredible speed with our AFF product, but we can also Dedoop it, we can clone it, and snapshot it, snapshotting it into, for example, NPS or NetApp Private Storage, which is in Equinox. And now all the sudden I can now choose to go to Amazon, or I can go to Azure, I can go to Google, I can go to SoftLayer. It gives me options as a customer to use whoever has got the best computational engine. Versus I'm stuck there. I can now do what's right for my business. And I also have a DR strategy that's quite elegant. But there's one really unique attribute too, and that's the cloning. So a lot of my big customers have 1000 plus node traditional Hadoop clusters, but it's nearly impossible for them to set up a test DEV environment with production data without having an enormous cost. But if I put it in my ONTAP, I can clone that. I can make hundreds of clones very efficiently. >> That gets the cost of ownership down, but more importantly gets the speed to getting Sandboxes up and running. >> And the Sandboxes are using true production data so that you don't have to worry about oh, I didn't have it in my test set, and now I have a bug. >> A lot of guys are losing budget because they just can't prove it and they can't get it working, it's too clunky. All right, cool, I want to get one more thing in before we run out of time. The role of machine learning we talked about, that's super important. Algorithms are going to be here, it's going to be a big part of it, but as you look at that policy, where the foundational policy governance thing is huge. So you're seeing GDPR, I want to get your comments on the impact of GDPR. But in addition to GDPR, there's going to be another Equifax coming, they're out there, right? It's inevitable. So as someone who's got code out there, writing algorithms, using machine learning, I don't want to rewrite my code based upon some new policy that might come in tomorrow. So GDPR is one we're seeing that you guys are heavily involved in. But there might be another policy I might want to change, but I don't want to rewrite my software. How should a CXO think about that dynamic? Not rewriting code if a new governance policy comes in, and then the GDPR's obvious. >> I don't think you can be so rigid to say that you don't want to rewrite code, but you want to build on what you have. So how can I expand what I already have as a product, let's say, to accommodate these changes? Because again, it's one of those trains. You're not going to stop it. So GDPR, again, it's one of these disruptive regulations that's coming out of EMEA. But what we forget is that it has far reaching implications even in the United States. Because of their ability to reach into basically the company's pocket and fine them for violations. >> So what's the impact of the Big Data system on GDPR? >> It can potentially be huge. The key attribute there is you have to start when you're building your data lakes, when you're building these things, you always have to make sure that you're taking into account anonymizing personal identifying information or obfuscating it in some way, but it's like with everything, you're only as strong as your weakest link. And this is again where NetApp plays a really powerful role because in our storage products, we actually can encrypt the data at rest, at wire speed. So it's part of that chain. So you have to make sure that all of the parts are doing that because if you have data at rest in a drive, let's say, that's inside your server, it doesn't take a lot to beat the heck out of it and find the data that's in there if it's not encrypted. >> Let me ask you a quick question before we wrap up. So how does NetApp incorporate ML or AI into these kinds of protections that you offer to customers? >> Well for us it's, again, we're only as successful as our customers are, and what NetApp does as a company, we'll just call us the data stewards, that's part of the puzzle, but we have to build a team to be successful. So when I travel around the world, the only reason a customer is successful is because they did it with a team. Nobody does it on an island, nobody does it by themself, although a lot of times they think they can. So it's not just us, it's our server vendors that work with us, it's the other layers that go on top of it, companies like Zaloni or BlueData and BlueTalon, people we've partnered with that are providing solutions to help drive this for our customers. >> Gus, great to have you on theCUBE. Looking forward to next week. I know you're super busy at NetApp InSight. I know you got like five major talks you're doing but if we can get some time I think you'd be great. My final question, a personal one. We were talking that you're a search and rescue in Tahoe in case there's an avalanche, a lost skier. A lot of enterprises feel lost right now. So you kind of come in a lot and the avalanche is coming, the waves or whatever are coming, so you probably seen situations. You don't need to name names, but talk about what should someone do if they're lost? You come in, you can do a lot of consulting. What's the best advice you could give someone? A lot of CXOs and CEOs, their heads are spinning right now. There's so much on the table, so much to do, they got to prioritize. >> It's a great question. And here's the one thing is don't try to boil the ocean. You got to be hyper-focused. If you're not seeing a return on investment within 90 days of setting up your data lake, something's going wrong. Either the scope of what you're trying to do is too large, or you haven't identified the use case that will give you an immediate ROI. There should be no hesitation to going down this path, but you got to do it in a manner where you're tackling the biggest problems that have the best hit value for you. Whether it's ETLing goes into your plan of record systems, your enterprise data warehouses, you got to get started, but you want to make sure you have measurable, tangible success within 90 days. And if you don't, you have to reset and say okay, why is that not happening? Am I reinventing the wheel because my consultant said I have to write all this SCOOP and Flume code and get the data in? Or maybe I should have chosen another company to be a partner that's done this 1000 times. And it's not a science experiment. We got to move away from science experiment to solving business problems. >> Well science experiments and boiling of the ocean is don't try to overreach, build a foundational building block. >> The successful guys are the ones who are very disciplined and they want to see results. >> Some call it baby steps, some call it building blocks, but ultimately the foundation right now is critical. >> Gus: Yeah. >> All right, Gus, thanks for coming on theCUBE. Great day, great to chat with you. Great conversation about machine learning impact to organizations. theCUBE bringing you the data here live in Manhattan. I'm John Furrier, Jim Kobielus with Wikibon. More after this short break. We'll be right back. (digital music) (synthesizer music)

Published Date : Sep 28 2017

SUMMARY :

Brought to you by SiliconANGLE Media and spending the time with us. But also educating a lot of folks on the impact And the only way you can do that is with data, the folks watching that you were involved with AI and management capabilities that move the data That the value of the data when you get it fast, the kind of connections you need to do the right things. is that what you derive from the data, and everybody talks about that being the cloud. So the question I have for you guys and the fact that we want to become that data steward, one of the earliest in with AWS, when it wasn't fashionable. in the new generation of IOT devices? it doesn't mean that's the only place it's going to be. And you have to do the same thing but it does kind of indicate that the wave is coming And the key is how can we make even those trends better, What's the key feature? And now all the sudden I can now choose to go to Amazon, but more importantly gets the speed so that you don't have to worry about oh, But in addition to GDPR, there's going to be another Equifax to say that you don't want to rewrite code, and find the data that's in there if it's not encrypted. into these kinds of protections that you offer to customers? that's part of the puzzle, but we have to build a team What's the best advice you could give someone? Either the scope of what you're trying to do Well science experiments and boiling of the ocean The successful guys are the ones who are very disciplined but ultimately the foundation right now is critical. Great day, great to chat with you.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
John	PERSON	0.99+
Gus Horn	PERSON	0.99+
BlueTalon	ORGANIZATION	0.99+
George Kurian	PERSON	0.99+
BlueData	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Germany	LOCATION	0.99+
two	QUANTITY	0.99+
Zaloni	ORGANIZATION	0.99+
Bosch	ORGANIZATION	0.99+
Manhattan	LOCATION	0.99+
Tahoe	LOCATION	0.99+
NVIDIA	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
1000 times	QUANTITY	0.99+
New York City	LOCATION	0.99+
20 bucks	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Delphi	ORGANIZATION	0.99+
Vegas	LOCATION	0.99+
20	QUANTITY	0.99+
New York	LOCATION	0.99+
Gus	PERSON	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
2010	DATE	0.99+
Amazon	ORGANIZATION	0.99+
first	QUANTITY	0.99+
United States	LOCATION	0.99+
ZF	ORGANIZATION	0.99+
90 days	QUANTITY	0.99+
GDPR	TITLE	0.99+
next week	DATE	0.99+
today	DATE	0.99+
NetApp	ORGANIZATION	0.99+
Equifax	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
four years	QUANTITY	0.99+
Eight years	QUANTITY	0.99+
tomorrow	DATE	0.98+
hundreds of clones	QUANTITY	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
NYC	LOCATION	0.98+
One	QUANTITY	0.98+
one part	QUANTITY	0.97+
Big Data	EVENT	0.97+
Wikibon	ORGANIZATION	0.97+
one camp	QUANTITY	0.96+
NetApp	TITLE	0.96+
Strata Data	EVENT	0.96+
NetApp	EVENT	0.96+
late '70s	DATE	0.96+
six plus years	QUANTITY	0.95+
Midtown Manhattan	LOCATION	0.95+
Hadoop Strata Conference	EVENT	0.95+
Equinox	ORGANIZATION	0.95+
one thing	QUANTITY	0.94+
Strata Hadoop	EVENT	0.94+
one more thing	QUANTITY	0.94+
one quick question	QUANTITY	0.93+
1000 plus	QUANTITY	0.92+
DGX	COMMERCIAL_ITEM	0.91+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for DGX: