Nanda Vijaydev, HPE (BlueData) | CUBE Conversation, September 2019

from our studios in the heart of Silicon Valley Palo Alto California this is a cute conversation hi and welcome to the cube Studios for another cube conversation where we go in-depth with thought leaders driving innovation across the tech industry I'm your host Peter Burris AI is on the forefront of every board in every enterprise on a global basis as well as machine learning deep learning and other advanced technologies that are intended to turn data into business action that differentiates the business leads to more revenue leads to more profitability but the challenge is is that all of these new use cases are not able to be addressed with the traditional ways that we've set up the workflows that we've set up to address them so as a consequence we're going to need greater opera's the operationalization of how we translate business problems into ml and related technology solutions big challenge we've got a great guest today to talk about it non-division diof is a distinguished technologist and lead data scientists at HPE in the blue data team nonde welcome to the cube thank you happy to be here so ananda let's start with this notion of a need for an architected approach to how we think about matching AI ml technology to operations so that we get more certain results better outcomes more understanding of where we're going and how the technology is working within the business absolutely yeah ai and doing AI in an enterprise is not new there have been enterprise-grade tools in the space before but most of them have a very prescribed way of doing things sometimes you use custom sequel to use that particular tool or the way you present data to that tool requires some level of pre-processing which makes you copy the data into the tool so you have already data fidelity maybe at risk and you have a data duplication happening and then the scale right when you talk about doing AI at the scale that is required now considering data is so big and there is a variety of data sets for the scale it can probably be done but there is a huge cost associated with that and you may still not meet the variety of use cases that you want to actually work on so the problem now is to make sure that you empower your users who are working in the space and augment them with the right set of technologies and the ability to bring data in a timely manner for them to work on these solutions so it sounds as though what we're trying to do is simplify the process of taking great ideas and turn it into great outcomes but you mentioned users I think it's got to start with or let me ask you if we have to start here that we've always thought about how is going to center in the data science or the data scientist as these solutions have start to become more popularized if diffused across the industry a lot more people are engaging are all roles being served as well as you need to be absolutely I think that's the biggest challenge right in the past you know when we talk about very prescribed solutions end to end was happening within those tools so the different user persona were probably part of that particular solution and also the way these models came into production which is really making it available for a consumer is read coding or redeveloping this in technologies that were production friendly which is you're rewriting that and sequel you're recording that and C so there is a lot of details that are lost in translation and the third big problem was really having visibility or having a say from a developer's point of view or a data scientist point of view in how these things are performing in production that how do you actually take it back take that feedback back into deciding you know is this model still good or how do you retrain so when you look at this lifecycle holistically this is an iterative process it is no longer you know workflow where you hand things off this is not a water flow methodology anymore this is a very very continuous and iterative process especially in the New Age data science the tools that are developing where you build the model that developer decides what the run time is and the run times are capable of serving those models as is you don't have to recode you don't have to lose things during translation so with this back to your question of how do you serve two different roles now all those personas and all those roles have to be part of the same project and they have to be part of the same experiment they're just serving different parts of the lifecycle and now you've whatever tooling you provide or whatever architecture technologies you provide have to look at it holistically there has to be continuous development there has to be collaboration there has to be central repositories that actually cater to those needs so each so the architected approach needs to be able to serve each of the roles but in a way that is collaborative and is ultimately put in service to the outcome and driving the use of the technology forward well that leads to another question should it should the should this architected approach be tied to one or another set of algorithms or one or another set of implementation infrastructure or does it have to be able to serve a wide array of Technology types yeah great question right this is a living ecosystem we can no longer build for you know you plant something for the next two years or the next three years technologies are coming every day and the reason is because the types of use cases are evolving and what you need to solve that use case is completely different when you look at two different use cases so whatever standards you come up with you know the consistency has to be across how a user is on-boarded into the system a consistency has to be about data access about security about how does one provision these environments but as far as what tool is used or how is that tool being applied to a specific problem there's a lot of variability in there and it has to cater your architecture has to make sure that this variability is addressed and it is growing so HPE spends a lot of time with customers and you're learning from your customer successes and how you turn that into tooling that leads to this type of operator operationalization but give us some visibility into some of those successes that really stand out for you that have been essential to how HP has participated in this journey to create better tools for better AI and m/l absolutely you know traditionally with blue data HPE now you know we've been exposed to a lot of big data processing technologies where the current landscape the data is different data is not always at rest data is not structured you know data is coming it could be a stream of data it could be a picture and in the use cases like we talked about you know it could be image recognition or a voice recognition where the type of data is very different right so back to how we've learnt from our customers like in my role I talked to you know tens of customers on a daily or weekly basis and each one of them are at a different level of maturity in their life cycle and these are some very established customers but you know the various groups that are adopting this new age technologies even within an organization there is a lot of variability so whatever we offered them we have to help support all of that particular user groups there are some who are coming from the classic or language background there are some that are coming from Python background some are doing things in Scala someone doing things in SPARC and there are some commercial tools that they're using like h2o driverless AI or data iku so what we have to look at is in this life cycle we have to make sure that all these communities are represented and/or addressed and if they build a model in a specific technology how do we consume that how do we take it in then how do we deploy that from an end to point of view it doesn't matter where a model gets built it does matter how end-users access it it doesn't matter how security is applied to it it does matter how scaling is applied to it so really there is a lot of consistency is required in the operationalization and also in how you onboard those different tools how do you make sure that consistency or methodology or standard practices are applied in this entire lifecycle and also monitoring that's a huge aspect right when you have deployed a model and it's in production monitoring means two different things to people where is it even available you know when you go to a website when you click on something is a website available very similarly when you go to an endpoint or you're scoring against a model is that model available do you have enough resources can it scale depending on how much requests come in that's one aspect of monitoring and the second aspect is really how was the model performing you know is that what is the accuracy what is the drift when is it time to retrain so you no longer have the luxury to look at these things in isolation right so it we want to make sure that all these things can be addressed in a manner knowing that this iteration sometimes can be a month sometimes it can be a day sometimes it's probably a few hours and that is why it can no longer be an isolated and even infrastructure point of view some of these workloads may need things like GPU and you may need it for a very short amount of time let how do you make sure that you give what is needed for that duration that is required and take it back and assign it to something else because these are very valuable resources so I want to build on if I may on that notion of onboarding the tools we're talking about use cases that enterprises are using today to create business value we're talking about HPE as an example delivering tooling that operationalize is how that's done today but the reality is we're gonna see the state of the art still evolve pretty dramatically over the next few years how is HPE going about ensuring that your approach and the approach you working with your customers does not get balkanized does not get you know sclerotic that it's capable of evolving and changing as folks learn new approaches to doing things absolutely you know it this has to start with having an open architecture you know you have to there has to be standards without which enterprises can't run but at the same time those standards shouldn't be so constricting that it doesn't allow you to expand into newer use cases right so what HP EML ops offers is really making sure that you can do what you do today in a best-practice manner or in the most efficient manner bringing time to value you know making sure that there is you know instant provisioning or access to beta or making sure that you don't duplicate data compute storage separation containerization you know these are some of the standard best practice technologies that are out there making sure that you adopt those and what these sets users for is to make sure that they can evolve with the later use cases you can never have you know you can never have things you know frozen in time you just want to make sure that you can evolve and this is what it sets them up for and you evolve with different use cases and different tools as they come along nada thanks very much has been a very it's been a great conversation we appreciate you being on the cube thank you Peter so my guest has been non Division I of the distinguished technologists and lead data scientists at HPE blue data and for all of you thanks for joining us again for another cube conversation on Peter burst see you next time you [Music]

Published Date : Sep 5 2019

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
September 2019	DATE	0.99+
Nanda Vijaydev	PERSON	0.99+
Scala	TITLE	0.99+
Python	TITLE	0.99+
second aspect	QUANTITY	0.99+
tens of customers	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Peter	PERSON	0.98+
HP	ORGANIZATION	0.98+
BlueData	ORGANIZATION	0.97+
each	QUANTITY	0.97+
two different use cases	QUANTITY	0.97+
a day	QUANTITY	0.97+
third big problem	QUANTITY	0.97+
a month	QUANTITY	0.96+
two different things	QUANTITY	0.96+
each one	QUANTITY	0.95+
two different roles	QUANTITY	0.94+
one	QUANTITY	0.94+
today	DATE	0.92+
Palo Alto California	LOCATION	0.92+
SPARC	TITLE	0.91+
lot of details	QUANTITY	0.87+
New Age	DATE	0.83+
blue data	ORGANIZATION	0.83+
lot	QUANTITY	0.81+
ananda	PERSON	0.76+
h2o	TITLE	0.76+
lot more	QUANTITY	0.74+
a few hours	QUANTITY	0.72+
few years	DATE	0.7+
next two years	DATE	0.69+
daily	QUANTITY	0.65+
years	QUANTITY	0.62+
weekly	QUANTITY	0.62+
next three	DATE	0.61+
HPE	TITLE	0.59+
time	QUANTITY	0.55+
Division I	QUANTITY	0.54+

John Morello, Twistlock & Nanda Kumar, Verizon Global Technology Services | KubeCon 2018

>> It's been great. >> Robert Herjavec. >> I mean, you guys are excited where you are, no? >> Dancing with the Stars, of course. >> His CUBE alumni. (techno music) Live from Seattle, Washington, it's theCUBE covering KubeCon and CloudNativeCon North America 2018 brought to you by Red Hat, the Cloud Native Computing Foundation, and its ecosystem partners. (crowd talking) >> And welcome back to our live coverage here in Seattle for KubeCon and CloudNativeCon 2018. I'm John Furrier, Stu Miniman, here for three days of wall to wall coverage, 8,000 people up from 4,000 last year. Growing Kubernetes and the Cloud Native ecosystem around KubeCon. Next two guests, John Morello, CTO of Twistlock, hot start-up to the news. And Nanda Kumar, who's a Fellow Systems engineer at Verizon's Global Technology Service. Guys, welcome to theCUBE. >> Thank you. Thanks for having us. >> Congratulations on your news and Kelsey wearing your shirt on theCUBE earlier. (they laugh) >> Thanks for having us. >> So take a minute to explain what you guys do, your story, you guys got to lot of hot things happening. Take a minute to talk about the company's value-- >> Yeah, sure, so we've been around for about four years now or going on four years. We're kind of the first company in this space that's really focused on cloud native cybersecurity. So, the idea is not just to take the existing capabilities that you've had on traditional systems and kind of retrofit them onto this new platform. But really to leverage the way that the cloud native space works, to be able to do security in a different and hopefully a more effective way. Cloud native has this notion of immutability and being able to take the same artifact from development to staging to production. And that enables us to do things in a security fashion that you really haven't been able to do in the past. Like actually be able to enforce security controls at the very beginning of the life cycle of the app. To be able to ensure consistency in your compliance posture all the way through production. And then as we learn things at runtime, to be able to signal that knowledge back to the developer, so they can actually improve the security application in the beginning. We basically have a platform that gives you those capabilities, vulnerability management, compliance, runtime defense, and firewalling across VMs, containers, and serverless across any clouds you have. We're not specific to any one cloud provider-- >> Is like telemetry coming back to the developer in real time? >> Yeah, basically as an example, when you have an application that's deployed, in the old world you as the developer would give the app to an operator, they would deploy it, and maybe weeks later, somebody would scan it, and they'd say you've got these vulnerabilities and then they have to go back and tell somebody to go and fix them. There's a lot of time where you're exposed, there's a lot of cost with that operation. The way that we're able to do it for the vulnerability case is as the developer builds the application, every build they do, Twistlock can scan that and see the vulnerabilities and actually enforce that as a quality gate and say if you've got critical vulnerabilities, you have to fix 'em before you progress. And then as you take that application and move that into test and staging and production, we create this dynamic runtime model that describes basically an implicit allow list of what's normal behaviors. So you don't have to tell us that my web server normally runs in Gen X and listens on port 80, we learn that automatically. We create this reference model where you can understand what's normal and then we automatically prevent anomalies. So unlike that traditional world of security where you had to have a whole bunch of manual rules that try to blacklist every thing that was bad, (John Furrier laughs) we just say, we learn what's good and only allow that. >> It's predictive and prescriptive in one. >> Yeah, exactly. >> What's the role here with Kubernetes, how do you fit into the Kubernetes standardization, momentum? >> For us, we've kind of pre-dated the rise of Kubernetes in some ways, and really supported Kubernetes from the very beginning when the project became popular. Our platform is designed to work as a native cloud native app itself, so when you deploy Twistlock, you run the Twistlock console, our management service and API controller. All that's run just as a cloud native app. You deploy as a replication controller. When you deploy Twistlock defender, our agent effective error, containerized agents to all the nodes where you're writing compute jobs, you run that as a Damon set. So for us, not only do we protect the platform, but we just are a part of the platform. There's nothing abnormal that you have to do. You deploy it and manage it like you would any other Kubernetes application. >> All right, Nanda, let's pull you into the conversation here. >> Sure. Verizon, obviously most people know, explain what your group does, how cloud native fits into what you're doing. >> I'm part of the Global Technology Services organization. Verizon, as you probably know, is a mixed bag of different types of businesses brought together, wireless being the most prominent one that most of you know about it. But we also have other solutions, like our file solutions. And recently with our acquisition of Yahoo, which is gold, and so forth. Verizon is actually on a major transformation journey. Our transformation journey spans around a five year program. We are in year number three of this transformation and cloud native and cloud technology is a very foundational aspect for us as part of this transformation. I was just chatting with John earlier. Opportunity like this doesn't come that often because we are in a perfect intersection of where automation and Verizon is doing a cloud migration and then you have these cloud native technologies that have been made available. Where it's Kubernetes, container, and so forth. So that mesh of the opportunity to migrate. And as you migrate, you're taking advantage of these technologies, and modernizing your application stack is a big win. >> Okay, can you connect for us the intersection of what you were just talking about and 5G, which is you know, really going to be a huge impact on everything happening in telecommunications. >> Yeah, the whole idea about 5G for us is it's not just the next generation of technology. It's all about the human element ability of it. Basically it means we want to make sure that the technology is used to solve real human problems and the technology is capable of doing that. Be it whether it's a life science or be it in transportation and so forth. We really want to make sure that the technology is being used to solve real human problems and to enable the consumption of this technology. We won't take advantage of cloud native services to support it. >> Help boil it down for us because, just in general, you say even domestically, I think it's like 40% of the U.S. population doesn't have access to broadband. Those of us at the conference here understand that wireless isn't always reliable. 5G silver bullet, everybody's going to have infinite bandwidth everywhere, right? >> Absolutely. (Stu laughs) And that's the valued proposition of the technology that it brings to the table. I know the spread of the technology is going to vary depending upon the commercialization of the product, the solution, and so forth. But the reality is in the new world that we live in, it is not just one piece of technology that's going to make it. It's going to be a mesh of the new technologies like 5G with a combination of WiFi and so forth. All of this coming together. It all comes down to fundamentally what are the use cases or what type of solutions are you going to go after and how it's going to make sense. >> How has cloud native in this transformation changed how you guys make investments? Obviously, the security equation's paramount. Central to the that, lot of data. How is the investments and how you guys are building out changed? Obviously you're looking at re-imagining operations, security, et cetera et cetera. How's that going to shape for you guys-- >> One of the things that Nanda and I were talking about earlier that not because of cloud native but it's enabled by cloud native. I think you look at almost all organizations today, and to reuse that phrase that Andreessen quoted about softwaring the world. It really is a true thing. Unlike in the past where IT had been this cost center that most organizations sought to strangle out and reduce as much as possible, I think most, at least modern companies that will be successful in the future, realize that that's part of their competitive advantage. It's not just about providing an app because your competitor has an app, it's about providing a better experience so that you're driving more revenue, having a better relationship, a longer term deeper relationship with that customer. Like we were talking about, in his case, if they build kind of a minimal application or minimal experience for their customers, their customers may choose to go to AT&T or whomever else if they can feel like hey, it's easier for me to work with them. I get better data, I can use my systems more easily. If you have that inflection point where people are having to really invest in building better software, better industry specific software, you need those tools of mass innovation to do that. And that's what cloud native really is. It's about being able to take and innovate and iterate on those innovations much more rapidly than you've been able to do in the past. And so it's really this confluence of those two trends that make this space as big as it is. That's why we have so many people here at KubeCon. >> Oh, you go faster too. The investment in apps, your applications, faster. And your talking about your security solution replaces the old way of hey, is there a problem, we'll patch it. >> It also has to get away from that approach where people took in the past where security was always this friction. It was this impediment, you know, you wanted to deploy something and you had to go through the security review and create all this rules and it was a hassle and slowed things down. If that's your approach to security, you're going to be at a fundamental conflict to this new approach. >> I think you'll be out of business personally, I think that ship has sailed, that's dead. We see the breaches every day, you see on all the dark webs who've been harvesting all that. IoT though is a different kind of animal. How are you guys looking at the IoT equation because that's a good use case for cloud? You can push now compute to the edge, you don't have to move data around. Certainly you guys are in the telecom business, you know what that means, so latency matters. How are you looking at the edge, IoT, and where does security fit into that? >> In terms of IoT, I think as you mentioned, there are going to be use cases where IoT's going to be very critical. There are two paradigms to the concept of the mobile edge compute. One is for the IoT use cases, the other could be even for like AR/VR is a good example. You want the compute to be so fast where you want responses immediately based on the location you are and so forth. So that's a very important foundation that we're working on and making that a reality for our organization to come use it. And of course any solution that we provide, security needs to be baked into it, because that's going to be foundation for how to-- >> Back to your 5G point, that's great back haul too for those devices. That one at least. If they want to send data back or interface with the edge, and power and compute, you need power and connectivity. >> Yep, exactly, very true. >> What's next, I guess? If you look forward, where's this journey going? How does this partnership help solve things? >> I think the key to any successful transformation is you got to take into consideration your current landscape. You certainly can have a broad vision of where the future is and so forth, but if you can't build the bridge between where we are and where we need to go, that's going to be a very challenging space so when you look at the cloud native technologies, we look at making it operational efficiency for us. In terms of how do we do our operations, like the earlier question we talked about, what is changing for us? Our operation's getting better. Our security portion is getting better because we're now shifting more of this to left. Which means as the workloads are being built and so forth. We're taking into consideration how it's going to run, where it's going to run and so forth. So that's going to create the savings and operational efficiency, which then allows us to take that and transform it into how do we focus on more modern technologies and modern solutions and so forth. >> Customer satisfaction. >> And customer satisfaction. >> Those are the top line business for every new model. >> So I got to ask, how is it going with Twistlock? Where's their role in your transformation? It's on the security side? >> Mm-hmm. >> Where do they play into your mix? >> So when we rolled out our solution for our Kubernetes platform, we certainly want to make sure that, to John's earlier point, where we can shift left and really look at security wholistically. And the only way you could do that is you need to capture the essence or integrate security as the project's being built. Because today we do have a security portion, but it's kind of where you have it during the development phase or during operations or doing it on time. You're not able to stitch it together. But with container and Kubernetes, you now have the advantage of really knowing what is end to end. And that is where our partnership with Twistlock has to be able to oversee that and provide that insight on what is running, where it's running, what levels exist, and how do we fix it. >> It kind of makes sense too. We've talked for years, the perimeter is dead. You guys are addressing security upfront at the application level where it's coding. This is working out for you guys well? >> Yep, and that's been a big shift in fact for why they've been successful with this transformation. Because we know have inside steward and everybody in the organization has a line off-site to what's going on, where things are running and so forth. It's been a great partnership. >> John, talk about this dynamic 'cause this is really kind of compelling because we've heard, "Oh, yeah, we're throwing everything "against the wall in security." And everyone always says, "Hey, the perimeter is dead "and you got to start with security in mind from day one." Well, I mean, what is day one? The minute you start coding, right? >> I get your overall point about the perimeter being dead. I would actually rephrase it a bit and say, "The perimeter being dissolved." And I think that's really a more probably accurate way to look at it. What used to be this very tightly defined like, we deploy things in this network or even VPC and it's got this control around it. Whereas a lot of customers today we see choosing an intentional multi-cloud strategy. They want to preserve the ability to have some leverage, not just with Amazon, but with Azure, or with Google, or whomever it may be on-premises. And when you have that model where you've got infrastructure and multiple regions, multiple different providers, you no longer have that very clean separation between what's yours and what's kind of out on the outside. And so one of the things that we really think is important is to be able to bring the perimeter to the application. So the way that we look at protecting the application is around the app itself, regardless of what the underlying compute platform is, the cloud, the region, it's really about protecting the app. You learn how those different microservices normally communicate with each other. You only allow that normal good communication unless you can really constrain a blast radius if you do have some kind of compromise in the future. And the minute you really try to mitigate that compromise is to again find those vulnerabilities as you develop the app, and prevent them in development before they ever get out to production. >> And that's a super smart approach, I love that. I think it's a winner, congratulations. Final question, what's the prediction for multi-cloud in 2019? Since you brought it up, multi-cloud seems to be the hot thing. What's your prediction 2019? It becomes a conversation? It becomes practice? >> I would say at this point, it already is practice in most organizations. And I would say that in 2019, you'll see that become something that's accepted not just as an option but as really the preferred, the better operational model. So you're able to choose technology platforms and operational approaches that are designed to work in a model in which you have multiple providers. Because you have a dependency layer that you can take now with Kubernetes and containers that's universal across those. Theoretically, you could have always taken a VM you put in ager and moved it to AWS, but it was really difficult and painful and hard to do that. If you do that well with Kubernetes, it's really pretty straightforward to deploy an application across multiple providers or multiple regions of the same provider even. And I think you'll see that become a more real thing in 2019 because it gives you as a company, or you as a customer, more leverage to be able to choose the services and negotiate the rates that you want with your provider. >> And if you move security to the app level like you guys are doing, you take away all that extra work around how to send policy and make it dynamic. >> Exactly. Our customers may have one Twistlock environment that manages things in Azure and AWS and GCP and on-premises and that's fine because we care about protecting the app not the interlying infrastructure. >> You agree? >> Absolutely, I think that's going to be the case even from our perspective. You're always going to look for where is the best place around these workloads and in a cost-effective way and secure manner. And as long as you're a single-controlled plane that you can manage it, I think the multi-cloud is going to be the ideal-- >> Make it easier to operate, standard language for developers, lock in security at the front end. >> That's right. >> Good stuff. Guys thanks for coming out. >> Sure. >> Appreciate the insight. Smart commentary here on security, cloud native, Kubernetes, I'll break it down here on theCUBE. I'm John Furrier, Stu Miniman, stay with us. More day one coverage of three days of live coverage here in Seattle for KubeCon and CloudNativeCon. We'll be right back. (upbeat music)

Published Date : Dec 12 2018

SUMMARY :

America 2018 brought to you Growing Kubernetes and the Cloud Native Thanks for having us. and Kelsey wearing your what you guys do, your story, So, the idea is not just to give the app to an operator, It's predictive and that you have to do. into the conversation here. explain what your group So that mesh of the and 5G, which is you know, make sure that the technology of the U.S. population doesn't that it brings to the table. How's that going to shape for you guys-- Unlike in the past where IT the old way of hey, is there It was this impediment, you You can push now compute to the edge, be so fast where you want and power and compute, you So that's going to create the savings Those are the top line And the only way you could do This is working out for you guys well? in the organization has a line "and you got to start with And the minute you really try to be the hot thing. and negotiate the rates that you want to the app level like you guys about protecting the app not that's going to be the case Make it easier to Appreciate the insight.

ENTITIES

Entity	Category	Confidence
Verizon	ORGANIZATION	0.99+
John	PERSON	0.99+
Nanda Kumar	PERSON	0.99+
Stu Miniman	PERSON	0.99+
John Morello	PERSON	0.99+
Robert Herjavec	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
2019	DATE	0.99+
AT&T	ORGANIZATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Seattle	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Andreessen	PERSON	0.99+
Kelsey	PERSON	0.99+
Nanda	PERSON	0.99+
Twistlock	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
John Morello	PERSON	0.99+
40%	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
four years	QUANTITY	0.99+
KubeCon	EVENT	0.99+
Verizon Global Technology Services	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
4,000	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
three days	QUANTITY	0.99+
Twistlock	ORGANIZATION	0.99+
CloudNativeCon	EVENT	0.99+
8,000 people	QUANTITY	0.98+
last year	DATE	0.98+
two trends	QUANTITY	0.98+
two paradigms	QUANTITY	0.98+
Twistlock	TITLE	0.98+
two guests	QUANTITY	0.98+
today	DATE	0.98+
Seattle, Washington	LOCATION	0.98+
one piece	QUANTITY	0.98+
One	QUANTITY	0.98+
CloudNativeCon North America 2018	EVENT	0.98+
Cloud Native	ORGANIZATION	0.97+
CloudNativeCon 2018	EVENT	0.97+
Kubernetes	TITLE	0.96+
Dancing with the Stars	TITLE	0.96+
one	QUANTITY	0.95+
single	QUANTITY	0.94+
weeks later	DATE	0.93+
about four years	QUANTITY	0.92+
Global Technology Services	ORGANIZATION	0.89+
KubeCon 2018	EVENT	0.89+
Global Technology Service	ORGANIZATION	0.88+
CTO	PERSON	0.87+
first company	QUANTITY	0.86+
U.S.	LOCATION	0.86+
year number three	QUANTITY	0.84+
day one	QUANTITY	0.8+
five year	QUANTITY	0.77+
More day one	QUANTITY	0.76+
years	QUANTITY	0.73+
Azure	TITLE	0.63+
Gen X	OTHER	0.63+

Breaking Analysis: How JPMC is Implementing a Data Mesh Architecture on the AWS Cloud

>> From theCUBE studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is braking analysis with Dave Vellante. >> A new era of data is upon us, and we're in a state of transition. You know, even our language reflects that. We rarely use the phrase big data anymore, rather we talk about digital transformation or digital business, or data-driven companies. Many have come to the realization that data is a not the new oil, because unlike oil, the same data can be used over and over for different purposes. We still use terms like data as an asset. However, that same narrative, when it's put forth by the vendor and practitioner communities, includes further discussions about democratizing and sharing data. Let me ask you this, when was the last time you wanted to share your financial assets with your coworkers or your partners or your customers? Hello everyone, and welcome to this week's Wikibon Cube Insights powered by ETR. In this breaking analysis, we want to share our assessment of the state of the data business. We'll do so by looking at the data mesh concept and how a leading financial institution, JP Morgan Chase is practically applying these relatively new ideas to transform its data architecture. Let's start by looking at what is the data mesh. As we've previously reported many times, data mesh is a concept and set of principles that was introduced in 2018 by Zhamak Deghani who's director of technology at ThoughtWorks, it's a global consultancy and software development company. And she created this movement because her clients, who were some of the leading firms in the world had invested heavily in predominantly monolithic data architectures that had failed to deliver desired outcomes in ROI. So her work went deep into trying to understand that problem. And her main conclusion that came out of this effort was the world of data is distributed and shoving all the data into a single monolithic architecture is an approach that fundamentally limits agility and scale. Now a profound concept of data mesh is the idea that data architectures should be organized around business lines with domain context. That the highly technical and hyper specialized roles of a centralized cross functional team are a key blocker to achieving our data aspirations. This is the first of four high level principles of data mesh. So first again, that the business domain should own the data end-to-end, rather than have it go through a centralized big data technical team. Second, a self-service platform is fundamental to a successful architectural approach where data is discoverable and shareable across an organization and an ecosystem. Third, product thinking is central to the idea of data mesh. In other words, data products will power the next era of data success. And fourth data products must be built with governance and compliance that is automated and federated. Now there's lot more to this concept and there are tons of resources on the web to learn more, including an entire community that is formed around data mesh. But this should give you a basic idea. Now, the other point is that, in observing Zhamak Deghani's work, she is deliberately avoided discussions around specific tooling, which I think has frustrated some folks because we all like to have references that tie to products and tools and companies. So this has been a two-edged sword in that, on the one hand it's good, because data mesh is designed to be tool agnostic and technology agnostic. On the other hand, it's led some folks to take liberties with the term data mesh and claim mission accomplished when their solution, you know, maybe more marketing than reality. So let's look at JP Morgan Chase in their data mesh journey. Is why I got really excited when I saw this past week, a team from JPMC held a meet up to discuss what they called, data lake strategy via data mesh architecture. I saw that title, I thought, well, that's a weird title. And I wondered, are they just taking their legacy data lakes and claiming they're now transformed into a data mesh? But in listening to the presentation, which was over an hour long, the answer is a definitive no, not at all in my opinion. A gentleman named Scott Hollerman organized the session that comprised these three speakers here, James Reid, who's a divisional CIO at JPMC, Arup Nanda who is a technologist and architect and Serita Bakst who is an information architect, again, all from JPMC. This was the most detailed and practical discussion that I've seen to date about implementing a data mesh. And this is JP Morgan's their approach, and we know they're extremely savvy and technically sound. And they've invested, it has to be billions in the past decade on data architecture across their massive company. And rather than dwell on the downsides of their big data past, I was really pleased to see how they're evolving their approach and embracing new thinking around data mesh. So today, we're going to share some of the slides that they use and comment on how it dovetails into the concept of data mesh that Zhamak Deghani has been promoting, and at least as we understand it. And dig a bit into some of the tooling that is being used by JP Morgan, particularly around it's AWS cloud. So the first point is it's all about business value, JPMC, they're in the money business, and in that world, business value is everything. So Jr Reid, the CIO showed this slide and talked about their overall goals, which centered on a cloud first strategy to modernize the JPMC platform. I think it's simple and sensible, but there's three factors on which he focused, cut costs always short, you got to do that. Number two was about unlocking new opportunities, or accelerating time to value. But I was really happy to see number three, data reuse. That's a fundamental value ingredient in the slide that he's presenting here. And his commentary was all about aligning with the domains and maximizing data reuse, i.e. data is not like oil and making sure there's appropriate governance around that. Now don't get caught up in the term data lake, I think it's just how JP Morgan communicates internally. It's invested in the data lake concept, so they use water analogies. They use things like data puddles, for example, which are single project data marts or data ponds, which comprise multiple data puddles. And these can feed in to data lakes. And as we'll see, JPMC doesn't strive to have a single version of the truth from a data standpoint that resides in a monolithic data lake, rather it enables the business lines to create and own their own data lakes that comprise fit for purpose data products. And they do have a single truth of metadata. Okay, we'll get to that. But generally speaking, each of the domains will own end-to-end their own data and be responsible for those data products, we'll talk about that more. Now the genesis of this was sort of a cloud first platform, JPMC is leaning into public cloud, which is ironic since the early days, in the early days of cloud, all the financial institutions were like never. Anyway, JPMC is going hard after it, they're adopting agile methods and microservices architectures, and it sees cloud as a fundamental enabler, but it recognizes that on-prem data must be part of the data mesh equation. Here's a slide that starts to get into some of that generic tooling, and then we'll go deeper. And I want to make a couple of points here that tie back to Zhamak Deghani's original concept. The first is that unlike many data architectures, this puts data as products right in the fat middle of the chart. The data products live in the business domains and are at the heart of the architecture. The databases, the Hadoop clusters, the files and APIs on the left-hand side, they serve the data product builders. The specialized roles on the right hand side, the DBA's, the data engineers, the data scientists, the data analysts, we could have put in quality engineers, et cetera, they serve the data products. Because the data products are owned by the business, they inherently have the context that is the middle of this diagram. And you can see at the bottom of the slide, the key principles include domain thinking, an end-to-end ownership of the data products. They build it, they own it, they run it, they manage it. At the same time, the goal is to democratize data with a self-service as a platform. One of the biggest points of contention of data mesh is governance. And as Serita Bakst said on the Meetup, metadata is your friend, and she kind of made a joke, she said, "This sounds kind of geeky, but it's important to have a metadata catalog to understand where data resides and the data lineage in overall change management. So to me, this really past the data mesh stink test pretty well. Let's look at data as products. CIO Reid said the most difficult thing for JPMC was getting their heads around data product, and they spent a lot of time getting this concept to work. Here's the slide they use to describe their data products as it related to their specific industry. They set a common language and taxonomy is very important, and you can imagine how difficult that was. He said, for example, it took a lot of discussion and debate to define what a transaction was. But you can see at a high level, these three product groups around wholesale, credit risk, party, and trade and position data as products, and each of these can have sub products, like, party, we'll have to know your customer, KYC for example. So a key for JPMC was to start at a high level and iterate to get more granular over time. So lots of decisions had to be made around who owns the products and the sub-products. The product owners interestingly had to defend why that product should even exist, what boundaries should be in place and what data sets do and don't belong in the various products. And this was a collaborative discussion, I'm sure there was contention around that between the lines of business. And which sub products should be part of these circles? They didn't say this, but tying it back to data mesh, each of these products, whether in a data lake or a data hub or a data pond or data warehouse, data puddle, each of these is a node in the global data mesh that is discoverable and governed. And supporting this notion, Serita said that, "This should not be infrastructure-bound, logically, any of these data products, whether on-prem or in the cloud can connect via the data mesh." So again, I felt like this really stayed true to the data mesh concept. Well, let's look at some of the key technical considerations that JPM discussed in quite some detail. This chart here shows a diagram of how JP Morgan thinks about the problem, and some of the challenges they had to consider were how to write to various data stores, can you and how can you move data from one data store to another? How can data be transformed? Where's the data located? Can the data be trusted? How can it be easily accessed? Who has the right to access that data? These are all problems that technology can help solve. And to address these issues, Arup Nanda explained that the heart of this slide is the data in ingestor instead of ETL. All data producers and contributors, they send their data to the ingestor and the ingestor then registers the data so it's in the data catalog. It does a data quality check and it tracks the lineage. Then, data is sent to the router, which persists the data in the data store based on the best destination as informed by the registration. This is designed to be a flexible system. In other words, the data store for a data product is not fixed, it's determined at the point of inventory, and that allows changes to be easily made in one place. The router simply reads that optimal location and sends it to the appropriate data store. Nowadays you see the schema infer there is used when there is no clear schema on right. In this case, the data product is not allowed to be consumed until the schema is inferred, and then the data goes into a raw area, and the inferer determines the schema and then updates the inventory system so that the data can be routed to the proper location and properly tracked. So that's some of the detail of how the sausage factory works in this particular use case, it was very interesting and informative. Now let's take a look at the specific implementation on AWS and dig into some of the tooling. As described in some detail by Arup Nanda, this diagram shows the reference architecture used by this group within JP Morgan, and it shows all the various AWS services and components that support their data mesh approach. So start with the authorization block right there underneath Kinesis. The lake formation is the single point of entitlement and has a number of buckets including, you can see there the raw area that we just talked about, a trusted bucket, a refined bucket, et cetera. Depending on the data characteristics at the data catalog registration block where you see the glue catalog, that determines in which bucket the router puts the data. And you can see the many AWS services in use here, identity, the EMR, the elastic MapReduce cluster from the legacy Hadoop work done over the years, the Redshift Spectrum and Athena, JPMC uses Athena for single threaded workloads and Redshift Spectrum for nested types so they can be queried independent of each other. Now remember very importantly, in this use case, there is not a single lake formation, rather than multiple lines of business will be authorized to create their own lakes, and that creates a challenge. So how can that be done in a flexible and automated manner? And that's where the data mesh comes into play. So JPMC came up with this federated lake formation accounts idea, and each line of business can create as many data producer or consumer accounts as they desire and roll them up into their master line of business lake formation account. And they cross-connect these data products in a federated model. And these all roll up into a master glue catalog so that any authorized user can find out where a specific data element is located. So this is like a super set catalog that comprises multiple sources and syncs up across the data mesh. So again to me, this was a very well thought out and practical application of database. Yes, it includes some notion of centralized management, but much of that responsibility has been passed down to the lines of business. It does roll up to a master catalog, but that's a metadata management effort that seems compulsory to ensure federated and automated governance. As well at JPMC, the office of the chief data officer is responsible for ensuring governance and compliance throughout the federation. All right, so let's take a look at some of the suspects in this world of data mesh and bring in the ETR data. Now, of course, ETR doesn't have a data mesh category, there's no such thing as that data mesh vendor, you build a data mesh, you don't buy it. So, what we did is we use the ETR dataset to select and filter on some of the culprits that we thought might contribute to the data mesh to see how they're performing. This chart depicts a popular view that we often like to share. It's a two dimensional graphic with net score or spending momentum on the vertical axis and market share or pervasiveness in the data set on the horizontal axis. And we filtered the data on sectors such as analytics, data warehouse, and the adjacencies to things that might fit into data mesh. And we think that these pretty well reflect participation that data mesh is certainly not all compassing. And it's a subset obviously, of all the vendors who could play in the space. Let's make a few observations. Now as is often the case, Azure and AWS, they're almost literally off the charts with very high spending velocity and large presence in the market. Oracle you can see also stands out because much of the world's data lives inside of Oracle databases. It doesn't have the spending momentum or growth, but the company remains prominent. And you can see Google Cloud doesn't have nearly the presence in the dataset, but it's momentum is highly elevated. Remember that red dotted line there, that 40% line, anything over that indicates elevated spending momentum. Let's go to Snowflake. Snowflake is consistently shown to be the gold standard in net score in the ETR dataset. It continues to maintain highly elevated spending velocity in the data. And in many ways, Snowflake with its data marketplace and its data cloud vision and data sharing approach, fit nicely into the data mesh concept. Now, a caution, Snowflake has used the term data mesh in it's marketing, but in our view, it lacks clarity, and we feel like they're still trying to figure out how to communicate what that really is. But is really, we think a lot of potential there to that vision. Databricks is also interesting because the firm has momentum and we expect further elevated levels in the vertical axis in upcoming surveys, especially as it readies for its IPO. The firm has a strong product and managed service, and is really one to watch. Now we included a number of other database companies for obvious reasons like Redis and Mongo, MariaDB, Couchbase and Terradata. SAP as well is in there, but that's not all database, but SAP is prominent so we included them. As is IBM more of a database, traditional database player also with the big presence. Cloudera includes Hortonworks and HPE Ezmeral comprises the MapR business that HPE acquired. So these guys got the big data movement started, between Cloudera, Hortonworks which is born out of Yahoo, which was the early big data, sorry early Hadoop innovator, kind of MapR when it's kind of owned course, and now that's all kind of come together in various forms. And of course, we've got Talend and Informatica are there, they are two data integration companies that are worth noting. We also included some of the AI and ML specialists and data science players in the mix like DataRobot who just did a monster $250 million round. Dataiku, H2O.ai and ThoughtSpot, which is all about democratizing data and injecting AI, and I think fits well into the data mesh concept. And you know we put VMware Cloud in there for reference because it really is the predominant on-prem infrastructure platform. All right, let's wrap with some final thoughts here, first, thanks a lot to the JP Morgan team for sharing this data. I really want to encourage practitioners and technologists, go to watch the YouTube of that meetup, we'll include it in the link of this session. And thank you to Zhamak Deghani and the entire data mesh community for the outstanding work that you're doing, challenging the established conventions of monolithic data architectures. The JPM presentation, it gives you real credibility, it takes Data Mesh well beyond concept, it demonstrates how it can be and is being done. And you know, this is not a perfect world, you're going to start somewhere and there's going to be some failures, the key is to recognize that shoving everything into a monolithic data architecture won't support massive scale and agility that you're after. It's maybe fine for smaller use cases in smaller firms, but if you're building a global platform in a data business, it's time to rethink data architecture. Now much of this is enabled by the cloud, but cloud first doesn't mean cloud only, doesn't mean you'll leave your on-prem data behind, on the contrary, you have to include non-public cloud data in your Data Mesh vision just as JPMC has done. You've got to get some quick wins, that's crucial so you can gain credibility within the organization and grow. And one of the key takeaways from the JP Morgan team is, there is a place for dogma, like organizing around data products and domains and getting that right. On the other hand, you have to remain flexible because technologies is going to come, technology is going to go, so you got to be flexible in that regard. And look, if you're going to embrace the metaphor of water like puddles and ponds and lakes, we suggest maybe a little tongue in cheek, but still we believe in this, that you expand your scope to include data ocean, something John Furry and I have talked about and laughed about extensively in theCUBE. Data oceans, it's huge. It's the new data lake, go transcend data lake, think oceans. And think about this, just as we're evolving our language, we should be evolving our metrics. Much the last the decade of big data was around just getting the stuff to work, getting it up and running, standing up infrastructure and managing massive, how much data you got? Massive amounts of data. And there were many KPIs built around, again, standing up that infrastructure, ingesting data, a lot of technical KPIs. This decade is not just about enabling better insights, it's a more than that. Data mesh points us to a new era of data value, and that requires the new metrics around monetizing data products, like how long does it take to go from data product conception to monetization? And how does that compare to what it is today? And what is the time to quality if the business owns the data, and the business has the context? the quality that comes out of them, out of the shoot should be at a basic level, pretty good, and at a higher mark than out of a big data team with no business context. Automation, AI, and very importantly, organizational restructuring of our data teams will heavily contribute to success in the coming years. So we encourage you, learn, lean in and create your data future. Okay, that's it for now, remember these episodes, they're all available as podcasts wherever you listen, all you got to do is search, breaking analysis podcast, and please subscribe. Check out ETR's website at etr.plus for all the data and all the survey information. We publish a full report every week on wikibon.com and siliconangle.com. And you can get in touch with us, email me david.vellante@siliconangle.com, you can DM me @dvellante, or you can comment on my LinkedIn posts. This is Dave Vellante for theCUBE insights powered by ETR. Have a great week everybody, stay safe, be well, and we'll see you next time. (upbeat music)

Published Date : Jul 12 2021

SUMMARY :

This is braking analysis and the adjacencies to things

ENTITIES

Entity	Category	Confidence
JPMC	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
2018	DATE	0.99+
Zhamak Deghani	PERSON	0.99+
James Reid	PERSON	0.99+
JP Morgan	ORGANIZATION	0.99+
JP Morgan	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
Serita Bakst	PERSON	0.99+
IBM	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Scott Hollerman	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
40%	QUANTITY	0.99+
JP Morgan Chase	ORGANIZATION	0.99+
Serita	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
Arup Nanda	PERSON	0.99+
each	QUANTITY	0.99+
ThoughtWorks	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
each line	QUANTITY	0.99+
Terradata	ORGANIZATION	0.99+
Redis	ORGANIZATION	0.99+
$250 million	QUANTITY	0.99+
first point	QUANTITY	0.99+
three factors	QUANTITY	0.99+
Second	QUANTITY	0.99+
MapR	ORGANIZATION	0.99+
today	DATE	0.99+
Informatica	ORGANIZATION	0.99+
Talend	ORGANIZATION	0.99+
John Furry	PERSON	0.99+
Zhamak Deghani	PERSON	0.99+
first platform	QUANTITY	0.98+
YouTube	ORGANIZATION	0.98+
fourth	QUANTITY	0.98+
single	QUANTITY	0.98+
One	QUANTITY	0.98+
Third	QUANTITY	0.97+
Couchbase	ORGANIZATION	0.97+
three speakers	QUANTITY	0.97+
two data	QUANTITY	0.97+
first strategy	QUANTITY	0.96+
one	QUANTITY	0.96+
one place	QUANTITY	0.96+
Jr Reid	PERSON	0.96+
single lake	QUANTITY	0.95+
SAP	ORGANIZATION	0.95+
wikibon.com	OTHER	0.95+
siliconangle.com	OTHER	0.94+
Azure	ORGANIZATION	0.93+

Kaustubh Das, Cisco & Laura Crone, Intel | Cisco Live US 2019

>> Live from San Diego, California It's the queue covering Sisqo Live US 2019 Tio by Cisco and its ecosystem barkers. >> Welcome back. It's the Cube here at Cisco Live, San Diego 2019 times. Two minute My co host is Day Volante. First, I want to welcome back custom dos Katie, who is the vice president. Product management with Cisco Compute. We talked with him a lot about Piper Flex anywhere in Barcelona. Wanna welcome to the program of first time guests Laura Crone, who's the vice president of sales and marketing group in NSG sales and marketing at Intel. Laura, thanks so much for joining us, All right, So since Katie has been our program, let let's start with you. You know, we know, you know. We've watched, you know, Cisco UCS and that compute, you know, since it rolled out for about a decade ago. Now on DH, you know Intel always up on stage with Cisco talking about the latest enhancements everywhere I go this year, people are talking about obtained and how technologies like envy me are baking in tow. The environment storage class memories, you know, coming there. So you know, let's start with kind of intel. What's happening in your world and you know your activities. Francisco live >> great. So I'm glad to hear you've heard a lot about octane because I have some marketing of my organization. So obtain is the first new memory architecture er in over 25 years. And it is different than Nanda, right? It is you, Khun, right? Data to the silicon that is programs faster and has greater endurance. So when you think of obtain its fast like D ram But it's persistent, like nay on three D now. And it has some industry leading combinations of capabilities such a cz high throughput, high endurance, high quality of service and low latent see. And for a storage device, what could be better than having fast performance and hi consistency. Oh, >> Laura's you say? Yeah, but 25 years since this move. You know, I remember when I when I started working with Dave, it was, you know, how do we get out of you know, the horrible, scuzzy stack is what we had lived on for decades there. And finally, Now it feels like we're coming through the clearing and there is just going to be wave after wave of new technologies that air free to get us high performance low latent c on the like. >> Yeah, And I think the other big part of that which is part of Cisco's hyper flex all in Vienna, is the envy me standards. So, you know, we've lived in a world of legacy satya controllers, which created a lot of bottlenecks and the performance Now that the industry is moving toe envy me, that even opens up it. Mohr And so, as we were developing, obtain, we knew we had Teo go move the industry to a new protocol. Otherwise, that pairing was not going to be very successful. >> Alright, so Katie all envy me, tell more. >> So we come here and we talk about all the cool innovations we do within the company. And then sometimes you come here and we talk about all the cool innovation we do with our partners, our technology partner, that intel being a fantastic technology partner, obviously being the server business, you've got a partner with intel on. We've really going away that across the walls ofthe two organizations to bring, uh, just do to life, right? So Cisco 80 I hyper flex is one of the products >> we >> talked about in the past. Hyper Flex, all in Miami that uses Intel's obtain technology is, well, it's Intel's three demand all envy me devices to power really the fastest workloads that customers want to put on this device. So you talked about free envy me. Pricing is getting to a point where it becomes that much more accessible to youth, ese for powering databases for par like those those work clothes required that leyton see characteristics and acquire those I ops on DH. That's what we've enabled with Cisco Hyper Flex collaborating with Intel of Envy Me portfolio. >> Remember when I started in the business, somebody was sharing with me to educate me on the head? A pyramid? Think of the period is a storage hierarchy. And at the top of it, was it actually an Intel solid state device, which back then was not It was volatile, right? So you had to put, you know, backup power supplies on it. Uh, so but any rate and then with all this memory architecture coming and flash towards people have been saying, well, it's going to flatten that pyramid. But now, with obtain. You're seeing the reemergence of that periods of that pyramid. So help us understand, sort of where it fits from a supplier standpoint and a no yam and ultimate customer. Because if I understand it, so obtain is faster than NAND, but it's going to be more expensive, but it's slower than D Ram, but it's cheaper, right? So where does it fit? What, the use cases? Where does it fit in that hierarchy? Maybe. >> Yeah. So if you think about the hierarchy at the very top is D RAM, which is going to be your fastest lowest Leighton see product. But right below that is obtained. Persistent memory, the dims and you get greater density because that's one of the challenges with the Ram is they're not dense enough, nor are they affordable enough, right? And so you get that creates a new tear in the store tire curry. Go below that and you have obtain assist ease, which bring even mohr density. So we go up to a 1.5 terabyte in a obtain sst, uh, and you that now get performance for your storage and memory expansion. Then you have three Dean and and then even below that, you have three thing and Q l c, which gives you cost effective, high density capacity. And then below that is the old fashioned hard disk drive. And then magnet. Yeah, you start inserting all these tears that give architects and both hardware and software an opportunity. Teo rethink how they wantto do storage. >> So the demand for this granularity obviously coming from your your buyers, your direct bars and your customers. So what does it do for you and specifically your customers? >> Yeah. So the name of the game is performance and the ability to have in a land where things are not very predictable, the ability to support any thing that the your end customers may throw at you if you're a 90 department. That may mean a bur internal of, uh, data scientist team are traditional architect off a traditional application. Now, what Intel and Cisco can do together is truly unique because we control all parts of the stack, everything from the sober itself to the to the storage devices to the distributed file system that sits on top ofit. So, for example, in Etienne, hyper flecks were using obtain as a cashing here on because we write the distributed file system. We can speak in a balance between what we put in the cash in care how it moved out data to the non cashing 3 90 year, as as Intel came out with their latest processors that support memory class torched last memory. We support that now we can engineer this whole system and to end so that we can deliver to customers the innovation that Intel is bringing to the table in a way that's consumable by their, uh, one more thing I'll throw out there. So technology is great, but it needs to be resilient because I D departments will occasionally yank out the wrong wire. They are barely yank out the wrong drive. One of the things that we work together with Intel What? How do we court rise into this? How to be with reliability, availability, serviceability? How do we prevent against accidental removal or accidental insertion on DH? Some of those go innovations have let Teo asked, getting out in the market a hyper flecked system that uses these technologies in a way that's really usable by teens in our customs. I'd >> love to double click on that in the context of envy. Envy? What you guys were talking about, You mentioned horrible storage deck. I think he called it the horrible, scuzzy stack. And Laura, you were talking about the You know, the cheap and deep now is a spinning disk. So my understanding is that you've got a lot of overhead in the traditional scuzzy protocol, but nobody ever noticed because you had this mechanical device. Now, with flash storage, it all becomes exposed. And VM e allows just a like a bat phone. Right? Okay, so correct me where I got that wrong, But maybe you could give us the perspective. You know what? Why Envy Emmy is important from your standpoint. And how are you guys using it? >> Yeah, I think envy and me is just a much faster protocol. And you're absolutely right. We have a graph that we show of the old world and how much overhead there is all the way down to when you have obtained in a dim solution with no overhead octane assist. E still has a tiny bit, but there's a graph that shows all of that Leyton C is removed when you deploy, obtain so envy me gives you much greater band with right. The CPU is not bottlenecked, and you get greater CPU efficiency when you have a faster interface like and >> and like hyper flexes taking advantage of this house. Oh, >> yeah? Let me give you a couple of examples. So anything performance, the first thing that comes to mind is databases. So for those kinds of workloads, this system gets about 25% better performance. Next thing that comes to mind is people really don't know what they're gonna put on the system. So sometimes they put databases, sometimes put mixed workloads. So when we look at mixed workloads way get about 65% or so better I ops, we get 37% better lately sees. So even in a mixed I opened Wyman wherever have databases you may have a Web theory may have other things. This thing is definite resilient to handle the workload. So it's it just opens up the splatter abuse cases. >> So any other questions I had was specific to obtain. D ram has consumer applications, as does Flash Anand was obtained. Have similar consumer applications can achieve that volume so that the prices, you can come down, not free, but continue to sort of drive the curves. >> Eso When we look at the overall tam, we see the tam growing out over time. I don't know exactly when it crosses. Over the volume are the bits of the ram, but we absolutely see it growing over time. And as a technology ramps, it'll have a you know, it costs ramping curves. Well, >> it'll follow that curve. Okay, good. >> Yeah, Just Katie. Give us a little bit. Broad view of hyper flex here. Att? The show, people, you know, play any labs with the brand new obtained pieces or what? What other highlights that you and the team have this week? >> Yeah, absolutely. So in in Barcelona, we talked about high, perfect for all that is live today. So in the show floor, people can look at the hyper flex at the edge combined with S t one. How do you control How did deploy thousands of edge locations from a centralized location to the part of the inner side which cloud based management too? So that whole experience is unable. Now, at the other end of the spectrum is how do we drive even more performance. So we were always, always the performance leader. Now we're comparing ourselves to ourselves to behavior 35% better than our previous all flash. With the innovation Intel is bringing to the table, some of the other pieces are actually use cases. So there's a big hospital chain where my kids go toe goto, get treated and look and see the doctor. There are lots of medical use cases which require epic the medical software company to power it, whether it is the end terminals or it is the back and database. So that epic hyperspace and happy cachet those have been out be invalidated on hyper flex, using the technology that we just talked about around update on doll in via me that can get me there is that much more power. That means that when my my doctor and the nurse pulls off, the records don't show up fast. But all the medical records, all of those other high performance seeking applications also run that much more streamlined, so I would encourage people little water solution. We've got a tremendous set off demos out there to go up there and check us out >> and there's a great white paper out on this, right? That e g s >> e g is made one of the a company that I've seen benchmarking Ah, a hyper flex. >> So whatever Elaborate where they do a lab report or >> it's what they do is they bench around different hyper converge infrastructure vendors. So they did this first time around and they and they said, Well, we could pack that much more We EMS on a on a hyper flex with rotating drives. And then they did it again And I said, Well, now that you got all flash Well, deacon, you got now the performance and the ladies see leadership and then they did it again and they said, Well, hang on, you you've kind of left the competition that does that. That's not going to make a pretty chart to show when we compare your all in Miami against your hyper so many. When you get that good, you compare against yourselves. We've been the performance theater on the estate has been doing the >> data obtained. The next generation added up, >> and this is what a database workload. OK, nowyou bringing obtain a little toast to the latest report >> has that measures >> measures obtain against are all flash report and then also ship or measure across vendors. So >> where can I get this? Is at some party or website or >> it's off all of this. All of this is off off the Cisco Hyper Flex website on artist go dot com. But F is the companies that want to go directly to their about getting a more >> I guess final final question for you is you know, I think back the early is ucs. It was the memory enhancements that they had that allowed the dentist virtual ization in the industry back when it started. It sounds like we're just taking that to the next level with this next generation of solutions. What what else would you out about? The relationship with Cisco and Intel? >> Eso, Intel and Cisco worked together for years right innovation around the CPU and the platform, and it's super exciting to be expanding our relationship to storage. And I'm even more excited that the Cisco hyper flex solution is endorsing Intel obtain and three thing and and we're seeing great examples of really use workloads where are in customers can benefit from this technology. >> Katie Laura. Thanks so much for the update. Congratulations on the progress that you've made so far for David Dante on Student, and we'll be back with more coverage here. It's just go live 2019 in San Diego. Thanks for watching the cue >> theme.

Published Date : Jun 10 2019

SUMMARY :

Live from San Diego, California It's the queue covering So you know, So when you think of obtain its fast like D ram But it's You know, I remember when I when I started working with Dave, it was, you know, how do we get out of you So, you know, we've lived in a world of legacy So Cisco 80 I hyper flex is one of the products So you talked about free envy me. So you had to put, you know, backup power supplies on it. Persistent memory, the dims and you get greater density So what does it do for you and specifically your customers? One of the things that we work And Laura, you were talking about the You know, of that Leyton C is removed when you deploy, obtain so envy me gives and like hyper flexes taking advantage of this house. So anything performance, the first thing that comes to mind is databases. prices, you can come down, not free, but continue to sort of drive the curves. are the bits of the ram, but we absolutely see it growing over time. it'll follow that curve. What other highlights that you and the team have this week? So in the show floor, people can look at the hyper flex at the edge e g is made one of the a company that I've seen benchmarking Ah, And then they did it again And I said, Well, now that you got all flash Well, deacon, you got now the performance and the The next generation added up, and this is what a database workload. So But F is the companies that want to go directly to What what else would you out about? And I'm even more excited that the Cisco hyper flex solution Congratulations on the progress that you've made so far for

ENTITIES

Entity	Category	Confidence
Laura Crone	PERSON	0.99+
Laura	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Katie	PERSON	0.99+
Miami	LOCATION	0.99+
Barcelona	LOCATION	0.99+
Dave	PERSON	0.99+
Katie Laura	PERSON	0.99+
David Dante	PERSON	0.99+
San Diego	LOCATION	0.99+
Vienna	LOCATION	0.99+
37%	QUANTITY	0.99+
Kaustubh Das	PERSON	0.99+
First	QUANTITY	0.99+
San Diego, California	LOCATION	0.99+
Intel	ORGANIZATION	0.99+
Eso	ORGANIZATION	0.99+
intel	ORGANIZATION	0.99+
25 years	QUANTITY	0.99+
first	QUANTITY	0.99+
Hyper Flex	COMMERCIAL_ITEM	0.99+
over 25 years	QUANTITY	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
about 25%	QUANTITY	0.98+
today	DATE	0.98+
Leighton	ORGANIZATION	0.98+
first time	QUANTITY	0.98+
Teo	PERSON	0.97+
this week	DATE	0.97+
about 65%	QUANTITY	0.97+
Envy Emmy	PERSON	0.97+
thousands	QUANTITY	0.97+
1.5 terabyte	QUANTITY	0.96+
three	QUANTITY	0.96+
One	QUANTITY	0.95+
35%	QUANTITY	0.95+
Two minute	QUANTITY	0.95+
Cisco Compute	ORGANIZATION	0.94+
two organizations	QUANTITY	0.94+
3	QUANTITY	0.92+
hyper flex	ORGANIZATION	0.9+
decades	QUANTITY	0.88+
90 year	QUANTITY	0.88+
90 department	QUANTITY	0.87+
this year	DATE	0.87+
2019	DATE	0.87+
Mohr	PERSON	0.87+
first thing	QUANTITY	0.84+
Cisco UCS	ORGANIZATION	0.84+
envy	PERSON	0.83+
Cisco Live	EVENT	0.83+
Nanda	ORGANIZATION	0.81+
NAND	ORGANIZATION	0.8+
octane	OTHER	0.8+
envy	ORGANIZATION	0.78+
a decade ago	DATE	0.78+
hyper flex	COMMERCIAL_ITEM	0.78+
NSG	ORGANIZATION	0.74+
US	LOCATION	0.72+
Flex	ORGANIZATION	0.72+

Deploying AI in the Enterprise

(orchestral music) >> Hi, I'm Peter Burris and welcome to another digital community event. As we do with all digital community events, we're gonna start off by having a series of conversations with real thought leaders about a topic that's pressing to today's enterprises as they try to achieve new classes of business outcomes with technology. At the end of that series of conversations, we're gonna go into a crowd chat and give you an opportunity to voice your opinions and ask your questions. So stay with us throughout. So, what are we going to be talking about today? We're going to be talking about the challenge that businesses face as they try to apply AI, ML, and new classes of analytics to their very challenging, very difficult, but nonetheless very value-producing outcomes associated with data. The challenge that all these businesses have is that often, you spend too much time in the infrastructure and not enough time solving the problem. And so what's required is new classes of technology and new classes of partnerships and business arrangements that allow for us to mask the underlying infrastructure complexity from data science practitioners, so that they can focus more time and attention on building out the outcomes that the business wants and a sustained business capability so that we can continue to do so. Once again, at the end of this series of conversations, stay with us, so that we can have that crowd chat and you can, again, ask your questions, provide your insights, and participate with the community to help all of us move faster in this crucial direction for better AI, better ML and better analytics. So, the first conversation we're going to have is with Anant Chintamaneni. Anant's the Vice President of Products at BlueData. Anant, welcome to theCUBE. >> Hi Peter, it's great to be here. I think the topic that you just outlined is a very fascinating and interesting one. Over the last 10 years, data and analytics have been used to create transformative experiences and drive a lot of business growth. You look at companies like Uber, AirBnB, and you know, Spotify, practically, every industry's being disrupted. And the reason why they're able to do this is because data is in their DNA; it's their key asset and they've leveraged it in every aspect of their product development to deliver amazing experiences and drive business growth. And the reason why they're able to do this is they've been able to leverage open-source technologies, data science techniques, and big data, fast data, all types of data to extract that business value and inject analytics into every part of their business process. Enterprises of all sizes want to take advantage of that same assets that the new digital companies are taking and drive digital transformation and innovation, in their organizations. But there's a number of challenges. First and foremost, if you look at the enterprises where data was not necessarily in their DNA and to inject that into their DNA, it is a big challenge. The executives, the executive branch, definitely wants to understand where they want to apply AI, how to kind of identify which huge cases to go after. There is some recognition coming in. They want faster time-to-value and they're willing to invest in that. >> And they want to focus more on the actual outcomes they seek as opposed to the technology selection that's required to achieve those outcomes. >> Absolutely. I think it's, you know, a boardroom mandate for them to drive new business outcomes, new business models, but I think there is still some level of misalignment between the executive branch and the data worker community which they're trying to upgrade with the new-age data scientists, the AI developer and then you have IT in the middle who has to basically bridge the gap and enable the digital transformation journey and provide the infrastructure, provide the capabilities. >> So we've got a situation where people readily acknowledge the potential of some of these new AI, ML, big data related technologies, but we've got a mismatch between the executives that are trying to do evidence-based management, drive new models, the IT organization who's struggling to deal with data-first technologies, and data scientists who are few and far between, and leave quickly if they don't get the tooling that they need. So, what's the way forward, that's the problem. How do we move forward? >> Yeah, so I think, you know, I think we have to double-click into some of the problems. So the data scientists, they want to build a tool chain that leverages the best in-class, open source technologies to solve the problem at hand and they don't want, they want to be able to compile these tool chains, they want to be able to apply and create new algorithms and operationalize and do it in a very iterative cycle. It's a continuous development, continuous improvement process which is at odds with what IT can deliver, which is they have to deliver data that is dispersed all over the place to these data scientists. They need to be able to provide infrastructure, which today, they're not, there's an impotence mismatch. It takes them months, if not years, to be able to make those available, make that infrastructure available. And last but not the least, security and control. It's just fundamentally not the way they've worked where they can make data and new tool chains available very quickly to the data scientists. And the executives, it's all about faster time-to-value so there's a little bit of an expectation mismatch as well there and so those are some of the fundamental problems. There's also reproducibility, like, once you've created an analytics model, to be able to reproduce that at scale, to be then able to govern that and make sure that it's producing the right results is fundamentally a challenge. >> Audibility of that process. >> Absolutely, audibility. And, in general, being able to apply this sort of model for many different business problems so you can drive outcomes in different parts of your business. So there's a huge number of problems here. And so what I believe, and what we've seen with some of these larger companies, the new digital companies that are driving business valley ways, they have invested in a unified platform where they've made the infrastructure invisible by leveraging cloud technologies or containers and essentially, made it such that the data scientists don't have to worry about the infrastructure, they can be a lot more agile, they can quickly create the tool chains that work for the specific business problem at hand, scale it up and down as needed, be able to access data where it lies, whether it's on-prem, whether it's in the cloud or whether it's a hybrid model. And so that's something that's required from a unified platform where you can do your rapid prototyping, you can do your development and ultimately, the business outcome and the value comes when you operationalize it and inject it into your business processes. So, I think fundamentally, this start, this kind of a unified platform, is critical. Which, I think, a lot of the new age companies have, but is missing with a lot of the enterprises. >> So, a big challenge for the enterprise over the next few years is to bring these three groups together; the business, data science world and infrastructure world or others to help with those problems and apply it successfully to some of the new business challenges that we have. >> Yeah, and I would add one last point is that we are on this continuous journey, as I mentioned, this is a world of open source technologies that are coming out from a lot of the large organizations out there. Whether it's your Googles and your Facebooks. And so there is an evolution in these technologies much like we've evolved from big data and data management to capture the data. The next sort of phase is around data exploitation with artificial intelligence and machine learning type techniques. And so, it's extremely important that this platform enables these organizations to future proof themselves. So as new technologies come in, they can leverage them >> Great point. >> for delivering exponential business value. >> Deliver value now, but show a path to delivery value in the future as all of these technologies and practices evolve. >> Absolutely. >> Excellent, all right, Anant Chintamaneni, thanks very much for giving us some insight into the nature of the problems that enterprises face and some of the way forward. We're gonna be right back, and we're gonna talk about how to actually do this in a second. (light techno music) >> Introducing, BlueData EPIC. The leading container-based software platform for distributed AI, machine learning, deep learning and analytics environments. Whether on-prem, in the cloud or in a hybrid model. Data scientists need to build models utilizing various stacks of AI, ML and DL applications and libraries. However, installing and validating these environments is time consuming and prone to errors. BlueData provides the ability to spin up these environments on demand. The BlueData EPIC app store includes, best of breed, ready to run docker based application images. Like TensorFlow and H2O driverless AI. Teams can also add their own images, to provide the latest tools that data scientists prefer. And ensure compliance with enterprise standards. They can use the quick launch button. which provides pre configured templates with the appropriate application image and resources. For example, they can instantly launch a new Sandbox environment using the template for TensorFlow with a Jupyter Notebook. Within just a few minutes, it'll be automatically configured with GPUs and easy access to their data. Users can launch experiments and make GPUs automatically available for analysis. In this case, the H2O environment was set up with one GPU. With BlueData EPIC, users can also deploy end points with the appropriate run time. And the inference run times can use CPUs or GPUs. With a container based BlueData Platform, you can deploy fully configured distributed environments within a matter of minutes. Whether on-prem, in the public cloud, or in a hybrid a architecture. BlueData was recently acquired by Hewlett Packward Enterprise. And now, HPE and BlueData are joining forces to help you on your AI journey. (light techno music) To learn more, visit www.BlueData.com >> And we're back. I'm Peter Burris and we're continuing to have this conversation about how businesses are turning experience with the problems of advance analytics and the solutions that they seek into actual systems that deliver continuous on going value and achieve the business capabilities required to make possible these advanced outcomes associated with analytics, AI and ML. And to do that, we've got two great guests with us. We've got Kumar Sreekanti, who is the co-founder and CEO of BlueData. Kumar, welcome back to theCUBE. >> Thank you, it is nice to be here, back again. >> And Kumar, you're being joined by a customer. Ramesh Thyagarajan, is the executive director of the Advisory Board Company which is part of Optum now. Ramesh, welcome to theCUBE. >> Great to be here. >> Alright, so Kumar let's start with you. I mentioned up front, this notion of turning technology and understanding into actual business capabilities to deliver outcomes. What has been BlueData's journey along, to make that happen? >> Yeah, it all started six years ago, Peter. It was a bold vision and a big idea and no pun intended on big data which was an emerging market then. And as everybody knows, the data was enormous and there was a lot of innovation around the periphery. but nobody was paying attention to how to make the big data consumable in enterprise. And I saw an enormous opportunity to make this data more consumable in the enterprise and to give a cloud-like experience with the agility and elasticity. So, our vision was to build a software infrastructure platform like VMware, specially focused on data intensity distributed applications and this platform will allow enterprises to build cloud like experiences both on enterprise as well as on hybrid clouds. So that it pays the journey for their cloud experience. So I was very fortunate to put together a team and I found good partners like Intel. So that actually is the genesis for the BlueData. So, if you look back into the last six years, big data itself has went through a lot of evolution and so the marketplace and the enterprises have gone from offline analytics to AI, ML based work loads that are actually giving them predictive and descriptive analytics. What BlueData has done is by making the infrastructure invisible, by making the tool set completely available as the tool set itself is evolving and in the process, we actually created so many game changing software technologies. For example, we are the first end-to-end content-arised enterprise solution that gives you distributed applications. And we built a technology called DataTap, that provides computed data operation so that you don't have to actually copy the data, which is a boom for enterprises. We also actually built multitenancy so those enterprises can run multiple work loads on the same data and Ramesh will tell you in a second here, in the healthcare enterprise, the multitenancy is such a very important element. And finally, we also actually contributed to many open source technologies including, we have a project called KubeDirector which is actually is our own Kubernetes and how to run stateful workloads on Kubernetes. which we have actually very happy to see that people like, customers like Ramesh are using the BlueData. >> Sounds like quite a journey and obviously you've intercepted companies like the advisory board company. So Ramesh, a lot of enterprises have mastered or you know, gotten, understood how to create data lakes with a dupe but then found that they still weren't able to connect to some of the outcomes that they saw. Is that the experience that you had. >> Right, to be precise, that is one of the kind of problems we have. It's not just the data lake that we need to be able to do the workflows or other things, but we also, being a traditional company, being in the business for a long time, we have a lot of data assets that are not part of this data lake. We're finding it hard to, how do we get the data, getting them and putting them in a data lake is a duplication of work. We were looking for some kind of solutions that will help us to gather the benefits of leaving the data alone but still be able to get into it. >> This is where (mumbles). >> This is where we were looking for things and then I was lucky and fortunate to run into Kumar and his crew in one of the Hadoop conferences and then they demonstrated the way it can be done so immediately hit upon, it's a big hit with us and then we went back and then did a POC, very quickly adapt to the technology and that is also one of the benefits of corrupting this technology is the level of contrary memorization they are doing, it is helping me to address many needs. My data analyst, the data engineers and the data scientists so I'm able to serve all of them which otherwise wouldn't be possible for me with just this plain very (mumbles). >> So it sounds as though the partnership with BlueData has allowed you to focus on activities and problems and challenges above the technology so that you can actually start bringing data science, business objectives and infrastructure people together. Have I got that right? >> Absolutely. So BlueData is helping me to tie them all together and provide an excess value to my business. We being in the healthcare, the importance is we need to be able to look at the large data sets for a period of time in order to figure out how a patient's health journey is happening. That is very important so that we can figure out the ways and means in which we can lower the cost of health care and also provide insights to the physician, they can help get people better at health. >> So we're getting great outcomes today especially around, as you said that patient journey where all the constituents can get access to those insights without necessarily having to learn a whole bunch of new infrastructure stuff but presumably you need more. We're talking about a new world that you mentioned before upfront, talking about a new world, AI, ML, a lot of changes. A lot of our enterprise customers are telling us it's especially important that they find companies that not only deliver something today but demonstrate a commitment to sustain that value delivery process especially as the whole analytics world evolves. Are you experiencing that as well? >> Yes, we are experiencing and one of the great advantage of the platform, BlueData platform that gave me this ability to, I had the new functionality, be it the TensorFlow, be it the H2O, be it the heart studio, anything that I needed, I call them, they give me the images that are plug-and-play, just put them and all the prompting is practically transparent to nobody need to know how it is achieved. Now, in order to get to the next level of the predictive and prescriptive analytics, it is not just you having the data, you need to be able to have your curated data asset set process on top of a platform that will help you to get the data scientists to make you. One of the biggest challenges that are scientist is not able to get their hands on data. BlueData platform gives me the ability to do it and ensure all the security meets and all the compliances with the various other regulated compliances we need to make. >> Kamar, congratulations. >> Thank you. >> Sounds like you have a happy customer. >> Thank you. >> One of the challenges that every entrepreneur faces is how did you scale the business. So talk to us about where you are in the decisions that you made recently to achieve that. >> As an entrepreneur, when you start a company, odds are against you, right? You're always worried about it, right. You make so many sacrifices, yourself and your team and all that but the the customer is the king. The most important thing for us to find satisfied customers like Rameshan so we were very happy and BlueData was very successful in finding that customer because i think as you pointed out, as Ramesh pointed out, we provide that clean solution for the customer but as you go through this journey as a co-founder and CEO, you always worry about how do you scale to the next level. So we had partnerships with many companies including HPE and we found when this opportunity came in front of me with myself and my board, we saw this opportunity of combining the forces of BlueData satisfied customers and innovative technology and the team with the HPs brand name, their world-class service, their investment in R&D and they have a very long, large list of enterprise customers. We think putting these two things together provides that next journey in the BlueData's innovation and BlueData's customers. >> Excellent, so once again Kumar Sreekanti, co-founder and CEO of BlueData and Ramesh Thyagarajan who is the executive director of the advisory board company and part of Optum, I want to thank both of you for being on theCUBE. >> Thank you >> Thank you, great to be here. >> Now let's hear a little bit more about how this notion of bringing BlueData and HPE together is generating new classes of value that are making things happen today but are also gonna make things happen for customers in the future and to do that we've got Dave Velante who's with Silicon Angle Wiki Bond joined by Patrick Osbourne who's with HPE in our Marlborough studio so Dave over to you. >> Thanks Peter. We're here with Patrick Osbourne, the vice president and general manager of big data and analytics at Hewlett Packard Enterprise. Patrick, thanks for coming on. >> Thanks for having us. >> So we heard from Kumar, let's hear from you. Why did HPE purchase, acquire BlueData? >> So if you think about it from three angles. Platform, people and customers, right. Great platform, built for scale addressing a number of these new workloads and big data analytics and certainly AI, the people that they have are amazing, right, great engineering team, awesome customer success team, team of data scientists, right. So you know, all the folks that have some really, really great knowledge in this space so they're gonna be a great addition to HPE and also on the customer side, great logos, major fortune five customers in the financial services vertical, healthcare, pharma, manufacturing so a huge opportunity for us to scale that within HP context. >> Okay, so talk about how it fits into your strategy, specifically what are you gonna do with it? What are the priorities, can you share some roadmap? >> Yeah, so you take a look at HPE strategy. We talk about hybrid cloud and specifically edge to core to cloud and the common theme that runs through that is data, data-driven enterprises. So for us we see BlueData, Epic platform as a way to you know, help our customers quickly deploy these new mode to applications that are fueling their digital transformation. So we have some great plans. We're gonna certainly invest in all the functions, right. So we're gonna do a force multiplier on not only on product engineering and product delivery but also go to market and customer success. We're gonna come out in our business day one with some really good reference architectures, with some of our partners like Cloud Era, H2O, we've got some very scalable building block architectures to marry up the BlueData platform with our Apollo systems for those of you have seen that in the market, we've got our Elastic platform for analytics for customers who run these workloads, now you'd be able to virtualize those in containers and we'll have you know, we're gonna be building out a big services practice in this area. So a lot of customers often talk to us about, we don't have the people to do this, right. So we're gonna bring those people to you as HPE through Point Next, advisory services, implementation, ongoing help with customers. So it's going to be a really fantastic start. >> Apollo, as you mentioned Apollo. I think of Apollo sometimes as HPC high performance computing and we've had a lot of discussion about how that's sort of seeping in to mainstream, is that what you're seeing? >> Yeah absolutely, I mean we know that a lot of our customers have traditional workloads, you know, they're on the path to almost completely virtualizing those, right, but where a lot of the innovation is going on right now is in this mode two world, right. So your big data and analytics pipeline is getting longer, you're introducing new experiences on top of your product and that's fueling you know, essentially commercial HPC and now that folks are using techniques like AI and modeling inference to make those services more scalable, more automated, we're starting to bringing these more of these platforms, these scalable architectures like Apollo. >> So it sounds like your roadmap has a lot of integration plans across the HPE portfolio. We certainly saw that with Nimble, but BlueData was working with a lot of different companies, its software, is the plan to remain open or is this an HPE thing? >> Yeah, we absolutely want to be open. So we know that we have lots of customers that choose, so the HP is all about hybrid cloud, right and that has a couple different implications. We want to talk about your choice of on-prem versus off-prem so BlueData has a great capability to run some of these workloads. It essentially allows you to do separation of compute and storage, right in the world of AI and analytics we can run it off-prem as well in the public cloud but then we also have choice for customers, you know, any customer's private cloud. So that means they want to run on other infrastructure besides HPE, we're gonna support that, we have existing customers that do that. We're also gonna provide infrastructure that marries the software and the hardware together with frameworks like Info Site that we feel will be a you know, much better experience for the customers but we'll absolutely be open and absolutely have choice. >> All right, what about the business impact to take the customer perspective, what can they expect? >> So I think from a customer perspective, we're really just looking to accelerate deployment of AI in the enterprise, right and that has a lot of implications for us. We're gonna have very scalable infrastructure for them, we're gonna be really focused on this very dynamic AI and ML application ecosystems through partnerships and support within the BlueData platform. We want to provide a SAS experience, right. So whether that's GPUs or accelerators as a service, analytics as a service, we really want to fuel innovation as a service. We want to empower those data scientists there, those are they're really hard to find you know, they're really hard to retain within your organization so we want to unlock all that capability and really just we want to focus on innovation of the customers. >> Yeah, and they spend a lot of time wrangling data so you're really going to simplify that with the cloud (mumbles). Patrick thank you, I appreciate it. >> Thank you very much. >> Alright Peter, back to you in Palo Alto. >> And welcome back, I'm Peter Burris and we've been talking a lot in the industry about how new tooling, new processes can achieve new classes of analytics, AI and ML outcomes within a business but if you don't get the people side of that right, you're not going to achieve the full range of benefits that you might get out of your investments. Now to talk a little bit about how important the data science practitioner is in this equation, we've got two great guests with us. Nanda Vijaydev is the chief data scientists of BlueData. Welcome to theCUBE. >> Thank you Peter, happy to be here. >> Ingrid Burton is the CMO and business leader at H2O.AI, Ingrid, welcome to the CUBE. >> Thank you so much for having us. >> So Nanda Vijaydev, let's start with you. Again, having a nice platform, very, very important but how does that turn into making the data science practitioner's life easier so they can deliver more business value. >> Yeah thank you, it's a great question. I think end of the day for a data scientist, what's most important is, did you understand the question that somebody asked you and what is expected of you when you deliver something and then you go about finding, what do I need for them, I need data, I need systems and you know, I need to work with people, the experts in the process to make sure that the hypothesis I'm doing is structured in a nice way where it is testable, it's modular and I have you know, a way for them to go back to show my results and keep doing this in an iterative manner. That's the biggest thing because the satisfaction for a data scientist is when you actually take this and make use of it, put it in production, right. To make this whole thing easier, we definitely need some way of bringing it all together. That's really where, especially compared to the traditional data science where everything was monolithic, it was one system, there was a very set way of doing things but now it is not so you know, with the growing types of data, with the growing types of computation algorithms that's available, there's a lot of opportunity and at the same time there is a lot of uncertainty. So it's really about putting that structure and it's really making sure you get the best of everything and still deliver the results, that is the focus that all data scientists strive for. >> And especially you wanted, the data scientists wants to operate in the world of uncertainty related to the business question and reducing uncertainty and not deal with the underlying some uncertainty associated with the infrastructure. >> Absolutely, absolutely you know, as a data scientist a lot of time used to spend in the past about where is the data, then the question was, what data do you want and give it to you because the data always came in a nice structured, row-column format, it had already lost a lot of context of what we had to look for. So it is really not about you know, getting the you know, it's really not about going back to systems that are pre-built or pre-processed, it's getting access to that real, raw data. It's getting access to the information as it came so you can actually make the best judgment of how to go forward with it. >> So you describe the world with business, technology and data science practitioners are working together but let's face it, there's an enormous amount of change in the industry and quite frankly, a deficit of expertise and I think that requires new types of partnerships, new types of collaboration, a real (mumbles) approach and Ingrid, I want to talk about what H2O.AI is doing as a partner of BlueData, HPE to ensure that you're complementing these skills in pursuit or in service to the customer's objectives. >> Absolutely, thank you for that. So as Nanda described, you know, data scientists want to get to answers and what we do at H2O.AI is we provide the algorithms, the platforms for data scientist to be successful. So when they want to try and solve a problem, they need to work with their business leaders, they need to work with IT and they actually don't want to do all the heavy lifting, they want to solve that problem. So what we do is we do automatic machine learning platforms, we do that with optimizing algorithms and doing all the kind of, a lot of the heavy lifting that novice data scientists need and help expert data scientists as well. I talk about it as algorithms to answers and actually solving business problems with predictions and that's what machine learning is really all about but really what we're seeing in the industry right now and BlueData is a great example of kind of taking away some of the hard stuff away from a data scientist and making them successful. So working with BlueData and HPE, making us together really solve the problems that businesses are looking for, it's really transformative and we've been through like the digital transformation journey, all of us have been through that. We are now what I would term an AI transformation of sorts and businesses are going to the next step. They had their data, they got their data, infrastructure is kind of seamlessly working together, the clusters and containerization that's very important. Now what we're trying to do is get to the answers and using automatic machine learning platforms is probably the best way forward. >> That's still hard stuff but we're trying to get rid of data science practitioners, focusing on hard stuff that doesn't directly deliver value. >> It doesn't deliver anything for them, right. They shouldn't have to worry about the infrastructure, they should worry about getting the answers to the business problems they've been asked to solve. >> So let's talk a little bit about some of the new business problems that are going to be able to be solved by these kinds of partnerships between BlueData and H2O.AI. Start, Nanda, what do you, what gets you excited when we think about the new types of business problems that customers are gonna be able to solve. >> Yeah, I think it is really you know, the question that comes to you is not filtered through someone else's lens, right. Someone is trying an optimization problem, someone is trying to do a new product discovery so all this is based on a combination of both data-driven and evidence-based, right. For us as a data scientist, what excites me is that I have the flexibility now that I can choose the best of the breed technologies. I should not be restricted to what is given to me by an IT organization or something like that but at the same time, in an organization, for things to work, there has to be some level of control. So it is really having this type of environments or having some platforms where some, there is a team that can work on the control aspect but as a data scientist, I don't have to worry about it. I have my flexibility of tools of choice that I can use. At the same time, when you talk about data, security is a big deal in companies and a lot of times data scientists don't get access to data because of the layers and layers of security that they have to go through, right. So the excitement of the opportunity for me is if someone else takes care of the problem you know, just tell me where is the source of data that I can go to, don't filter the data for me you know, don't already structure the data for me but just tell me it's an approved source, right then it gives me more flexibility to actually go and take that information and build. So the having those controls taken care of well before I get into the picture as a data scientist, it makes it extremely easy for us to focus on you know, to her point, focus on the problem, right, focus on accessing the best of the breed technology and you know, give back and have that interaction with the business users on an ongoing basis. >> So especially focus on, so speed to value so that you're not messing around with a bunch of underlying infrastructure, governance remaining in place so that you know what are the appropriate limits of using the data with security that is embedded within that entire model without removing fidelity out of the quality of data. >> Absolutely. >> Would you agree with those? >> I totally agree with all the points that she brought up and we have joint customers in the market today, they're solving very complex problems. We have customers in financial services, joint customers there. We have customers in healthcare that are really trying to solve today's business problems and these are everything from, how do I give new credit to somebody? How do I know what next product to give them? How do I know what customer recommendations can I make next? Why did that customer churn? How do I reach new people? How do I do drug discovery? How do I give a patient a better prescription? How do I pinpoint disease than when I couldn't have seen it before? Now we have all that data that's available and it's very rich and data is a team sport. It takes data scientists, it takes business leaders and it takes IT to make it all work together and together the two companies are really working to solve problems that our customers are facing, working with our customers because they have the intellectual knowledge of what their problems are. We are providing the tools to help them solve those problems. >> Fantastic conversation about what is necessary to ensure that the data science practitioner remains at the center and is the ultimate test of whether or not these systems and these capabilities are working for business. Nanda Vijaydev, chief data scientist of BlueData, Ingrid Burton CMO and business leader, H2O.AI, thank you very much for being on theCUBE. >> Thank you. >> Thank you so much. >> So let's now spend some time talking about how ultimately, all of this comes together and what you're going to do as you participate in the crowd chat. To do that let me throw it back to Dave Velante in our Marlborough studios. >> We're back with Patrick Osbourne, alright Patrick, let's wrap up here and summarize. We heard how you're gonna help data science teams, right. >> Yup, speed, agility, time to value. >> Alright and I know a bunch of folks at BlueData, the engineering team is very, very strong so you picked up a good asset there. >> Yeah, it means amazing technology, the founders have a long lineage of software development and adoption in the market so we're just gonna, we're gonna invested them and let them loose. >> And then we heard they're sort of better together story from you, you got a roadmap, you're making some investments here, as I heard. >> Yeah, I mean so if we're really focused on hybrid cloud and we want to have all these as a services experience, whether it's through Green Lake or providing innovation, AI, GPUs as a service is something that we're gonna be you know, continuing to provide our customers as we move along. >> Okay and then we heard the data science angle and the data science community and the partner angle, that's exciting. >> Yeah, I mean, I think it's two approaches as well too. We have data scientists, right. So we're gonna bring that capability to bear whether it's through the product experience or through a professional services organization and then number two, you know, this is a very dynamic ecosystem from an application standpoint. There's commercial applications, there's certainly open source and we're gonna bring a fully vetted, full stack experience for our customers that they can feel confident in this you know, it's a very dynamic space. >> Excellent, well thank you very much. >> Thank you. Alright, now it's your turn. Go into the crowd chat and start talking. Ask questions, we're gonna have polls, we've got experts in there so let's crouch chat.

Published Date : May 7 2019

SUMMARY :

and give you an opportunity to voice your opinions and to inject that into their DNA, it is a big challenge. on the actual outcomes they seek and provide the infrastructure, provide the capabilities. and leave quickly if they don't get the tooling So the data scientists, they want to build a tool chain that the data scientists don't have to worry and apply it successfully to some and data management to capture the data. but show a path to delivery value in the future that enterprises face and some of the way forward. to help you on your AI journey. and the solutions that they seek into actual systems of the Advisory Board Company which is part of Optum now. What has been BlueData's journey along, to make that happen? and in the process, we actually created Is that the experience that you had. of leaving the data alone but still be able to get into it. and that is also one of the benefits and challenges above the technology and also provide insights to the physician, that you mentioned before upfront, and one of the great advantage of the platform, So talk to us about where you are in the decisions and all that but the the customer is the king. and part of Optum, I want to thank both of you in the future and to do that we've got Dave Velante and general manager of big data and analytics So we heard from Kumar, let's hear from you. and certainly AI, the people that they have are amazing, So a lot of customers often talk to us about, about how that's sort of seeping in to mainstream, and modeling inference to make those services more scalable, its software, is the plan to remain open and storage, right in the world of AI and analytics those are they're really hard to find you know, Yeah, and they spend a lot of time wrangling data of benefits that you might get out of your investments. Ingrid Burton is the CMO and business leader at H2O into making the data science practitioner's life easier and at the same time there is a lot of uncertainty. the data scientists wants to operate in the world of how to go forward with it. and Ingrid, I want to talk about what H2O and businesses are going to the next step. that doesn't directly deliver value. to the business problems they've been asked to solve. of the new business problems that are going to be able and a lot of times data scientists don't get access to data So especially focus on, so speed to value and it takes IT to make it all work together to ensure that the data science practitioner remains To do that let me throw it back to Dave Velante We're back with Patrick Osbourne, Alright and I know a bunch of folks at BlueData, and adoption in the market so we're just gonna, And then we heard they're sort of better together story that we're gonna be you know, continuing and the data science community and then number two, you know, Go into the crowd chat and start talking.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
Ramesh Thyagarajan	PERSON	0.99+
Kumar Sreekanti	PERSON	0.99+
Dave Velante	PERSON	0.99+
Peter Burris	PERSON	0.99+
Kumar	PERSON	0.99+
Nanda Vijaydev	PERSON	0.99+
AirBnB	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
BlueData	ORGANIZATION	0.99+
Patrick Osbourne	PERSON	0.99+
Patrick	PERSON	0.99+
Ingrid Burton	PERSON	0.99+
Ramesh	PERSON	0.99+
Anant Chintamaneni	PERSON	0.99+
Spotify	ORGANIZATION	0.99+
Nanda	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
two companies	QUANTITY	0.99+
Ingrid	PERSON	0.99+
Anant	PERSON	0.99+
Hewlett Packward Enterprise	ORGANIZATION	0.99+
H2O.AI	ORGANIZATION	0.99+
both	QUANTITY	0.99+
HPs	ORGANIZATION	0.99+
Facebooks	ORGANIZATION	0.99+
Googles	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Intel	ORGANIZATION	0.99+
Marlborough	LOCATION	0.99+
First	QUANTITY	0.99+
first	QUANTITY	0.99+
one	QUANTITY	0.99+
one system	QUANTITY	0.99+
today	DATE	0.99+
two approaches	QUANTITY	0.99+
Apollo	ORGANIZATION	0.99+
www.BlueData.com	OTHER	0.99+
HP	ORGANIZATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
six years ago	DATE	0.98+
two things	QUANTITY	0.98+
One	QUANTITY	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Nanda: