Next Gen Analytics & Data Services for the Cloud that Comes to You | An HPE GreenLake Announcement

(upbeat music) >> Welcome back to theCUBE's coverage of HPE GreenLake announcements. We're seeing the transition of Hewlett Packard Enterprise as a company, yes they're going all in for as a service, but we're also seeing a transition from a hardware company to what I look at increasingly as a data management company. We're going to talk today to Vishal Lall who's GreenLake cloud services solutions at HPE and Matt Maccaux who's a global field CTO, Ezmeral Software at HPE. Gents welcome back to theCube. Good to see you again. >> Thank you for having us here. >> Thanks Dave. >> So Vishal let's start with you. What are the big mega trends that you're seeing in data? When you talk to customers, when you talk to partners, what are they telling you? What's your optic say? >> Yeah, I mean, I would say the first thing is data is getting even more important. It's not that data hasn't been important for enterprises, but as you look at the last, I would say 24 to 36 months has become really important, right? And it's become important because customers look at data and they're trying to stitch data together across different sources, whether it's marketing data, it's supply chain data, it's financial data. And they're looking at that as a source of competitive advantage. So, customers were able to make sense out of the data, enterprises that are able to make sense out of that data, really do have a competitive advantage, right? And they actually get better business outcomes. So that's really important, right? If you start looking at, where we are from an analytics perspective, I would argue we are in maybe the third generation of data analytics. Kind of the first one was in the 80's and 90's with data warehousing kind of EDW. A lot of companies still have that, but think of Teradata, right? The second generation more in the 2000's was around data lakes, right? And that was all about Hadoop and others, and really the difference between the first and the second generation was the first generation was more around structured data, right? Second became more about unstructured data, but you really couldn't run transactions on that data. And I would say, now we are entering this third generation, which is about data lake houses, right? Customers what they want really is, or enterprises, what they want really is they want structured data. They want unstructured data altogether. They want to run transactions on them, right? They want to use the data to mine it for machine learning purposes, right? Use it for SQL as well as non-SQL, right? And that's kind of where we are today. So, that's really what we are hearing from our customers in terms of at least the top trends. And that's how we are thinking about our strategy in context of those trends. >> So lake house use that term. It's an increasing popular term. It connotes, "Okay, I've got the best of data warehouse "and I've got the best of data lake. "I'm going to try to simplify the data warehouse. "And I'm going to try to clean up the data swamp "if you will." Matt, so, talk a little bit more about what you guys are doing specifically and what that means for your customers. >> Well, what we think is important is that there has to be a hybrid solution, that organizations are going to build their analytics. They're going to deploy algorithms, where the data either is being produced or where it's going to be stored. And that could be anywhere. That could be in the trunk of a vehicle. It could be in a public cloud or in many cases, it's on-premises in the data center. And where organizations struggle is they feel like they have to make a choice and a trade-off going from one to the other. And so what HPE is offering is a way to unify the experiences of these different applications, workloads, and algorithms, while connecting them together through a fabric so that the experience is tied together with consistent, security policies, not having to refactor your applications and deploying tools like Delta lake to ensure that the organization that needs to build a data product in one cloud or deploy another data product in the trunk of an automobile can do so. >> So, Vishal I wonder if we could talk about some of the patterns that you're seeing with customers as you go to deploy solutions. Are there other industry patterns? Are there any sort of things you can share that you're discerning? >> Yeah, no, absolutely. As we kind of hear back from our customers across industries, I think the problem sets are very similar, right? Whether you look at healthcare customers. You look at telco customers, you look at consumer goods, financial services, they're all quite similar. I mean, what are they looking for? They're looking for making sense, making business value from the data, breaking down the silos that I think Matt spoke about just now, right? How do I stitch intelligence across my data silos to get more business intelligence out of it. They're looking for openness. I think the problem that's happened is over time, people have realized that they are locked in with certain vendors or certain technologies. So, they're looking for openness and choice. So that's an important one that we've at least heard back from our customers. The other one is just being able to run machine learning on algorithms on the data. I think that's another important one for them as well. And I think the last one I would say is, TCO is important as customers over the last few years have realized going to public cloud is starting to become quite expensive, to run really large workloads on public cloud, especially as they want to egress data. So, cost performance, trade offs are starting to become really important and starting to enter into the conversation now. So, I would say those are some of the key things and themes that we are hearing from customers cutting across industries. >> And you talked to Matt about basically being able to essentially leave the data where it belongs, bring the compute to data. We talk about that all the time. And so that has to include on-prem, it's got to include the cloud. And I'm kind of curious on the edge, where you see that 'cause that's... Is that an eventual piece? Is that something that's actually moving in parallel? There's lot of fuzziness as an observer in the edge. >> I think the edge is driving the most interesting use cases. The challenge up until recently has been, well, I think it's always been connectivity, right? Whether we have poor connection, little connection or no connection, being able to asynchronously deploy machine learning jobs into some sort of remote location. Whether it's a very tiny edge or it's a very large edge, like a factory floor, the challenge as Vishal mentioned is that if we're going to deploy machine learning, we need some sort of consistency of runtime to be able to execute those machine learning models. Yes, we need consistent access to data, but consistent access in terms of runtime is so important. And I think Hadoop got us started down this path, the ability to very efficiently and cost-effectively run large data jobs against large data sets. And it attempted to work into the source ecosystem, but because of the monolithic deployment, the tightly coupling of the compute and the data, it never achieved that cloud native vision. And so what as role in HPE through GreenLake services is delivering with open source-based Kubernetes, open source Apache Spark, open source Delta lake libraries, those same cloud native services that you can develop on your workstation, deploy in your data center in the same way you deploy through automation out at the edge. And I think that is what's so critical about what we're going to see over the next couple of years. The edge is driving these use cases, but it's consistency to build and deploy those machine learning models and connect it consistently with data that's what's going to drive organizations to success. >> So you're saying you're able to decouple, to compute from the storage. >> Absolutely. You wouldn't have a cloud if you didn't decouple compute from storage. And I think this is sort of the demise of Hadoop was forcing that coupling. We have high-speed networks now. Whether I'm in a cloud or in my data center, even at the edge, I have high-performance networks, I can now do distributed computing and separate compute from storage. And so if I want to, I can have high-performance compute for my really data intensive applications and I can have cost-effective storage where I need to. And by separating that off, I can now innovate at the pace of those individual tools in that opensource ecosystem. >> So, can I stay on this for a second 'cause you certainly saw Snowflake popularize that, they were kind of early on. I don't know if they're the first, but they certainly one of the most successful. And you saw Amazon Redshift copied it. And Redshift was kind of a bolt on. What essentially they did is they teared off. You could never turn off the compute. You still had to pay for a little bit compute, that's kind of interesting. Snowflakes at the t-shirt sizes, so there's trade offs there. There's a lot of ways to skin the cat. How did you guys skin the cat? >> What we believe we're doing is we're taking the best of those worlds. Through GreenLake cloud services, the ability to pay for and provision on demand the computational services you need. So, if someone needs to spin up a Delta lake job to execute a machine learning model, you spin up that. We're of course spinning that up behind the scenes. The job executes, it spins down, and you only pay for what you need. And we've got reserve capacity there. So you, of course, just like you would in the public cloud. But more importantly, being able to then extend that through a fabric across clouds and edge locations, so that if a customer wants to deploy in some public cloud service, like we know we're going to, again, we're giving that consistency across that, and exposing it through an S3 API. >> So, Vishal at the end of the day, I mean, I love to talk about the plumbing and the tech, but the customer doesn't care, right? They want the lowest cost. They want the fastest outcome. They want the greatest value. My question is, how are you seeing data organizations evolve to sort of accommodate this third era of this next generation? >> Yeah. I mean, the way at least, kind of look at, from a customer perspective, what they're trying to do is first of all, I think Matt addressed it somewhat. They're looking at a consistent experience across the different groups of people within the company that do something to data, right? It could be a SQL users. People who's just writing a SQL code. It could be people who are writing machine learning models and running them. It could be people who are writing code in Spark. Right now they are, you know the experience is completely disjointed across them, across the three types of users or more. And so that's one thing that they trying to do, is just try to get that consistency. We spoke about performance. I mean the disjointedness between compute and storage does provide the agility, because there customers are looking for elasticity. How can I have an elastic environment? So, that's kind of the other thing they're looking at. And performance and DCU, I think a big deal now. So, I think that that's definitely on a customer's mind. So, as enterprises are looking at their data journey, those are the at least the attributes that they are trying to hit as they organize themselves to make the most out of the data. >> Matt, you and I have talked about this sort of trend to the decentralized future. We're sort of hitting on that. And whether it's in a first gen data warehouse, second gen data lake, data hub, bucket, whatever, that essentially should ideally stay where it is, wherever it should be from a performance standpoint, from a governance standpoint and a cost perspective, and just be a node on this, I like the term data mesh, but be a node on that, and essentially allow the business owners, those with domain context to you've mentioned data products before to actually build data products, maybe air quotes, but a data product is something that can be monetized. Maybe it cuts costs. Maybe it adds value in other ways. How do you see HPE fitting into that long-term vision which we know is going to take some time to play out? >> I think what's important for organizations to realize is that they don't have to go to the public cloud to get that experience they're looking for. Many organizations are still reluctant to push all of their data, their critical data, that is going to be the next way to monetize business into the public cloud. And so what HPE is doing is bringing the cloud to them. Bringing that cloud from the infrastructure, the virtualization, the containerization, and most importantly, those cloud native services. So, they can do that development rapidly, test it, using those open source tools and frameworks we spoke about. And if that model ends up being deployed on a factory floor, on some common X86 infrastructure, that's okay, because the lingua franca is Kubernetes. And as Vishal mentioned, Apache Spark, these are the common tools and frameworks. And so I want organizations to think about this unified analytics experience, where they don't have to trade off security for cost, efficiency for reliability. HPE through GreenLake cloud services is delivering all of that where they need to do it. >> And what about the speed to quality trade-off? Have you seen that pop up in customer conversations, and how are organizations dealing with that? >> Like I said, it depends on what you mean by speed. Do you mean a computational speed? >> No, accelerating the time to insights, if you will. We've got to go faster, faster, agile to the data. And it's like, "Whoa, move fast break things. "Whoa, whoa. "What about data quality and governance and, right?" They seem to be at odds. >> Yeah, well, because the processes are fundamentally broken. You've got a developer who maybe is able to spin up an instance in the public cloud to do their development, but then to actually do model training, they bring it back on-premises, but they're waiting for a data engineer to get them the data available. And then the tools to be provisioned, which is some esoteric stack. And then runtime is somewhere else. The entire process is broken. So again, by using consistent frameworks and tools, and bringing that computation to where the data is, and sort of blowing this construct of pipelines out of the water, I think is what is going to drive that success in the future. A lot of organizations are not there yet, but that's I think aspirationally where they want to be. >> Yeah, I think you're right. I think that is potentially an answer as to how you, not incrementally, but revolutionized sort of the data business. Last question, is talking about GreenLake, how this all fits in. Why GreenLake? Why do you guys feel as though it's differentiable in the market place? >> So, I mean, something that you asked earlier as well, time to value, right? I think that's a very important attribute and kind of a design factor as we look at GreenLake. If you look at GreenLake overall, kind of what does it stand for? It stands for experience. How do we make sure that we have the right experience for the users, right? We spoke about it in context of data. How do we have a similar experience for different users of data, but just broadly across an enterprise? So, it's all about experience. How do you automate it, right? How do you automate the workloads? How do you provision fast? How do you give folks a cloud... An experience that they have been used to in the public cloud, on using an Apple iPhone? So it's all about experience, I think that's number one. Number two is about choice and openness. I mean, as we look at GreenLake is not a proprietary platform. We are very, very clear that the design, one of the important design principles is about choice and openness. And that's the reason we are, you hear us talk about Kubernetes, about Apaches Spark, about Delta lake et cetera, et cetera, right? We're using kind of those open source models where customers have a choice. If they don't want to be on GreenLake, they can go to public cloud tomorrow. Or they can run in our Holos if they want to do it that way or in their Holos, if they want to do it. So they should have the choice. Third is about performance. I mean, what we've done is it's not just about the software, but we as a company know how to configure infrastructure for that workload. And that's an important part of it. I mean if you think about the machine learning workloads, we have the right Nvidia chips that accelerate those transactions. So, that's kind of the last, the third one, and the last one, I think, as I spoke about earlier is cost. We are very focused on TCO, but from a customer perspective, we want to make sure that we are giving a value proposition, which is just not about experience and performance and openness, but also about costs. So if you think about GreenLake, that's kind of the value proposition that we bring to our customers across those four dimensions. >> Guys, great conversation. Thanks so much, really appreciate your time and insights. >> Matt: Thanks for having us here, David. >> All right, you're welcome. And thank you for watching everybody. Keep it right there for more great content from HPE GreenLake announcements. You're watching theCUBE. (upbeat music)

Published Date : Sep 28 2021

SUMMARY :

Good to see you again. What are the big mega trends enterprises that are able to "and I've got the best of data lake. fabric so that the experience about some of the patterns that And I think the last one I would say is, And so that has to include on-prem, the ability to very efficiently to compute from the storage. of the demise of Hadoop of the most successful. services, the ability to pay for end of the day, I mean, So, that's kind of the other I like the term data mesh, bringing the cloud to them. on what you mean by speed. to insights, if you will. that success in the future. in the market place? And that's the reason we are, Thanks so much, really appreciate And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Dave	PERSON	0.99+
Vishal	PERSON	0.99+
Matt Maccaux	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Matt	PERSON	0.99+
24	QUANTITY	0.99+
Vishal Lall	PERSON	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Second	QUANTITY	0.99+
second generation	QUANTITY	0.99+
first generation	QUANTITY	0.99+
third generation	QUANTITY	0.99+
tomorrow	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Spark	TITLE	0.99+
Third	QUANTITY	0.99+
first one	QUANTITY	0.99+
36 months	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
second generation	QUANTITY	0.99+
telco	ORGANIZATION	0.99+
GreenLake	ORGANIZATION	0.98+
Redshift	TITLE	0.98+
first gen	QUANTITY	0.98+
one	QUANTITY	0.98+
one thing	QUANTITY	0.98+
Teradata	ORGANIZATION	0.98+
third one	QUANTITY	0.97+
SQL	TITLE	0.97+
theCUBE	ORGANIZATION	0.97+
second gen	QUANTITY	0.96+
S3	TITLE	0.96+
today	DATE	0.96+
Ezmeral Software	ORGANIZATION	0.96+
Apple	ORGANIZATION	0.96+
three types	QUANTITY	0.96+
2000's	DATE	0.95+
third	QUANTITY	0.95+
90's	DATE	0.95+
HPE GreenLake	ORGANIZATION	0.95+
TCO	ORGANIZATION	0.94+
Delta lake	ORGANIZATION	0.93+
80's	DATE	0.91+
Number two	QUANTITY	0.88+
last	DATE	0.88+
theCube	ORGANIZATION	0.87+
Amazon	ORGANIZATION	0.87+
Apache	ORGANIZATION	0.87+
Kubernetes	TITLE	0.86+
Kubernetes	ORGANIZATION	0.83+
Hadoop	TITLE	0.83+
first thing	QUANTITY	0.82+
Snowflake	TITLE	0.82+
four dimensions	QUANTITY	0.8+
Holos	TITLE	0.79+
years	DATE	0.78+
second	QUANTITY	0.75+
X86	TITLE	0.73+
next couple of years	DATE	0.73+
Delta lake	TITLE	0.69+
Apaches Spark	ORGANIZATION	0.65+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Holos: