Infrastructure For Big Data Workloads
>> From the SiliconANGLE media office in Boston, Massachusetts, it's theCUBE! Now, here's your host, Dave Vellante. >> Hi, everybody, welcome to this special CUBE Conversation. You know, big data workloads have evolved, and the infrastructure that runs big data workloads is also evolving. Big data, AI, other emerging workloads need infrastructure that can keep up. Welcome to this special CUBE Conversation with Patrick Osborne, who's the vice president and GM of big data and secondary storage at Hewlett Packard Enterprise, @patrick_osborne. Great to see you again, thanks for coming on. >> Great, love to be back here. >> As I said up front, big data's changing. It's evolving, and the infrastructure has to also evolve. What are you seeing, Patrick, and what's HPE seeing in terms of the market forces right now driving big data and analytics? >> Well, some of the things that we see in the data center, there is a continuous move to move from bare metal to virtualized. Everyone's on that train. To containerization of existing apps, your apps of record, business, mission-critical apps. But really, what a lot of folks are doing right now is adding additional services to those applications, those data sets, so, new ways to interact, new apps. A lot of those are being developed with a lot of techniques that revolve around big data and analytics. We're definitely seeing the pressure to modernize what you have on-prem today, but you know, you can't sit there and be static. You gotta provide new services around what you're doing for your customers. A lot of those are coming in the form of this Mode 2 type of application development. >> One of the things that we're seeing, everybody talks about digital transformation. It's the hot buzzword of the day. To us, digital means data first. Presumably, you're seeing that. Are organizations organizing around their data, and what does that mean for infrastructure? >> Yeah, absolutely. We see a lot of folks employing not only technology to do that. They're doing organizational techniques, so, peak teams. You know, bringing together a lot of different functions. Also, too, organizing around the data has become very different right now, that you've got data out on the edge, right? It's coming into the core. A lot of folks are moving some of their edge to the cloud, or even their core to the cloud. You gotta make a lot of decisions and be able to organize around a pretty complex set of places, physical and virtual, where your data's gonna lie. >> There's a lot of talk, too, about the data pipeline. The data pipeline used to be, you had an enterprise data warehouse, and the pipeline was, you'd go through a few people that would build some cubes and then they'd hand off a bunch of reports. The data pipeline, it's getting much more complex. You've got the edge coming in, you've got, you know, core. You've got the cloud, which can be on-prem or public cloud. Talk about the evolution of the data pipeline and what that means for infrastructure and big data workloads. >> For a lot of our customers, and we've got a pretty interesting business here at HPE. We do a lot with the Intelligent Edge, so, our Edgeline servers in Aruba, where a a lot of the data is sitting outside of the traditional data center. Then we have what's going on in the core, which, for a lot of customers, they are moving from either traditional EDW, right, or even Hadoop 1.0 if they started that transformation five to seven years ago, to, a lot of things are happening now in real time, or a combination thereof. The data types are pretty dynamic. Some of that is always getting processed out on the edge. Results are getting sent back to the core. We're also seeing a lot of folks move to real-time data analytics, or some people call it fast data. That sits in your core data center, so utilizing things like Kafka and Spark. A lot of the techniques for persistent storage are brand new. What it boils down to is, it's an opportunity, but it's also very complex for our customers. >> What about some of the technical trends behind what's going on with big data? I mean, you've got sprawl, with both data sprawl, you've got workload sprawl. You got developers that are dealing with a lot of complex tooling. What are you guys seeing there, in terms of the big mega-trends? >> We have, as you know, HPE has quite a few customers in the mid-range in enterprise segments. We have some customers that are very tech-forward. A lot of those customers are moving from this, you know, Hadoop 1.0, Hadoop 2.0 system to a set of essentially mixed workloads that are very multi-tenant. We see customers that have, essentially, a mix of batch-oriented workloads. Now they're introducing these streaming type of workloads to folks who are bringing in things like TensorFlow and GPGPUs, and they're trying to apply some of the techniques of AI and ML into those clusters. What we're seeing right now is that that is causing a lot of complexity, not only in the way you do your apps, but the number of applications and the number of tenants who use that data. It's getting used all day long for various different, so now what we're seeing is it's grown up. It started as an opportunity, a science project, the POC. Now it's business-critical. Becoming, now, it's very mission-critical for a lot of the services that drives. >> Am I correct that those diverse workloads used to require a bespoke set of infrastructure that was very siloed? I'm inferring that technology today will allow you to bring those workloads together on a single platform. Is that correct? >> A couple of things that we offer, and we've been helping customers to get off the complexity train, but provide them flexibility and elasticity is, a lot of the workloads that we did in the past were either very vertically-focused and integrated. One app server, networking, storage, to, you know, the beginning of the analytics phase was really around symmetrical clusters and scaling them out. Now we've got a very rich and diverse set of components and infrastructure that can essentially allow a customer to make a data lake that's very scalable. Compute, storage-oriented nodes, GPU-oriented nodes, so it's very flexible and helps us, helps the customers take complexity out of their environment. >> In thinking about, when you talk to customers, what are they struggling with, specifically as it relates to infrastructure? Again, we talked about tooling. I mean, Hadoop is well-known for the complexity of the tooling. But specifically from an infrastructure standpoint, what are the big complaints that you hear? >> A couple things that we hear is that my budget's flat for the next year or couple years, right? We talked earlier in the conversation about, I have to modernize, virtualize, containerizing my existing apps, that means I have to introduce new services as well with a very different type of DevOps, you know, mode of operations. That's all with the existing staff, right? That's the number one issue that we hear from the customers. Anything that we can do to help increase the velocity of deployment through automation. We hear now, frankly, the battle is for whether I'm gonna run these type of workloads on-prem versus off-prem. We have a set of technology as well as services, enabling services with Pointnext. You remember the acquisition we made around cloud technology partners to right-place where those workloads are gonna go and become like a broker in that conversation and assist customers to make that transition and then, ultimately, give them an elastic platform that's gonna scale for the diverse set of workloads that's well-known, sized, easy to deploy. >> As you get all this data, and the data's, you know, Hadoop, it sorta blew up the data model. Said, "Okay, we'll leave the data where it is, "we'll bring the compute there." You had a lot of skunk works projects growing. What about governance, security, compliance? As you have data sprawl, how are customers handling that challenge? Is it a challenge? >> Yeah, it certainly is a challenge. I mean, we've gone through it just recently with, you know, GDPR is implemented. You gotta think about how that's gonna fit into your workflow, and certainly security. The big thing that we see, certainly, is around if the data's residing outside of your traditional data center, that's a big issue. For us, when we have Edgeline servers, certainly a lot of things are coming in over wireless, there's a big buildout in advent of 5G coming out. That certainly is an area that customers are very concerned about in terms of who has their data, who has access to it, how can you tag it, how can you make sure it's secure. That's a big part of what we're trying to provide here at HPE. >> What specifically is HPE doing to address these problems? Products, services, partnerships, maybe you could talk about that a little bit. Maybe even start with, you know, what's your philosophy on infrastructure for big data and AI workloads? >> I mean, for us, we've over the last two years have really concentrated on essentially two areas. We have the Intelligent Edge, which is, certainly, it's been enabled by fantastic growth with our Aruba products in the networks in space and our Edgeline systems, so, being able to take that type of compute and get it as far out to the edge as possible. The other piece of it is around making hybrid IT simple, right? In that area, we wanna provide a very flexible, yet easy-to-deploy set of infrastructure for big data and AI workloads. We have this concept of the Elastic Platform for Analytics. It helps customers deploy that for a whole myriad of requirements. Very compute-oriented, storage-oriented, GPUs, cold and warm data lakes, for that matter. And the third area, what we've really focused on is the ecosystem that we bring to our customers as a portfolio company is evolving rapidly. As you know, in this big data and analytics workload space, the software development portion of it is super dynamic. If we can bring a vetted, well-known ecosystem to our customers as part of a solution with advisory services, that's definitely one of the key pieces that our customers love to come to HP for. >> What about partnerships around things like containers and simplifying the developer experience? >> I mean, we've been pretty public about some of our efforts in this area around OneSphere, and some of these, the models around, certainly, advisory services in this area with some recent acquisitions. For us, it's all about automation, and then we wanna be able to provide that experience to the customers, whether they want to develop those apps and deploy on-prem. You know, we love that. I think you guys tag it as true private cloud. But we know that the reality is, most people are embracing very quickly a hybrid cloud model. Given the ability to take those apps, develop them, put them on-prem, run them off-prem is pretty key for OneSphere. >> I remember Antonio Neri, when you guys announced Apollo, and you had the astronaut there. Antonio was just a lowly GM and VP at the time, and now he's, of course, CEO. Who knows what's in the future? But Apollo, generally at the time, it was like, okay, this is a high-performance computing system. We've talked about those worlds, HPC and big data coming together. Where does a system like Apollo fit in this world of big data workloads? >> Yeah, so we have a very wide product line for Apollo that helps, you know, some of them are very tailored to specific workloads. If you take a look at the way that people are deploying these infrastructures now, multi-tenant with many different workloads. We allow for some compute-focused systems, like the Apollo 2000. We have very balanced systems, the Apollo 4200, that allow a very good mix of CPU, memory, and now customers are certainly moving to flash and storage-class memory for these type of workloads. And then, Apollo 6500 were some of the newer systems that we have. Big memory footprint, NVIDIA GPUs allowing you to do very high calculations rates for AI and ML workloads. We take that and we aggregate that together. We've made some recent acquisitions, like Plexxi, for example. A big part of this is around simplification of the networking experience. You can probably see into the future of automation of the networking level, automation of the compute and storage level, and then having a very large and scalable data lake for customers' data repositories. Object, file, HTFS, some pretty interesting trends in that space. >> Yeah, I'm actually really super excited about the Plexxi acquisition. I think it's because flash, it used to be the bottleneck was the spinning disk, flash pushes the bottleneck largely to the network. Plexxi gonna allow you guys to scale, and I think actually leapfrog some of the other hyperconverged players that are out there. So, super excited to see what you guys do with that acquisition. It sounds like your focus is on optimizing the design for I/O. I'm sure flash fits in there as well. >> And that's a huge accelerator for, even when you take a look at our storage business, right? So, 3PAR, Nimble, All-Flash, certainly moving to NVMe and storage-class memory for acceleration of other types of big data databases. Even though we're talking about Hadoop today, right now, certainly SAP HANA, scale-out databases, Oracle, SQL, all these things play a part in the customer's infrastructure. >> Okay, so you were talking before about, a little bit about GPUs. What is this HPE Elastic Platform for big data analytics? What's that all about? >> I mean, we have a lot of the sizing and scalability falls on the shoulders of our customers in this space, especially in some of these new areas. What we've done is, we have, it's a product/a concept, and what we do is we have this, it's called the Elastic Platform for Analytics. It allows, with all those different components that I rattled off, all great systems in of their own, but when it comes to very complex multi-tenant workloads, what we do is try to take the mystery out of that for our customers, to be able to deploy that cookie-cutter module. We're even gonna get to a place pretty soon where we're able to offer that as a consumption-based service so you don't have to choose for an elastic type of acquisition experience between on-prem and off-prem. We're gonna provide that as well. It's not only a set of products. It's reference architectures. We do a lot of sizing with our partners. The Hortonworks, CloudEra's, MapR's, and a lot of the things that are out in the open source world. It's pretty good. >> We've been covering big data, as you know, for a long, long time. The early days of big data was like, "Oh, this is great, "we're just gonna put white boxes out there "and off the shelf storage!" Well, that changed as big data got, workloads became more enterprise, mainstream, they needed to be enterprise-ready. But my question to you is, okay, I hear you. You got products, you got services, you got perspectives, a philosophy. Obviously, you wanna sell some stuff. What has HPE done internally with regard to big data? How have you transformed your own business? >> For us, we wanna provide a really rich experience, not just products. To do that, you need to provide a set of services and automation, and what we've done is, with products and solutions like InfoSight, we've been able to, we call it AI for the Data Center, or certainly, the tagline of predictive analytics is something that Nimble's brought to the table for a long time. To provide that level of services, InfoSight, predictive analytics, AI for the Data Center, we're running our own big data infrastructure. It started a number of years ago even on our 3PAR platforms and other products, where we had scale-up databases. We moved and transitioned to batch-oriented Hadoop. Now we're fully embedded with real-time streaming analytics that come in every day, all day long, from our customers and telemetry. We're using AI and ML techniques to not only improve on what we've done that's certainly automating for the support experience, and making it easy to manage the platforms, but now introducing things like learning, automation engines, the recommendation engines for various things for our customers to take, essentially, the hands-on approach of managing the products and automate it and put into the products. So, for us, we've gone through a multi-phase, multi-year transition that's brought in things like Kafka and Spark and Elasticsearch. We're using all these techniques in our system to provide new services for our customers as well. >> Okay, great. You're practitioners, you got some street cred. >> Absolutely. >> Can I come back on InfoSight for a minute? It came through an acquisition of Nimble. It seems to us that you're a little bit ahead, and maybe you say a lot a bit ahead of the competition with regard to that capability. How do you see it? Where do you see InfoSight being applied across the portfolio, and how much of a lead do you think you have on competitors? >> I'm paranoid, so I don't think we ever have a good enough lead, right? You always gotta stay grinding on that front. But we think we have a really good product. You know, it speaks for itself. A lot of the customers love it. We've applied it to 3PAR, for example, so we came out with some, we have VMVision for a 3PAR that's based on InfoSight. We've got some things in the works for other product lines that are imminent pretty soon. You can think about what we've done for Nimble and 3PAR, we can apply similar type of logic to Elastic Platform for Analytics, like running at that type of cluster scale to automate a number of items that are pretty pedantic for the customers to manage. There's a lot of work going on within HPE to scale that as a service that we provide with most of our products. >> Okay, so where can I get more information on your big data offerings and what you guys are doing in that space? >> Yeah, so, we have, you can always go to hp.com/bigdata. We've got some really great information out there. We're in our run-up to our big end user event that we do every June in Las Vegas. It's HPE Discover. We have about 15,000 of our customers and trusted partners there, and we'll be doing a number of talks. I'm doing some work there with a British telecom. We'll give some great talks. Those'll be available online virtually, so you'll hear about not only what we're doing with our own InfoSight and big data services, but how other customers like BTE and 21st Century Fox and other folks are applying some of these techniques and making a big difference for their business as well. >> That's June 19th to the 21st. It's at the Sands Convention Center in between the Palazzo and the Venetian, so it's a good conference. Definitely check that out live if you can, or if not, you can all watch online. Excellent, Patrick, thanks so much for coming on and sharing with us this big data evolution. We'll be watching. >> Yeah, absolutely. >> And thank you for watcihing, everybody. We'll see you next time. This is Dave Vellante for theCUBE. (fast techno music)
SUMMARY :
From the SiliconANGLE media office and the infrastructure that in terms of the market forces right now to modernize what you have on-prem today, One of the things that we're seeing, of their edge to the cloud, of the data pipeline A lot of the techniques What about some of the technical trends for a lot of the services that drives. Am I correct that a lot of the workloads for the complexity of the tooling. You remember the acquisition we made the data where it is, is around if the data's residing outside Maybe even start with, you know, of the Elastic Platform for Analytics. Given the ability to take those apps, GM and VP at the time, automation of the compute So, super excited to see what you guys do in the customer's infrastructure. Okay, so you were talking before about, and a lot of the things But my question to you and automate it and put into the products. you got some street cred. bit ahead of the competition for the customers to manage. that we do every June in Las Vegas. Definitely check that out live if you can, We'll see you next time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Patrick | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Aruba | LOCATION | 0.99+ |
Antonio | PERSON | 0.99+ |
BTE | ORGANIZATION | 0.99+ |
Patrick Osborne | PERSON | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
June 19th | DATE | 0.99+ |
Antonio Neri | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Pointnext | ORGANIZATION | 0.99+ |
Hewlett Packard Enterprise | ORGANIZATION | 0.99+ |
NVIDIA | ORGANIZATION | 0.99+ |
third area | QUANTITY | 0.99+ |
21st Century Fox | ORGANIZATION | 0.99+ |
Apollo 4200 | COMMERCIAL_ITEM | 0.99+ |
@patrick_osborne | PERSON | 0.99+ |
Apollo 6500 | COMMERCIAL_ITEM | 0.99+ |
InfoSight | ORGANIZATION | 0.99+ |
MapR | ORGANIZATION | 0.99+ |
Sands Convention Center | LOCATION | 0.99+ |
Boston, Massachusetts | LOCATION | 0.98+ |
Apollo 2000 | COMMERCIAL_ITEM | 0.98+ |
CloudEra | ORGANIZATION | 0.98+ |
HP | ORGANIZATION | 0.98+ |
Nimble | ORGANIZATION | 0.98+ |
Spark | TITLE | 0.98+ |
SAP HANA | TITLE | 0.98+ |
next year | DATE | 0.98+ |
GDPR | TITLE | 0.98+ |
One app | QUANTITY | 0.98+ |
Venetian | LOCATION | 0.98+ |
two areas | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
hp.com/bigdata | OTHER | 0.97+ |
one | QUANTITY | 0.97+ |
Hortonworks | ORGANIZATION | 0.97+ |
Mode 2 | OTHER | 0.96+ |
single platform | QUANTITY | 0.96+ |
SQL | TITLE | 0.96+ |
One | QUANTITY | 0.96+ |
21st | DATE | 0.96+ |
Elastic Platform | TITLE | 0.95+ |
3PAR | TITLE | 0.95+ |
Hadoop 1.0 | TITLE | 0.94+ |
seven years ago | DATE | 0.93+ |
CUBE Conversation | EVENT | 0.93+ |
Palazzo | LOCATION | 0.93+ |
Hadoop | TITLE | 0.92+ |
Kafka | TITLE | 0.92+ |
Hadoop 2.0 | TITLE | 0.91+ |
Elasticsearch | TITLE | 0.9+ |
Plexxi | ORGANIZATION | 0.87+ |
Apollo | ORGANIZATION | 0.87+ |
of years ago | DATE | 0.86+ |
Elastic Platform for Analytics | TITLE | 0.85+ |
Oracle | ORGANIZATION | 0.83+ |
TensorFlow | TITLE | 0.82+ |
Edgeline | ORGANIZATION | 0.82+ |
Intelligent Edge | ORGANIZATION | 0.81+ |
about 15,000 of | QUANTITY | 0.78+ |
one issue | QUANTITY | 0.77+ |
five | DATE | 0.74+ |
HPE Discover | ORGANIZATION | 0.74+ |
both data | QUANTITY | 0.73+ |
data | ORGANIZATION | 0.73+ |
years | DATE | 0.72+ |
SiliconANGLE | LOCATION | 0.71+ |
EDW | TITLE | 0.71+ |
Edgeline | COMMERCIAL_ITEM | 0.71+ |
HPE | TITLE | 0.7+ |
OneSphere | ORGANIZATION | 0.68+ |
couple | QUANTITY | 0.64+ |
3PAR | ORGANIZATION | 0.63+ |