Seamus Jones & Milind Damle
>>Welcome to the Cube's Continuing coverage of AMD's fourth generation Epic launch. I'm Dave Nicholson and I'm joining you here in our Palo Alto Studios. We have two very interesting guests to dive into some of the announcements that have been made and maybe take a look at this from an AI and ML perspective. Our first guest is Milland Doley. He's a senior director for software and solutions at amd, and we're also joined by Shamus Jones, who's a director of server engineering at Dell Technologies. Welcome gentlemen. How are you? >>Very good, thank >>You. Welcome to the Cube. So let's start out really quickly, Shamus, what, give us a thumbnail sketch of what you do at Dell. >>Yeah, so I'm the director of technical marketing engineering here at Dell, and our team really takes a look at the technical server portfolio and solutions and ensures that we can look at, you know, the performance metrics, benchmarks, and performance characteristics, so that way we can give customers a good idea of what they can expect from the server portfolio when they're looking to buy Power Edge from Dell. >>Milland, how about you? What's, what's new at a M D? What do you do there? >>Great to be here. Thank you for having me at amd, I'm the senior director of performance engineering and ISV ecosystem enablement, which is a long winter way of saying we do a lot of benchmarks, improved performance and demonstrate with wonderful partners such as Shamus and Dell, the combined leverage that AMD four generation processes and Dell systems can bring to bear on a multitude of applications across the industry spectrum. >>Shamus, talk about that relationship a little bit more. The relationship between a M D and Dell. How far back does it go? What does it look like in practical terms? >>Absolutely. So, you know, ever since AM MD reentered the server space, we've had a very close relationship. You know, it's one of those things where we are offering solutions that are out there to our customers no matter what generation A portfolio, if they're, if they're demanding either from their competitor or a m d, we offer a portfolio solutions that are out there. What we're finding is that within their generational improvements, they're just getting better and better and better. Really exciting things happening from a m D at the moment, and we're seeing that as we engineer those CPU stacks into our, our server portfolio, you know, we're really seeing unprecedented performance across the board. So excited about the, the history, you know, my team and Lin's team work very closely together, so much so that we were communicating almost on a daily basis around portfolio platforms and updates around the, the, the benchmarks testing and, and validation efforts. >>So Melind, are you happy with these PowerEdge boxes that Seamus is building to, to house, to house your baby? >>We are delighted, you know, it's hard to find stronger partners than Shamus and Dell with AMD's, second generation epic service CPUs. We already had undisputable industry performance leadership, and then with the third and now the fourth generation CPUs, we've just increased our lead with competition. We've got so many outstanding features at the platform, at the CPU level, everybody focuses on the high core counts, but there's also the DDR five, the memory, the io, and the storage subsystem. So we believe we have a fantastic performance and performance per dollar performance per what edge over competition, and we look to partners such as Dell to help us showcase that leadership. >>Well. So Shay Yeah, through Yeah, go ahead >>Dave. What, what I'd add, Dave, is that through the, the partnership that we've had, you know, we've been able to develop subsystems and platform features that historically we couldn't have really things around thermals power efficiency and, and efficiency within the platform. That means that customers can get the most out of their compute infrastructure. >>So this is gonna be a big question moving forward as next generation platforms are rolled out, there's the potential for people to have sticker shock. You talk about something that has eight or 12 cores in a, in a physical enclosure versus 96 cores, and, and I guess the, the question is, do the ROI and TCO numbers look good for someone to make that upgrade? Shamus, you wanna, you wanna hit that first or you guys are integrated? >>Absolutely, yeah, sorry. Absolutely. So we, I'll tell you what, at the moment, customers really can't afford not to upgrade at the moment, right? We've taken a look at the cost basis of keeping older infrastructure in place, let's say five or seven year old infrastructure servers that are, that are drawing more power maybe are, are poorly utilized within the infrastructure and take more and more effort and time to manage, maintain and, and really keep in production. So as customers look to upgrade or refresh their platforms, what we're finding right is that they can take a dynamic consolidation sometimes 5, 7, 8 to one consolidation depending on which platform they have as a historical and which one they're looking to upgrade to. Within AI specifically and machine learning frameworks, we're seeing really unprecedented performance. Lin's team partnered with us to deliver multiple benchmarks for the launch, some of which we're still continuing to see the goodness from things like TP C X AI as a framework, and I'm talking about here specifically the CPU U based performance. >>Even though in a lot of those AI frameworks, you would also expect to have GPUs, which all of the four platforms that we're offering on the AM MD portfolio today offer multiple G P U offerings. So we're seeing a balance between a huge amount of C P U gain and performance, as well as more and more GPU offerings within the platform. That was real, that was a real challenge for us because of the thermal challenges. I mean, you think GPUs are going up 300, 400 watt, these CPUs at 96 core are, are quite demanding thermally, but what we're able to do is through some, some unique smart cooling engineering within the, the PowerEdge portfolio, we can take a look at those platforms and make the most efficient use case by having things like telemetry within the platform so that way we can dynamically change fan speeds to get customers the best performance without throttling based on their need. >>Melin the cube was at the Supercomputing conference in Dallas this year, supercomputing conference 2022, and a lot of the discussion was around not only advances in microprocessor technology, but also advances in interconnect technology. How do you manage that sort of research partnership with Dell when you aren't strictly just focusing on the piece that you are bringing to the party? It's kind of a potluck, you know, we, we, we, we mentioned P C I E Gen five or 5.0, whatever you want to call it, new DDR storage cards, Nicks, accelerators, all of those, all of those things. How do you keep that straight when those aren't things that you actually build? >>Well, excellent question, Dave. And you know, as we are developing the next platform, obviously the, the ongoing relationship is there with Dell, but we start way before launch, right? Sometimes it's multiple years before launch. So we are not just focusing on the super high core counts at the CPU level and the platform configurations, whether it's single socket or dual socket, we are looking at it from the memory subsystem from the IO subsystem, P c i lanes for storage is a big deal, for example, in this generation. So it's really a holistic approach. And look, core counts are, you know, more important at the higher end for some customers h HPC space, some of the AI applications. But on the lower end you have database applications or some other is s v applications that care a lot about those. So it's, I guess different things matter to different folks across verticals. >>So we partnered with Dell very early in the cycle, and it's really a joint co-engineering. Shamus talked about the focus on AI with TP C X xci, I, so we set five world records in that space just on that one benchmark with AD and Dell. So fantastic kick kick off to that across a multitude of scale factors. But PPP c Xci is not just the only thing we are focusing on. We are also collaborating with Dell and des e i on some of the transformer based natural language processing models that we worked on, for example. So it's not just a steep CPU story, it's CPU platform, es subsystem software and the whole thing delivering goodness across the board to solve end user problems in AI and and other verticals. >>Yeah, the two of you are at the tip of the spear from a performance perspective. So I know it's easy to get excited about world records and, and they're, they're fantastic. I know Shamus, you know, that, you know, end user customers might, might immediately have the reaction, well, I don't need a Ferrari in my data center, or, you know, what I need is to be able to do more with less. Well, aren't we delivering that also? And you know, you imagine you milland you mentioned natural, natural language processing. Shamus, are you thinking in 2023 that a lot more enterprises are gonna be able to afford to do things like that? I mean, what are you hearing from customers on this front? >>I mean, while the adoption of the top bin CPU stack is, is definitely the exception, not the rule today we are seeing marked performance, even when we look at the mid bin CPU offerings from from a m d, those are, you know, the most common sold SKUs. And when we look at customers implementations, really what we're seeing is the fact that they're trying to make the most, not just of dollar spend, but also the whole subsystem that Melin was talking about. You know, the fact that balanced memory configs can give you marked performance improvements, not just at the CPU level, but as actually all the way through to the, to the application performance. So it's, it's trying to find the correct balance between the application needs, your budget, power draw and infrastructure within the, the data center, right? Because not only could you, you could be purchasing and, and look to deploy the most powerful systems, but if you don't have an infrastructure that's, that's got the right power, right, that's a large challenge that's happening right now and the right cooling to deal with the thermal differences of the systems, might you wanna ensure that, that you can accommodate those for not just today but in the future, right? >>So it's, it's planning that balance. >>If I may just add onto that, right? So when we launched, not just the fourth generation, but any generation in the past, there's a natural tendency to zero in on the top bin and say, wow, we've got so many cores. But as Shamus correctly said, it's not just that one core count opn, it's, it's the whole stack. And we believe with our four gen CPU processor stack, we've simplified things so much. We don't have, you know, dozens and dozens of offerings. We have a fairly simple skew stack, but we also have a very efficient skew stack. So even, even though at the top end we've got 96 scores, the thermal budget that we require is fairly reasonable. And look, with all the energy crisis going around, especially in Europe, this is a big deal. Not only do customers want performance, but they're also super focused on performance per want. And so we believe with this generation, we really delivered not just on raw performance, but also on performance per dollar and performance per one. >>Yeah. And it's not just Europe, I'm, we're, we are here in Palo Alto right now, which is in California where we all know the cost of an individual kilowatt hour of electricity because it's quite, because it's quite high. So, so thermals, power cooling, all of that, all of that goes together and that, and that drives cost. So it's a question of how much can you get done per dollar shame as you made the point that you, you're not, you don't just have a one size fits all solution that it's, that it's fit for function. I, I'm, I'm curious to hear from you from the two of you what your thoughts are from a, from a general AI and ML perspective. We're starting to see right now, if you hang out on any kind of social media, the rise of these experimental AI programs that are being presented to the public, some will write stories for you based on prom, some will create images for you. One of the more popular ones will create sort of a, your superhero alter ego for, I, I can't wait to do it, I just got the app on my phone. So those are all fun and they're trivial, but they sort of get us used to this idea that, wow, these systems can do things. They can think on their own in a certain way. W what do, what do you see the future of that looking like over the next year in terms of enterprises, what they're going to do for it with it >>Melan? Yeah, I can go first. Yeah, yeah, yeah, yeah, >>Sure. Yeah. Good. >>So the couple of examples, Dave, that you mentioned are, I, I guess it's a blend of novelty and curiosity. You know, people using AI to write stories or poems or, you know, even carve out little jokes, check grammar and spelling very useful, but still, you know, kind of in the realm of novelty in the mainstream, in the enterprise. Look, in my opinion, AI is not just gonna be a vertical, it's gonna be a horizontal capability. We are seeing AI deployed across the board once the models have been suitably trained for disparate functions ranging from fraud detection or anomaly detection, both in the financial markets in manufacturing to things like image classification or object detection that you talked about in, in the sort of a core AI space itself, right? So we don't think of AI necessarily as a vertical, although we are showcasing it with a specific benchmark for launch, but we really look at AI emerging as a horizontal capability and frankly, companies that don't adopt AI on a massive scale run the risk of being left behind. >>Yeah, absolutely. There's an, an AI as an outcome is really something that companies, I, I think of it in the fact that they're adopting that and the frameworks that you're now seeing as the novelty pieces that Melin was talking about is, is really indicative of the under the covers activity that's been happening within infrastructures and within enterprises for the past, let's say 5, 6, 7 years, right? The fact that you have object detection within manufacturing to be able to, to be able to do defect detection within manufacturing lines. Now that can be done on edge platforms all the way at the device. So you're no longer only having to have things be done, you know, in the data center, you can bring it right out to the edge and have that high performance, you know, inferencing training models. Now, not necessarily training at the edge, but the inferencing models especially, so that way you can, you know, have more and, and better use cases for some of these, these instances things like, you know, smart cities with, with video detection. >>So that way they can see, especially during covid, we saw a lot of hospitals and a lot of customers that were using using image and, and spatial detection within their, their video feeds to be able to determine who and what employees were at risk during covid. So there's a lot of different use cases that have been coming around. I think the novelty aspect of it is really interesting and I, I know my kids, my daughters love that, that portion of it, but really what's been happening has been exciting for quite a, quite a period of time in the enterprise space. We're just now starting to actually see those come to light in more of a, a consumer relevant kind of use case. So the technology that's been developed in the data center around all of these different use cases is now starting to feed in because we do have more powerful compute at our fingertips. We do have the ability to talk more about the framework and infrastructure that's that's right out at the edge. You know, I know Dave in the past you've said things like the data center of, you know, 20 years ago is now in my hand as, as my cell phone. That's right. And, and that's, that's a fact and I'm, it's exciting to think where it's gonna be in the next 10 or 20 years. >>One terabyte baby. Yeah. One terabyte. Yeah. It's mind bo. Exactly. It's mind boggling. Yeah. And it makes me feel old. >>Yeah, >>Me too. And, and that and, and Shamus, that all sounded great. A all I want is a picture of me as a superhero though, so you guys are already way ahead of the curve, you know, with, with, with that on that note, Seamus wrap us up with, with a, with kind of a summary of the, the highlights of what we just went through in terms of the performance you're seeing out of this latest gen architecture from a md. >>Absolutely. So within the TPC xai frameworks that Melin and my team have worked together to do, you know, we're seeing unprecedented price performance. So the fact that you can get 220% uplift gen on gen for some of these benchmarks and, you know, you can have a five to one consolidation means that if you're looking to refresh platforms that are historically legacy, you can get a, a huge amount of benefit, both in reduction in the number of units that you need to deploy and the, the amount of performance that you can get per unit. You know, Melinda had mentioned earlier around CPU performance and performance per wat, specifically on the Tu socket two U platform using the fourth generation a m d Epic, you know, we're seeing a 55% higher C P U performance per wat that is that, you know, when for people who aren't necessarily looking at these statistics, every generation of servers, that that's, that is a huge jump leap forward. >>That combined with 121% higher spec scores, you know, as a benchmark, those are huge. Normally we see, let's say a 40 to 60% performance improvement on the spec benchmarks, we're seeing 121%. So while that's really impressive at the top bin, we're actually seeing, you know, large percentile improvements across the mid bins as well, you know, things in the range of like 70 to 90% performance improvements in those standard bins. So it, it's a, it's a huge performance improvement, a power efficiency, which means customers are able to save energy, space and time based on, on their deployment size. >>Thanks for that Shamus, sadly, gentlemen, our time has expired. With that, I want to thank both of you. It's a very interesting conversation. Thanks for, thanks for being with us, both of you. Thanks for joining us here on the Cube for our coverage of AMD's fourth generation Epic launch. Additional information, including white papers and benchmarks plus editorial coverage can be found on does hardware matter.com.
SUMMARY :
I'm Dave Nicholson and I'm joining you here in our Palo Alto Studios. Shamus, what, give us a thumbnail sketch of what you do at Dell. and ensures that we can look at, you know, the performance metrics, benchmarks, and Dell, the combined leverage that AMD four generation processes and Shamus, talk about that relationship a little bit more. So, you know, ever since AM MD reentered the server space, We are delighted, you know, it's hard to find stronger partners That means that customers can get the most out you wanna, you wanna hit that first or you guys are integrated? So we, I'll tell you what, and make the most efficient use case by having things like telemetry within the platform It's kind of a potluck, you know, we, But on the lower end you have database applications or some But PPP c Xci is not just the only thing we are focusing on. Yeah, the two of you are at the tip of the spear from a performance perspective. the fact that balanced memory configs can give you marked performance improvements, but any generation in the past, there's a natural tendency to zero in on the top bin and say, the two of you what your thoughts are from a, from a general AI and ML perspective. Yeah, I can go first. So the couple of examples, Dave, that you mentioned are, I, I guess it's a blend of novelty have that high performance, you know, inferencing training models. So the technology that's been developed in the data center around all And it makes me feel old. so you guys are already way ahead of the curve, you know, with, with, with that on that note, So the fact that you can get 220% uplift gen you know, large percentile improvements across the mid bins as well, Thanks for that Shamus, sadly, gentlemen, our time has
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Nicholson | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
70 | QUANTITY | 0.99+ |
40 | QUANTITY | 0.99+ |
55% | QUANTITY | 0.99+ |
five | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
220% | QUANTITY | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
121% | QUANTITY | 0.99+ |
96 cores | QUANTITY | 0.99+ |
California | LOCATION | 0.99+ |
AMD | ORGANIZATION | 0.99+ |
Shamus Jones | PERSON | 0.99+ |
12 cores | QUANTITY | 0.99+ |
Shamus | ORGANIZATION | 0.99+ |
Shamus | PERSON | 0.99+ |
2023 | DATE | 0.99+ |
eight | QUANTITY | 0.99+ |
96 core | QUANTITY | 0.99+ |
300 | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
dozens | QUANTITY | 0.99+ |
seven year | QUANTITY | 0.99+ |
5 | QUANTITY | 0.99+ |
Ferrari | ORGANIZATION | 0.99+ |
96 scores | QUANTITY | 0.99+ |
60% | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
Milland Doley | PERSON | 0.99+ |
first guest | QUANTITY | 0.99+ |
third | QUANTITY | 0.99+ |
Dell Technologies | ORGANIZATION | 0.99+ |
amd | ORGANIZATION | 0.99+ |
today | DATE | 0.98+ |
Lin | PERSON | 0.98+ |
20 years ago | DATE | 0.98+ |
Melinda | PERSON | 0.98+ |
One terabyte | QUANTITY | 0.98+ |
Seamus | ORGANIZATION | 0.98+ |
one core | QUANTITY | 0.98+ |
Melind | PERSON | 0.98+ |
fourth generation | QUANTITY | 0.98+ |
this year | DATE | 0.97+ |
7 years | QUANTITY | 0.97+ |
Seamus Jones | PERSON | 0.97+ |
Dallas | LOCATION | 0.97+ |
One | QUANTITY | 0.97+ |
Melin | PERSON | 0.97+ |
one | QUANTITY | 0.97+ |
6 | QUANTITY | 0.96+ |
Milind Damle | PERSON | 0.96+ |
Melan | PERSON | 0.96+ |
first | QUANTITY | 0.95+ |
8 | QUANTITY | 0.94+ |
second generation | QUANTITY | 0.94+ |
Seamus | PERSON | 0.94+ |
TP C X | TITLE | 0.93+ |
Ian Colle, AWS | SuperComputing 22
(lively music) >> Good morning. Welcome back to theCUBE's coverage at Supercomputing Conference 2022, live here in Dallas. I'm Dave Nicholson with my co-host Paul Gillin. So far so good, Paul? It's been a fascinating morning Three days in, and a fascinating guest, Ian from AWS. Welcome. >> Thanks, Dave. >> What are we going to talk about? Batch computing, HPC. >> We've got a lot, let's get started. Let's dive right in. >> Yeah, we've got a lot to talk about. I mean, first thing is we recently announced our batch support for EKS. EKS is our Kubernetes, managed Kubernetes offering at AWS. And so batch computing is still a large portion of HPC workloads. While the interactive component is growing, the vast majority of systems are just kind of fire and forget, and we want to run thousands and thousands of nodes in parallel. We want to scale out those workloads. And what's unique about our AWS batch offering, is that we can dynamically scale, based upon the queue depth. And so customers can go from seemingly nothing up to thousands of nodes, and while they're executing their work they're only paying for the instances while they're working. And then as the queue depth starts to drop and the number of jobs waiting in the queue starts to drop, then we start to dynamically scale down those resources. And so it's extremely powerful. We see lots of distributed machine learning, autonomous vehicle simulation, and traditional HPC workloads taking advantage of AWS Batch. >> So when you have a Kubernetes cluster does it have to be located in the same region as the HPC cluster that's going to be doing the batch processing, or does the nature of batch processing mean, in theory, you can move something from here to somewhere relatively far away to do the batch processing? How does that work? 'Cause look, we're walking around here and people are talking about lengths of cables in order to improve performance. So what does that look like when you peel back the cover and you look at it physically, not just logically, AWS is everywhere, but physically, what does that look like? >> Oh, physically, for us, it depends on what the customer's looking for. We have workflows that are all entirely within a single region. And so where they could have a portion of say the traditional HPC workflow, is within that region as well as the batch, and they're saving off the results, say to a shared storage file system like our Amazon FSx for Lustre, or maybe aging that back to an S3 object storage for a little lower cost storage solution. Or you can have customers that have a kind of a multi-region orchestration layer to where they say, "You know what? "I've got a portion of my workflow that occurs "over on the other side of the country "and I replicate my data between the East Coast "and the West Coast just based upon business needs. "And I want to have that available to customers over there. "And so I'll do a portion of it in the East Coast "a portion of it in the West Coast." Or you can think of that even globally. It really depends upon the customer's architecture. >> So is the intersection of Kubernetes with HPC, is this relatively new? I know you're saying you're, you're announcing it. >> It really is. I think we've seen a growing perspective. I mean, Kubernetes has been a long time kind of eating everything, right, in the enterprise space? And now a lot of CIOs in the industrial space are saying, "Why am I using one orchestration layer "to manage my HPC infrastructure and another one "to manage my enterprise infrastructure?" And so there's a growing appreciation that, you know what, why don't we just consolidate on one? And so that's where we've seen a growth of Kubernetes infrastructure and our own managed Kubernetes EKS on AWS. >> Last month you announced a general availability of Trainium, of a chip that's optimized for AI training. Talk about what's special about that chip or what is is customized to the training workloads. >> Yeah, what's unique about the Trainium, is you'll you'll see 40% price performance over any other GPU available in the AWS cloud. And so we've really geared it to be that most price performance of options for our customers. And that's what we like about the silicon team, that we're part of that Annaperna acquisition, is because it really has enabled us to have this differentiation and to not just be innovating at the software level but the entire stack. That Annaperna Labs team develops our network cards, they develop our ARM cards, they developed this Trainium chip. And so that silicon innovation has become a core part of our differentiator from other vendors. And what Trainium allows you to do is perform similar workloads, just at a lower price performance. >> And you also have a chip several years older, called Inferentia- >> Um-hmm. >> Which is for inferencing. What is the difference between, I mean, when would a customer use one versus the other? How would you move the workload? >> What we've seen is customers traditionally have looked for a certain class of machine, more of a compute type that is not as accelerated or as heavy as you would need for Trainium for their inference portion of their workload. So when they do that training they want the really beefy machines that can grind through a lot of data. But when you're doing the inference, it's a little lighter weight. And so it's a different class of machine. And so that's why we've got those two different product lines with the Inferentia being there to support those inference portions of their workflow and the Trainium to be that kind of heavy duty training work. >> And then you advise them on how to migrate their workloads from one to the other? And once the model is trained would they switch to an Inferentia-based instance? >> Definitely, definitely. We help them work through what does that design of that workflow look like? And some customers are very comfortable doing self-service and just kind of building it on their own. Other customers look for a more professional services engagement to say like, "Hey, can you come in and help me work "through how I might modify my workflow to "take full advantage of these resources?" >> The HPC world has been somewhat slower than commercial computing to migrate to the cloud because- >> You're very polite. (panelists all laughing) >> Latency issues, they want to control the workload, they want to, I mean there are even issues with moving large amounts of data back and forth. What do you say to them? I mean what's the argument for ditching the on-prem supercomputer and going all-in on AWS? >> Well, I mean, to be fair, I started at AWS five years ago. And I can tell you when I showed up at Supercomputing, even though I'd been part of this community for many years, they said, "What is AWS doing at Supercomputing?" I know you care, wait, it's Amazon Web Services. You care about the web, can you actually handle supercomputing workloads? Now the thing that very few people appreciated is that yes, we could. Even at that time in 2017, we had customers that were performing HPC workloads. Now that being said, there were some real limitations on what we could perform. And over those past five years, as we've grown as a company, we've started to really eliminate those frictions for customers to migrate their HPC workloads to the AWS cloud. When I started in 2017, we didn't have our elastic fabric adapter, our low-latency interconnect. So customers were stuck with standard TCP/IP. So for their highly demanding open MPI workloads, we just didn't have the latencies to support them. So the jobs didn't run as efficiently as they could. We didn't have Amazon FSx for Lustre, our managed lustre offering for high performant, POSIX-compliant file system, which is kind of the key to a large portion of HPC workloads is you have to have a high-performance file system. We didn't even, I mean, we had about 25 gigs of networking when I started. Now you look at, with our accelerated instances, we've got 400 gigs of networking. So we've really continued to grow across that spectrum and to eliminate a lot of those really, frictions to adoption. I mean, one of the key ones, we had a open source toolkit that was jointly developed by Intel and AWS called CFN Cluster that customers were using to even instantiate their clusters. So, and now we've migrated that all the way to a fully functional supported service at AWS called AWS Parallel Cluster. And so you've seen over those past five years we have had to develop, we've had to grow, we've had to earn the trust of these customers and say come run your workloads on us and we will demonstrate that we can meet your demanding requirements. And at the same time, there's been, I'd say, more of a cultural acceptance. People have gone away from the, again, five years ago, to what are you doing walking around the show, to say, "Okay, I'm not sure I get it. "I need to look at it. "I, okay, I, now, oh, it needs to be a part "of my architecture but the standard questions, "is it secure? "Is it price performant? "How does it compare to my on-prem?" And really culturally, a lot of it is, just getting IT administrators used to, we're not eliminating a whole field, right? We're just upskilling the people that used to rack and stack actual hardware, to now you're learning AWS services and how to operate within that environment. And it's still key to have those people that are really supporting these infrastructures. And so I'd say it's a little bit of a combination of cultural shift over the past five years, to see that cloud is a super important part of HPC workloads, and part of it's been us meeting the the market segment of where we needed to with innovating both at the hardware level and at the software level, which we're going to continue to do. >> You do have an on-prem story though. I mean, you have outposts. We don't hear a lot of talk about outposts lately, but these innovations, like Inferentia, like Trainium, like the networking innovation you're talking about, are these going to make their way into outposts as well? Will that essentially become this supercomputing solution for customers who want to stay on-prem? >> Well, we'll see what the future lies, but we believe that we've got the, as you noted, we've got the hardware, we've got the network, we've got the storage. All those put together gives you a a high-performance computer, right? And whether you want it to be redundant in your local data center or you want it to be accessible via APIs from the AWS cloud, we want to provide that service to you. >> So to be clear, that's not that's not available now, but that is something that could be made available? >> Outposts are available right now, that have this the services that you need. >> All these capabilities? >> Often a move to cloud, an impetus behind it comes from the highest levels in an organization. They're looking at the difference between OpEx versus CapEx. CapEx for a large HPC environment, can be very, very, very high. Are these HPC clusters consumed as an operational expense? Are you essentially renting time, and then a fundamental question, are these multi-tenant environments? Or when you're referring to batches being run in HPC, are these dedicated HPC environments for customers who are running batches against them? When you think about batches, you think of, there are times when batches are being run and there are times when they're not being run. So that would sort of conjure, in the imagination, multi-tenancy, what does that look like? >> Definitely, and that's been, let me start with your second part first is- >> Yeah. That's been a a core area within AWS is we do not see as, okay we're going to, we're going to carve out this super computer and then we're going to allocate that to you. We are going to dynamically allocate multi-tenant resources to you to perform the workloads you need. And especially with the batch environment, we're going to spin up containers on those, and then as the workloads complete we're going to turn those resources over to where they can be utilized by other customers. And so that's where the batch computing component really is powerful, because as you say, you're releasing resources from workloads that you're done with. I can use those for another portion of the workflow for other work. >> Okay, so it makes a huge difference, yeah. >> You mentioned, that five years ago, people couldn't quite believe that AWS was at this conference. Now you've got a booth right out in the center of the action. What kind of questions are you getting? What are people telling you? >> Well, I love being on the show floor. This is like my favorite part is talking to customers and hearing one, what do they love, what do they want more of? Two, what do they wish we were doing that we're not currently doing? And three, what are the friction points that are still exist that, like, how can I make their lives easier? And what we're hearing is, "Can you help me migrate my workloads to the cloud? "Can you give me the information that I need, "both from a price for performance, "for an operational support model, "and really help me be an internal advocate "within my environment to explain "how my resources can be operated proficiently "within the AWS cloud." And a lot of times it's, let's just take your application a subset of your applications and let's benchmark 'em. And really that, AWS, one of the key things is we are a data-driven environment. And so when you take that data and you can help a customer say like, "Let's just not look at hypothetical, "at synthetic benchmarks, let's take "actually the LS-DYNA code that you're running, perhaps. "Let's take the OpenFOAM code that you're running, "that you're running currently "in your on-premises workloads, "and let's run it on AWS cloud "and let's see how it performs." And then we can take that back to your to the decision makers and say, okay, here's the price for performance on AWS, here's what we're currently doing on-premises, how do we think about that? And then that also ties into your earlier question about CapEx versus OpEx. We have models where actual, you can capitalize a longer-term purchase at AWS. So it doesn't have to be, I mean, depending upon the accounting models you want to use, we do have a majority of customers that will stay with that OpEx model, and they like that flexibility of saying, "Okay, spend as you go." We need to have true ups, and make sure that they have insight into what they're doing. I think one of the boogeyman is that, oh, I'm going to spend all my money and I'm not going to know what's available. And so we want to provide the, the cost visibility, the cost controls, to where you feel like, as an HPC administrator you have insight into what your customers are doing and that you have control over that. And so once you kind of take away some of those fears and and give them the information that they need, what you start to see too is, you know what, we really didn't have a lot of those cost visibility and controls with our on-premises hardware. And we've had some customers tell us we had one portion of the workload where this work center was spending thousands of dollars a day. And we went back to them and said, "Hey, we started to show this, "what you were spending on-premises." They went, "Oh, I didn't realize that." And so I think that's part of a cultural thing that, at an HPC, the question was, well on-premises is free. How do you compete with free? And so we need to really change that culturally, to where people see there is no free lunch. You're paying for the resources whether it's on-premises or in the cloud. >> Data scientists don't worry about budgets. >> Wait, on-premises is free? Paul mentioned something that reminded me, you said you were here in 2017, people said AWS, web, what are you even doing here? Now in 2022, you're talking in terms of migrating to cloud. Paul mentioned outposts, let's say that a customer says, "Hey, I'd like you to put "in a thousand-node cluster in this data center "that I happen to own, but from my perspective, "I want to interact with it just like it's "in your data center." In other words, the location doesn't matter. My experience is identical to interacting with AWS in an AWS data center, in a CoLo that works with AWS, but instead it's my physical data center. When we're tracking the percentage of IT that's that is on-prem versus off-prem. What is that? Is that, what I just described, is that cloud? And in five years are you no longer going to be talking about migrating to cloud because people go, "What do you mean migrating to cloud? "What do you even talking about? "What difference does it make?" It's either something that AWS is offering or it's something that someone else is offering. Do you think we'll be at that point in five years, where in this world of virtualization and abstraction, you talked about Kubernetes, we should be there already, thinking in terms of it doesn't matter as long as it meets latency and sovereignty requirements. So that, your prediction, we're all about insights and supercomputing- >> My prediction- >> In five years, will you still be talking about migrating to cloud or will that be something from the past? >> In five years, I still think there will be a component. I think the majority of the assumption will be that things are cloud-native and you start in the cloud and that there are perhaps, an aspect of that, that will be interacting with some sort of an edge device or some sort of an on-premises device. And we hear more and more customers that are saying, "Okay, I can see the future, "I can see that I'm shrinking my footprint." And, you can see them still saying, "I'm not sure how small that beachhead will be, "but right now I want to at least say "that I'm going to operate in that hybrid environment." And so I'd say, again, the pace of this community, I'd say five years we're still going to be talking about migrations, but I'd say the vast majority will be a cloud-native, cloud-first environment. And how do you classify that? That outpost sitting in someone's data center? I'd say we'd still, at least I'll leave that up to the analysts, but I think it would probably come down as cloud spend. >> Great place to end. Ian, you and I now officially have a bet. In five years we're going to come back. My contention is, no we're not going to be talking about it anymore. >> Okay. >> And kids in college are going to be like, "What do you mean cloud, it's all IT, it's all IT." And they won't remember this whole phase of moving to cloud and back and forth. With that, join us in five years to see the result of this mega-bet between Ian and Dave. I'm Dave Nicholson with theCUBE, here at Supercomputing Conference 2022, day three of our coverage with my co-host Paul Gillin. Thanks again for joining us. Stay tuned, after this short break, we'll be back with more action. (lively music)
SUMMARY :
Welcome back to theCUBE's coverage What are we going to talk about? Let's dive right in. in the queue starts to drop, does it have to be of say the traditional HPC workflow, So is the intersection of Kubernetes And now a lot of CIOs in the to the training workloads. And what Trainium allows you What is the difference between, to be that kind of heavy to say like, "Hey, can you You're very polite. to control the workload, to what are you doing I mean, you have outposts. And whether you want it to be redundant that have this the services that you need. Often a move to cloud, to you to perform the workloads you need. Okay, so it makes a What kind of questions are you getting? the cost controls, to where you feel like, And in five years are you no And so I'd say, again, the not going to be talking of moving to cloud and back and forth.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ian | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Dave Nicholson | PERSON | 0.99+ |
Paul Gillin | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
400 gigs | QUANTITY | 0.99+ |
2017 | DATE | 0.99+ |
Ian Colle | PERSON | 0.99+ |
thousands | QUANTITY | 0.99+ |
Dallas | LOCATION | 0.99+ |
40% | QUANTITY | 0.99+ |
Amazon Web Services | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
2022 | DATE | 0.99+ |
Annaperna | ORGANIZATION | 0.99+ |
second part | QUANTITY | 0.99+ |
five years | QUANTITY | 0.99+ |
Last month | DATE | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
five years ago | DATE | 0.98+ |
five | QUANTITY | 0.98+ |
Two | QUANTITY | 0.98+ |
Supercomputing | ORGANIZATION | 0.98+ |
Lustre | ORGANIZATION | 0.97+ |
Annaperna Labs | ORGANIZATION | 0.97+ |
Trainium | ORGANIZATION | 0.97+ |
five years | QUANTITY | 0.96+ |
one | QUANTITY | 0.96+ |
OpEx | TITLE | 0.96+ |
both | QUANTITY | 0.96+ |
first thing | QUANTITY | 0.96+ |
Supercomputing Conference | EVENT | 0.96+ |
first | QUANTITY | 0.96+ |
West Coast | LOCATION | 0.96+ |
thousands of dollars a day | QUANTITY | 0.96+ |
Supercomputing Conference 2022 | EVENT | 0.95+ |
CapEx | TITLE | 0.94+ |
three | QUANTITY | 0.94+ |
theCUBE | ORGANIZATION | 0.92+ |
East Coast | LOCATION | 0.91+ |
single region | QUANTITY | 0.91+ |
years | QUANTITY | 0.91+ |
thousands of nodes | QUANTITY | 0.88+ |
Parallel Cluster | TITLE | 0.87+ |
about 25 gigs | QUANTITY | 0.87+ |
Dhabaleswar “DK” Panda, Ohio State State University | SuperComputing 22
>>Welcome back to The Cube's coverage of Supercomputing Conference 2022, otherwise known as SC 22 here in Dallas, Texas. This is day three of our coverage, the final day of coverage here on the exhibition floor. I'm Dave Nicholson, and I'm here with my co-host, tech journalist extraordinaire, Paul Gillum. How's it going, >>Paul? Hi, Dave. It's going good. >>And we have a wonderful guest with us this morning, Dr. Panda from the Ohio State University. Welcome Dr. Panda to the Cube. >>Thanks a lot. Thanks a lot to >>Paul. I know you're, you're chopping at >>The bit, you have incredible credentials, over 500 papers published. The, the impact that you've had on HPC is truly remarkable. But I wanted to talk to you specifically about a product project you've been working on for over 20 years now called mva, high Performance Computing platform that's used by more than 32 organ, 3,200 organizations across 90 countries. You've shepherded this from, its, its infancy. What is the vision for what MVA will be and and how is it a proof of concept that others can learn from? >>Yeah, Paul, that's a great question to start with. I mean, I, I started with this conference in 2001. That was the first time I came. It's very coincidental. If you remember the Finman Networking Technology, it was introduced in October of 2000. Okay. So in my group, we were working on NPI for Marinette Quadrics. Those are the old technology, if you can recollect when Finman was there, we were the very first one in the world to really jump in. Nobody knew how to use Infin van in an HPC system. So that's how the Happy Project was born. And in fact, in super computing 2002 on this exhibition floor in Baltimore, we had the first demonstration, the open source happy, actually is running on an eight node infinite van clusters, eight no zeros. And that was a big challenge. But now over the years, I means we have continuously worked with all infinite van vendors, MPI Forum. >>We are a member of the MPI Forum and also all other network interconnect. So we have steadily evolved this project over the last 21 years. I'm very proud of my team members working nonstop, continuously bringing not only performance, but scalability. If you see now INFIN event are being deployed in 8,000, 10,000 node clusters, and many of these clusters actually use our software, stack them rapid. So, so we have done a lot of, like our focuses, like we first do research because we are in academia. We come up with good designs, we publish, and in six to nine months, we actually bring it to the open source version and people can just download and then use it. And that's how currently it's been used by more than 3000 orange in 90 countries. And, but the interesting thing is happening, your second part of the question. Now, as you know, the field is moving into not just hvc, but ai, big data, and we have those support. This is where like we look at the vision for the next 20 years, we want to design this MPI library so that not only HPC but also all other workloads can take advantage of it. >>Oh, we have seen libraries that become a critical develop platform supporting ai, TensorFlow, and, and the pie torch and, and the emergence of, of, of some sort of default languages that are, that are driving the community. How, how important are these frameworks to the, the development of the progress making progress in the HPC world? >>Yeah, no, those are great. I mean, spite our stencil flow, I mean, those are the, the now the bread and butter of deep learning machine learning. Am I right? But the challenge is that people use these frameworks, but continuously models are becoming larger. You need very first turnaround time. So how do you train faster? How do you do influencing faster? So this is where HPC comes in and what exactly what we have done is actually we have linked floor fighters to our happy page because now you see the MPI library is running on a million core system. Now your fighters and tenor four clan also be scaled to to, to those number of, large number of course and gps. So we have actually done that kind of a tight coupling and that helps the research to really take advantage of hpc. >>So if, if a high school student is thinking in terms of interesting computer science, looking for a place, looking for a university, Ohio State University, bruns, world renowned, widely known, but talk about what that looks like from a day on a day to day basis in terms of the opportunity for undergrad and graduate students to participate in, in the kind of work that you do. What is, what does that look like? And is, and is that, and is that a good pitch to for, for people to consider the university? >>Yes. I mean, we continuously, from a university perspective, by the way, the Ohio State University is one of the largest single campus in, in us, one of the top three, top four. We have 65,000 students. Wow. It's one of the very largest campus. And especially within computer science where I am located, high performance computing is a very big focus. And we are one of the, again, the top schools all over the world for high performance computing. And we also have very strength in ai. So we always encourage, like the new students who like to really work on top of the art solutions, get exposed to the concepts, principles, and also practice. Okay. So, so we encourage those people that wish you can really bring you those kind of experience. And many of my past students, staff, they're all in top companies now, have become all big managers. >>How, how long, how long did you say you've been >>At 31 >>Years? 31 years. 31 years. So, so you, you've had people who weren't alive when you were already doing this stuff? That's correct. They then were born. Yes. They then grew up, yes. Went to university graduate school, and now they're on, >>Now they're in many top companies, national labs, all over the universities, all over the world. So they have been trained very well. Well, >>You've, you've touched a lot of lives, sir. >>Yes, thank you. Thank >>You. We've seen really a, a burgeoning of AI specific hardware emerge over the last five years or so. And, and architectures going beyond just CPUs and GPUs, but to Asics and f PGAs and, and accelerators, does this excite you? I mean, are there innovations that you're seeing in this area that you think have, have great promise? >>Yeah, there is a lot of promise. I think every time you see now supercomputing technology, you see there is sometime a big barrier comes barrier jump. Rather I'll say, new technology comes some disruptive technology, then you move to the next level. So that's what we are seeing now. A lot of these AI chips and AI systems are coming up, which takes you to the next level. But the bigger challenge is whether it is cost effective or not, can that be sustained longer? And this is where commodity technology comes in, which commodity technology tries to take you far longer. So we might see like all these likes, Gaudi, a lot of new chips are coming up, can they really bring down the cost? If that cost can be reduced, you will see a much more bigger push for AI solutions, which are cost effective. >>What, what about on the interconnect side of things, obvi, you, you, your, your start sort of coincided with the initial standards for Infin band, you know, Intel was very, very, was really big in that, in that architecture originally. Do you see interconnects like RDMA over converged ethernet playing a part in that sort of democratization or commoditization of things? Yes. Yes. What, what are your thoughts >>There for internet? No, this is a great thing. So, so we saw the infinite man coming. Of course, infinite Man is, commod is available. But then over the years people have been trying to see how those RDMA mechanisms can be used for ethernet. And then Rocky has been born. So Rocky has been also being deployed. But besides these, I mean now you talk about Slingshot, the gray slingshot, it is also an ethernet based systems. And a lot of those RMA principles are actually being used under the hood. Okay. So any modern networks you see, whether it is a Infin and Rocky Links art network, rock board network, you name any of these networks, they are using all the very latest principles. And of course everybody wants to make it commodity. And this is what you see on the, on the slow floor. Everybody's trying to compete against each other to give you the best performance with the lowest cost, and we'll see whoever wins over the years. >>Sort of a macroeconomic question, Japan, the US and China have been leapfrogging each other for a number of years in terms of the fastest supercomputer performance. How important do you think it is for the US to maintain leadership in this area? >>Big, big thing, significantly, right? We are saying that I think for the last five to seven years, I think we lost that lead. But now with the frontier being the number one, starting from the June ranking, I think we are getting that leadership back. And I think it is very critical not only for fundamental research, but for national security trying to really move the US to the leading edge. So I hope us will continue to lead the trend for the next few years until another new system comes out. >>And one of the gating factors, there is a shortage of people with data science skills. Obviously you're doing what you can at the university level. What do you think can change at the secondary school level to prepare students better to, for data science careers? >>Yeah, I mean that is also very important. I mean, we, we always call like a pipeline, you know, that means when PhD levels we are expecting like this even we want to students to get exposed to, to, to many of these concerts from the high school level. And, and things are actually changing. I mean, these days I see a lot of high school students, they, they know Python, how to program in Python, how to program in sea object oriented things. Even they're being exposed to AI at that level. So I think that is a very healthy sign. And in fact we, even from Ohio State side, we are always engaged with all this K to 12 in many different programs and then gradually trying to take them to the next level. And I think we need to accelerate also that in a very significant manner because we need those kind of a workforce. It is not just like a building a system number one, but how do we really utilize it? How do we utilize that science? How do we propagate that to the community? Then we need all these trained personal. So in fact in my group, we are also involved in a lot of cyber training activities for HPC professionals. So in fact, today there is a bar at 1 1 15 I, yeah, I think 1215 to one 15. We'll be talking more about that. >>About education. >>Yeah. Cyber training, how do we do for professionals? So we had a funding together with my co-pi, Dr. Karen Tom Cook from Ohio Super Center. We have a grant from NASA Science Foundation to really educate HPT professionals about cyber infrastructure and ai. Even though they work on some of these things, they don't have the complete knowledge. They don't get the time to, to learn. And the field is moving so fast. So this is how it has been. We got the initial funding, and in fact, the first time we advertised in 24 hours, we got 120 application, 24 hours. We couldn't even take all of them. So, so we are trying to offer that in multiple phases. So, so there is a big need for those kind of training sessions to take place. I also offer a lot of tutorials at all. Different conference. We had a high performance networking tutorial. Here we have a high performance deep learning tutorial, high performance, big data tutorial. So I've been offering tutorials at, even at this conference since 2001. Good. So, >>So in the last 31 years, the Ohio State University, as my friends remind me, it is properly >>Called, >>You've seen the world get a lot smaller. Yes. Because 31 years ago, Ohio, in this, you know, of roughly in the, in the middle of North America and the United States was not as connected as it was to everywhere else in the globe. So that's, that's pro that's, I i it kind of boggles the mind when you think of that progression over 31 years, but globally, and we talk about the world getting smaller, we're sort of in the thick of, of the celebratory seasons where, where many, many groups of people exchange gifts for varieties of reasons. If I were to offer you a holiday gift, that is the result of what AI can deliver the world. Yes. What would that be? What would, what would, what would the first thing be? This is, this is, this is like, it's, it's like the genie, but you only get one wish. >>I know, I know. >>So what would the first one be? >>Yeah, it's very hard to answer one way, but let me bring a little bit different context and I can answer this. I, I talked about the happy project and all, but recently last year actually we got awarded an S f I institute award. It's a 20 million award. I am the overall pi, but there are 14 universities involved. >>And who is that in that institute? >>What does that Oh, the I ici. C e. Okay. I cycle. You can just do I cycle.ai. Okay. And that lies with what exactly what you are trying to do, how to bring lot of AI for masses, democratizing ai. That's what is the overall goal of this, this institute, think of like a, we have three verticals we are working think of like one is digital agriculture. So I'll be, that will be my like the first ways. How do you take HPC and AI to agriculture the world as though we just crossed 8 billion people. Yeah, that's right. We need continuous food and food security. How do we grow food with the lowest cost and with the highest yield? >>Water >>Consumption. Water consumption. Can we minimize or minimize the water consumption or the fertilization? Don't do blindly. Technologies are out there. Like, let's say there is a weak field, A traditional farmer see that, yeah, there is some disease, they will just go and spray pesticides. It is not good for the environment. Now I can fly it drone, get images of the field in the real time, check it against the models, and then it'll tell that, okay, this part of the field has disease. One, this part of the field has disease. Two, I indicate to the, to the tractor or the sprayer saying, okay, spray only pesticide one, you have pesticide two here. That has a big impact. So this is what we are developing in that NSF A I institute I cycle ai. We also have, we have chosen two additional verticals. One is animal ecology, because that is very much related to wildlife conservation, climate change, how do you understand how the animals move? Can we learn from them? And then see how human beings need to act in future. And the third one is the food insecurity and logistics. Smart food distribution. So these are our three broad goals in that institute. How do we develop cyber infrastructure from below? Combining HP c AI security? We have, we have a large team, like as I said, there are 40 PIs there, 60 students. We are a hundred members team. We are working together. So, so that will be my wish. How do we really democratize ai? >>Fantastic. I think that's a great place to wrap the conversation here On day three at Supercomputing conference 2022 on the cube, it was an honor, Dr. Panda working tirelessly at the Ohio State University with his team for 31 years toiling in the field of computer science and the end result, improving the lives of everyone on Earth. That's not a stretch. If you're in high school thinking about a career in computer science, keep that in mind. It isn't just about the bits and the bobs and the speeds and the feeds. It's about serving humanity. Maybe, maybe a little, little, little too profound a statement, I would argue not even close. I'm Dave Nicholson with the Queue, with my cohost Paul Gillin. Thank you again, Dr. Panda. Stay tuned for more coverage from the Cube at Super Compute 2022 coming up shortly. >>Thanks a lot.
SUMMARY :
Welcome back to The Cube's coverage of Supercomputing Conference 2022, And we have a wonderful guest with us this morning, Dr. Thanks a lot to But I wanted to talk to you specifically about a product project you've So in my group, we were working on NPI for So we have steadily evolved this project over the last 21 years. that are driving the community. So we have actually done that kind of a tight coupling and that helps the research And is, and is that, and is that a good pitch to for, So, so we encourage those people that wish you can really bring you those kind of experience. you were already doing this stuff? all over the world. Thank this area that you think have, have great promise? I think every time you see now supercomputing technology, with the initial standards for Infin band, you know, Intel was very, very, was really big in that, And this is what you see on the, Sort of a macroeconomic question, Japan, the US and China have been leapfrogging each other for a number the number one, starting from the June ranking, I think we are getting that leadership back. And one of the gating factors, there is a shortage of people with data science skills. And I think we need to accelerate also that in a very significant and in fact, the first time we advertised in 24 hours, we got 120 application, that's pro that's, I i it kind of boggles the mind when you think of that progression over 31 years, I am the overall pi, And that lies with what exactly what you are trying to do, to the tractor or the sprayer saying, okay, spray only pesticide one, you have pesticide two here. I think that's a great place to wrap the conversation here On
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Nicholson | PERSON | 0.99+ |
Paul Gillum | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Paul Gillin | PERSON | 0.99+ |
October of 2000 | DATE | 0.99+ |
Paul | PERSON | 0.99+ |
NASA Science Foundation | ORGANIZATION | 0.99+ |
2001 | DATE | 0.99+ |
Baltimore | LOCATION | 0.99+ |
8,000 | QUANTITY | 0.99+ |
14 universities | QUANTITY | 0.99+ |
31 years | QUANTITY | 0.99+ |
20 million | QUANTITY | 0.99+ |
24 hours | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Karen Tom Cook | PERSON | 0.99+ |
60 students | QUANTITY | 0.99+ |
Ohio State University | ORGANIZATION | 0.99+ |
90 countries | QUANTITY | 0.99+ |
six | QUANTITY | 0.99+ |
Earth | LOCATION | 0.99+ |
Panda | PERSON | 0.99+ |
today | DATE | 0.99+ |
65,000 students | QUANTITY | 0.99+ |
3,200 organizations | QUANTITY | 0.99+ |
North America | LOCATION | 0.99+ |
Python | TITLE | 0.99+ |
United States | LOCATION | 0.99+ |
Dallas, Texas | LOCATION | 0.99+ |
over 500 papers | QUANTITY | 0.99+ |
June | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
more than 32 organ | QUANTITY | 0.99+ |
120 application | QUANTITY | 0.99+ |
Ohio | LOCATION | 0.99+ |
more than 3000 orange | QUANTITY | 0.99+ |
first ways | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
nine months | QUANTITY | 0.99+ |
40 PIs | QUANTITY | 0.99+ |
Asics | ORGANIZATION | 0.99+ |
MPI Forum | ORGANIZATION | 0.98+ |
China | ORGANIZATION | 0.98+ |
Two | QUANTITY | 0.98+ |
Ohio State State University | ORGANIZATION | 0.98+ |
8 billion people | QUANTITY | 0.98+ |
Intel | ORGANIZATION | 0.98+ |
HP | ORGANIZATION | 0.97+ |
Dr. | PERSON | 0.97+ |
over 20 years | QUANTITY | 0.97+ |
US | ORGANIZATION | 0.97+ |
Finman | ORGANIZATION | 0.97+ |
Rocky | PERSON | 0.97+ |
Japan | ORGANIZATION | 0.97+ |
first time | QUANTITY | 0.97+ |
first demonstration | QUANTITY | 0.96+ |
31 years ago | DATE | 0.96+ |
Ohio Super Center | ORGANIZATION | 0.96+ |
three broad goals | QUANTITY | 0.96+ |
one wish | QUANTITY | 0.96+ |
second part | QUANTITY | 0.96+ |
31 | QUANTITY | 0.96+ |
Cube | ORGANIZATION | 0.95+ |
eight | QUANTITY | 0.95+ |
over 31 years | QUANTITY | 0.95+ |
10,000 node clusters | QUANTITY | 0.95+ |
day three | QUANTITY | 0.95+ |
first | QUANTITY | 0.95+ |
INFIN | EVENT | 0.94+ |
seven years | QUANTITY | 0.94+ |
Dhabaleswar “DK” Panda | PERSON | 0.94+ |
three | QUANTITY | 0.93+ |
S f I institute | TITLE | 0.93+ |
first thing | QUANTITY | 0.93+ |
David Schmidt, Dell Technologies and Scott Clark, Intel | SuperComputing 22
(techno music intro) >> Welcome back to theCube's coverage of SuperComputing Conference 2022. We are here at day three covering the amazing events that are occurring here. I'm Dave Nicholson, with my co-host Paul Gillin. How's it goin', Paul? >> Fine, Dave. Winding down here, but still plenty of action. >> Interesting stuff. We got a full day of coverage, and we're having really, really interesting conversations. We sort of wrapped things up at Supercomputing 22 here in Dallas. I've got two very special guests with me, Scott from Intel and David from Dell, to talk about yeah supercomputing, but guess what? We've got some really cool stuff coming up after this whole thing wraps. So not all of the holiday gifts have been unwrapped yet, kids. Welcome gentlemen. >> Thanks so much for having us. >> Thanks for having us. >> So, let's start with you, David. First of all, explain the relationship in general between Dell and Intel. >> Sure, so obviously Intel's been an outstanding partner. We built some great solutions over the years. I think the market reflects that. Our customers tell us that. The feedback's strong. The products you see out here this week at Supercompute, you know, put that on display for everybody to see. And then as we think about AI in machine learning, there's so many different directions we need to go to help our customers deliver AI outcomes. Right, so we recognize that AI has kind of spread outside of just the confines of everything we've seen here this week. And now we've got really accessible AI use cases that we can explain to friends and family. We can talk about going into retail environments and how AI is being used to track inventory, to monitor traffic, et cetera. But really what that means to us as a bunch of hardware folks is we have to deliver the right platforms and the right designs for a variety of environments, both inside and outside the data center. And so if you look at our portfolio, we have some great products here this week, but we also have other platforms, like the XR4000, our shortest rack server ever that's designed to go into Edge environments, but is also built for those Edge AI use cases that supports GPUs. It supports AI on the CPU as well. And so there's a lot of really compelling platforms that we're starting to talk about, have already been talking about, and it's going to really enable our customers to deliver AI in a variety of ways. >> You mentioned AI on the CPU. Maybe this is a question for Scott. What does that mean, AI on the CPU? >> Well, as David was talking about, we're just seeing this explosion of different use cases. And some of those on the Edge, some of them in the Cloud, some of them on Prem. But within those individual deployments, there's often different ways that you can do AI, whether that's training or inference. And what we're seeing is a lot of times the memory locality matters quite a bit. You don't want to have to pay necessarily a cost going across the PCI express bus, especially with some of our newer products like the CPU Max series, where you can have a huge about of high bandwidth memory just sitting right on the CPU. Things that traditionally would have been accelerator only, can now live on a CPU, and that includes both on the inference side. We're seeing some really great things with images, where you might have a giant medical image that you need to be able to do extremely high resolution inference on or even text, where you might have a huge corpus of extremely sparse text that you need to be able to randomly sample very efficiently. >> So how are these needs influencing the evolution of Intel CPU architectures? >> So, we're talking to our customers. We're talking to our partners. This presents both an opportunity, but also a challenge with all of these different places that you can put these great products, as well as applications. And so we're very thoughtfully trying to go to the market, see where their needs are, and then meet those needs. This industry obviously has a lot of great players in it, and it's no longer the case that if you build it, they will come. So what we're doing is we're finding where are those choke points, how can we have that biggest difference? Sometimes there's generational leaps, and I know David can speak to this, can be huge from one system to the next just because everything's accelerated on the software side, the hardware side, and the platforms themselves. >> That's right, and we're really excited about that leap. If you take what Scott just described, we've been writing white papers, our team with Scott's team, we've been talking about those types of use cases using doing large image analysis and leveraging system memory, leveraging the CPU to do that, we've been talking about that for several generations now. Right, going back to Cascade Lake, going back to what we would call 14th generation power Edge. And so now as we prepare and continue to unveil, kind of we're in launch season, right, you and I were talking about how we're in launch season. As we continue to unveil and launch more products, the performance improvements are just going to be outstanding and we'll continue that evolution that Scott described. >> Yeah, I'd like to applaud Dell just for a moment for its restraint. Because I know you could've come in and taken all of the space in the convention center to show everything that you do. >> Would have loved to. >> In the HPC space. Now, worst kept secrets on earth at this point. Vying for number one place is the fact that there is a new Mission Impossible movie coming. And there's also new stuff coming from Intel. I know, I think allegedly we're getting close. What can you share with us on that front? And I appreciate it if you can't share a ton of specifics, but where are we going? David just alluded to it. >> Yeah, as David talked about, we've been working on some of these things for many years. And it's just, this momentum is continuing to build, both in respect to some of our hardware investments. We've unveiled some things both here, both on the CPU side and the accelerator side, but also on the software side. OneAPI is gathering more and more traction and the ecosystem is continuing to blossom. Some of our AI and HPC workloads, and the combination thereof, are becoming more and more viable, as well as displacing traditional approaches to some of these problems. And it's this type of thing where it's not linear. It all builds on itself. And we've seen some of these investments that we've made for a better half of a decade starting to bear fruit, but that's, it's not just a one time thing. It's just going to continue to roll out, and we're going to be seeing more and more of this. >> So I want to follow up on something that you mentioned. I don't know if you've ever heard that the Charlie Brown saying that sometimes the most discouraging thing can be to have immense potential. Because between Dell and Intel, you offer so many different versions of things from a fit for function perspective. As a practical matter, how do you work with customers, and maybe this is a question for you, David. How do you work with customers to figure out what the right fit is? >> I'll give you a great example. Just this week, customer conversations, and we can put it in terms of kilowatts to rack, right. How many kilowatts are you delivering at a rack level inside your data center? I've had an answer anywhere from five all the way up to 90. There's some that have been a bit higher that probably don't want to talk about those cases, kind of customers we're meeting with very privately. But the range is really, really large, right, and there's a variety of environments. Customers might be ready for liquid today. They may not be ready for it. They may want to maximize air cooling. Those are the conversations, and then of course it all maps back to the workloads they wish to enable. AI is an extremely overloaded term. We don't have enough time to talk about all the different things that tuck under that umbrella, but the workloads and the outcomes they wish to enable, we have the right solutions. And then we take it a step further by considering where they are today, where they need to go. And I just love that five to 90 example of not every customer has an identical cookie cutter environment, so we've got to have the right platforms, the right solutions, for the right workloads, for the right environments. >> So, I like to dive in on this power issue, to give people who are watching an idea. Because we say five kilowatts, 90 kilowatts, people are like, oh wow, hmm, what does that mean? 90 kilowatts is more than 100 horse power if you want to translate it over. It's a massive amount of power, so if you think of EV terms. You know, five kilowatts is about a hairdryer's around a kilowatt, 1,000 watts, right. But the point is, 90 kilowatts in a rack, that's insane. That's absolutely insane. The heat that that generates has got to be insane, and so it's important. >> Several houses in the size of a closet. >> Exactly, exactly. Yeah, in a rack I explain to people, you know, it's like a refrigerator. But, so in the arena of thermals, I mean is that something during the development of next gen architectures, is that something that's been taken into consideration? Or is it just a race to die size? >> Well, you definitely have to take thermals into account, as well as just the power of consumption themselves. I mean, people are looking at their total cost of ownership. They're looking at sustainability. And at the end of the day, they need to solve a problem. There's many paths up that mountain, and it's about choosing that right path. We've talked about this before, having extremely thoughtful partners, we're just not going to common-torily try every single solution. We're going to try to find the ones that fit that right mold for that customer. And we're seeing more and more people, excuse me, care about this, more and more people wanting to say, how do I do this in the most sustainable way? How do I do this in the most reliable way, given maybe different fluctuations in their power consumption or their power pricing? We're developing more software tools and obviously partnering with great partners to make sure we do this in the most thoughtful way possible. >> Intel put a lot of, made a big investment by buying Habana Labs for its acceleration technology. They're based in Israel. You're based on the west coast. How are you coordinating with them? How will the Habana technology work its way into more mainstream Intel products? And how would Dell integrate those into your servers? >> Good question. I guess I can kick this off. So Habana is part of the Intel family now. They've been integrated in. It's been a great journey with them, as some of their products have launched on AWS, and they've had some very good wins on MLPerf and things like that. I think it's about finding the right tool for the job, right. Not every problem is a nail, so you need more than just a hammer. And so we have the Xeon series, which is incredibly flexible, can do so many different things. It's what we've come to know and love. On the other end of the spectrum, we obviously have some of these more deep learning focused accelerators. And if that's your problem, then you can solve that problem in incredibly efficient ways. The accelerators themselves are somewhere in the middle, so you get that kind of Goldilocks zone of flexibility and power. And depending on your use case, depending on what you know your workloads are going to be day in and day out, one of these solutions might work better for you. A combination might work better for you. Hybrid compute starts to become really interesting. Maybe you have something that you need 24/7, but then you only need a burst to certain things. There's a lot of different options out there. >> The portfolio approach. >> Exactly. >> And then what I love about the work that Scott's team is doing, customers have told us this week in our meetings, they do not want to spend developer's time porting code from one stack to the next. They want that flexibility of choice. Everyone does. We want it in our lives, in our every day lives. They need that flexibility of choice, but they also, there's an opportunity cost when their developers have to choose to port some code over from one stack to another or spend time improving algorithms and doing things that actually generate, you know, meaningful outcomes for their business or their research. And so if they are, you know, desperately searching I would say for that solution and for help in that area, and that's what we're working to enable soon. >> And this is what I love about oneAPI, our software stack, it's open first, heterogeneous first. You can take SYCL code, it can run on competitor's hardware. It can run on Intel hardware. It's one of these things that you have to believe long term, the future is open. Wall gardens, the walls eventually crumble. And we're just trying to continue to invest in that ecosystem to make sure that the in-developer at the end of the day really gets what they need to do, which is solving their business problem, not tinkering with our drivers. >> Yeah, I actually saw an interesting announcement that I hadn't been tracking. I hadn't been tracking this area. Chiplets, and the idea of an open standard where competitors of Intel from a silicone perspective can have their chips integrated via a universal standard. And basically you had the top three silicone vendors saying, yeah, absolutely, let's work together. Cats and dogs. >> Exactly, but at the end of the day, it's whatever menagerie solves the problem. >> Right, right, exactly. And of course Dell can solve it from any angle. >> Yeah, we need strong partners to build the platforms to actually do it. At the end of the day, silicone without software is just sand. Sand with silicone is poorly written prose. But without an actual platform to put it on, it's nothing, it's a box that sits in the corner. >> David, you mentioned that 90% of power age servers now support GPUs. So how is this high-performing, the growth of high performance computing, the demand, influencing the evolution of your server architecture? >> Great question, a couple of ways. You know, I would say 90% of our platforms support GPUs. 100% of our platforms support AI use cases. And it goes back to the CPU compute stack. As we look at how we deliver different form factors for customers, we go back to that range, I said that power range this week of how do we enable the right air coolant solutions? How do we deliver the right liquid cooling solutions, so that wherever the customer is in their environment, and whatever footprint they have, we're ready to meet it? That's something you'll see as we go into kind of the second half of launch season and continue rolling out products. You're going to see some very compelling solutions, not just in air cooling, but liquid cooling as well. >> You want to be more specific? >> We can't unveil everything at Supercompute. We have a lot of great stuff coming up here in the next few months, so. >> It's kind of like being at a great restaurant when they offer you dessert, and you're like yeah, dessert would be great, but I just can't take anymore. >> It's a multi course meal. >> At this point. Well, as we wrap, I've got one more question for each of you. Same question for each of you. When you think about high performance computing, super computing, all of the things that you're doing in your partnership, driving artificial intelligence, at that tip of the spear, what kind of insights are you looking forward to us being able to gain from this technology? In other words, what cool thing, what do you think is cool out there from an AI perspective? What problem do you think we can solve in the near future? What problems would you like to solve? What gets you out of bed in the morning? Cause it's not the little, it's not the bits and the bobs and the speeds and the feats, it's what we're going to do with them, so what do you think, David? >> I'll give you an example. And I think, I saw some of my colleagues talk about this earlier in the week, but for me what we could do in the past two years to unable our customers in a quarantine pandemic environment, we were delivering platforms and solutions to help them do their jobs, help them carry on in their lives. And that's just one example, and if I were to map that forward, it's about enabling that human progress. And it's, you know, you ask a 20 year version of me 20 years ago, you know, if you could imagine some of these things, I don't know what kind of answer you would get. And so mapping forward next decade, next two decades, I can go back to that example of hey, we did great things in the past couple of years to enable our customers. Just imagine what we're going to be able to do going forward to enable that human progress. You know, there's great use cases, there's great image analysis. We talked about some. The images that Scott was referring to had to do with taking CAT scan images and being able to scan them for tumors and other things in the healthcare industry. That is stuff that feels good when you get out of bed in the morning, to know that you're enabling that type of progress. >> Scott, quick thoughts? >> Yeah, and I'll echo that. It's not one specific use case, but it's really this wave front of all of these use cases, from the very micro of developing the next drug to finding the next battery technology, all the way up to the macro of trying to have an impact on climate change or even the origins of the universe itself. All of these fields are seeing these massive gains, both from the software, the hardware, the platforms that we're bringing to bear to these problems. And at the end of the day, humanity is going to be fundamentally transformed by the computation that we're launching and working on today. >> Fantastic, fantastic. Thank you, gentlemen. You heard it hear first, Intel and Dell just committed to solving the secrets of the universe by New Years Eve 2023. >> Well, next Supercompute, let's give us a little time. >> The next Supercompute Convention. >> Yeah, next year. >> Yeah, SC 2023, we'll come back and see what problems have been solved. You heard it hear first on theCube, folks. By SC 23, Dell and Intel are going to reveal the secrets of the universe. From here, at SC 22, I'd like to thank you for joining our conversation. I'm Dave Nicholson, with my co-host Paul Gillin. Stay tuned to theCube's coverage of Supercomputing Conference 22. We'll be back after a short break. (techno music)
SUMMARY :
covering the amazing events Winding down here, but So not all of the holiday gifts First of all, explain the and the right designs for What does that mean, AI on the CPU? that you need to be able to and it's no longer the case leveraging the CPU to do that, all of the space in the convention center And I appreciate it if you and the ecosystem is something that you mentioned. And I just love that five to 90 example But the point is, 90 kilowatts to people, you know, And at the end of the day, You're based on the west coast. So Habana is part of the Intel family now. and for help in that area, in that ecosystem to make Chiplets, and the idea of an open standard Exactly, but at the end of the day, And of course Dell can that sits in the corner. the growth of high performance And it goes back to the CPU compute stack. in the next few months, so. when they offer you dessert, and the speeds and the feats, in the morning, to know And at the end of the day, of the universe by New Years Eve 2023. Well, next Supercompute, From here, at SC 22, I'd like to thank you
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Maribel | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Keith | PERSON | 0.99+ |
Equinix | ORGANIZATION | 0.99+ |
Matt Link | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Indianapolis | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Scott | PERSON | 0.99+ |
Dave Nicholson | PERSON | 0.99+ |
Tim Minahan | PERSON | 0.99+ |
Paul Gillin | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Stephanie Cox | PERSON | 0.99+ |
Akanshka | PERSON | 0.99+ |
Budapest | LOCATION | 0.99+ |
Indiana | LOCATION | 0.99+ |
Steve Jobs | PERSON | 0.99+ |
October | DATE | 0.99+ |
India | LOCATION | 0.99+ |
Stephanie | PERSON | 0.99+ |
Nvidia | ORGANIZATION | 0.99+ |
Chris Lavilla | PERSON | 0.99+ |
2006 | DATE | 0.99+ |
Tanuja Randery | PERSON | 0.99+ |
Cuba | LOCATION | 0.99+ |
Israel | LOCATION | 0.99+ |
Keith Townsend | PERSON | 0.99+ |
Akanksha | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Akanksha Mehrotra | PERSON | 0.99+ |
London | LOCATION | 0.99+ |
September 2020 | DATE | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
David Schmidt | PERSON | 0.99+ |
90% | QUANTITY | 0.99+ |
$45 billion | QUANTITY | 0.99+ |
October 2020 | DATE | 0.99+ |
Africa | LOCATION | 0.99+ |
Brian Payne, Dell Technologies and Raghu Nambiar, AMD | SuperComputing 22
(upbeat music) >> We're back at SC22 SuperComputing Conference in Dallas. My name's Paul Gillan, my co-host, John Furrier, SiliconANGLE founder. And huge exhibit floor here. So much activity, so much going on in HPC, and much of it around the chips from AMD, which has been on a roll lately. And in partnership with Dell, our guests are Brian Payne, Dell Technologies, VP of Product Management for ISG mid-range technical solutions, and Raghu Nambiar, corporate vice president of data system, data center ecosystem, and application engineering, that's quite a mouthful, at AMD, And gentlemen, welcome. Thank you. >> Thanks for having us. >> This has been an evolving relationship between you two companies, obviously a growing one, and something Dell was part of the big general rollout, AMD's new chip set last week. Talk about how that relationship has evolved over the last five years. >> Yeah, sure. Well, so it goes back to the advent of the EPIC architecture. So we were there from the beginning, partnering well before the launch five years ago, thinking about, "Hey how can we come up with a way to solve customer problems? address workloads in unique ways?" And that was kind of the origin of the relationship. We came out with some really disruptive and capable platforms. And then it continues, it's continued till then, all the way to the launch of last week, where we've introduced four of the most capable platforms we've ever had in the PowerEdge portfolio. >> Yeah, I'm really excited about the partnership with the Dell. As Brian said, we have been partnering very closely for last five years since we introduced the first generation of EPIC. So we collaborate on, you know, system design, validation, performance benchmarks, and more importantly on software optimizations and solutions to offer out of the box experience to our customers. Whether it is HPC or databases, big data analytics or AI. >> You know, you guys have been on theCUBE, you guys are veterans 2012, 2014 back in the day. So much has changed over the years. Raghu, you were on the founding chair of the TPC for AI. We've talked about the different iterations of power service. So much has changed. Why the focus on these workloads now? What's the inflection point that we're seeing here at SuperComputing? It feels like we've been in this, you know run the ball, get, gain a yard, move the chains, you know, but we feel, I feel like there's a moment where the there's going to be an unleashing of innovation around new use cases. Where's the workloads? Why the performance? What are some of those use cases right now that are front and center? >> Yeah, I mean if you look at today, the enterprise ecosystem has become extremely complex, okay? People are running traditional workloads like Relational Database Management Systems, also new generation of workloads with the AI and HPC and actually like AI actually HPC augmented with some of the AI technologies. So what customers are looking for is, as I said, out of the box experience, or time to value is extremely critical. Unlike in the past, you know, people, the customers don't have the time and resources to run months long of POCs, okay? So that's one idea that we are focusing, you know, working closely with Dell to give out of the box experience. Again, you know, the enterprise applicate ecosystem is, you know, really becoming complex and the, you know, as you mentioned, some of the industry standard benchmark is designed to give the fair comparison of performance, and price performance for the, our end customers. And you know, Brian and my team has been working closely to demonstrate our joint capabilities in the AI space with, in a set of TPCx-AI benchmark cards last week it was the major highlight of our launch last week. >> Brian, you got showing the demo in the booth at Dell here. Not demo, the product, it's available. What are you seeing for your use cases that customers are kind of rallying around now, and what are they doubling down on. >> Yeah, you know, I, so Raghu I think teed it up well. The really data is the currency of business and all organizations today. And that's what's pushing people to figure out, hey, both traditional workloads as well as new workloads. So we've got in the traditional workload space, you still have ERP systems like SAP, et cetera, and we've announced world records there, a hundred plus percent improvements in our single socket system, 70% and dual. We actually posted a 40% advantage over the best Genoa result just this week. So, I mean, we're excited about that in the traditional space. But what's exciting, like why are we here? Why, why are people thinking about HPC and AI? It's about how do we make use of that data, that data being the currency and how do we push in that space? So Raghu mentioned the TPC AI benchmark. We launched, or we announced in collaboration you talk about how do we work together, nine world records in that space. In one case it's a 3x improvement over prior generations. So the workloads that people care about is like how can I process this data more effectively? How can I store it and secure it more effectively? And ultimately, how do I make decisions about where we're going, whether it's a scientific breakthrough, or a commercial application. That's what's really driving the use cases and the demand from our customers today. >> I think one of the interesting trends we've seen over the last couple of years is a resurgence in interest in task specific hardware around AI. In fact venture capital companies invested a $1.8 billion last year in AI hardware startups. I wonder, and these companies are not doing CPUs necessarily, or GPUs, they're doing accelerators, FPGAs, ASICs. But you have to be looking at that activity and what these companies are doing. What are you taking away from that? How does that affect your own product development plans? Both on the chip side and on the system side? >> I think the future of computing is going to be heterogeneous. Okay. I mean a CPU solving certain type of problems like general purpose computing databases big data analytics, GPU solving, you know, problems in AI and visualization and DPUs and FPGA's accelerators solving you know, offloading, you know, some of the tasks from the CPU and providing realtime performance. And of course, you know, the, the software optimizes are going to be critical to stitch everything together, whether it is HPC or AI or other workloads. You know, again, as I said, heterogeneous computing is going to be the future. >> And, and for us as a platform provider, the heterogeneous, you know, solutions mean we have to design systems that are capable of supporting that. So if as you think about the compute power whether it's a GPU or a CPU, continuing to push the envelope in terms of, you know, to do the computations, power consumption, things like that. How do we design a system that can be, you know, incredibly efficient, and also be able to support the scaling, you know, to solve those complex problems. So that gets into challenges around, you know, both liquid cooling, but also making the most out of air cooling. And so we're seeing not only are we we driving up you know, the capability of these systems, we're actually improving the energy efficiency. And those, the most recent systems that we launched around the CPU, which is still kind of at the heart of everything today, you know, are seeing 50% improvement, you know, gen to gen in terms of performance per watt capabilities. So it's, it's about like how do we package these systems in effective ways and make sure that our customers can get, you know, the advertised benefits, so to speak, of the new chip technologies. >> Yeah. To add to that, you know, performance, scalability total cost of ownership, these are the key considerations, but now energy efficiency has become more important than ever, you know, our commitment to sustainability. This is one of the thing that we have demonstrated last week was with our new generation of EPIC Genoa based systems, we can do a one five to one consolidation, significantly reducing the energy requirement. >> Power's huge costs are going up. It's a global issue. >> Raghu: Yeah, it is. >> How do you squeeze more performance too out of it at the same time, I mean, smaller, faster, cheaper. Paul, you wrote a story about, you know, this weekend about hardware and AI making hardware so much more important. You got more power requirements, you got the sustainability, but you need more horsepower, more compute. What's different in the architecture if you guys could share like today versus years ago, what's different in as these generations step function value increases? >> So one of the major drivers from the processor perspective is if you look at the latest generation of processors, the five nanometer technology, bringing efficiency and density. So we are able to pack 96 processor cores, you know, in a two socket system, we are talking about 196 processor cores. And of course, you know, other enhancements like IPC uplift, bringing DDR5 to the market PC (indistinct) for the market, offering overall, you know, performance uplift of more than 2.5x for certain workloads. And of course, you know, significantly reducing the power footprint. >> Also, I was just going to cut, I mean, architecturally speaking, you know, then how do we take the 96 cores and surround it, deliver a balanced ecosystem to make sure that we can get the, the IO out of the system, and make sure we've got the right data storage. So I mean, you'll see 60% improvements and total storage in the system. I think in 2012 we're talking about 10 gig ethernet. Well, you know, now we're on to 100 and 400 on the forefront. So it's like how do we keep up with this increased power, by having, or computing capabilities both offload and core computing and make sure we've got a system that can deliver the desired (indistinct). >> So the little things like the bus, the PCI cards, the NICs, the connectors have to be rethought through. Is that what you're getting at? >> Yeah, absolutely. >> Paul: And the GPUs, which are huge power consumers. >> Yeah, absolutely. So I mean, cooling, we introduce, and we call it smart cooling is a part of our latest generation of servers. I mean, the thermal design inside of a server is a is a complex, you know, complex system, right? And doing that efficiently because of course fans consume power. So I mean, yeah, those are the kind of considerations that we have to put through to make sure that you're not either throttling performance because you don't have you know, keeping the chips at the right temperature. And, and you know, ultimately when you do that, you're hurting the productivity of the investment. So I mean, it's, it's our responsibility to put our thoughts and deliver those systems that are (indistinct) >> You mention data too, if you bring in the data, one of the big discussions going into the big Amazon show coming up, re:Invent is egress costs. Right, So now you've got compute and how you design data latency you know, processing. It's not just contained in a machine. You got to think about outside that machine talking to other machines. Is there an intelligent (chuckles) network developing? I mean, what's the future look like? >> Well, I mean, this is a, is an area that, that's, you know, it's fun and, you know, Dell's in a unique position to work on this problem, right? We have 70% of the mission housed, 70% of the mission critical data that exists in the world. How do we bring that closer to compute? How do we deliver system level solutions? So server compute, so recently we announced innovations around NVMe over Fabrics. So now you've got the NVMe technology and the SAN. How do we connect that more efficiently across the servers? Those are the kinds, and then guide our customers to make use of that. Those are the kinds of challenges that we're trying to unlock the value of the data by making sure we're (indistinct). >> There are a lot of lessons learned from, you know, classic HPC and some of the, you know big data analytics. Like, you know, Hadoops of the world, you know, you know distributor processing for crunching a large amount of amount of data. >> With the growth of the cloud, you see, you know, some pundits saying that data centers will become obsolete in five years, and everything's going to move to the cloud. Obviously data center market that's still growing, and is projected to continue to grow. But what's the argument for captive hardware, for owning a data center these days when the cloud offers such convenience and allegedly cost benefit? >> I would say the reality is that we're, and I think the industry at large has acknowledged this, that we're living in a multicloud world and multicloud methods are going to be necessary to you know, to solve problems and compete. And so, I mean, you know, in some cases, whether it's security or latency, you know, there's a push to have things in your own data center. And then of course growth at the edge, right? I mean, that's, that's really turning, you know, things on their head, if you will, getting data closer to where it's being generated. And so I would say we're going to live in this edge cloud, you know, and core data center environment with multi, you know, different cloud providers providing solutions and services where it makes sense, and it's incumbent on us to figure out how do we stitch together that data platform, that data layer, and help customers, you know, synthesize this data to, to generate, you know, the results they need. >> You know, one of the things I want to get into on the cloud you mentioned that Paul, is that we see the rise of graph databases. And so is that on the radar for the AI? Because a lot of more graph data is being brought in, the database market's incredibly robust. It's one of the key areas that people want performance out of. And as cloud native becomes the modern application development, a lot more infrastructure as code's happening, which means that the internet and the networks and the process should be programmable. So graph database has been one of those things. Have you guys done any work there? What's some data there you can share on that? >> Yeah, actually, you know, we have worked closely with a company called TigerGraph, there in the graph database space. And we have done a couple of case studies, one on the healthcare side, and the other one on the financial side for fraud detection. Yeah, I think they have a, this is an emerging area, and we are able to demonstrate industry leading performance for graph databases. Very excited about it. >> Yeah, it's interesting. It brings up the vertical versus horizontal applications. Where is the AI HPC kind of shining? Is it like horizontal and vertical solutions or what's, what's your vision there. >> Yeah, well, I mean, so this is a case where I'm also a user. So I own our analytics platform internally. We actually, we have a chat box for our product development organization to figure out, hey, what trends are going on with the systems that we sell, whether it's how they're being consumed or what we've sold. And we actually use graph database technology in order to power that chat box. So I'm actually in a position where I'm like, I want to get these new systems into our environment so we can deliver. >> Paul: Graphs under underlie most machine learning models. >> Yeah, Yeah. >> So we could talk about, so much to talk about in this space, so little time. And unfortunately we're out of that. So fascinating discussion. Brian Payne, Dell Technologies, Raghu Nambiar, AMD. Congratulations on the successful launch of your new chip set and the growth of, in your relationship over these past years. Thanks so much for being with us here on theCUBE. >> Super. >> Thank you much. >> It's great to be back. >> We'll be right back from SuperComputing 22 in Dallas. (upbeat music)
SUMMARY :
and much of it around the chips from AMD, over the last five years. in the PowerEdge portfolio. you know, system design, So much has changed over the years. Unlike in the past, you know, demo in the booth at Dell here. Yeah, you know, I, so and on the system side? And of course, you know, the heterogeneous, you know, This is one of the thing that we It's a global issue. What's different in the And of course, you know, other Well, you know, now the connectors have to Paul: And the GPUs, which And, and you know, you know, processing. is an area that, that's, you know, the world, you know, you know With the growth of the And so, I mean, you know, in some cases, on the cloud you mentioned that Paul, Yeah, actually, you know, Where is the AI HPC kind of shining? And we actually use graph Paul: Graphs under underlie Congratulations on the successful launch SuperComputing 22 in Dallas.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Brian | PERSON | 0.99+ |
Brian Payne | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Paul Gillan | PERSON | 0.99+ |
Dallas | LOCATION | 0.99+ |
50% | QUANTITY | 0.99+ |
60% | QUANTITY | 0.99+ |
70% | QUANTITY | 0.99+ |
2012 | DATE | 0.99+ |
Raghu | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
96 cores | QUANTITY | 0.99+ |
two companies | QUANTITY | 0.99+ |
40% | QUANTITY | 0.99+ |
100 | QUANTITY | 0.99+ |
$1.8 billion | QUANTITY | 0.99+ |
400 | QUANTITY | 0.99+ |
TigerGraph | ORGANIZATION | 0.99+ |
AMD | ORGANIZATION | 0.99+ |
last week | DATE | 0.99+ |
Raghu Nambiar | PERSON | 0.99+ |
2014 | DATE | 0.99+ |
one | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
Dell Technologies | ORGANIZATION | 0.99+ |
96 processor cores | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Both | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.98+ |
five years | QUANTITY | 0.98+ |
two socket | QUANTITY | 0.98+ |
3x | QUANTITY | 0.98+ |
this week | DATE | 0.98+ |
five years ago | DATE | 0.98+ |
today | DATE | 0.98+ |
first generation | QUANTITY | 0.98+ |
four | QUANTITY | 0.98+ |
SiliconANGLE | ORGANIZATION | 0.97+ |
more than 2.5x | QUANTITY | 0.97+ |
five | QUANTITY | 0.97+ |
one idea | QUANTITY | 0.97+ |
ISG | ORGANIZATION | 0.96+ |
one case | QUANTITY | 0.95+ |
five nanometer | QUANTITY | 0.95+ |
SuperComputing | ORGANIZATION | 0.94+ |
EPIC | ORGANIZATION | 0.93+ |
years | DATE | 0.93+ |
Genoa | ORGANIZATION | 0.92+ |
Raghu Nambiar | ORGANIZATION | 0.92+ |
SC22 SuperComputing Conference | EVENT | 0.91+ |
last couple of years | DATE | 0.9+ |
hundred plus percent | QUANTITY | 0.89+ |
TPC | ORGANIZATION | 0.88+ |
nine world | QUANTITY | 0.87+ |
SuperComputing 22 | ORGANIZATION | 0.87+ |
about 196 processor cores | QUANTITY | 0.85+ |