The University of Edinburgh and Rolls Royce Drive in Exascale Style | Exascale Day

>>welcome. My name is Ben Bennett. I am the director of HPC Strategic programs here at Hewlett Packard Enterprise. It is my great pleasure and honor to be talking to Professor Mark Parsons from the Edinburgh Parallel Computing Center. And we're gonna talk a little about exa scale. What? It means we're gonna talk less about the technology on Maura about the science, the requirements on the need for exa scale. Uh, rather than a deep dive into the enabling technologies. Mark. Welcome. >>I then thanks very much for inviting me to tell me >>complete pleasure. Um, so I'd like to kick off with, I suppose. Quite an interesting look back. You and I are both of a certain age 25 plus, Onda. We've seen these milestones. Uh, I suppose that the S I milestones of high performance computing's come and go, you know, from a gig a flop back in 1987 teraflop in 97 a petaflop in 2000 and eight. But we seem to be taking longer in getting to an ex a flop. Um, so I'd like your thoughts. Why is why is an extra flop taking so long? >>So I think that's a very interesting question because I started my career in parallel computing in 1989. I'm gonna join in. IPCC was set up then. You know, we're 30 years old this year in 1990 on Do you know the fastest computer we have them is 800 mega flops just under a getting flogged. So in my career, we've gone already. When we reached the better scale, we'd already gone pretty much a million times faster on, you know, the step from a tariff block to a block scale system really didn't feel particularly difficult. Um, on yet the step from A from a petaflop PETA scale system. To an extent, block is a really, really big challenge. And I think it's really actually related to what's happened with computer processes over the last decade, where, individually, you know, approached the core, Like on your laptop. Whoever hasn't got much faster, we've just got more often So the perception of more speed, but actually just being delivered by more course. And as you go down that approach, you know what happens in the supercomputing world as well. We've gone, uh, in 2010 I think we had systems that were, you know, a few 1000 cores. Our main national service in the UK for the last eight years has had 118,000 cores. But looking at the X scale we're looking at, you know, four or five million cores on taming that level of parallelism is the real challenge. And that's why it's taking an enormous and time to, uh, deliver these systems. That is not just on the hardware front. You know, vendors like HP have to deliver world beating technology and it's hard, hard. But then there's also the challenge to the users. How do they get the codes to work in the face of that much parallelism? >>If you look at what the the complexity is delivering an annex a flop. Andi, you could have bought an extra flop three or four years ago. You couldn't have housed it. You couldn't have powered it. You couldn't have afforded it on, do you? Couldn't program it. But you still you could have You could have bought one. We should have been so lucky to be unable to supply it. Um, the software, um I think from our standpoint, is is looking like where we're doing mawr enabling with our customers. You sell them a machine on, then the the need then to do collaboration specifically seems mawr and Maura around the software. Um, so it's It's gonna be relatively easy to get one x a flop using limb pack, but but that's not extra scale. So what do you think? On exa scale machine versus an X? A flop machine means to the people like yourself to your users, the scientists and industry. What is an ex? A flop versus >>an exa scale? So I think, you know, supercomputing moves forward by setting itself challenges. And when you when you look at all of the excess scale programs worldwide that are trying to deliver systems that can do an X a lot form or it's actually very arbitrary challenge. You know, we set ourselves a PETA scale challenge delivering a petaflop somebody manage that, Andi. But you know, the world moves forward by setting itself challenges e think you know, we use quite arbitrary definition of what we mean is well by an exit block. So, you know, in your in my world, um, we either way, first of all, see ah flop is a computation, so multiply or it's an ad or whatever on we tend. Thio, look at that is using very high precision numbers or 64 bit numbers on Do you know, we then say, Well, you've got to do the next block. You've got to do a billion billion of those calculations every second. No, a some of the last arbitrary target Now you know today from HPD Aiken by my assistant and will do a billion billion calculations per second. And they will either do that as a theoretical peak, which would be almost unattainable, or using benchmarks that stressed the system on demonstrate a relaxing law. But again, those benchmarks themselves attuned Thio. Just do those calculations and deliver and explore been a steady I'll way if you like. So, you know, way kind of set ourselves this this this big challenge You know, the big fence on the race course, which were clambering over. But the challenge in itself actually should be. I'm much more interesting. The water we're going to use these devices for having built um, eso. Getting into the extra scale era is not so much about doing an extra block. It's a new generation off capability that allows us to do better scientific and industrial research. And that's the interesting bit in this whole story. >>I would tend to agree with you. I think the the focus around exa scale is to look at, you know, new technologies, new ways of doing things, new ways of looking at data and to get new results. So eventually you will get yourself a nexus scale machine. Um, one hopes, sooner rather >>than later. Well, I'm sure you don't tell me one, Ben. >>It's got nothing to do with may. I can't sell you anything, Mark. But there are people outside the door over there who would love to sell you one. Yes. However, if we if you look at your you know your your exa scale machine, Um, how do you believe the workloads are going to be different on an extra scale machine versus your current PETA scale machine? >>So I think there's always a slight conceit when you buy a new national supercomputer. On that conceit is that you're buying a capability that you know on. But many people will run on the whole system. Known truth. We do have people that run on the whole of our archer system. Today's A 118,000 cores, but I would say, and I'm looking at the system. People that run over say, half of that can be counted on Europe on a single hand in a year, and they're doing very specific things. It's very costly simulation they're running on. So, you know, if you look at these systems today, two things show no one is. It's very difficult to get time on them. The Baroque application procedures All of the requirements have to be assessed by your peers and your given quite limited amount of time that you have to eke out to do science. Andi people tend to run their applications in the sweet spot where their application delivers the best performance on You know, we try to push our users over time. Thio use reasonably sized jobs. I think our average job says about 20,000 course, she's not bad, but that does mean that as we move to the exits, kill two things have to happen. One is actually I think we've got to be more relaxed about giving people access to the system, So let's give more people access, let people play, let people try out ideas they've never tried out before. And I think that will lead to a lot more innovation and computational science. But at the same time, I think we also need to be less precious. You know, we to accept these systems will have a variety of sizes of job on them. You know, we're still gonna have people that want to run four million cores or two million cores. That's absolutely fine. Absolutely. Salute those people for trying really, really difficult. But then we're gonna have a huge spectrum of views all the way down to people that want to run on 500 cores or whatever. So I think we need Thio broaden the user base in Alexa Skill system. And I know this is what's happening, for example, in Japan with the new Japanese system. >>So, Mark, if you cast your mind back to almost exactly a year ago after the HPC user forum, you were interviewed for Premier Magazine on Do you alluded in that article to the needs off scientific industrial users requiring, you know, uh on X a flop or an exa scale machine it's clear in your in your previous answer regarding, you know, the workloads. Some would say that the majority of people would be happier with, say, 10 100 petaflop machines. You know, democratization. More people access. But can you provide us examples at the type of science? The needs of industrial users that actually do require those resources to be put >>together as an exa scale machine? So I think you know, it's a very interesting area. At the end of the day, these systems air bought because they are capability systems on. I absolutely take the argument. Why shouldn't we buy 10 100 pattern block systems? But there are a number of scientific areas even today that would benefit from a nexus school system and on these the sort of scientific areas that will use as much access onto a system as much time and as much scale of the system as they can, as you can give them eso on immediate example. People doing chroma dynamics calculations in particle physics, theoretical calculations, they would just use whatever you give them. But you know, I think one of the areas that is very interesting is actually the engineering space where, you know, many people worry the engineering applications over the last decade haven't really kept up with this sort of supercomputers that we have. I'm leading a project called Asimov, funded by M. P S O. C in the UK, which is jointly with Rolls Royce, jointly funded by Rolls Royce and also working with the University of Cambridge, Oxford, Bristol, Warrick. We're trying to do the whole engine gas turbine simulation for the first time. So that's looking at the structure of the gas turbine, the airplane engine, the structure of it, how it's all built it together, looking at the fluid dynamics off the air and the hot gasses, the flu threat, looking at the combustion of the engine looking how fuel is spread into the combustion chamber. Looking at the electrics around, looking at the way the engine two forms is, it heats up and cools down all of that. Now Rolls Royce wants to do that for 20 years. Andi, Uh, whenever they certify, a new engine has to go through a number of physical tests, and every time they do on those tests, it could cost them as much as 25 to $30 million. These are very expensive tests, particularly when they do what's called a blade off test, which would be, you know, blade failure. They could prove that the engine contains the fragments of the blade. Sort of think, continue face really important test and all engines and pass it. What we want to do is do is use an exa scale computer to properly model a blade off test for the first time, so that in future, some simulations can become virtual rather than having thio expend all of the money that Rolls Royce would normally spend on. You know, it's a fascinating project is a really hard project to do. One of the things that I do is I am deaf to share this year. Gordon Bell Price on bond I've really enjoyed to do. That's one of the major prizes in our area, you know, gets announced supercomputing every year. So I have the pleasure of reading all the submissions each year. I what's been really interesting thing? This is my third year doing being on the committee on what's really interesting is the way that big systems like Summit, for example, in the US have pushed the user communities to try and do simulations Nowhere. Nobody's done before, you know. And we've seen this as well, with papers coming after the first use of the for Goku system in Japan, for example, people you know, these are very, very broad. So, you know, earthquake simulation, a large Eddie simulations of boats. You know, a number of things around Genome Wide Association studies, for example. So the use of these computers spans of last area off computational science. I think the really really important thing about these systems is their challenging people that do calculations they've never done before. That's what's important. >>Okay, Thank you. You talked about challenges when I nearly said when you and I had lots of hair, but that's probably much more true of May. Um, we used to talk about grand challenges we talked about, especially around the teraflop era, the ski red program driving, you know, the grand challenges of science, possibly to hide the fact that it was a bomb designing computer eso they talked about the grand challenges. Um, we don't seem to talk about that much. We talk about excess girl. We talk about data. Um Where are the grand challenges that you see that an exa scale computer can you know it can help us. Okay, >>so I think grand challenges didn't go away. Just the phrase went out of fashion. Um, that's like my hair. I think it's interesting. The I do feel the science moves forward by setting itself grand challenges and always had has done, you know, my original backgrounds in particle physics. I was very lucky to spend four years at CERN working in the early stage of the left accelerator when it first came online on. Do you know the scientists there? I think they worked on left 15 years before I came in and did my little ph d on it. Andi, I think that way of organizing science hasn't changed. We just talked less about grand challenges. I think you know what I've seen over the last few years is a renaissance in computational science, looking at things that have previously, you know, people have said have been impossible. So a couple of years ago, for example, one of the key Gordon Bell price papers was on Genome Wide Association studies on some of it. If I may be one of the winner of its, if I remember right on. But that was really, really interesting because first of all, you know, the sort of the Genome Wide Association Studies had gone out of favor in the bioinformatics by a scientist community because people thought they weren't possible to compute. But that particular paper should Yes, you could do these really, really big Continental little problems in a reasonable amount of time if you had a big enough computer. And one thing I felt all the way through my career actually is we've probably discarded Mawr simulations because they were impossible at the time that we've actually decided to do. And I sometimes think we to challenge ourselves by looking at the things we've discovered in the past and say, Oh, look, you know, we could actually do that now, Andi, I think part of the the challenge of bringing an extra service toe life is to get people to think about what they would use it for. That's a key thing. Otherwise, I always say, a computer that is unused to just be turned off. There's no point in having underutilized supercomputer. Everybody loses from that. >>So Let's let's bring ourselves slightly more up to date. We're in the middle of a global pandemic. Uh, on board one of the things in our industry has bean that I've been particularly proud about is I've seen the vendors, all the vendors, you know, offering up machine's onboard, uh, making resources available for people to fight things current disease. Um, how do you see supercomputers now and in the future? Speeding up things like vaccine discovery on help when helping doctors generally. >>So I think you're quite right that, you know, the supercomputer community around the world actually did a really good job of responding to over 19. Inasmuch as you know, speaking for the UK, we put in place a rapid access program. So anybody wanted to do covert research on the various national services we have done to the to two services Could get really quick access. Um, on that, that has worked really well in the UK You know, we didn't have an archer is an old system, Aziz. You know, we didn't have the world's largest supercomputer, but it is happily bean running lots off covert 19 simulations largely for the biomedical community. Looking at Druk modeling and molecular modeling. Largely that's just been going the US They've been doing really large uh, combinatorial parameter search problems on on Summit, for example, looking to see whether or not old drugs could be reused to solve a new problem on DSO, I think, I think actually, in some respects Kobe, 19 is being the sounds wrong. But it's actually been good for supercomputing. Inasmuch is pointed out to governments that supercomputers are important parts off any scientific, the active countries research infrastructure. >>So, um, I'll finish up and tap into your inner geek. Um, there's a lot of technologies that are being banded around to currently enable, you know, the first exa scale machine, wherever that's going to be from whomever, what are the current technologies or emerging technologies that you are interested in excited about looking forward to getting your hands on. >>So in the business case I've written for the U. K's exa scale computer, I actually characterized this is a choice between the American model in the Japanese model. Okay, both of frozen, both of condoms. Eso in America, they're very much gone down the chorus plus GPU or GPU fruit. Um, so you might have, you know, an Intel Xeon or an M D process er center or unarmed process or, for that matter on you might have, you know, 24 g. P. U s. I think the most interesting thing that I've seen is definitely this move to a single address space. So the data that you have will be accessible, but the G p u on the CPU, I think you know, that's really bean. One of the key things that stopped the uptake of GPS today and that that that one single change is going Thio, I think, uh, make things very, very interesting. But I'm not entirely convinced that the CPU GPU model because I think that it's very difficult to get all the all the performance set of the GPU. You know, it will do well in H p l, for example, high performance impact benchmark we're discussing at the beginning of this interview. But in riel scientific workloads, you know, you still find it difficult to find all the performance that has promised. So, you know, the Japanese approach, which is the core, is only approach. E think it's very attractive, inasmuch as you know They're using very high bandwidth memory, very interesting process of which they are going to have to, you know, which they could develop together over 10 year period. And this is one thing that people don't realize the Japanese program and the American Mexico program has been working for 10 years on these systems. I think the Japanese process really interesting because, um, it when you look at the performance, it really does work for their scientific work clothes, and that's that does interest me a lot. This this combination of a A process are designed to do good science, high bandwidth memory and a real understanding of how data flows around the supercomputer. I think those are the things are exciting me at the moment. Obviously, you know, there's new networking technologies, I think, in the fullness of time, not necessarily for the first systems. You know, over the next decade we're going to see much, much more activity on silicon photonics. I think that's really, really fascinating all of these things. I think in some respects the last decade has just bean quite incremental improvements. But I think we're supercomputing is going in the moment. We're a very very disruptive moment again. That goes back to start this discussion. Why is extra skill been difficult to get? Thio? Actually, because the disruptive moment in technology. >>Professor Parsons, thank you very much for your time and your insights. Thank you. Pleasure and folks. Thank you for watching. I hope you've learned something, or at least enjoyed it. With that, I would ask you to stay safe and goodbye.

Published Date : Oct 16 2020

SUMMARY :

I am the director of HPC Strategic programs I suppose that the S I milestones of high performance computing's come and go, But looking at the X scale we're looking at, you know, four or five million cores on taming But you still you could have You could have bought one. challenges e think you know, we use quite arbitrary focus around exa scale is to look at, you know, new technologies, Well, I'm sure you don't tell me one, Ben. outside the door over there who would love to sell you one. So I think there's always a slight conceit when you buy a you know, the workloads. That's one of the major prizes in our area, you know, gets announced you know, the grand challenges of science, possibly to hide I think you know what I've seen over the last few years is a renaissance about is I've seen the vendors, all the vendors, you know, Inasmuch as you know, speaking for the UK, we put in place a rapid to currently enable, you know, I think you know, that's really bean. Professor Parsons, thank you very much for your time and your insights.

ENTITIES

Entity	Category	Confidence
Ben Bennett	PERSON	0.99+
1989	DATE	0.99+
Rolls Royce	ORGANIZATION	0.99+
UK	LOCATION	0.99+
500 cores	QUANTITY	0.99+
10 years	QUANTITY	0.99+
20 years	QUANTITY	0.99+
Japan	LOCATION	0.99+
Parsons	PERSON	0.99+
1990	DATE	0.99+
Mark	PERSON	0.99+
2010	DATE	0.99+
1987	DATE	0.99+
HP	ORGANIZATION	0.99+
118,000 cores	QUANTITY	0.99+
first time	QUANTITY	0.99+
four years	QUANTITY	0.99+
America	LOCATION	0.99+
CERN	ORGANIZATION	0.99+
third year	QUANTITY	0.99+
four	QUANTITY	0.99+
first	QUANTITY	0.99+
30 years	QUANTITY	0.99+
2000	DATE	0.99+
four million cores	QUANTITY	0.99+
two million cores	QUANTITY	0.99+
Genome Wide Association	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Ben	PERSON	0.99+
first systems	QUANTITY	0.99+
two forms	QUANTITY	0.99+
US	LOCATION	0.99+
both	QUANTITY	0.99+
IPCC	ORGANIZATION	0.99+
three	DATE	0.99+
today	DATE	0.98+
Hewlett Packard Enterprise	ORGANIZATION	0.98+
University of Cambridge	ORGANIZATION	0.98+
five million cores	QUANTITY	0.98+
a year ago	DATE	0.98+
single	QUANTITY	0.98+
Mark Parsons	PERSON	0.98+
two things	QUANTITY	0.98+
$30 million	QUANTITY	0.98+
one	QUANTITY	0.98+
Edinburgh Parallel Computing Center	ORGANIZATION	0.98+
Aziz	PERSON	0.98+
Gordon Bell	PERSON	0.98+
May	DATE	0.98+
64 bit	QUANTITY	0.98+
Europe	LOCATION	0.98+
One	QUANTITY	0.97+
each year	QUANTITY	0.97+
about 20,000 course	QUANTITY	0.97+
Today	DATE	0.97+
Alexa	TITLE	0.97+
this year	DATE	0.97+
HPC	ORGANIZATION	0.96+
Intel	ORGANIZATION	0.96+
Xeon	COMMERCIAL_ITEM	0.95+
25	QUANTITY	0.95+
over 10 year	QUANTITY	0.95+
1000 cores	QUANTITY	0.95+
Thio	PERSON	0.95+
800 mega flops	QUANTITY	0.95+
Professor	PERSON	0.95+
Andi	PERSON	0.94+
one thing	QUANTITY	0.94+
couple of years ago	DATE	0.94+
over 19	QUANTITY	0.93+
U. K	LOCATION	0.92+
Premier Magazine	TITLE	0.92+
10 100 petaflop machines	QUANTITY	0.91+
four years ago	DATE	0.91+
Exascale	LOCATION	0.91+
HPD Aiken	ORGANIZATION	0.91+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Gordon Bell: