Dr. Eng Lim Goh, HPE | HPE Discover 2021

>>Please >>welcome back to HPD discovered 2021. The cubes virtual coverage, continuous coverage of H P. S H. P. S. Annual customer event. My name is Dave Volonte and we're going to dive into the intersection of high performance computing data and AI with DR Eng limb go who is the senior vice president and CTO for AI Hewlett Packard enterprise Doctor go great to see you again. Welcome back to the cube. >>Hello Dave, Great to talk to you again. >>You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the day two keynotes here at discover and you talked about thriving in the age of insights and how to craft a data centric strategy. And you addressed you know some of the biggest problems I think organizations face with data that's You got a data is plentiful but insights they're harder to come by. And you really dug into some great examples in retail banking and medicine and health care and media. But stepping back a little bit with zoom out on discovered 21, what do you make of the events so far? And some of your big takeaways? >>Mm Well you started with the insightful question, Right? Yeah, data is everywhere then. But we like the insight. Right? That's also part of the reason why that's the main reason why you know Antonio on day one focused and talked about that. The fact that we are now in the age of insight, right? Uh and uh and and how to thrive thrive in that in this new age. What I then did on the day to kino following Antonio is to talk about the challenges that we need to overcome in order in order to thrive in this new asia. >>So maybe we could talk a little bit about some of the things that you took away in terms I'm specifically interested in some of the barriers to achieving insights when customers are drowning in data. What do you hear from customers? What we take away from some of the ones you talked about today? >>Oh, very pertinent question. Dave You know the two challenges I spoke about right now that we need to overcome in order to thrive in this new age. The first one is is the current challenge and that current challenge is uh you know stated is no barriers to insight. You know when we are awash with data. So that's a statement. Right? How to overcome those barriers. What are the barriers of these two insight when we are awash in data? Um I in the data keynote I spoke about three main things. Three main areas that received from customers. The first one, the first barrier is in many with many of our customers. A data is siloed. All right. You know, like in a big corporation you've got data siloed by sales, finance, engineering, manufacturing, and so on, uh supply chain and so on. And uh there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the that was the first barrier. We spoke about barriers to incite when we are washed with data. The second barrier is uh that we see amongst our customers is that uh data is raw and dispersed when they are stored and and uh and you know, it's tough to get tough to to get value out of them. Right? And I in that case I I used the example of uh you know the May 6 2010 event where the stock market dropped a trillion dollars in in tens of minutes. You know, we we all know those who are financially attuned with know about this uh incident, But this is not the only incident. There are many of them out there and for for that particular May six event, uh you know, it took a long time to get insight months. Yeah, before we for months we had no insight as to what happened, why it happened, right. Um, and and there were many other incidences like this and the regulators were looking for that one rule that could, that could mitigate many of these incidences. Um, one of our customers decided to take the hard road to go with the tough data right? Because data is rolling dispersed. So they went into all the different feeds of financial transaction information, took the took the tough took the tough road and analyze that data took a long time to assemble. And they discovered that there was quote stuffing right? That uh people were sending a lot of traits in and then cancelling them almost immediately. You have to manipulate the market. Um And why why why didn't we see it immediately? Well, the reason is the process reports that everybody sees the rule in there that says all trades, less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 103 100 100 shares trades uh to fly under the radar to do this manipulation. So here is here the second barrier right? Data could be raw and dispersed. Um Sometimes you just have to take the hard road and um and to get insight And this is 1 1 great example. And then the last barrier is uh is has to do with sometimes when you start a project to to get insight to get uh to get answers and insight. You you realize that all the datas around you but you don't you don't seem to find the right ones to get what you need. You don't you don't seem to get the right ones. Yeah. Um here we have three quick examples of customers. 111 was it was a great example right? Where uh they were trying to build a language translator, a machine language translator between two languages. Right? By not do that. They need to get hundreds of millions of word pairs, you know, of one language compared uh with a corresponding other hundreds of millions of them. They say, well I'm going to get all these word pairs. Someone creative thought of a willing source. And you thought it was the United Nations, you see. So sometimes you think you don't have the right data with you, but there might be another source. And the willing one that could give you that data Right? The 2nd 1 has to do with uh there was uh the uh sometimes you you may just have to generate that data, interesting one. We had an autonomous car customer that collects all these data from their cars, right? Massive amounts of data, loss of sensors, collect loss of data. And uh, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car uh in in um in fine weather and collected the car driving on this highway in rain and also in stone, but never had the opportunity to collect the car in hill because that's a rare occurrence. So instead of waiting for a time where the car can dr inhale, they build a simulation you by having the car collector in snow and simulated him. So, these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated the fact that data silo the Federated it various associated with data. That's tough to get that. They just took the hard road, right? And, and sometimes, thirdly, you just have to be creative to get the right data. You need, >>wow, I I'll tell you, I have about 100 questions based on what you just said. Uh, there's a great example, the flash crash. In fact, Michael Lewis wrote about this in his book The Flash Boys and essentially right. It was high frequency traders trying to front run the market and sending in small block trades trying to get on the front end it. So that's and they, and they chalked it up to a glitch like you said, for months. Nobody really knew what it was. So technology got us into this problem. I guess my question is, can technology help us get out of the problem? And that maybe is where AI fits in. >>Yes, yes. Uh, in fact, a lot of analytics, we went in to go back to the raw data that is highly dispersed from different sources, right, assemble them to see if you can find a material trend, right? You can see lots of trends, right? Like, uh, you know, we if if humans look at things right, we tend to see patterns in clouds, right? So sometimes you need to apply statistical analysis, um math to to be sure that what the model is seeing is is real. Right? And and that required work. That's one area. The second area is uh you know, when um uh there are times when you you just need to to go through that uh that tough approach to to find the answer. Now, the issue comes to mind now is is that humans put in the rules to decide what goes into a report that everybody sees. And in this case uh before the change in the rules. Right? But by the way, after the discovery, uh authorities change the rules and all all shares, all traits of different any sizes. It has to be reported. No. Yeah. Right. But the rule was applied uh you know, to say earlier that shares under 100 trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and and under for understandable reasons. I mean they probably didn't want that for various reasons not to put everything in there so that people could still read it uh in a reasonable amount of time. But uh we need to understand that rules were being put in by humans for the reports we read. And as such there are times you just need to go back to the raw data. >>I want to ask, >>it's gonna be tough. >>Yeah. So I want to ask a question about AI is obviously it's in your title and it's something you know a lot about but and I want to make a statement, you tell me if it's on point or off point. So it seems that most of the Ai going on in the enterprise is modeling data science applied to troves of data but but there's also a lot of ai going on in consumer whether it's you know, fingerprint technology or facial recognition or natural language processing will a two part question will the consumer market as has so often in the enterprise sort of inform us uh the first part and then will there be a shift from sort of modeling if you will to more you mentioned autonomous vehicles more ai influencing in real time. Especially with the edge you can help us understand that better. >>Yeah, it's a great question. Right. Uh there are three stages to just simplify, I mean, you know, it's probably more sophisticated than that but let's simplify three stages. All right. To to building an Ai system that ultimately can predict, make a prediction right or to to assist you in decision making, have an outcome. So you start with the data massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data and the machine uh starts to evolve a model based on all the data is seeing. It starts to evolve right to the point that using a test set of data that you have separately kept a site that you know the answer for. Then you test the model uh you know after you trained it with all that data to see whether it's prediction accuracy is high enough and once you are satisfied with it, you you then deploy the model to make the decision and that's the influence. Right? So a lot of times depend on what what we are focusing on. We we um in data science are we working hard on assembling the right data to feed the machine with, That's the data preparation organization work. And then after which you build your models, you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you you pick one model and the prediction isn't that robust, it is good but then it is not consistent right now. What you do is uh you try another model so sometimes it's just keep trying different models until you get the right kind. Yeah, that gives you a good robust decision making and prediction after which It is tested well Q eight. You would then take that model and deploy it at the edge. Yeah. And then at the edges is essentially just looking at new data, applying it to the model that you have trained and then that model will give you a prediction decision. Right? So uh it is these three stages. Yeah, but more and more uh your question reminds me that more and more people are thinking as the edge become more and more powerful. Can you also do learning at the edge? Right. That's the reason why we spoke about swarm learning the last time, learning at the edge as a swamp, right? Because maybe individually they may not have enough power to do so. But as a swamp they made >>is that learning from the edge? You're learning at the edge? In other words? >>Yes. >>Yeah, I understand the question. Yeah. >>That's a great question. That's a great question. Right? So uh the quick answer is learning at the edge, right? Uh and and also from the edge, but the main goal, right? The goal is to learn at the edge so that you don't have to move the data that the edge sees first back to the cloud or the core to do the learning because that would be the reason. One of the main reasons why you want to learn at the edge, right? Uh So so that you don't need to have to send all that data back and assemble it back from all the different Edge devices, assemble it back to the cloud side to to do the learning right. With someone you can learn it and keep the data at the edge and learn at that point. >>And then maybe only selectively send the autonomous vehicle example you gave us great because maybe there, you know, there may be only persisting, they're not persisting data that is inclement weather or when a deer runs across the front. And then maybe they they do that and then they send that smaller data set back and maybe that's where it's modelling done. But the rest can be done at the edges. It's a new world that's coming down. Let me ask you a question, is there a limit to what data should be collected and how it should be collected? >>That's a great question again, you know uh wow today, full of these uh insightful questions that actually touches on the second challenge. Right? How do we uh in order to thrive in this new age of insight? The second challenge is are you know the is our future challenge, right? What do we do for our future? And and in there is uh the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that I talk about what to collect right? When to organize it when you collect and where will your data be, you know, going forward that you are collecting from? So what, when and where for the what data for the what data to collect? That? That was the question you ask. Um it's it's a question that different industries have to ask themselves because it will vary, right? Um Let me give you the, you use the autonomous car example, let me use that. And We have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from the fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars, collecting data so they can train and eventually deploy commercial cars. Right? Um, so this data collection cars they collect as a fleet of them collect 10 petabytes a day and when it came to us uh building a storage system yeah, to store all of that data, they realized they don't want to afford to store all of it. Now here comes the dilemma, right? Should what should I after I spent so much effort building all these cars and sensors and collecting data, I've now decide what to delete. That's a dilemma right now in working with them on this process of trimming down what they collected. You know, I'm constantly reminded of the sixties and seventies, right? To remind myself 16 seventies we call a large part of our D. N. A junk DNA. Today we realize that a large part of that what we call john has function as valuable function. They are not jeans, but they regulate the function of jeans, you know? So, so what's jumped in the yesterday could be valuable today or what's junk today could be valuable tomorrow. Right? So, so there's this tension going on right between you decided not wanting to afford to store everything that you can get your hands on. But on the other hand, you you know, you worry you you you ignore the wrong ones, right? You can see this tension in our customers, right? And it depends on industry here. Right? In health care, they say I have no choice. I I want it. All right. One very insightful point brought up by one health care provider that really touched me was, you know, we are not we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But you also care for the people were not caring for. How do we find them? Mhm. Right. And that therefore they did not just need to collect data that is uh that they have with from their patients. They also need to reach out right to outside data so that they can figure out who they are not caring for. Right? So they want it all. So I tell us them. So what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and find out. Of course they also come back to us rightfully that, you know, we have to then work out a way to help them build that system, you know, so that health care, right? And and if you go to other industries like banking, they say they can't afford to keep them on, but they are regulated. Seems like healthcare, they are regulated as to uh privacy and such. Like so many examples different industries having different needs but different approaches to how what they collect. But there is this constant tension between um you perhaps deciding not wanting to fund all of that uh all that you can stall right on the other hand, you know, if you if you kind of don't want to afford it and decide not to store some uh if he does some become highly valuable in the future right? Don't worry. >>We can make some assumptions about the future, can't we? I mean, we know there's gonna be a lot more data than than we've ever seen before. We know that we know. Well notwithstanding supply constraints on things like nand, we know the prices of storage is gonna continue to decline. We also know and not a lot of people are really talking about this but the processing power but he says moore's law is dead. Okay, it's waning. But the processing power when you combine the Cpus and N. P. U. S. And Gpus and accelerators and and so forth actually is is increasing. And so when you think about these use cases at the edge, you're going to have much more processing power, you're going to have cheaper storage and it's going to be less expensive processing. And so as an ai practitioner, what can you do with that? >>So the amount of data that's gonna come in, it's gonna we exceed right? Our drop in storage costs are increasing computer power. Right? So what's the answer? Right? So so the the answer must be knowing that we don't and and even the drop in price and increase in bandwidth, it will overwhelm the increased five G will overwhelm five G. Right? Given amount 55 billion of them collecting. Right? So the answer must be that there might need to be a balance between you needing to bring all that data from the 55 billion devices data back to a central as a bunch of central. Cause because you may not be able to afford to do that firstly band with even with five G. M and and SD when you'll still be too expensive given the number of devices out there, Were you given storage costs dropping? You'll still be too expensive to try and store them all. So the answer must be to start at least to mitigate the problem to some leave both a lot of the data out there. Right? And only send back the pertinent ones as you said before. But then if you did that, then how are we gonna do machine learning at the core and the cloud side? If you don't have all the data, you want rich data to train with. Right? Some sometimes you wanna mix of the uh positive type data and the negative type data so you can train the machine in a more balanced way. So the answer must be eventually right. As we move forward with these huge number of devices out of the edge to do machine learning at the edge today, we don't have enough power. Right? The edge typically is characterized by a lower uh energy capability and therefore lower compute power. But soon, you know, even with lower energy they can do more with compute power, improving in energy efficiency, Right? Uh So learning at the edge today we do influence at the edge. So we data model deploy and you do in France at the age, that's what we do today. But more and more I believe given a massive amount of data at the edge, you, you have to have to start doing machine learning at the edge and, and if when you don't have enough power then you aggregate multiple devices, compute power into a swamp and learn as a swan. >>Oh, interesting. So now of course, if, if I were sitting and fly, fly on the wall in hp board meeting, I said okay. HB is as a leading provider of compute how do you take advantage of that? I mean we're going, we're, I know its future, but you must be thinking about that and participating in those markets. I know today you are, you have, you know, edge line and other products. But there's, it seems to me that it's, it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that >>opportunity for the customers? The world will have to have a balance right? Where today the default? Well, the more common mode is to collect the data from the edge and train at uh at some centralized location or a number of centralized location um going forward. Given the proliferation of the edge devices, we'll need a balance. We need both. We need capability at the cloud side. Right? And it has to be hybrid and then we need capability on the edge side. Yeah. That they want to build systems that that on one hand, uh is uh edge adapted, right? Meaning the environmentally adapted because the edge different. They are on a lot of times. On the outside. Uh They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery power. Right? Um, so you have to build systems that adapt to it. But at the same time they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run a rich set of applications. So yes. Um that's that's also the insightful for that Antonio announced in 2018 Uh the next four years from 2018, right $4 billion dollars invested to strengthen our edge portfolio. Edge product lines, Right. Edge solutions. >>I can doctor go, I could go on for hours with you. You're you're just such a great guest. Let's close. What are you most excited about in the future? Of of of it. Certainly H. P. E. But the industry in general. >>Yeah. I think the excitement is uh the customers, right? The diversity of customers and and the diversity in a way they have approached their different problems with data strategy. So the excitement is around data strategy, right? Just like you know uh you know, the the statement made was was so was profound, right? Um And Antonio said we are in the age of insight powered by data. That's the first line, right. Uh The line that comes after that is as such were becoming more and more data centric with data, the currency. Now the next step is even more profound. That is um You know, we are going as far as saying that you know um data should not be treated as cost anymore. No. Right. But instead as an investment in a new asset class called data with value on our balance sheet, this is a this is a step change right? In thinking that is going to change the way we look at data, the way we value it. So that's a statement that this is the exciting thing because because for for me, a city of Ai right uh machine is only as intelligent as the data you feed it with data is a source of the machine learning to be intelligent. So, so that's that's why when when people start to value data, right? And and and say that it is an investment when we collect it, it is very positive for AI because an AI system gets intelligent, get more intelligence because it has a huge amounts of data and the diversity of data. So it would be great if the community values values data. Well, >>you certainly see it in the valuations of many companies these days. Um and I think increasingly you see it on the income statement, you know, data products and people monetizing data services and maybe eventually you'll see it in the in the balance. You know, Doug Laney, when he was a gardener group wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? Dr >>yeah. Question is is the process and methods evaluation right. But I believe we'll get there, we need to get started and then we'll get there. Believe >>doctor goes on >>pleasure. And yeah. And then the Yeah, I will well benefit greatly from it. >>Oh yeah, no doubt people will better understand how to align you know, some of these technology investments, Doctor goes great to see you again. Thanks so much for coming back in the cube. It's been a real pleasure. >>Yes. A system. It's only as smart as the data you feed it with. >>Excellent. We'll leave it there, thank you for spending some time with us and keep it right there for more great interviews from HP discover 21 this is Dave Volonte for the cube. The leader in enterprise tech coverage right back

Published Date : Jun 23 2021

SUMMARY :

Hewlett Packard enterprise Doctor go great to see you again. And you addressed you That's also part of the reason why that's the main reason why you know Antonio on day one So maybe we could talk a little bit about some of the things that you The first one is is the current challenge and that current challenge is uh you know stated So that's and they, and they chalked it up to a glitch like you said, is is that humans put in the rules to decide what goes into So it seems that most of the Ai going on in the enterprise is modeling It starts to evolve right to the point that using a test set of data that you have Yeah. The goal is to learn at the edge so that you don't have to move And then maybe only selectively send the autonomous vehicle example you gave us great because But on the other hand, you you know, you worry you you you But the processing power when you combine the Cpus and N. that there might need to be a balance between you needing to bring all that data from the I know today you are, you have, you know, edge line and other products. Um, so you have to build systems that adapt to it. What are you most excited about in the future? machine is only as intelligent as the data you feed it with data Um and I think increasingly you see it on the income statement, you know, data products and people Question is is the process and methods evaluation right. And then the Yeah, I will well benefit greatly from it. Doctor goes great to see you again. It's only as smart as the data you feed it with. We'll leave it there, thank you for spending some time with us and keep it right there for more great interviews

ENTITIES

Entity	Category	Confidence
Michael Lewis	PERSON	0.99+
Dave Volonte	PERSON	0.99+
Dave	PERSON	0.99+
Doug Laney	PERSON	0.99+
France	LOCATION	0.99+
two languages	QUANTITY	0.99+
The Flash Boys	TITLE	0.99+
55 billion	QUANTITY	0.99+
2018	DATE	0.99+
Today	DATE	0.99+
two challenges	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
one language	QUANTITY	0.99+
second area	QUANTITY	0.99+
2021	DATE	0.99+
May 6 2010	DATE	0.99+
last year	DATE	0.99+
tomorrow	DATE	0.99+
tens of minutes	QUANTITY	0.99+
HPD	ORGANIZATION	0.99+
less than 100 shares	QUANTITY	0.99+
second barrier	QUANTITY	0.99+
today	DATE	0.99+
first part	QUANTITY	0.99+
Eng Lim Goh	PERSON	0.99+
One	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
first barrier	QUANTITY	0.99+
both	QUANTITY	0.98+
three stages	QUANTITY	0.98+
Hewlett Packard	ORGANIZATION	0.98+
$4 billion dollars	QUANTITY	0.98+
one model	QUANTITY	0.98+
two part	QUANTITY	0.98+
first line	QUANTITY	0.98+
United Nations	ORGANIZATION	0.98+
one area	QUANTITY	0.98+
Antonio	PERSON	0.98+
first one	QUANTITY	0.98+
one rule	QUANTITY	0.98+
hundreds of millions	QUANTITY	0.98+
HPE	ORGANIZATION	0.97+
May six	DATE	0.97+
about 100 questions	QUANTITY	0.97+
john	PERSON	0.96+
two insight	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.95+
yesterday	DATE	0.95+
asia	LOCATION	0.92+
HB	ORGANIZATION	0.92+
under 100 shares	QUANTITY	0.92+
Three main areas	QUANTITY	0.92+
first	QUANTITY	0.92+
hundreds of millions of word pairs	QUANTITY	0.91+
under 100 trades	QUANTITY	0.91+
less than 103 100 100 shares	QUANTITY	0.91+
Q eight	OTHER	0.9+
three quick examples	QUANTITY	0.9+
two keynotes	QUANTITY	0.9+
55 billion devices	QUANTITY	0.89+
firstly	QUANTITY	0.88+
three main things	QUANTITY	0.88+
Dr.	PERSON	0.87+
day one	QUANTITY	0.86+
H P. S H. P. S. Annual customer	EVENT	0.85+
2nd 1	QUANTITY	0.84+
Eng limb	PERSON	0.81+
one	QUANTITY	0.8+
16 seventies	QUANTITY	0.77+
a trillion dollars	QUANTITY	0.74+
one health care provider	QUANTITY	0.73+
one of	QUANTITY	0.72+
sixties	QUANTITY	0.69+
DR	PERSON	0.69+
customers	QUANTITY	0.68+

Dr Eng Lim Goh, High Performance Computing & AI | HPE Discover 2021

>>Welcome back to HPD discovered 2021 the cubes virtual coverage, continuous coverage of H P. S H. P. S. Annual customer event. My name is Dave Volonte and we're going to dive into the intersection of high performance computing data and AI with DR Eng limb go who is the senior vice president and CTO for AI at Hewlett Packard enterprise Doctor go great to see you again. Welcome back to the cube. >>Hello Dave, Great to talk to you again. >>You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the day two keynotes here at discover you talked about thriving in the age of insights and how to craft a data centric strategy and you addressed you know some of the biggest problems I think organizations face with data that's You got a data is plentiful but insights they're harder to come by. And you really dug into some great examples in retail banking and medicine and health care and media. But stepping back a little bit with zoom out on discovered 21, what do you make of the events so far? And some of your big takeaways? >>Mm Well you started with the insightful question, right? Yeah. Data is everywhere then. But we like the insight. Right? That's also part of the reason why that's the main reason why you know Antonio on day one focused and talked about that. The fact that we are now in the age of insight. Right? Uh and and uh and and how to thrive thrive in that in this new age. What I then did on the day to kino following Antonio is to talk about the challenges that we need to overcome in order in order to thrive in this new age. >>So maybe we could talk a little bit about some of the things that you took away in terms I'm specifically interested in some of the barriers to achieving insights when you know customers are drowning in data. What do you hear from customers? What we take away from some of the ones you talked about today? >>Oh, very pertinent question. Dave you know the two challenges I spoke about right now that we need to overcome in order to thrive in this new age. The first one is is the current challenge and that current challenge is uh you know stated is you know, barriers to insight, you know when we are awash with data. So that's a statement right? How to overcome those barriers. What are the barriers of these two insight when we are awash in data? Um I in the data keynote I spoke about three main things. Three main areas that received from customers. The first one, the first barrier is in many with many of our customers. A data is siloed. All right. You know, like in a big corporation you've got data siloed by sales, finance, engineering, manufacturing, and so on, uh supply chain and so on. And uh, there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the that was the first barrier we spoke about barriers to incite when we are washed with data. The second barrier is uh, that we see amongst our customers is that uh data is raw and dispersed when they are stored and and uh and you know, it's tough to get tough to to get value out of them. Right? And I in that case I I used the example of uh you know the May 6 2010 event where the stock market dropped a trillion dollars in in tens of ministerial. We we all know those who are financially attuned with know about this uh incident But this is not the only incident. There are many of them out there and for for that particular May six event uh you know, it took a long time to get insight months. Yeah before we for months we had no insight as to what happened, why it happened, right. Um and and there were many other incidences like this. And the regulators were looking for that one rule that could, that could mitigate many of these incidences. Um one of our customers decided to take the hard road go with the tough data right? Because data is rolling dispersed. So they went into all the different feeds of financial transaction information. Uh took the took the tough uh took the tough road and analyze that data took a long time to assemble and they discovered that there was court stuffing right? That uh people were sending a lot of traits in and then cancelling them almost immediately. You have to manipulate the market. Um And why why why didn't we see it immediately? Well the reason is the process reports that everybody sees uh rule in there that says all trades. Less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 103 100 100 shares trades uh to fly under the radar to do this manipulation. So here is here the second barrier right? Data could be raw and dispersed. Um Sometimes you just have to take the hard road and um and to get insight And this is 1 1 great example. And then the last barrier is uh is has to do with sometimes when you start a project to to get insight to get uh to get answers and insight. You you realize that all the datas around you but you don't you don't seem to find the right ones To get what you need. You don't you don't seem to get the right ones. Yeah. Um here we have three quick examples of customers. 111 was it was a great example right? Where uh they were trying to build a language translator, a machine language translator between two languages. Right? But not do that. They need to get hundreds of millions of word pairs, you know, of one language compared uh with the corresponding other hundreds of millions of them. They say we are going to get all these word pairs. Someone creative thought of a willing source and a huge, so it was a United Nations you see. So sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data right. The second one has to do with uh there was uh the uh sometimes you you may just have to generate that data, interesting one. We had an autonomous car customer that collects all these data from their cars, right, massive amounts of data, loss of senses, collect loss of data. And uh you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car uh in in um in fine weather and collected the car driving on this highway in rain and also in stone, but never had the opportunity to collect the car in hale because that's a rare occurrence. So instead of waiting for a time where the car can dr inhale, they build a simulation you by having the car collector in snow and simulated him. So these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated the fact that data is silo Federated, it various associated with data. That's tough to get that. They just took the hard road, right? And sometimes, thirdly, you just have to be creative to get the right data you need, >>wow, I tell you, I have about 100 questions based on what you just said. Uh, there's a great example, the flash crash. In fact, Michael Lewis wrote about this in his book, The Flash Boys and essentially right. It was high frequency traders trying to front run the market and sending in small block trades trying to get on the front end it. So that's and they, and they chalked it up to a glitch like you said, for months, nobody really knew what it was. So technology got us into this problem. I guess my question is, can technology help us get out of the problem? And that maybe is where AI fits in. >>Yes, yes. Uh, in fact, a lot of analytics, we went in, uh, to go back to the raw data that is highly dispersed from different sources, right, assemble them to see if you can find a material trend, right? You can see lots of trends right? Like, uh, you know, we, if if humans look at things right, we tend to see patterns in clouds, right? So sometimes you need to apply statistical analysis, um math to be sure that what the model is seeing is is real. Right? And and that required work. That's one area. The second area is uh you know, when um uh there are times when you you just need to to go through that uh that tough approach to to find the answer. Now, the issue comes to mind now is is that humans put in the rules to decide what goes into a report that everybody sees in this case uh before the change in the rules. Right? But by the way, after the discovery, the authorities change the rules and all all shares, all traits of different any sizes. It has to be reported. No. Yeah. Right. But the rule was applied uh you know, to say earlier that shares under 100 trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and and under for understandable reasons. I mean they probably didn't want that for various reasons not to put everything in there so that people could still read it uh in a reasonable amount of time. But uh we need to understand that rules were being put in by humans for the reports we read. And as such, there are times you just need to go back to the raw data. >>I want to ask, >>albeit that it's gonna be tough. >>Yeah. So I want to ask a question about AI is obviously it's in your title and it's something you know a lot about but and I want to make a statement, you tell me if it's on point or off point. So it seems that most of the Ai going on in the enterprise is modeling data science applied to troves of data >>but >>but there's also a lot of ai going on in consumer whether it's you know, fingerprint technology or facial recognition or natural language processing. Will a two part question will the consumer market has so often in the enterprise sort of inform us uh the first part and then will there be a shift from sort of modeling if you will to more you mentioned autonomous vehicles more ai influencing in real time. Especially with the edge. She can help us understand that better. >>Yeah, it's a great question. Right. Uh there are three stages to just simplify, I mean, you know, it's probably more sophisticated than that but let's simplify three stages. All right. To to building an Ai system that ultimately can predict, make a prediction right or to to assist you in decision making, have an outcome. So you start with the data massive amounts data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data and the machine uh starts to evolve a model based on all the data is seeing. It starts to evolve right to the point that using a test set of data that you have separately campus site that you know the answer for. Then you test the model uh you know after you trained it with all that data to see whether it's prediction accuracy is high enough and once you are satisfied with it, you you then deploy the model to make the decision and that's the influence. Right? So a lot of times depend on what what we are focusing on. We we um in data science are we working hard on assembling the right data to feed the machine with, That's the data preparation organization work. And then after which you build your models, you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you you pick one model and the prediction isn't that robust, it is good but then it is not consistent right now what you do is uh you try another model so sometimes it's just keep trying different models until you get the right kind. Yeah, that gives you a good robust decision making and prediction after which It is tested well Q eight. You would then take that model and deploy it at the edge. Yeah. And then at the edges is essentially just looking at new data, applying it to the model, you're you're trained and then that model will give you a prediction decision. Right? So uh it is these three stages. Yeah, but more and more uh you know, your question reminds me that more and more people are thinking as the edge become more and more powerful. Can you also do learning at the edge? Right. That's the reason why we spoke about swarm learning the last time, learning at the edge as a swamp, right? Because maybe individually they may not have enough power to do so. But as a swampy me, >>is that learning from the edge or learning at the edge? In other words? Yes. Yeah. Question Yeah. >>That's a great question. That's a great question. Right? So uh the quick answer is learning at the edge, right? Uh and also from the edge, but the main goal, right? The goal is to learn at the edge so that you don't have to move the data that the Edge sees first back to the cloud or the core to do the learning because that would be the reason. One of the main reasons why you want to learn at the edge, right? Uh So so that you don't need to have to send all that data back and assemble it back from all the different edge devices, assemble it back to the cloud side to to do the learning right? With swampland. You can learn it and keep the data at the edge and learn at that point. >>And then maybe only selectively send the autonomous vehicle example you gave us. Great because maybe there, you know, there may be only persisting, they're not persisting data that is inclement weather or when a deer runs across the front and then maybe they they do that and then they send that smaller data set back and maybe that's where it's modelling done. But the rest can be done at the edges. It's a new world that's coming down. Let me ask you a question, is there a limit to what data should be collected and how it should be collected? >>That's a great question again. You know uh wow today, full of these uh insightful questions that actually touches on the second challenge. Right? How do we uh in order to thrive in this new age of inside? The second challenge is are you know the is our future challenge, right? What do we do for our future? And and in there is uh the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that I talk about what to collect right? When to organize it when you collect and then where will your data be, you know going forward that you are collecting from? So what, when and where for the what data for the what data to collect? That? That was the question you ask. Um it's it's a question that different industries have to ask themselves because it will vary, right? Um let me give you the you use the autonomous car example, let me use that. And you have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from the fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars collecting data so they can train and eventually deploy commercial cars, right? Um so this data collection cars they collect as a fleet of them collect temporal bikes a day. And when it came to us building a storage system to store all of that data, they realized they don't want to afford to store all of it. Now, here comes the dilemma, right? What should I after I spent so much effort building all these cars and sensors and collecting data, I've now decide what to delete. That's a dilemma right now in working with them on this process of trimming down what they collected. You know, I'm constantly reminded of the sixties and seventies, right? To remind myself 60 and seventies, we call a large part of our D. N. A junk DNA. Today. We realize that a large part of that what we call john has function as valuable function. They are not jeans, but they regulate the function of jeans, you know, So, so what's jump in the yesterday could be valuable today or what's junk today could be valuable tomorrow, Right? So, so there's this tension going on right between you decided not wanting to afford to store everything that you can get your hands on. But on the other hand, you you know, you worry you you you ignore the wrong ones, right? You can see this tension in our customers, right? And it depends on industry here, right? In health care, they say I have no choice. I I want it. All right. One very insightful point brought up by one health care provider that really touched me was, you know, we are not we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But you also care for the people were not caring for. How do we find them? Mhm. Right. And that therefore, they did not just need to collect data. That is that they have with from their patients. They also need to reach out right to outside data so that they can figure out who they are not caring for, right? So they want it all. So I tell us them, so what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and find that. Of course they also come back to us rightfully that you know, we have to then work out a way to help them build that system, you know? So that's health care, right? And and if you go to other industries like banking, they say they can't afford to keep them off, but they are regulated, seems like healthcare, they are regulated as to uh privacy and such. Like so many examples different industries having different needs, but different approaches to how what they collect. But there is this constant tension between um you perhaps deciding not wanting to fund all of that uh all that you can store, right? But on the other hand, you know, if you if you kind of don't want to afford it and decide not to store some uh if he does some become highly valuable in the future, right? Yeah. >>We can make some assumptions about the future, can't we? I mean, we know there's gonna be a lot more data than than we've ever seen before. We know that we know well notwithstanding supply constraints on things like nand. We know the prices of storage is going to continue to decline. We also know, and not a lot of people are really talking about this but the processing power but he says moore's law is dead okay. It's waning. But the processing power when you combine the Cpus and NP US and GPUS and accelerators and and so forth actually is is increasing. And so when you think about these use cases at the edge, you're going to have much more processing power, you're gonna have cheaper storage and it's going to be less expensive processing And so as an ai practitioner, what can you do with that? >>Yeah, it's highly again, another insightful questions that we touched on our keynote and that that goes up to the why I do the where? Right, When will your data be? Right. We have one estimate that says that by next year there will be 55 billion connected devices out there. Right. 55 billion. Right. What's the population of the world? Of the other? Of 10 billion? But this thing is 55 billion. Right? Uh and many of them, most of them can collect data. So what do you what do you do? Right. Um So the amount of data that's gonna come in, it's gonna weigh exceed right? Our drop in storage costs are increasing computer power. Right? So what's the answer? Right. So, so the the answer must be knowing that we don't and and even the drop in price and increase in bandwidth, it will overwhelm the increased five G will overwhelm five G. Right? Given amount 55 billion of them collecting. Right? So, the answer must be that there might need to be a balance between you needing to bring all that data from the 55 billion devices of data back to a central as a bunch of central Cause because you may not be able to afford to do that firstly band with even with five G. M and and SD when you'll still be too expensive given the number of devices out there. Were you given storage cause dropping will still be too expensive to try and store them all. So the answer must be to start at least to mitigate the problem to some leave both a lot of the data out there. Right? And only send back the pertinent ones as you said before. But then if you did that, then how are we gonna do machine learning at the core and the cloud side? If you don't have all the data you want rich data to train with. Right? Some sometimes you want a mix of the uh positive type data and the negative type data so you can train the machine in a more balanced way. So the answer must be eventually right. As we move forward with these huge number of devices out of the edge to do machine learning at the edge. Today, we don't have enough power. Right? The edge typically is characterized by a lower uh, energy capability and therefore lower compute power. But soon, you know, even with lower energy, they can do more with compute power improving in energy efficiency, Right? Uh, so learning at the edge today, we do influence at the edge. So we data model deploy and you do influence at the age, that's what we do today. But more and more, I believe, given a massive amount of data at the edge, you you have to have to start doing machine learning at the edge. And and if when you don't have enough power, then you aggregate multiple devices, compute power into a swamp and learn as a swan, >>interesting. So now, of course, if I were sitting and fly on the wall in HP board meeting, I said, okay, HP is as a leading provider of compute, how do you take advantage of that? I mean, we're going, I know it's future, but you must be thinking about that and participating in those markets. I know today you are you have, you know, edge line and other products. But there's it seems to me that it's it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that >>opportunity for your customers? Uh the world will have to have a balance right? Where today the default, Well, the more common mode is to collect the data from the edge and train at uh at some centralized location or a number of centralized location um going forward. Given the proliferation of the edge devices, we'll need a balance. We need both. We need capability at the cloud side. Right. And it has to be hybrid. And then we need capability on the edge side. Yeah. That they want to build systems that that on one hand, uh is uh edge adapted, right? Meaning the environmentally adapted because the edge different they are on a lot of times on the outside. Uh They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery powered. Right? Um so you have to build systems that adapt to it, but at the same time they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run rich a set of applications. So yes. Um that's that's also the insightful for that Antonio announced in 2018, Uh the next four years from 2018, right, $4 billion dollars invested to strengthen our edge portfolio, edge product lines, right Edge solutions. >>I get a doctor go. I could go on for hours with you. You're you're just such a great guest. Let's close what are you most excited about in the future of of of it? Certainly H. P. E. But the industry in general. >>Yeah I think the excitement is uh the customers right? The diversity of customers and and the diversity in a way they have approached their different problems with data strategy. So the excitement is around data strategy right? Just like you know uh you know the the statement made was was so was profound. Right? Um And Antonio said we are in the age of insight powered by data. That's the first line right? The line that comes after that is as such were becoming more and more data centric with data the currency. Now the next step is even more profound. That is um you know we are going as far as saying that you know um data should not be treated as cost anymore. No right. But instead as an investment in a new asset class called data with value on our balance sheet, this is a this is a step change right in thinking that is going to change the way we look at data the way we value it. So that's a statement that this is the exciting thing because because for for me a city of AI right uh machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. So so that's that's why when when people start to value data right? And and and say that it is an investment when we collect it. It is very positive for ai because an Ai system gets intelligent, more intelligence because it has a huge amounts of data and the diversity of data. So it'd be great if the community values values data. Well >>you certainly see it in the valuations of many companies these days. Um and I think increasingly you see it on the income statement, you know data products and people monetizing data services and maybe eventually you'll see it in the in the balance. You know Doug Laney when he was a gardener group wrote a book about this and a lot of people are thinking about it. That's a big change isn't it? Dr >>yeah. Question is is the process and methods evaluation. Right. But uh I believe we'll get there, we need to get started then we'll get their belief >>doctor goes on and >>pleasure. And yeah and then the yeah I will will will will benefit greatly from it. >>Oh yeah, no doubt people will better understand how to align you know, some of these technology investments, Doctor goes great to see you again. Thanks so much for coming back in the cube. It's been a real pleasure. >>Yes. A system. It's only as smart as the data you feed it with. >>Excellent. We'll leave it there. Thank you for spending some time with us and keep it right there for more great interviews from HP discover 21. This is dave a lot for the cube. The leader in enterprise tech coverage right back.

Published Date : Jun 17 2021

SUMMARY :

at Hewlett Packard enterprise Doctor go great to see you again. the age of insights and how to craft a data centric strategy and you addressed you know That's also part of the reason why that's the main reason why you know Antonio on day one So maybe we could talk a little bit about some of the things that you The first one is is the current challenge and that current challenge is uh you know stated So that's and they, and they chalked it up to a glitch like you said, is is that humans put in the rules to decide what goes into So it seems that most of the Ai going on in the enterprise is modeling be a shift from sort of modeling if you will to more you mentioned autonomous It starts to evolve right to the point that using a test set of data that you have is that learning from the edge or learning at the edge? The goal is to learn at the edge so that you don't have to move the data that the And then maybe only selectively send the autonomous vehicle example you gave us. But on the other hand, you know, if you if you kind of don't want to afford it and But the processing power when you combine the Cpus and NP that there might need to be a balance between you needing to bring all that data from the I know today you are you have, you know, edge line and other products. Um so you have to build systems that adapt to it, but at the same time they must not Let's close what are you most excited about in the future of machine is only as intelligent as the data you feed it with. Um and I think increasingly you see it on the income statement, you know data products and Question is is the process and methods evaluation. And yeah and then the yeah I will will will will benefit greatly from it. Doctor goes great to see you again. It's only as smart as the data you feed it with. Thank you for spending some time with us and keep it right there for more great

ENTITIES

Entity	Category	Confidence
Michael Lewis	PERSON	0.99+
Dave Volonte	PERSON	0.99+
Dave	PERSON	0.99+
2018	DATE	0.99+
HP	ORGANIZATION	0.99+
two languages	QUANTITY	0.99+
The Flash Boys	TITLE	0.99+
55 billion	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
Hewlett Packard	ORGANIZATION	0.99+
two challenges	QUANTITY	0.99+
second area	QUANTITY	0.99+
one language	QUANTITY	0.99+
Today	DATE	0.99+
last year	DATE	0.99+
Doug Laney	PERSON	0.99+
tomorrow	DATE	0.99+
next year	DATE	0.99+
both	QUANTITY	0.99+
One	QUANTITY	0.99+
today	DATE	0.99+
first line	QUANTITY	0.99+
first part	QUANTITY	0.99+
May 6 2010	DATE	0.99+
$4 billion dollars	QUANTITY	0.99+
two part	QUANTITY	0.99+
Less than 100 shares	QUANTITY	0.99+
HPD	ORGANIZATION	0.99+
one model	QUANTITY	0.98+
one rule	QUANTITY	0.98+
one area	QUANTITY	0.98+
second barrier	QUANTITY	0.98+
60	QUANTITY	0.98+
55 billion devices	QUANTITY	0.98+
Antonio	PERSON	0.98+
john	PERSON	0.98+
three stages	QUANTITY	0.98+
hundreds of millions	QUANTITY	0.97+
about 100 questions	QUANTITY	0.97+
Eng Lim Goh	PERSON	0.97+
HPE	ORGANIZATION	0.97+
first barrier	QUANTITY	0.97+
first one	QUANTITY	0.97+
Three main areas	QUANTITY	0.97+
yesterday	DATE	0.96+
tens of ministerial	QUANTITY	0.96+
two insight	QUANTITY	0.96+
Q eight	OTHER	0.95+
2021	DATE	0.94+
seventies	QUANTITY	0.94+
two keynotes	QUANTITY	0.93+
a day	QUANTITY	0.93+
first	QUANTITY	0.92+
H P. S H. P. S. Annual customer	EVENT	0.91+
United Nations	ORGANIZATION	0.91+
less than 103 100 100 shares	QUANTITY	0.91+
under 100 trades	QUANTITY	0.9+
under 100 shares	QUANTITY	0.9+
day one	QUANTITY	0.88+
about 10 petabytes a day	QUANTITY	0.88+
three quick examples	QUANTITY	0.85+
one health care provider	QUANTITY	0.85+
one estimate	QUANTITY	0.84+
three main things	QUANTITY	0.83+
hundreds of millions of word pairs	QUANTITY	0.82+
Antonio	ORGANIZATION	0.81+
sixties	QUANTITY	0.78+
one	QUANTITY	0.77+
May six	DATE	0.75+
firstly	QUANTITY	0.74+
trillion dollars	QUANTITY	0.73+
second one	QUANTITY	0.71+
HP discover 21	ORGANIZATION	0.69+
DR Eng limb	PERSON	0.69+
one of our customers	QUANTITY	0.66+

Dr Eng Lim Goh, Vice President, CTO, High Performance Computing & AI

(upbeat music) >> Welcome back to HPE Discover 2021, theCube's virtual coverage, continuous coverage of HPE's annual customer event. My name is Dave Vellante and we're going to dive into the intersection of high-performance computing, data and AI with Dr. Eng Lim Goh who's a Senior Vice President and CTO for AI at Hewlett Packard Enterprise. Dr. Goh, great to see you again. Welcome back to theCube. >> Hey, hello, Dave. Great to talk to you again. >> You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the Day 2 keynotes here at Discover. And you talked about thriving in the age of insights and how to craft a data-centric strategy and you addressed some of the biggest problems I think organizations face with data. And that's, you got to look, data is plentiful, but insights, they're harder to come by and you really dug into some great examples in retail, banking, and medicine and healthcare and media. But stepping back a little bit we'll zoom out on Discover '21, you know, what do you make of the events so far and some of your big takeaways? >> Hmm, well, you started with the insightful question. Data is everywhere then but we lack the insight. That's also part of the reason why that's a main reason why, Antonio on Day 1 focused and talked about that, the fact that we are in the now in the age of insight and how to thrive in this new age. What I then did on the Day 2 keynote following Antonio is to talk about the challenges that we need to overcome in order to thrive in this new age. >> So maybe we could talk a little bit about some of the things that you took away in terms of, I'm specifically interested in some of the barriers to achieving insights when you know customers are drowning in data. What do you hear from customers? What were your takeaway from some of the ones you talked about today? >> Very pertinent question, Dave. You know, the two challenges I spoke about how to, that we need to overcome in order to thrive in this new age, the first one is the current challenge. And that current challenge is, you know state of this, you know, barriers to insight, when we are awash with data. So that's a statement. How to overcome those barriers. One of the barriers to insight when we are awash in data, in the Day 2 keynote, I spoke about three main things, three main areas that receive from customers. The first one, the first barrier is with many of our customers, data is siloed. You know, like in a big corporation, you've got data siloed by sales, finance, engineering, manufacturing, and so on supply chain and so on. And there's a major effort ongoing in many corporations to build a Federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the first barrier we spoke about, you know, barriers to insight when we are awash with data. The second barrier is that we see amongst our customers is that data is raw and disperse when they are stored. And it's tough to get to value out of them. In that case I use the example of the May 6, 2010 event where the stock market dropped a trillion dollars in tens of minutes. We all know those who are financially attuned with, know about this incident. But that this is not the only incident. There are many of them out there. And for that particular May 6, event, you know it took a long time to get insight, months, yeah, before we, for months we had no insight as to what happened, why it happened. And there were many other incidences like this and the regulators were looking for that one rule that could mitigate many of these incidences. One of our customers decided to take the hard road to go with the tough data. Because data is raw and dispersed. So they went into all the different feeds of financial transaction information, took the tough, you know, took a tough road and analyze that data took a long time to assemble. And he discovered that there was quote stuffing. That people were sending a lot of trades in and then canceling them almost immediately. You have to manipulate the market. And why didn't we see it immediately? Well, the reason is the process reports that everybody sees had the rule in there that says all trades less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 100 shares trades to fly under the radar to do this manipulation. So here is, here the second barrier. Data could be raw and disperse. Sometimes it's just have to take the hard road and to get insight. And this is one great example. And then the last barrier has to do with sometimes when you start a project to get insight, to get answers and insight, you realize that all the data's around you, but you don't seem to find the right ones to get what you need. You don't seem to get the right ones, yeah. Here we have three quick examples of customers. One was a great example where they were trying to build a language translator a machine language translator between two languages. But in order to do that they need to get hundreds of millions of word pairs of one language compare with the corresponding other hundreds of millions of them. They say, "Where I'm going to get all these word pairs?" Someone creative thought of a willing source and huge source, it was a United Nations. You see, so sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data. The second one has to do with, there was the, sometimes you may just have to generate that data. Interesting one. We had an autonomous car customer that collects all these data from their cars. Massive amounts of data, lots of sensors, collect lots of data. And, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car in fine weather and collected the car driving on this highway in rain and also in snow. But never had the opportunity to collect the car in hail because that's a rare occurrence. So instead of waiting for a time where the car can drive in hail, they build a simulation by having the car collected in snow and simulated hail. So these are some of the examples where we have customers working to overcome barriers. You have barriers that is associated with the fact, that data silo, if federated barriers associated with data that's tough to get at. They just took the hard road. And sometimes thirdly, you just have to be creative to get the right data you need. >> Wow, I tell you, I have about 100 questions based on what you just said. And as a great example, the flash crash in fact Michael Lewis wrote about this in his book, the "Flash Boys" and essentially. It was high frequency traders trying to front run the market and sending in small block trades trying to get sort of front ended. So that's, and they chalked it up to a glitch. Like you said, for months, nobody really knew what it was. So technology got us into this problem. Can I guess my question is can technology help us get get out of the problem? And that maybe is where AI fits in. >> Yes. Yes. In fact, a lot of analytics work went in to go back to the raw data that is highly dispersed from different sources, assemble them to see if you can find a material trend. You can see lots of trends. Like, no, we, if humans at things we tend to see patterns in clouds. So sometimes you need to apply statistical analysis, math to be sure that what the model is seeing is real. And that required work. That's one area. The second area is, you know, when this, there are times when you just need to go through that tough approach to find the answer. Now, the issue comes to mind now is that humans put in the rules to decide what goes into a report that everybody sees. And in this case before the change in the rules. By the way, after the discovery, the authorities changed the rules and all shares all trades of different, any sizes it has to be reported. Not, yeah. But the rule was applied to to say earlier that shares under 100, trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and for understandable reasons. I mean, they probably didn't, wanted for various reasons not to put everything in there so that people could still read it in a reasonable amount of time. But we need to understand that rules were being put in by humans for the reports we read. And as such there are times we just need to go back to the raw data. >> I want to ask you-- Or be it that it's going to be tough there. >> Yeah, so I want to ask you a question about AI as obviously it's in your title and it's something you know a lot about and I'm going to make a statement. You tell me if it's on point or off point. Seems that most of the AI going on in the enterprise is modeling data science applied to troves of data. But there's also a lot of AI going on in consumer, whether it's fingerprint technology or facial recognition or natural language processing. Will, to two-part question, will the consumer market, let's say as it has so often in the enterprise sort of inform us is sort of first part. And then will there be a shift from sort of modeling, if you will, to more, you mentioned autonomous vehicles more AI inferencing in real-time, especially with the Edge. I think you can help us understand that better. >> Yeah, this is a great question. There are three stages to just simplify, I mean, you know, it's probably more sophisticated than that, but let's just simplify there're three stages to building an AI system that ultimately can predict, make a prediction. Or to assist you in decision-making, have an outcome. So you start with the data, massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data. And the machine starts to evolve a model based on all the data is seeing it starts to evolve. To a point that using a test set of data that you have separately kept a site that you know the answer for. Then you test the model, you know after you're trained it with all that data to see whether his prediction accuracy is high enough. And once you are satisfied with it, you then deploy the model to make the decision and that's the inference. So a lot of times depending on what we are focusing on. We in data science are we working hard on assembling the right data to feed the machine with? That's the data preparation organization work. And then after which you build your models you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you pick one model and a prediction isn't that a robust, it is good, but then it is not consistent. Now what you do is you try another model. So sometimes you just keep trying different models until you get the right kind, yeah, that gives you a good robust decision-making and prediction. Now, after which, if it's tested well, Q8 you will then take that model and deploy it at the Edge, yeah. And then at the Edge is essentially just looking at new data applying it to the model that you have trained and then that model will give you a prediction or a decision. So it is these three stages, yeah. But more and more, your question reminds me that more and more people are thinking as the Edge become more and more powerful, can you also do learning at the Edge? That's the reason why we spoke about swarm learning the last time, learning at the Edge as a swarm. Because maybe individually they may not have enough power to do so, but as a swarm, they may. >> Is that learning from the Edge or learning at the Edge. In other words, is it-- >> Yes. >> Yeah, you don't understand my question, yeah. >> That's a great question. That's a great question. So answer is learning at the Edge, and also from the Edge, but the main goal, the goal is to learn at the Edge so that you don't have to move the data that Edge sees first back to the Cloud or the call to do the learning. Because that would be the reason, one of the main reasons why you want to learn at the Edge. So that you don't need to have to send all that data back and assemble it back from all the different Edge devices assemble it back to the Cloud side to do the learning. With swarm learning, you can learn it and keep the data at the Edge and learn at that point, yeah. >> And then maybe only selectively send the autonomous vehicle example you gave is great 'cause maybe they're, you know, there may be only persisting. They're not persisting data that is an inclement weather, or when a deer runs across the front and then maybe they do that and then they send that smaller data set back and maybe that's where it's modeling done but the rest can be done at the Edge. It's a new world that's coming to, let me ask you a question. Is there a limit to what data should be collected and how it should be collected? >> That's a great question again, yeah, well, today full of these insightful questions that actually touches on the second challenge. How do we, to in order to thrive in this new age of insight. The second challenge is our future challenge. What do we do for our future? And in there is the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that, I talk about what to collect, and when to organize it when you collect, and then where will your data be going forward that you are collecting from? So what, when, and where. For the what data, for what data to collect that was the question you asked. It's a question that different industries have to ask themselves because it will vary. Let me give you the, you use the autonomous car example. Let me use that and you have this customer collecting massive amounts of data. You know, we talking about 10 petabytes a day from a fleet of their cars and these are not production autonomous cars. These are training autonomous cars, collecting data so they can train and eventually deploy a commercial cars. Also these data collection cars, they collect 10 as a fleet of them collect 10 petabytes a day. And then when it came to us, building a storage system to store all of that data they realize they don't want to afford to store all of it. Now here comes the dilemma. What should I, after I spent so much effort building all this cars and sensors and collecting data, I've now decide what to delete. That's a dilemma. Now in working with them on this process of trimming down what they collected. I'm constantly reminded of the 60s and 70s. To remind myself 60s and 70s, we call a large part of our DNA, junk DNA. Today we realized that a large part of that, what we call junk has function has valuable function. They are not genes but they regulate the function of genes. So what's junk in yesterday could be valuable today, or what's junk today could be valuable tomorrow. So there's this tension going on between you deciding not wanting to afford to store everything that you can get your hands on. But on the other hand, you know you worry, you ignore the wrong ones. You can see this tension in our customers. And then it depends on industry here. In healthcare they say, I have no choice. I want it all, why? One very insightful point brought up by one healthcare provider that really touched me was you know, we are not, we don't only care. Of course we care a lot. We care a lot about the people we are caring for. But we also care for the people we are not caring for. How do we find them? And therefore, they did not just need to collect data that they have with, from their patients they also need to reach out to outside data so that they can figure out who they are not caring for. So they want it all. So I asked them, "So what do you do with funding if you want it all?" They say they have no choice but they'll figure out a way to fund it and perhaps monetization of what they have now is the way to come around and fund that. Of course, they also come back to us, rightfully that you know, we have to then work out a way to to help them build a system. So that healthcare. And if you go to other industries like banking, they say they can afford to keep them all. But they are regulated same like healthcare. They are regulated as to privacy and such like. So many examples, different industries having different needs but different approaches to how, what they collect. But there is this constant tension between you perhaps deciding not wanting to fund all of that, all that you can store. But on the other hand you know, if you kind of don't want to afford it and decide not to store some, maybe those some become highly valuable in the future. You worry. >> Well, we can make some assumptions about the future, can't we? I mean we know there's going to be a lot more data than we've ever seen before, we know that. We know, well not withstanding supply constraints and things like NAND. We know the price of storage is going to continue to decline. We also know and not a lot of people are really talking about this but the processing power, everybody says, Moore's Law is dead. Okay, it's waning but the processing power when you combine the CPUs and NPUs, and GPUs and accelerators and so forth, actually is increasing. And so when you think about these use cases at the Edge you're going to have much more processing power. You're going to have cheaper storage and it's going to be less expensive processing. And so as an AI practitioner, what can you do with that? >> Yeah, it's a highly, again another insightful question that we touched on, on our keynote and that goes up to the why, I'll do the where. Where will your data be? We have one estimate that says that by next year, there will be 55 billion connected devices out there. 55 billion. What's the population of the world? Well, off the order of 10 billion, but this thing is 55 billion. And many of them, most of them can collect data. So what do you do? So the amount of data that's going to come in is going to way exceed our drop in storage costs our increasing compute power. So what's the answer? The answer must be knowing that we don't and even a drop in price and increase in bandwidth, it will overwhelm the 5G, it'll will overwhelm 5G, given the amount of 55 billion of them collecting. So the answer must be that there needs to be a balance between you needing to bring all that data from the 55 billion devices of the data back out to a central, as a bunch of central cost because you may not be able to afford to do that. Firstly bandwidth, even with 5G and as the, when you still be too expensive given the number of devices out there. You know given storage costs dropping it'll still be too expensive to try and install them all. So the answer must be to start at least to mitigate the problem to some leave most a lot of the data out there. And only send back the pertinent ones, as you said before. But then if you did that then, how are we going to do machine learning at the core and the Cloud side, if you don't have all the data you want rich data to train with. Sometimes you want to a mix of the positive type data, and the negative type data. So you can train the machine in a more balanced way. So the answer must be you eventually, as we move forward with these huge number of devices are at the Edge to do machine learning at the Edge. Today we don't even have power. The Edge typically is characterized by a lower energy capability and therefore, lower compute power. But soon, you know, even with low energy, they can do more with compute power, improving in energy efficiency. So learning at the Edge today we do inference at the Edge. So we data, model, deploy and you do inference at age. That's what we do today. But more and more, I believe given a massive amount of data at the Edge you have to have to start doing machine learning at the Edge. And if when you don't have enough power then you aggregate multiple devices' compute power into a swarm and learn as a swarm. >> Oh, interesting, so now of course, if I were sitting in a flyer flying the wall on HPE Board meeting I said, "Okay, HPE is a leading provider of compute." How do you take advantage that? I mean, we're going, I know it's future but you must be thinking about that and participating in those markets. I know today you are, you have, you know, Edge line and other products, but there's, it seems to me that it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that opportunity for your customers? >> The wall will have to have a balance. Where today the default, well, the more common mode is to collect the data from the Edge and train at some centralized location or number of centralized location. Going forward, given the proliferation of the Edge devices, we'll need a balance, we need both. We need capability at the Cloud side. And it has to be hybrid. And then we need capability on the Edge side. Yeah that we need to build systems that on one hand is Edge-adapted. Meaning they environmentally-adapted because the Edge differently are on it. A lot of times on the outside, they need to be packaging-adapted and also power-adapted. Because typically many of these devices are battery-powered. So you have to build systems that adapts to it. But at the same time, they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run a rich set of applications. So yes, that's also the insightful for that. Antonio announced in 2018 for the next four years from 2018, $4 billion invested to strengthen our Edge portfolio our Edge product lines, Edge solutions. >> Dr. Goh, I could go on for hours with you. You're just such a great guest. Let's close. What are you most excited about in the future of certainly HPE, but the industry in general? >> Yeah, I think the excitement is the customers. The diversity of customers and the diversity in the way they have approached their different problems with data strategy. So the excitement is around data strategy. Just like, you know, the statement made for us was so, was profound. And Antonio said we are in the age of insight powered by data. That's the first line. The line that comes after that is as such we are becoming more and more data-centric with data the currency. Now the next step is even more profound. That is, you know, we are going as far as saying that data should not be treated as cost anymore, no. But instead, as an investment in a new asset class called data with value on our balance sheet. This is a step change in thinking that is going to change the way we look at data, the way we value it. So that's a statement. So this is the exciting thing, because for me a CTO of AI, a machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. So that's why when the people start to value data and say that it is an investment when we collect it it is very positive for AI because an AI system gets intelligent, get more intelligence because it has huge amounts of data and a diversity of data. So it'd be great if the community values data. >> Well, are you certainly see it in the valuations of many companies these days? And I think increasingly you see it on the income statement, you know data products and people monetizing data services, and yeah, maybe eventually you'll see it in the balance sheet, I know. Doug Laney when he was at Gartner Group wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? Dr. Goh. >> Yeah, yeah, yeah. Your question is the process and methods in valuation. But I believe we'll get there. We need to get started and then we'll get there, I believe, yeah. >> Dr. Goh it's always my pleasure. >> And then the AI will benefit greatly from it. >> Oh yeah, no doubt. People will better understand how to align some of these technology investments. Dr. Goh, great to see you again. Thanks so much for coming back in theCube. It's been a real pleasure. >> Yes, a system is only as smart as the data you feed it with. (both chuckling) >> Well, excellent, we'll leave it there. Thank you for spending some time with us so keep it right there for more great interviews from HPE Discover '21. This is Dave Vellante for theCube, the leader in enterprise tech coverage. We'll be right back (upbeat music)

Published Date : Jun 10 2021

SUMMARY :

Dr. Goh, great to see you again. Great to talk to you again. and you addressed some and how to thrive in this new age. of the ones you talked about today? One of the barriers to insight And as a great example, the flash crash is that humans put in the rules to decide that it's going to be tough there. and it's something you know a lot about And the machine starts to evolve a model Is that learning from the Yeah, you don't So that you don't need to have but the rest can be done at the Edge. But on the other hand you know, And so when you think about and the Cloud side, if you I know today you are, you So you have to build about in the future as the data you feed it with. And I think increasingly you Your question is the process And then the AI will Dr. Goh, great to see you again. as the data you feed it with. Thank you for spending some time with us

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Lewis	PERSON	0.99+
Doug Laney	PERSON	0.99+
Dave	PERSON	0.99+
Antonio	PERSON	0.99+
2018	DATE	0.99+
10 billion	QUANTITY	0.99+
$4 billion	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
two languages	QUANTITY	0.99+
two challenges	QUANTITY	0.99+
May 6	DATE	0.99+
Flash Boys	TITLE	0.99+
two-part	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
tomorrow	DATE	0.99+
Gartner Group	ORGANIZATION	0.99+
second area	QUANTITY	0.99+
Today	DATE	0.99+
last year	DATE	0.99+
less than 100 shares	QUANTITY	0.99+
hundreds of millions	QUANTITY	0.99+
first line	QUANTITY	0.99+
One	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
today	DATE	0.99+
second barrier	QUANTITY	0.99+
May 6, 2010	DATE	0.99+
10	QUANTITY	0.99+
first barrier	QUANTITY	0.99+
both	QUANTITY	0.99+
less than 100 share	QUANTITY	0.99+
Dr.	PERSON	0.99+
one model	QUANTITY	0.99+
tens of minutes	QUANTITY	0.98+
one area	QUANTITY	0.98+
one language	QUANTITY	0.98+
Edge	ORGANIZATION	0.98+
three stages	QUANTITY	0.98+
yesterday	DATE	0.98+
first part	QUANTITY	0.98+
one rule	QUANTITY	0.98+
Goh	PERSON	0.98+
Firstly	QUANTITY	0.98+
first one	QUANTITY	0.97+
United Nations	ORGANIZATION	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
first barrier	QUANTITY	0.97+
Hewlett Packard Enterprise	ORGANIZATION	0.96+
about 100 questions	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.95+
Day 2	QUANTITY	0.94+
Eng Lim Goh	PERSON	0.94+
Day 1	QUANTITY	0.93+
under 100	QUANTITY	0.92+
Dr	PERSON	0.92+
one estimate	QUANTITY	0.91+

Dr Eng Lim Goh, Vice President, CTO, High Performance Computing & AI

(upbeat music) >> Welcome back to HPE Discover 2021, theCUBE's virtual coverage, continuous coverage of HPE's Annual Customer Event. My name is Dave Vellante, and we're going to dive into the intersection of high-performance computing, data and AI with Doctor Eng Lim Goh, who's a Senior Vice President and CTO for AI at Hewlett Packard Enterprise. Doctor Goh, great to see you again. Welcome back to theCUBE. >> Hello, Dave, great to talk to you again. >> You might remember last year we talked a lot about Swarm intelligence and how AI is evolving. Of course, you hosted the Day 2 Keynotes here at Discover. And you talked about thriving in the age of insights, and how to craft a data-centric strategy. And you addressed some of the biggest problems, I think organizations face with data. That's, you've got a, data is plentiful, but insights, they're harder to come by. >> Yeah. >> And you really dug into some great examples in retail, banking, in medicine, healthcare and media. But stepping back a little bit we zoomed out on Discover '21. What do you make of the events so far and some of your big takeaways? >> Hmm, well, we started with the insightful question, right, yeah? Data is everywhere then, but we lack the insight. That's also part of the reason why, that's a main reason why Antonio on day one focused and talked about the fact that we are in the now in the age of insight, right? And how to try thrive in that age, in this new age? What I then did on a Day 2 Keynote following Antonio is to talk about the challenges that we need to overcome in order to thrive in this new age. >> So, maybe we could talk a little bit about some of the things that you took away in terms of, I'm specifically interested in some of the barriers to achieving insights. You know customers are drowning in data. What do you hear from customers? What were your takeaway from some of the ones you talked about today? >> Oh, very pertinent question, Dave. You know the two challenges I spoke about, that we need to overcome in order to thrive in this new age. The first one is the current challenge. And that current challenge is, you know, stated is now barriers to insight, when we are awash with data. So that's a statement on how do you overcome those barriers? What are the barriers to insight when we are awash in data? In the Day 2 Keynote, I spoke about three main things. Three main areas that we receive from customers. The first one, the first barrier is in many, with many of our customers, data is siloed, all right. You know, like in a big corporation, you've got data siloed by sales, finance, engineering, manufacturing and so on supply chain and so on. And there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above, they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the first barrier we spoke about, you know? Barriers to insight when we are awash with data. The second barrier is that we see amongst our customers is that data is raw and disperse when they are stored. And you know, it's tough to get at, to tough to get a value out of them, right? And in that case, I use the example of, you know, the May 6, 2010 event where the stock market dropped a trillion dollars in terms of minutes. We all know those who are financially attuned with know about this incident but that this is not the only incident. There are many of them out there. And for that particular May 6 event, you know, it took a long time to get insight. Months, yeah, before we, for months we had no insight as to what happened. Why it happened? Right, and there were many other incidences like this and the regulators were looking for that one rule that could mitigate many of these incidences. One of our customers decided to take the hard road they go with the tough data, right? Because data is raw and dispersed. So they went into all the different feeds of financial transaction information, took the tough, you know, took a tough road. And analyze that data took a long time to assemble. And they discovered that there was caught stuffing, right? That people were sending a lot of trades in and then canceling them almost immediately. You have to manipulate the market. And why didn't we see it immediately? Well, the reason is the process reports that everybody sees, the rule in there that says, all trades less than a hundred shares don't need to report in there. And so what people did was sending a lot of less than a hundred shares trades to fly under the radar to do this manipulation. So here is the second barrier, right? Data could be raw and dispersed. Sometimes it's just have to take the hard road and to get insight. And this is one great example. And then the last barrier has to do with sometimes when you start a project to get insight, to get answers and insight, you realize that all the data's around you, but you don't seem to find the right ones to get what you need. You don't seem to get the right ones, yeah? Here we have three quick examples of customers. One was a great example, right? Where they were trying to build a language translator or machine language translator between two languages, right? By not do that, they need to get hundreds of millions of word pairs. You know of one language compare with the corresponding other. Hundreds of millions of them. They say, well, I'm going to get all these word pairs. Someone creative thought of a willing source and a huge, it was a United Nations. You see? So sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data, right? The second one has to do with, there was the sometimes you may just have to generate that data. Interesting one, we had an autonomous car customer that collects all these data from their their cars, right? Massive amounts of data, lots of sensors, collect lots of data. And, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car in fine weather and collected the car driving on this highway in rain and also in snow. But never had the opportunity to collect the car in hill because that's a rare occurrence. So instead of waiting for a time where the car can drive in hill, they build a simulation by having the car collected in snow and simulated him. So these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated. In fact, that data silo, they federated it. Virus associated with data, that's tough to get at. They just took the hard road, right? And sometimes thirdly, you just have to be creative to get the right data you need. >> Wow! I tell you, I have about a hundred questions based on what you just said, you know? (Dave chuckles) And as a great example, the Flash Crash. In fact, Michael Lewis, wrote about this in his book, the Flash Boys. And essentially, right, it was high frequency traders trying to front run the market and sending into small block trades (Dave chuckles) trying to get sort of front ended. So that's, and they chalked it up to a glitch. Like you said, for months, nobody really knew what it was. So technology got us into this problem. (Dave chuckles) I guess my question is can technology help us get out of the problem? And that maybe is where AI fits in? >> Yes, yes. In fact, a lot of analytics work went in to go back to the raw data that is highly dispersed from different sources, right? Assembled them to see if you can find a material trend, right? You can see lots of trends, right? Like, no, we, if humans look at things that we tend to see patterns in Clouds, right? So sometimes you need to apply statistical analysis math to be sure that what the model is seeing is real, right? And that required, well, that's one area. The second area is you know, when this, there are times when you just need to go through that tough approach to find the answer. Now, the issue comes to mind now is that humans put in the rules to decide what goes into a report that everybody sees. Now, in this case, before the change in the rules, right? But by the way, after the discovery, the authorities changed the rules and all shares, all trades of different any sizes it has to be reported. >> Right. >> Right, yeah? But the rule was applied, you know, I say earlier that shares under a hundred, trades under a hundred shares need not be reported. So, sometimes you just have to understand that reports were decided by humans and for understandable reasons. I mean, they probably didn't wanted a various reasons not to put everything in there. So that people could still read it in a reasonable amount of time. But we need to understand that rules were being put in by humans for the reports we read. And as such, there are times we just need to go back to the raw data. >> I want to ask you... >> Oh, it could be, that it's going to be tough, yeah. >> Yeah, I want to ask you a question about AI as obviously it's in your title and it's something you know a lot about but. And I'm going to make a statement, you tell me if it's on point or off point. So seems that most of the AI going on in the enterprise is modeling data science applied to, you know, troves of data. But there's also a lot of AI going on in consumer. Whether it's, you know, fingerprint technology or facial recognition or natural language processing. Well, two part question will the consumer market, as it has so often in the enterprise sort of inform us is sort of first part. And then, there'll be a shift from sort of modeling if you will to more, you mentioned the autonomous vehicles, more AI inferencing in real time, especially with the Edge. Could you help us understand that better? >> Yeah, this is a great question, right? There are three stages to just simplify. I mean, you know, it's probably more sophisticated than that. But let's just simplify that three stages, right? To building an AI system that ultimately can predict, make a prediction, right? Or to assist you in decision-making. I have an outcome. So you start with the data, massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data, and the machine starts to evolve a model based on all the data it's seeing. It starts to evolve, right? To a point that using a test set of data that you have separately kept aside that you know the answer for. Then you test the model, you know? After you've trained it with all that data to see whether its prediction accuracy is high enough. And once you are satisfied with it, you then deploy the model to make the decision. And that's the inference, right? So a lot of times, depending on what we are focusing on, we in data science are, are we working hard on assembling the right data to feed the machine with? That's the data preparation organization work. And then after which you build your models you have to pick the right models for the decisions and prediction you need to make. You pick the right models. And then you start feeding the data with it. Sometimes you pick one model and a prediction isn't that robust. It is good, but then it is not consistent, right? Now what you do is you try another model. So sometimes it gets keep trying different models until you get the right kind, yeah? That gives you a good robust decision-making and prediction. Now, after which, if it's tested well, QA, you will then take that model and deploy it at the Edge. Yeah, and then at the Edge is essentially just looking at new data, applying it to the model that you have trained. And then that model will give you a prediction or a decision, right? So it is these three stages, yeah. But more and more, your question reminds me that more and more people are thinking as the Edge become more and more powerful. Can you also do learning at the Edge? >> Right. >> That's the reason why we spoke about Swarm Learning the last time. Learning at the Edge as a Swarm, right? Because maybe individually, they may not have enough power to do so. But as a Swarm, they may. >> Is that learning from the Edge or learning at the Edge? In other words, is that... >> Yes. >> Yeah. You do understand my question. >> Yes. >> Yeah. (Dave chuckles) >> That's a great question. That's a great question, right? So the quick answer is learning at the Edge, right? And also from the Edge, but the main goal, right? The goal is to learn at the Edge so that you don't have to move the data that Edge sees first back to the Cloud or the Call to do the learning. Because that would be the reason, one of the main reasons why you want to learn at the Edge. Right? So that you don't need to have to send all that data back and assemble it back from all the different Edge devices. Assemble it back to the Cloud Site to do the learning, right? Some on you can learn it and keep the data at the Edge and learn at that point, yeah. >> And then maybe only selectively send. >> Yeah. >> The autonomous vehicle, example you gave is great. 'Cause maybe they're, you know, there may be only persisting. They're not persisting data that is an inclement weather, or when a deer runs across the front. And then maybe they do that and then they send that smaller data setback and maybe that's where it's modeling done but the rest can be done at the Edge. It's a new world that's coming through. Let me ask you a question. Is there a limit to what data should be collected and how it should be collected? >> That's a great question again, yeah. Well, today full of these insightful questions. (Dr. Eng chuckles) That actually touches on the the second challenge, right? How do we, in order to thrive in this new age of insight? The second challenge is our future challenge, right? What do we do for our future? And in there is the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that, I talked about what to collect, right? When to organize it when you collect? And then where will your data be going forward that you are collecting from? So what, when, and where? For what data to collect? That was the question you asked, it's a question that different industries have to ask themselves because it will vary, right? Let me give you the, you use the autonomous car example. Let me use that. And we do have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from a fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars, collecting data so they can train and eventually deploy commercial cars, right? Also this data collection cars, they collect 10, as a fleet of them collect 10 petabytes a day. And then when they came to us, building a storage system you know, to store all of that data, they realized they don't want to afford to store all of it. Now here comes the dilemma, right? What should I, after I spent so much effort building all this cars and sensors and collecting data, I've now decide what to delete. That's a dilemma, right? Now in working with them on this process of trimming down what they collected, you know, I'm constantly reminded of the 60s and 70s, right? To remind myself 60s and 70s, we called a large part of our DNA, junk DNA. >> Yeah. (Dave chuckles) >> Ah! Today, we realized that a large part of that what we call junk has function as valuable function. They are not genes but they regulate the function of genes. You know? So what's junk in yesterday could be valuable today. Or what's junk today could be valuable tomorrow, right? So, there's this tension going on, right? Between you deciding not wanting to afford to store everything that you can get your hands on. But on the other hand, you worry, you ignore the wrong ones, right? You can see this tension in our customers, right? And then it depends on industry here, right? In healthcare they say, I have no choice. I want it all, right? Oh, one very insightful point brought up by one healthcare provider that really touched me was you know, we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But who also care for the people we are not caring for? How do we find them? >> Uh-huh. >> Right, and that definitely, they did not just need to collect data that they have with from their patients. They also need to reach out, right? To outside data so that they can figure out who they are not caring for, right? So they want it all. So I asked them, so what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and fund that. Of course, they also come back to us rightfully, that you know we have to then work out a way to help them build a system, you know? So that's healthcare, right? And if you go to other industries like banking, they say they can afford to keep them all. >> Yeah. >> But they are regulated, seemed like healthcare, they are regulated as to privacy and such like. So many examples different industries having different needs but different approaches to what they collect. But there is this constant tension between you perhaps deciding not wanting to fund all of that, all that you can install, right? But on the other hand, you know if you kind of don't want to afford it and decide not to start some. Maybe those some become highly valuable in the future, right? (Dr. Eng chuckles) You worry. >> Well, we can make some assumptions about the future. Can't we? I mean, we know there's going to be a lot more data than we've ever seen before. We know that. We know, well, not withstanding supply constraints and things like NAND. We know the prices of storage is going to continue to decline. We also know and not a lot of people are really talking about this, but the processing power, but the says, Moore's law is dead. Okay, it's waning, but the processing power when you combine the CPUs and NPUs, and GPUs and accelerators and so forth actually is increasing. And so when you think about these use cases at the Edge you're going to have much more processing power. You're going to have cheaper storage and it's going to be less expensive processing. And so as an AI practitioner, what can you do with that? >> Yeah, it's a highly, again, another insightful question that we touched on our Keynote. And that goes up to the why, uh, to the where? Where will your data be? Right? We have one estimate that says that by next year there will be 55 billion connected devices out there, right? 55 billion, right? What's the population of the world? Well, of the other 10 billion? But this thing is 55 billion. (Dave chuckles) Right? And many of them, most of them can collect data. So what do you do? Right? So the amount of data that's going to come in, it's going to way exceed, right? Drop in storage costs are increasing compute power. >> Right. >> Right. So what's the answer, right? So the answer must be knowing that we don't, and even a drop in price and increase in bandwidth, it will overwhelm the, 5G, it will overwhelm 5G, right? Given the amount of 55 billion of them collecting. So the answer must be that there needs to be a balance between you needing to bring all of that data from the 55 billion devices of the data back to a central, as a bunch of central cost. Because you may not be able to afford to do that. Firstly bandwidth, even with 5G and as the, when you'll still be too expensive given the number of devices out there. You know given storage costs dropping is still be too expensive to try and install them all. So the answer must be to start, at least to mitigate from to, some leave most a lot of the data out there, right? And only send back the pertinent ones, as you said before. But then if you did that then how are we going to do machine learning at the Core and the Cloud Site, if you don't have all the data? You want rich data to train with, right? Sometimes you want to mix up the positive type data and the negative type data. So you can train the machine in a more balanced way. So the answer must be eventually, right? As we move forward with these huge number of devices all at the Edge to do machine learning at the Edge. Today we don't even have power, right? The Edge typically is characterized by a lower energy capability and therefore lower compute power. But soon, you know? Even with low energy, they can do more with compute power improving in energy efficiency, right? So learning at the Edge, today we do inference at the Edge. So we data, model, deploy and you do inference there is. That's what we do today. But more and more, I believe given a massive amount of data at the Edge, you have to start doing machine learning at the Edge. And when you don't have enough power then you aggregate multiple devices, compute power into a Swarm and learn as a Swarm, yeah. >> Oh, interesting. So now of course, if I were sitting and fly on the wall and the HPE board meeting I said, okay, HPE is a leading provider of compute. How do you take advantage of that? I mean, we're going, I know it's future but you must be thinking about that and participating in those markets. I know today you are, you have, you know, Edge line and other products. But there's, it seems to me that it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that opportunity for the customers? >> Hmm, the wall will have to have a balance, right? Where today the default, well, the more common mode is to collect the data from the Edge and train at some centralized location or number of centralized location. Going forward, given the proliferation of the Edge devices, we'll need a balance, we need both. We need capability at the Cloud Site, right? And it has to be hybrid. And then we need capability on the Edge side that we need to build systems that on one hand is an Edge adapter, right? Meaning they environmentally adapted because the Edge differently are on it, a lot of times on the outside. They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery powered. Right? So you have to build systems that adapts to it. But at the same time, they must not be custom. That's my belief. It must be using standard processes and standard operating system so that they can run a rich set of applications. So yes, that's also the insight for that Antonio announced in 2018. For the next four years from 2018, right? $4 billion invested to strengthen our Edge portfolio. >> Uh-huh. >> Edge product lines. >> Right. >> Uh-huh, Edge solutions. >> I could, Doctor Goh, I could go on for hours with you. You're just such a great guest. Let's close. What are you most excited about in the future of, certainly HPE, but the industry in general? >> Yeah, I think the excitement is the customers, right? The diversity of customers and the diversity in the way they have approached different problems of data strategy. So the excitement is around data strategy, right? Just like, you know, the statement made for us was so was profound, right? And Antonio said, we are in the age of insight powered by data. That's the first line, right? The line that comes after that is as such we are becoming more and more data centric with data that currency. Now the next step is even more profound. That is, you know, we are going as far as saying that, you know, data should not be treated as cost anymore. No, right? But instead as an investment in a new asset class called data with value on our balance sheet. This is a step change, right? Right, in thinking that is going to change the way we look at data, the way we value it. So that's a statement. (Dr. Eng chuckles) This is the exciting thing, because for me a CTO of AI, right? A machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. Right? (Dr. Eng chuckles) So, that's why when the people start to value data, right? And say that it is an investment when we collect it it is very positive for AI. Because an AI system gets intelligent, get more intelligence because it has huge amounts of data and a diversity of data. >> Yeah. >> So it'd be great, if the community values data. >> Well, you certainly see it in the valuations of many companies these days. And I think increasingly you see it on the income statement. You know data products and people monetizing data services. And yeah, maybe eventually you'll see it in the balance sheet. I know Doug Laney, when he was at Gartner Group, wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? >> Yeah, yeah. >> Dr. Goh... (Dave chuckles) >> The question is the process and methods in valuation. Right? >> Yeah, right. >> But I believe we will get there. We need to get started. And then we'll get there. I believe, yeah. >> Doctor Goh, it's always my pleasure. >> And then the AI will benefit greatly from it. >> Oh, yeah, no doubt. People will better understand how to align, you know some of these technology investments. Dr. Goh, great to see you again. Thanks so much for coming back in theCUBE. It's been a real pleasure. >> Yes, a system is only as smart as the data you feed it with. (Dave chuckles) (Dr. Eng laughs) >> Excellent. We'll leave it there. Thank you for spending some time with us and keep it right there for more great interviews from HPE Discover 21. This is Dave Vellante for theCUBE, the leader in Enterprise Tech Coverage. We'll be right back. (upbeat music)

Published Date : Jun 8 2021

SUMMARY :

Doctor Goh, great to see you again. great to talk to you again. And you talked about thriving And you really dug in the age of insight, right? of the ones you talked about today? to get what you need. And as a great example, the Flash Crash. is that humans put in the rules to decide But the rule was applied, you know, that it's going to be tough, yeah. So seems that most of the AI and the machine starts to evolve a model they may not have enough power to do so. Is that learning from the Edge You do understand my question. or the Call to do the learning. but the rest can be done at the Edge. When to organize it when you collect? But on the other hand, to help them build a system, you know? all that you can install, right? And so when you think about So what do you do? of the data back to a central, in that opportunity for the customers? And it has to be hybrid. about in the future of, as the data you feed it with. if the community values data. And I think increasingly you The question is the process We need to get started. And then the AI will Dr. Goh, great to see you again. as smart as the data Thank you for spending some time with us

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Lewis	PERSON	0.99+
Doug Laney	PERSON	0.99+
Dave	PERSON	0.99+
2018	DATE	0.99+
$4 billion	QUANTITY	0.99+
Antonio	PERSON	0.99+
two languages	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
two challenges	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
last year	DATE	0.99+
Gartner Group	ORGANIZATION	0.99+
first line	QUANTITY	0.99+
10	QUANTITY	0.99+
second area	QUANTITY	0.99+
both	QUANTITY	0.99+
tomorrow	DATE	0.99+
Hundreds of millions	QUANTITY	0.99+
Today	DATE	0.99+
today	DATE	0.99+
second barrier	QUANTITY	0.99+
two part	QUANTITY	0.99+
May 6, 2010	DATE	0.99+
One	QUANTITY	0.99+
Edge	ORGANIZATION	0.99+
first barrier	QUANTITY	0.99+
less than a hundred shares	QUANTITY	0.99+
next year	DATE	0.98+
Eng	PERSON	0.98+
yesterday	DATE	0.98+
first part	QUANTITY	0.98+
May 6	DATE	0.98+
United Nations	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
one area	QUANTITY	0.98+
one model	QUANTITY	0.98+
first one	QUANTITY	0.98+
Hewlett Packard Enterprise	ORGANIZATION	0.98+
Dr.	PERSON	0.97+
less than a hundred shares	QUANTITY	0.97+
three stages	QUANTITY	0.97+
one rule	QUANTITY	0.97+
Three main areas	QUANTITY	0.97+
Flash Boys	TITLE	0.97+
one language	QUANTITY	0.97+
one	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.96+
Flash Crash	TITLE	0.95+
under a hundred	QUANTITY	0.95+
Firstly	QUANTITY	0.95+

Dr. Eng Lim Goh, Joachim Schultze, & Krishna Prasad Shastry | HPE Discover 2020

>> Narrator: From around the globe it's theCUBE, covering HPE Discover Virtual Experience brought to you by HPE. >> Hi everybody. Welcome back. This is Dave Vellante for theCUBE, and this is our coverage of discover 2020, the virtual experience of HPE discover. We've done many, many discoveries, as usually we're on the show floor, theCUBE has been virtualized and we talk a lot at HPE discovers, a lot of storage and server and infrastructure and networking which is great. But the conversation we're going to have now is really, we're going to be talking about helping the world solve some big problems. And I'm very excited to welcome back to theCUBE Dr. Eng Lim Goh. He's a senior vice president of and CTO for AI, at HPE. Hello, Dr. Goh. Great to see you again. >> Hello. Thank you for having us, Dave. >> You're welcome. And then our next guest is Professor Joachim Schultze, who is the Professor for Genomics, and Immunoregulation at the university of Bonn amongst other things Professor, welcome. >> Thank you all. Welcome. >> And then Prasad Shastry, is the Chief Technologist for the India Advanced Development Center at HPE. Welcome, Prasad. Great to see you. >> Thank you. Thanks for having me. >> So guys, we have a CUBE first. I don't believe we've ever had of three guests in three separate times zones. I'm in a fourth time zone. (guests chuckling) So I'm in Boston. Dr. Goh, you're in Singapore, Professor Schultze, you're in Germany and Prasad, you're in India. So, we've got four different time zones. Plus our studio in Palo Alto. Who's running this program. So we've got actually got five times zones, a CUBE first. >> Amazing. >> Very good. (Prasad chuckles) >> Such as the world we live in. So we're going to talk about some of the big problems. I mean, here's the thing we're obviously in the middle of this pandemic, we're thinking about the post isolation economy, et cetera. People compare obviously no surprise to the Spanish flu early part of last century. They talk about the great depression, but the big difference this time is technology. Technology has completely changed the way in which we've approached this pandemic. And we're going to talk about that. Dr. Goh, I want to start with you. You've done a lot of work on this topic of swarm learning. If we could, (mumbles) my limited knowledge of this is we're kind of borrowing from nature. You think about, bees looking for a hive as sort of independent agents, but somehow they come together and communicate, but tell us what do we need to know about swarm learning and how it relates to artificial intelligence and we'll get into it. >> Oh, Dave, that's a great analogy using swarm of bees. That's exactly what we do at HPE. So let's use the of here. When deploying artificial intelligence, a hospital does machine learning of the outpatient data that could be biased, due to demographics and the types of cases they see more also. Sharing patient data across different hospitals to remove this bias is limited, given privacy or even sovereignty the restrictions, right? Like for example, across countries in the EU. HPE, so I'm learning fixers this by allowing each hospital, let's still continue learning locally, but at each cycle we collect the lumped weights of the neural networks, average them and sending it back down to older hospitals. And after a few cycles of doing this, all the hospitals would have learned from each other, removing biases without having to share any private patient data. That's the key. So, the ability to allow you to learn from everybody without having to share your private patients. That's swarm learning, >> And part of the key to that privacy is blockchain, correct? I mean, you you've been too involved in blockchain and invented some things in blockchain and that's part of the privacy angle, is it not? >> Yes, yes, absolutely. There are different ways of doing this kind of distributed learning, which swarm learning is over many of the other distributed learning methods. Require you to have some central control. Right? So, Prasad, and the team and us came up together. We have a method where you would, instead of central control, use blockchain to do this coordination. So, there is no more a central control or coordinator, especially important if you want to have a truly distributed swamp type learning system. >> Yeah, no need for so-called trusted third party or adjudicator. Okay. Professor Schultze, let's go to you. You're essentially the use case of this swarm learning application. Tell us a little bit more about what you do and how you're applying this concept. >> I'm actually by training a physician, although I haven't seen patients for a very long time. I'm interested in bringing new technologies to what we call precision medicine. So, new technologies both from the laboratories, but also from computational sciences, married them. And then I basically allow precision medicine, which is a medicine that is built on new measurements, many measurements of molecular phenotypes, how we call them. So, basically that process on different levels, for example, the genome or genes that are transcribed from the genome. We have thousands of such data and we have to make sense out of this. This can only be done by computation. And as we discussed already one of the hope for the future is that the new wave of developments in artificial intelligence and machine learning. We can make more sense out of this huge data that we generate right now in medicine. And that's what we're interesting in to find out how can we leverage these new technologies to build a new diagnostics, new therapy outcome predictors. So, to know the patient benefits from a disease, from a diagnostics or a therapy or not, and that's what we are doing for the last 10 years. The most exciting thing I have been through in the last three, four, five years is really when HPE introduced us to swarm learning. >> Okay and Prasad, you've been helping Professor Schultze, actually implements swarm learning for specific use cases that we're going to talk about COVID, but maybe describe a little bit about what you've been or your participation in this whole equation. >> Yep, thank. As Dr Eng Lim Goh, mentioned. So, we have used blockchain as a backbone to implement the decentralized network. And through that we're enabling a privacy preserved these centralized network without having any control points, as Professor explained in terms of depression medicines. So, one of the use case we are looking at he's looking at the blood transcriptomes, think of it, different hospitals having a different set of transcriptome data, which they cannot share due to the privacy regulations. And now each of those hospitals, will clean the model depending upon their local data, which is available in that hospital. And shared the learnings coming out of that training with the other hospitals. And we played to over several cycles to merge all these learnings and then finally get into a global model. So, through that we are able to kind of get into a model which provides the performance is equal of collecting all the data into a central repository and trying to do it. And we could really think of when we are doing it, them, could be multiple kinds of challenges. So, it's good to do decentralized learning. But what about if you have a non ID type of data, what about if there is a dropout in the network connections? What about if there are some of the compute nodes we just practice or probably they're not seeing sufficient amount of data. So, that's something we tried to build into the swarm learning framework. You'll handle the scenarios of having non ID data. All in a simple word we could call it as seeing having the biases. An example, one of the hospital might see EPR trying to, look at, in terms of let's say the tumors, how many number of cases and whereas the other hospital might have very less number of cases. So, if you have kind of implemented some techniques in terms of doing the merging or providing the way that different kind of weights or the tuneable parameters to overcome these set of challenges in the swarm learning. >> And Professor Schultze, you you've applied this to really try to better understand and attack the COVID pandemic, can you describe in more detail your goals there and what you've actually done and accomplished? >> Yeah. So, we have actually really done it for COVID. The reason why we really were trying to do this already now is that we have to generate it to these transcriptomes from COVID-19 patients ourselves. And we realized that the scene of the disease is so strong and so unique compared to other infectious diseases, which we looked at in some detail that we felt that the blood transcriptome would be good starting point actually to identify patients. But maybe even more important to identify those with severe diseases. So, if you can identify them early enough that'd be basically could care for those more and find particular for those treatments and therapies. And the reason why we could do that is because we also had some other test cases done before. So, we used the time wisely with large data sets that we had collected beforehand. So, use cases learned how to apply swarm learning, and we are now basically ready to test directly with COVID-19. So, this is really a step wise process, although it was extremely fast, it was still a step wise probably we're guided by data where we had much more knowledge of which was with the black leukemia. So, we had worked on that for years. We had collected many data. So, we could really simulate a Swarm learning very nicely. And based on all the experience we get and gain together with Prasad, and his team, we could quickly then also apply that knowledge to the data that are coming now from COVID-19 patients. >> So, Dr. Goh, it really comes back to how we apply machine intelligence to the data, and this is such an interesting use case. I mean, the United States, we have 50 different States with 50 different policies, different counties. We certainly have differences around the world in terms of how people are approaching this pandemic. And so the data is very rich and varied. Let's talk about that dynamic. >> Yeah. If you, for the listeners who are or viewers who are new to this, right? The workflow could be a patient comes in, you take the blood, and you send it through an analysis? DNA is made up of genes and our genes express, right? They express in two steps the first they transcribe, then they translate. But what we are analyzing is the middle step, the transcription stage. And tens of thousands of these Transcripts that are produced after the analysis of the blood. The thing is, can we find in the tens of thousands of items, right? Or biomarkers a signature that tells us, this is COVID-19 and how serious it is for this patient, right? Now, the data is enormous, right? For every patient. And then you have a collection of patients in each hospitals that have a certain demographic. And then you have also a number of hospitals around. The point is how'd you get to share all that data in order to have good training of your machine? The ACO is of course a know privacy of data, right? And as such, how do you then share that information if privacy restricts you from sharing the data? So in this case, swarm learning only shares the learnings, not the private patient data. So we hope this approach would allow all the different hospitals to come together and unite sharing the learnings removing biases so that we have high accuracy in our prediction as well at the same time, maintaining privacy. >> It's really well explained. And I would like to add at least for the European union, that this is extremely important because the lawmakers have clearly stated, and the governments that even non of these crisis conditions, they will not minimize the rules of privacy laws, their compliance to privacy laws has to stay as high as outside of the pandemic. And I think there's good reasons for that, because if you lower the bond, now, why shouldn't you lower the bar in other times as well? And I think that was a wise decision, yes. If you would see in the medical field, how difficult it is to discuss, how do we share the data fast enough? I think swarm learning is really an amazing solution to that. Yeah, because this discussion is gone basically. Now we can discuss about how we do learning together. I'd rather than discussing what would be a lengthy procedure to go towards sharing. Which is very difficult under the current privacy laws. So, I think that's why I was so excited when I learned about it, the first place with faster, we can do things that otherwise are either not possible or would take forever. And for a crisis that's key. That's absolutely key. >> And is the byproduct. It's also the fact that all the data stay where they are at the different hospitals with no movement. >> Yeah. Yeah. >> Learn locally but only shared the learnings. >> Right. Very important in the EU of course, even in the United States, People are debating. What about contact tracing and using technology and cell phones, and smartphones to do that. Beside, I don't know what the situation is like in India, but nonetheless, that Dr. Goh's point about just sharing the learnings, bubbling it up, trickling just kind of metadata. If you will, back down, protects us. But at the same time, it allows us to iterate and improve the models. And so, that's a key part of this, the starting point and the conclusions that we draw from the models they're going to, and we've seen this with the pandemic, it changes daily, certainly weekly, but even daily. We continuously improve the conclusions and the models don't we. >> Absolutely, as Dr. Goh explained well. So, we could look at like they have the clinics or the testing centers, which are done in the remote places or wherever. So, we could collect those data at the time. And then if we could run it to the transcripting kind of a sequencing. And then as in, when we learn to these new samples and the new pieces all of them put kind of, how is that in the local data participate in the kind of use swarm learning, not just within the state or in a country could participate into an swarm learning globally to share all this data, which is coming up in a new way, and then also implement some kind of continuous learning to pick up the new signals or the new insight. It comes a bit new set of data and also help to immediately deploy it back into the inference or into the practice of identification. To do these, I think one of the key things which we have realized is to making it very simple. It's making it simple, to convert the machine learning models into the swarm learning, because we know that our subject matter experts who are going to develop these models on their choice of platforms and also making it simple to integrate into that complete machine learning workflow from the time of collecting a data pre processing and then doing the model training and then putting it onto inferencing and looking performance. So, we have kept that in the mind from the beginning while developing it. So, we kind of developed it as a plug able microservices kind of packed data with containers. So the whole library could be given it as a container with a kind of a decentralized management command controls, which would help to manage the whole swarm network and to start and initiate and children enrollment of new hospitals or the new nodes into the swarm network. At the same time, we also looked into the task of the data scientists and then try to make it very, very easy for them to take their existing models and convert that into the swarm learning frameworks so that they can convert or enabled they're models to participate in a decentralized learning. So, we have made it to a set callable rest APIs. And I could say that the example, which we are working with the Professor either in the case of leukemia or in the COVID kind of things. The noodle network model. So we're kind of using the 10 layer neural network things. We could convert that into the swarm model with less than 10 lines of code changes. So, that's kind of a simply three we are looking at so that it helps to make it quicker, faster and loaded the benefits. >> So, that's an exciting thing here Dr. Goh is, this is not an R and D project. This is something that you're actually, implementing in a real world, even though it's a narrow example, but there are so many other examples that I'd love to talk about, but please, you had a comment. >> Yes. The key thing here is that in addition to allowing privacy to be kept at each hospital, you also have the issue of different hospitals having day to day skewed differently. Right? For example, a demographics could be that this hospital is seeing a lot more younger patients, and other hospitals seeing a lot more older patients. Right? And then if you are doing machine learning in isolation then your machine might be better at recognizing the condition in the younger population, but not older and vice versa by using this approach of swarm learning, we then have the biases removed so that both hospitals can detect for younger and older population. All right. So, this is an important point, right? The ability to remove biases here. And you can see biases in the different hospitals because of the type of cases they see and the demographics. Now, the other point that's very important to reemphasize is what precise Professor Schultze mentioned, right? It's how we made it very easy to implement this.Right? This started out being so, for example, each hospital has their own neural network and they training their own. All you do is we come in, as Pasad mentioned, change a few lines of code in the original, machine learning model. And now you're part of the collective swarm. This is how we want to easy to implement so that we can get again, as I like to call, hospitals of the world to uniting. >> Yeah. >> Without sharing private patient data. So, let's double click on that Professor. So, tell us about sort of your team, how you're taking advantage of this Dr. Goh, just describe, sort of the simplicity, but what are the skills that you need to take advantage of this? What's your team look like? >> Yeah. So, we actually have a team that's comes from physicians to biologists, from medical experts up to computational scientists. So, we have early on invested in having these interdisciplinary research teams so that we can actually spend the whole spectrum. So, people know about the medicine they know about them the biological basics, but they also know how to implement such new technology. So, they are probably a little bit spearheading that, but this is the way to go in the future. And I see that with many institutions going this way many other groups are going into this direction because finally medicine understands that without computational sciences, without artificial intelligence and machine learning, we will not answer those questions with this large data that we're using. So, I'm here fine. But I also realize that when we entered this project, we had basically our model, we had our machine learning model from the leukemia's, and it really took almost no efforts to get this into the swarm. So, we were really ready to go in very short time, but I also would like to say, and then it goes towards the bias that is existing in medicine between different places. Dr. Goh said this very nicely. It's one aspect is the patient and so on, but also the techniques, how we do clinical essays, we're using different robots a bit. Using different automates to do the analysis. And we actually try to find out what the Swan learning is doing if we actually provide such a bias by prep itself. So, I did the following thing. We know that there's different ways of measuring these transcriptomes. And we actually simulated that two hospitals had an older technology and a third hospital had a much newer technology, which is good for understanding the biology and the diseases. But it is the new technology is prone for not being able anymore to generate data that can be used to learn and then predicting the old technology. So, there was basically, it's deteriorating, if you do take the new one and you'll make a classifier model and you try old data, it doesn't work anymore. So, that's a very hard challenge. We knew it didn't work anymore in the old way. So, we've pushed it into swarm learning and to swarm recognize that, and it didn't take care of it. It didn't care anymore because the results were even better by bringing everything together. I was astonished. I mean, it's absolutely amazing. That's although we knew about this limitations on that one hospital data, this form basically could deal with it. I think there's more to learn about these advantages. Yeah. And I'm very excited. It's not only a transcriptome that people do. I hope we can very soon do it with imaging or the DCNE has 10 sites in Germany connected to 10 university hospitals. There's a lot of imaging data, CT scans and MRIs, Rachel Grimes. And this is the next next domain in medicine that we would like to apply as well as running. Absolutely. >> Well, it's very exciting being able to bring this to the clinical world And make it in sort of an ongoing learnings. I mean, you think about, again, coming back to the pandemic, initially, we thought putting people on ventilators was the right thing to do. We learned, okay. Maybe, maybe not so much the efficacy of vaccines and other therapeutics. It's going to be really interesting to see how those play out. My understanding is that the vaccines coming out of China, or built to for speed, get to market fast, be interested in U.S. Maybe, try to build vaccines that are maybe more longterm effective. Let's see if that actually occurs some of those other biases and tests that we can do. That is a very exciting, continuous use case. Isn't it? >> Yeah, I think so. Go ahead. >> Yes. I, in fact, we have another project ongoing to use a transcriptome data and other data like metabolic and cytokines that data, all these biomarkers from the blood, right? Volunteers during a clinical trial. But the whole idea of looking at all those biomarkers, we talking tens of thousands of them, the same thing again, and then see if we can streamline it clinical trials by looking at it data and training with that data. So again, here you go. Right? We have very good that we have many vaccines on. In candidates out there right now, the next long pole in the tenth is the clinical trial. And we are working on that also by applying the same concept. Yeah. But for clinical trials. >> Right. And then Prasad, it seems to me that this is a good, an example of sort of an edge use case. Right? You've got a lot of distributed data. And I know you've spoken in the past about the edge generally, where data lives bringing moving data back to sort of the centralized model. But of course you don't want to move data if you don't have to real time AI inferencing at the edge. So, what are you thinking in terms of other other edge use cases that were there swarm learning can be applied. >> Yeah, that's a great point. We could kind of look at this both in the medical and also in the other fields, as we talked about Professor just mentioned about this radiographs and then probably, Using this with a medical image data, think of it as a scenario in the future. So, if we could have an edge note sitting next to these medical imaging systems, very close to that. And then as in when this the systems producers, the medical immediate speed could be an X-ray or a CT scan or MRI scan types of thing. The system next to that, sitting on the attached to that. From the modernity is already built with the swarm lending. It can do the inferencing. And also with the new setup data, if it looks some kind of an outlier sees the new or images are probably a new signals. It could use that new data to initiate another round up as form learning with all the involved or the other medical images across the globe. So, all this can happen without really sharing any of the raw data outside of the systems but just getting the inferencing and then trying to make all of these systems to come together and try to build a better model. >> So, the last question. Yeah. >> If I may, we got to wrap, but I mean, I first, I think we've heard about swarm learning, maybe read about it probably 30 years ago and then just ignored it and forgot about it. And now here we are today, blockchain of course, first heard about with Bitcoin and you're seeing all kinds of really interesting examples, but Dr. Goh, start with you. This is really an exciting area, and we're just getting started. Where do you see swarm learning, by let's say the end of the decade, what are the possibilities? >> Yeah. You could see this being applied in many other industries, right? So, we've spoken about life sciences, to the healthcare industry or you can't imagine the scenario of manufacturing where a decade from now you have intelligent robots that can learn from looking at across men building a product and then to replicate it, right? By just looking, listening, learning and imagine now you have multiple of these robots, all sharing their learnings across boundaries, right? Across state boundaries, across country boundaries provided you allow that without having to share what they are seeing. Right? They can share, what they have lunch learnt You see, that's the difference without having to need to share what they see and hear, they can share what they have learned across all the different robots around the world. Right? All in the community that you allow, you mentioned that time, right? That will even in manufacturing, you get intelligent robots learning from each other. >> Professor, I wonder if as a practitioner, if you could sort of lay out your vision for where you see something like this going in the future, >> I'll stay with the medical field at the moment being, although I agree, it will be in many other areas, medicine has two traditions for sure. One is learning from each other. So, that's an old tradition in medicine for thousands of years, but what's interesting and that's even more in the modern times, we have no traditional sharing data. It's just not really inherent to medicine. So, that's the mindset. So yes, learning from each other is fine, but sharing data is not so fine, but swarm learning deals with that, we can still learn from each other. We can, help each other by learning and this time by machine learning. We don't have to actually dealing with the data sharing anymore because that's that's us. So for me, it's a really perfect situation. Medicine could benefit dramatically from that because it goes along the traditions and that's very often very important to get adopted. And on top of that, what also is not seen very well in medicine is that there's a hierarchy in the sense of serious certain institutions rule others and swarm learning is exactly helping us there because it democratizes, onboarding everybody. And even if you're not sort of a small entity or a small institutional or small hospital, you could become remembering the swarm and you will become as a member important. And there is no no central institution that actually rules everything. But this democratization, I really laugh, I have to say, >> Pasad, we'll give you the final word. I mean, your job is very helping to apply these technologies to solve problems. what's your vision or for this. >> Yeah. I think Professor mentioned about one of the very key points to use saying that democratization of BI I'd like to just expand a little bit. So, it has a very profound application. So, Dr. Goh, mentioned about, the manufacturing. So, if you look at any field, it could be health science, manufacturing, autonomous vehicles and those to the democratization, and also using that a blockchain, we are kind of building a framework also to incentivize the people who own certain set of data and then bring the insight from the data into the table for doing and swarm learning. So, we could build some kind of alternative monetization framework or an incentivization framework on top of the existing fund learning stuff, which we are working on to enable the participants to bring their data or insight and then get rewarded accordingly kind of a thing. So, if you look at eventually, we could completely make dais a democratized AI, with having the complete monitorization incentivization system which is built into that. You may call the parties to seamlessly work together. >> So, I think this is just a fabulous example of we hear a lot in the media about, the tech backlash breaking up big tech but how tech has disrupted our lives. But this is a great example of tech for good and responsible tech for good. And if you think about this pandemic, if there's one thing that it's taught us is that disruptions outside of technology, pandemics or natural disasters or climate change, et cetera, are probably going to be the bigger disruptions then technology yet technology is going to help us solve those problems and address those disruptions. Gentlemen, I really appreciate you coming on theCUBE and sharing this great example and wish you best of luck in your endeavors. >> Thank you. >> Thank you. >> Thank you for having me. >> And thank you everybody for watching. This is theCUBE's coverage of HPE discover 2020, the virtual experience. We'll be right back right after this short break. (upbeat music)

Published Date : Jun 24 2020

SUMMARY :

the globe it's theCUBE, But the conversation we're Thank you for having us, Dave. and Immunoregulation at the university Thank you all. is the Chief Technologist Thanks for having me. So guys, we have a CUBE first. Very good. I mean, here's the thing So, the ability to allow So, Prasad, and the team You're essentially the use case of for the future is that the new wave Okay and Prasad, you've been helping So, one of the use case we And based on all the experience we get And so the data is very rich and varied. of the blood. and the governments that even non And is the byproduct. Yeah. shared the learnings. and improve the models. And I could say that the that I'd love to talk about, because of the type of cases they see sort of the simplicity, and the diseases. and tests that we can do. Yeah, I think so. and then see if we can streamline it about the edge generally, and also in the other fields, So, the last question. by let's say the end of the decade, All in the community that you allow, and that's even more in the modern times, to apply these technologies You may call the parties to the tech backlash breaking up big tech the virtual experience.

ENTITIES

Entity	Category	Confidence
Prasad	PERSON	0.99+
India	LOCATION	0.99+
Joachim Schultze	PERSON	0.99+
Dave	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
Boston	LOCATION	0.99+
China	LOCATION	0.99+
Schultze	PERSON	0.99+
Germany	LOCATION	0.99+
Singapore	LOCATION	0.99+
United States	LOCATION	0.99+
10 sites	QUANTITY	0.99+
Prasad Shastry	PERSON	0.99+
10 layer	QUANTITY	0.99+
10 university hospitals	QUANTITY	0.99+
COVID-19	OTHER	0.99+
Goh	PERSON	0.99+
50 different policies	QUANTITY	0.99+
two hospitals	QUANTITY	0.99+
thousands	QUANTITY	0.99+
two steps	QUANTITY	0.99+
Krishna Prasad Shastry	PERSON	0.99+
pandemic	EVENT	0.99+
thousands of years	QUANTITY	0.99+
Eng Lim Goh	PERSON	0.99+
HPE	ORGANIZATION	0.99+
first	QUANTITY	0.99+
ACO	ORGANIZATION	0.99+
DCNE	ORGANIZATION	0.99+
European union	ORGANIZATION	0.99+
each hospitals	QUANTITY	0.99+
less than 10 lines	QUANTITY	0.99+
both hospitals	QUANTITY	0.99+
one	QUANTITY	0.99+
Rachel Grimes	PERSON	0.99+
each	QUANTITY	0.99+
three guests	QUANTITY	0.99+
each cycle	QUANTITY	0.99+
third hospital	QUANTITY	0.99+
each hospital	QUANTITY	0.98+
four	QUANTITY	0.98+
30 years ago	DATE	0.98+
India Advanced Development Center	ORGANIZATION	0.98+
both	QUANTITY	0.98+
tens of thousands	QUANTITY	0.98+
fourth time zone	QUANTITY	0.98+
three	QUANTITY	0.98+
one aspect	QUANTITY	0.97+
EU	LOCATION	0.96+
five years	QUANTITY	0.96+
2020	DATE	0.96+
today	DATE	0.96+
Dr.	PERSON	0.95+
Pasad	PERSON	0.95+

Phoebe Goh, Netapp & Paul Stringfellow, Gardner Systems

(electronic music) >> Announcer: Live from Las Vegas, it's theCUBE, covering NetApp Insight 2018. Brought to you by NetApp. >> Welcome back to theCUBE's continuing coverage of NetApp Insight 2018. We are in Mandalay Bay in Las Vegas, I'm Lisa Martin, with Stu Miniman. And we have a couple of guests joining us now from the A-Team, cue the music rights too. We've got Phoebe Goh, Cloud architect from NetApp, and we've got Paul Stringfellow, one of our CUBE alumni, technical director of Gardner Systems, one of NetApp's partners. Guys thanks so much for stopping by theCUBE. >> You're welcome. >> In your matching outfits! >> Thank you for having us. >> So first of all, this morning, before the general session started, I saw both of you on camera talking a little bit about the A-Team. For our audience who might not be familiar with that, I know it's been around for five years. Phoebe, talk to us a little bit about the A-Team, who composes it, obviously we've got a channel partner, it's not just NetAppians, but give our viewers a little bit of an overview of the A-Team. >> NetApp really appreciates our advocates from channel partners and also from our customers. We really love hearing from them, and we also love giving them back information about what we do, and where we're going with our vision and our strategy. So, we have channel partners on the A-Team as well as customers, and technical advisors from NetApp, such as myself, and we get together every now and then at events like Insight, and we also bring them to Sunnyvale where they are given some information about what's coming up with our strategy. >> And this is a small group of about maybe 30 people. Paul how long have you been part of the A-Team, and what has that, what you have learned from some of the other folks that are on that team? >> It's a great question, I've been a part of the team for three years, and it's kind of a symbiotic relationship almost in that it kind of works both ways. I think there's lots of value for NetApp in the partnership, in that they get to hear kind of from channel partners on the street, about what people actually think of their technology. It also works in that we get to see quite a lot of pre-release information, and it gives us the opportunity to feed back to NetApp directly from the things that we see out in the channel about what customers actually want, and then we can feed that back into NetApp, and we've seen over kind of the five years of the team, we've seen product strategies change, we've seen new products come to market, because of that direct feedback. And then from our side, when we talk to our customers, there's real value in being able to say that we've got that direct relationship with NetApp, we've got that access to their executives, and access to their research team. It works really well both ways for us. >> In the keynote this morning, we heard George Kurian talk about digital transformation, and one of those pieces is that hybrid multi cloud is the defacto IT architecture, Paul I would love to get your feedback as a channel partner, what this kind of hybrid multi cloud means to your customers, means to your business. >> So I think the idea of hybrid I think, it's different for a lot of people, so in lots of cases, hybrid for some organizations may be that their entire data center remains on prem within their own walls, however they might be using a software service, an Office 365, they may be using a Dropbox, and I don't think kind of the definition that George was talking about this morning when he talked about hybrid cloud. My little take on what George talked about as well with hybrid cloud , I think he's understanding that it exists, understand that public cloud is a thing, that the Azures, that the AWSs, the Googles, play a part in a way that some organizations are working. That's not necessarily the way your organization wants to work, so understanding that it's there, designing an architecture that recognizes that, and makes sure that if you ever want to use those kind of services in the future, that you'll be able to do so, but it's equally valid to say, actually, public cloud isn't for us. As long as you make that as a decision, and don't just fall into it because you've not really thought about it, that's a perfectly valid strategy. >> I really agree with what you were saying. So often when we talk about hybrid and multi-cloud, we're talking about infrastructure. >> Paul: Yep. >> And there's more than just infrastructure, a thing that I've been saying for a few years, let's follow the applications, and even more importantly, let's follow the data. I love we get some international viewpoints here because sometimes North America, it's like oh let's talk only about public cloud and seems to be kind of a monolithic thing. Phoebe, I would love to get your viewpoint, what are you hearing from customers when they talk about cloud, what does that mean for them, and how's NetApp and NetApp's channel partners helping them sort through this new future? >> Definitely, our customers and our channel partners are talking a lot about cloud, creating, adding agility to their business, allowing them to move faster, and to be more flexible, and what NetApp is looking to do is really enable that and speed that up for, no matter where you are in the globe, whether you're in Australia, or in America, or in Europe, that you can achieve those business outcomes that you really want, and we know that the cloud is going to help us get there, so we really want to help them use the data the best ways, and use the technology that makes sense for the business to be able to get to public cloud. >> How are you hearing, a lot of the messaging coming out, NetApp is data driven, it's the data authority, lot of transformation that NetApp's undergone in its 26 year history, I'd love to get your both of your perspectives before we wrap here about how are customers embracing that as looking to NetApp and its ecosystem partners to help them embrace this hybrid multi-cloud environment in which they live, and look at NetApp as part of their core cloud strategy, rather than data management storage? >> I'm actually really excited about this because I love collaborating and talking our customers and our partners, and what I find is that they're coming to us and saying, "Wow, we didn't know you guys did that, and "you're not even, you're not selling us something, you're "really helping us get there." We're having a conversation about how we can really get there, get to their business outcomes, rather than trying to push a product, where I find that we get to have really collaborative conversations, Paul? >> Actually, I couldn't agree more, I think that what data fabric, what this kind of hybrid cloud model means to our customers, is it opens up a much wider conversation. We're not having a conversation about storage, we're not talking to a partner saying, would you like to buy some NetApp, as a customer, because that can be, that's a yes no, I use something else, I'm not interested in NetApp or I'd love to buy some NetApp. Actually, if we can have a data conversation that talks about how do you want to use this, what are the business outcomes that you'd like to achieve, what is it you are trying to do as a business, let's help data be part of that transformation. >> Guys, thanks so much for stopping by having a quick convo, especially Phoebe since you've been in Vegas for four days already, and your voice is hanging on by a thread. Paul, Phoebe, thanks so much for your time. >> Thank you. >> You're welcome, pleasure, thank you. >> We want to thank you for watching theCUBE, from Las Vegas NetApp Insight 2018, I'm Lisa Martin with Stu Miniman, Stu and I will be right back with our next guest after a short break. (electronic music)

Published Date : Oct 24 2018

SUMMARY :

Brought to you by NetApp. And we have a couple of guests joining us now from the Phoebe, talk to us a little bit about the A-Team, We really love hearing from them, and we also love giving has that, what you have learned from some of the other in that they get to hear kind of from channel partners on In the keynote this morning, we heard George that the AWSs, the Googles, play a part in a way that some I really agree with what you were saying. public cloud and seems to be kind of a monolithic thing. going to help us get there, so we really want to help them and saying, "Wow, we didn't know you guys did that, and talking to a partner saying, would you like to buy some Paul, Phoebe, thanks so much for your time. We want to thank you for watching theCUBE, from

ENTITIES

Entity	Category	Confidence
Phoebe Goh	PERSON	0.99+
George	PERSON	0.99+
Lisa Martin	PERSON	0.99+
George Kurian	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Paul Stringfellow	PERSON	0.99+
Europe	LOCATION	0.99+
America	LOCATION	0.99+
Australia	LOCATION	0.99+
Stu	PERSON	0.99+
Vegas	LOCATION	0.99+
Mandalay Bay	LOCATION	0.99+
Stu Miniman	PERSON	0.99+
Paul	PERSON	0.99+
Phoebe	PERSON	0.99+
three years	QUANTITY	0.99+
AWSs	ORGANIZATION	0.99+
Gardner Systems	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
26 year	QUANTITY	0.99+
Googles	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
CUBE	ORGANIZATION	0.99+
North America	LOCATION	0.99+
both	QUANTITY	0.99+
Sunnyvale	LOCATION	0.99+
NetApp	ORGANIZATION	0.99+
both ways	QUANTITY	0.98+
Dropbox	ORGANIZATION	0.97+
one	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
Office 365	TITLE	0.97+
four days	QUANTITY	0.96+
NetAppians	ORGANIZATION	0.96+
this morning	DATE	0.94+
Netapp	ORGANIZATION	0.94+
NetApp	TITLE	0.91+
30 people	QUANTITY	0.89+
A-Team	ORGANIZATION	0.86+
NetApp Insight 2018	TITLE	0.79+
about	QUANTITY	0.75+
one of those	QUANTITY	0.74+
couple of guests	QUANTITY	0.7+
Azures	TITLE	0.66+

Eng Lim Goh, HPE & Tuomas Sandholm, Strategic Machine Inc. - HPE Discover 2017

>> Announcer: Live from Las Vegas, it's theCUBE covering HPE Discover 2017, brought to you by Hewlett Packard Enterprise. >> Okay, welcome back everyone. We're live here in Las Vegas for SiliconANGLE's CUBE coverage of HPE Discover 2017. This is our seventh year of covering HPE Discover Now. HPE Discover in its second year. I'm John Furrier, my co-host Dave Vellante. We've got two great guests, two doctors, PhD's in the house here. So Eng Lim Goh, VP and SGI CTO, PhD, and Tuomas Sandholm, Professor at Carnegie Mellon University of Computer Science and also runs the marketplace lab over there, welcome to theCube guys, doctors. >> Thank you. >> Thank you. >> So the patient is on the table, it's called machine learning, AI, cloud computing. We're living in a really amazing place. I call it open bar and open source. There's so many new things being contributed to open source, so much new hardware coming on with HPE that there's a lot of innovation happening. So want to get your thoughts first on how you guys are looking at this big trend where all this new software is coming in and these new capabilities, what's the vibe, how do you look at this. You must be, Carnegie Mellon, oh this is an amazing time, thoughts. >> Yeah, it is an amazing time and I'm seeing it both on the academic side and the startup side that you know, you don't have to invest into your own custom hardware. We are using HPE with the Pittsburgh Supercomputing Center in academia, using cloud in the startups. So it really makes entry both for academic research and startups easier, and also the high end on the academic research, you don't have to worry about maintaining and staying up to speed with all of the latest hardware and networking and all that. You know it kind of. >> Focus on your research. >> Focus on the research, focus on the algorithms, focus on the AI, and the rest is taken care of. >> John: Eng talk about the supercomputer world that's now there, if you look at the abundant computer intelligent edge we're seeing genome sequencing done in minutes, the prices are dropping. I mean high performance computing used to be this magical, special thing, that you had to get a lot of money to pay for or access to. Democratization is pretty amazing can I just hear your thoughts on what you see happening. >> Yes, Yes democratization in the traditional HPC approach the goal is to prediction and forecasts. Whether the engine will stay productive, or financial forecasts, whether you should buy or sell or hold, let's use the weather as an example. In traditional HPC for the last 30 years what we do to predict tomorrows weather, what we do first is to write all the equations that models the weather. Measure today's weather and feed that in and then we apply supercomputing power in the hopes that it will predict tomorrows weather faster than tomorrow is coming. So that has been the traditional approach, but things have changed. Two big things changed in the last few years. We got these scientists that think perhaps there is a new way of doing it. Instead of calculating your prediction can you not use data intensive method to do an educated guess at your prediction and this is what you do. Instead of feeding today's weather information into the machine learning system they feed 30 years everyday, 10 thousand days. Everyday they feed the data in, the machine learning system guess at whether it will rain tomorrow. If it gets it wrong, it's okay, it just goes back to the weights that control the inputs and adjust them. Then you take the next day and feed it in again after 10 thousand tries, what started out as a wild guess becomes an educated guess, and this is how the new way of doing data intensive computing is starting to emerge using machine learning. >> Democratization is a theme I threw that out because I think it truly is happening. But let's get specific now, I mean a lot of science has been, well is climate change real, I mean this is something that is in the news. We see that in today's news cycle around climate change things of that as you mentioned weather. So there's other things, there's other financial models there's other in healthcare, in disease and there's new ways to get at things that were kind of hocus pocus maybe some science, some modeling, forecasting. What are you seeing that's right low hanging fruit right now that's going to impact lives? What key things will HPC impact besides weather? Is healthcare there, where is everyone getting excited? >> I think health and safety immediately right. Health and safety, you mentioned gene sequencing, drug designs, and you also mentioned in gene sequencing and drug design there is also safety in designing of automobiles and aircrafts. These methods have been traditionally using simulation, but more and more now they are thinking while these engines for example, are flying can you collect more data so you can predict when this engine will fail. And also predict say, when will the aircraft lands what sort of maintenance you should be applying on the engine without having to spend some time on the ground, which is unproductive time, that time on the ground diagnosing the problems. You start to see application of data intensive methods increased in order to improve safety and health. >> I think that's good and I agree with that. You could also kind of look at some of the technology perspective as to what kind of AI is going to be next and if you look back over the last five to seven years, deep learning has become a very hot part of machine learning and machine learning is part of AI. So that's really lifted that up. But what's next there is not just classification or prediction, but decision making on top of that. So we'll see AI move up the chain to actual decision making on top of just the basic machine learning. So optimization, things like that. Another category is what we call strategic reasoning. Traditionally in games like chess, or checkers and now Go, people have fallen to AI and now we did this in January in poker as well, after 14 years of research. So now we can actually take real strategic reasoning under imperfect information settings and apply it to various settings like business strategy optimization, automated negotiation, certain areas of finance, cyber security, and so forth. >> Go ahead. >> I'd like to interject, so we are very on it and impressed right. If we look back years ago IBM beat the worlds top chess player right. And that was an expert system and more recently Google Alpha Go beat even a more complex game, Go, and beat humans in that. But what the Professor has done recently is develop an even more complex game in a sense that it is incomplete information, it is poker. You don't know the other party's cards, unlike in the board game you would know right. This is very much real life in business negotiation in auctions, you don't quite know what the other party' thinking. So I believe now you are looking at ways I hope right, that poker playing AI software that can handle incomplete information, not knowing the other parties but still able to play expertly and apply that in business. >> I want to double down on that, I know Dave's got a question but I want to just follow this thread through. So the AI, in this case augmented intelligence, not so much artificial, because you're augmenting without the perfect information. It's interesting because one of the debates in the big data world has been, well the streaming of all this data is so high-velocity and so high-volume that we don't know what we're missing. Everyone's been trying to get at the perfect information in the streaming of the data. And this is where the machine learning if I get your point here, can do this meta reasoning or this reasoning on top of it to try to use that and say, hey let's not try to solve the worlds problems and boil the ocean over and understand it all, let's use that as a variable for AI. Did I get that right? >> Kind of, kind of I would say, in that it's not just a technical barrier to getting the big data, it's also kind of a strategic barrier. Companies, even if I could tell you all of my strategic information, I wouldn't want to. So you have to worry not just about not having all the information but are there other guys explicitly hiding information, misrepresenting and vice versa, you doing strategic action as well. Unlike in games like Go or chess, where it's perfect information, you need totally different kinds of algorithms to deal with these imperfect information games, like negotiation or strategic pricing where you have to think about the opponents responses. >> It's your hairy window. >> In advance. >> John: Knowing what you don't know. >> To your point about huge amounts of data we are talking about looking for a needle in a haystack. But when the data gets so big and the needles get so many you end up with a haystack of needles. So you need some augmentation to help you to deal with it. Because the humans would be inundated with the needles themselves. >> So is HPE sort of enabling AI or is AI driving HPC. >> I think it's both. >> Both, yeah. >> Eng: Yeah, that's right, both together. In fact AI is driving HPC because it is a new way of using that supercomputing power. Not just doing computer intensive calculation, but also doing it data intensive AI, machine learning. Then we are also driving AI because our customers are now asking the same questions, how do I transition from a computer intensive approach to a data intensive one also. This is where we come in. >> What are your thoughts on how this affects society, individuals, particularly students coming in. You mentioned Gary Kasparov losing to the IBM supercomputer. But he didn't stop there, he said I'm going to beat the supercomputer, and he got supercomputers and humans together and now holds a contest every year. So everybody talks about the impact of machines replacing humans and that's always happened. But what do you guys see, where's the future of work, of creativity for young people and the future of the economy. What does this all mean? >> You want to go first or second? >> You go ahead first. (Eng and Tuomas laughing) >> They love the fighting. >> This is a fun topic, yeah. There's a lot of worry about AI of course. But I think of AI as a tool, much like a hammer or a saw So It's going to make human lives better and it's already making human lives better. A lot of people don't even understand all the things that already have AI that are helping them out. There's this worry that there's going to be a super species that's AI that's going to take over humans. I don't think so, I don't think there's any demand for a super species of AI. Like a hammer and a saw, a hammer and a saw is better than a hammersaw, so I actually think of AI as better being separate tools for separate applications and that is very important for mankind and also nations and the world in the future. One example is our work on kidney exchange. We run the nationwide kidney exchange for the United Network for Organ Sharing, which saves hundreds of lives. This is an example not only that saves lives and makes better decisions than humans can. >> In terms of kidney candidates, timing, is all of that. >> That's a long story, but basically, when you have willing but incompatible live donors, incompatible with the patient they can swap their donors. Pair A gives to pair B gives to pair C gives to pair A for example. And we also co-invented this idea of chains where an altruist donor creates a while chain through our network and then the question of which combination of cycles and chains is the best solution. >> John: And no manual involvement, your machines take over the heavy lifting? >> It's hard because when the number of possible solutions is bigger than the number of atoms in the universe. So you have to have optimization AI actually make the decisions. So now our AI makes twice a week, these decisions for the country or 66% of the transplant centers in the country, twice a week. >> Dr. Goh would you would you add anything to the societal impact of AI? >> Yes, absolutely on the cross point on the saw and hammer. That's why these AI systems today are very specific. That's why some call them artificial specific intelligence, not general intelligence. Now whether a hundred years from now you take a hundred of these specific intelligence and combine them, whether you get an emergent property of general intelligence, that's something else. But for now, what they do is to help the analyst, the human, the decision maker and more and more you will see that as you train these models it's hard to make a lot of correct decisions. But ultimately there's a difference between a correct decision and, I believe, a right decision. Therefore, there always needs to be a human supervisor there to ultimately make the right decision. Of course, he will listen to the machine learning algorithm suggesting the correct answer, but ultimately the human values have to be applied to decide whether society accepts this decision. >> All models are wrong, some are useful. >> So on this thing there's a two benefits of AI. One is a this saves time, saves effort, which is a labor savings, automation. The other is better decision making. We're seeing the better decision making now become more of an important part instead of just labor savings or what have you. We're seeing that in the kidney exchange and now with strategic reasoning, now for the first time we can do better strategic reasoning than the best humans in imperfect information settings. Now it becomes almost a competitive need. You have to have, what I call, strategic augmentation as a business to be competitive. >> I want to get your final thoughts before we end the segment, this is more of a sharing component. A lot of young folks are coming in to computer science and or related sciences and they don't need to be a computer science major per se, but they have all the benefits of this goodness we're talking about here. Your advice, if both of you could share you opinion and thoughts in reaction to the trend where, the question we get all the time is what should young people be thinking about if they're going to be modeling and simulating a lot of new data scientists are coming in some are more practitioner oriented, some are more hard core. As this evolution of simulations and modeling that we're talking about have scale here changes, what should they know, what should be the best practice be for learning, applying, thoughts. >> For me you know the key thing is be comfortable about using tools. And for that I think the young chaps of the world as they come out of school they are very comfortable with that. So I think I'm actually less worried. It will be a new set of tools these intelligent tools, leverage them. If you look at the entire world as a single system what we need to do is to move our leveraging of tools up to a level where we become an even more productive society rather than worrying, of course we must be worried and then adapt to it, about jobs going to AI. Rather we should move ourselves up to leverage AI to be an even more productive world and then hopefully they will distribute that wealth to the entire human race, becomes more comfortable given the AI. >> Tuomas your thoughts? >> I think that people should be ready to actually for the unknown so you've got to be flexible in your education get the basics right because those basics don't change. You know, math, science, get that stuff solid and then be ready to, instead of thinking about I'm going to be this in my career, you should think about I'm going to be this first and then maybe something else I don't know even. >> John: Don't memorize the test you don't know you're going to take yet, be more adaptive. >> Yes, creativity is very important and adaptability and people should start thinking about that at a young age. >> Doctor thank you so much for sharing your input. What a great world we live in right now. A lot of opportunities a lot of challenges that are opportunities to solve with high performance computing, AI and whatnot. Thanks so much for sharing. This is theCUBE bringing you all the best coverage from HPE Discover. I'm John Furrier with Dave Vellante, we'll be back with more live coverage after this short break. Three days of wall to wall live coverage. We'll be right back. >> Thanks for having us.

Published Date : Jun 6 2017

SUMMARY :

covering HPE Discover 2017, brought to you and also runs the marketplace lab over there, So the patient is on the table, and the startup side that you know, Focus on the research, focus on the algorithms, done in minutes, the prices are dropping. and this is what you do. things of that as you mentioned weather. Health and safety, you mentioned gene sequencing, You could also kind of look at some of the technology So I believe now you are looking at ways So the AI, in this case augmented intelligence, and vice versa, you doing strategic action as well. So you need some augmentation to help you to deal with it. are now asking the same questions, and the future of the economy. (Eng and Tuomas laughing) and also nations and the world in the future. is the best solution. is bigger than the number of atoms in the universe. Dr. Goh would you would you add anything and combine them, whether you get an emergent property We're seeing that in the kidney exchange and or related sciences and they don't need to be and then adapt to it, about jobs going to AI. for the unknown so you've got to be flexible John: Don't memorize the test you don't know and adaptability and people should start thinking This is theCUBE bringing you all

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Gary Kasparov	PERSON	0.99+
Tuomas Sandholm	PERSON	0.99+
Dave	PERSON	0.99+
Tuomas	PERSON	0.99+
30 years	QUANTITY	0.99+
66%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
10 thousand days	QUANTITY	0.99+
January	DATE	0.99+
Three days	QUANTITY	0.99+
two doctors	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
One	QUANTITY	0.99+
tomorrow	DATE	0.99+
Eng Lim Goh	PERSON	0.99+
Pittsburgh Supercomputing Center	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Both	QUANTITY	0.99+
twice a week	QUANTITY	0.99+
Strategic Machine Inc.	ORGANIZATION	0.99+
seventh year	QUANTITY	0.99+
two benefits	QUANTITY	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
today	DATE	0.98+
HPE Discover	ORGANIZATION	0.98+
Carnegie Mellon	ORGANIZATION	0.98+
first	QUANTITY	0.98+
One example	QUANTITY	0.98+
Carnegie Mellon University of Computer Science	ORGANIZATION	0.98+
United Network for Organ Sharing	ORGANIZATION	0.98+
two great guests	QUANTITY	0.98+
second year	QUANTITY	0.98+
tomorrows	DATE	0.97+
Dr.	PERSON	0.97+
seven years	QUANTITY	0.97+
Goh	PERSON	0.97+
second	QUANTITY	0.96+
first time	QUANTITY	0.96+
Two big	QUANTITY	0.96+
Go	TITLE	0.95+
10 thousand tries	QUANTITY	0.94+
one	QUANTITY	0.94+
next day	DATE	0.91+
last few years	DATE	0.91+
a hundred years	QUANTITY	0.91+
single system	QUANTITY	0.88+
SGI	ORGANIZATION	0.88+
hundreds of lives	QUANTITY	0.87+
chess	TITLE	0.86+
last 30 years	DATE	0.86+
pair A	OTHER	0.85+
hundred	QUANTITY	0.84+
HPE Discover 2017	EVENT	0.83+
HPE Discover	EVENT	0.82+
pair B	OTHER	0.81+
14 years	QUANTITY	0.8+
SiliconANGLE	ORGANIZATION	0.79+
2017	DATE	0.78+
transplant centers	QUANTITY	0.75+
five	QUANTITY	0.73+
Eng	PERSON	0.72+
last	QUANTITY	0.71+
years ago	DATE	0.7+
every	QUANTITY	0.68+
VP	PERSON	0.67+
theCube	ORGANIZATION	0.66+
pair C	OTHER	0.59+
Alpha Go	COMMERCIAL_ITEM	0.57+

Drug Discovery and How AI Makes a Difference Panel | Exascale Day

>> Hello everyone. On today's panel, the theme is Drug Discovery and how Artificial Intelligence can make a difference. On the panel today, we are honored to have Dr. Ryan Yates, principal scientist at The National Center for Natural Products Research, with a focus on botanicals specifically the pharmacokinetics, which is essentially how the drug changes over time in our body and pharmacodynamics which is essentially how drugs affects our body. And of particular interest to him is the use of AI in preclinical screening models to identify chemical combinations that can target chronic inflammatory processes such as fatty liver disease, cognitive impairment and aging. Welcome, Ryan. Thank you for coming. >> Good morning. Thank you for having me. >> The other distinguished panelist is Dr. Rangan Sukumar, our very own, is a distinguished technologist at the CTO office for High Performance Computing and Artificial Intelligence with a PHD in AI and 70 publications that can be applied in drug discovery, autonomous vehicles and social network analysis. Hey Rangan, welcome. Thank you for coming, by sparing the time. We have also our distinguished Chris Davidson. He is leader of our HPC and AI Application and Performance Engineering team. His job is to tune and benchmark applications, particularly in the applications of weather, energy, financial services and life sciences. Yes so particular interest is life sciences he spent 10 years in biotech and medical diagnostics. Hi Chris, welcome. Thank you for coming. >> Nice to see you. >> Well let's start with your Chris, yes, you're regularly interfaced with pharmaceutical companies and worked also on the COVID-19 White House Consortium. You know tell us, let's kick this off and tell us a little bit about your engagement in the drug discovery process. >> Right and that's a good question I think really setting the framework for what we're talking about here is to understand what is the drug discovery process. And that can be kind of broken down into I would say four different areas, there's the research and development space, the preclinical studies space, clinical trial and regulatory review. And if you're lucky, hopefully approval. Traditionally this is a slow arduous process it costs a lot of money and there's a high amount of error. Right, however this process by its very nature is highly iterate and has just huge amounts of data, right it's very data intensive, right and it's these characteristics that make this process a great target for kind of new approaches in different ways of doing things. Right, so for the sake of discussion, right, go ahead. >> Oh yes, so you mentioned data intensive brings to mind Artificial Intelligence, you know, so Artificial Intelligence making the difference here in this process, is that so? >> Right, and some of those novel approaches are actually based on Artificial Intelligence whether it's deep learning and machine learning, et cetera, you know, prime example would say, let's just say for the sake of discussion, let's say there's a brand new virus, causes flu-like symptoms, shall not be named if we focus kind of on the R and D phase, right our goal is really to identify target for the treatment and then screen compounds against it see which, you know, which ones we take forward right to this end, technologies like cryo-electron, cryogenic electron microscopy, just a form of microscopy can provide us a near atomic biomolecular map of the samples that we're studying, right whether that's a virus, a microbe, the cell that it's attaching to and so on, right AI, for instance, has been used in the particle picking aspect of this process. When you take all these images, you know, there are only certain particles that we want to take and study, right whether they have good resolution or not whether it's in the field of the frame and image recognition is a huge part of this, it's massive amounts of data in AI can be very easily, you know, used to approach that. Right, so with docking, you can take the biomolecular maps that you achieved from cryo-electron microscopy and you can take those and input that into the docking application and then run multiple iterations to figure out which will give you the best fit. AI again, right, this is iterative process it's extremely data intensive, it's an easy way to just apply AI and get that best fit doing something in a very, you know, analog manner that would just take humans very long time to do or traditional computing a very long time to do. >> Oh, Ryan, Ryan, you work at the NCNPR, you know, very exciting, you know after all, you know, at some point in history just about all drugs were from natural products yeah, so it's great to have you here today. Please tell us a little bit about your work with the pharmaceutical companies, especially when it is often that drug cocktails or what they call Polypharmacology, is the answer to complete drug therapy. Please tell us a bit more with your work there. >> Yeah thank you again for having me here this morning Dr. Goh, it's a pleasure to be here and as you said, I'm from the National Center for Natural Products Research you'll hear me refer to it as the NCNPR here in Oxford, Mississippi on the Ole Miss Campus, beautiful setting here in the South and so, what, as you said historically, what the drug discovery process has been, and it's really not a drug discovery process is really a therapy process, traditional medicine is we've looked at natural products from medicinal plants okay, in these extracts and so where I'd like to begin is really sort of talking about the assets that we have here at the NCNPR one of those prime assets, unique assets is our medicinal plant repository which comprises approximately 15,000 different medicinal plants. And what that allows us to do, right is to screen mine, that repository for activities so whether you have a disease of interest or whether you have a target of interest then you can use this medicinal plant repository to look for actives, in this case active plants. It's really important in today's environment of drug discovery to really understand what are the actives in these different medicinal plants which leads me to the second unique asset here at the NCNPR and that is our what I'll call a plant deconstruction laboratory so without going into great detail, but what that allows us to do is through a how to put workstation, right, is to facilitate rapid isolation and identification of phytochemicals in these different medicinal plants right, and so things that have historically taken us weeks and sometimes months, think acetylsalicylic acid from salicylic acid as a pain reliever in the willow bark or Taxol, right as an anti-cancer drug, right now we can do that with this system on the matter of days or weeks so now we're talking about activity from a plant and extract down to phytochemical characterization on a timescale, which starts to make sense in modern drug discovery, alright and so now if you look at these phytochemicals, right, and you ask yourself, well sort of who is interested in that and why, right what are traditional pharmaceutical companies, right which I've been working with for 20, over 25 years now, right, typically uses these natural products where historically has used these natural products as starting points for new drugs. Right, so in other words, take this phytochemical and make chemicals synthetic modifications in order to achieve a potential drug. But in the context of natural products, unlike the pharmaceutical realm, there is often times a big knowledge gap between a disease and a plant in other words I have a plant that has activity, but how to connect those dots has been really laborious time consuming so it took us probably 50 years to go from salicylic acid and willow bark to synthesize acetylsalicylic acid or aspirin it just doesn't work in today's environment. So casting about trying to figure out how we expedite that process that's when about four years ago, I read a really fascinating article in the Los Angeles Times about my colleague and business partner, Dr. Rangan Sukumar, describing all the interesting things that he was doing in the area of Artificial Intelligence. And one of my favorite parts of this story is basically, unannounced, I arrived at his doorstep in Oak Ridge, he was working Oak Ridge National Labs at the time, and I introduced myself to him didn't know what was coming, didn't know who I was, right and I said, hey, you don't know me you don't know why I'm here, I said, but let me tell you what I want to do with your system, right and so that kicked off a very fruitful collaboration and friendship over the last four years using Artificial Intelligence and it's culminated most recently in our COVID-19 project collaborative research between the NCNPR and HP in this case. >> From what I can understand also as Chris has mentioned highly iterative, especially with these combination mixture of chemicals right, in plants that could affect a disease. We need to put in effort to figure out what are the active components in that, that affects it yeah, the combination and given the layman's way of understanding it you know and therefore iterative and highly data intensive. And I can see why Rangan can play a huge significant role here, Rangan, thank you for joining us So it's just a nice segue to bring you in here, you know, given your work with Ryan over so many years now, tell I think I'm also quite interested in knowing a little about how it developed the first time you met and the process and the things you all work together on that culminated into the progress at the advanced level today. Please tell us a little bit about that history and also the current work. Rangan. >> So, Ryan, like he mentioned, walked into my office about four years ago and he was like hey, I'm working on this Omega-3 fatty acid, what can your system tell me about this Omega-3 fatty acid and I didn't even know how to spell Omega-3 fatty acids that's the disconnect between the technologist and the pharmacologist, they have terms of their own right since then we've come a long way I think I understand his terminologies now and he understands that I throw words like knowledge graphs and page rank and then all kinds of weird stuff that he's probably never heard in his life before right, so it's been on my mind off to different domains and terminologies in trying to accept each other's expertise in trying to work together on a collaborative project. I think the core of what Ryan's work and collaboration has led me to understanding is what happens with the drug discovery process, right so when we think about the discovery itself, we're looking at companies that are trying to accelerate the process to market, right an average drug is taking 12 years to get to market the process that Chris just mentioned, Right and so companies are trying to adopt what's called the in silico simulation techniques and in silico modeling techniques into what was predominantly an in vitro, in silico, in vivo environment, right. And so the in silico techniques could include things like molecular docking, could include Artificial Intelligence, could include other data-driven discovery methods and so forth, and the essential component of all the things that you know the discovery workflows have is the ability to augment human experts to do the best by assisting them with what computers do really really well. So, in terms of what we've done as examples is Ryan walks in and he's asking me a bunch of questions and few that come to mind immediately, the first few are, hey, you are an Artificial Intelligence expert can you sift through a database of molecules the 15,000 compounds that he described to prioritize a few for next lab experiments? So that's question number one. And he's come back into my office and asked me about hey, there's 30 million publications in PubMag and I don't have the time to read everything can you create an Artificial Intelligence system that once I've picked these few molecules will tell me everything about the molecule or everything about the virus, the unknown virus that shows up, right. Just trying to understand what are some ways in which he can augment his expertise, right. And then the third question, I think he described better than I'm going to was how can technology connect these dots. And typically it's not that the answer to a drug discovery problem sits in one database, right he probably has to think about uniproduct protein he has to think about phytochemical, chemical or informatics properties, data and so forth. Then he talked about the phytochemical interaction that's probably in another database. So when he is trying to answer other question and specifically in the context of an unknown virus that showed up in late last year, the question was, hey, do we know what happened in this particular virus compared to all the previous viruses? Do we know of any substructure that was studied or a different disease that's part of this unknown virus and can I use that information to go mine these databases to find out if these interactions can actually be used as a repurpose saying, hook, say this drug does not interact with this subsequence of a known virus that also seems to be part of this new virus, right? So to be able to connect that dot I think the abstraction that we are learning from working with pharma companies is that this drug discovery process is complex, it's iterative, and it's a sequence of needle in the haystack search problems, right and so one day, Ryan would be like, hey, I need to match genome, I need to match protein sequences between two different viruses. Another day it would be like, you know, I need to sift through a database of potential compounds, identified side effects and whatnot other day it could be, hey, I need to design a new molecule that never existed in the world before I'll figure out how to synthesize it later on, but I need a completely new molecule because of patentability reasons, right so it goes through the entire spectrum. And I think where HP has differentiated multiple times even the recent weeks is that the technology infusion into drug discovery, leads to several aha! Moments. And, aha moments typically happened in the other few seconds, and not the hours, days, months that Ryan has to laboriously work through. And what we've learned is pharma researchers love their aha moments and it leads to a sound valid, well founded hypothesis. Isn't that true Ryan? >> Absolutely. Absolutely. >> Yeah, at some point I would like to have a look at your, peak the list of your aha moments, yeah perhaps there's something quite interesting in there for other industries too, but we'll do it at another time. Chris, you know, with your regular work with pharmaceutical companies especially the big pharmas, right, do you see botanicals, coming, being talked about more and more there? >> Yeah, we do, right. Looking at kind of biosimilars and drugs that are already really in existence is kind of an important point and Dr. Yates and Rangan, with your work with databases this is something important to bring up and much of the drug discovery in today's world, isn't from going out and finding a brand new molecule per se. It's really looking at all the different databases, right all the different compounds that already exist and sifting through those, right of course data is mind, and it is gold essentially, right so a lot of companies don't want to share their data. A lot of those botanicals data sets are actually open to the public to use in many cases and people are wanting to have more collaborative efforts around those databases so that's really interesting to kind of see that being picked up more and more. >> Mm, well and Ryan that's where NCNPR hosts much of those datasets, yeah right and it's interesting to me, right you know, you were describing the traditional way of drug discovery where you have a target and a compound, right that can affect that target, very very specific. But from a botanical point of view, you really say for example, I have an extract from a plant that has combination of chemicals and somehow you know, it affects this disease but then you have to reverse engineer what those chemicals are and what the active ones are. Is that very much the issue, the work that has to be put in for botanicals in this area? >> Yes Doctor Goh, you hit it exactly. >> Now I can understand why a highly iterative intensive and data intensive, and perhaps that's why Rangan, you're highly valuable here, right. So tell us about the challenge, right the many to many intersection to try and find what the targets are, right given these botanicals that seem to affect the disease here what methods do you use, right in AI, to help with this? >> Fantastic question, I'm going to go a little bit deeper and speak like Ryan in terminology, but here we go. So with going back to about starting of our conversation right, so let's say we have a database of molecules on one side, and then we've got the database of potential targets in a particular, could be a virus, could be bacteria, could be whatever, a disease target that you've identified, right >> Oh this process so, for example, on a virus, you can have a number of targets on the virus itself some have the spike protein, some have the other proteins on the surface so there are about three different targets and others on a virus itself, yeah so a lot of people focus on the spike protein, right but there are other targets too on that virus, correct? >> That is exactly right. So for example, so the work that we did with Ryan we realized that, you know, COVID-19 protein sequence has an overlap, a significant overlap with previous SARS-CoV-1 virus, not only that, but it overlap with MERS, that's overlapped with some bad coronavirus that was studied before and so forth, right so knowing that and it's actually broken down into multiple and Ryan I'm going to steal your words, non-structural proteins, envelope proteins, S proteins, there's a whole substructure that you can associate an amino acid sequence with, right so on the one hand, you have different targets and again, since we did the work it's 160 different targets even on the COVID-19 mark, right and so you find a match, that we say around 36, 37 million molecules that are potentially synthesizable and try to figure it out which one of those or which few of those is actually going to be mapping to which one of these targets and actually have a mechanism of action that Ryan's looking for, that'll inhibit the symptoms on a human body, right so that's the challenge there. And so I think the techniques that we can unrule go back to how much do we know about the target and how much do we know about the molecule, alright. And if you start off a problem with I don't know anything about the molecule and I don't know anything about the target, you go with the traditional approaches of docking and molecular dynamics simulations and whatnot, right. But then, you've done so much docking before on the same database for different targets, you'll learn some new things about the ligands, the molecules that Ryan's talking about that can predict potential targets. So can you use that information of previous protein interactions or previous binding to known existing targets with some of the structures and so forth to build a model that will capture that essence of what we have learnt from the docking before? And so that's the second level of how do we infuse Artificial Intelligence. The third level, is to say okay, I can do this for a database of molecules, but then what if the protein-protein interactions are all over the literature study for millions of other viruses? How do I connect the dots across different mechanisms of actions too? Right and so this is where the knowledge graph component that Ryan was talking about comes in. So we've put together a database of about 150 billion medical facts from literature that Ryan is able to connect the dots and say okay, I'm starting with this molecule, what interactions do I know about the molecule? Is there a pretty intruding interaction that affects the mechanism of pathway for the symptoms that a disease is causing? And then he can go and figure out which protein and protein in the virus could potentially be working with this drug so that inhibiting certain activities would stop that progression of the disease from happening, right so like I said, your method of options, the options you've got is going to be, how much do you know about the target? How much do you know the drug database that you have and how much information can you leverage from previous research as you go down this pipeline, right so in that sense, I think we mix and match different methods and we've actually found that, you know mixing and matching different methods produces better synergies for people like Ryan. So. >> Well, the synergies I think is really important concept, Rangan, in additivities, synergistic, however you want to catch that. Right. But it goes back to your initial question Dr. Goh, which is this idea of polypharmacology and historically what we've done with traditional medicines there's more than one active, more than one network that's impacted, okay. You remember how I sort of put you on both ends of the spectrum which is the traditional sort of approach where we really don't know much about target ligand interaction to the completely interpretal side of it, right where now we are all, we're focused on is, in a single molecule interacting with a target. And so where I'm going with this is interesting enough, pharma has sort of migrate, started to migrate back toward the middle and what I mean by that, right, is we had these in a concept of polypharmacology, we had this idea, a regulatory pathway of so-called, fixed drug combinations. Okay, so now you start to see over the last 20 years pharmaceutical companies taking known, approved drugs and putting them in different combinations to impact different diseases. Okay. And so I think there's a really unique opportunity here for Artificial Intelligence or as Rangan has taught me, Augmented Intelligence, right to give you insight into how to combine those approved drugs to come up with unique indications. So is that patentability right, getting back to right how is it that it becomes commercially viable for entities like pharmaceutical companies but I think at the end of the day what's most interesting to me is sort of that, almost movement back toward that complex mixture of fixed drug combination as opposed to single drug entity, single target approach. I think that opens up some really neat avenues for us. As far as the expansion, the applicability of Artificial Intelligence is I'd like to talk to, briefly about one other aspect, right so what Rang and I have talked about is how do we take this concept of an active phytochemical and work backwards. In other words, let's say you identify a phytochemical from an in silico screening process, right, which was done for COVID-19 one of the first publications out of a group, Dr. Jeremy Smith's group at Oak Ridge National Lab, right, identified a natural product as one of the interesting actives, right and so it raises the question to our botanical guy, says, okay, where in nature do we find that phytochemical? What plants do I go after to try and source botanical drugs to achieve that particular end point right? And so, what Rangan's system allows us to do is to say, okay, let's take this phytochemical in this case, a phytochemical flavanone called eriodictyol and say, where else in nature is this found, right that's a trivial question for an Artificial Intelligence system. But for a guy like me left to my own devices without AI, I spend weeks combing the literature. >> Wow. So, this is brilliant I've learned something here today, right, If you find a chemical that actually, you know, affects and addresses a disease, right you can actually try and go the reverse way to figure out what botanicals can give you those chemicals as opposed to trying to synthesize them. >> Well, there's that and there's the other, I'm going to steal Rangan's thunder here, right he always teach me, Ryan, don't forget everything we talk about has properties, plants have properties, chemicals have properties, et cetera it's really understanding those properties and using those properties to make those connections, those edges, those sort of interfaces, right. And so, yes, we can take something like an eriodictyol right, that example I gave before and say, okay, now, based upon the properties of eriodictyol, tell me other phytochemicals, other flavonoid in this case, such as that phytochemical class of eriodictyols part right, now tell me how, what other phytochemicals match that profile, have the same properties. It might be more economically viable, right in other words, this particular phytochemical is found in a unique Himalayan plant that I've never been able to source, but can we find something similar or same thing growing in, you know a bush found all throughout the Southeast for example, like. >> Wow. So, Chris, on the pharmaceutical companies, right are they looking at this approach of getting, building drugs yeah, developing drugs? >> Yeah, absolutely Dr. Goh, really what Dr. Yates is talking about, right it doesn't help us if we find a plant and that plant lives on one mountain only on the North side in the Himalayas, we're never going to be able to create enough of a drug to manufacture and to provide to the masses, right assuming that the disease is widespread or affects a large enough portion of the population, right so understanding, you know, not only where is that botanical or that compound but understanding the chemical nature of the chemical interaction and the physics of it as well where which aspect affects the binding site, which aspect of the compound actually does the work, if you will and then being able to make that at scale, right. If you go to these pharmaceutical companies today, many of them look like breweries to be honest with you, it's large scale, it's large back everybody's clean room and it's, they're making the microbes do the work for them or they have these, you know, unique processes, right. So. >> So they're not brewing beer okay, but drugs instead. (Christopher laughs) >> Not quite, although there are pharmaceutical companies out there that have had a foray into the brewery business and vice versa, so. >> We should, we should visit one of those, yeah (chuckles) Right, so what's next, right? So you've described to us the process and how you develop your relationship with Dr. Yates Ryan over the years right, five years, was it? And culminating in today's, the many to many fast screening methods, yeah what would you think would be the next exciting things you would do other than letting me peek at your aha moments, right what would you say are the next exciting steps you're hoping to take? >> Thinking long term, again this is where Ryan and I are working on this long-term project about, we don't know enough about botanicals as much as we know about the synthetic molecules, right and so this is a story that's inspired from Simon Sinek's "Infinite Game" book, trying to figure it out if human population has to survive for a long time which we've done so far with natural products we are going to need natural products, right. So what can we do to help organizations like NCNPR to stage genomes of natural products to stage and understand the evolution as we go to understand the evolution to map the drugs and so forth. So the vision is huge, right so it's not something that we want to do on a one off project and go away but in the process, just like you are learning today, Dr. Goh I'm going to be learning quite a bit, having fun with life. So, Ryan what do you think? >> Ryan, we're learning from you. >> So my paternal grandfather lived to be 104 years of age. I've got a few years to get there, but back to "The Infinite Game" concept that Rang had mentioned he and I discussed that quite frequently, I'd like to throw out a vision for you that's well beyond that sort of time horizon that we have as humans, right and that's this right, is our current strategy and it's understandable is really treatment centric. In other words, we have a disease we develop a treatment for that disease. But we all recognize, whether you're a healthcare practitioner, whether you're a scientist, whether you're a business person, right or whatever occupation you realize that prevention, right the old ounce, prevention worth a pound of cure, right is how can we use something like Artificial Intelligence to develop preventive sorts of strategies that we are able to predict with time, right that's why we don't have preventive treatment approach right, we can't do a traditional clinical trial and say, did we prevent type two diabetes in an 18 year old? Well, we can't do that on a timescale that is reasonable, okay. And then the other part of that is why focus on botanicals? Is because, for the most part and there are exceptions I want to be very clear, I don't want to paint the picture that botanicals are all safe, you should just take botanicals dietary supplements and you'll be safe, right there are exceptions, but for the most part botanicals, natural products are in fact safe and have undergone testing, human testing for thousands of years, right. So how do we connect those dots? A preventive strategy with existing extent botanicals to really develop a healthcare system that becomes preventive centric as opposed to treatment centric. If I could wave a magic wand, that's the vision that I would figure out how we could achieve, right and I do think with guys like Rangan and Chris and folks like yourself, Eng Lim, that that's possible. Maybe it's in my lifetime I got 50 years to go to get to my grandfather's age, but you never know, right? >> You bring really, up two really good points there Ryan, it's really a systems approach, right understanding that things aren't just linear, right? And as you go through it, there's no impact to anything else, right taking that systems approach to understand every aspect of how things are being impacted. And then number two was really kind of the downstream, really we've been discussing the drug discovery process a lot and kind of the kind of preclinical in vitro studies and in vivo models, but once you get to the clinical trial there are many drugs that just fail, just fail miserably and the botanicals, right known to be safe, right, in many instances you can have a much higher success rate and that would be really interesting to see, you know, more of at least growing in the market. >> Well, these are very visionary statements from each of you, especially Dr. Yates, right, prevention better than cure, right, being proactive better than being reactive. Reactive is important, but we also need to focus on being proactive. Yes. Well, thank you very much, right this has been a brilliant panel with brilliant panelists, Dr. Ryan Yates, Dr. Rangan Sukumar and Chris Davidson. Thank you very much for joining us on this panel and highly illuminating conversation. Yeah. All for the future of drug discovery, that includes botanicals. Thank you very much. >> Thank you. >> Thank you.

Published Date : Oct 16 2020

SUMMARY :

And of particular interest to him Thank you for having me. technologist at the CTO office in the drug discovery process. is to understand what is and you can take those and input that is the answer to complete drug therapy. and friendship over the last four years and the things you all work together on of all the things that you know Absolutely. especially the big pharmas, right, and much of the drug and somehow you know, the many to many intersection and then we've got the database so on the one hand, you and so it raises the question and go the reverse way that I've never been able to source, approach of getting, and the physics of it as well where okay, but drugs instead. foray into the brewery business the many to many fast and so this is a story that's inspired I'd like to throw out a vision for you and the botanicals, right All for the future of drug discovery,

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Ryan	PERSON	0.99+
Chris Davidson	PERSON	0.99+
NCNPR	ORGANIZATION	0.99+
Rangan Sukumar	PERSON	0.99+
National Center for Natural Products Research	ORGANIZATION	0.99+
Rangan	PERSON	0.99+
Simon Sinek	PERSON	0.99+
Christopher	PERSON	0.99+
HP	ORGANIZATION	0.99+
12 years	QUANTITY	0.99+
third question	QUANTITY	0.99+
50 years	QUANTITY	0.99+
Rangan Sukumar	PERSON	0.99+
10 years	QUANTITY	0.99+
Infinite Game	TITLE	0.99+
15,000 compounds	QUANTITY	0.99+
Jeremy Smith	PERSON	0.99+
104 years	QUANTITY	0.99+
COVID-19	OTHER	0.99+
Ryan Yates	PERSON	0.99+
30 million publications	QUANTITY	0.99+
five years	QUANTITY	0.99+
third level	QUANTITY	0.99+
70 publications	QUANTITY	0.99+
Eng Lim	PERSON	0.99+
Oak Ridge National Labs	ORGANIZATION	0.99+
160 different targets	QUANTITY	0.99+
20	QUANTITY	0.99+
thousands of years	QUANTITY	0.99+
second level	QUANTITY	0.99+
Goh	PERSON	0.99+
The Infinite Game	TITLE	0.99+
Himalayas	LOCATION	0.99+
over 25 years	QUANTITY	0.99+
two different viruses	QUANTITY	0.98+
more than one network	QUANTITY	0.98+
Yates	PERSON	0.98+
late last year	DATE	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
about 150 billion medical facts	QUANTITY	0.98+
one database	QUANTITY	0.97+
both ends	QUANTITY	0.97+
SARS-CoV-1 virus	OTHER	0.97+
second unique asset	QUANTITY	0.97+
single drug	QUANTITY	0.97+
Oak Ridge National Lab	ORGANIZATION	0.97+
Oak Ridge	LOCATION	0.97+

Tech for Good | Exascale Day

(plane engine roars) (upbeat music) >> They call me Dr. Goh. I'm Senior Vice President and Chief Technology Officer of AI at Hewlett Packard Enterprise. And today I'm in Munich, Germany. Home to one and a half million people. Munich is famous for everything from BMW, to beer, to breathtaking architecture and festive markets. The Bavarian capital is the beating heart of Germany's automobile industry. Over 50,000 of its residents work in automotive engineering, and to date, Munich allocated around 30 million euros to boost electric vehicles and infrastructure for them. (upbeat music) >> Hello, everyone, my name is Dr. Jerome Baudry. I am a professor at the University of Alabama in Huntsville. Our mission is to use a computational resources to accelerate the discovery of drugs that will be useful and efficient against the COVID-19 virus. On the one hand, there is this terrible crisis. And on the other hand, there is this absolutely unique and rare global effort to fight it. And that I think is a is a very positive thing. I am working with the Cray HPE machine called Sentinel. This machine is so amazing that it can actually mimic the screening of hundreds of thousands, almost millions of chemicals a day. What we take weeks, if not months, or years, we can do in a matter of a few days. And it's really the key to accelerating the discovery of new drugs, new pharmaceuticals. We are all in this together, thank you. (upbeat music) >> Hello, everyone. I'm so pleased to be here to interview Dr. Jerome Baudry, of the University of Alabama in Huntsville. >> Hello, Dr. Goh, I'm very happy to be meeting with you here, today. I have a lot of questions for you as well. And I'm looking forward to this conversation between us. >> Yes, yes, and I've got lots of COVID-19 and computational science questions lined up for you too Jerome. Yeah, so let's interview each other, then. >> Absolutely, let's do that, let's interview each other. I've got many questions for you. And , we have a lot in common and yet a lot of things we are addressing from a different point of view. So I'm very much looking forward to your ideas and insights. >> Yeah, especially now, with COVID-19, many of us will have to pivot a lot of our research and development work, to address the most current issues. I watch your video and I've seen that you're very much focused on drug discovery using super computing. The central notebook you did, I'm very excited about that. Can you tell us a bit more about how that works, yeah? >> Yes, I'd be happy to in fact, I watch your video as well manufacturing, and it's actually quite surprisingly close, what we do with drugs, and with what other people do with planes or cars or assembly lanes. we are calculating forces, on molecules, on drug candidates, when they hit parts of the viruses. And we essentially try to identify what small molecules will hit the viruses or its components, the hardest to mess with its function in a way. And that's not very different from what you're doing. What you are describing people in the industry or in the transportation industry are doing. So that's our problem, so to speak, is to deal with a lot of small molecules. Guy creating a lot of forces. That's not a main problem, our main problem is to make intelligent choices about what calculates, what kind of data should we incorporate in our calculations? And what kind of data should we give to the people who are going to do the testing? And that's really something I would like you to do to help us understand better. How do you see artificial intelligence, helping us, putting our hands on the right data to start with, in order to produce the right data and accuracy. >> Yeah, that's that's a great question. And it is a question that we've been pondering in our strategy as a company a lot recently. Because more and more now we realize that the data is being generated at the far out edge. By edge. I mean, something that's outside of the cloud and data center, right? Like, for example, a more recent COVID-19 work, doing a lot of cryo electron microscope work, right? To try and get high resolution pictures of the virus and at different angles, so creating lots of movies under electron microscope to try and create a 3D model of the virus. And we realize that's the edge, right, because that's where the microscope is, away from the data center. And massive amounts of data is generated, terabytes and terabytes of data per day generated. And we had to develop means, a workflow means to get that data off the microscope and provide pre-processing and processing, so that they can achieve results without delay. So we learned quite a few lessons there, right, especially trying to get the edge to be more intelligent, to deal with the onslaught of data coming in, from these devices. >> That's fantastic that you're saying that and that you're using this very example of cryo-EM, because that's the kind of data that feeds our computations. And indeed, we have found that it is very, very difficult to get the right cryo-EM data to us. Now we've been working with HPE supercomputer Sentinel, as you may know, for our COVID-19 work. So we have a lot of computational power. But we will be even faster and better, frankly, if we knew what kind of cryo-EM data to focus on. In fact, most of our discussions are based on not so much how to compute the forces of the molecules, which we do quite well on an HP supercomputer. But again, what cryo-EM 3D dimensional space to look at. And it's becoming almost a bottleneck. >> Have access to that. >> And we spend a lot of time, do you envision a point where AI will be able to help us, to make this kind of code almost live or at least as close to live as possible, as that that comes from the edge? How to pack it and not triage it, but prioritize it for the best possible computations on supercomputers? >> What a visionary question and desire, right? Like exactly the vision we have, right? Of course, the ultimate vision, you aim for the best, and that will be a real time stream of processed data coming off the microscope straight, providing your need, right? We are not there. Before this, we are far from there, right? But that's the aim, the ability to push more and more intelligence forward, so that by the time the data reaches you, it is what you need, right, without any further processing. And a lot of AI is applied there, particularly in cryo-EM where they do particle picking, right, they do a lot of active pictures and movies of the virus. And then what they do is, they rotate the virus a little bit, right? And then to try and figure out in all the different images in the movies, to try and pick the particles in there. And this is very much image processing that AI is very good at. So many different stages, application is made. The key thing, is to deal with the data that is flowing at this at this speed, and to get the data to you in the right form, that in time. So yes, that's the desire, right? >> It will be a game changer, really. You'll be able to get things in a matter of weeks, instead of a matter of years to the colleague who will be doing the best day. If the AI can help me learn from a calculation that didn't exactly turn out the way we want it to be, that will be very, very helpful. I can see, I can envision AI being able to, live AI to be able to really revolutionize all the process, not only from the discovery, but all the way to the clinical, to the patient, to the hospital. >> Well, that's a great point. In fact, I caught on to your term live AI. That's actually what we are trying to achieve. Although I have not used that term before. Perhaps I'll borrow it for next time. >> Oh please, by all means. >> You see, yes, we have done, I've been doing also recent work on gene expression data. So a vaccine, clinical trial, they have the blood, they get the blood from the volunteers after the first day. And then to run very, very fast AI analytics on the gene expression data that the one, the transcription data, before translation to emit amino acid. The transcription data is enormous. We're talking 30,000, 60,000 different items, transcripts, and how to use that high dimensional data to predict on day one, whether this volunteer will get an adverse event or will have a good antibody outcome, right? For efficacy. So yes, how to do it so quickly, right? To get the blood, go through an SA, right, get the transcript, and then run the analytics and AI to produce an outcome. So that's exactly what we're trying to achieve, yeah. Yes, I always emphasize that, ultimately, the doctor makes that decision. Yeah, AI only suggests based on the data, this is the likely outcome based on all the previous data that the machine has learned from, yeah. >> Oh, I agree, we wouldn't want the machine to decide the fate of the patient, but to assist the doctor or nurse making the decision that will be invaluable? And are you aware of any kind of industry that already is using this kind of live AI? And then, is there anything in, I don't know in sport or crowd control? Or is there any kind of industry? I will be curious to see who is ahead of us in terms of making this kind of a minute based decisions using AI? Yes, in fact, this is very pertinent question. We as In fact, COVID-19, lots of effort working on it, right? But now, industries and different countries are starting to work on returning to work, right, returning to their offices, returning to the factories, returning to the manufacturing plants, but yet, the employers need to reassure the employees that things, appropriate measures are taken for safety, but yet maintain privacy, right? So our Aruba organization actually developed a solution called contact location tracing inside buildings, inside factories, right? Why they built this, and needed a lot of machine learning methods in there to do very, very well, as you say, live AI right? To offer a solution? Well, let me describe the problem. The problem is, in certain countries, and certain states, certain cities where regulations require that, if someone is ill, right, you actually have to go in and disinfect the area person has been to, is a requirement. But if you don't know precisely where the ill person has been to, you actually disinfect the whole factory. And if you have that, if you do that, it becomes impractical and cost prohibitive for the company to keep operating profitably. So what they are doing today with Aruba is, that they carry this Bluetooth Low Energy tag, which is a quarter size, right? The reason they do that is, so that they extract the tag from the person, and then the system tracks, everybody, all the employees. We have one company, there's 10,000 employees, right? Tracks everybody with the tag. And if there is a person ill, immediately a floor plan is brought up with hotspots. And then you just targeted the cleaning services there. The same thing, contact tracing is also produced automatically, you could say, anybody that is come in contact with this person within two meters, and more than 15 minutes, right? It comes up the list. And we, privacy is our focused here. There's a separation between the tech and the person, on only restricted people are allowed to see the association. And then things like washrooms and all that are not tracked here. So yes, live AI, trying to make very, very quick decisions, right, because this affects people. >> Another question I have for you, if you have a minute, actually has to be the same thing. Though, it's more a question about hardware, about computer hardware purify may. We're having, we're spending a lot of time computing on number crunching giant machines, like Sentinel, for instance, which is a dream to use, but it's very good at something but when we pulled it off, also spent a lot of time moving back and forth, so data from clouds from storage, from AI processing, to the computing cycles back and forth, back and forth, did you envision an architecture, that will kind of, combine the hardware needed for a massively parallel calculations, kind of we are doing. And also very large storage, fast IO to be more AI friendly, so to speak. You see on the horizon, some kind of, I would say you need some machine, maybe it's to be determined, to be ambitious at times but something that, when the AI ahead plan in terms of passing the vector to the massively parallel side, yeah, that makes sense? >> Makes a lot of sense. And you ask it I know, because it is a tough problem to solve, as we always say, computation, right, is growing capability enormously. But bandwidth, you have to pay for, latency you sweat for, right? >> That's a very good >> So moving data is ultimately going to be the problem. >> It is. >> Yeah, and we've move the data a lot of times, right, >> You move back and forth, so many times >> Back and forth, back and forth, from the edge that's where you try to pre-process it, before you put it in storage, yeah. But then once it arrives in storage, you move it to memory to do some work and bring it back and move it memory again, right, and then that's what HPC, and then you put it back into storage, and then the AI comes in you, you do the learning, the other way around also. So lots of back and forth, right. So tough problem to solve. But more and more, we are looking at a new architecture, right? Currently, this architecture was built for the AI side first, but we're now looking and see how we can expand that. And this is that's the reason why we announced HPE Ezmeral Data Fabric. What it does is that, it takes care of the data, all the way from the edge point of view, the minute it is ingested at the edge, it is incorporated in the global namespace. So that eventually where the data arrives, lands at geographically one, or lands at, temperature, hot data, warm data or cold data, regardless of eventually where it lands at, this Data Fabric checks everything, from in a global namespace, in a unified way. So that's the first step. So that data is not seen as in different places, different pieces, it is a unified view of all the data, the minute that it does, Just start from the edge. >> I think it's important that we communicate that AI is purposed for good, A lot of sci-fi movies, unfortunately, showcase some psychotic computers or teams of evil scientists who want to take over the world. But how can we communicate better that it's a tool for a change, a tool for good? >> So key differences are I always point out is that, at least we have still judgment relative to the machine. And part of the reason we still have judgment is because our brain, logical center is automatically connected to our emotional center. So whatever our logic say is tempered by emotion, and whatever our emotion wants to act, wants to do, right, is tempered by our logic, right? But then AI machine is, many call them, artificial specific intelligence. They are just focused on that decision making and are not connected to other more culturally sensitive or emotionally sensitive type networks. They are focus networks. Although there are people trying to build them, right. That's this power, reason why with judgment, I always use the phrase, right, what's correct, is not always the right thing to do. There is a difference, right? We need to be there to be the last Judge of what's right, right? >> Yeah. >> So that says one of the the big thing, the other one, I bring up is that humans are different from machines, generally, in a sense that, we are highly subtractive. We, filter, right? Well, machine is highly accumulative today. So an AI machine they accumulate to bring in lots of data and tune the network, but our brains a few people realize, we've been working with brain researchers in our work, right? Between three and 30 years old, our brain actually goes through a pruning process of our connections. So for those of us like me after 30 it's done right. (laughs) >> Wait till you reach my age. >> Keep the brain active, because it prunes away connections you don't use, to try and conserve energy, right? I always say, remind our engineers about this point, about prunings because of energy efficiency, right? A slice of pizza drives our brain for three hours. (laughs) That's why, sometimes when I get need to get my engineers to work longer, I just offer them pizza, three more hours, >> Pizza is universal solution to our problems, absolutely. Food Indeed, indeed. There is always a need for a human consciousness. It's not just a logic, it's not like Mr. Spock in "Star Trek," who always speaks about logic but forgets the humanity aspect of it. >> Yes, yes, The connection between the the logic centers and emotional centers, >> You said it very well. Yeah, yeah and the thing is, sleep researchers are saying that when you don't get enough REM sleep, this connection is weakened. Therefore, therefore your decision making gets affected if you don't get enough sleep. So I was thinking, people do alcohol test breathalyzer test before they are allowed to operate sensitive or make sensitive decisions. Perhaps in the future, you have to check whether you have enough REM sleep before, >> It is. This COVID-19 crisis obviously problematic, and I wish it never happened, but there is something that I never experienced before is, how people are talking to each other, people like you and me, we have a lot in common. But I hear more about the industry outside of my field. And I talk a lot to people, like cryo-EM people or gene expression people, I would have gotten the data before and process it. Now, we have a dialogue across the board in all aspects of industry, science, and society. And I think that could be something wonderful that we should keep after we finally fix this bug. >> Yes. yes, yes. >> Right? >> Yes, that's that's a great point. In fact, it's something I've been thinking about, right, for employees, things have changed, because of COVID-19. But very likely, the change will continue, yeah? >> Right. Yes, yes, because there are a few positive outcomes. COVID-19 is a tough outcome. But there positive side of things, like communicating in this way, effectively. So we were part of the consortium that developed a natural language processing system in AI system that would allow you scientists to do, I can say, with the link to that website, allows you to do a query. So say, tell me the latest on the binding energy between the Sasko B2 virus like protein and the AC receptor. And then you will, it will give you a list of 10 answers, yeah? And give you a link to the papers that say, they say those answers. If you key that in today to NLP, you see 315 points -13.7 kcal per mole, which is right, I think the general consensus answer, and see a few that are highly out of out of range, right? And then when you go further, you realize those are the earlier papers. So I think this NLP system will be useful. (both chattering) I'm sorry, I didn't mean to interrupt, but I mentioned yesterday about it, because I have used that, and it's a game changer indeed, it is amazing, indeed. Many times by using this kind of intelligent conceptual, analyzes a very direct use, that indeed you guys are developing, I have found connections between facts, between clinical or pharmaceutical aspects of COVID-19. That I wasn't really aware of. So a it's a tool for creativity as well, I find it, it builds something. It just doesn't analyze what has been done, but it creates the connections, it creates a network of knowledge and intelligence. >> That's why three to 30 years old, when it stops pruning. >> I know, I know. (laughs) But our children are amazing, in that respect, they see things that we don't see anymore. they make connections that we don't necessarily think of, because we're used to seeing a certain way. And the eyes of a child, are bringing always something new, which I think is what AI could potentially bring here. So look, this is fascinating, really. >> Yes, yes, difference between filtering subtractive and the machine being accumulative. That's why I believe, the two working together, can have a stronger outcome if used properly. >> Absolutely. And I think that's how AI will be a force for good indeed. Obviously see, seems that we would have missed that would end up being very important. Well, we are very interested in or in our quest for drug discovery against COVID-19, we have been quite successful so far. We have accelerated the process by an order of magnitude. So we're having molecules that are being tested against the virus, otherwise, it would have taken maybe three or four years to get to that point. So first thing, we have been very fast. But we are very interested in natural products, that chemicals that come from plants, essentially. We found a way to mine, I don't want to say explore it, but leverage, that knowledge of hundreds of years of people documenting in a very historical way of what plants do against what diseases in different parts of the world. So that really has been a, not only very useful in our work, but a fantastic bridge to our common human history, basically. And second, yes, plants have chemicals. And of course we love chemicals. Every living cell has chemicals. The chemicals that are in plants, have been fine tuned by evolution to actually have some biological function. They are not there just to look good. They have a role in the cell. And if we're trying to come up with a new growth from scratch, which is also something we want to do, of course, then we have to engineer a function that evolution hasn't already found a solution to, for in plants, so in a way, it's also artificial intelligence. We have natural solutions to our problems, why don't we try to find them and see their work in ourselves, we're going to, and this is certainly have to reinvent the wheel each time. >> Hundreds of millions of years of evolution, >> Hundreds of millions of years. >> Many iterations, >> Yes, ending millions of different plants with all kinds of chemical diversity. So we have a lot of that, at our disposal here. If only we find the right way to analyze them, and bring them to our supercomputers, then we will, we will really leverage this humongus amount of knowledge. Instead of having to reinvent the wheel each time we want to take a car, we'll find that there are cars whose wheels already that we should be borrowing instead of, building one each time. Most of the keys are out there, if we can find them, They' re at our disposal. >> Yeah, nature has done the work after hundreds of millions of years. >> Yes. (chattering) Is to figure out, which is it, yeah? Exactly, exactly hence the importance of biodiversity. >> Yeah, I think this is related to the Knowledge Graph, right? Where, yes, to objects and the linking parameter, right? And then you have hundreds of millions of these right? A chemical to an outcome and the link to it, right? >> Yes, that's exactly what it is, absolutely the kind of things we're pursuing very much, so absolutely. >> Not only only building the graph, but building the dynamics of the graph, In the future, if you eat too much Creme Brulee, or if you don't run enough, or if you sleep, well, then your cells, will have different connections on this graph of the ages, will interact with that molecule in a different way than if you had more sleep or didn't eat that much Creme Brulee or exercise a bit more, >> So insightful, Dr. Baudry. Your, span of knowledge, right, impressed me. And it's such fascinating talking to you. (chattering) Hopefully next time, when we get together, we'll have a bit of Creme Brulee together. >> Yes, let's find out scientifically what it does, we have to do double blind and try three times to make sure we get the right statistics. >> Three phases, three clinical trial phases, right? >> It's been a pleasure talking to you. I like we agreed, you knows this, for all that COVID-19 problems, the way that people talk to each other is, I think the things that I want to keep in this in our post COVID-19 world. I appreciate very much your insight and it's very encouraging the way you see things. So let's make it happen. >> We will work together Dr.Baudry, hope to see you soon, in person. >> Indeed in person, yes. Thank you. >> Thank you, good talking to you.

Published Date : Oct 16 2020

SUMMARY :

and to date, Munich allocated And it's really the key to of the University of to be meeting with you here, today. for you too Jerome. of things we are addressing address the most current issues. the hardest to mess with of the virus. forces of the molecules, and to get the data to you out the way we want it In fact, I caught on to your term live AI. And then to run very, the employers need to reassure has to be the same thing. to solve, as we always going to be the problem. and forth, from the edge to take over the world. is not always the right thing to do. So that says one of the the big thing, Keep the brain active, because but forgets the humanity aspect of it. Perhaps in the future, you have to check And I talk a lot to changed, because of COVID-19. So say, tell me the latest That's why three to 30 years And the eyes of a child, and the machine being accumulative. And of course we love chemicals. Most of the keys are out there, Yeah, nature has done the work Is to figure out, which is it, yeah? it is, absolutely the kind And it's such fascinating talking to you. to make sure we get the right statistics. the way you see things. hope to see you soon, in person. Indeed in person, yes.

ENTITIES

Entity	Category	Confidence
Jerome	PERSON	0.99+
Huntsville	LOCATION	0.99+
Baudry	PERSON	0.99+
Jerome Baudry	PERSON	0.99+
three	QUANTITY	0.99+
10 answers	QUANTITY	0.99+
hundreds of years	QUANTITY	0.99+
Star Trek	TITLE	0.99+
Goh	PERSON	0.99+
10,000 employees	QUANTITY	0.99+
COVID-19	OTHER	0.99+
University of Alabama	ORGANIZATION	0.99+
hundreds of millions	QUANTITY	0.99+
Hundreds of millions of years	QUANTITY	0.99+
yesterday	DATE	0.99+
BMW	ORGANIZATION	0.99+
three times	QUANTITY	0.99+
three hours	QUANTITY	0.99+
more than 15 minutes	QUANTITY	0.99+
today	DATE	0.99+
13.7 kcal	QUANTITY	0.99+
Munich	LOCATION	0.99+
first step	QUANTITY	0.99+
four years	QUANTITY	0.99+
Munich, Germany	LOCATION	0.99+
Aruba	ORGANIZATION	0.99+
Sentinel	ORGANIZATION	0.99+
Hundreds of millions of years	QUANTITY	0.99+
315 points	QUANTITY	0.99+
two	QUANTITY	0.99+
Dr.	PERSON	0.98+
hundreds of millions of years	QUANTITY	0.98+
hundreds of thousands	QUANTITY	0.98+
each time	QUANTITY	0.98+
second	QUANTITY	0.98+
three more hours	QUANTITY	0.98+
around 30 million euros	QUANTITY	0.98+
first thing	QUANTITY	0.97+
both	QUANTITY	0.97+
University of Alabama	ORGANIZATION	0.97+
first day	QUANTITY	0.97+
Sasko B2 virus	OTHER	0.97+
Spock	PERSON	0.96+
one	QUANTITY	0.96+
two meters	QUANTITY	0.95+
Three phases	QUANTITY	0.95+
Germany	LOCATION	0.95+
one company	QUANTITY	0.94+
COVID-19 virus	OTHER	0.94+
HP	ORGANIZATION	0.92+
Dr.Baudry	PERSON	0.91+
Hewlett Packard Enterprise	ORGANIZATION	0.91+
day one	QUANTITY	0.89+
30	QUANTITY	0.88+
30 years old	QUANTITY	0.88+
Bavarian	OTHER	0.88+
30 years old	QUANTITY	0.84+
one and a half million people	QUANTITY	0.84+
millions of chemicals a day	QUANTITY	0.84+
millions of	QUANTITY	0.83+
HPE	ORGANIZATION	0.82+
COVID-19 crisis	EVENT	0.82+
Exascale	PERSON	0.81+
Over 50,000 of its residents	QUANTITY	0.81+
Aruba	LOCATION	0.8+
30,000, 60,000 different items	QUANTITY	0.77+
Mr.	PERSON	0.77+
double	QUANTITY	0.73+
plants	QUANTITY	0.7+
Cray HPE	ORGANIZATION	0.69+
AC	OTHER	0.67+
times	QUANTITY	0.65+
three clinical trial phases	QUANTITY	0.65+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Goh: