Bob Rogers, Intel, Julie Cordua, Thorn | AWS re:Invent
>> Narrator: Live from Las Vegas, it's theCUBE, covering AWS re:Invent 2017, presented by AWS, Intel, and our ecosystem of partners. >> Hello everyone, welcome to a special CUBE presentation here, live in Las Vegas for Amazon Web Service's AWS re:Invent 2017. This is theCUBE's fifth year here. We've been watching the progression. I'm John Furrier with Justin here as my co-host. Our two next guests are Bob Rogers, the chief data scientist at Intel, and Julie Cardoa, who's the CEO of Thorn. Great guests, showing some AI for good. Intel, obviously, good citizen and great technology partner. Welcome to theCUBE. >> Thank you, thanks for having us! >> So, I saw your talk you gave at the Public Sector Breakfast this morning here at re:Invent. Packed house, fire marshal was kicking people out. Really inspirational story. Intel, we've talked at South by Southwest. You guys are really doing a lot of AI for good. That's the theme here. You guys are doing incredible work. >> Julie: Thank you. >> Tell your story real quick. >> Yeah, so Thorn is a nonprofit, we started about five years ago, and we are just specifically dedicated to build new technologies to defend children form sexual abuse. We were seeing that, as, you know, new technologies emerge, there's new innovation out there, how child sexual abuse was presenting itself was changing dramatically. So, everything from child sex trafficking online, to the spread of child sexual abuse material, livestreaming abuse, and there wasn't a concentrated effort to put the best and brightest minds and technology together to be a part of the solution, and so that's what we do. We build products to stop child abuse. >> John: So you're a nonprofit? >> Julie: Yep! >> And you're in that public sector, but you guys have made a great progress. What's the story behind it? How did you get to do so effective work in such a short period of time as a nonprofit? >> Well, I think there's a couple things to that. One is, well, we learned a lot really quickly, so what we're doing today is not what we thought we would do five years ago. We thought we were gonna talk to big companies, and push them to do more, and then we realized that we actually needed to be a hub. We needed to build our own engineering teams, we needed to build product, and then bring in these companies to help us, and to add to that, but there had to be some there there, and so we actually have evolved. We're a nonprofit, but we are a product company. We have two products used in 23 countries around the world, stopping abuse every day. And I think the other thing we learned is that we really have to break down silos. So, we didn't, in a lot of our development, we didn't go the normal route of saying, okay, well this is a law enforcement job, so we're gonna go bid for a big government RFE. We just went and built a tool and gave it to a bunch of police officers and they said, "Wow, this works really well, "we're gonna keep using it." And it kinda spread like wildfire. >> And it's making a difference. It's really been a great inspirational story. Check out Thorn, amazing work, real use case, in my mind, a testimonial for how fast you can accelerate. Congratulations. Bob, I wanna get your take on this because it's a data problem that, actually, the technology's applying to a problem that people have been trying to crack the code on for a long time. >> Yeah, well, it's interesting, 'cause the context is that we're really in this era of AI explosion, and AI is really computer systems that can do things that only humans could do 10 years ago. That's kind of my basic way of thinking about it, so the problem of being able to recognize when you're looking at two images of the same child, which is the piece that we solved for Thorn, actually, you know, is a great example of using the current AI capabilities. You start with the problem of, if I show an algorithm two different images of the same child, can it recognize that they're the same? And you basically customize your training to create a very specific capability. Not a basic image recognition or facial recognition, but a very specific capability that's been trained with specific examples. I was gonna say something about what Julie was describing about their model. Their model to create that there there has been incredible because it allows them to really focus our energy into the right problems. We have lots of technology, we have lots of different ways of doing AI and machine learning, but when we get a focus on this is the data, this is the exact problem we need to solve, and this is the way it needs to work for law enforcement, for National Center for Missing and Exploited Children. It has really just turned the knob up to 11, so to speak. >> I mean, this is an example where, I mean, we always talk about how tech transformation can make things go faster. It's such an obvious problem. I mean, it's almost everyone kinda looks away because it's too hard. So, I wanna ask you, how do people make this happen for other areas for good? So, for instance, you know, what was the bottlenecks before? What solved the problem, because, I mean, you could really make a difference here. You guys are. >> Well, I think there's a couple things. I think you hit on one, which is this is a problem people turn away from. It's really hard to look at. And the other thing is is there's not a lot of money to be made in using advanced technology to find missing and exploited children, right? So, it did require the development of a nonprofit that said, "We're gonna do this, "and we're gonna fundraise to get it done." But it also required us to look at it from a technology angle, right? I think a lot of times people look at social issues from the impact angle, which we do, but we said, "What if we looked at it "from a different perspective? "How can technology disrupt in this area?" And then we made that the core of what we do, and we partnered with all the other amazing organizations that are doing the other work. And I think, then, what Bob said was that we created a hub where other experts could plug into, and I think, in any other issue area that you're working on, you can't just talk about it and convene people. You actually have to build, and when you build, you create a platform that others can add to, and I think that is one of the core reasons why we have seen so much progress, is we started out convening and really realized that wasn't gonna last very long, and then we built, and once we started building, we scaled. >> So, you got in the market quickly with something. >> Yeah. >> So, one of the issues with any sort of criminal enterprise is it tends to end up in a bit of an arms race, so you've built this great technology but then you've gotta keep one step ahead of the bad guys. So, how are you actually doing that? How are you continuing to invest in this and develop it to make sure that you're always one step ahead? >> So, I can address that on a couple of levels. One is, you know, working with Thorn, and I lead a program at Intel called the Safer Children Program, where we work with Thorn and also the National Center for Missing and Exploited Children. Those conversations bring in all of the tech giants, and there's a little bit of sibling rivalry. We're all trying to throw in our best tech. So, I think we all wanna do as well as we can for these partnerships. The other thing is, just in very tactical terms, working with Thorn, we've actually, Thorn and with Microsoft, we've created a capability to crowdsource more data to help improve the accuracy of these deep learning algorithms. So, by getting critical mass around this problem, we've actually now created enough visibility that we're getting more and more data. And as you said earlier, it's a data problem, so if you have enough data, you can actually create the models with the accuracy and the capability that you need. So, it starts to feed on itself. >> Julie talked about the business logic, how she attacked that. That's really, 'cause I think one thing notable, good use case, but from a tech perspective, how does the cloud fit in with Intel specifically? Because it really, the cloud is an enabler too. >> Bob: Yeah, absolutely. >> How's that all working with Intel? And you go on about whole new territory you guys are forging in here, it's awesome, but the cloud. >> Right, so, for us, the cloud is an incredible way for us to make our compute capability available to anyone who needs to do computing, especially in this data-driven algorithm era where more and more machine learning, more and more AI, more and more data-driven problems are coming to the fore, doing that work on the cloud and being able to scale your work according to how much data is coming in at any time, it makes the cloud a really natural place for us. And of course, Intel's hardware is a core component of pretty much all the cloud that you could connect to. >> And the compute that you guys provide, and Amazon adds to it, their cloud is impressive. Now, I'd like to know what you guys are gonna be talking about in your session. You have a session here at re:Invent. What's the title of the session, what's the agenda, is it the same stuff here, what's gonna be talked about? >> So, we're talking about life-changing AI applications, and in specific we're gonna talk about, at the end Julie will talk about what Thorn has done with the child-finder and the AI that we and Microsoft built for them. We'll also, I'll start out by talking about Intel's role broadly in the computing and AI space. Intel really looks to take all of its different hardware, and networking, and memory assets, and make it possible for anybody to do the kinds of artificial intelligence or machine learning they need to do. And then in the middle, there's a really cool deployment on AWS sandwich that (something) will talk about how they've taken the models and really dialed them up in terms of how fast you can go through this data, so that we can go through millions and millions of images in our searches, and come back with results really, really fast. So, it's a great sort of three piece story about the conception of AI, the deployment at scale and with high performance, and then how Thorn is really taking that and creating a human impact around it. >> So, Bob, I asked you the Intel question because no one calls up Intel and says, "Hey, give me some AI for good." I mean, I wish that would be the case. >> Well, they do now. >> If they do, well, share your strategy, because cloud makes sense. I could see how you could provision easily, get in there, really empowering people to do stuff that's passionable and relevant. But how do you guys play in all of this? 'Cause I know you supply stuff to the cloud guys. Is this a formal program you're doing at Intel? Is this a one-off? >> Yeah, so Safer Children is a formal program. It started with two other folks, Lisa Davis and Lisa Theinai, going to the VP of the entire data center group and saying, "There is an opportunity to make a big impact "with Intel technology, and we'd like to do this." And it started literally because Intel does actually want to do good work for humankind, and frankly, the fact that these people are using our technology and other technology to hurt children, it steams our dumplings, frankly. So, it started with that. >> You've been a team player with Amazon and everyone else. >> Exactly, so then, once we've been able to show that we can actually create technology and provide infrastructure to solve these problems, it starts to become a self-fulfilling prophecy where people are saying, "Hey, we've got this "interesting adjacent problem that "this kind of technology could solve. "Is there an opportunity to work together and solve that?" And that fits into our bigger, you know, people ask me all the time, "Why does Intel have a chief data scientist?" We're a hardware company, right? The answer is-- >> That processes a lot of data! >> Yes, that processes a lot of data. Literally, we need to help people know how to get value from their data. So, if people are successful with their analytics and their AI, guess what, they're gonna invest in their infrastructure, and it sort of lifts Intel's boat across the board. >> You guys have always been a great citizen, and great technology provider, and hats off to Intel. Julie, tell a story about an example people can get a feel for some of the impact, because I saw you on stage this morning with Theresa Carlson, and we've been tracking her efforts in the public sector have been amazing, and Intel's been part of that too, congratulations. But you were kind of emotional, and you got a lot of applause. What's some of the impact? Tell a story of how important this really is, and your work at Thorn. >> Yeah, well, I mean, one of the areas we work in is trying to identify children who are being sold online in the US. A lot of people, first of all, think that's happening somewhere else. No, that's here in this country. A lot of these kids are coming out of foster care, or are runaways, and they get convinced by a pimp or a trafficker to be sold into prostitution, basically. So, we have 150,000 escort ads posted every single day in this country, and somewhere in there are children, and it's really difficult to look through that with your eye, and determine what's a child. So, we built a tool called Spotlight that basically reads and analyzes every ad as it comes in, and we layer on smart algorithms to say to an officer, "Hey, this is an ad you need to pay attention to. "It looks like this could be a child." And we've had over 6,000 children identified over the last year. >> John: That's amazing. >> You know, it happens in a situation where, you know, you have online it says, you know, this girl's 18, and it's actually a 15-year-old girl who met a man who said he was 17, he was actually 30, had already been convicted of sex trafficking, and within 48 hours of meeting this girl, he had her up online for sale. So, that sounds like a unique incident. It is not unique, it happens every single day in almost every city and town across this country. And the work we're doing is to find those kids faster, and stop that trauma. >> Well, I just wanna say congratulations. That's great work. We had a CUBE alumni, founder of CloudAir, Jeff Hammerbacher, good friend of theCUBE. He had a famous quote that he said on theCUBE, then said on the Charlie Rose Show, "The best minds of our generations "are thinking about how to make people click ads. "That sucks." This is an example where you can really put the best minds on some of the real important things. >> Yeah, we love Jeff. I read that quote all the time. >> It's really a most important quote. Well, thanks so much. Congratulations, great inspiration, great story. Bob, thanks for coming on, appreciate it. CUBE live coverage here at AWS re:Invent 2017, kicking off day one of three days of wall-to-wall coverage here, live in Las Vegas. We'll be right back with more after this short break.
SUMMARY :
Intel, and our ecosystem of partners. Welcome to theCUBE. the Public Sector Breakfast this morning and we are just specifically dedicated to build but you guys have made a great progress. and then bring in these companies to help us, the technology's applying to a problem that so the problem of being able to recognize So, for instance, you know, You actually have to build, and when you build, So, one of the issues with and the capability that you need. how does the cloud fit in with Intel specifically? And you go on about whole new territory that you could connect to. And the compute that you guys provide, and make it possible for anybody to do the kinds of So, Bob, I asked you the Intel question because 'Cause I know you supply stuff to the cloud guys. and frankly, the fact that these people and provide infrastructure to solve these problems, and it sort of lifts Intel's boat across the board. and hats off to Intel. and it's really difficult to and stop that trauma. This is an example where you can really I read that quote all the time. We'll be right back with more
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Davis | PERSON | 0.99+ |
Lisa Theinai | PERSON | 0.99+ |
Julie | PERSON | 0.99+ |
Bob Rogers | PERSON | 0.99+ |
Julie Cardoa | PERSON | 0.99+ |
Theresa Carlson | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Jeff | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Bob | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Julie Cordua | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
CloudAir | ORGANIZATION | 0.99+ |
Justin | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
two images | QUANTITY | 0.99+ |
US | LOCATION | 0.99+ |
National Center for Missing and Exploited Children | ORGANIZATION | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
150,000 escort ads | QUANTITY | 0.99+ |
23 countries | QUANTITY | 0.99+ |
three days | QUANTITY | 0.99+ |
two products | QUANTITY | 0.99+ |
18 | QUANTITY | 0.99+ |
30 | QUANTITY | 0.99+ |
Thorn | ORGANIZATION | 0.99+ |
17 | QUANTITY | 0.99+ |
fifth year | QUANTITY | 0.99+ |
National Center for Missing and Exploited Children | ORGANIZATION | 0.99+ |
two different images | QUANTITY | 0.99+ |
theCUBE | ORGANIZATION | 0.99+ |
15-year-old | QUANTITY | 0.99+ |
One | QUANTITY | 0.98+ |
Thorn | PERSON | 0.98+ |
one | QUANTITY | 0.97+ |
48 hours | QUANTITY | 0.97+ |
five years ago | DATE | 0.97+ |
three piece | QUANTITY | 0.97+ |
over 6,000 children | QUANTITY | 0.97+ |
Amazon Web Service | ORGANIZATION | 0.97+ |
10 years ago | DATE | 0.97+ |
Charlie Rose Show | TITLE | 0.96+ |
South by Southwest | ORGANIZATION | 0.96+ |
two next guests | QUANTITY | 0.95+ |
last year | DATE | 0.94+ |
two other folks | QUANTITY | 0.94+ |
today | DATE | 0.94+ |
Spotlight | TITLE | 0.93+ |
day one | QUANTITY | 0.93+ |
one step | QUANTITY | 0.92+ |
AI for Good Panel - Precision Medicine - SXSW 2017 - #IntelAI - #theCUBE
>> Welcome to the Intel AI Lounge. Today, we're very excited to share with you the Precision Medicine panel discussion. I'll be moderating the session. My name is Kay Erin. I'm the general manager of Health and Life Sciences at Intel. And I'm excited to share with you these three panelists that we have here. First is John Madison. He is a chief information medical officer and he is part of Kaiser Permanente. We're very excited to have you here. Thank you, John. >> Thank you. >> We also have Naveen Rao. He is the VP and general manager for the Artificial Intelligence Solutions at Intel. He's also the former CEO of Nervana, which was acquired by Intel. And we also have Bob Rogers, who's the chief data scientist at our AI solutions group. So, why don't we get started with our questions. I'm going to ask each of the panelists to talk, introduce themselves, as well as talk about how they got started with AI. So why don't we start with John? >> Sure, so can you hear me okay in the back? Can you hear? Okay, cool. So, I am a recovering evolutionary biologist and a recovering physician and a recovering geek. And I implemented the health record system for the first and largest region of Kaiser Permanente. And it's pretty obvious that most of the useful data in a health record, in lies in free text. So I started up a natural language processing team to be able to mine free text about a dozen years ago. So we can do things with that that you can't otherwise get out of health information. I'll give you an example. I read an article online from the New England Journal of Medicine about four years ago that said over half of all people who have had their spleen taken out were not properly vaccinated for a common form of pneumonia, and when your spleen's missing, you must have that vaccine or you die a very sudden death with sepsis. In fact, our medical director in Northern California's father died of that exact same scenario. So, when I read the article, I went to my structured data analytics team and to my natural language processing team and said please show me everybody who has had their spleen taken out and hasn't been appropriately vaccinated and we ran through about 20 million records in about three hours with the NLP team, and it took about three weeks with a structured data analytics team. That sounds counterintuitive but it actually happened that way. And it's not a competition for time only. It's a competition for quality and sensitivity and specificity. So we were able to indentify all of our members who had their spleen taken out, who should've had a pneumococcal vaccine. We vaccinated them and there are a number of people alive today who otherwise would've died absent that capability. So people don't really commonly associate natural language processing with machine learning, but in fact, natural language processing relies heavily and is the first really, highly successful example of machine learning. So we've done dozens of similar projects, mining free text data in millions of records very efficiently, very effectively. But it really helped advance the quality of care and reduce the cost of care. It's a natural step forward to go into the world of personalized medicine with the arrival of a 100-dollar genome, which is actually what it costs today to do a full genome sequence. Microbiomics, that is the ecosystem of bacteria that are in every organ of the body actually. And we know now that there is a profound influence of what's in our gut and how we metabolize drugs, what diseases we get. You can tell in a five year old, whether or not they were born by a vaginal delivery or a C-section delivery by virtue of the bacteria in the gut five years later. So if you look at the complexity of the data that exists in the genome, in the microbiome, in the health record with free text and you look at all the other sources of data like this streaming data from my wearable monitor that I'm part of a research study on Precision Medicine out of Stanford, there is a vast amount of disparate data, not to mention all the imaging, that really can collectively produce much more useful information to advance our understanding of science, and to advance our understanding of every individual. And then we can do the mash up of a much broader range of science in health care with a much deeper sense of data from an individual and to do that with structured questions and structured data is very yesterday. The only way we're going to be able to disambiguate those data and be able to operate on those data in concert and generate real useful answers from the broad array of data types and the massive quantity of data, is to let loose machine learning on all of those data substrates. So my team is moving down that pathway and we're very excited about the future prospects for doing that. >> Yeah, great. I think that's actually some of the things I'm very excited about in the future with some of the technologies we're developing. My background, I started actually being fascinated with computation in biological forms when I was nine. Reading and watching sci-fi, I was kind of a big dork which I pretty much still am. I haven't really changed a whole lot. Just basically seeing that machines really aren't all that different from biological entities, right? We are biological machines and kind of understanding how a computer works and how we engineer those things and trying to pull together concepts that learn from biology into that has always been a fascination of mine. As an undergrad, I was in the EE, CS world. Even then, I did some research projects around that. I worked in the industry for about 10 years designing chips, microprocessors, various kinds of ASICs, and then actually went back to school, quit my job, got a Ph.D. in neuroscience, computational neuroscience, to specifically understand what's the state of the art. What do we really understand about the brain? And are there concepts that we can take and bring back? Inspiration's always been we want to... We watch birds fly around. We want to figure out how to make something that flies. We extract those principles, and then build a plane. Don't necessarily want to build a bird. And so Nervana's really was the combination of all those experiences, bringing it together. Trying to push computation in a new a direction. Now, as part of Intel, we can really add a lot of fuel to that fire. I'm super excited to be part of Intel in that the technologies that we were developing can really proliferate and be applied to health care, can be applied to Internet, can be applied to every facet of our lives. And some of the examples that John mentioned are extremely exciting right now and these are things we can do today. And the generality of these solutions are just really going to hit every part of health care. I mean from a personal viewpoint, my whole family are MDs. I'm sort of the black sheep of the family. I don't have an MD. And it's always been kind of funny to me that knowledge is concentrated in a few individuals. Like you have a rare tumor or something like that, you need the guy who knows how to read this MRI. Why? Why is it like that? Can't we encapsulate that knowledge into a computer or into an algorithm, and democratize it. And the reason we couldn't do it is we just didn't know how. And now we're really getting to a point where we know how to do that. And so I want that capability to go to everybody. It'll bring the cost of healthcare down. It'll make all of us healthier. That affects everything about our society. So that's really what's exciting about it to me. >> That's great. So, as you heard, I'm Bob Rogers. I'm chief data scientist for analytics and artificial intelligence solutions at Intel. My mission is to put powerful analytics in the hands of every decision maker and when I think about Precision Medicine, decision makers are not just doctors and surgeons and nurses, but they're also case managers and care coordinators and probably most of all, patients. So the mission is really to put powerful analytics and AI capabilities in the hands of everyone in health care. It's a very complex world and we need tools to help us navigate it. So my background, I started with a Ph.D. in physics and I was computer modeling stuff, falling into super massive black holes. And there's a lot of applications for that in the real world. No, I'm kidding. (laughter) >> John: There will be, I'm sure. Yeah, one of these days. Soon as we have time travel. Okay so, I actually, about 1991, I was working on my post doctoral research, and I heard about neural networks, these things that could compute the way the brain computes. And so, I started doing some research on that. I wrote some papers and actually, it was an interesting story. The problem that we solved that got me really excited about neural networks, which have become deep learning, my office mate would come in. He was this young guy who was about to go off to grad school. He'd come in every morning. "I hate my project." Finally, after two weeks, what's your project? What's the problem? It turns out he had to circle these little fuzzy spots on these images from a telescope. So they were looking for the interesting things in a sky survey, and he had to circle them and write down their coordinates all summer. Anyone want to volunteer to do that? No? Yeah, he was very unhappy. So we took the first two weeks of data that he created doing his work by hand, and we trained an artificial neural network to do his summer project and finished it in about eight hours of computing. (crowd laughs) And so he was like yeah, this is amazing. I'm so happy. And we wrote a paper. I was the first author of course, because I was the senior guy at age 24. And he was second author. His first paper ever. He was very, very excited. So we have to fast forward about 20 years. His name popped up on the Internet. And so it caught my attention. He had just won the Nobel Prize in physics. (laughter) So that's where artificial intelligence will get you. (laughter) So thanks Naveen. Fast forwarding, I also developed some time series forecasting capabilities that allowed me to create a hedge fund that I ran for 12 years. After that, I got into health care, which really is the center of my passion. Applying health care to figuring out how to get all the data from all those siloed sources, put it into the cloud in a secure way, and analyze it so you can actually understand those cases that John was just talking about. How do you know that that person had had a splenectomy and that they needed to get that pneumovax? You need to be able to search all the data, so we used AI, natural language processing, machine learning, to do that and then two years ago, I was lucky enough to join Intel and, in the intervening time, people like Naveen actually thawed the AI winter and we're really in a spring of amazing opportunities with AI, not just in health care but everywhere, but of course, the health care applications are incredibly life saving and empowering so, excited to be here on this stage with you guys. >> I just want to cue off of your comment about the role of physics in AI and health care. So the field of microbiomics that I referred to earlier, bacteria in our gut. There's more bacteria in our gut than there are cells in our body. There's 100 times more DNA in that bacteria than there is in the human genome. And we're now discovering a couple hundred species of bacteria a year that have never been identified under a microscope just by their DNA. So it turns out the person who really catapulted the study and the science of microbiomics forward was an astrophysicist who did his Ph.D. in Steven Hawking's lab on the collision of black holes and then subsequently, put the other team in a virtual reality, and he developed the first super computing center and so how did he get an interest in microbiomics? He has the capacity to do high performance computing and the kind of advanced analytics that are required to look at a 100 times the volume of 3.2 billion base pairs of the human genome that are represented in the bacteria in our gut, and that has unleashed the whole science of microbiomics, which is going to really turn a lot of our assumptions of health and health care upside down. >> That's great, I mean, that's really transformational. So a lot of data. So I just wanted to let the audience know that we want to make this an interactive session, so I'll be asking for questions in a little bit, but I will start off with one question so that you can think about it. So I wanted to ask you, it looks like you've been thinking a lot about AI over the years. And I wanted to understand, even though AI's just really starting in health care, what are some of the new trends or the changes that you've seen in the last few years that'll impact how AI's being used going forward? >> So I'll start off. There was a paper published by a guy by the name of Tegmark at Harvard last summer that, for the first time, explained why neural networks are efficient beyond any mathematical model we predict. And the title of the paper's fun. It's called Deep Learning Versus Cheap Learning. So there were two sort of punchlines of the paper. One is is that the reason that mathematics doesn't explain the efficiency of neural networks is because there's a higher order of mathematics called physics. And the physics of the underlying data structures determined how efficient you could mine those data using machine learning tools. Much more so than any mathematical modeling. And so the second thing that was a reel from that paper is that the substrate of the data that you're operating on and the natural physics of those data have inherent levels of complexity that determine whether or not a 12th layer of neural net will get you where you want to go really fast, because when you do the modeling, for those math geeks in the audience, a factorial. So if there's 12 layers, there's 12 factorial permutations of different ways you could sequence the learning through those data. When you have 140 layers of a neural net, it's a much, much, much bigger number of permutations and so you end up being hardware-bound. And so, what Max Tegmark basically said is you can determine whether to do deep learning or cheap learning based upon the underlying physics of the data substrates you're operating on and have a good insight into how to optimize your hardware and software approach to that problem. >> So another way to put that is that neural networks represent the world in the way the world is sort of built. >> Exactly. >> It's kind of hierarchical. It's funny because, sort of in retrospect, like oh yeah, that kind of makes sense. But when you're thinking about it mathematically, we're like well, anything... The way a neural can represent any mathematical function, therfore, it's fully general. And that's the way we used to look at it, right? So now we're saying, well actually decomposing the world into different types of features that are layered upon each other is actually a much more efficient, compact representation of the world, right? I think this is actually, precisely the point of kind of what you're getting at. What's really exciting now is that what we were doing before was sort of building these bespoke solutions for different kinds of data. NLP, natural language processing. There's a whole field, 25 plus years of people devoted to figuring out features, figuring out what structures make sense in this particular context. Those didn't carry over at all to computer vision. Didn't carry over at all to time series analysis. Now, with neural networks, we've seen it at Nervana, and now part of Intel, solving customers' problems. We apply a very similar set of techniques across all these different types of data domains and solve them. All data in the real world seems to be hierarchical. You can decompose it into this hierarchy. And it works really well. Our brains are actually general structures. As a neuroscientist, you can look at different parts of your brain and there are differences. Something that takes in visual information, versus auditory information is slightly different but they're much more similar than they are different. So there is something invariant, something very common between all of these different modalities and we're starting to learn that. And this is extremely exciting to me trying to understand the biological machine that is a computer, right? We're figurig it out, right? >> One of the really fun things that Ray Chrisfall likes to talk about is, and it falls in the genre of biomimmicry, and how we actually replicate biologic evolution in our technical solutions so if you look at, and we're beginning to understand more and more how real neural nets work in our cerebral cortex. And it's sort of a pyramid structure so that the first pass of a broad base of analytics, it gets constrained to the next pass, gets constrained to the next pass, which is how information is processed in the brain. So we're discovering increasingly that what we've been evolving towards, in term of architectures of neural nets, is approximating the architecture of the human cortex and the more we understand the human cortex, the more insight we get to how to optimize neural nets, so when you think about it, with millions of years of evolution of how the cortex is structured, it shouldn't be a surprise that the optimization protocols, if you will, in our genetic code are profoundly efficient in how they operate. So there's a real role for looking at biologic evolutionary solutions, vis a vis technical solutions, and there's a friend of mine who worked with who worked with George Church at Harvard and actually published a book on biomimmicry and they wrote the book completely in DNA so if all of you have your home DNA decoder, you can actually read the book on your DNA reader, just kidding. >> There's actually a start up I just saw in the-- >> Read-Write DNA, yeah. >> Actually it's a... He writes something. What was it? (response from crowd member) Yeah, they're basically encoding information in DNA as a storage medium. (laughter) The company, right? >> Yeah, that same friend of mine who coauthored that biomimmicry book in DNA also did the estimate of the density of information storage. So a cubic centimeter of DNA can store an hexabyte of data. I mean that's mind blowing. >> Naveen: Highly done soon. >> Yeah that's amazing. Also you hit upon a really important point there, that one of the things that's changed is... Well, there are two major things that have changed in my perception from let's say five to 10 years ago, when we were using machine learning. You could use data to train models and make predictions to understand complex phenomena. But they had limited utility and the challenge was that if I'm trying to build on these things, I had to do a lot of work up front. It was called feature engineering. I had to do a lot of work to figure out what are the key attributes of that data? What are the 10 or 20 or 100 pieces of information that I should pull out of the data to feed to the model, and then the model can turn it into a predictive machine. And so, what's really exciting about the new generation of machine learning technology, and particularly deep learning, is that it can actually learn from example data those features without you having to do any preprogramming. That's why Naveen is saying you can take the same sort of overall approach and apply it to a bunch of different problems. Because you're not having to fine tune those features. So at the end of the day, the two things that have changed to really enable this evolution is access to more data, and I'd be curious to hear from you where you're seeing data come from, what are the strategies around that. So access to data, and I'm talking millions of examples. So 10,000 examples most times isn't going to cut it. But millions of examples will do it. And then, the other piece is the computing capability to actually take millions of examples and optimize this algorithm in a single lifetime. I mean, back in '91, when I started, we literally would have thousands of examples and it would take overnight to run the thing. So now in the world of millions, and you're putting together all of these combinations, the computing has changed a lot. I know you've made some revolutionary advances in that. But I'm curious about the data. Where are you seeing interesting sources of data for analytics? >> So I do some work in the genomics space and there are more viable permutations of the human genome than there are people who have ever walked the face of the earth. And the polygenic determination of a phenotypic expression translation, what are genome does to us in our physical experience in health and disease is determined by many, many genes and the interaction of many, many genes and how they are up and down regulated. And the complexity of disambiguating which 27 genes are affecting your diabetes and how are they up and down regulated by different interventions is going to be different than his. It's going to be different than his. And we already know that there's four or five distinct genetic subtypes of type II diabetes. So physicians still think there's one disease called type II diabetes. There's actually at least four or five genetic variants that have been identified. And so, when you start thinking about disambiguating, particularly when we don't know what 95 percent of DNA does still, what actually is the underlining cause, it will require this massive capability of developing these feature vectors, sometimes intuiting it, if you will, from the data itself. And other times, taking what's known knowledge to develop some of those feature vectors, and be able to really understand the interaction of the genome and the microbiome and the phenotypic data. So the complexity is high and because the variation complexity is high, you do need these massive members. Now I'm going to make a very personal pitch here. So forgive me, but if any of you have any role in policy at all, let me tell you what's happening right now. The Genomic Information Nondiscrimination Act, so called GINA, written by a friend of mine, passed a number of years ago, says that no one can be discriminated against for health insurance based upon their genomic information. That's cool. That should allow all of you to feel comfortable donating your DNA to science right? Wrong. You are 100% unprotected from discrimination for life insurance, long term care and disability. And it's being practiced legally today and there's legislation in the House, in mark up right now to completely undermine the existing GINA legislation and say that whenever there's another applicable statute like HIPAA, that the GINA is irrelevant, that none of the fines and penalties are applicable at all. So we need a ton of data to be able to operate on. We will not be getting a ton of data to operate on until we have the kind of protection we need to tell people, you can trust us. You can give us your data, you will not be subject to discrimination. And that is not the case today. And it's being further undermined. So I want to make a plea to any of you that have any policy influence to go after that because we need this data to help the understanding of human health and disease and we're not going to get it when people look behind the curtain and see that discrimination is occurring today based upon genetic information. >> Well, I don't like the idea of being discriminated against based on my DNA. Especially given how little we actually know. There's so much complexity in how these things unfold in our own bodies, that I think anything that's being done is probably childishly immature and oversimplifying. So it's pretty rough. >> I guess the translation here is that we're all unique. It's not just a Disney movie. (laughter) We really are. And I think one of the strengths that I'm seeing, kind of going back to the original point, of these new techniques is it's going across different data types. It will actually allow us to learn more about the uniqueness of the individual. It's not going to be just from one data source. They were collecting data from many different modalities. We're collecting behavioral data from wearables. We're collecting things from scans, from blood tests, from genome, from many different sources. The ability to integrate those into a unified picture, that's the important thing that we're getting toward now. That's what I think is going to be super exciting here. Think about it, right. I can tell you to visual a coin, right? You can visualize a coin. Not only do you visualize it. You also know what it feels like. You know how heavy it is. You have a mental model of that from many different perspectives. And if I take away one of those senses, you can still identify the coin, right? If I tell you to put your hand in your pocket, and pick out a coin, you probably can do that with 100% reliability. And that's because we have this generalized capability to build a model of something in the world. And that's what we need to do for individuals is actually take all these different data sources and come up with a model for an individual and you can actually then say what drug works best on this. What treatment works best on this? It's going to get better with time. It's not going to be perfect, because this is what a doctor does, right? A doctor who's very experienced, you're a practicing physician right? Back me up here. That's what you're doing. You basically have some categories. You're taking information from the patient when you talk with them, and you're building a mental model. And you apply what you know can work on that patient, right? >> I don't have clinic hours anymore, but I do take care of many friends and family. (laughter) >> You used to, you used to. >> I practiced for many years before I became a full-time geek. >> I thought you were a recovering geek. >> I am. (laughter) I do more policy now. >> He's off the wagon. >> I just want to take a moment and see if there's anyone from the audience who would like to ask, oh. Go ahead. >> We've got a mic here, hang on one second. >> I have tons and tons of questions. (crosstalk) Yes, so first of all, the microbiome and the genome are really complex. You already hit about that. Yet most of the studies we do are small scale and we have difficulty repeating them from study to study. How are we going to reconcile all that and what are some of the technical hurdles to get to the vision that you want? >> So primarily, it's been the cost of sequencing. Up until a year ago, it's $1000, true cost. Now it's $100, true cost. And so that barrier is going to enable fairly pervasive testing. It's not a real competitive market becaue there's one sequencer that is way ahead of everybody else. So the price is not $100 yet. The cost is below $100. So as soon as there's competition to drive the cost down, and hopefully, as soon as we all have the protection we need against discrimination, as I mentioned earlier, then we will have large enough sample sizes. And so, it is our expectation that we will be able to pool data from local sources. I chair the e-health work group at the Global Alliance for Genomics and Health which is working on this very issue. And rather than pooling all the data into a single, common repository, the strategy, and we're developing our five-year plan in a month in London, but the goal is to have a federation of essentially credentialed data enclaves. That's a formal method. HHS already does that so you can get credentialed to search all the data that Medicare has on people that's been deidentified according to HIPPA. So we want to provide the same kind of service with appropriate consent, at an international scale. And there's a lot of nations that are talking very much about data nationality so that you can't export data. So this approach of a federated model to get at data from all the countries is important. The other thing is a block-chain technology is going to be very profoundly useful in this context. So David Haussler of UC Santa Cruz is right now working on a protocol using an open block-chain, public ledger, where you can put out. So for any typical cancer, you may have a half dozen, what are called sematic variance. Cancer is a genetic disease so what has mutated to cause it to behave like a cancer? And if we look at those biologically active sematic variants, publish them on a block chain that's public, so there's not enough data there to reidentify the patient. But if I'm a physician treating a woman with breast cancer, rather than say what's the protocol for treating a 50-year-old woman with this cell type of cancer, I can say show me all the people in the world who have had this cancer at the age of 50, wit these exact six sematic variants. Find the 200 people worldwide with that. Ask them for consent through a secondary mechanism to donate everything about their medical record, pool that information of the core of 200 that exactly resembles the one sitting in front of me, and find out, of the 200 ways they were treated, what got the best results. And so, that's the kind of future where a distributed, federated architecture will allow us to query and obtain a very, very relevant cohort, so we can basically be treating patients like mine, sitting right in front of me. Same thing applies for establishing research cohorts. There's some very exciting stuff at the convergence of big data analytics, machine learning, and block chaining. >> And this is an area that I'm really excited about and I think we're excited about generally at Intel. They actually have something called the Collaborative Cancer Cloud, which is this kind of federated model. We have three different academic research centers. Each of them has a very sizable and valuable collection of genomic data with phenotypic annotations. So you know, pancreatic cancer, colon cancer, et cetera, and we've actually built a secure computing architecture that can allow a person who's given the right permissions by those organizations to ask a specific question of specific data without ever sharing the data. So the idea is my data's really important to me. It's valuable. I want us to be able to do a study that gets the number from the 20 pancreatic cancer patients in my cohort, up to the 80 that we have in the whole group. But I can't do that if I'm going to just spill my data all over the world. And there are HIPAA and compliance reasons for that. There are business reasons for that. So what we've built at Intel is this platform that allows you to do different kinds of queries on this genetic data. And reach out to these different sources without sharing it. And then, the work that I'm really involved in right now and that I'm extremely excited about... This also touches on something that both of you said is it's not sufficient to just get the genome sequences. You also have to have the phenotypic data. You have to know what cancer they've had. You have to know that they've been treated with this drug and they've survived for three months or that they had this side effect. That clinical data also needs to be put together. It's owned by other organizations, right? Other hospitals. So the broader generalization of the Collaborative Cancer Cloud is something we call the data exchange. And it's a misnomer in a sense that we're not actually exchanging data. We're doing analytics on aggregated data sets without sharing it. But it really opens up a world where we can have huge populations and big enough amounts of data to actually train these models and draw the thread in. Of course, that really then hits home for the techniques that Nervana is bringing to the table, and of course-- >> Stanford's one of your academic medical centers? >> Not for that Collaborative Cancer Cloud. >> The reason I mentioned Standford is because the reason I'm wearing this FitBit is because I'm a research subject at Mike Snyder's, the chair of genetics at Stanford, IPOP, intrapersonal omics profile. So I was fully sequenced five years ago and I get four full microbiomes. My gut, my mouth, my nose, my ears. Every three months and I've done that for four years now. And about a pint of blood. And so, to your question of the density of data, so a lot of the problem with applying these techniques to health care data is that it's basically a sparse matrix and there's a lot of discontinuities in what you can find and operate on. So what Mike is doing with the IPOP study is much the same as you described. Creating a highly dense longitudinal set of data that will help us mitigate the sparse matrix problem. (low volume response from audience member) Pardon me. >> What's that? (low volume response) (laughter) >> Right, okay. >> John: Lost the school sample. That's got to be a new one I've heard now. >> Okay, well, thank you so much. That was a great question. So I'm going to repeat this and ask if there's another question. You want to go ahead? >> Hi, thanks. So I'm a journalist and I report a lot on these neural networks, a system that's beter at reading mammograms than your human radiologists. Or a system that's better at predicting which patients in the ICU will get sepsis. These sort of fascinating academic studies that I don't really see being translated very quickly into actual hospitals or clinical practice. Seems like a lot of the problems are regulatory, or liability, or human factors, but how do you get past that and really make this stuff practical? >> I think there's a few things that we can do there and I think the proof points of the technology are really important to start with in this specific space. In other places, sometimes, you can start with other things. But here, there's a real confidence problem when it comes to health care, and for good reason. We have doctors trained for many, many years. School and then residencies and other kinds of training. Because we are really, really conservative with health care. So we need to make sure that technology's well beyond just the paper, right? These papers are proof points. They get people interested. They even fuel entire grant cycles sometimes. And that's what we need to happen. It's just an inherent problem, its' going to take a while. To get those things to a point where it's like well, I really do trust what this is saying. And I really think it's okay to now start integrating that into our standard of care. I think that's where you're seeing it. It's frustrating for all of us, believe me. I mean, like I said, I think personally one of the biggest things, I want to have an impact. Like when I go to my grave, is that we used machine learning to improve health care. We really do feel that way. But it's just not something we can do very quickly and as a business person, I don't actually look at those use cases right away because I know the cycle is just going to be longer. >> So to your point, the FDA, for about four years now, has understood that the process that has been given to them by their board of directors, otherwise known as Congress, is broken. And so they've been very actively seeking new models of regulation and what's really forcing their hand is regulation of devices and software because, in many cases, there are black box aspects of that and there's a black box aspect to machine learning. Historically, Intel and others are making inroads into providing some sort of traceability and transparency into what happens in that black box rather than say, overall we get better results but once in a while we kill somebody. Right? So there is progress being made on that front. And there's a concept that I like to use. Everyone knows Ray Kurzweil's book The Singularity Is Near? Well, I like to think that diadarity is near. And the diadarity is where you have human transparency into what goes on in the black box and so maybe Bob, you want to speak a little bit about... You mentioned that, in a prior discussion, that there's some work going on at Intel there. >> Yeah, absolutely. So we're working with a number of groups to really build tools that allow us... In fact Naveen probably can talk in even more detail than I can, but there are tools that allow us to actually interrogate machine learning and deep learning systems to understand, not only how they respond to a wide variety of situations but also where are there biases? I mean, one of the things that's shocking is that if you look at the clinical studies that our drug safety rules are based on, 50 year old white guys are the peak of that distribution, which I don't see any problem with that, but some of you out there might not like that if you're taking a drug. So yeah, we want to understand what are the biases in the data, right? And so, there's some new technologies. There's actually some very interesting data-generative technologies. And this is something I'm also curious what Naveen has to say about, that you can generate from small sets of observed data, much broader sets of varied data that help probe and fill in your training for some of these systems that are very data dependent. So that takes us to a place where we're going to start to see deep learning systems generating data to train other deep learning systems. And they start to sort of go back and forth and you start to have some very nice ways to, at least, expose the weakness of these underlying technologies. >> And that feeds back to your question about regulatory oversight of this. And there's the fascinating, but little known origin of why very few women are in clinical studies. Thalidomide causes birth defects. So rather than say pregnant women can't be enrolled in drug trials, they said any woman who is at risk of getting pregnant cannot be enrolled. So there was actually a scientific meritorious argument back in the day when they really didn't know what was going to happen post-thalidomide. So it turns out that the adverse, unintended consequence of that decision was we don't have data on women and we know in certain drugs, like Xanax, that the metabolism is so much slower, that the typical dosing of Xanax is women should be less than half of that for men. And a lot of women have had very serious adverse effects by virtue of the fact that they weren't studied. So the point I want to illustrate with that is that regulatory cycles... So people have known for a long time that was like a bad way of doing regulations. It should be changed. It's only recently getting changed in any meaningful way. So regulatory cycles and legislative cycles are incredibly slow. The rate of exponential growth in technology is exponential. And so there's impedance mismatch between the cycle time for regulation cycle time for innovation. And what we need to do... I'm working with the FDA. I've done four workshops with them on this very issue. Is that they recognize that they need to completely revitalize their process. They're very interested in doing it. They're not resisting it. People think, oh, they're bad, the FDA, they're resisting. Trust me, there's nobody on the planet who wants to revise these review processes more than the FDA itself. And so they're looking at models and what I recommended is global cloud sourcing and the FDA could shift from a regulatory role to one of doing two things, assuring the people who do their reviews are competent, and assuring that their conflicts of interest are managed, because if you don't have a conflict of interest in this very interconnected space, you probably don't know enough to be a reviewer. So there has to be a way to manage the conflict of interest and I think those are some of the keypoints that the FDA is wrestling with because there's type one and type two errors. If you underregulate, you end up with another thalidomide and people born without fingers. If you overregulate, you prevent life saving drugs from coming to market. So striking that balance across all these different technologies is extraordinarily difficult. If it were easy, the FDA would've done it four years ago. It's very complicated. >> Jumping on that question, so all three of you are in some ways entrepreneurs, right? Within your organization or started companies. And I think it would be good to talk a little bit about the business opportunity here, where there's a huge ecosystem in health care, different segments, biotech, pharma, insurance payers, etc. Where do you see is the ripe opportunity or industry, ready to really take this on and to make AI the competitive advantage. >> Well, the last question also included why aren't you using the result of the sepsis detection? We do. There were six or seven published ways of doing it. We did our own data, looked at it, we found a way that was superior to all the published methods and we apply that today, so we are actually using that technology to change clinical outcomes. As far as where the opportunities are... So it's interesting. Because if you look at what's going to be here in three years, we're not going to be using those big data analytics models for sepsis that we are deploying today, because we're just going to be getting a tiny aliquot of blood, looking for the DNA or RNA of any potential infection and we won't have to infer that there's a bacterial infection from all these other ancillary, secondary phenomenon. We'll see if the DNA's in the blood. So things are changing so fast that the opportunities that people need to look for are what are generalizable and sustainable kind of wins that are going to lead to a revenue cycle that are justified, a venture capital world investing. So there's a lot of interesting opportunities in the space. But I think some of the biggest opportunities relate to what Bob has talked about in bringing many different disparate data sources together and really looking for things that are not comprehensible in the human brain or in traditional analytic models. >> I think we also got to look a little bit beyond direct care. We're talking about policy and how we set up standards, these kinds of things. That's one area. That's going to drive innovation forward. I completely agree with that. Direct care is one piece. How do we scale out many of the knowledge kinds of things that are embedded into one person's head and get them out to the world, democratize that. Then there's also development. The underlying technology's of medicine, right? Pharmaceuticals. The traditional way that pharmaceuticals is developed is actually kind of funny, right? A lot of it was started just by chance. Penicillin, a very famous story right? It's not that different today unfortunately, right? It's conceptually very similar. Now we've got more science behind it. We talk about domains and interactions, these kinds of things but fundamentally, the problem is what we in computer science called NP hard, it's too difficult to model. You can't solve it analytically. And this is true for all these kinds of natural sorts of problems by the way. And so there's a whole field around this, molecular dynamics and modeling these sorts of things, that are actually being driven forward by these AI techniques. Because it turns out, our brain doesn't do magic. It actually doesn't solve these problems. It approximates them very well. And experience allows you to approximate them better and better. Actually, it goes a little bit to what you were saying before. It's like simulations and forming your own networks and training off each other. There are these emerging dynamics. You can simulate steps of physics. And you come up with a system that's much too complicated to ever solve. Three pool balls on a table is one such system. It seems pretty simple. You know how to model that, but it actual turns out you can't predict where a balls going to be once you inject some energy into that table. So something that simple is already too complex. So neural network techniques actually allow us to start making those tractable. These NP hard problems. And things like molecular dynamics and actually understanding how different medications and genetics will interact with each other is something we're seeing today. And so I think there's a huge opportunity there. We've actually worked with customers in this space. And I'm seeing it. Like Rosch is acquiring a few different companies in space. They really want to drive it forward, using big data to drive drug development. It's kind of counterintuitive. I never would've thought it had I not seen it myself. >> And there's a big related challenge. Because in personalized medicine, there's smaller and smaller cohorts of people who will benefit from a drug that still takes two billion dollars on average to develop. That is unsustainable. So there's an economic imperative of overcoming the cost and the cycle time for drug development. >> I want to take a go at this question a little bit differently, thinking about not so much where are the industry segments that can benefit from AI, but what are the kinds of applications that I think are most impactful. So if this is what a skilled surgeon needs to know at a particular time to care properly for a patient, this is where most, this area here, is where most surgeons are. They are close to the maximum knowledge and ability to assimilate as they can be. So it's possible to build complex AI that can pick up on that one little thing and move them up to here. But it's not a gigantic accelerator, amplifier of their capability. But think about other actors in health care. I mentioned a couple of them earlier. Who do you think the least trained actor in health care is? >> John: Patients. >> Yes, the patients. The patients are really very poorly trained, including me. I'm abysmal at figuring out who to call and where to go. >> Naveen: You know as much the doctor right? (laughing) >> Yeah, that's right. >> My doctor friends always hate that. Know your diagnosis, right? >> Yeah, Dr. Google knows. So the opportunities that I see that are really, really exciting are when you take an AI agent, like sometimes I like to call it contextually intelligent agent, or a CIA, and apply it to a problem where a patient has a complex future ahead of them that they need help navigating. And you use the AI to help them work through. Post operative. You've got PT. You've got drugs. You've got to be looking for side effects. An agent can actually help you navigate. It's like your own personal GPS for health care. So it's giving you the inforamation that you need about you for your care. That's my definition of Precision Medicine. And it can include genomics, of course. But it's much bigger. It's that broader picture and I think that a sort of agent way of thinking about things and filling in the gaps where there's less training and more opportunity, is very exciting. >> Great start up idea right there by the way. >> Oh yes, right. We'll meet you all out back for the next start up. >> I had a conversation with the head of the American Association of Medical Specialties just a couple of days ago. And what she was saying, and I'm aware of this phenomenon, but all of the medical specialists are saying, you're killing us with these stupid board recertification trivia tests that you're giving us. So if you're a cardiologist, you have to remember something that happens in one in 10 million people, right? And they're saying that irrelevant anymore, because we've got advanced decision support coming. We have these kinds of analytics coming. Precisely what you're saying. So it's human augmentation of decision support that is coming at blazing speed towards health care. So in that context, it's much more important that you have a basic foundation, you know how to think, you know how to learn, and you know where to look. So we're going to be human-augmented learning systems much more so than in the past. And so the whole recertification process is being revised right now. (inaudible audience member speaking) Speak up, yeah. (person speaking) >> What makes it fathomable is that you can-- (audience member interjects inaudibly) >> Sure. She was saying that our brain is really complex and large and even our brains don't know how our brains work, so... are there ways to-- >> What hope do we have kind of thing? (laughter) >> It's a metaphysical question. >> It circles all the way down, exactly. It's a great quote. I mean basically, you can decompose every system. Every complicated system can be decomposed into simpler, emergent properties. You lose something perhaps with each of those, but you get enough to actually understand most of the behavior. And that's really how we understand the world. And that's what we've learned in the last few years what neural network techniques can allow us to do. And that's why our brain can understand our brain. (laughing) >> Yeah, I'd recommend reading Chris Farley's last book because he addresses that issue in there very elegantly. >> Yeah we're seeing some really interesting technologies emerging right now where neural network systems are actually connecting other neural network systems in networks. You can see some very compelling behavior because one of the things I like to distinguish AI versus traditional analytics is we used to have question-answering systems. I used to query a database and create a report to find out how many widgets I sold. Then I started using regression or machine learning to classify complex situations from this is one of these and that's one of those. And then as we've moved more recently, we've got these AI-like capabilities like being able to recognize that there's a kitty in the photograph. But if you think about it, if I were to show you a photograph that happened to have a cat in it, and I said, what's the answer, you'd look at me like, what are you talking about? I have to know the question. So where we're cresting with these connected sets of neural systems, and with AI in general, is that the systems are starting to be able to, from the context, understand what the question is. Why would I be asking about this picture? I'm a marketing guy, and I'm curious about what Legos are in the thing or what kind of cat it is. So it's being able to ask a question, and then take these question-answering systems, and actually apply them so that's this ability to understand context and ask questions that we're starting to see emerge from these more complex hierarchical neural systems. >> There's a person dying to ask a question. >> Sorry. You have hit on several different topics that all coalesce together. You mentioned personalized models. You mentioned AI agents that could help you as you're going through a transitionary period. You mentioned data sources, especially across long time periods. Who today has access to enough data to make meaningful progress on that, not just when you're dealing with an issue, but day-to-day improvement of your life and your health? >> Go ahead, great question. >> That was a great question. And I don't think we have a good answer to it. (laughter) I'm sure John does. Well, I think every large healthcare organization and various healthcare consortiums are working very hard to achieve that goal. The problem remains in creating semantic interoperatability. So I spent a lot of my career working on semantic interoperatability. And the problem is that if you don't have well-defined, or self-defined data, and if you don't have well-defined and documented metadata, and you start operating on it, it's real easy to reach false conclusions and I can give you a classic example. It's well known, with hundreds of studies looking at when you give an antibiotic before surgery and how effective it is in preventing a post-op infection. Simple question, right? So most of the literature done prosectively was done in institutions where they had small sample sizes. So if you pool that, you get a little bit more noise, but you get a more confirming answer. What was done at a very large, not my own, but a very large institution... I won't name them for obvious reasons, but they pooled lots of data from lots of different hospitals, where the data definitions and the metadata were different. Two examples. When did they indicate the antibiotic was given? Was it when it was ordered, dispensed from the pharmacy, delivered to the floor, brought to the bedside, put in the IV, or the IV starts flowing? Different hospitals used a different metric of when it started. When did surgery occur? When they were wheeled into the OR, when they were prepped and drapped, when the first incision occurred? All different. And they concluded quite dramatically that it didn't matter when you gave the pre-op antibiotic and whether or not you get a post-op infection. And everybody who was intimate with the prior studies just completely ignored and discounted that study. It was wrong. And it was wrong because of the lack of commonality and the normalization of data definitions and metadata definitions. So because of that, this problem is much more challenging than you would think. If it were so easy as to put all these data together and operate on it, normalize and operate on it, we would've done that a long time ago. It's... Semantic interoperatability remains a big problem and we have a lot of heavy lifting ahead of us. I'm working with the Global Alliance, for example, of Genomics and Health. There's like 30 different major ontologies for how you represent genetic information. And different institutions are using different ones in different ways in different versions over different periods of time. That's a mess. >> Our all those issues applicable when you're talking about a personalized data set versus a population? >> Well, so N of 1 studies and single-subject research is an emerging field of statistics. So there's some really interesting new models like step wedge analytics for doing that on small sample sizes, recruiting people asynchronously. There's single-subject research statistics. You compare yourself with yourself at a different point in time, in a different context. So there are emerging statistics to do that and as long as you use the same sensor, you won't have a problem. But people are changing their remote sensors and you're getting different data. It's measured in different ways with different sensors at different normalization and different calibration. So yes. It even persists in the N of 1 environment. >> Yeah, you have to get started with a large N that you can apply to the N of 1. I'm actually going to attack your question from a different perspective. So who has the data? The millions of examples to train a deep learning system from scratch. It's a very limited set right now. Technology such as the Collaborative Cancer Cloud and The Data Exchange are definitely impacting that and creating larger and larger sets of critical mass. And again, not withstanding the very challenging semantic interoperability questions. But there's another opportunity Kay asked about what's changed recently. One of the things that's changed in deep learning is that we now have modules that have been trained on massive data sets that are actually very smart as certain kinds of problems. So, for instance, you can go online and find deep learning systems that actually can recognize, better than humans, whether there's a cat, dog, motorcycle, house, in a photograph. >> From Intel, open source. >> Yes, from Intel, open source. So here's what happens next. Because most of that deep learning system is very expressive. That combinatorial mixture of features that Naveen was talking about, when you have all these layers, there's a lot of features there. They're actually very general to images, not just finding cats, dogs, trees. So what happens is you can do something called transfer learning, where you take a small or modest data set and actually reoptimize it for your specific problem very, very quickly. And so we're starting to see a place where you can... On one end of the spectrum, we're getting access to the computing capabilities and the data to build these incredibly expressive deep learning systems. And over here on the right, we're able to start using those deep learning systems to solve custom versions of problems. Just last weekend or two weekends ago, in 20 minutes, I was able to take one of those general systems and create one that could recognize all different kinds of flowers. Very subtle distinctions, that I would never be able to know on my own. But I happen to be able to get the data set and literally, it took 20 minutes and I have this vision system that I could now use for a specific problem. I think that's incredibly profound and I think we're going to see this spectrum of wherever you are in your ability to get data and to define problems and to put hardware in place to see really neat customizations and a proliferation of applications of this kind of technology. >> So one other trend I think, I'm very hopeful about it... So this is a hard problem clearly, right? I mean, getting data together, formatting it from many different sources, it's one of these things that's probably never going to happen perfectly. But one trend I think that is extremely hopeful to me is the fact that the cost of gathering data has precipitously dropped. Building that thing is almost free these days. I can write software and put it on 100 million cell phones in an instance. You couldn't do that five years ago even right? And so, the amount of information we can gain from a cell phone today has gone up. We have more sensors. We're bringing online more sensors. People have Apple Watches and they're sending blood data back to the phone, so once we can actually start gathering more data and do it cheaper and cheaper, it actually doesn't matter where the data is. I can write my own app. I can gather that data and I can start driving the correct inferences or useful inferences back to you. So that is a positive trend I think here and personally, I think that's how we're going to solve it, is by gathering from that many different sources cheaply. >> Hi, my name is Pete. I've very much enjoyed the conversation so far but I was hoping perhaps to bring a little bit more focus into Precision Medicine and ask two questions. Number one, how have you applied the AI technologies as you're emerging so rapidly to your natural language processing? I'm particularly interested in, if you look at things like Amazon Echo or Siri, or the other voice recognition systems that are based on AI, they've just become incredibly accurate and I'm interested in specifics about how I might use technology like that in medicine. So where would I find a medical nomenclature and perhaps some reference to a back end that works that way? And the second thing is, what specifically is Intel doing, or making available? You mentioned some open source stuff on cats and dogs and stuff but I'm the doc, so I'm looking at the medical side of that. What are you guys providing that would allow us who are kind of geeks on the software side, as well as being docs, to experiment a little bit more thoroughly with AI technology? Google has a free AI toolkit. Several other people have come out with free AI toolkits in order to accelerate that. There's special hardware now with graphics, and different processors, hitting amazing speeds. And so I was wondering, where do I go in Intel to find some of those tools and perhaps learn a bit about the fantastic work that you guys are already doing at Kaiser? >> Let me take that first part and then we'll be able to talk about the MD part. So in terms of technology, this is what's extremely exciting now about what Intel is focusing on. We're providing those pieces. So you can actually assemble and build the application. How you build that application specific for MDs and the use cases is up to you or the one who's filling out the application. But we're going to power that technology for multiple perspectives. So Intel is already the main force behind The Data Center, right? Cloud computing, all this is already Intel. We're making that extremely amenable to AI and setting the standard for AI in the future, so we can do that from a number of different mechanisms. For somebody who wants to develop an application quickly, we have hosted solutions. Intel Nervana is kind of the brand for these kinds of things. Hosted solutions will get you going very quickly. Once you get to a certain level of scale, where costs start making more sense, things can be bought on premise. We're supplying that. We're also supplying software that makes that transition essentially free. Then taking those solutions that you develop in the cloud, or develop in The Data Center, and actually deploying them on device. You want to write something on your smartphone or PC or whatever. We're actually providing those hooks as well, so we want to make it very easy for developers to take these pieces and actually build solutions out of them quickly so you probably don't even care what hardware it's running on. You're like here's my data set, this is what I want to do. Train it, make it work. Go fast. Make my developers efficient. That's all you care about, right? And that's what we're doing. We're taking it from that point at how do we best do that? We're going to provide those technologies. In the next couple of years, there's going to be a lot of new stuff coming from Intel. >> Do you want to talk about AI Academy as well? >> Yeah, that's a great segway there. In addition to this, we have an entire set of tutorials and other online resources and things we're going to be bringing into the academic world for people to get going quickly. So that's not just enabling them on our tools, but also just general concepts. What is a neural network? How does it work? How does it train? All of these things are available now and we've made a nice, digestible class format that you can actually go and play with. >> Let me give a couple of quick answers in addition to the great answers already. So you're asking why can't we use medical terminology and do what Alexa does? Well, no, you may not be aware of this, but Andrew Ian, who was the AI guy at Google, who was recruited by Google, they have a medical chat bot in China today. I don't speak Chinese. I haven't been able to use it yet. There are two similar initiatives in this country that I know of. There's probably a dozen more in stealth mode. But Lumiata and Health Cap are doing chat bots for health care today, using medical terminology. You have the compound problem of semantic normalization within language, compounded by a cross language. I've done a lot of work with an international organization called Snowmed, which translates medical terminology. So you're aware of that. We can talk offline if you want, because I'm pretty deep into the semantic space. >> Go google Intel Nervana and you'll see all the websites there. It's intel.com/ai or nervanasys.com. >> Okay, great. Well this has been fantastic. I want to, first of all, thank all the people here for coming and asking great questions. I also want to thank our fantastic panelists today. (applause) >> Thanks, everyone. >> Thank you. >> And lastly, I just want to share one bit of information. We will have more discussions on AI next Tuesday at 9:30 AM. Diane Bryant, who is our general manager of Data Centers Group will be here to do a keynote. So I hope you all get to join that. Thanks for coming. (applause) (light electronic music)
SUMMARY :
And I'm excited to share with you He is the VP and general manager for the And it's pretty obvious that most of the useful data in that the technologies that we were developing So the mission is really to put and analyze it so you can actually understand So the field of microbiomics that I referred to earlier, so that you can think about it. is that the substrate of the data that you're operating on neural networks represent the world in the way And that's the way we used to look at it, right? and the more we understand the human cortex, What was it? also did the estimate of the density of information storage. and I'd be curious to hear from you And that is not the case today. Well, I don't like the idea of being discriminated against and you can actually then say what drug works best on this. I don't have clinic hours anymore, but I do take care of I practiced for many years I do more policy now. I just want to take a moment and see Yet most of the studies we do are small scale And so that barrier is going to enable So the idea is my data's really important to me. is much the same as you described. That's got to be a new one I've heard now. So I'm going to repeat this and ask Seems like a lot of the problems are regulatory, because I know the cycle is just going to be longer. And the diadarity is where you have and deep learning systems to understand, And that feeds back to your question about regulatory and to make AI the competitive advantage. that the opportunities that people need to look for to what you were saying before. of overcoming the cost and the cycle time and ability to assimilate Yes, the patients. Know your diagnosis, right? and filling in the gaps where there's less training We'll meet you all out back for the next start up. And so the whole recertification process is being are there ways to-- most of the behavior. because he addresses that issue in there is that the systems are starting to be able to, You mentioned AI agents that could help you So most of the literature done prosectively So there are emerging statistics to do that that you can apply to the N of 1. and the data to build these And so, the amount of information we can gain And the second thing is, what specifically is Intel doing, and the use cases is up to you that you can actually go and play with. You have the compound problem of semantic normalization all the websites there. I also want to thank our fantastic panelists today. So I hope you all get to join that.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Diane Bryant | PERSON | 0.99+ |
Bob Rogers | PERSON | 0.99+ |
Kay Erin | PERSON | 0.99+ |
John | PERSON | 0.99+ |
David Haussler | PERSON | 0.99+ |
China | LOCATION | 0.99+ |
six | QUANTITY | 0.99+ |
Chris Farley | PERSON | 0.99+ |
Naveen Rao | PERSON | 0.99+ |
100% | QUANTITY | 0.99+ |
Bob | PERSON | 0.99+ |
10 | QUANTITY | 0.99+ |
Ray Kurzweil | PERSON | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
London | LOCATION | 0.99+ |
Mike | PERSON | 0.99+ |
John Madison | PERSON | 0.99+ |
American Association of Medical Specialties | ORGANIZATION | 0.99+ |
four | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
three months | QUANTITY | 0.99+ |
HHS | ORGANIZATION | 0.99+ |
Andrew Ian | PERSON | 0.99+ |
20 minutes | QUANTITY | 0.99+ |
$100 | QUANTITY | 0.99+ |
first paper | QUANTITY | 0.99+ |
Congress | ORGANIZATION | 0.99+ |
95 percent | QUANTITY | 0.99+ |
second author | QUANTITY | 0.99+ |
UC Santa Cruz | ORGANIZATION | 0.99+ |
100-dollar | QUANTITY | 0.99+ |
200 ways | QUANTITY | 0.99+ |
two billion dollars | QUANTITY | 0.99+ |
George Church | PERSON | 0.99+ |
Health Cap | ORGANIZATION | 0.99+ |
Naveen | PERSON | 0.99+ |
25 plus years | QUANTITY | 0.99+ |
12 layers | QUANTITY | 0.99+ |
27 genes | QUANTITY | 0.99+ |
12 years | QUANTITY | 0.99+ |
Kay | PERSON | 0.99+ |
140 layers | QUANTITY | 0.99+ |
first author | QUANTITY | 0.99+ |
one question | QUANTITY | 0.99+ |
200 people | QUANTITY | 0.99+ |
20 | QUANTITY | 0.99+ |
First | QUANTITY | 0.99+ |
CIA | ORGANIZATION | 0.99+ |
NLP | ORGANIZATION | 0.99+ |
Today | DATE | 0.99+ |
two questions | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
Pete | PERSON | 0.99+ |
Medicare | ORGANIZATION | 0.99+ |
Legos | ORGANIZATION | 0.99+ |
Northern California | LOCATION | 0.99+ |
Echo | COMMERCIAL_ITEM | 0.99+ |
Each | QUANTITY | 0.99+ |
100 times | QUANTITY | 0.99+ |
nervanasys.com | OTHER | 0.99+ |
$1000 | QUANTITY | 0.99+ |
Ray Chrisfall | PERSON | 0.99+ |
Nervana | ORGANIZATION | 0.99+ |
Data Centers Group | ORGANIZATION | 0.99+ |
Global Alliance | ORGANIZATION | 0.99+ |
Global Alliance for Genomics and Health | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
intel.com/ai | OTHER | 0.99+ |
four years | QUANTITY | 0.99+ |
Stanford | ORGANIZATION | 0.99+ |
10,000 examples | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
one disease | QUANTITY | 0.99+ |
Two examples | QUANTITY | 0.99+ |
Steven Hawking | PERSON | 0.99+ |
five years ago | DATE | 0.99+ |
first | QUANTITY | 0.99+ |
two sort | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
first time | QUANTITY | 0.99+ |