Brett McMillen, AWS | AWS re:Invent 2020

>>From around the globe. It's the cube with digital coverage of AWS reinvent 2020, sponsored by Intel and AWS. >>Welcome back to the cubes coverage of AWS reinvent 2020 I'm Lisa Martin. Joining me next is one of our cube alumni. Breton McMillan is back the director of us, federal for AWS. Right. It's great to see you glad that you're safe and well. >>Great. It's great to be back. Uh, I think last year when we did the cube, we were on the convention floor. It feels very different this year here at reinvent, it's gone virtual and yet it's still true to how reinvent always been. It's a learning conference and we're releasing a lot of new products and services for our customers. >>Yes. A lot of content, as you say, the one thing I think I would say about this reinvent, one of the things that's different, it's so quiet around us. Normally we're talking loudly over tens of thousands of people on the showroom floor, but great. That AWS is still able to connect in such an actually an even bigger way with its customers. So during Theresa Carlson's keynote, want to get your opinion on this or some info. She talked about the AWS open data sponsorship program, and that you guys are going to be hosting the national institutes of health, NIH sequence, read archive data, the biologist, and may former gets really excited about that. Talk to us about that because especially during the global health crisis that we're in, that sounds really promising >>Very much is I am so happy that we're working with NIH on this and multiple other initiatives. So the secret greed archive or SRA, essentially what it is, it's a very large data set of sequenced genomic data. And it's a wide variety of judge you gnomic data, and it's got a knowledge human genetic thing, but all life forms or all branches of life, um, is in a SRA to include viruses. And that's really important here during the pandemic. Um, it's one of the largest and oldest, um, gen sequence genomic data sets are out there and yet it's very modern. It has been designed for next generation sequencing. So it's growing, it's modern and it's well used. It's one of the more important ones that it's out there. One of the reasons this is so important is that we know to find cures for what a human ailments and disease and death, but by studying the gem genomic code, we can come up with the answers of these or the scientists can come up with answer for that. And that's what Amazon is doing is we're putting in the hands of the scientists, the tools so that they can help cure heart disease and diabetes and cancer and, um, depression and yes, even, um, uh, viruses that can cause pandemics. >>So making this data, sorry, I'm just going to making this data available to those scientists. Worldwide is incredibly important. Talk to us about that. >>Yeah, it is. And so, um, within NIH, we're working with, um, the, um, NCBI when you're dealing with NIH, there's a lot of acronyms, uh, and uh, at NIH, it's the national center for, um, file type technology information. And so we're working with them to make this available as an open data set. Why, why this is important is it's all about increasing the speed for scientific discovery. I personally think that in the fullness of time, the scientists will come up with cures for just about all of the human ailments that are out there. And it's our job at AWS to put into the hands of the scientists, the tools they need to make things happen quickly or in our lifetime. And I'm really excited to be working with NIH on that. When we start talking about it, there's multiple things. The scientists needs. One is access to these data sets and SRA. >>It's a very large data set. It's 45 petabytes and it's growing. I personally believe that it's going to double every year, year and a half. So it's a very large data set and it's hard to move that data around. It's so much easier if you just go into the cloud, compute against it and do your research there in the cloud. And so it's super important. 45 petabytes, give you an idea if it were all human data, that's equivalent to have a seven and a half million people or put another way 90% of everybody living in New York city. So that's how big this is. But then also what AWS is doing is we're bringing compute. So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. The third leg of the tool of the stool is giving the scientists easy access to the specialized tool sets they need. >>And we're doing that in a few different ways. One that the people would design these toolsets design a lot of them on AWS, but then we also make them available through something called AWS marketplace. So they can just go into marketplace, get a catalog, go in there and say, I want to launch this resolve work and launches the infrastructure underneath. And it speeds the ability for those scientists to come up with the cures that they need. So SRA is stored in Amazon S3, which is a very popular object store, not just in the scientific community, but virtually every industry uses S3. And by making this available on these public data sets, we're giving the scientists the ability to speed up their research. >>One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, it's also facilitating collaboration globally because now you've got the cloud to drive all of this, which allows researchers and completely different parts of the world to be working together almost in real time. So I can imagine the incredible power that this is going to, to provide to that community. So I have to ask you though, you talked about this being all life forms, including viruses COVID-19, what are some of the things that you think we can see? I expect this to facilitate. Yeah. >>So earlier in the year we took the, um, uh, genetic code or NIH took the genetic code and they, um, put it in an SRA like format and that's now available on AWS and, and here's, what's great about it is that you can now make it so anybody in the world can go to this open data set and start doing their research. One of our goals here is build back to a democratization of research. So it used to be that, um, get, for example, the very first, um, vaccine that came out was a small part. It's a vaccine that was done by our rural country doctor using essentially test tubes in a microscope. It's gotten hard to do that because data sets are so large, you need so much computer by using the power of the cloud. We've really democratized it and now anybody can do it. So for example, um, with the SRE data set that was done by NIH, um, organizations like the university of British Columbia, their, um, cloud innovation center is, um, doing research. And so what they've done is they've scanned, they, um, SRA database think about it. They scanned out 11 million entries for, uh, coronavirus sequencing. And that's really hard to do in a typical on-premise data center. Who's relatively easy to do on AWS. So by making this available, we can have a larger number of scientists working on the problems that we need to have solved. >>Well, and as the, as we all know in the U S operation warp speed, that warp speed alone term really signifies how quickly we all need this to be progressing forward. But this is not the first partnership that AWS has had with the NIH. Talk to me about what you guys, what some of the other things are that you're doing together. >>We've been working with NIH for a very long time. Um, back in 2012, we worked with NIH on, um, which was called the a thousand genome data set. This is another really important, um, data set and it's a large number of, uh, against sequence human genomes. And we moved that into, again, an open dataset on AWS and what's happened in the last eight years is many scientists have been able to compute about on it. And the other, the wonderful power of the cloud is over time. We continue to bring out tools to make it easier for people to work. So what they're not they're computing using our, um, our instance types. We call it elastic cloud computing. whether they're doing that, or they were doing some high performance computing using, um, uh, EMR elastic MapReduce, they can do that. And then we've brought up new things that really take it to the next layer, like level like, uh, Amazon SageMaker. >>And this is a, um, uh, makes it really easy for, um, the scientists to launch machine learning algorithms on AWS. So we've done the thousand genome, uh, dataset. Um, there's a number of other areas within NIH that we've been working on. So for example, um, over at national cancer Institute, we've been providing some expert guidance on best practices to how, how you can architect and work on these COVID related workloads. Um, NIH does things with, um, collaboration with many different universities, um, over 2,500, um, academic institutions. And, um, and they do that through grants. And so we've been working with doc office of director and they run their grant management applications in the RFA on AWS, and that allows it to scale up and to work very efficiently. Um, and then we entered in with, um, uh, NIH into this program called strides strides as a program for knowing NIH, but also all these other institutions that work within NIH to use the power of the cloud use commercial cloud for scientific discovery. And when we started that back in July of 2018, long before COVID happened, it was so great that we had that up and running because now we're able to help them out through the strides program. >>Right. Can you imagine if, uh, let's not even go there? I was going to say, um, but so, okay. So the SRA data is available through the AWS open data sponsorship program. You talked about strides. What are some of the other ways that AWS system? >>Yeah, no. So strides, uh, is, uh, you know, wide ranging through multiple different institutes. So, um, for example, over at, uh, the national heart lung and blood Institute, uh, do di NHL BI. I said, there's a lot of acronyms and I gel BI. Um, they've been working on, um, harmonizing, uh, genomic data. And so working with the university of Michigan, they've been analyzing through a program that they call top of med. Um, we've also been working with a NIH on, um, establishing best practices, making sure everything's secure. So we've been providing, um, AWS professional services that are showing them how to do this. So one portion of strides is getting the right data set and the right compute in the right tools, in the hands of the scientists. The other areas that we've been working on is making sure the scientists know how to use it. And so we've been developing these cloud learning pathways, and we started this quite a while back, and it's been so helpful here during the code. So, um, scientists can now go on and they can do self-paced online courses, which we've been really helping here during the, during the pandemic. And they can learn how to maximize their use of cloud technologies through these pathways that we've developed for them. >>Well, not education is imperative. I mean, there, you think about all of the knowledge that they have with within their scientific discipline and being able to leverage technology in a way that's easy is absolutely imperative to the timing. So, so, um, let's talk about other data sets that are available. So you've got the SRA is available. Uh, what are their data sets are available through this program? >>What about along a wide range of data sets that we're, um, uh, doing open data sets and in general, um, these data sets are, um, improving the human condition or improving the, um, the world in which we live in. And so, um, I've talked about a few things. There's a few more, uh, things. So for example, um, there's the cancer genomic Atlas that we've been working with, um, national cancer Institute, as well as the national human genomic research Institute. And, um, that's a very important data set that being computed against, um, uh, throughout the world, uh, commonly within the scientific community, that data set is called TCGA. Um, then we also have some, uh, uh, datasets are focused on certain groups. So for example, kids first is a data set. That's looking at a lot of the, um, challenges, uh, in diseases that kids get every kind of thing from very rare pediatric cancer as to heart defects, et cetera. >>And so we're working with them, but it's not just in the, um, uh, medical side. We have open data sets, um, with, uh, for example, uh, NOAA national ocean open national oceanic and atmospheric administration, um, to understand what's happening better with climate change and to slow the rate of climate change within the department of interior, they have a Landsat database that is looking at pictures of their birth cell, like pictures of the earth, so we can better understand the MCO world we live in. Uh, similarly, uh, NASA has, um, a lot of data that we put out there and, um, over in the department of energy, uh, there's data sets there, um, that we're researching against, or that the scientists are researching against to make sure that we have better clean, renewable energy sources, but it's not just government agencies that we work with when we find a dataset that's important. >>We also work with, um, nonprofit organizations, nonprofit organizations are also in, they're not flush with cash and they're trying to make every dollar work. And so we've worked with them, um, organizations like the child mind Institute or the Allen Institute for brain science. And these are largely like neuro imaging, um, data. And we made that available, um, via, um, our open data set, um, program. So there's a wide range of things that we're doing. And what's great about it is when we do it, you democratize science and you allowed many, many more science scientists to work on these problems. They're so critical for us. >>The availability is, is incredible, but also the, the breadth and depth of what you just spoke. It's not just government, for example, you've got about 30 seconds left. I'm going to ask you to summarize some of the announcements that you think are really, really critical for federal customers to be paying attention to from reinvent 2020. >>Yeah. So, um, one of the things that these federal government customers have been coming to us on is they've had to have new ways to communicate with their customer, with the public. And so we have a product that we've had for a while called on AWS connect, and it's been used very extensively throughout government customers. And it's used in industry too. We've had a number of, um, of announcements this weekend. Jasmine made multiple announcements on enhancement, say AWS connect or additional services, everything from helping to verify that that's the right person from AWS connect ID to making sure that that customer's gets a good customer experience to connect wisdom or making sure that the managers of these call centers can manage the call centers better. And so I'm really excited that we're putting in the hands of both government and industry, a cloud based solution to make their connections to the public better. >>It's all about connections these days, but I wish we had more time, cause I know we can unpack so much more with you, but thank you for joining me on the queue today, sharing some of the insights, some of the impacts and availability that AWS is enabling the scientific and other federal communities. It's incredibly important. And we appreciate your time. Thank you, Lisa, for Brett McMillan. I'm Lisa Martin. You're watching the cubes coverage of AWS reinvent 2020.

Published Date : Dec 10 2020

SUMMARY :

It's the cube with digital coverage of AWS It's great to see you glad that you're safe and well. It's great to be back. Talk to us about that because especially during the global health crisis that we're in, One of the reasons this is so important is that we know to find cures So making this data, sorry, I'm just going to making this data available to those scientists. And so, um, within NIH, we're working with, um, the, So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. And it speeds the ability for those scientists One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, And that's really hard to do in a typical on-premise data center. Talk to me about what you guys, take it to the next layer, like level like, uh, Amazon SageMaker. in the RFA on AWS, and that allows it to scale up and to work very efficiently. So the SRA data is available through the AWS open data sponsorship And so working with the university of Michigan, they've been analyzing absolutely imperative to the timing. And so, um, And so we're working with them, but it's not just in the, um, uh, medical side. And these are largely like neuro imaging, um, data. I'm going to ask you to summarize some of the announcements that's the right person from AWS connect ID to making sure that that customer's And we appreciate your time.

ENTITIES

Entity	Category	Confidence
NIH	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Brett McMillan	PERSON	0.99+
Brett McMillen	PERSON	0.99+
AWS	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
July of 2018	DATE	0.99+
2012	DATE	0.99+
Theresa Carlson	PERSON	0.99+
Jasmine	PERSON	0.99+
Lisa	PERSON	0.99+
90%	QUANTITY	0.99+
New York	LOCATION	0.99+
Allen Institute	ORGANIZATION	0.99+
SRA	ORGANIZATION	0.99+
last year	DATE	0.99+
Breton McMillan	PERSON	0.99+
NCBI	ORGANIZATION	0.99+
45 petabytes	QUANTITY	0.99+
SRE	ORGANIZATION	0.99+
seven and a half million people	QUANTITY	0.99+
third leg	QUANTITY	0.99+
One	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
earth	LOCATION	0.99+
over 2,500	QUANTITY	0.99+
SRA	TITLE	0.99+
S3	TITLE	0.98+
pandemic	EVENT	0.98+
first partnership	QUANTITY	0.98+
one	QUANTITY	0.98+
child mind Institute	ORGANIZATION	0.98+
U S	LOCATION	0.98+
this year	DATE	0.98+
pandemics	EVENT	0.98+
national cancer Institute	ORGANIZATION	0.98+
both	QUANTITY	0.98+
national heart lung and blood Institute	ORGANIZATION	0.98+
NOAA	ORGANIZATION	0.97+
national human genomic research Institute	ORGANIZATION	0.97+
today	DATE	0.97+
Landsat	ORGANIZATION	0.96+
first	QUANTITY	0.96+
11 million entries	QUANTITY	0.96+
about 30 seconds	QUANTITY	0.95+
year and a half	QUANTITY	0.94+
AWS connect	ORGANIZATION	0.93+
university of British Columbia	ORGANIZATION	0.92+
COVID	EVENT	0.91+
COVID-19	OTHER	0.91+
over tens of thousands of people	QUANTITY	0.91+

Swami Sivasubramanian, AWS | AWS Summit Online 2020

>> Narrator: From theCUBE Studios in Palo Alto and Boston, connecting with thought leaders all around the world, this is a CUBE conversation. >> Hello everyone, welcome to this special CUBE interview. We are here at theCUBE Virtual covering AWS Summit Virtual Online. This is Amazon's Summits that they normally do all around the world. They're doing them now virtually. We are here in the Palo Alto COVID-19 quarantine crew getting all the interviews here with a special guest, Vice President of Machine Learning, we have Swami, CUBE Alumni, who's been involved in not only the machine learning, but all of the major activity around AWS around how machine learning's evolved, and all the services around machine learning workflows from transcribe, recognition, you name it. Swami, you've been at the helm for many years, and we've also chatted about that before. Welcome to the virtual CUBE covering AWS Summit. >> Hey, pleasure to be here, John. >> Great to see you. I know times are tough. Everything okay at Amazon? You guys are certainly cloud scaled, not too unfamiliar of working remotely. You do a lot of travel, but what's it like now for you guys right now? >> We're actually doing well. We have been I mean, this many of, we are working hard to make sure we continue to serve our customers. Even from their site, we have done, yeah, we had taken measures to prepare, and we are confident that we will be able to meet customer demands per capacity during this time. So we're also helping customers to react quickly and nimbly, current challenges, yeah. Various examples from amazing startups working in this area to reorganize themselves to serve customer. We can talk about that common layer. >> Large scale, you guys have done a great job and fun watching and chronicling the journey of AWS, as it now goes to a whole 'nother level with the post pandemic were expecting even more surge in everything from VPNs, workspaces, you name it, and all these workloads are going to be under a lot of pressure to do more and more value. You've been at the heart of one of the key areas, which is the tooling, and the scale around machine learning workflows. And this is where customers are really trying to figure out what are the adequate tools? How do my teams effectively deploy machine learning? Because now, more than ever, the data is going to start flowing in as virtualization, if you will, of life, is happening. We're going to be in a hybrid world with life. We're going to be online most of the time. And I think COVID-19 has proven that this new trajectory of virtualization, virtual work, applications are going to have to flex, and adjust, and scale, and be reinvented. This is a key thing. What's going on with machine learning, what's new? Tell us what are you guys doing right now. >> Yeah, I see now, in AWS, we offer broadest-- (poor audio capture obscures speech) All the way from like expert practitioners, we offer our frameworks and infrastructure layer support for all popular frameworks from like TensorFlow, Apache MXNet, and PyTorch, PowerShell, (poor audio capture obscures speech) custom chips like inference share. And then, for aspiring ML developers, who want to build their own custom machine learning models, we're actually building, we offer SageMaker, which is our end-to-end machine learning service that makes it easy for customers to be able to build, train, tune, and debug machine learning models, and it is one of our fastest growing machine learning services, and many startups and enterprises are starting to standardize their machine learning building on it. And then, the final tier is geared towards actually application developers, who did not want to go into model-building, just want an easy API to build capabilities to transcribe, run voice recognition, and so forth. And I wanted to talk about one of the new capabilities we are about to launch, enterprise search called Kendra, and-- >> So actually, so just from a news standpoint, that's GA now, that's being announced at the Summit. >> Yeah. >> That was a big hit at re:Invent, Kendra. >> Yeah. >> A lot of buzz! It's available. >> Yep, so I'm excited to say that Kendra is our new machine learning powered, highly accurate enterprise search service that has been made generally available. And if you look at what Kendra is, we have actually reimagined the traditional enterprise search service, which has historically been an underserved market segment, so to speak. If you look at it, on the public search, on the web search front, it is a relatively well-served area, whereas the enterprise search has been an area where data in enterprise, there are a huge amount of data silos, that is spread in file systems, SharePoint, or Salesforce, or various other areas. And deploying a traditional search index has always that even simple persons like when there's an ID desk open or when what is the security policy, or so forth. These kind of things have been historically, people have to find within an enterprise, let alone if I'm actually in a material science company or so forth like what 3M was trying to do. Enable collaboration of researchers spread across the world, to search their experiment archives and so forth. It has been super hard for them to be able to things, and this is one of those areas where Kendra has enabled the new, of course, where Kendra is a deep learning powered search service for enterprises, which breaks down data silos, and collects actually data across various things all the way from S3, or file system, or SharePoint, and various other data sources, and uses state-of-art NLP techniques to be able to actually index them, and then, you can query using natural language queries such as like when there's my ID desk-scoping, and the answer, it won't just give you a bunch of random, right? It'll tell you it opens at 8:30 a.m. in the morning. >> Yeah. >> Or what is the credit card cashback returns for my corporate credit card? It won't give you like a long list of links related to it. Instead it'll give you answer to be 2%. So it's that much highly accurate. (poor audio capture obscures speech) >> People who have been in the enterprise search or data business know how hard this is. And it is super, it's been a super hard problem, the old in the old guard models because databases were limiting to schemas and whatnot. Now, you have a data-driven world, and this becomes interesting. I think the big takeaway I took away from Kendra was not only the new kind of discovery navigation that's possible, in terms of low latency, getting relevant content, but it's really the under-the-covers impact, and I think I'd like to get your perspective on this because this has been an active conversation inside the community, in cloud scale, which is data silos have been a problem. People have had built these data silos, and they really talk about breaking them down but it's really again hard, there's legacy problems, and well, applications that are tied to them. How do I break my silos down? Or how do I leverage either silos? So I think you guys really solve a problem here around data silos and scale. >> Yeah. >> So talk about the data silos. And then, I'm going to follow up and get your take on the kind of size of of data, megabytes, petabytes, I mean, talk about data silos, and the scale behind it. >> Perfect, so if you look at actually how to set up something like a Kendra search cluster, even as simple as from your Management Console in the AWS, you'll be able to point Kendra to various data sources, such as Amazon S3, or SharePoint, and Salesforce, and various others. And say, these are kind of data I want to index. And Kendra automatically pulls in this data, index these using its deep learning and NLP models, and then, automatically builds a corpus. Then, I, as in user of the search index, can actually start querying it using natural language, and don't have to worry where it comes from, and Kendra takes care of things like access control, and it uses finely-tuned machine learning algorithms under the hood to understand the context of natural language query and return the most relevant. I'll give a real-world example of some of the field customers who are using Kendra. For instance, if you take a look at 3M, 3M is using Kendra to support search, support its material science R&D by enabling natural language search of their expansive repositories of past research documents that may be relevant to a new product. Imagine what this does to a company like 3M. Instead of researchers who are spread around the world, repeating the same experiments on material research over and over again, now, their engineers and researchers will allow everybody to quickly search through documents. And they can innovate faster instead of trying to literally reinvent the wheel all the time. So it is better acceleration to the market. Even we are in this situation, one of the interesting work that you might be interested in is the Semantic Scholar team at Allen Institute for AI, recently opened up what is a repository of scientific research called COVID-19 Open Research Dataset. These are expert research articles. (poor audio capture obscures speech) And now, the index is using Kendra, and it helps scientists, academics, and technologists to quickly find information in a sea of scientific literature. So you can even ask questions like, "Hey, how different is convalescent plasma "treatment compared to a vaccine?" And various in that question and Kendra automatically understand the context, and gets the summary answer to these questions for the customers, so. And this is one of the things where when we talk about breaking the data silos, it takes care of getting back the data, and putting it in a central location. Understanding the context behind each of these documents, and then, being able to also then, quickly answer the queries of customers using simple query natural language as well. >> So what's the scale? Talk about the scale behind this. What's the scale numbers? What are you guys seeing? I see you guys always do a good job, I've run a great announcement, and then following up with general availability, which means I know you've got some customers using it. What are we talking about in terms of scales? Petabytes, can you give some insight into the kind of data scale you're talking about here? >> So the nice thing about Kendra is it is easily linearly scalable. So I, as a developer, I can keep adding more and more data, and that is it linearly scales to whatever scale our customers want. So and that is one of the underpinnings of Kendra search engine. So this is where even if you see like customers like PricewaterhouseCoopers is using Kendra to power its regulatory application to help customers search through regulatory information quickly and easily. So instead of sifting through hundreds of pages of documents manually to answer certain questions, now, Kendra allows them to answer natural language question. I'll give another example, which is speaks to the scale. One is Baker Tilly, a leading advisory, tax, and assurance firm, is using Kendra to index documents. Compared to a traditional SharePoint-based full-text search, now, they are using Kendra to quickly search product manuals and so forth. And they're able to get answers up to 10x faster. Look at that kind of impact what Kendra has, being able to index vast amount of data, with in a linearly scalable fashion, keep adding in the order of terabytes, and keep going, and being able to search 10x faster than traditional, I mean traditional keyword search based algorithm is actually a big deal for these customers. They're very excited. >> So what is the main problem that you're solving with Kendra? What's the use case? If I'm the customer, what's my problem that you're solving? Is it just response to data, whether it's a call center, or support, or is it an app? I mean, what's the main focus that you guys came out? What was the vector of problem that you're solving here? >> So when we talked to customers before we started building Kendra, one of the things that constantly came back for us was that they wanted the same ease of use and the ability to search the world wide web, and customers like us to search within an enterprise. So it can be in the form of like an internal search to search within like the HR documents or internal wiki pages and so forth, or it can be to search like internal technical documentation or the public documentation to help the contact centers or is it the external search in terms of customer support and so forth, or to enable collaboration by sharing knowledge base and so forth. So each of these is really dissected. Why is this a problem? Why is it not being solved by traditional search techniques? One of the things that became obvious was that unlike the external world where the web pages are linked that easily with very well-defined structure, internal world is very messy within an enterprise. The documents are put in a SharePoint, or in a file system, or in a storage service like S3, or on naturally, tell-stores or Box, or various other things. And what really customers wanted was a system which knows how to actually pull the data from various these data silos, still understand the access control behind this, and enforce them in the search. And then, understand the real data behind it, and not just do simple keyword search, so that we can build remarkable search service that really answers queries in a natural language. And this has been the theme, premise of Kendra, and this is what had started to resonate with our customers. I talked with some of the other examples even in areas like contact centers. For instance, Magellan Health is using Kendra for its contact centers. So they are able to seamlessly tie like member, provider, or client specific information with other inside information about health care to its agents so that they can quickly resolve the call. Or it can be on internally to do things like external search as well. So very satisfied client. >> So you guys took the basic concept of discovery navigation, which is the consumer web, find what you're looking for as fast as possible, but also took advantage of building intelligence around understanding all the nuances and configuration, schemas, access, under the covers and allowing things to be discovered in a new way. So you basically makes data be discoverable, and then, provide an interface. >> Yeah. >> For discovery and navigation. So it's a broad use cat, then. >> Right, yeah that's sounds somewhat right except we did one thing more. We actually understood not just, we didn't just do discovery and also made it easy for people to find the information but they are sifting through like terabytes or hundreds of terabytes of internal documentation. Sometimes, one other things that happens is throwing a bunch of hundreds of links to these documents is not good enough. For instance, if I'm actually trying to find out for instance, what is the ALS marker in an health care setting, and for a particular research project, then, I don't want to actually sift through like thousands of links. Instead, I want to be able to correctly pinpoint which document contains answer to it. So that is the final element, which is to really understand the context behind each and every document using natural language processing techniques so that you not only find discover the information that is relevant but you also get like highly accurate possible precise answers to some of your questions. >> Well, that's great stuff, big fan. I was really liking the announcement of Kendra. Congratulations on the GA of that. We'll make some room on our CUBE Virtual site for your team to put more Kendra information up. I think it's fascinating. I think that's going to be the beginning of how the world changes, where this, this certainly with the voice activation and API-based applications integrating this in. I just see a ton of activity that this is going to have a lot of headroom. So appreciate that. The other thing I want to get to while I have you here is the news around the augmented artificial intelligence has been brought out as well. >> Yeah. >> So the GA of that is out. You guys are GA-ing everything, which is right on track with your cadence of AWS laws, I'd say. What is this about? Give us the headline story. What's the main thing to pay attention to of the GA? What have you learned? What's the learning curve, what's the results? >> So augmented artificial intelligence service, I called it A2I but Amazon A2I service, we made it generally available. And it is a very unique service that makes it easy for developers to augment human intelligence with machine learning predictions. And this is historically, has been a very challenging problem. We look at, so let me take a step back and explain the general idea behind it. You look at any developer building a machine learning application, there are use cases where even actually in 99% accuracy in machine learning is not going to be good enough to directly use that result as the response to back to the customer. Instead, you want to be able to augment that with human intelligence to make sure, hey, if my machine learning model is returning, saying hey, my confidence interval for this prediction is less than 70%, I would like it to be augmented with human intelligence. Then, A2I makes it super easy for customers to be, developers to use actually, a human reviewer workflow that comes in between. So then, I can actually send it either to the public pool using Mechanical Turk, where we have more than 500,000 Turkers, or I can use a private workflow as a vendor workflow. So now, A2I seamlessly integrates with our Textract, Rekognition, or SageMaker custom models. So now, for instance, NHS is integrated A2I with Textract, so that, and they are building these document processing workflows. The areas where the machine learning model confidence load is not as high, they will be able augment that with their human reviewer workflows so that they can actually build in highly accurate document processing workflow as well. So this, we think is a powerful capability. >> So this really kind of gets to what I've been feeling in some of the stuff we worked with you guys on our machine learning piece. It's hard for companies to hire machine learning people. This has been a real challenge. So I like this idea of human augmentation because humans and machines have to have that relationship, and if you build good abstraction layers, and you abstract away the complexity, which is what you guys do, and that's the vision of cloud, then, you're going to need to have that relationship solidified. So at what point do you think we're going to be ready for theCUBE team, or any customer that doesn't have the or can't find a machine learning person? Or may not want to pay the wages that's required? I mean it's hard to find a machine learning engineer, and when does the data science piece come in with visualization, the spectrum of pure computer science, math, machine learning guru to full end user productivity? Machine learning is where you guys are doing a lot of work. Can you just share your opinion on that evolution of where we are on that? Because people want to get to the point where they don't have to hire machine learning folks. >> Yeah. >> And have that kind support too. >> If you look at the history of technology, I actually always believe that many of these highly disruptive technology started as a way that it is available only to experts, and then, they quickly go through the cycles, where it becomes almost common place. I'll give an example with something totally outside the IT space. Let's take photography. I think more than probably 150 years ago, the first professional camera was invented, and built like three to four years still actually take a really good picture. And there were only very few expert photographers in the world. And then, fast forward to time where we are now, now, even my five-year-old daughter takes actually very good portraits, and actually gives it as a gift to her mom for Mother's Day. So now, if you look at Instagram, everyone is a professional photographer. I kind of think the same thing is about to, it will happen in machine learning too. Compared to 2012, where there were very few deep learning experts, who can really build these amazing applications, now, we are starting to see like tens of thousands of actually customers using machine learning in production in AWS, not just proof of concepts but in production. And this number is rapidly growing. I'll give one example. Internally, if you see Amazon, to aid our entire company to transform and make machine learning as a natural part of the business, six years ago, we started a Machine Learning University. And since then, we have been training all our engineers to take machine learning courses in this ML University, and a year ago, we actually made these coursework available through our Training and Certification platform in AWS, and within 48 hours, more than 100,000 people registered. Think about it, that's like a big all-time record. That's why I always like to believe that developers are always eager to learn, they're very hungry to pick up new technology, and I wouldn't be surprised if four or five years from now, machine learning is kind of becomes a normal feature of the app, the same with databases are, and that becomes less special. If that day happens, then, I would see it as my job is done, so. >> Well, you've got a lot more work to do because I know from the conversations I've been having around this COVID-19 pandemic is it's that there's general consensus and validation that the future got pulled forward, and what used to be an inside industry conversation that we used to have around machine learning and some of the visions that you're talking about has been accelerated on the pace of the new cloud scale, but now that people now recognize that virtual and experiencing it firsthand globally, everyone, there are now going to be an acceleration of applications. So we believe there's going to be a Cambrian explosion of new applications that got to reimagine and reinvent some of the plumbing or abstractions in cloud to deliver new experiences, because the expectations have changed. And I think one of the things we're seeing is that machine learning combined with cloud scale will create a whole new trajectory of a Cambrian explosion of applications. So this has kind of been validated. What's your reaction to that? I mean do you see something similar? What are some of the things that you're seeing as we come into this world, this virtualization of our lives, it's every vertical, it's not one vertical anymore that's maybe moving faster. I think everyone sees the impact. They see where the gaps are in this new reality here. What's your thoughts? >> Yeah, if you see the history from machine learning specifically around deep learning, while the technology is really not new, especially because the early deep learning paper was probably written like almost 30 years ago. And why didn't we see deep learning take us sooner? It is because historically, deep learning technologies have been hungry for computer resources, and hungry for like huge amount of data. And then, the abstractions were not easy enough. As you rightfully pointed out that cloud has come in made it super easy to get like access to huge amount of compute and huge amount of data, and you can literally pay by the hour or by the minute. And with new tools being made available to developers like SageMaker and all the AI services, we are talking about now, there is an explosion of options available that are easy to use for developers that we are starting to see, almost like a huge amount of like innovations starting to pop up. And unlike traditional disruptive technologies, which you usually see crashing in like one or two industry segments, and then, it crosses the chasm, and then goes mainstream, but machine learning, we are starting to see traction almost in like every industry segment, all the way from like in financial sector, where fintech companies like Intuit is using it to forecast its call center volume and then, personalization. In the health care sector, companies like Aidoc are using computer vision to assist radiologists. And then, we are seeing in areas like public sector. NASA has partnered with AWS to use machine learning to do anomaly detection, algorithms to detect solar flares in the space. And yeah, examples are plenty. It is because now, machine learning has become such common place that and almost every industry segment and every CIO is actually already looking at how can they reimagine, and reinvent, and make their customer experience better covered by machine learning. In the same way, Amazon actually asked itself, like eight or 10 years ago, so very exciting. >> Well, you guys continue to do the work, and I agree it's not just machine learning by itself, it's the integration and the perfect storm of elements that have come together at this time. Although pretty disastrous, but I think ultimately, it's going to come out, we're going to come out of this on a whole 'nother trajectory. It's going to be creativity will be emerged. You're going to start seeing really those builders thinking, "Okay hey, I got to get out there. "I can deliver, solve the gaps we are exposed. "Solve the problems, "pre-create new expectations, new experience." I think it's going to be great for software developers. I think it's going to change the computer science field, and it's really bringing the lifestyle aspect of things. Applications have to have a recognition of this convergence, this virtualization of life. >> Yeah. >> The applications are going to have to have that. So and remember virtualization helped Amazon formed the cloud. Maybe, we'll get some new kinds of virtualization, Swami. (laughs) Thanks for coming on, really appreciate it. Always great to see you. Thanks for taking the time. >> Okay, great to see you, John, also. Thank you, thanks again. >> We're with Swami, the Vice President of Machine Learning at AWS. Been on before theCUBE Alumni. Really sharing his insights around what we see around this virtualization, this online event at the Amazon Summit, we're covering with the Virtual CUBE. But as we go forward, more important than ever, the data is going to be important, searching it, finding it, and more importantly, having the humans use it building an application. So theCUBE coverage continues, for AWS Summit Virtual Online, I'm John Furrier, thanks for watching. (enlightening music)

Published Date : May 13 2020

SUMMARY :

leaders all around the world, and all the services around Great to see you. and we are confident that we will the data is going to start flowing in one of the new capabilities we are about announced at the Summit. That was a big hit A lot of buzz! and the answer, it won't just give you list of links related to it. and I think I'd like to get and the scale behind it. and then, being able to also then, into the kind of data scale So and that is one of the underpinnings One of the things that became obvious to be discovered in a new way. and navigation. So that is the final element, that this is going to What's the main thing to and explain the general idea behind it. and that's the vision of cloud, And have that and built like three to four years still and some of the visions of options available that are easy to use and it's really bringing the are going to have to have that. Okay, great to see you, John, also. the data is going to be important,

ENTITIES

Entity	Category	Confidence
NASA	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Swami	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2012	DATE	0.99+
John Furrier	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Boston	LOCATION	0.99+
99%	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
Kendra	ORGANIZATION	0.99+
Aidoc	ORGANIZATION	0.99+
2%	QUANTITY	0.99+
hundreds of pages	QUANTITY	0.99+
Swami Sivasubramanian	PERSON	0.99+
four years	QUANTITY	0.99+
less than 70%	QUANTITY	0.99+
thousands of links	QUANTITY	0.99+
S3	TITLE	0.99+
10x	QUANTITY	0.99+
more than 100,000 people	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
Intuit	ORGANIZATION	0.99+
Mother's Day	EVENT	0.99+
3M	ORGANIZATION	0.99+
six years ago	DATE	0.99+
SharePoint	TITLE	0.99+
Magellan Health	ORGANIZATION	0.99+
hundreds of links	QUANTITY	0.98+
eight	DATE	0.98+
a year ago	DATE	0.98+
each	QUANTITY	0.98+
8:30 a.m.	DATE	0.98+
48 hours	QUANTITY	0.98+
Mechanical Turk	ORGANIZATION	0.98+
PricewaterhouseCoopers	ORGANIZATION	0.98+
one example	QUANTITY	0.98+
Textract	TITLE	0.97+
Amazon Summit	EVENT	0.97+
five-year-old	QUANTITY	0.97+
Salesforce	TITLE	0.97+
ML University	ORGANIZATION	0.97+
hundreds of terabytes	QUANTITY	0.97+
Allen Institute for AI	ORGANIZATION	0.97+
first professional camera	QUANTITY	0.96+
COVID-19 pandemic	EVENT	0.96+
A2I	TITLE	0.96+
One	QUANTITY	0.95+
COVID-19	OTHER	0.95+
Machine Learning University	ORGANIZATION	0.95+
GA	LOCATION	0.94+
Instagram	ORGANIZATION	0.94+
pandemic	EVENT	0.93+
theCUBE Studios	ORGANIZATION	0.93+
COVID	TITLE	0.93+
Baker Tilly	ORGANIZATION	0.92+
AWS Summit	EVENT	0.92+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Allen Institute for AI: