Hui Xue, National Heart, Lung, and Blood Institute | DockerCon Live 2020
>> Narrator: From around the globe it's theCUBE with digital coverage of DockerCon Live 2020. Brought to you by Docker and its ecosystem partners. >> Hi, I'm Stu Miniman and welcome to theCUBE's coverage of DockerCon Live 2020. Really excited to be part of this online event. We've been involved with DockerCon for a long time, of course one of my favorite things is always to be able to talk to the practitioners. Of course we remember for years, Docker exploded onto the marketplace, millions of people downloaded it, using it. So joining me is Hui Xue, who is a Principal Deputy Director of Medical Signal Processing at the National Heart, Lung, and Blood Institute, which is part of the National Institute of Health. Hui, thank you so much for joining us. >> Thank you for inviting me. >> So let's start. Of course, the name of your institute, very specific. I think anyone in the United States knows the NIH. Tell us a little bit about your role there and kind of the scope of what your team covers. >> So I'm basically a researcher and developer of the medical imaging technology. We are the heart, lung and the blood, so we work and focus on imaging the heart. So what we exactly do is to develop the new and novel imaging technology and deploy them to the front of our clinical library, which Docker played an essential role in the process. So, yeah, that's what we do at NHLBI. >> Okay, excellent. So research, you know, of course in the medical field with the global pandemic gets a lot of attention. So you keyed it up there. Let's understand, where does containerization and Docker specifically play into the work that your team is doing? >> So, maybe I'd like to give an example which will suffice. So for example, we're working on the magnetic resonance imaging, MRI. Many of us may may already have been scanned. So we're using MRI to image the heart. What Docker plays, is Docker allow us to deploy our imaging technology to the clinical hospital. So we have a global deployment around 40 hospitals, a bit more, around the world. If we are for example develop a new AI-based image analysis for the heart image, what we do with Docker is we can put our model and software into the Docker so that our collaboration sites, they will pull the software that contains the latest technology, then use them for the patients, of course under the research agreement at NIH. Because Docker is so efficient, available globally, we can actually implement a continuous integration and testing, update the framework based on Docker. Then our collaborators would have the latest technology instead of, you know, in the traditional medical imaging in general, the iteration of technology is pretty slow. But with all this latest technology, and such like container Docker come into the field. It's actually relatively new. In the past two to three years, all these paradigm is, it's changing, certainly very exciting to us. It give us the flexibility we never had before to reach our customers, to reach other people in the world to help them. They also help us so that's a very good experience to have. >> Yeah that's pretty powerful what you're talking about there rather than you know, we install some equipment, who knows how often things get updated, how do you make sure to synchronize between different locations. Obviously the medical field highly regulated and being a government agency, talk a little bit about how you make sure you have the right version control, security is in place, how do all of those things sort out? >> Yes, that's an essential question. So firstly I want to clarify one thing. So it's not NIH who endorse Docker, it's us as researchers. We practiced Docker too and we trust its performance. This container technology is efficient, it's globally available and it's very secure. So all the communication between the container and the imaging equipment is encrypted. We also have all the paperwork it saved to set up to allow us to provide technology to our clinician. When they post the latest software, every version they put up into the Docker went through an automated integration test system. So every time they make a change, the newer version of software runs through a rigorous test, something like 200 gigabytes of data runs through and checked everything is still working. So the basic principle is we don't allow any version of the software to be delivered to customer without testing Docker. Let's say this container technology in general actually is 100% automating all this process, which actually give us a lot of freedom so we have a rather very small team here at NIH. Many people are actually very impressed by how many customer we support within this so small team. So the key reason is because we have a strongly utilized container technology, so its automation is unparalleled, certainly much better than anything I had before using this container technology. So that's actually the key to maintain the quality and the continuous service to our customers. >> Yeah, absolutely. Automation is something we've been talking about in the industry for a long time but if we implement it properly it can have a huge impact. Can you bring us inside a little bit, you know, what tools are you doing? How is that automation set up and managed? And how that fits into the Docker environment. >> So I kind of describe to be more specific. So we are using a continuous testing framework. There are several apps to be using a specific one to build on, which is an open source Python tool, rather small actually. What it can do is, this tool will set up at the service, then this service will watch for example our GitHub repo. Whenever I make a change or someone in the team makes a change for example, fix a bug, add a new feature, or maybe update a new AI model, we push the edge of the GitHub then there's a continuous building system that will notice, it will trigger the integration test run all inside Docker environment. So this is the key. What container technology offers is that we can have 100% reproducible runtime environment for our customers as the software provider, because in our particular use case we don't set up customer with the uniform hardware so they bought their own server around the world, so everyone may have slightly different hardware. We don't want that to get into our software experience. So Docker actually offers us the 100% control of the runtime environment which is very essential if we want to deliver a consistent medical imaging experience because most applications actually it's rather computational intensive, so they don't want something to run for like one minute in one site and maybe three minutes at another site. So what Docker place is that Docker will run all the integration tests. If everything pass then they pack the Docker image then send to the Docker Hub. Then all our collaborators around the world have new image then they will coordinate with them so they will find a proper time to update then they have the newer technology in time. So that's why Docker is such a useful tool for us. >> Yeah, absolutely. Okay, containerization in Docker really transformed the way a lot of those computational solutions happen. I'm wondering if you can explain a little bit more the stack that you're using if people that might not have looked at solutions for a couple of years think oh it's containers, it's dateless architectures, I'm not sure how it fits into my other network environment. Can you tell us what are you doing for the storage in the network? >> So we actually have a rather vertical integration in this medical imaging application, so we build our own service as the software, its backbone is C++ for the higher computational efficiency. There's lots of Python because these days AI model essential. What Docker provides, as I mentioned, uniform always this runtime environment so we have a fixed GCC version then if we want to go into that detail. Specific version of numerical library, certain versions of Python, will be using PyTorch a lot. So that's our AI backbone. Another way of using Docker is actually we deploy the same container into the Microsoft Azure cloud. That's another ability I found out about Docker, so we never need to change anything in our software development process, but the same container I give you must work everywhere on the cloud, on site, for our customers. This actually reduces the development cost, also improve our efficiency a lot. Another important aspect is this actually will improve customers', how do they say it, customer acceptance a lot because they go to one customer, tell them the software you are running is actually running on 30 other sites exactly the same up to the let's say heights there, so it's bit by bit consistent. This actually help us convince many people. Every time when I describe this process I think most people accept the idea. They actually appreciate the way how we deliver software to them because we always can falling back. So yes, here is another aspect. So we have many Docker images that's in the Docker Hub, so if one deployment fails, they can easily falling back. That's actually very important for medical imaging applications that fail because hospitals need to maintain their continuous level of service. So even we want to avoid this completely but yes occasionally, very occasionally, there will be some function not working or some new test case never covered before, then we give them an magnet then, falling back, that's actually also our policy and offered by the container technology. >> Yeah, absolutely. You brought up, many have said that the container is that atomic unit of building block and that portability around any platform environment. What about container orchestration? How are you managing these environments you talked about in the public cloud or in different environments? What are you doing for container orchestration? >> Actually our set-up might be the simplest case. So we basically have a private Docker repo which we paid, actually the Institute has paid. We have something like 50 or 100 private repos, then for every repo we have one specific Docker setup with different software versions of different, for example some image is for PyTorch another for TensorFlow depending on our application. Maybe some customer has the requirement to have rather small Docker image size then they have some trimmed down version of image. In this process, because it's still in a small number like 20, 30 active repo, we are actually managing it semi-automatically so we have the service running to push and pull, and loading back images but we actually configured this process here at the Institute whenever we feel we have something new to offer to the customer. Regarding managing this Docker image, it's actually another aspect for the medical image. So at the customer side, we had a lot of discussion with them for whether we want to set up a continuous automated app, but in the end they decided, they said they'd better have customers involved. Better have some people. So we were finally stopped there by, we noticed customer, there are something new to update then they will decide when to update, how to test. So this is another aspect. Even we have a very high level of confirmation using the container technology, we found it's not 100%. In some site, it's still better have human supervision to help because if the goal is to maintain 100% continuous service then in the end they need some experts on the field to test and verify. So that's how they are in the current stage of deployment of this Docker image. We found it's rather light-weight so even with a few people at NIH in our team, they can manage a rather large network globally, so it's really exciting for us. >> Excellent. Great. I guess final question, give us a little bit of a road map as to, you've already talked about leveraging AI in there, the various pieces, what are you looking for from Docker in the ecosystem, and your solution for the rest of the year? >> I would say the future definitely is on the cloud. One major direction we are trying to push is to go the clinical hospital, linking and use the cloud in building as a routine. So in current status, some of sites, hospital may be very conservative, they are afraid of the security, the connection, all kinds of issues related to cloud. But this scenario is changing rapidly, especially container technology contributes a lot on the cloud. So it makes the whole thing so easy, so reliable. So our next push is to move in lots of the application into the cloud only. So the model will be, for example, we have new AI applications. It may be only available on the cloud. If some customer is waiting to use them they will have to be willing to connect to the cloud and maybe sending data there and receive, for example, the AI apps from our running Docker image in the cloud, but what we need to do is to make the Docker building even more efficiency. Make the computation 100% stable so we can utilize the huge computational power in the cloud. Also the price, so the key here is the price. So if we have one setup in the cloud, a data center for example, we currently maintain two data centers one across Europe, another is in United States. So if we have one data center and 50 hospitals using it every day, then we need the numbers. The average price for one patient comes to a few dollars per patient. So if we consider this medical health care system the costs, the ideal costs of using cloud computing can be truly trivial, but what we can offer to patients and doctor has never happened. The computation you can bring to us is something they never saw before and they never experienced. So I believe that's the future, it's not, the old model is everyone has his own computational server, then maintaining that, it costs a lot of work. Even doctor make the software aspects much easier, but the hardware, someone still need to set-up them. But using cloud will change all of. So I think the next future is definitely to wholly utilize the cloud with the container technology. >> Excellent. Well, we thank you so much. I know everyone appreciates the work your team's doing and absolutely if things can be done to allow scalability and lower cost per patient that would be a huge benefit. Thank you so much for joining us. >> Thank you. >> All right, stay tuned for lots more coverage from theCUBE at DockerCon Live 2020. I'm Stu Miniman and thank you for watching theCUBE. (gentle music)
SUMMARY :
the globe it's theCUBE at the National Heart, Lung, of the scope of what your team covers. of the medical imaging technology. course in the medical field and software into the Docker Obviously the medical field of the software to be the Docker environment. edge of the GitHub then in the network? the way how we deliver about in the public cloud or because if the goal is to from Docker in the ecosystem, So the model will be, for example, the work your team's doing you for watching theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
NIH | ORGANIZATION | 0.99+ |
National Institute of Health | ORGANIZATION | 0.99+ |
100% | QUANTITY | 0.99+ |
Europe | LOCATION | 0.99+ |
United States | LOCATION | 0.99+ |
200 gigabytes | QUANTITY | 0.99+ |
one minute | QUANTITY | 0.99+ |
three minutes | QUANTITY | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Python | TITLE | 0.99+ |
Hui Xue | PERSON | 0.99+ |
50 hospitals | QUANTITY | 0.99+ |
Docker | ORGANIZATION | 0.99+ |
DockerCon | EVENT | 0.99+ |
20 | QUANTITY | 0.99+ |
one patient | QUANTITY | 0.99+ |
30 other sites | QUANTITY | 0.99+ |
PyTorch | TITLE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Docker | TITLE | 0.99+ |
one data center | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
one site | QUANTITY | 0.98+ |
millions of people | QUANTITY | 0.98+ |
two data centers | QUANTITY | 0.97+ |
DockerCon Live 2020 | EVENT | 0.97+ |
firstly | QUANTITY | 0.97+ |
Hui | PERSON | 0.97+ |
NHLBI | ORGANIZATION | 0.97+ |
National Heart, Lung, and Blood Institute | ORGANIZATION | 0.97+ |
theCUBE | ORGANIZATION | 0.97+ |
one customer | QUANTITY | 0.96+ |
National Heart, Lung, and Blood Institute | ORGANIZATION | 0.96+ |
one thing | QUANTITY | 0.96+ |
50 | QUANTITY | 0.96+ |
100 private repos | QUANTITY | 0.93+ |
around 40 hospitals | QUANTITY | 0.91+ |
30 active repo | QUANTITY | 0.85+ |
pandemic | EVENT | 0.82+ |
three years | QUANTITY | 0.82+ |
C++ | TITLE | 0.81+ |
Hui Xue | ORGANIZATION | 0.8+ |
TensorFlow | TITLE | 0.75+ |
One major | QUANTITY | 0.71+ |
Azure cloud | TITLE | 0.7+ |
Docker | PERSON | 0.7+ |
Medical Signal Processing | ORGANIZATION | 0.66+ |
few dollars per patient | QUANTITY | 0.65+ |
couple of years | QUANTITY | 0.58+ |
Brett McMillen, AWS | AWS re:Invent 2020
>>From around the globe. It's the cube with digital coverage of AWS reinvent 2020, sponsored by Intel and AWS. >>Welcome back to the cubes coverage of AWS reinvent 2020 I'm Lisa Martin. Joining me next is one of our cube alumni. Breton McMillan is back the director of us, federal for AWS. Right. It's great to see you glad that you're safe and well. >>Great. It's great to be back. Uh, I think last year when we did the cube, we were on the convention floor. It feels very different this year here at reinvent, it's gone virtual and yet it's still true to how reinvent always been. It's a learning conference and we're releasing a lot of new products and services for our customers. >>Yes. A lot of content, as you say, the one thing I think I would say about this reinvent, one of the things that's different, it's so quiet around us. Normally we're talking loudly over tens of thousands of people on the showroom floor, but great. That AWS is still able to connect in such an actually an even bigger way with its customers. So during Theresa Carlson's keynote, want to get your opinion on this or some info. She talked about the AWS open data sponsorship program, and that you guys are going to be hosting the national institutes of health, NIH sequence, read archive data, the biologist, and may former gets really excited about that. Talk to us about that because especially during the global health crisis that we're in, that sounds really promising >>Very much is I am so happy that we're working with NIH on this and multiple other initiatives. So the secret greed archive or SRA, essentially what it is, it's a very large data set of sequenced genomic data. And it's a wide variety of judge you gnomic data, and it's got a knowledge human genetic thing, but all life forms or all branches of life, um, is in a SRA to include viruses. And that's really important here during the pandemic. Um, it's one of the largest and oldest, um, gen sequence genomic data sets are out there and yet it's very modern. It has been designed for next generation sequencing. So it's growing, it's modern and it's well used. It's one of the more important ones that it's out there. One of the reasons this is so important is that we know to find cures for what a human ailments and disease and death, but by studying the gem genomic code, we can come up with the answers of these or the scientists can come up with answer for that. And that's what Amazon is doing is we're putting in the hands of the scientists, the tools so that they can help cure heart disease and diabetes and cancer and, um, depression and yes, even, um, uh, viruses that can cause pandemics. >>So making this data, sorry, I'm just going to making this data available to those scientists. Worldwide is incredibly important. Talk to us about that. >>Yeah, it is. And so, um, within NIH, we're working with, um, the, um, NCBI when you're dealing with NIH, there's a lot of acronyms, uh, and uh, at NIH, it's the national center for, um, file type technology information. And so we're working with them to make this available as an open data set. Why, why this is important is it's all about increasing the speed for scientific discovery. I personally think that in the fullness of time, the scientists will come up with cures for just about all of the human ailments that are out there. And it's our job at AWS to put into the hands of the scientists, the tools they need to make things happen quickly or in our lifetime. And I'm really excited to be working with NIH on that. When we start talking about it, there's multiple things. The scientists needs. One is access to these data sets and SRA. >>It's a very large data set. It's 45 petabytes and it's growing. I personally believe that it's going to double every year, year and a half. So it's a very large data set and it's hard to move that data around. It's so much easier if you just go into the cloud, compute against it and do your research there in the cloud. And so it's super important. 45 petabytes, give you an idea if it were all human data, that's equivalent to have a seven and a half million people or put another way 90% of everybody living in New York city. So that's how big this is. But then also what AWS is doing is we're bringing compute. So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. The third leg of the tool of the stool is giving the scientists easy access to the specialized tool sets they need. >>And we're doing that in a few different ways. One that the people would design these toolsets design a lot of them on AWS, but then we also make them available through something called AWS marketplace. So they can just go into marketplace, get a catalog, go in there and say, I want to launch this resolve work and launches the infrastructure underneath. And it speeds the ability for those scientists to come up with the cures that they need. So SRA is stored in Amazon S3, which is a very popular object store, not just in the scientific community, but virtually every industry uses S3. And by making this available on these public data sets, we're giving the scientists the ability to speed up their research. >>One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, it's also facilitating collaboration globally because now you've got the cloud to drive all of this, which allows researchers and completely different parts of the world to be working together almost in real time. So I can imagine the incredible power that this is going to, to provide to that community. So I have to ask you though, you talked about this being all life forms, including viruses COVID-19, what are some of the things that you think we can see? I expect this to facilitate. Yeah. >>So earlier in the year we took the, um, uh, genetic code or NIH took the genetic code and they, um, put it in an SRA like format and that's now available on AWS and, and here's, what's great about it is that you can now make it so anybody in the world can go to this open data set and start doing their research. One of our goals here is build back to a democratization of research. So it used to be that, um, get, for example, the very first, um, vaccine that came out was a small part. It's a vaccine that was done by our rural country doctor using essentially test tubes in a microscope. It's gotten hard to do that because data sets are so large, you need so much computer by using the power of the cloud. We've really democratized it and now anybody can do it. So for example, um, with the SRE data set that was done by NIH, um, organizations like the university of British Columbia, their, um, cloud innovation center is, um, doing research. And so what they've done is they've scanned, they, um, SRA database think about it. They scanned out 11 million entries for, uh, coronavirus sequencing. And that's really hard to do in a typical on-premise data center. Who's relatively easy to do on AWS. So by making this available, we can have a larger number of scientists working on the problems that we need to have solved. >>Well, and as the, as we all know in the U S operation warp speed, that warp speed alone term really signifies how quickly we all need this to be progressing forward. But this is not the first partnership that AWS has had with the NIH. Talk to me about what you guys, what some of the other things are that you're doing together. >>We've been working with NIH for a very long time. Um, back in 2012, we worked with NIH on, um, which was called the a thousand genome data set. This is another really important, um, data set and it's a large number of, uh, against sequence human genomes. And we moved that into, again, an open dataset on AWS and what's happened in the last eight years is many scientists have been able to compute about on it. And the other, the wonderful power of the cloud is over time. We continue to bring out tools to make it easier for people to work. So what they're not they're computing using our, um, our instance types. We call it elastic cloud computing. whether they're doing that, or they were doing some high performance computing using, um, uh, EMR elastic MapReduce, they can do that. And then we've brought up new things that really take it to the next layer, like level like, uh, Amazon SageMaker. >>And this is a, um, uh, makes it really easy for, um, the scientists to launch machine learning algorithms on AWS. So we've done the thousand genome, uh, dataset. Um, there's a number of other areas within NIH that we've been working on. So for example, um, over at national cancer Institute, we've been providing some expert guidance on best practices to how, how you can architect and work on these COVID related workloads. Um, NIH does things with, um, collaboration with many different universities, um, over 2,500, um, academic institutions. And, um, and they do that through grants. And so we've been working with doc office of director and they run their grant management applications in the RFA on AWS, and that allows it to scale up and to work very efficiently. Um, and then we entered in with, um, uh, NIH into this program called strides strides as a program for knowing NIH, but also all these other institutions that work within NIH to use the power of the cloud use commercial cloud for scientific discovery. And when we started that back in July of 2018, long before COVID happened, it was so great that we had that up and running because now we're able to help them out through the strides program. >>Right. Can you imagine if, uh, let's not even go there? I was going to say, um, but so, okay. So the SRA data is available through the AWS open data sponsorship program. You talked about strides. What are some of the other ways that AWS system? >>Yeah, no. So strides, uh, is, uh, you know, wide ranging through multiple different institutes. So, um, for example, over at, uh, the national heart lung and blood Institute, uh, do di NHL BI. I said, there's a lot of acronyms and I gel BI. Um, they've been working on, um, harmonizing, uh, genomic data. And so working with the university of Michigan, they've been analyzing through a program that they call top of med. Um, we've also been working with a NIH on, um, establishing best practices, making sure everything's secure. So we've been providing, um, AWS professional services that are showing them how to do this. So one portion of strides is getting the right data set and the right compute in the right tools, in the hands of the scientists. The other areas that we've been working on is making sure the scientists know how to use it. And so we've been developing these cloud learning pathways, and we started this quite a while back, and it's been so helpful here during the code. So, um, scientists can now go on and they can do self-paced online courses, which we've been really helping here during the, during the pandemic. And they can learn how to maximize their use of cloud technologies through these pathways that we've developed for them. >>Well, not education is imperative. I mean, there, you think about all of the knowledge that they have with within their scientific discipline and being able to leverage technology in a way that's easy is absolutely imperative to the timing. So, so, um, let's talk about other data sets that are available. So you've got the SRA is available. Uh, what are their data sets are available through this program? >>What about along a wide range of data sets that we're, um, uh, doing open data sets and in general, um, these data sets are, um, improving the human condition or improving the, um, the world in which we live in. And so, um, I've talked about a few things. There's a few more, uh, things. So for example, um, there's the cancer genomic Atlas that we've been working with, um, national cancer Institute, as well as the national human genomic research Institute. And, um, that's a very important data set that being computed against, um, uh, throughout the world, uh, commonly within the scientific community, that data set is called TCGA. Um, then we also have some, uh, uh, datasets are focused on certain groups. So for example, kids first is a data set. That's looking at a lot of the, um, challenges, uh, in diseases that kids get every kind of thing from very rare pediatric cancer as to heart defects, et cetera. >>And so we're working with them, but it's not just in the, um, uh, medical side. We have open data sets, um, with, uh, for example, uh, NOAA national ocean open national oceanic and atmospheric administration, um, to understand what's happening better with climate change and to slow the rate of climate change within the department of interior, they have a Landsat database that is looking at pictures of their birth cell, like pictures of the earth, so we can better understand the MCO world we live in. Uh, similarly, uh, NASA has, um, a lot of data that we put out there and, um, over in the department of energy, uh, there's data sets there, um, that we're researching against, or that the scientists are researching against to make sure that we have better clean, renewable energy sources, but it's not just government agencies that we work with when we find a dataset that's important. >>We also work with, um, nonprofit organizations, nonprofit organizations are also in, they're not flush with cash and they're trying to make every dollar work. And so we've worked with them, um, organizations like the child mind Institute or the Allen Institute for brain science. And these are largely like neuro imaging, um, data. And we made that available, um, via, um, our open data set, um, program. So there's a wide range of things that we're doing. And what's great about it is when we do it, you democratize science and you allowed many, many more science scientists to work on these problems. They're so critical for us. >>The availability is, is incredible, but also the, the breadth and depth of what you just spoke. It's not just government, for example, you've got about 30 seconds left. I'm going to ask you to summarize some of the announcements that you think are really, really critical for federal customers to be paying attention to from reinvent 2020. >>Yeah. So, um, one of the things that these federal government customers have been coming to us on is they've had to have new ways to communicate with their customer, with the public. And so we have a product that we've had for a while called on AWS connect, and it's been used very extensively throughout government customers. And it's used in industry too. We've had a number of, um, of announcements this weekend. Jasmine made multiple announcements on enhancement, say AWS connect or additional services, everything from helping to verify that that's the right person from AWS connect ID to making sure that that customer's gets a good customer experience to connect wisdom or making sure that the managers of these call centers can manage the call centers better. And so I'm really excited that we're putting in the hands of both government and industry, a cloud based solution to make their connections to the public better. >>It's all about connections these days, but I wish we had more time, cause I know we can unpack so much more with you, but thank you for joining me on the queue today, sharing some of the insights, some of the impacts and availability that AWS is enabling the scientific and other federal communities. It's incredibly important. And we appreciate your time. Thank you, Lisa, for Brett McMillan. I'm Lisa Martin. You're watching the cubes coverage of AWS reinvent 2020.
SUMMARY :
It's the cube with digital coverage of AWS It's great to see you glad that you're safe and well. It's great to be back. Talk to us about that because especially during the global health crisis that we're in, One of the reasons this is so important is that we know to find cures So making this data, sorry, I'm just going to making this data available to those scientists. And so, um, within NIH, we're working with, um, the, So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. And it speeds the ability for those scientists One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, And that's really hard to do in a typical on-premise data center. Talk to me about what you guys, take it to the next layer, like level like, uh, Amazon SageMaker. in the RFA on AWS, and that allows it to scale up and to work very efficiently. So the SRA data is available through the AWS open data sponsorship And so working with the university of Michigan, they've been analyzing absolutely imperative to the timing. And so, um, And so we're working with them, but it's not just in the, um, uh, medical side. And these are largely like neuro imaging, um, data. I'm going to ask you to summarize some of the announcements that's the right person from AWS connect ID to making sure that that customer's And we appreciate your time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
NIH | ORGANIZATION | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Brett McMillan | PERSON | 0.99+ |
Brett McMillen | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
NASA | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
July of 2018 | DATE | 0.99+ |
2012 | DATE | 0.99+ |
Theresa Carlson | PERSON | 0.99+ |
Jasmine | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
90% | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
Allen Institute | ORGANIZATION | 0.99+ |
SRA | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
Breton McMillan | PERSON | 0.99+ |
NCBI | ORGANIZATION | 0.99+ |
45 petabytes | QUANTITY | 0.99+ |
SRE | ORGANIZATION | 0.99+ |
seven and a half million people | QUANTITY | 0.99+ |
third leg | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
earth | LOCATION | 0.99+ |
over 2,500 | QUANTITY | 0.99+ |
SRA | TITLE | 0.99+ |
S3 | TITLE | 0.98+ |
pandemic | EVENT | 0.98+ |
first partnership | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
child mind Institute | ORGANIZATION | 0.98+ |
U S | LOCATION | 0.98+ |
this year | DATE | 0.98+ |
pandemics | EVENT | 0.98+ |
national cancer Institute | ORGANIZATION | 0.98+ |
both | QUANTITY | 0.98+ |
national heart lung and blood Institute | ORGANIZATION | 0.98+ |
NOAA | ORGANIZATION | 0.97+ |
national human genomic research Institute | ORGANIZATION | 0.97+ |
today | DATE | 0.97+ |
Landsat | ORGANIZATION | 0.96+ |
first | QUANTITY | 0.96+ |
11 million entries | QUANTITY | 0.96+ |
about 30 seconds | QUANTITY | 0.95+ |
year and a half | QUANTITY | 0.94+ |
AWS connect | ORGANIZATION | 0.93+ |
university of British Columbia | ORGANIZATION | 0.92+ |
COVID | EVENT | 0.91+ |
COVID-19 | OTHER | 0.91+ |
over tens of thousands of people | QUANTITY | 0.91+ |
Seema Haji, Splunk | Splunk .conf19
>>live from Las Vegas. It's the Cube covering Splunk dot com. 19. Brought to you by spunk >>Welcome back, everyone to keep live coverage here in Las Vegas for Splunk dot com. 10th anniversary. 10 years of doing their big customer shows. Cubes. Seventh year of covering Splunk I'm John Ferrier, Host Cube. Our next guest is Cube. Alumni seem Haji, senior director and head of platform on industry for Splunk Knows the business way last topped. 2014 Great to see you. >>Good to see you again, John. You've been busy. I have. It's been a busy time. It's Plunk. >>You have been in the data business. We've been following your career for the years. Data stacks now Splunk on other endeavors. But you've been in the data, even swim in the data business. You've seen clouds scale, you understand. Open source. You understand kind of big dynamics. Splunk has a full enabling data platform. Started out with logs keeps moving along the by companies that interview. But this'll platform concept of enabling value valued customers has been a big part of the success that it continues to yield success every year. When people say no, what is successful data playful because everyone wants to own the data layer because we just want to get value on the data. So what as a product market, our product person, what is the date of platform? >>So it's really a question and, you know, you gonna hit the nail on the head when you said we've been talking about the data platform for several years, like decades. Almost so if you think about, you know, data platform, like, way back when and I'm dating myself. When I graduated from college, you know, people were looking for insights right there. Like give me a report, give me a dashboard way. Went into data, databases of data, warehouses. Enabling this you actually think about the data platform or data to everything. Platform is, as we explore. Call it. It has five critical elements in my in my mind. You know, the first is how do you get all of your information? Like the data that's coming in from networks, logs, applications, people, you and I generate a ton of data. How do we get this all together into a single place so you can get insights on it? 1 may think that it's pretty easy, but the truth is, we've been struggling as an industry with for decades. So it's fun to think what super unique is you can actually bring in any of the data. And some of the challenges that customers have had in the past is way forced them to structure this state of before they can ask questions of it. What's wrong? It's free form. You can bring it in any information and then structured when you're ready to ask that question. So you know a data platform. Number one is flexibility in the way you bring your data second. And you know this being the business is getting real time insights, alerts on your phone, real time decision ing and then you have, you know, operating in different ways on cloud on premises, hybrid environments. That's the third. And I think the fourth and the fifth are probably the most important, and into related is allowing like a good data platform caters to everyone in the or so from your most non technical business user to the most technical data admin I t. Guy security analysts with giving them the same information but allowing them to view it in many different ways and ask different questions of it. So we call this, you know, explained is from a product marketing in a business standpoint way Refer to it as many lenses on your same data. Good data platforms do that while allowing an empowering different users. So those are the five in my >>love kicking out on platform converses. Second, we could talk for now, but I know you got busy. I want to ask you all successful platforms in this modern era of rocket texture. When you get cloud scale, massive data volumes coming in need key building blocks. Take me through your view on why Splunk been successful plateau because you got a naval value from the dorm room to the boardroom. So we've gotta have that use case breath what you do. What key building blocks of this point. Data platform. >>Great question. And, you know, we've we've kind of figured this out is a cz. Well, a cz have been working on building out these building blocks at a most critical customers, right? Did you think about it? You start with the core, the index, if you will. And that's your place to bring you know, slung started with all your logs together and it's your single go to place then, as you think about it, with working with customers, they need massive date engines. So what we just announced today the general availability of data stream processor and data fabric search. It allows you to have those two massive engines from How do I bring my streaming data in to have Can I do massive scale processing? Thea other elements around a machine learning right. So in a world where we're moving to automation, that's super critical to the success. And then you have consuming the way you consume insights or uses consuming sites. If you think about you and I and this amount of time we spend on our phone, how do we make it easy for people to act on their information to those your core platform building blocks give index. You have your date engines, you have a I am l. You have your business analytics and then you have your portfolios on top, which is use case specific, if you will. For I t for security and then for de mops. >>That's awesome. And let's get into the news you were your product. Kino today? Yes, they was opening day. But I want to read the headline from Lung press release and commentary. Don't get your reaction to it. Splunk Enterprising X Man's data access with data fabric search and data stream processor powers Uses with context and collaboration keywords context in their collaboration. House search is a hard problem. Discovery. We've seen carnage and people trying things. You guys do a lot of data. Lot of diverse date has been a big team here, right? Your customers have grown with more data coming in. Why these two features important. What's the keys? Behind the fabric search on the data processor is that the real time is the date acceleration. What are some of the key value points? What people know about the fabric surge processor. >>So actually, let me start with the data stream processor. You know, with DSP, what we're really doing is looking at streaming data. So when you think about the real time customers I ot sensor data, anything that's coming on the wire data stream processor lets you bring that in display. Now, the uniqueness of data stream processor is you wanted Thio, you didn't have to bring it in. Splunk. You can actually like process that live on the wire and it works just as well. Not do fabric search. It's, you know, you alluded to this earlier. It's how do you search across your massive data leaks warehouses that exist without having to bring it all in one place. So in the product, he notes Demo. Today we showed a really cool demo of a business and bliss user, really solving a business problem while searching across S three Duke and data that's sitting in instruct and then with the fabric search, you can also do massive, like federated, like global size searches on the context and collaboration. That's really once you have all this data in Splunk, how do you How do you like your users? Consume it right? And that's the mobile connected experiences A cz well, a cz Phantom and Victor Rapps like really activating this data in automating it. >>I want to get your thoughts on something that we've been seeing on the Q. And I've been kind of promoting for about a year now, and it really came back for you. Go back to the early days of duping big data. And, you know, you know, those days getting diverse data is hard. And so because it's a different formats on the database scheme is Andorran structured to find that databases in a way hamper hinder that capability. We've been saying that diverse data gives a better machine, makes machine learning better. Machine learning is a day I provides business benefits. This flywheel is really important. And can you give an example of where that's playing out and spunk? Because that seems to be the magic right now. Is that getting the data together, knowing what day it is? No blind spots. As much as that is, it's possible. But getting that flag will doing better. Better diverse data, better machine learning better. Ay, I better I better business value. I >>think it comes down to the word divers, right? So when you're looking at data coming in from many different sources, you also get a holistic perspective on what's going on in your business. You get the insight on what your customers may be doing in engaging with your business. You get insight on how your infrastructure is performing and the way you can optimize people to the business from you know you need to. The ops and operations is to like how customers are working and interacting with your business. The other piece is when you think about machine learning in the I A. CZ, you automate this. It's a lot easier when you have the holistic context, right? So, you know, diverse data means more context. More context means better insight into what you're trying to get to. It's just gonna rounds out. The perspective I often refer to it is it's adding a new dimension to something you already know >>and opens up a whole nother conscious around. What is the practitioners? Role? Not just a database administrator is setting up databases because you're getting at, you know, context is important. What's the data about the data? What dough I keep what should be addressable foran application. Is this relevant content for this some day, it is more valuable than others at any given time, so address ability becomes a big thing. What's your vision around this idea of data address ability for applications? >>So, you know, just going back to what you said about the administrators and the doers we call them the doers there. The innovators right there. The bill, people building the cool stuff. And so when you actually can bring these elements in for them, you really are giving them the ability to innovate and do better and have that accessibility into the information and really kind of like, you know, like Bill the best that they could write. So, you know, we've been saying Turn data into doing and it really is true. Like these are again the architects of what's happening and they're the people, like taking all this diverse data, taking the machine, learning, taking the technology of the building blocks and then turning it into, like, hold doing that we d'oh! >>It's interesting with markets change him. It actually changed the role of the database person makes them broader, more powerful. >>Yes, and because you know they're the ones fueling the business. >>Thanks for coming. I really appreciate the insight. I wish we had more time on a personal question. What's exciting You in the industry these days? Actually, you're exploring. Companies continue to grow from start up the i p o massive growth now to a whole nother level of market leadership to defend that you put some good products out there. What? What are you getting excited about these days from tech standpoint? >>You know, I think it's we're finally getting it. We're finally getting what you know. Being a data to everything. Platform is, for example, right after the keynote. I had more than a few people come up to me and say, Well, you know, that made sense, right? Like when we think about Splunk is the data to everything platform on what data platforms are meant to dio and how they should operate. So I think the industry is finally getting their What's exciting me next is if you look behind us and all the industry traction that we're seeing. So you know, taking technology and data beyond. And really enabling businesses from financial service is to healthcare to manufacturers to do more. You know, the businesses that traditionally, like, maybe have not been adopting technology as fast as software companies. And now we're seeing that, and that's super exciting. >>You know, I always get into these kind of philosophical debates with people. Either on the Cube are are off the Cube, where you know what is a platform success look like, you know, I always say, I want to get your reaction to this. I always say, if it's got applications or things being enabled value on a healthy ecosystem, so do you agree with that statement? And if so, what's the proof points for Splunk on those two things? What is defining that? What a successful platform looks like? >>You know that I do agree with you. And when I think about a successful platform, it's if I look around this room and just see how you know, like New York Presbyterian as using Splunk Thio like we heard from Dell today an intel. So when you see the spectrum of customers using Splunk across a variety of successes, it's that super exciting to me that tells me that you know what it is everything when you say date it. Everything >>all right? We got a fun job these days. >>D'oh to be here. So it's great. >>Great to see you. Thanks for coming back on the Cube. I'm looking forward to catching up. I'm John Kerry here on the Cube. Let's see what she's awesome. Cube alumni from 2014. Now it's blonde leading the product efforts and marketing. I'm John. Where were you watching the Q. Be right back after this short break
SUMMARY :
19. Brought to you by spunk Splunk Knows the business way last topped. Good to see you again, John. You have been in the data business. in the way you bring your data second. I want to ask you all successful platforms in this modern era of rocket texture. go to place then, as you think about it, with working with customers, And let's get into the news you were your product. how do you How do you like your users? And, you know, you know, those days getting people to the business from you know you need to. you know, context is important. that accessibility into the information and really kind of like, you know, It actually changed the role of the database person makes them What are you getting excited about these days from tech standpoint? I had more than a few people come up to me and say, Well, you know, that made sense, where you know what is a platform success look like, you know, I always say, I want to get your reaction to this. it's that super exciting to me that tells me that you know what it is everything when you say date it. all right? D'oh to be here. Where were you watching the Q.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
John | PERSON | 0.99+ |
John Ferrier | PERSON | 0.99+ |
2014 | DATE | 0.99+ |
John Kerry | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
10 years | QUANTITY | 0.99+ |
Today | DATE | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
fourth | QUANTITY | 0.99+ |
Seema Haji | PERSON | 0.99+ |
Second | QUANTITY | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
fifth | QUANTITY | 0.99+ |
third | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
Cube | ORGANIZATION | 0.98+ |
Bill | PERSON | 0.98+ |
Seventh year | QUANTITY | 0.98+ |
Splunk | ORGANIZATION | 0.98+ |
two things | QUANTITY | 0.97+ |
10th anniversary | QUANTITY | 0.97+ |
single | QUANTITY | 0.97+ |
Andorran | OTHER | 0.97+ |
one place | QUANTITY | 0.96+ |
two features | QUANTITY | 0.95+ |
five critical elements | QUANTITY | 0.95+ |
Victor Rapps | PERSON | 0.94+ |
single place | QUANTITY | 0.91+ |
S three Duke | COMMERCIAL_ITEM | 0.9+ |
Haji | PERSON | 0.9+ |
about a year | QUANTITY | 0.9+ |
second | QUANTITY | 0.9+ |
Splunk | OTHER | 0.83+ |
Q. Be | PERSON | 0.82+ |
intel | ORGANIZATION | 0.82+ |
Splunk | PERSON | 0.81+ |
Lung | ORGANIZATION | 0.79+ |
decades | QUANTITY | 0.78+ |
Cube | COMMERCIAL_ITEM | 0.76+ |
more than | QUANTITY | 0.75+ |
New York | LOCATION | 0.73+ |
Splunk | TITLE | 0.72+ |
two massive engines | QUANTITY | 0.72+ |
one | QUANTITY | 0.66+ |
Splunk dot com | ORGANIZATION | 0.63+ |
Cube | PERSON | 0.62+ |
Plunk | ORGANIZATION | 0.51+ |
spunk | PERSON | 0.51+ |
few | QUANTITY | 0.47+ |
CZ | TITLE | 0.44+ |
Presbyterian | ORGANIZATION | 0.43+ |
Thio | PERSON | 0.41+ |
Thio | ORGANIZATION | 0.29+ |
Phantom | ORGANIZATION | 0.25+ |
Ed Walsh and Eric Herzog, IBM | CUBE Conversation July 2017
(upbeat digital music) >> Hi, welcome to a CUBE conversation with Wikibon. I'm Peter Burris, the chief research officer of Wikibon and our goal with these CUBE conversations is try to bring you some of the finest minds in the technology industry to try to talk about some of the most pressing problems facing digital businesses as they transform in an increasingly chaotic world. We're very lucky today to have a couple of great thinkers, both from IBM. Ed Walsh is the general manager of storage at IBM and Eric Herzog runs product management for the storage group at IBM. Welcome to the CUBE conversation today. >> It's always nice, thank you for having us. >> So, guys you've been running around Silicon Valley today telling your story, we've got a couple of questions. Wikibon likes to talk about the relationship between data and digital business. A lot of people will wonder what digital business is. We say that the difference between digital business and business is how do you use your data assets. Now, that's a stance that I think is becoming a little bit more vogue in the market place today, but that means that storage has a slightly different role to play when we think about how we protect, secure, sustain those data assets. Do you subscribe to this? Is that how you're looking at it? And is that relevant to the conversation that you're having with customers? >> I haven't heard of that way, but it actually makes a lot of sense and you can jump in as well Eric, but I would say, however you look at your data, if a digital business is leveraging their data it makes a lot of sense. We use different, I would say metaphors, one would be your data assets are your oil, he who refines it gets value, so if you get insights from it. So, if you're not using that, you are kind of putting yourself at a disadvantage. We also see a lot of what I'll say is established companies getting disrupted by you know, true disrupters using the technology and insights of data to disrupt incumbency. You know, we'll call the Uber of my business, it's almost like a verb these days, is disrupting me, they're using technology against me, so, the key thing, the best defense is actually using technology, getting insights and then driving new business. But data alone, you need the right infrastructure, either on prem or in the cloud and put the right analytics and insight to it. So, I would agree completely and I would also say, you know, well, think about it, eighty per cent of data is behind your data, you know, it's not searchable by the web, it's how to leverage your data assets in combination with other things to get true insights. Outside data, different things on AI and really get true insights then map them into your business. So, I would agree with that, I haven't heard that way but I would agree with that, it's a good definition of digital business. >> Well, what we're seeing is for companies that are really leveraging the data, it's their life blood and the issue is data is not small anymore, it's oceans of data. Whether that be things from the Internet of Things, grabbing things, for example, all the tell-cos have sensors all over all of their assets and they're trying to keep the tell-co up and going. And it doesn't just have to be a giant tell-co, small companies have reams and reams of data, it's an ocean and if they're not mining that ocean, if they're not swimming through that ocean correctly, the next thing you know, the competitor disrupts them and that is their power, it's the ability to harness these oceans of data and use that data in a way that allows them to get competitive advantage. So, people thing of storage as just a way to sort of place your data but storage can be an active part of how you increase the value of that data and gain insights as Ed was pointing out. >> Well, I think, well we totally agree with you by the way, I think it's an important point. In fact, the observation that we've made is the difference between data as fuel, or the reason why it sometimes falls down, or the way I understand it, I don't think it's a decent enough metaphor, is that unlike fuel, data can be reused multiple times. >> Ed: Good point. >> And it makes the whole point that you're bringing up Eric, about the idea that you combine insights from a lot of different places with your data and storage has to play an active role in that process. But it also says something about, the idea of storage as kind of something you put over there, it's standalone, I mean, it used to be we worried about systems integration a lot, now, open systems kind of changed that, we just presumed that it was all going to come together. Now, IBM has been around for a long time and has lived in both worlds. What do you think the role of systems integration is going to be as we think about storage, the need to do a better job at protecting and sustaining our data assets, especially given the speed and uncertainty with which the world is changing and the dependency it has on data these days. >> Ed: You want to take that first? >> Well, let me give you a real time example. One of the things IBM just introduced last week, was a very powerful new mainframe, one of the key tenants of that mainframe, is the ability to secure data end to end, from the day the transaction starts, with no impacts, so, while they're doing transactions, millions and billions of transactions on the server farm, it's encrypted from day one but it eventually ends up on storage and storage has to extend that encryption, so that when you put the data at rest while you're analyzing the data, you've got it encrypted, when you're putting it at rest, it's encrypted, when you pull it back because you've run analytics multiple times, the data is encrypted. Eventually, certain data sets, like take finance, healthcare, does end up on archive. But guess what, it still needs to be encrypted. So, that's an example of how the complete systems integration, from the server, through to primary storage, through the archive, is just one example of how storage plays a critical role in extending everything across this entire matrix of systems integration, not just one point thing, but across an integrated solution and of course in this case, it's secure transactions, it's analysis of incredible amounts of insight and of course with the IBM Z mainframe, is incredible power and speed, yet at the same time, keeping that data safe, while it's doing all the analytics. So, that's a very strong story, but that's just one example of how storage plays a critical role in this complete integration of data, with a full systems infrastructure. >> And maybe I could add to that. So, that's a good example of on prem that also can be hosted in the cloud, but if you think of system integration, you're data is critical, you need access to it to actually do the analytic workload, the cognitive workloads on top of it. It can be on prem or in the cloud or actually split between, so, you do need to know you're relying on your cloud infrastructure to give you that enterprise class, not only performance but availability. But it also matters, but it's no longer you as an individual company putting that together. But it does matter, the infrastructure does matter how they get that performance. Also, you mentioned security and protection, which is where IBM's cloud comes in. >> Well, it's interesting to us that, it's almost natural to expect that the proper cloud companies are going to do deep integration. I mean their talking about going all the way down to FPGAs. As long as they are able to handle or provide, you know, a set of interfaces that are natural and reasonable from an overall workload standpoint. I would expect that we'd see the same type of thing happen in a lot of different on premise systems too. So, the notion of integration, I think you guys agree, is an important trend where it's appropriate and where it's adding value and should not be discounted just because it doesn't comply with some definition of open this, that or the other thing as it has in the past. >> Oh, agreed, yeah, in end systems, especially when you're looking at availability, performance, which you're talking about your asset as being your data and getting insights. If it's just sitting there, it's not very valuable, in fact you could say it's actually exposure, but if you're leveraging it, getting insights and driving your business, it's very valuable, right. So, you just need to make sure the infrastructure has either hyper cloud or in the cloud that allows you to do that, right. But security is becoming more and more a big issue. So, I would agree. >> Well, that raises the next question, so, again, as long as we're focused on the data as the asset and not the underlying hardware as the asset then I think we're in good shape. But it does raise the next question. As we think about converged infrastructure and hyper converge infrastructure and storage, compute, network and other elements coming together successfully, what will be the role of storage in the future? I mean, storage is not just that thing that sits over in there with the data on it. It is playing a much more active role in encryption, in compression, in duplication, in how it prepares data to be used by any number of different applications. How do you foresee the role of storage evolving over the next few years? >> I'm sure I can jump in, do you want to take a shot? >> Well, yeah, I think one of the key things you've got to realize is the role of storage is to sort of offload somethings from the primary CPU. So, for example, if you've got oceans of data, what if we can track all that metadata for you, so when the system or the cloud looked for data, it could search everything whether that was 20 million lung cancer pictures, whether that be MRI, whether that be the old style X-ray. Go back 20 years, if all that metadata is attached then the CPU from a server perspective to run the analytics workloads is offloaded and the storage is performing a valuable function of tracking all of that metadata, so that when the server does its analytics and then has to reiterate several times for example, Watson, IBM Watson, is a very intuitive element that analyzes, learns, analyzes, learns, analyzes and keeps going to get, and it's used in oncology, Watson is used in financial services and so if you could offload that metadata analysis to the storage where it's actually acting almost as if it's a sub compute element and handling that offloading the CPU, then more time is spent with Watson, looking at the financial data, looking at that medical data and storage can become a very valuable resource in this future world of this intense data analytics, the machine learning, the artificial intelligence, that systems are going to provide on premises through a cloud infrastructure storage. That's just one example how storage as an intelligent storage vehicle is offloading things from the CPU or from the cloud onto the storage and helping it become more productive and the data be more valuable that much faster. >> I would agree and I think storage has always been evolving, right. So, storage has gravity, it has value. If you think of storage as where you store data, it's going to change architecturally. You mentioned a hyper converge, you mentioned converge, you mentioned cloud, we talked about what we can do with the mainframe, it's all about how do you get the right accessibility and performance, but it will change. It will change rather dramatically, just think of what's going to go on with, we'll say the traditional, modernizing traditional workload, what you do with VMware, and the arrays are getting much more complex, you can also do software defined arrays which allows you to have just more flexibility and deployment but in the new workloads, where you're looking at high performance data analytics or doing things that you can actually expand out and leverage the cloud, that becomes much more of a software only play, it's still storage. The bits and bytes might be on, it's going to be typically on Flash in my opinion, both on prem or off prem, but how do you move that data? How do you keep accessibility? How do you secure that data? So, how do you make sure you have it in the right place where you can actually get the right performance? And that's where storage is always going to evolve. So, it doesn't matter if it's in this array, in a file system, in what we call a big storage ray, or it's in the cloud, it's about how do you monitor it and manage that through its full life cycle. >> So, it sounds like you're suggesting, and again, I think we agree, is that storage used to be the place where you put stuff, and it's becoming increasingly where you run data related services. Whether those services are associated with security or prepping data or protecting data or moving data as effectively as possible, increasingly the storage resources are becoming the mechanism by which we are handling these strategic data services, is that right? >> Yeah, so, think of it this way, in the old model, storage was somewhat passive, it's a place where you store the data, in the new world model, storage is actually active, it's active in moving the data, in helping analyzing the data like for example in that metadata example I just gave, so, storage is not a passive device any more. Storage is an active element of the entire analytic, machine learning, artificial intelligence process, so you can get real insights. If you just relied on the CPU to do that, not going to happen, so the storage is now an active participant in this end to end solution that extends from on premise into the cloud, as you guys have called it, the true private cloud, >> Right. >> Right, from Wikibon. The storage is active in that versus being just a passive tool, now it's very active and the intelligence, and some of the things we've done with cognitive storage at the IBM site allows the data, like our spectrum scale product, which is heavily involved in giant, hundreds of petabyte analytic workloads today in production in major enterprises across the globe as well as in high performance computer environments, extend from on premise onto cloud, but that storage is active not passive as it was in the old days. >> So, you mentioned cloud, so, we're pretty strong believers in this notion of true private cloud, which is the idea that instead of thinking ultimately about, in the industry that the architecture is going to remove all the data to the cloud, that increasingly, it's going to be moved cloud services down to the data and do things differently and that seems to be, people seem to be, that seems to be resonating with folks. The question that I have then is, when we think about that, where is the data going to be located, that's going to have a major effect on where the workloads actually run? I've had three conversations with three different CIOs in the last six weeks, and they all said, I'm thinking differently and instead of thinking about moving data up to the cloud, I'm now thinking about how do I ensure that I always have control over my data, even if it's running in the cloud because I'm afraid that if I move everything into the cloud, when I do have to bring it back, it's going to be such a huge capital expense, that everybody is going to say no and I can't do it. So, it's almost like, maybe I'll do some stuff in the cloud, but I'll do backup, restore, or have protection on site. What do you think the role of storage is going to be as we think about multi-cloud and being able to do end to end, developing and putting various applications in various places. >> So, you brought up a couple of topics there right, so, your concept and your research on true private cloud actually, I find resonates amazingly well with clients. In fact, a lot of clients are trying to figure out how to leverage cloud, if they have a lot of data on premises and they want to leverage that, so, the way I explain to clients, everyone wants to do everything they can do in the public cloud, all the agility, all the consumption model, all the dev ops models and they just want to do that on premises, so, it's really an agility statement, but then extend to have the right workloads working the right hyper cloud on their demand. But that brings a whole bunch of things. So, the best use case, and now I'll get into the multi-cloud but, the one use case that all of these companies, why did you end up going to Amazon or what not, and then what it gets down to, developers. Developers were able to swipe a credit card or whatever, put their credentials in, swipe a credit card, do one line of code, spin up an environment, one line of code, spin down an environment or they'd boot Chef and Puppet and that would do the API calls, but they are able to do things very quickly. Try that in the enterprise. I mean literally, they would have to go, do a ticket, talk to Joe IT, which they don't want to do, it takes a lot of time, it takes best case about a week, four to five days, and worse case up to three weeks to provision that environment. If you're doing agile development, it literally breaks the process of doing anything agile. So, you're not going to do it, you're forced, you're absolutely forced to go away. So, what we're doing is, we're doing an investment on prem to do exactly, bring the agility, for example, the idea of a swipe our credit card, we have a process, oh, sorry, a software product across, it's an API automation layer, across all of our storage, that gives you the last mile. How do you literally give API templates to your developers that they can literally one line of code, spin it up, one line of code, spin it down, and that works across all our storage devices? But it took investment, and another layer in API automation that the storage team sets up tablets enabled to hey, gold, silver, bronze, provision your own storage, but in the enterprise way, or like a developer, or a gold DBA, hey spin up an environment for a test dev, but what we're able to do is a simple line of code will spin up a system, which could be, let's say, four, five servers, last good snapshot from production that's been data masked the way you need to do it. 'Cause you don't just give developers the whole database. But then literally, that becomes a template that with roll base access again credentials, the developer or Chef or Puppet natively can literally, one line of code, spin up an environment, and one line of code, spin it down. The benefit is, on premises you actually have your data. So, unlike on the, in Amazon, you're spinning things up, spinning things down but it's not really running on what your production data looks like, you're literally able to keep that up to the last night's data or the weekend before, but again with all the data masking. But you can literally show, so, our investment thesis is we need to work on the next level of automation to allow people to truly do everything they can do in the public cloud on private and we're making a lot of investment to do that. So, it's actually one of our biggest investment thesis and it really plays out well as far as clientele. You mentioned the next thing, and you can jump in on both of these, but you also mentioned the next thing is, well, now, a true private cloud allows you to easily extend to these different clouds, well, then how do you keep track of where that is? How do you have, each one of the different clouds will have their own SLAs but how do you manage it? How do you think through security? How do you know you're getting the right SLAs? And where do you put the right things for the right places? And there's management stacks that do that, with software defined storage which all of our products allow you to do, we can run an extension of your device in any of the major public clouds and manage that securely. And I can add a couple more but do you want to jump in. >> Yeah. I think the key thing here is you've got to be able, in a true private cloud, the enterprise is mimicking what an Amazon or IBM cloud division does, right? Except they're doing it in their own walls, on their own premises, now that maybe spread across the world if it's a global enterprise, but it's v will, it's there version of IBM cloud. But they want to be able to burst out. So, all of our software defined storage and even our array storage is designed so that, if they need to move data from on premise to IBM cloud, from on premise to Azure, from on premise to Amazon, they can transparently move that data. In fact, we can set up that they can automatically tier the data, when the data gets cold, boom, they dump it off to IBM cloud. Now, with the data that's in the private cloud on premises if you will, but, a private cloud that they configure, is there for them to use and they take their access out for those, and by the way, talking to the chief security officer and the chief legal officer, they figure out what work loads is it okay to put out there in IBM cloud. And that way they have total control but they have the flexibility of going out to the cloud all done with the storage in an automated fashion. I think the key thing from a true private cloud perspective is storage as well as network and server infrastructure, they want it to be as automated as possible. They had the big town turn at 2008, yes, IT spend is back up, head count is back up, but when you look inside the envelope of head count, there aren't forty storage guys at XYZ Global Enterprise, there is twenty, they are now hired forty people, so, they got forty people back, but the other twenty went to test and dev. They are not doing storage now. So, those twenty guys need to be fully automated to support all these extra developers in a global enterprise and even smaller counts now need that, so the true private cloud, mimics IBM cloud, mimics Azure, mimics Amazon and all those public cloud providers will tell you, they make their business by making sure it's automated, although why is it so, they won't make any money. So, the private cloud does the same thing. >> And those twenty guys are now, as you said earlier, managing oceans of data where the business has no specific visibility in how that data is going to create value in the future. It's an extremely complex arena. So, with that in mind, you guys have been invited to speak to the board of directors of one of the large enterprise clients about the value that storage will play in a digital business, what are some of the things that you tell them? >> So, let me take that one first. >> Sure. >> I think a couple of things. First of all, storage is not passive the way it used to be, you need to think of it as an active element in your cloud strategy to keep your data whole, to keep your data secure and most importantly, to make sure your data offers value. So, for example, you need to use All Flash, why? Well, because it needs to be instantaneous. It needs to connect right into that CPU as fast as possible to suck the data in so you can analyze it and the guys who analyze the data faster, for example, in dark trading and financials, if you're slower, you lose ten million dollars, or a hundred million dollars, so storage is critical in that, so you want to A, let the board of directors know that storage is a critical component, because it's not just passive, you know, like we said before, it's active. So, storage is an intelligence not dumb and people view storage historically as dumb, so, storage is active, storage is intelligent, storage is a critical element of your infrastructure, both in your private club, but also, for what you do to cut costs, when you do go to public club for certain workloads, and so you need to view storage as a more holistic part of how you handle your data, how you harvest the values of the oceans, okay, if you're going to be fishing, you better make sure you get a lot of fish, if you're going to feed the populous, and the more you do, I think of course, you've got to be all that you protected, and you want to be able to secure everything, you can't do that if storage is just dumb and passive. So, the board of directors, they need to see as data is your life blood, data is your gold, you have to mine that data and storage helps you do that. It's not just a place you stick it. It's not a vault to stick the gold in later. It's helping you mine the gold, refine the gold, get the value out of that gold. How do you do 24 karat versus 18 or versus 14? What do you charge for that? Storage can actually help you do all that analysis. Because it's an active element. >> Peter: What would you say Ed? >> I would agree with everything you said and I would actually play it back to how you started this conversation, which is, you know, that digital business is he who uses his data right. So, I'd probably start there and I used the classic metaphor of your data is oil and he who refines it gets the value of it and I agree it's not a perfect metaphor but it's really about getting insight and leveraging that insight and that does translate to a couple of things, right, so, it does matter that you have it secure but it also matters that you have the right performance either on premises or in the cloud and get the right insights. Typically, the right insights is leveraging the data behind your firewall, which is your proprietary data, which is eighty per cent of data in the world is just not available to a public search engine, it's behind the firewall, and by the way, when you're looking at your business, you might want to combine it with different things, like we talk a lot about our Watson, our ability to do, you know, let's say, your in healthcare and then you could bring up oncology, so, Watson and oncology can help you with your data, or the weather channel, we can bring the weather into a lot of different applications. So, you want to leverage other data sets that are publicly available, but also your private data scenarios and get unique insights to it and you want to work with someone that those insights are actually yours, which is really where IBM differentiates their cloud from everything else, so, you want to bring in AI or cognitive, but we actually have cognitive based upon industry, we've actually trained, the thing between cognitive and AI is actually you have to train cognitive, it actually has to learn. But once it learns, it's able to give you very interesting, you know, insights to your data. We do it by industry, which is a very compelling way to deal with data, and the other thing is, you want to protect your data, either on prem, it's not only protection as far as, if you have a failure or you come back up and running, so, recovery, resiliency, but as much also in security, so, you need to secure it throughout. And then the other thing I'd kind of highlight is, more compliance and everyone doesn't want to talk about compliance but the price of compliance is nothing compared to the price of if you get audited and you have to get compliance back, and prove that, just do it right from day one, and you need to be looking data that you're doing on premises or in the cloud, especially multi-cloud, you need to keep compliance and ownership of the data, because it is a high regulated environment and you're seeing new things coming out in Europe. >> Peter: Absolutely. >> You really need to be on top of it, because the cost of that compliance, it might seem, jeez, that seems like a lot, but it's nothing compared to if you, after a law suit or something, you have to come back from it. That's what I would normally talk to a board about. >> So, Ed, you been back at IBM or at IBM now for a while, it's about a year. >> Sure, yeah. >> About five quarters or so, something like that? >> Four quarters. >> Four quarters. And you've had a chance to look at the assets that IBM has. Now, IBM has obviously been a leader in the tech industry and is going to remain so for a long time. But what will IBM be as a leader in the storage industry? What does leadership mean to IBM? It's kind of the one IBM specific question I'm asking but I think it's important, what is IBM leadership going to be in storage? >> So I think, and maybe it gets to the hypothesis of why I came to IBM, you know, to be honest I think IBM helps people get from where they are to where they want to get to and it helps them do that in what I'll say is risk reduced steps. But very few companies have the breadth of portfolio or capabilities like what we have in cloud and cognitive than IBM. I also think storage as an industry, is going through a major change. It might be the next era is about data, but as far as the storage industry, it's in a lot of changes, so, I think it's a, I use the term big boy game, because it's not about doing the next array which we do, it's as much applying the right analytics and understanding the true flow of data and the right security to do it effectively. When I looked at coming to IBM, I kind of did four things. I think it does play to where our vision is, right. I actually think it is changing and our clients are being disrupted and they are looking for a partner to help them. And it's not just disruption of technology or consolidation or price pressures, but they're being disrupted by these, you know, the Uber of my business is XYZ, it's a verb, so, I keep on saying that, but clients in every industry getting disrupted, so, if they're hesitant, if they are on their heels, they're not able to lean in and technology is the worst thing they could do. So, what they need is a partner that knows, and kind of has the right vision and capabilities to lean forward and with confidence, move forward. IBM has a history of going era to era with clients, that's the first thing, and we calmly do it and clients trust that we know where we're going. And that's a lot to do with our primary research, looking out there. Second thing, I think we have the right vision, the cloud and cognitive vision, no one argues with me, how do you get the insight to your data and that matters. You're definition of a digital business is right on. He who uses their data to their advantage is really a digital business, and is at an advantage by that. Three, it's broad portfolio, so, storage with the broadest portfolio in the industry, and you need that because as we help clients, it's not helping them with the next storage array, it's helping them, here's your business, and it's different for everyone, here's where you want to go to as far as your infrastructure and transformation and I help you get there over time. That takes a broad portfolio, not only in storage, but also overall, the right services, the right software. Analytics becomes a big thing, we're the number one company in analytics and that comes to bear for all our clients, but also have the right services, capabilities going forward. And then, I actually think where IBM storage allows you to lean in is really the biggest thing. We're going to help you simplify so you can lean in, with confidence, because that's what everyone is looking for. A partner to allow you to get there. And very few companies are positioned as well as IBM storage to do that. And I know I'm taking credit for a lot of IBM pieces, but that's a strength, because that's leverage of using an overall company to help you industry by industry, with industry vertical knowledge, really help you lean in, with confidence, so you can grow your business and transform. >> Well, let me build on that, because, at the end of the day, your ability to make these kind of commitments to your customers is a function of your ability to make these commitments to IBM and other IBMers history of keeping the commitments that they make to each other. So, IBM as a culture, and I've been around for a long time, worked with a lot of clients with these things, up and down, good and bad at a product level, but your absolutely right, IBM has a track record of saying here's where we're going, if you want to come with us, we're going to get you there and during periods of significant disruption, that's not a bad type of partner to have. >> I'd use the term people kind of say sometimes it's trust. They trust us to get there, and I think their trust is well placed, again I came from the outside a year ago. We're the last company with primary research, right, and so you have to say, where is it going. We actually do primary research to, there's a reason we've been able to go era to era as a company for a hundred plus years, it's because we actually do that and allow people to go era to era. I know we, sometimes IBM downplays it, I actually think it's a strength. >> Well, the Watson Research Center in many respects is creating the new eras and has for many years and is doing so today too. >> Help clients through those eras without leaving you behind, which is something that's rare, you don't see it, our competitors don't have that and I think that's a big thing. >> Alright, so I'm going to close it here. Ed Walsh, GM of storage at IBM. Eric Herzog, runs product marketing for the storage group at IBM, I want to thank you very much for being part of this CUBE conversation. >> Yeah, thank you. >> As we try to bring the experts that matter and they're going to have a consequential impact on how the industry evolves. Thank you very much for joining us for this Wikibon CUBE conversation. I'm Peter Burris, until we talk again. (upbeat digital music)
SUMMARY :
in the technology industry to try to talk about and business is how do you use your data assets. and put the right analytics and insight to it. the next thing you know, the competitor disrupts them Well, I think, well we totally agree with you about the idea that you combine insights is the ability to secure data end to end, so, you do need to know you're relying So, the notion of integration, I think you guys agree, that allows you to do that, right. Well, that raises the next question, and so if you could offload that metadata analysis or it's in the cloud, it's about how do you monitor it where you put stuff, and it's becoming increasingly where it's a place where you store the data, and some of the things we've done with cognitive storage in the industry that the architecture is going to remove that's been data masked the way you need to do it. and the chief legal officer, they figure out So, with that in mind, you guys have been invited and the more you do, I think of course, but it also matters that you have the right performance you have to come back from it. So, Ed, you been back at IBM or at IBM now for a while, and is going to remain so for a long time. and the right security to do it effectively. the commitments that they make to each other. and so you have to say, where is it going. is creating the new eras and has for many years you don't see it, our competitors don't have that at IBM, I want to thank you very much and they're going to have a consequential impact
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
IBM | ORGANIZATION | 0.99+ |
Eric Herzog | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Ed | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Eric | PERSON | 0.99+ |
Ed Walsh | PERSON | 0.99+ |
Peter | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
ten million dollars | QUANTITY | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
forty people | QUANTITY | 0.99+ |
four | QUANTITY | 0.99+ |
Watson Research Center | ORGANIZATION | 0.99+ |
2008 | DATE | 0.99+ |
XYZ Global Enterprise | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
twenty | QUANTITY | 0.99+ |
one line | QUANTITY | 0.99+ |
twenty guys | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
a year ago | DATE | 0.99+ |
one point | QUANTITY | 0.99+ |
eighty per cent | QUANTITY | 0.99+ |
July 2017 | DATE | 0.99+ |
Three | QUANTITY | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
three conversations | QUANTITY | 0.98+ |
14 | QUANTITY | 0.98+ |
Four quarters | QUANTITY | 0.98+ |
both worlds | QUANTITY | 0.98+ |
18 | QUANTITY | 0.98+ |
one | QUANTITY | 0.97+ |
three different CIOs | QUANTITY | 0.97+ |
about a year | QUANTITY | 0.96+ |
forty storage guys | QUANTITY | 0.96+ |