Doug Laney, Caserta | MIT CDOIQ 2020

>> Announcer: From around the globe, it's theCUBE with digital coverage of MIT Chief Data Officer and Information Quality symposium brought to you by SiliconANGLE Media. >> Hi everybody. This is Dave Vellante and welcome back to theCUBE's coverage of the MIT CDOIQ 2020 event. Of course, it's gone virtual. We wish we were all together in Cambridge. They were going to move into a new building this year for years they've done this event at the Tang Center, moving into a new facility, but unfortunately going to have to wait at least a year, we'll see, But we've got a great guest. Nonetheless, Doug Laney is here. He's a Business Value Strategist, the bestselling author, an analyst, consultant then a long time CUBE friend. Doug, great to see you again. Thanks so much for coming on. >> Dave, great to be with you again as well. So can I ask you? You have been an advocate for obviously measuring the value of data, the CDO role. I don't take this the wrong way, but I feel like the last 150 days have done more to accelerate people's attention on the importance of data and the value of data than all the great work that you've done. What do you think? (laughing) >> It's always great when organizations, actually take advantage of some of these concepts of data value. You may be speaking specifically about the situation with United Airlines and American Airlines, where they have basically collateralized their customer loyalty data, their customer loyalty programs to the tunes of several billion dollars each. And one of the things that's very interesting about that is that the third party valuations of their customer loyalty data, resulted in numbers that were larger than the companies themselves. So basically the value of their data, which is as we've discussed previously off balance sheet is more valuable than the market cap of those companies themselves, which is just incredibly fascinating. >> Well, and of course, all you have to do is look to the Trillionaire's Club. And now of course, Apple pushing two trillion to really see the value that the market places on data. But the other thing is of course, COVID, everybody talks about the COVID acceleration. How have you seen it impact the awareness of the importance of data, whether it applies to business resiliency or even new monetization models? If you're not digital, you can't do business. And digital is all about data. >> I think the major challenge that most organizations are seeing from a data and analytics perspective due to COVID is that their traditional trend based forecast models are broken. If you're a company that's only forecasting based on your own historical data and not taking into consideration, or even identifying what are the leading indicators of your business, then COVID and the economic shutdown have entirely broken those models. So it's raised the awareness of companies to say, "Hey, how can we predict our business now? We can't do it based on our own historical data. We need to look externally at what are those external, maybe global indicators or other kinds of markets that proceed our own forecasts or our own activity." And so the conversion from trend based forecast models to what we call driver based forecast models, isn't easy for a lot of organizations to do. And one of the more difficult parts is identifying what are those external data factors from suppliers, from customers, from partners, from competitors, from complimentary products and services that are leading indicators of your business. And then recasting those models and executing on them. >> And that's a great point. If you think about COVID and how it's changed things, everything's changed, right? The ideal customer profile has changed, your value proposition to those customers has completely changed. You got to rethink that. And of course, it's very hard to predict even when this thing eventually comes back, some kind of hybrid mode, you used to be selling to people in an office environment. That's obviously changed. There's a lot that's permanent there. And data is potentially at least the forward indicator, the canary in the coal mine. >> Right. It also is the product and service. So not only can it help you and improve your forecasting models, but it can become a product or service that you're offering. Look at us right now, we would generally be face to face and person to person, but we're using video technology to transfer this content. And then one of the things that I... It took me awhile to realize, but a couple of months after the COVID shutdown, it occurred to me that even as a consulting organization, Caserta focuses on North America. But the reality is that every consultancy is now a global consultancy because we're all doing business remotely. There are no particular or real strong localization issues for doing consulting today. >> So we talked a lot over the years about the role of the CDO, how it's evolved, how it's changed the course of the early... The pre-title days it was coming out of a data quality world. And it's still vital. Of course, as we heard today from the Keynote, it's much more public, much more exposed, different public data sources, but the role has certainly evolved initially into regulated industries like financial, healthcare and government, but now, many, many more organizations have a CDO. My understanding is that you're giving a talk in the business case for the CDO. Help us understand that. >> Yeah. So one of the things that we've been doing here for the last couple of years is a running an ongoing study of how organizations are impacted by the role of the CDO. And really it's more of a correlation and looking at what are some of the qualities of organizations that have a CDO or don't have a CDO. So some of the things we found is that organizations with a CDO nearly twice as often, mention the importance of data and analytics in their annual report organizations with a C level CDO, meaning a true executive are four times more often likely to be using data, to transform the business. And when we're talking about using data and advanced analytics, we found that organizations with a CIO, not a CDO responsible for their data assets are only half as likely to be doing advanced analytics in any way. So there are a number of interesting things that we found about companies that have a CDO and how they operate a bit differently. >> I want to ask you about that. You mentioned the CIO and we're increasingly seeing lines of reporting and peer reporting alter shift. The sands are shifting a little bit. In the early days the CDO and still predominantly I think is an independent organization. We've seen a few cases and increasingly number where they're reporting into the CIO, we've seen the same thing by the way with the chief Information Security Officer, which used to be considered the fox watching the hen house. So we're seeing those shifts. We've also seen the CDO become more aligned with a technical role and sometimes even emerging out of that technical role. >> Yeah. I think the... I don't know, what I've seen more is that the CDOs are emerging from the business, companies are realizing that data is a business asset. It's not an IT asset. There was a time when data was tightly coupled with applications of technologies, but today data is very easily decoupled from those applications and usable in a wider variety of contexts. And for that reason, as data gets recognized as a business, not an IT asset, you want somebody from the business responsible for overseeing that asset. Yes, a lot of CDOs still report to the CIO, but increasingly more CDOs you're seeing and I think you'll see some other surveys from other organizations this week where the CDOs are more frequently reporting up to the CEO level, meaning they're true executives. Along I advocated for the bifurcation of the IT organization into separate I and T organizations. Again, there's no reason other than for historical purposes to keep the data and technology sides of the organizations so intertwined. >> Well, it makes sense that the Chief Data Officer would have an affinity with the lines of business. And you're seeing a lot of organizations, really trying to streamline their data pipeline, their data life cycles, bringing that together, infuse intelligence into that, but also take a systems view and really have the business be intimately involved, if not even owned into the data. You see a lot of emphasis on self-serve, what are you seeing in terms of that data pipeline or the data life cycle, if you will, that used to be wonky, hard core techies, but now it really involving a lot more constituent. >> Yeah. Well, the data life cycle used to be somewhat short. The data life cycles, they're longer and they're more a data networks than a life cycle and or a supply chain. And the reason is that companies are finding alternative uses for their data, not just using it for a single operational purpose or perhaps reporting purpose, but finding that there are new value streams that can be generated from data. There are value streams that can be generated internally. There are a variety of value streams that can be generated externally. So we work with companies to identify what are those variety of value streams? And then test their feasibility, are they ethically feasible? Are they legally feasible? Are they economically feasible? Can they scale? Do you have the technology capabilities? And so we'll run through a process of assessing the ideas that are generated. But the bottom line is that companies are realizing that data is an asset. It needs to be not just measured as one and managed as one, but also monetized as an asset. And as we've talked about previously, data has these unique qualities that it can be used over and over again, and it generate more data when you use it. And it can be used simultaneously for multiple purposes. So companies like, you mentioned, Apple and others have built business models, based on these unique qualities of data. But I think it's really incumbent upon any organization today to do so as well. >> But when you observed those companies that we talk about all the time, data is at the center of their organization. They maybe put people around that data. That's got to be one of the challenge for many of the incumbents is if we talked about the data silos, the different standards, different data quality, that's got to be fairly major blocker for people becoming a "Data-driven organization." >> It is because some organizations were developed as people driven product, driven brand driven, or other things to try to convert. To becoming data-driven, takes a high degree of data literacy or fluency. And I think there'll be a lot of talk about that this week. I'll certainly mention it as well. And so getting the organization to become data fluent and appreciate data as an asset and understand its possibilities and the art of the possible with data, it's a long road. So the culture change that goes along with it is really difficult. And so we're working with 150 year old consumer brand right now that wants to become more data-driven and they're very product driven. And we hear the CIO say, "We want people to understand that we're a data company that just happens to produce this product. We're not a product company that generates data." And once we realized that and started behaving in that fashion, then we'll be able to really win and thrive in our marketplace. >> So one of the key roles of a Chief Data Officers to understand how data affects the monetization of an organization. Obviously there are four profit companies of your healthcare organization saving lives, obviously being profitable as well, or at least staying within the budget, depending upon the structure of the organization. But a lot of people I think oftentimes misunderstand that it's like, "Okay, do I have to become a data broker? Am I selling data directly?" But I think, you pointed out many times and you just did that unlike oil, that's why we don't like that data as a new oil analogy, because it's so much more valuable and can be use, it doesn't fall because of its scarcity. But what are you finding just in terms of people's application of that notion of monetization? Cutting costs, increasing revenue, what are you seeing in the field? What's that spectrum look like? >> So one of the things I've done over the years is compile a library of hundreds and hundreds of examples of how organizations are using data and analytics in innovative ways. And I have a book in process that hopefully will be out this fall. I'm sharing a number of those inspirational examples. So that's the thing that organizations need to understand is that there are a variety of great examples out there, and they shouldn't just necessarily look to their own industry. There are inspirational examples from other industries as well, many clients come to me and they ask, "What are others in my industry doing?" And my flippant response to that is, "Why do you want to be in second place or third place? Why not take an idea from another industry, perhaps a digital product company and apply that to your own business." But like you mentioned, there are a variety of ways to monetize data. It doesn't involve necessarily selling it. You can deliver analytics, you can report on it, you can use it internally to generate improved business process performance. And as long as you're measuring how data's being applied and what its impact is, then you're in a position to claim that you're monetizing it. But if you're not measuring the impact of data on business processes or on customer relationships or partner supplier relationships or anything else, then it's difficult to claim that you're monetizing it. But one of the more interesting ways that we've been working with organizations to monetize their data, certainly in light of GDPR and the California consumer privacy act where I can't sell you my data anymore, but we've identified ways to monetize your customer data in a couple of ways. One is to synthesize the data, create synthetic data sets that retain the original statistical anomalies in the data or features of the data, but don't share actually any PII. But another interesting way that we've been working with organizations to monetize their data is what I call, Inverted data monetization, where again, I can't share my customer data with you, but I can share information about your products and services with my customers. And take a referral fee or a commission, based on that. So let's say I'm a hospital and I can't sell you my patient data, of course, due to variety of regulations, but I know who my diabetes patients are, and I can introduce them to your healthy meal plans, to your gym memberships, to your at home glucose monitoring kits. And again, take a referral fee or a cut of that action. So we're working with customers and the financial services firm industry and in the healthcare industry on just those kinds of examples. So we've identified hundreds of millions of dollars of incremental value for organizations that from their data that we're just sitting on. >> Interesting. Doug because you're a business value strategist at the top, where in the S curve do you see you're able to have the biggest impact. I doubt that you enter organizations where you say, "Oh, they've got it all figured out. They can't use my advice." But as well, sometimes in the early stages, you may not be able to have as big of an impact because there's not top down support or whatever, there's too much technical data, et cetera, where are you finding you can have the biggest impact, Doug? >> Generally we don't come in and run those kinds of data monetization or information innovation exercises, unless there's some degree of executive support. I've never done that at a lower level, but certainly there are lower level more immediate and vocational opportunities for data to deliver value through, to simply analytics. One of the simple examples I give is, I sold a home recently and when you put your house on the market, everybody comes out of the woodwork, the fly by night, mortgage companies, the moving companies, the box companies, the painters, the landscapers, all know you're moving because your data is in the U.S. and the MLS directory. And it was interesting. The only company that didn't reach out to me was my own bank, and so they lost the opportunity to introduce me to a Mortgage they'd retain me as a client, introduce me to my new branch, print me new checks, move the stuff in my safe deposit box, all of that. They missed a simple opportunity. And I'm thinking, this doesn't require rocket science to figure out which of your customers are moving, the MLS database or you can harvest it from Zillow or other sites is basically public domain data. And I was just thinking, how stupid simple would it have been for them to hire a high school programmer, give him a can of red bull and say, "Listen match our customer database to the MLS database to let us know who's moving on a daily or weekly basis." Some of these solutions are pretty simple. >> So is that part of what you do, come in with just hardcore tactical ideas like that? Are you also doing strategy? Tell me more about how you're spending your time. >> I trying to think more of a broader approach where we look at the data itself and again, people have said, "If you tortured enough, what would you tell us? We're just take that angle." We look at examples of how other organizations have monetized data and think about how to apply those and adapt those ideas to the company's own business. We look at key business drivers, internally and externally. We look at edge cases for their customers' businesses. We run through hypothesis generating activities. There are a variety of different kinds of activities that we do to generate ideas. And most of the time when we run these workshops, which last a week or two, we'll end up generating anywhere from 35 to 50 pretty solid ideas for generating new value streams from data. So when we talk about monetizing data, that's what we mean, generating new value streams. But like I said, then the next step is to go through that feasibility assessment and determining which of these ideas you actually want to pursue. >> So you're of course the longtime industry watcher as well, as a former Gartner Analyst, you have to be. My question is, if I think back... I've been around a while. If I think back at the peak of Microsoft's prominence in the PC era, it was like windows 95 and you felt like, "Wow, Microsoft is just so strong." And then of course the Linux comes along and a lot of open source changes and low and behold, a whole new set of leaders emerges. And you see the same thing today with the Trillionaire's Club and you feel like, "Wow, even COVID has been a tailwind for them." But you think about, "Okay, where could the disruption come to these large players that own huge clouds, they have all the data." Is data potentially a disruptor for what appear to be insurmountable odds against the newbies" >> There's always people coming up with new ways to leverage data or new sources of data to capture. So yeah, there's certainly not going to be around for forever, but it's been really fascinating to see the transformation of some companies I think nobody really exemplifies it more than IBM where they emerged from originally selling meat slicers. The Dayton Meat Slicer was their original product. And then they evolved into Manual Business Machines and then Electronic Business Machines. And then they dominated that. Then they dominated the mainframe software industry. Then they dominated the PC industry. Then they dominated the services industry to some degree. And so they're starting to get into data. And I think following that trajectory is something that really any organization should be looking at. When do you actually become a data company? Not just a product company or a service company or top. >> We have Inderpal Bhandari is one of our huge guests here. He's a Chief-- >> Sure. >> Data Officer of IBM, you know him well. And he talks about the journey that he's undertaken to transform the company into a data company. I think a lot of people don't really realize what's actually going on behind the scenes, whether it's financially oriented or revenue opportunities. But one of the things he stressed to me in our interview was that they're on average, they're reducing the end to end cycle time from raw data to insights by 70%, that's on average. And that's just an enormous, for a company that size, it's just enormous cost savings or revenue generating opportunity. >> There's no doubt that the technology behind data pipelines is improving and the process from moving data from those pipelines directly into predictive or diagnostic or prescriptive output is a lot more accelerated than the early days of data warehousing. >> Is the skills barrier is acute? It seems like it's lessened somewhat, the early Hadoop days you needed... Even data scientist... Is it still just a massive skill shortage, or we're starting to attack that. >> Well, I think companies are figuring out a way around the skill shortage by doing things like self service analytics and focusing on more easy to use mainstream type AI or advanced analytics technologies. But there's still very much a need for data scientists and organizations and the difficulty in finding people that are true data scientists. There's no real certification. And so really anybody can call themselves a data scientist but I think companies are getting good at interviewing and determining whether somebody's got the goods or not. But there are other types of skills that we don't really focus on, like the data engineering skills, there's still a huge need for data engineering. Data doesn't self-organize. There are some augmented analytics technologies that will automatically generate analytic output, but there really aren't technologies that automatically self-organize data. And so there's a huge need for data engineers. And then as we talked about, there's a large interest in external data and harvesting that and then ingesting it and even identifying what external data is out there. So one of the emerging roles that we're seeing, if not the sexiest role of the 21st century is the role of the Data Curator, somebody who acts as a librarian, identifying external data assets that are potentially valuable, testing them, evaluating them, negotiating and then figuring out how to ingest that data. So I think that's a really important role for an organization to have. Most companies have an entire department that procures office supplies, but they don't have anybody who's procuring data supplies. And when you think about which is more valuable to an organization? How do you not have somebody who's dedicated to identifying the world of external data assets that are out there? There are 10 million data sets published by government, organizations and NGOs. There are thousands and thousands of data brokers aggregating and sharing data. There's a web content that can be harvested, there's data from your partners and suppliers, there's data from social media. So to not have somebody who's on top of all that it demonstrates gross negligence by the organization. >> That is such an enlightening point, Doug. My last question is, I wonder how... If you can share with us how the pandemic has effected your business personally. As a consultant, you're on the road a lot, obviously not on the road so much, you're doing a lot of chalk talks, et cetera. How have you managed through this and how have you been able to maintain your efficacy with your clients? >> Most of our clients, given that they're in the digital world a bit already, made the switch pretty quick. Some of them took a month or two, some things went on hold but we're still seeing the same level of enthusiasm for data and doing things with data. In fact some companies have taken our (mumbles) that data to be their best defense in a crisis like this. It's affected our business and it's enabled us to do much more international work more easily than we used to. And I probably spend a lot less time on planes. So it gives me more time for writing and speaking and actually doing consulting. So that's been nice as well. >> Yeah, there's that bonus. Obviously theCUBE yes, we're not doing physical events anymore, but hey, we've got two studios operating. And Doug Laney, really appreciate you coming on. (Dough mumbles) Always a great guest and sharing your insights and have a great MIT CDOIQ. >> Thanks, you too, Dave, take care. (mumbles) >> Thanks Doug. All right. And thank you everybody for watching. This is Dave Vellante for theCUBE, our continuous coverage of the MIT Chief Data Officer conference, MIT CDOIQ, will be right back, right after this short break. (bright music)

Published Date : Sep 3 2020

SUMMARY :

symposium brought to you Doug, great to see you again. and the value of data And one of the things of the importance of data, And one of the more difficult the canary in the coal mine. But the reality is that every consultancy a talk in the business case for the CDO. So some of the things we found is that In the early days the CDO is that the CDOs are that data pipeline or the data life cycle, of assessing the ideas that are generated. for many of the incumbents and the art of the possible with data, of the organization. and apply that to your own business." I doubt that you enter organizations and the MLS directory. So is that part of what you do, And most of the time when of Microsoft's prominence in the PC era, the services industry to some degree. is one of our huge guests here. But one of the things he stressed to me is improving and the process the early Hadoop days you needed... and the difficulty in finding people and how have you been able to maintain our (mumbles) that data to be and sharing your insights Thanks, you too, Dave, take care. of the MIT Chief Data Officer conference,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Laney	PERSON	0.99+
United Airlines	ORGANIZATION	0.99+
American Airlines	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Doug	PERSON	0.99+
thousands	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Cambridge	LOCATION	0.99+
21st century	DATE	0.99+
10 million	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Inderpal Bhandari	PERSON	0.99+
two trillion	QUANTITY	0.99+
windows 95	TITLE	0.99+
North America	LOCATION	0.99+
one	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
U.S.	LOCATION	0.99+
a month	QUANTITY	0.99+
35	QUANTITY	0.99+
two	QUANTITY	0.99+
third place	QUANTITY	0.99+
One	QUANTITY	0.99+
MLS	ORGANIZATION	0.98+
two studios	QUANTITY	0.98+
MIT CDOIQ 2020	EVENT	0.98+
Trillionaire's Club	ORGANIZATION	0.98+
today	DATE	0.98+
this week	DATE	0.98+
Tang Center	LOCATION	0.98+
California consumer privacy act	TITLE	0.97+
second place	QUANTITY	0.97+
Linux	TITLE	0.97+
COVID	EVENT	0.97+
Gartner	ORGANIZATION	0.97+
Zillow	ORGANIZATION	0.97+
50	QUANTITY	0.97+
GDPR	TITLE	0.97+
CUBE	ORGANIZATION	0.97+
this year	DATE	0.97+
MIT Chief Data Officer	EVENT	0.96+
theCUBE	ORGANIZATION	0.95+
a week	QUANTITY	0.94+
single	QUANTITY	0.94+
Caserta	ORGANIZATION	0.93+
four times	QUANTITY	0.92+
COVID	OTHER	0.92+
pandemic	EVENT	0.92+
2020	DATE	0.91+
hundreds of millions of dollars	QUANTITY	0.86+
150 year old	QUANTITY	0.86+
this fall	DATE	0.85+
MIT CDOIQ	EVENT	0.85+
last couple of years	DATE	0.84+
four profit companies	QUANTITY	0.84+
COVID	ORGANIZATION	0.82+
Dough	PERSON	0.78+
Keynote	EVENT	0.77+

Joe Caserta & Doug Laney, Caserta | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's three Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Hi already. We're back in Cambridge, Massachusetts at the M I t. Chief data officer Information quality event. Hashtag m i t cdo i Q. And I'm David Dante. He's Paul Gillen. Day one of our two day coverage of this event. This is the Cube, the leader in live tech coverage. Joe Caserta is here is the president of Caserta and Doug Laney, who is principal data strategist at Caserta, both Cube alarm guys. Great to see you again, Joe. What? Did you pick up this guy? How did that all came on here a couple of years ago? We had a great conversation. I read the book, Loved it. So congratulations. A nice pickup. >> We're very fortunate to have. >> Thanks. So I'm fortunate to be here, >> so Okay, well, what attracted you to Cassard? Oh, >> it's Joe's got a tremendous reputation. His his team of consultants has a great reputation. We both felt there was an opportunity to build some data strategy competency on top of that and leverage some of those in Phanom. Its ideas that I've been working on over the years. >> Great. Well, congratulations. And so, Joe, you and I have talked many times. And the reason I like talking because you know what's going on in the market place? You could you could siphon. What's riel? What's hype? So what do you see? It is the big trends in this data space, and then we'll get into it. Yeah, sure. Um, trends >> are chief data officer has been evolving over the last couple of years. You know, when we started doing this several years ago, there was just a handful of people, maybe 30 40 people. Now, there's 450 people here today, and it's been evolving. People are still trying to find their feet. Exactly what the chief date officers should be doing where they are in the hierarchy. Should they report to the c e o the C I O u the other CDO, which is a digital officer. So I think you know, hierarchically. That's still figuring it out politically. They're figuring it out, but technically also, they're still trying to figure it out. You know what's been happening over the past three years is the evolution of data going from traditional data warehousing and business intelligence. To get inside out of data just isn't working anymore. Eso evolving that moving it forward to more modern data engineering we've been doing for the past couple of years with quote unquote big data on That's not working anymore either, right? Because it's been evolving so fast. So now we're on, like, maybe Data three dato. And now we're talking about just pure automate everything. We have to automate everything. And we have to change your mindset from from having output of a data solution to an outcome to date a solution. And that's why I hired Doug, because way have to figure out not only had to get this data and look at it and analyze really had to monetize it, right? It's becoming a revenue stream for your business if you're doing it right and Doug is the leader in the industry, how to figure that >> you keep keep premise of your book was you gotta start valuing data and its fundamental you put forth a number of approaches and techniques and examples of companies doing that. Since you've published in phenomena Microsoft Apple, Amazon, Google and Facebook. Of the top five market value cos they've surpassed all the financial service is guys all ExxonMobil's and any manufacturer? Automobile makers? And what of a data companies, right? Absolutely. But intrinsically we know there's value their way any closer to the prescription that you put forth. >> Yeah, it's really no surprise and extra. We found that data companies have, ah, market to book value. That's nearly 33 times the market average, so Apple and others are much higher than that. But on average, if you look at the data product companies, they're valued much higher than other companies, probably because data can be reused in multiple ways. That's one of the core tenets of intra nomics is that Data's is non depleted ble regenerative, reusable asset and that companies that get that an architect of businesses based on those economics of information, um, can really perform well and not just data companies, but >> any company. That was a key takeaway of the book. The data doesn't conform to the laws of scarcity. Every says data is the new oil. It's like, No, it's not more valuable. So what are some examples in writing your book and customers that you work with. Where do you see Cos outside of these big data driven firms, breaking new ground and uses of data? I >> think the biggest opportunity is really not with the big giant Cos it's really with. Most of our most valuable clients are small companies with large volumes of data. You know if and the reason why they can remain small companies with large volumes of data is the thing that holds back the big giant enterprises is they have so much technical. Dad, it's very hard. They're like trying to, you know, raise the Titanic, right? You can't really. It's not agile enough. You need something that small and agile in order to pivot because it is changing so fast every time there's a solution created, it's obsolete. We have to greet the new solution on dhe when you have a big old processes. Big old technologies, big old mind sets on big old cultures. It's very hard to be agile. >> So is there no hope? I mean, the reason I ask the question was, What hope can you give some of these smokestack companies that they can become data centric? Yeah, What you >> see is that there was a There was a move to build big, monolithic data warehouses years ago and even Data Lakes. And what we find is that through the wealth of examples of companies that have benefited in significant ways from data and analytics, most of those solutions are very vocational. They're very functionally specific. They're not enterprise class, yada, yada, kind of kind of projects. They're focused on a particular business problem or monetizing or leveraging data in a very specific way, and they're generating millions of dollars of value. But again they tend to be very, very functionally specific. >> The other trend that we're seeing is also that the technology and the and the end result of what you're doing with your data is one thing. But really, in order to make that shift, if your big enterprises culture to really change all of the people within the organization to migrate from being a conventional wisdom run company to be a data really analytics driven company, and that takes a lot of change management, a lot of what we call data therapy way actually launched a new practice within the organization that Doug is actually and I are collaborating on to really mature because that is the next wave is really we figured out the data part. We figured out the technology part, but now it's the people part people. Part is really why we're not way ahead of where we even though we're way ahead of where we were a couple of years ago, we should be even further. Culturally, it's very, very challenging, and we need to address that head on. >> And that zeta skills issue that they're sort of locked into their existing skill sets and processes. Or is it? It's fear of the unknown what we're doing, you know? What about foam? Oh, yeah, Well, I mean, there are people >> jumping into bed to do this, right? So there is that part in an exciting part of it. But there's also just fear, you know, and fear of the unknown and, you know, part of what we're trying to do. And why were you trying Thio push Doug's book not for sales, but really just to share the knowledge and remove the mystery and let people see what they can actually do with this data? >> Yeah, it's more >> than just date illiteracy. So there's a lot of talk of the industry about data literacy programs and educating business people on the data and educating data people on the business. And that's obviously important. But what Joe is talking about is something bigger than that. It's really cultural, and it's something that is changed to the company's DNA. >> So where do you attack that problem? It doesn't have to go from the top down. You go into the middle. It has to >> be from the top down. It has to be. It has to be because my boss said to do it all right. >> Well, otherwise they well, they might do it. But the organization's because if you do, it >> is a grassroots movement on Lee. The folks who are excited, right? The foam of people, right? They're the ones who are gonna be excited. But they're going to evolve in adopt anyway, right? But it's the rest of the organization, and that needs to be a top down, Um, approach. >> It was interesting hearing this morning keynote speakers. You scored a throw on top down under the bus, but I had the same reaction is you can't do it without that executive buying. And of course, we defined, I guess in the session what that was. Amazon has an interesting concept for for any initiative, like every initiative that's funded has to have what they call a threaded leader. Another was some kind of And if they don't, if they don't have a threat of leader, there's like an incentive system tau dime on initiative. Kill it. It kind of forces top down. Yeah, you know, So >> when we interview our clients, we have a litmus test and the limits. It's kind of a ready in this test. Do you have the executive leadership to actually make this project successful? And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, you know, or because one of the challenges another part of the litmus test is this IittIe driven. If it's I t driven is gonna be very tough to get embraced by the rest of the business. So way need to really be able to have that executive leadership from the business to say this is something that we need >> to do to survive. Yeah, and, you know, with without the top down support. You could play small ball. But if you're playing the Yankees, you're gonna win one >> of the reasons why when it's I t driven, it's very challenging is because the people part right is a different budget from the i T budget. And when we start talking about data therapy, right and human resource is and training and education of just culture and data literacy, which is not necessary technical, that that becomes a challenge internally figuring out, like how to pay for Andi how to get it done with a corporate politics. >> So So the CDO crowd definitely parts of your book that they should be adopting because to me, there their main job is okay. How does data support the monetization of my organization? Raising revenue, cutting costs, improving productivity, saving lives. You call it value. And so that seems to be the starting point. At the same time. In this conference, you grew out of the ashes of back room information quality of the big data height, but exploded and have kind of gone full circle. So But I wonder, I mean, is the CDO crowd still focused on that monetization? Certainly I think we all agree they should be, but they're getting sucked back into a governance role. Can they do both, I guess, is >> my question. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation and focus on on on ensuring compliance with them. But there's often a just a pendulum swing back, and I think there's a swing back to adding business value. And so we're seeing a lot of opportunities to help companies monetize their data broadly in a variety of ways. A CZ you mentioned not just in one way and, um, again those you need to be driven from the top. We have a process that we go through to generate ideas, and that's wonderful. Generating ideas. No is fairly straightforward enough. But then running them through kind of a feasibility government, starting with you have the executive support for that is a technology technologically feasible, managerially feasible, ethically feasible and so forth. So we kind of run them through that gauntlet next. >> One of my concerns is that chief data officer, the level of involvement that year he has in these digital initiatives again is digital initiative of Field of Dreams. Maybe it is. But everywhere you go the CEO is trying to get digital right, and it seems like the chief data officer is not necessarily front and center in those. Certainly a I projects, which are skunk works. But it's the chief digital officer that's driving it. So how how do you see in those roles playoff >> In the less panel that I've just spoken, very similar question was asked. And again, we're trying to figure out the hierarchy of where the CDO should live in an organization. Um, I find that the biggest place it fails typically is if it rolls up to a C I. O. Right. If you think the data is a technical issue, you're wrong, Right? Data is a business issue, Andi. I also think for any company to survive today, they have to have a digital presence. And so digital presence is so tightly coupled to data that I find the best success is when the chief date officer reports directly to the chief digital officer. Chief Digital officer has a vision for the user experience for the customer customers Ella to figure out. How do we get that customer engaged and that directly is dependent on insight. Right on analytics. You know, if the four of us were to open up, any application on our phone, even for the same product, would have four different experiences based on who we are, who are peers are what we bought in the past, that's all based on analytics. So the business application of the digital presence is tightly couple tow Analytics, which is driven by the chief state officer. >> That's the first time I've heard that. I think that's the right organizational structure. Did see did. JJ is going to be sort of the driver, right? The strategy. That's where the budget's gonna go and the chief date office is gonna have that supporting role that's vital. The enabler. Yeah, I think the chief data officer is a long term play. Well, we have a lot of cheap date officers. Still, 10 years from now, I think that >> data is not a fad. I think Data's just become more and more important. And will they ultimately leapfrog the chief digital officer and report to the CEO? Maybe someday, but for now, I think that's where they belong. >> You know what's company started managing their labor and workforce is as an actual asset, even though it's not a balance sheet. Asked for obvious reasons in the 19 sixties that gave rise to the chief human resource officer, which we still see today and his company start to recognize information as an asset, you need an executive leader to oversee and be responsible for that asset. >> Conceptually, it's always been data is an asset and a liability. And, you know, we've always thought about balancing terms. Your book sort of put forth a formula for actually formalizing. That's right. Do you think it's gonna happen our lifetime? What exactly clear on it, what you put forth in your book in terms of organizations actually valuing data specifically on the balance sheet. So that's >> an accounting question and one that you know that you leave to the accounting professionals. But there have been discussion papers published by the accounting standards bodies to discuss that issue. We're probably at least 10 years away, but I think respective weather data is that about what she'd asked or not. It's an imperative organizations to behave as if it is one >> that was your point it's probably not gonna happen, but you got a finger in terms that you can understand the value because it comes >> back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. Or what day do you have your in a poor position to manage it like one. And if you're not manage like an asset, then you're really not probably able to leverage it like one. >> Give us a little commercial for I do want to say that I do >> think in our lifetime we will see it become an asset. There are lots of intangible assets that are on the books, intellectual property contracts. I think data that supports both of those things are equally is important. And they will they will see the light. >> Why are those five companies huge market cap winners, where they've surpassed all the evaluation >> of a business that the data that they have is considered right? So it should be part of >> the assets in the books. All right, we gotta wraps, But give us Give us the The Caserta Commercial. Well, concert is >> a consultancy that does essentially three things. We do data advisory work, which, which Doug is heading up. We do data architecture and strategy, and we also do just implementation of solutions. Everything from data engineering gate architecture and data science. >> Well, you made a good bet on data. Thanks for coming on, you guys. Great to see you again. Thank you. That's a wrap on day one, Paul. And I'll be back tomorrow for day two with the M I t cdo m I t cdo like you. Thanks for watching. We'll see them all.

Published Date : Jul 31 2019

SUMMARY :

Brought to you by Great to see you again, Joe. Its ideas that I've been working on over the years. And the reason I like talking because you know what's going on in the market place? So I think you that you put forth. We found that data companies have, ah, market to book value. The data doesn't conform to the laws of scarcity. We have to greet the new solution on dhe when you have a big old processes. But again they tend to be very, very functionally specific. But really, in order to make that shift, if your big enterprises It's fear of the unknown what we're But there's also just fear, you know, and fear of the unknown and, people on the data and educating data people on the business. It doesn't have to go from the top down. It has to be because my boss said to do it all But the organization's because if you do, But it's the rest of the organization, and that needs to be a top down, And of course, we defined, I guess in the session what that was. And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, Yeah, and, you know, with without the top down support. of the reasons why when it's I t driven, it's very challenging is because the people part And so that seems to be the starting point. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation One of my concerns is that chief data officer, the level of involvement experience for the customer customers Ella to figure out. JJ is going to be sort of the driver, right? data is not a fad. to the chief human resource officer, which we still see today and his company start to recognize information What exactly clear on it, what you put forth in your book in terms of an accounting question and one that you know that you leave to the accounting professionals. back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. assets that are on the books, intellectual property contracts. the assets in the books. a consultancy that does essentially three things. Great to see you again.

ENTITIES

Entity	Category	Confidence
Joe	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
David Dante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
ExxonMobil	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Joe Caserta	PERSON	0.99+
Paul	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
five companies	QUANTITY	0.99+
Doug	PERSON	0.99+
450 people	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
four	QUANTITY	0.99+
Yankees	ORGANIZATION	0.99+
JJ	PERSON	0.99+
tomorrow	DATE	0.99+
both	QUANTITY	0.99+
two day	QUANTITY	0.99+
Lee	PERSON	0.99+
Doug Laney	PERSON	0.99+
today	DATE	0.98+
One	QUANTITY	0.98+
Cassard	PERSON	0.98+
Andi	PERSON	0.97+
Cube	ORGANIZATION	0.97+
The Caserta Commercial	ORGANIZATION	0.97+
one	QUANTITY	0.97+
day one	QUANTITY	0.97+
first time	QUANTITY	0.97+
day two	QUANTITY	0.96+
several years ago	DATE	0.96+
one thing	QUANTITY	0.93+
Day one	QUANTITY	0.93+
three things	QUANTITY	0.92+
Phanom	LOCATION	0.92+
Caserta	ORGANIZATION	0.91+
this morning	DATE	0.91+
nearly 33 times	QUANTITY	0.9+
couple of years ago	DATE	0.9+
millions of dollars	QUANTITY	0.9+
last couple of years	DATE	0.9+
Doug Laney,	PERSON	0.9+
wave	EVENT	0.89+
19 sixties	DATE	0.87+
2019	DATE	0.86+
Thio push	PERSON	0.85+
past couple of years	DATE	0.84+
years ago	DATE	0.84+
Data three dato	ORGANIZATION	0.84+
one way	QUANTITY	0.84+
next	EVENT	0.83+
past three years	DATE	0.81+
Titanic	COMMERCIAL_ITEM	0.8+
30 40 people	QUANTITY	0.8+
least 10 years	QUANTITY	0.75+
top	QUANTITY	0.75+
M I T.	EVENT	0.75+
MIT CDOIQ	EVENT	0.7+
Field of Dreams	ORGANIZATION	0.7+
past few years	DATE	0.7+
three	QUANTITY	0.7+
five market	QUANTITY	0.69+
CDO	ORGANIZATION	0.68+
of people	QUANTITY	0.66+
M I t.	EVENT	0.65+
years	QUANTITY	0.64+
Caserta	PERSON	0.63+
Cos	ORGANIZATION	0.56+
Ella	PERSON	0.56+
k	ORGANIZATION	0.53+

Dr. Eng Lim Goh, HPE | HPE Discover 2021

>>Please >>welcome back to HPD discovered 2021. The cubes virtual coverage, continuous coverage of H P. S H. P. S. Annual customer event. My name is Dave Volonte and we're going to dive into the intersection of high performance computing data and AI with DR Eng limb go who is the senior vice president and CTO for AI Hewlett Packard enterprise Doctor go great to see you again. Welcome back to the cube. >>Hello Dave, Great to talk to you again. >>You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the day two keynotes here at discover and you talked about thriving in the age of insights and how to craft a data centric strategy. And you addressed you know some of the biggest problems I think organizations face with data that's You got a data is plentiful but insights they're harder to come by. And you really dug into some great examples in retail banking and medicine and health care and media. But stepping back a little bit with zoom out on discovered 21, what do you make of the events so far? And some of your big takeaways? >>Mm Well you started with the insightful question, Right? Yeah, data is everywhere then. But we like the insight. Right? That's also part of the reason why that's the main reason why you know Antonio on day one focused and talked about that. The fact that we are now in the age of insight, right? Uh and uh and and how to thrive thrive in that in this new age. What I then did on the day to kino following Antonio is to talk about the challenges that we need to overcome in order in order to thrive in this new asia. >>So maybe we could talk a little bit about some of the things that you took away in terms I'm specifically interested in some of the barriers to achieving insights when customers are drowning in data. What do you hear from customers? What we take away from some of the ones you talked about today? >>Oh, very pertinent question. Dave You know the two challenges I spoke about right now that we need to overcome in order to thrive in this new age. The first one is is the current challenge and that current challenge is uh you know stated is no barriers to insight. You know when we are awash with data. So that's a statement. Right? How to overcome those barriers. What are the barriers of these two insight when we are awash in data? Um I in the data keynote I spoke about three main things. Three main areas that received from customers. The first one, the first barrier is in many with many of our customers. A data is siloed. All right. You know, like in a big corporation you've got data siloed by sales, finance, engineering, manufacturing, and so on, uh supply chain and so on. And uh there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the that was the first barrier. We spoke about barriers to incite when we are washed with data. The second barrier is uh that we see amongst our customers is that uh data is raw and dispersed when they are stored and and uh and you know, it's tough to get tough to to get value out of them. Right? And I in that case I I used the example of uh you know the May 6 2010 event where the stock market dropped a trillion dollars in in tens of minutes. You know, we we all know those who are financially attuned with know about this uh incident, But this is not the only incident. There are many of them out there and for for that particular May six event, uh you know, it took a long time to get insight months. Yeah, before we for months we had no insight as to what happened, why it happened, right. Um, and and there were many other incidences like this and the regulators were looking for that one rule that could, that could mitigate many of these incidences. Um, one of our customers decided to take the hard road to go with the tough data right? Because data is rolling dispersed. So they went into all the different feeds of financial transaction information, took the took the tough took the tough road and analyze that data took a long time to assemble. And they discovered that there was quote stuffing right? That uh people were sending a lot of traits in and then cancelling them almost immediately. You have to manipulate the market. Um And why why why didn't we see it immediately? Well, the reason is the process reports that everybody sees the rule in there that says all trades, less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 103 100 100 shares trades uh to fly under the radar to do this manipulation. So here is here the second barrier right? Data could be raw and dispersed. Um Sometimes you just have to take the hard road and um and to get insight And this is 1 1 great example. And then the last barrier is uh is has to do with sometimes when you start a project to to get insight to get uh to get answers and insight. You you realize that all the datas around you but you don't you don't seem to find the right ones to get what you need. You don't you don't seem to get the right ones. Yeah. Um here we have three quick examples of customers. 111 was it was a great example right? Where uh they were trying to build a language translator, a machine language translator between two languages. Right? By not do that. They need to get hundreds of millions of word pairs, you know, of one language compared uh with a corresponding other hundreds of millions of them. They say, well I'm going to get all these word pairs. Someone creative thought of a willing source. And you thought it was the United Nations, you see. So sometimes you think you don't have the right data with you, but there might be another source. And the willing one that could give you that data Right? The 2nd 1 has to do with uh there was uh the uh sometimes you you may just have to generate that data, interesting one. We had an autonomous car customer that collects all these data from their cars, right? Massive amounts of data, loss of sensors, collect loss of data. And uh, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car uh in in um in fine weather and collected the car driving on this highway in rain and also in stone, but never had the opportunity to collect the car in hill because that's a rare occurrence. So instead of waiting for a time where the car can dr inhale, they build a simulation you by having the car collector in snow and simulated him. So, these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated the fact that data silo the Federated it various associated with data. That's tough to get that. They just took the hard road, right? And, and sometimes, thirdly, you just have to be creative to get the right data. You need, >>wow, I I'll tell you, I have about 100 questions based on what you just said. Uh, there's a great example, the flash crash. In fact, Michael Lewis wrote about this in his book The Flash Boys and essentially right. It was high frequency traders trying to front run the market and sending in small block trades trying to get on the front end it. So that's and they, and they chalked it up to a glitch like you said, for months. Nobody really knew what it was. So technology got us into this problem. I guess my question is, can technology help us get out of the problem? And that maybe is where AI fits in. >>Yes, yes. Uh, in fact, a lot of analytics, we went in to go back to the raw data that is highly dispersed from different sources, right, assemble them to see if you can find a material trend, right? You can see lots of trends, right? Like, uh, you know, we if if humans look at things right, we tend to see patterns in clouds, right? So sometimes you need to apply statistical analysis, um math to to be sure that what the model is seeing is is real. Right? And and that required work. That's one area. The second area is uh you know, when um uh there are times when you you just need to to go through that uh that tough approach to to find the answer. Now, the issue comes to mind now is is that humans put in the rules to decide what goes into a report that everybody sees. And in this case uh before the change in the rules. Right? But by the way, after the discovery, uh authorities change the rules and all all shares, all traits of different any sizes. It has to be reported. No. Yeah. Right. But the rule was applied uh you know, to say earlier that shares under 100 trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and and under for understandable reasons. I mean they probably didn't want that for various reasons not to put everything in there so that people could still read it uh in a reasonable amount of time. But uh we need to understand that rules were being put in by humans for the reports we read. And as such there are times you just need to go back to the raw data. >>I want to ask, >>it's gonna be tough. >>Yeah. So I want to ask a question about AI is obviously it's in your title and it's something you know a lot about but and I want to make a statement, you tell me if it's on point or off point. So it seems that most of the Ai going on in the enterprise is modeling data science applied to troves of data but but there's also a lot of ai going on in consumer whether it's you know, fingerprint technology or facial recognition or natural language processing will a two part question will the consumer market as has so often in the enterprise sort of inform us uh the first part and then will there be a shift from sort of modeling if you will to more you mentioned autonomous vehicles more ai influencing in real time. Especially with the edge you can help us understand that better. >>Yeah, it's a great question. Right. Uh there are three stages to just simplify, I mean, you know, it's probably more sophisticated than that but let's simplify three stages. All right. To to building an Ai system that ultimately can predict, make a prediction right or to to assist you in decision making, have an outcome. So you start with the data massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data and the machine uh starts to evolve a model based on all the data is seeing. It starts to evolve right to the point that using a test set of data that you have separately kept a site that you know the answer for. Then you test the model uh you know after you trained it with all that data to see whether it's prediction accuracy is high enough and once you are satisfied with it, you you then deploy the model to make the decision and that's the influence. Right? So a lot of times depend on what what we are focusing on. We we um in data science are we working hard on assembling the right data to feed the machine with, That's the data preparation organization work. And then after which you build your models, you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you you pick one model and the prediction isn't that robust, it is good but then it is not consistent right now. What you do is uh you try another model so sometimes it's just keep trying different models until you get the right kind. Yeah, that gives you a good robust decision making and prediction after which It is tested well Q eight. You would then take that model and deploy it at the edge. Yeah. And then at the edges is essentially just looking at new data, applying it to the model that you have trained and then that model will give you a prediction decision. Right? So uh it is these three stages. Yeah, but more and more uh your question reminds me that more and more people are thinking as the edge become more and more powerful. Can you also do learning at the edge? Right. That's the reason why we spoke about swarm learning the last time, learning at the edge as a swamp, right? Because maybe individually they may not have enough power to do so. But as a swamp they made >>is that learning from the edge? You're learning at the edge? In other words? >>Yes. >>Yeah, I understand the question. Yeah. >>That's a great question. That's a great question. Right? So uh the quick answer is learning at the edge, right? Uh and and also from the edge, but the main goal, right? The goal is to learn at the edge so that you don't have to move the data that the edge sees first back to the cloud or the core to do the learning because that would be the reason. One of the main reasons why you want to learn at the edge, right? Uh So so that you don't need to have to send all that data back and assemble it back from all the different Edge devices, assemble it back to the cloud side to to do the learning right. With someone you can learn it and keep the data at the edge and learn at that point. >>And then maybe only selectively send the autonomous vehicle example you gave us great because maybe there, you know, there may be only persisting, they're not persisting data that is inclement weather or when a deer runs across the front. And then maybe they they do that and then they send that smaller data set back and maybe that's where it's modelling done. But the rest can be done at the edges. It's a new world that's coming down. Let me ask you a question, is there a limit to what data should be collected and how it should be collected? >>That's a great question again, you know uh wow today, full of these uh insightful questions that actually touches on the second challenge. Right? How do we uh in order to thrive in this new age of insight? The second challenge is are you know the is our future challenge, right? What do we do for our future? And and in there is uh the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that I talk about what to collect right? When to organize it when you collect and where will your data be, you know, going forward that you are collecting from? So what, when and where for the what data for the what data to collect? That? That was the question you ask. Um it's it's a question that different industries have to ask themselves because it will vary, right? Um Let me give you the, you use the autonomous car example, let me use that. And We have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from the fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars, collecting data so they can train and eventually deploy commercial cars. Right? Um, so this data collection cars they collect as a fleet of them collect 10 petabytes a day and when it came to us uh building a storage system yeah, to store all of that data, they realized they don't want to afford to store all of it. Now here comes the dilemma, right? Should what should I after I spent so much effort building all these cars and sensors and collecting data, I've now decide what to delete. That's a dilemma right now in working with them on this process of trimming down what they collected. You know, I'm constantly reminded of the sixties and seventies, right? To remind myself 16 seventies we call a large part of our D. N. A junk DNA. Today we realize that a large part of that what we call john has function as valuable function. They are not jeans, but they regulate the function of jeans, you know? So, so what's jumped in the yesterday could be valuable today or what's junk today could be valuable tomorrow. Right? So, so there's this tension going on right between you decided not wanting to afford to store everything that you can get your hands on. But on the other hand, you you know, you worry you you you ignore the wrong ones, right? You can see this tension in our customers, right? And it depends on industry here. Right? In health care, they say I have no choice. I I want it. All right. One very insightful point brought up by one health care provider that really touched me was, you know, we are not we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But you also care for the people were not caring for. How do we find them? Mhm. Right. And that therefore they did not just need to collect data that is uh that they have with from their patients. They also need to reach out right to outside data so that they can figure out who they are not caring for. Right? So they want it all. So I tell us them. So what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and find out. Of course they also come back to us rightfully that, you know, we have to then work out a way to help them build that system, you know, so that health care, right? And and if you go to other industries like banking, they say they can't afford to keep them on, but they are regulated. Seems like healthcare, they are regulated as to uh privacy and such. Like so many examples different industries having different needs but different approaches to how what they collect. But there is this constant tension between um you perhaps deciding not wanting to fund all of that uh all that you can stall right on the other hand, you know, if you if you kind of don't want to afford it and decide not to store some uh if he does some become highly valuable in the future right? Don't worry. >>We can make some assumptions about the future, can't we? I mean, we know there's gonna be a lot more data than than we've ever seen before. We know that we know. Well notwithstanding supply constraints on things like nand, we know the prices of storage is gonna continue to decline. We also know and not a lot of people are really talking about this but the processing power but he says moore's law is dead. Okay, it's waning. But the processing power when you combine the Cpus and N. P. U. S. And Gpus and accelerators and and so forth actually is is increasing. And so when you think about these use cases at the edge, you're going to have much more processing power, you're going to have cheaper storage and it's going to be less expensive processing. And so as an ai practitioner, what can you do with that? >>So the amount of data that's gonna come in, it's gonna we exceed right? Our drop in storage costs are increasing computer power. Right? So what's the answer? Right? So so the the answer must be knowing that we don't and and even the drop in price and increase in bandwidth, it will overwhelm the increased five G will overwhelm five G. Right? Given amount 55 billion of them collecting. Right? So the answer must be that there might need to be a balance between you needing to bring all that data from the 55 billion devices data back to a central as a bunch of central. Cause because you may not be able to afford to do that firstly band with even with five G. M and and SD when you'll still be too expensive given the number of devices out there, Were you given storage costs dropping? You'll still be too expensive to try and store them all. So the answer must be to start at least to mitigate the problem to some leave both a lot of the data out there. Right? And only send back the pertinent ones as you said before. But then if you did that, then how are we gonna do machine learning at the core and the cloud side? If you don't have all the data, you want rich data to train with. Right? Some sometimes you wanna mix of the uh positive type data and the negative type data so you can train the machine in a more balanced way. So the answer must be eventually right. As we move forward with these huge number of devices out of the edge to do machine learning at the edge today, we don't have enough power. Right? The edge typically is characterized by a lower uh energy capability and therefore lower compute power. But soon, you know, even with lower energy they can do more with compute power, improving in energy efficiency, Right? Uh So learning at the edge today we do influence at the edge. So we data model deploy and you do in France at the age, that's what we do today. But more and more I believe given a massive amount of data at the edge, you, you have to have to start doing machine learning at the edge and, and if when you don't have enough power then you aggregate multiple devices, compute power into a swamp and learn as a swan. >>Oh, interesting. So now of course, if, if I were sitting and fly, fly on the wall in hp board meeting, I said okay. HB is as a leading provider of compute how do you take advantage of that? I mean we're going, we're, I know its future, but you must be thinking about that and participating in those markets. I know today you are, you have, you know, edge line and other products. But there's, it seems to me that it's, it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that >>opportunity for the customers? The world will have to have a balance right? Where today the default? Well, the more common mode is to collect the data from the edge and train at uh at some centralized location or a number of centralized location um going forward. Given the proliferation of the edge devices, we'll need a balance. We need both. We need capability at the cloud side. Right? And it has to be hybrid and then we need capability on the edge side. Yeah. That they want to build systems that that on one hand, uh is uh edge adapted, right? Meaning the environmentally adapted because the edge different. They are on a lot of times. On the outside. Uh They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery power. Right? Um, so you have to build systems that adapt to it. But at the same time they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run a rich set of applications. So yes. Um that's that's also the insightful for that Antonio announced in 2018 Uh the next four years from 2018, right $4 billion dollars invested to strengthen our edge portfolio. Edge product lines, Right. Edge solutions. >>I can doctor go, I could go on for hours with you. You're you're just such a great guest. Let's close. What are you most excited about in the future? Of of of it. Certainly H. P. E. But the industry in general. >>Yeah. I think the excitement is uh the customers, right? The diversity of customers and and the diversity in a way they have approached their different problems with data strategy. So the excitement is around data strategy, right? Just like you know uh you know, the the statement made was was so was profound, right? Um And Antonio said we are in the age of insight powered by data. That's the first line, right. Uh The line that comes after that is as such were becoming more and more data centric with data, the currency. Now the next step is even more profound. That is um You know, we are going as far as saying that you know um data should not be treated as cost anymore. No. Right. But instead as an investment in a new asset class called data with value on our balance sheet, this is a this is a step change right? In thinking that is going to change the way we look at data, the way we value it. So that's a statement that this is the exciting thing because because for for me, a city of Ai right uh machine is only as intelligent as the data you feed it with data is a source of the machine learning to be intelligent. So, so that's that's why when when people start to value data, right? And and and say that it is an investment when we collect it, it is very positive for AI because an AI system gets intelligent, get more intelligence because it has a huge amounts of data and the diversity of data. So it would be great if the community values values data. Well, >>you certainly see it in the valuations of many companies these days. Um and I think increasingly you see it on the income statement, you know, data products and people monetizing data services and maybe eventually you'll see it in the in the balance. You know, Doug Laney, when he was a gardener group wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? Dr >>yeah. Question is is the process and methods evaluation right. But I believe we'll get there, we need to get started and then we'll get there. Believe >>doctor goes on >>pleasure. And yeah. And then the Yeah, I will well benefit greatly from it. >>Oh yeah, no doubt people will better understand how to align you know, some of these technology investments, Doctor goes great to see you again. Thanks so much for coming back in the cube. It's been a real pleasure. >>Yes. A system. It's only as smart as the data you feed it with. >>Excellent. We'll leave it there, thank you for spending some time with us and keep it right there for more great interviews from HP discover 21 this is Dave Volonte for the cube. The leader in enterprise tech coverage right back

Published Date : Jun 23 2021

SUMMARY :

Hewlett Packard enterprise Doctor go great to see you again. And you addressed you That's also part of the reason why that's the main reason why you know Antonio on day one So maybe we could talk a little bit about some of the things that you The first one is is the current challenge and that current challenge is uh you know stated So that's and they, and they chalked it up to a glitch like you said, is is that humans put in the rules to decide what goes into So it seems that most of the Ai going on in the enterprise is modeling It starts to evolve right to the point that using a test set of data that you have Yeah. The goal is to learn at the edge so that you don't have to move And then maybe only selectively send the autonomous vehicle example you gave us great because But on the other hand, you you know, you worry you you you But the processing power when you combine the Cpus and N. that there might need to be a balance between you needing to bring all that data from the I know today you are, you have, you know, edge line and other products. Um, so you have to build systems that adapt to it. What are you most excited about in the future? machine is only as intelligent as the data you feed it with data Um and I think increasingly you see it on the income statement, you know, data products and people Question is is the process and methods evaluation right. And then the Yeah, I will well benefit greatly from it. Doctor goes great to see you again. It's only as smart as the data you feed it with. We'll leave it there, thank you for spending some time with us and keep it right there for more great interviews

ENTITIES

Entity	Category	Confidence
Michael Lewis	PERSON	0.99+
Dave Volonte	PERSON	0.99+
Dave	PERSON	0.99+
Doug Laney	PERSON	0.99+
France	LOCATION	0.99+
two languages	QUANTITY	0.99+
The Flash Boys	TITLE	0.99+
55 billion	QUANTITY	0.99+
2018	DATE	0.99+
Today	DATE	0.99+
two challenges	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
one language	QUANTITY	0.99+
second area	QUANTITY	0.99+
2021	DATE	0.99+
May 6 2010	DATE	0.99+
last year	DATE	0.99+
tomorrow	DATE	0.99+
tens of minutes	QUANTITY	0.99+
HPD	ORGANIZATION	0.99+
less than 100 shares	QUANTITY	0.99+
second barrier	QUANTITY	0.99+
today	DATE	0.99+
first part	QUANTITY	0.99+
Eng Lim Goh	PERSON	0.99+
One	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
first barrier	QUANTITY	0.99+
both	QUANTITY	0.98+
three stages	QUANTITY	0.98+
Hewlett Packard	ORGANIZATION	0.98+
$4 billion dollars	QUANTITY	0.98+
one model	QUANTITY	0.98+
two part	QUANTITY	0.98+
first line	QUANTITY	0.98+
United Nations	ORGANIZATION	0.98+
one area	QUANTITY	0.98+
Antonio	PERSON	0.98+
first one	QUANTITY	0.98+
one rule	QUANTITY	0.98+
hundreds of millions	QUANTITY	0.98+
HPE	ORGANIZATION	0.97+
May six	DATE	0.97+
about 100 questions	QUANTITY	0.97+
john	PERSON	0.96+
two insight	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.95+
yesterday	DATE	0.95+
asia	LOCATION	0.92+
HB	ORGANIZATION	0.92+
under 100 shares	QUANTITY	0.92+
Three main areas	QUANTITY	0.92+
first	QUANTITY	0.92+
hundreds of millions of word pairs	QUANTITY	0.91+
under 100 trades	QUANTITY	0.91+
less than 103 100 100 shares	QUANTITY	0.91+
Q eight	OTHER	0.9+
three quick examples	QUANTITY	0.9+
two keynotes	QUANTITY	0.9+
55 billion devices	QUANTITY	0.89+
firstly	QUANTITY	0.88+
three main things	QUANTITY	0.88+
Dr.	PERSON	0.87+
day one	QUANTITY	0.86+
H P. S H. P. S. Annual customer	EVENT	0.85+
2nd 1	QUANTITY	0.84+
Eng limb	PERSON	0.81+
one	QUANTITY	0.8+
16 seventies	QUANTITY	0.77+
a trillion dollars	QUANTITY	0.74+
one health care provider	QUANTITY	0.73+
one of	QUANTITY	0.72+
sixties	QUANTITY	0.69+
DR	PERSON	0.69+
customers	QUANTITY	0.68+

Dr Eng Lim Goh, High Performance Computing & AI | HPE Discover 2021

>>Welcome back to HPD discovered 2021 the cubes virtual coverage, continuous coverage of H P. S H. P. S. Annual customer event. My name is Dave Volonte and we're going to dive into the intersection of high performance computing data and AI with DR Eng limb go who is the senior vice president and CTO for AI at Hewlett Packard enterprise Doctor go great to see you again. Welcome back to the cube. >>Hello Dave, Great to talk to you again. >>You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the day two keynotes here at discover you talked about thriving in the age of insights and how to craft a data centric strategy and you addressed you know some of the biggest problems I think organizations face with data that's You got a data is plentiful but insights they're harder to come by. And you really dug into some great examples in retail banking and medicine and health care and media. But stepping back a little bit with zoom out on discovered 21, what do you make of the events so far? And some of your big takeaways? >>Mm Well you started with the insightful question, right? Yeah. Data is everywhere then. But we like the insight. Right? That's also part of the reason why that's the main reason why you know Antonio on day one focused and talked about that. The fact that we are now in the age of insight. Right? Uh and and uh and and how to thrive thrive in that in this new age. What I then did on the day to kino following Antonio is to talk about the challenges that we need to overcome in order in order to thrive in this new age. >>So maybe we could talk a little bit about some of the things that you took away in terms I'm specifically interested in some of the barriers to achieving insights when you know customers are drowning in data. What do you hear from customers? What we take away from some of the ones you talked about today? >>Oh, very pertinent question. Dave you know the two challenges I spoke about right now that we need to overcome in order to thrive in this new age. The first one is is the current challenge and that current challenge is uh you know stated is you know, barriers to insight, you know when we are awash with data. So that's a statement right? How to overcome those barriers. What are the barriers of these two insight when we are awash in data? Um I in the data keynote I spoke about three main things. Three main areas that received from customers. The first one, the first barrier is in many with many of our customers. A data is siloed. All right. You know, like in a big corporation you've got data siloed by sales, finance, engineering, manufacturing, and so on, uh supply chain and so on. And uh, there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the that was the first barrier we spoke about barriers to incite when we are washed with data. The second barrier is uh, that we see amongst our customers is that uh data is raw and dispersed when they are stored and and uh and you know, it's tough to get tough to to get value out of them. Right? And I in that case I I used the example of uh you know the May 6 2010 event where the stock market dropped a trillion dollars in in tens of ministerial. We we all know those who are financially attuned with know about this uh incident But this is not the only incident. There are many of them out there and for for that particular May six event uh you know, it took a long time to get insight months. Yeah before we for months we had no insight as to what happened, why it happened, right. Um and and there were many other incidences like this. And the regulators were looking for that one rule that could, that could mitigate many of these incidences. Um one of our customers decided to take the hard road go with the tough data right? Because data is rolling dispersed. So they went into all the different feeds of financial transaction information. Uh took the took the tough uh took the tough road and analyze that data took a long time to assemble and they discovered that there was court stuffing right? That uh people were sending a lot of traits in and then cancelling them almost immediately. You have to manipulate the market. Um And why why why didn't we see it immediately? Well the reason is the process reports that everybody sees uh rule in there that says all trades. Less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 103 100 100 shares trades uh to fly under the radar to do this manipulation. So here is here the second barrier right? Data could be raw and dispersed. Um Sometimes you just have to take the hard road and um and to get insight And this is 1 1 great example. And then the last barrier is uh is has to do with sometimes when you start a project to to get insight to get uh to get answers and insight. You you realize that all the datas around you but you don't you don't seem to find the right ones To get what you need. You don't you don't seem to get the right ones. Yeah. Um here we have three quick examples of customers. 111 was it was a great example right? Where uh they were trying to build a language translator, a machine language translator between two languages. Right? But not do that. They need to get hundreds of millions of word pairs, you know, of one language compared uh with the corresponding other hundreds of millions of them. They say we are going to get all these word pairs. Someone creative thought of a willing source and a huge, so it was a United Nations you see. So sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data right. The second one has to do with uh there was uh the uh sometimes you you may just have to generate that data, interesting one. We had an autonomous car customer that collects all these data from their cars, right, massive amounts of data, loss of senses, collect loss of data. And uh you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car uh in in um in fine weather and collected the car driving on this highway in rain and also in stone, but never had the opportunity to collect the car in hale because that's a rare occurrence. So instead of waiting for a time where the car can dr inhale, they build a simulation you by having the car collector in snow and simulated him. So these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated the fact that data is silo Federated, it various associated with data. That's tough to get that. They just took the hard road, right? And sometimes, thirdly, you just have to be creative to get the right data you need, >>wow, I tell you, I have about 100 questions based on what you just said. Uh, there's a great example, the flash crash. In fact, Michael Lewis wrote about this in his book, The Flash Boys and essentially right. It was high frequency traders trying to front run the market and sending in small block trades trying to get on the front end it. So that's and they, and they chalked it up to a glitch like you said, for months, nobody really knew what it was. So technology got us into this problem. I guess my question is, can technology help us get out of the problem? And that maybe is where AI fits in. >>Yes, yes. Uh, in fact, a lot of analytics, we went in, uh, to go back to the raw data that is highly dispersed from different sources, right, assemble them to see if you can find a material trend, right? You can see lots of trends right? Like, uh, you know, we, if if humans look at things right, we tend to see patterns in clouds, right? So sometimes you need to apply statistical analysis, um math to be sure that what the model is seeing is is real. Right? And and that required work. That's one area. The second area is uh you know, when um uh there are times when you you just need to to go through that uh that tough approach to to find the answer. Now, the issue comes to mind now is is that humans put in the rules to decide what goes into a report that everybody sees in this case uh before the change in the rules. Right? But by the way, after the discovery, the authorities change the rules and all all shares, all traits of different any sizes. It has to be reported. No. Yeah. Right. But the rule was applied uh you know, to say earlier that shares under 100 trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and and under for understandable reasons. I mean they probably didn't want that for various reasons not to put everything in there so that people could still read it uh in a reasonable amount of time. But uh we need to understand that rules were being put in by humans for the reports we read. And as such, there are times you just need to go back to the raw data. >>I want to ask, >>albeit that it's gonna be tough. >>Yeah. So I want to ask a question about AI is obviously it's in your title and it's something you know a lot about but and I want to make a statement, you tell me if it's on point or off point. So it seems that most of the Ai going on in the enterprise is modeling data science applied to troves of data >>but >>but there's also a lot of ai going on in consumer whether it's you know, fingerprint technology or facial recognition or natural language processing. Will a two part question will the consumer market has so often in the enterprise sort of inform us uh the first part and then will there be a shift from sort of modeling if you will to more you mentioned autonomous vehicles more ai influencing in real time. Especially with the edge. She can help us understand that better. >>Yeah, it's a great question. Right. Uh there are three stages to just simplify, I mean, you know, it's probably more sophisticated than that but let's simplify three stages. All right. To to building an Ai system that ultimately can predict, make a prediction right or to to assist you in decision making, have an outcome. So you start with the data massive amounts data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data and the machine uh starts to evolve a model based on all the data is seeing. It starts to evolve right to the point that using a test set of data that you have separately campus site that you know the answer for. Then you test the model uh you know after you trained it with all that data to see whether it's prediction accuracy is high enough and once you are satisfied with it, you you then deploy the model to make the decision and that's the influence. Right? So a lot of times depend on what what we are focusing on. We we um in data science are we working hard on assembling the right data to feed the machine with, That's the data preparation organization work. And then after which you build your models, you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you you pick one model and the prediction isn't that robust, it is good but then it is not consistent right now what you do is uh you try another model so sometimes it's just keep trying different models until you get the right kind. Yeah, that gives you a good robust decision making and prediction after which It is tested well Q eight. You would then take that model and deploy it at the edge. Yeah. And then at the edges is essentially just looking at new data, applying it to the model, you're you're trained and then that model will give you a prediction decision. Right? So uh it is these three stages. Yeah, but more and more uh you know, your question reminds me that more and more people are thinking as the edge become more and more powerful. Can you also do learning at the edge? Right. That's the reason why we spoke about swarm learning the last time, learning at the edge as a swamp, right? Because maybe individually they may not have enough power to do so. But as a swampy me, >>is that learning from the edge or learning at the edge? In other words? Yes. Yeah. Question Yeah. >>That's a great question. That's a great question. Right? So uh the quick answer is learning at the edge, right? Uh and also from the edge, but the main goal, right? The goal is to learn at the edge so that you don't have to move the data that the Edge sees first back to the cloud or the core to do the learning because that would be the reason. One of the main reasons why you want to learn at the edge, right? Uh So so that you don't need to have to send all that data back and assemble it back from all the different edge devices, assemble it back to the cloud side to to do the learning right? With swampland. You can learn it and keep the data at the edge and learn at that point. >>And then maybe only selectively send the autonomous vehicle example you gave us. Great because maybe there, you know, there may be only persisting, they're not persisting data that is inclement weather or when a deer runs across the front and then maybe they they do that and then they send that smaller data set back and maybe that's where it's modelling done. But the rest can be done at the edges. It's a new world that's coming down. Let me ask you a question, is there a limit to what data should be collected and how it should be collected? >>That's a great question again. You know uh wow today, full of these uh insightful questions that actually touches on the second challenge. Right? How do we uh in order to thrive in this new age of inside? The second challenge is are you know the is our future challenge, right? What do we do for our future? And and in there is uh the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that I talk about what to collect right? When to organize it when you collect and then where will your data be, you know going forward that you are collecting from? So what, when and where for the what data for the what data to collect? That? That was the question you ask. Um it's it's a question that different industries have to ask themselves because it will vary, right? Um let me give you the you use the autonomous car example, let me use that. And you have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from the fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars collecting data so they can train and eventually deploy commercial cars, right? Um so this data collection cars they collect as a fleet of them collect temporal bikes a day. And when it came to us building a storage system to store all of that data, they realized they don't want to afford to store all of it. Now, here comes the dilemma, right? What should I after I spent so much effort building all these cars and sensors and collecting data, I've now decide what to delete. That's a dilemma right now in working with them on this process of trimming down what they collected. You know, I'm constantly reminded of the sixties and seventies, right? To remind myself 60 and seventies, we call a large part of our D. N. A junk DNA. Today. We realize that a large part of that what we call john has function as valuable function. They are not jeans, but they regulate the function of jeans, you know, So, so what's jump in the yesterday could be valuable today or what's junk today could be valuable tomorrow, Right? So, so there's this tension going on right between you decided not wanting to afford to store everything that you can get your hands on. But on the other hand, you you know, you worry you you you ignore the wrong ones, right? You can see this tension in our customers, right? And it depends on industry here, right? In health care, they say I have no choice. I I want it. All right. One very insightful point brought up by one health care provider that really touched me was, you know, we are not we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But you also care for the people were not caring for. How do we find them? Mhm. Right. And that therefore, they did not just need to collect data. That is that they have with from their patients. They also need to reach out right to outside data so that they can figure out who they are not caring for, right? So they want it all. So I tell us them, so what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and find that. Of course they also come back to us rightfully that you know, we have to then work out a way to help them build that system, you know? So that's health care, right? And and if you go to other industries like banking, they say they can't afford to keep them off, but they are regulated, seems like healthcare, they are regulated as to uh privacy and such. Like so many examples different industries having different needs, but different approaches to how what they collect. But there is this constant tension between um you perhaps deciding not wanting to fund all of that uh all that you can store, right? But on the other hand, you know, if you if you kind of don't want to afford it and decide not to store some uh if he does some become highly valuable in the future, right? Yeah. >>We can make some assumptions about the future, can't we? I mean, we know there's gonna be a lot more data than than we've ever seen before. We know that we know well notwithstanding supply constraints on things like nand. We know the prices of storage is going to continue to decline. We also know, and not a lot of people are really talking about this but the processing power but he says moore's law is dead okay. It's waning. But the processing power when you combine the Cpus and NP US and GPUS and accelerators and and so forth actually is is increasing. And so when you think about these use cases at the edge, you're going to have much more processing power, you're gonna have cheaper storage and it's going to be less expensive processing And so as an ai practitioner, what can you do with that? >>Yeah, it's highly again, another insightful questions that we touched on our keynote and that that goes up to the why I do the where? Right, When will your data be? Right. We have one estimate that says that by next year there will be 55 billion connected devices out there. Right. 55 billion. Right. What's the population of the world? Of the other? Of 10 billion? But this thing is 55 billion. Right? Uh and many of them, most of them can collect data. So what do you what do you do? Right. Um So the amount of data that's gonna come in, it's gonna weigh exceed right? Our drop in storage costs are increasing computer power. Right? So what's the answer? Right. So, so the the answer must be knowing that we don't and and even the drop in price and increase in bandwidth, it will overwhelm the increased five G will overwhelm five G. Right? Given amount 55 billion of them collecting. Right? So, the answer must be that there might need to be a balance between you needing to bring all that data from the 55 billion devices of data back to a central as a bunch of central Cause because you may not be able to afford to do that firstly band with even with five G. M and and SD when you'll still be too expensive given the number of devices out there. Were you given storage cause dropping will still be too expensive to try and store them all. So the answer must be to start at least to mitigate the problem to some leave both a lot of the data out there. Right? And only send back the pertinent ones as you said before. But then if you did that, then how are we gonna do machine learning at the core and the cloud side? If you don't have all the data you want rich data to train with. Right? Some sometimes you want a mix of the uh positive type data and the negative type data so you can train the machine in a more balanced way. So the answer must be eventually right. As we move forward with these huge number of devices out of the edge to do machine learning at the edge. Today, we don't have enough power. Right? The edge typically is characterized by a lower uh, energy capability and therefore lower compute power. But soon, you know, even with lower energy, they can do more with compute power improving in energy efficiency, Right? Uh, so learning at the edge today, we do influence at the edge. So we data model deploy and you do influence at the age, that's what we do today. But more and more, I believe, given a massive amount of data at the edge, you you have to have to start doing machine learning at the edge. And and if when you don't have enough power, then you aggregate multiple devices, compute power into a swamp and learn as a swan, >>interesting. So now, of course, if I were sitting and fly on the wall in HP board meeting, I said, okay, HP is as a leading provider of compute, how do you take advantage of that? I mean, we're going, I know it's future, but you must be thinking about that and participating in those markets. I know today you are you have, you know, edge line and other products. But there's it seems to me that it's it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that >>opportunity for your customers? Uh the world will have to have a balance right? Where today the default, Well, the more common mode is to collect the data from the edge and train at uh at some centralized location or a number of centralized location um going forward. Given the proliferation of the edge devices, we'll need a balance. We need both. We need capability at the cloud side. Right. And it has to be hybrid. And then we need capability on the edge side. Yeah. That they want to build systems that that on one hand, uh is uh edge adapted, right? Meaning the environmentally adapted because the edge different they are on a lot of times on the outside. Uh They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery powered. Right? Um so you have to build systems that adapt to it, but at the same time they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run rich a set of applications. So yes. Um that's that's also the insightful for that Antonio announced in 2018, Uh the next four years from 2018, right, $4 billion dollars invested to strengthen our edge portfolio, edge product lines, right Edge solutions. >>I get a doctor go. I could go on for hours with you. You're you're just such a great guest. Let's close what are you most excited about in the future of of of it? Certainly H. P. E. But the industry in general. >>Yeah I think the excitement is uh the customers right? The diversity of customers and and the diversity in a way they have approached their different problems with data strategy. So the excitement is around data strategy right? Just like you know uh you know the the statement made was was so was profound. Right? Um And Antonio said we are in the age of insight powered by data. That's the first line right? The line that comes after that is as such were becoming more and more data centric with data the currency. Now the next step is even more profound. That is um you know we are going as far as saying that you know um data should not be treated as cost anymore. No right. But instead as an investment in a new asset class called data with value on our balance sheet, this is a this is a step change right in thinking that is going to change the way we look at data the way we value it. So that's a statement that this is the exciting thing because because for for me a city of AI right uh machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. So so that's that's why when when people start to value data right? And and and say that it is an investment when we collect it. It is very positive for ai because an Ai system gets intelligent, more intelligence because it has a huge amounts of data and the diversity of data. So it'd be great if the community values values data. Well >>you certainly see it in the valuations of many companies these days. Um and I think increasingly you see it on the income statement, you know data products and people monetizing data services and maybe eventually you'll see it in the in the balance. You know Doug Laney when he was a gardener group wrote a book about this and a lot of people are thinking about it. That's a big change isn't it? Dr >>yeah. Question is is the process and methods evaluation. Right. But uh I believe we'll get there, we need to get started then we'll get their belief >>doctor goes on and >>pleasure. And yeah and then the yeah I will will will will benefit greatly from it. >>Oh yeah, no doubt people will better understand how to align you know, some of these technology investments, Doctor goes great to see you again. Thanks so much for coming back in the cube. It's been a real pleasure. >>Yes. A system. It's only as smart as the data you feed it with. >>Excellent. We'll leave it there. Thank you for spending some time with us and keep it right there for more great interviews from HP discover 21. This is dave a lot for the cube. The leader in enterprise tech coverage right back.

Published Date : Jun 17 2021

SUMMARY :

at Hewlett Packard enterprise Doctor go great to see you again. the age of insights and how to craft a data centric strategy and you addressed you know That's also part of the reason why that's the main reason why you know Antonio on day one So maybe we could talk a little bit about some of the things that you The first one is is the current challenge and that current challenge is uh you know stated So that's and they, and they chalked it up to a glitch like you said, is is that humans put in the rules to decide what goes into So it seems that most of the Ai going on in the enterprise is modeling be a shift from sort of modeling if you will to more you mentioned autonomous It starts to evolve right to the point that using a test set of data that you have is that learning from the edge or learning at the edge? The goal is to learn at the edge so that you don't have to move the data that the And then maybe only selectively send the autonomous vehicle example you gave us. But on the other hand, you know, if you if you kind of don't want to afford it and But the processing power when you combine the Cpus and NP that there might need to be a balance between you needing to bring all that data from the I know today you are you have, you know, edge line and other products. Um so you have to build systems that adapt to it, but at the same time they must not Let's close what are you most excited about in the future of machine is only as intelligent as the data you feed it with. Um and I think increasingly you see it on the income statement, you know data products and Question is is the process and methods evaluation. And yeah and then the yeah I will will will will benefit greatly from it. Doctor goes great to see you again. It's only as smart as the data you feed it with. Thank you for spending some time with us and keep it right there for more great

ENTITIES

Entity	Category	Confidence
Michael Lewis	PERSON	0.99+
Dave Volonte	PERSON	0.99+
Dave	PERSON	0.99+
2018	DATE	0.99+
HP	ORGANIZATION	0.99+
two languages	QUANTITY	0.99+
The Flash Boys	TITLE	0.99+
55 billion	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
Hewlett Packard	ORGANIZATION	0.99+
two challenges	QUANTITY	0.99+
second area	QUANTITY	0.99+
one language	QUANTITY	0.99+
Today	DATE	0.99+
last year	DATE	0.99+
Doug Laney	PERSON	0.99+
tomorrow	DATE	0.99+
next year	DATE	0.99+
both	QUANTITY	0.99+
One	QUANTITY	0.99+
today	DATE	0.99+
first line	QUANTITY	0.99+
first part	QUANTITY	0.99+
May 6 2010	DATE	0.99+
$4 billion dollars	QUANTITY	0.99+
two part	QUANTITY	0.99+
Less than 100 shares	QUANTITY	0.99+
HPD	ORGANIZATION	0.99+
one model	QUANTITY	0.98+
one rule	QUANTITY	0.98+
one area	QUANTITY	0.98+
second barrier	QUANTITY	0.98+
60	QUANTITY	0.98+
55 billion devices	QUANTITY	0.98+
Antonio	PERSON	0.98+
john	PERSON	0.98+
three stages	QUANTITY	0.98+
hundreds of millions	QUANTITY	0.97+
about 100 questions	QUANTITY	0.97+
Eng Lim Goh	PERSON	0.97+
HPE	ORGANIZATION	0.97+
first barrier	QUANTITY	0.97+
first one	QUANTITY	0.97+
Three main areas	QUANTITY	0.97+
yesterday	DATE	0.96+
tens of ministerial	QUANTITY	0.96+
two insight	QUANTITY	0.96+
Q eight	OTHER	0.95+
2021	DATE	0.94+
seventies	QUANTITY	0.94+
two keynotes	QUANTITY	0.93+
a day	QUANTITY	0.93+
first	QUANTITY	0.92+
H P. S H. P. S. Annual customer	EVENT	0.91+
United Nations	ORGANIZATION	0.91+
less than 103 100 100 shares	QUANTITY	0.91+
under 100 trades	QUANTITY	0.9+
under 100 shares	QUANTITY	0.9+
day one	QUANTITY	0.88+
about 10 petabytes a day	QUANTITY	0.88+
three quick examples	QUANTITY	0.85+
one health care provider	QUANTITY	0.85+
one estimate	QUANTITY	0.84+
three main things	QUANTITY	0.83+
hundreds of millions of word pairs	QUANTITY	0.82+
Antonio	ORGANIZATION	0.81+
sixties	QUANTITY	0.78+
one	QUANTITY	0.77+
May six	DATE	0.75+
firstly	QUANTITY	0.74+
trillion dollars	QUANTITY	0.73+
second one	QUANTITY	0.71+
HP discover 21	ORGANIZATION	0.69+
DR Eng limb	PERSON	0.69+
one of our customers	QUANTITY	0.66+

Dr Eng Lim Goh, Vice President, CTO, High Performance Computing & AI

(upbeat music) >> Welcome back to HPE Discover 2021, theCube's virtual coverage, continuous coverage of HPE's annual customer event. My name is Dave Vellante and we're going to dive into the intersection of high-performance computing, data and AI with Dr. Eng Lim Goh who's a Senior Vice President and CTO for AI at Hewlett Packard Enterprise. Dr. Goh, great to see you again. Welcome back to theCube. >> Hey, hello, Dave. Great to talk to you again. >> You might remember last year we talked a lot about swarm intelligence and how AI is evolving. Of course you hosted the Day 2 keynotes here at Discover. And you talked about thriving in the age of insights and how to craft a data-centric strategy and you addressed some of the biggest problems I think organizations face with data. And that's, you got to look, data is plentiful, but insights, they're harder to come by and you really dug into some great examples in retail, banking, and medicine and healthcare and media. But stepping back a little bit we'll zoom out on Discover '21, you know, what do you make of the events so far and some of your big takeaways? >> Hmm, well, you started with the insightful question. Data is everywhere then but we lack the insight. That's also part of the reason why that's a main reason why, Antonio on Day 1 focused and talked about that, the fact that we are in the now in the age of insight and how to thrive in this new age. What I then did on the Day 2 keynote following Antonio is to talk about the challenges that we need to overcome in order to thrive in this new age. >> So maybe we could talk a little bit about some of the things that you took away in terms of, I'm specifically interested in some of the barriers to achieving insights when you know customers are drowning in data. What do you hear from customers? What were your takeaway from some of the ones you talked about today? >> Very pertinent question, Dave. You know, the two challenges I spoke about how to, that we need to overcome in order to thrive in this new age, the first one is the current challenge. And that current challenge is, you know state of this, you know, barriers to insight, when we are awash with data. So that's a statement. How to overcome those barriers. One of the barriers to insight when we are awash in data, in the Day 2 keynote, I spoke about three main things, three main areas that receive from customers. The first one, the first barrier is with many of our customers, data is siloed. You know, like in a big corporation, you've got data siloed by sales, finance, engineering, manufacturing, and so on supply chain and so on. And there's a major effort ongoing in many corporations to build a Federation layer above all those silos so that when you build applications above they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the first barrier we spoke about, you know, barriers to insight when we are awash with data. The second barrier is that we see amongst our customers is that data is raw and disperse when they are stored. And it's tough to get to value out of them. In that case I use the example of the May 6, 2010 event where the stock market dropped a trillion dollars in tens of minutes. We all know those who are financially attuned with, know about this incident. But that this is not the only incident. There are many of them out there. And for that particular May 6, event, you know it took a long time to get insight, months, yeah, before we, for months we had no insight as to what happened, why it happened. And there were many other incidences like this and the regulators were looking for that one rule that could mitigate many of these incidences. One of our customers decided to take the hard road to go with the tough data. Because data is raw and dispersed. So they went into all the different feeds of financial transaction information, took the tough, you know, took a tough road and analyze that data took a long time to assemble. And he discovered that there was quote stuffing. That people were sending a lot of trades in and then canceling them almost immediately. You have to manipulate the market. And why didn't we see it immediately? Well, the reason is the process reports that everybody sees had the rule in there that says all trades less than 100 shares don't need to report in there. And so what people did was sending a lot of less than 100 shares trades to fly under the radar to do this manipulation. So here is, here the second barrier. Data could be raw and disperse. Sometimes it's just have to take the hard road and to get insight. And this is one great example. And then the last barrier has to do with sometimes when you start a project to get insight, to get answers and insight, you realize that all the data's around you, but you don't seem to find the right ones to get what you need. You don't seem to get the right ones, yeah. Here we have three quick examples of customers. One was a great example where they were trying to build a language translator a machine language translator between two languages. But in order to do that they need to get hundreds of millions of word pairs of one language compare with the corresponding other hundreds of millions of them. They say, "Where I'm going to get all these word pairs?" Someone creative thought of a willing source and huge source, it was a United Nations. You see, so sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data. The second one has to do with, there was the, sometimes you may just have to generate that data. Interesting one. We had an autonomous car customer that collects all these data from their cars. Massive amounts of data, lots of sensors, collect lots of data. And, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car in fine weather and collected the car driving on this highway in rain and also in snow. But never had the opportunity to collect the car in hail because that's a rare occurrence. So instead of waiting for a time where the car can drive in hail, they build a simulation by having the car collected in snow and simulated hail. So these are some of the examples where we have customers working to overcome barriers. You have barriers that is associated with the fact, that data silo, if federated barriers associated with data that's tough to get at. They just took the hard road. And sometimes thirdly, you just have to be creative to get the right data you need. >> Wow, I tell you, I have about 100 questions based on what you just said. And as a great example, the flash crash in fact Michael Lewis wrote about this in his book, the "Flash Boys" and essentially. It was high frequency traders trying to front run the market and sending in small block trades trying to get sort of front ended. So that's, and they chalked it up to a glitch. Like you said, for months, nobody really knew what it was. So technology got us into this problem. Can I guess my question is can technology help us get get out of the problem? And that maybe is where AI fits in. >> Yes. Yes. In fact, a lot of analytics work went in to go back to the raw data that is highly dispersed from different sources, assemble them to see if you can find a material trend. You can see lots of trends. Like, no, we, if humans at things we tend to see patterns in clouds. So sometimes you need to apply statistical analysis, math to be sure that what the model is seeing is real. And that required work. That's one area. The second area is, you know, when this, there are times when you just need to go through that tough approach to find the answer. Now, the issue comes to mind now is that humans put in the rules to decide what goes into a report that everybody sees. And in this case before the change in the rules. By the way, after the discovery, the authorities changed the rules and all shares all trades of different, any sizes it has to be reported. Not, yeah. But the rule was applied to to say earlier that shares under 100, trades under 100 shares need not be reported. So sometimes you just have to understand that reports were decided by humans and for understandable reasons. I mean, they probably didn't, wanted for various reasons not to put everything in there so that people could still read it in a reasonable amount of time. But we need to understand that rules were being put in by humans for the reports we read. And as such there are times we just need to go back to the raw data. >> I want to ask you-- Or be it that it's going to be tough there. >> Yeah, so I want to ask you a question about AI as obviously it's in your title and it's something you know a lot about and I'm going to make a statement. You tell me if it's on point or off point. Seems that most of the AI going on in the enterprise is modeling data science applied to troves of data. But there's also a lot of AI going on in consumer, whether it's fingerprint technology or facial recognition or natural language processing. Will, to two-part question, will the consumer market, let's say as it has so often in the enterprise sort of inform us is sort of first part. And then will there be a shift from sort of modeling, if you will, to more, you mentioned autonomous vehicles more AI inferencing in real-time, especially with the Edge. I think you can help us understand that better. >> Yeah, this is a great question. There are three stages to just simplify, I mean, you know, it's probably more sophisticated than that, but let's just simplify there're three stages to building an AI system that ultimately can predict, make a prediction. Or to assist you in decision-making, have an outcome. So you start with the data, massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data. And the machine starts to evolve a model based on all the data is seeing it starts to evolve. To a point that using a test set of data that you have separately kept a site that you know the answer for. Then you test the model, you know after you're trained it with all that data to see whether his prediction accuracy is high enough. And once you are satisfied with it, you then deploy the model to make the decision and that's the inference. So a lot of times depending on what we are focusing on. We in data science are we working hard on assembling the right data to feed the machine with? That's the data preparation organization work. And then after which you build your models you have to pick the right models for the decisions and prediction you wanted to make. You pick the right models and then you start feeding the data with it. Sometimes you pick one model and a prediction isn't that a robust, it is good, but then it is not consistent. Now what you do is you try another model. So sometimes you just keep trying different models until you get the right kind, yeah, that gives you a good robust decision-making and prediction. Now, after which, if it's tested well, Q8 you will then take that model and deploy it at the Edge, yeah. And then at the Edge is essentially just looking at new data applying it to the model that you have trained and then that model will give you a prediction or a decision. So it is these three stages, yeah. But more and more, your question reminds me that more and more people are thinking as the Edge become more and more powerful, can you also do learning at the Edge? That's the reason why we spoke about swarm learning the last time, learning at the Edge as a swarm. Because maybe individually they may not have enough power to do so, but as a swarm, they may. >> Is that learning from the Edge or learning at the Edge. In other words, is it-- >> Yes. >> Yeah, you don't understand my question, yeah. >> That's a great question. That's a great question. So answer is learning at the Edge, and also from the Edge, but the main goal, the goal is to learn at the Edge so that you don't have to move the data that Edge sees first back to the Cloud or the call to do the learning. Because that would be the reason, one of the main reasons why you want to learn at the Edge. So that you don't need to have to send all that data back and assemble it back from all the different Edge devices assemble it back to the Cloud side to do the learning. With swarm learning, you can learn it and keep the data at the Edge and learn at that point, yeah. >> And then maybe only selectively send the autonomous vehicle example you gave is great 'cause maybe they're, you know, there may be only persisting. They're not persisting data that is an inclement weather, or when a deer runs across the front and then maybe they do that and then they send that smaller data set back and maybe that's where it's modeling done but the rest can be done at the Edge. It's a new world that's coming to, let me ask you a question. Is there a limit to what data should be collected and how it should be collected? >> That's a great question again, yeah, well, today full of these insightful questions that actually touches on the second challenge. How do we, to in order to thrive in this new age of insight. The second challenge is our future challenge. What do we do for our future? And in there is the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that, I talk about what to collect, and when to organize it when you collect, and then where will your data be going forward that you are collecting from? So what, when, and where. For the what data, for what data to collect that was the question you asked. It's a question that different industries have to ask themselves because it will vary. Let me give you the, you use the autonomous car example. Let me use that and you have this customer collecting massive amounts of data. You know, we talking about 10 petabytes a day from a fleet of their cars and these are not production autonomous cars. These are training autonomous cars, collecting data so they can train and eventually deploy a commercial cars. Also these data collection cars, they collect 10 as a fleet of them collect 10 petabytes a day. And then when it came to us, building a storage system to store all of that data they realize they don't want to afford to store all of it. Now here comes the dilemma. What should I, after I spent so much effort building all this cars and sensors and collecting data, I've now decide what to delete. That's a dilemma. Now in working with them on this process of trimming down what they collected. I'm constantly reminded of the 60s and 70s. To remind myself 60s and 70s, we call a large part of our DNA, junk DNA. Today we realized that a large part of that, what we call junk has function has valuable function. They are not genes but they regulate the function of genes. So what's junk in yesterday could be valuable today, or what's junk today could be valuable tomorrow. So there's this tension going on between you deciding not wanting to afford to store everything that you can get your hands on. But on the other hand, you know you worry, you ignore the wrong ones. You can see this tension in our customers. And then it depends on industry here. In healthcare they say, I have no choice. I want it all, why? One very insightful point brought up by one healthcare provider that really touched me was you know, we are not, we don't only care. Of course we care a lot. We care a lot about the people we are caring for. But we also care for the people we are not caring for. How do we find them? And therefore, they did not just need to collect data that they have with, from their patients they also need to reach out to outside data so that they can figure out who they are not caring for. So they want it all. So I asked them, "So what do you do with funding if you want it all?" They say they have no choice but they'll figure out a way to fund it and perhaps monetization of what they have now is the way to come around and fund that. Of course, they also come back to us, rightfully that you know, we have to then work out a way to to help them build a system. So that healthcare. And if you go to other industries like banking, they say they can afford to keep them all. But they are regulated same like healthcare. They are regulated as to privacy and such like. So many examples, different industries having different needs but different approaches to how, what they collect. But there is this constant tension between you perhaps deciding not wanting to fund all of that, all that you can store. But on the other hand you know, if you kind of don't want to afford it and decide not to store some, maybe those some become highly valuable in the future. You worry. >> Well, we can make some assumptions about the future, can't we? I mean we know there's going to be a lot more data than we've ever seen before, we know that. We know, well not withstanding supply constraints and things like NAND. We know the price of storage is going to continue to decline. We also know and not a lot of people are really talking about this but the processing power, everybody says, Moore's Law is dead. Okay, it's waning but the processing power when you combine the CPUs and NPUs, and GPUs and accelerators and so forth, actually is increasing. And so when you think about these use cases at the Edge you're going to have much more processing power. You're going to have cheaper storage and it's going to be less expensive processing. And so as an AI practitioner, what can you do with that? >> Yeah, it's a highly, again another insightful question that we touched on, on our keynote and that goes up to the why, I'll do the where. Where will your data be? We have one estimate that says that by next year, there will be 55 billion connected devices out there. 55 billion. What's the population of the world? Well, off the order of 10 billion, but this thing is 55 billion. And many of them, most of them can collect data. So what do you do? So the amount of data that's going to come in is going to way exceed our drop in storage costs our increasing compute power. So what's the answer? The answer must be knowing that we don't and even a drop in price and increase in bandwidth, it will overwhelm the 5G, it'll will overwhelm 5G, given the amount of 55 billion of them collecting. So the answer must be that there needs to be a balance between you needing to bring all that data from the 55 billion devices of the data back out to a central, as a bunch of central cost because you may not be able to afford to do that. Firstly bandwidth, even with 5G and as the, when you still be too expensive given the number of devices out there. You know given storage costs dropping it'll still be too expensive to try and install them all. So the answer must be to start at least to mitigate the problem to some leave most a lot of the data out there. And only send back the pertinent ones, as you said before. But then if you did that then, how are we going to do machine learning at the core and the Cloud side, if you don't have all the data you want rich data to train with. Sometimes you want to a mix of the positive type data, and the negative type data. So you can train the machine in a more balanced way. So the answer must be you eventually, as we move forward with these huge number of devices are at the Edge to do machine learning at the Edge. Today we don't even have power. The Edge typically is characterized by a lower energy capability and therefore, lower compute power. But soon, you know, even with low energy, they can do more with compute power, improving in energy efficiency. So learning at the Edge today we do inference at the Edge. So we data, model, deploy and you do inference at age. That's what we do today. But more and more, I believe given a massive amount of data at the Edge you have to have to start doing machine learning at the Edge. And if when you don't have enough power then you aggregate multiple devices' compute power into a swarm and learn as a swarm. >> Oh, interesting, so now of course, if I were sitting in a flyer flying the wall on HPE Board meeting I said, "Okay, HPE is a leading provider of compute." How do you take advantage that? I mean, we're going, I know it's future but you must be thinking about that and participating in those markets. I know today you are, you have, you know, Edge line and other products, but there's, it seems to me that it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that opportunity for your customers? >> The wall will have to have a balance. Where today the default, well, the more common mode is to collect the data from the Edge and train at some centralized location or number of centralized location. Going forward, given the proliferation of the Edge devices, we'll need a balance, we need both. We need capability at the Cloud side. And it has to be hybrid. And then we need capability on the Edge side. Yeah that we need to build systems that on one hand is Edge-adapted. Meaning they environmentally-adapted because the Edge differently are on it. A lot of times on the outside, they need to be packaging-adapted and also power-adapted. Because typically many of these devices are battery-powered. So you have to build systems that adapts to it. But at the same time, they must not be custom. That's my belief. They must be using standard processes and standard operating system so that they can run a rich set of applications. So yes, that's also the insightful for that. Antonio announced in 2018 for the next four years from 2018, $4 billion invested to strengthen our Edge portfolio our Edge product lines, Edge solutions. >> Dr. Goh, I could go on for hours with you. You're just such a great guest. Let's close. What are you most excited about in the future of certainly HPE, but the industry in general? >> Yeah, I think the excitement is the customers. The diversity of customers and the diversity in the way they have approached their different problems with data strategy. So the excitement is around data strategy. Just like, you know, the statement made for us was so, was profound. And Antonio said we are in the age of insight powered by data. That's the first line. The line that comes after that is as such we are becoming more and more data-centric with data the currency. Now the next step is even more profound. That is, you know, we are going as far as saying that data should not be treated as cost anymore, no. But instead, as an investment in a new asset class called data with value on our balance sheet. This is a step change in thinking that is going to change the way we look at data, the way we value it. So that's a statement. So this is the exciting thing, because for me a CTO of AI, a machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. So that's why when the people start to value data and say that it is an investment when we collect it it is very positive for AI because an AI system gets intelligent, get more intelligence because it has huge amounts of data and a diversity of data. So it'd be great if the community values data. >> Well, are you certainly see it in the valuations of many companies these days? And I think increasingly you see it on the income statement, you know data products and people monetizing data services, and yeah, maybe eventually you'll see it in the balance sheet, I know. Doug Laney when he was at Gartner Group wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? Dr. Goh. >> Yeah, yeah, yeah. Your question is the process and methods in valuation. But I believe we'll get there. We need to get started and then we'll get there, I believe, yeah. >> Dr. Goh it's always my pleasure. >> And then the AI will benefit greatly from it. >> Oh yeah, no doubt. People will better understand how to align some of these technology investments. Dr. Goh, great to see you again. Thanks so much for coming back in theCube. It's been a real pleasure. >> Yes, a system is only as smart as the data you feed it with. (both chuckling) >> Well, excellent, we'll leave it there. Thank you for spending some time with us so keep it right there for more great interviews from HPE Discover '21. This is Dave Vellante for theCube, the leader in enterprise tech coverage. We'll be right back (upbeat music)

Published Date : Jun 10 2021

SUMMARY :

Dr. Goh, great to see you again. Great to talk to you again. and you addressed some and how to thrive in this new age. of the ones you talked about today? One of the barriers to insight And as a great example, the flash crash is that humans put in the rules to decide that it's going to be tough there. and it's something you know a lot about And the machine starts to evolve a model Is that learning from the Yeah, you don't So that you don't need to have but the rest can be done at the Edge. But on the other hand you know, And so when you think about and the Cloud side, if you I know today you are, you So you have to build about in the future as the data you feed it with. And I think increasingly you Your question is the process And then the AI will Dr. Goh, great to see you again. as the data you feed it with. Thank you for spending some time with us

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Lewis	PERSON	0.99+
Doug Laney	PERSON	0.99+
Dave	PERSON	0.99+
Antonio	PERSON	0.99+
2018	DATE	0.99+
10 billion	QUANTITY	0.99+
$4 billion	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
two languages	QUANTITY	0.99+
two challenges	QUANTITY	0.99+
May 6	DATE	0.99+
Flash Boys	TITLE	0.99+
two-part	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
tomorrow	DATE	0.99+
Gartner Group	ORGANIZATION	0.99+
second area	QUANTITY	0.99+
Today	DATE	0.99+
last year	DATE	0.99+
less than 100 shares	QUANTITY	0.99+
hundreds of millions	QUANTITY	0.99+
first line	QUANTITY	0.99+
One	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
today	DATE	0.99+
second barrier	QUANTITY	0.99+
May 6, 2010	DATE	0.99+
10	QUANTITY	0.99+
first barrier	QUANTITY	0.99+
both	QUANTITY	0.99+
less than 100 share	QUANTITY	0.99+
Dr.	PERSON	0.99+
one model	QUANTITY	0.99+
tens of minutes	QUANTITY	0.98+
one area	QUANTITY	0.98+
one language	QUANTITY	0.98+
Edge	ORGANIZATION	0.98+
three stages	QUANTITY	0.98+
yesterday	DATE	0.98+
first part	QUANTITY	0.98+
one rule	QUANTITY	0.98+
Goh	PERSON	0.98+
Firstly	QUANTITY	0.98+
first one	QUANTITY	0.97+
United Nations	ORGANIZATION	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
first barrier	QUANTITY	0.97+
Hewlett Packard Enterprise	ORGANIZATION	0.96+
about 100 questions	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.95+
Day 2	QUANTITY	0.94+
Eng Lim Goh	PERSON	0.94+
Day 1	QUANTITY	0.93+
under 100	QUANTITY	0.92+
Dr	PERSON	0.92+
one estimate	QUANTITY	0.91+

Dr Eng Lim Goh, Vice President, CTO, High Performance Computing & AI

(upbeat music) >> Welcome back to HPE Discover 2021, theCUBE's virtual coverage, continuous coverage of HPE's Annual Customer Event. My name is Dave Vellante, and we're going to dive into the intersection of high-performance computing, data and AI with Doctor Eng Lim Goh, who's a Senior Vice President and CTO for AI at Hewlett Packard Enterprise. Doctor Goh, great to see you again. Welcome back to theCUBE. >> Hello, Dave, great to talk to you again. >> You might remember last year we talked a lot about Swarm intelligence and how AI is evolving. Of course, you hosted the Day 2 Keynotes here at Discover. And you talked about thriving in the age of insights, and how to craft a data-centric strategy. And you addressed some of the biggest problems, I think organizations face with data. That's, you've got a, data is plentiful, but insights, they're harder to come by. >> Yeah. >> And you really dug into some great examples in retail, banking, in medicine, healthcare and media. But stepping back a little bit we zoomed out on Discover '21. What do you make of the events so far and some of your big takeaways? >> Hmm, well, we started with the insightful question, right, yeah? Data is everywhere then, but we lack the insight. That's also part of the reason why, that's a main reason why Antonio on day one focused and talked about the fact that we are in the now in the age of insight, right? And how to try thrive in that age, in this new age? What I then did on a Day 2 Keynote following Antonio is to talk about the challenges that we need to overcome in order to thrive in this new age. >> So, maybe we could talk a little bit about some of the things that you took away in terms of, I'm specifically interested in some of the barriers to achieving insights. You know customers are drowning in data. What do you hear from customers? What were your takeaway from some of the ones you talked about today? >> Oh, very pertinent question, Dave. You know the two challenges I spoke about, that we need to overcome in order to thrive in this new age. The first one is the current challenge. And that current challenge is, you know, stated is now barriers to insight, when we are awash with data. So that's a statement on how do you overcome those barriers? What are the barriers to insight when we are awash in data? In the Day 2 Keynote, I spoke about three main things. Three main areas that we receive from customers. The first one, the first barrier is in many, with many of our customers, data is siloed, all right. You know, like in a big corporation, you've got data siloed by sales, finance, engineering, manufacturing and so on supply chain and so on. And there's a major effort ongoing in many corporations to build a federation layer above all those silos so that when you build applications above, they can be more intelligent. They can have access to all the different silos of data to get better intelligence and more intelligent applications built. So that was the first barrier we spoke about, you know? Barriers to insight when we are awash with data. The second barrier is that we see amongst our customers is that data is raw and disperse when they are stored. And you know, it's tough to get at, to tough to get a value out of them, right? And in that case, I use the example of, you know, the May 6, 2010 event where the stock market dropped a trillion dollars in terms of minutes. We all know those who are financially attuned with know about this incident but that this is not the only incident. There are many of them out there. And for that particular May 6 event, you know, it took a long time to get insight. Months, yeah, before we, for months we had no insight as to what happened. Why it happened? Right, and there were many other incidences like this and the regulators were looking for that one rule that could mitigate many of these incidences. One of our customers decided to take the hard road they go with the tough data, right? Because data is raw and dispersed. So they went into all the different feeds of financial transaction information, took the tough, you know, took a tough road. And analyze that data took a long time to assemble. And they discovered that there was caught stuffing, right? That people were sending a lot of trades in and then canceling them almost immediately. You have to manipulate the market. And why didn't we see it immediately? Well, the reason is the process reports that everybody sees, the rule in there that says, all trades less than a hundred shares don't need to report in there. And so what people did was sending a lot of less than a hundred shares trades to fly under the radar to do this manipulation. So here is the second barrier, right? Data could be raw and dispersed. Sometimes it's just have to take the hard road and to get insight. And this is one great example. And then the last barrier has to do with sometimes when you start a project to get insight, to get answers and insight, you realize that all the data's around you, but you don't seem to find the right ones to get what you need. You don't seem to get the right ones, yeah? Here we have three quick examples of customers. One was a great example, right? Where they were trying to build a language translator or machine language translator between two languages, right? By not do that, they need to get hundreds of millions of word pairs. You know of one language compare with the corresponding other. Hundreds of millions of them. They say, well, I'm going to get all these word pairs. Someone creative thought of a willing source and a huge, it was a United Nations. You see? So sometimes you think you don't have the right data with you, but there might be another source and a willing one that could give you that data, right? The second one has to do with, there was the sometimes you may just have to generate that data. Interesting one, we had an autonomous car customer that collects all these data from their their cars, right? Massive amounts of data, lots of sensors, collect lots of data. And, you know, but sometimes they don't have the data they need even after collection. For example, they may have collected the data with a car in fine weather and collected the car driving on this highway in rain and also in snow. But never had the opportunity to collect the car in hill because that's a rare occurrence. So instead of waiting for a time where the car can drive in hill, they build a simulation by having the car collected in snow and simulated him. So these are some of the examples where we have customers working to overcome barriers, right? You have barriers that is associated. In fact, that data silo, they federated it. Virus associated with data, that's tough to get at. They just took the hard road, right? And sometimes thirdly, you just have to be creative to get the right data you need. >> Wow! I tell you, I have about a hundred questions based on what you just said, you know? (Dave chuckles) And as a great example, the Flash Crash. In fact, Michael Lewis, wrote about this in his book, the Flash Boys. And essentially, right, it was high frequency traders trying to front run the market and sending into small block trades (Dave chuckles) trying to get sort of front ended. So that's, and they chalked it up to a glitch. Like you said, for months, nobody really knew what it was. So technology got us into this problem. (Dave chuckles) I guess my question is can technology help us get out of the problem? And that maybe is where AI fits in? >> Yes, yes. In fact, a lot of analytics work went in to go back to the raw data that is highly dispersed from different sources, right? Assembled them to see if you can find a material trend, right? You can see lots of trends, right? Like, no, we, if humans look at things that we tend to see patterns in Clouds, right? So sometimes you need to apply statistical analysis math to be sure that what the model is seeing is real, right? And that required, well, that's one area. The second area is you know, when this, there are times when you just need to go through that tough approach to find the answer. Now, the issue comes to mind now is that humans put in the rules to decide what goes into a report that everybody sees. Now, in this case, before the change in the rules, right? But by the way, after the discovery, the authorities changed the rules and all shares, all trades of different any sizes it has to be reported. >> Right. >> Right, yeah? But the rule was applied, you know, I say earlier that shares under a hundred, trades under a hundred shares need not be reported. So, sometimes you just have to understand that reports were decided by humans and for understandable reasons. I mean, they probably didn't wanted a various reasons not to put everything in there. So that people could still read it in a reasonable amount of time. But we need to understand that rules were being put in by humans for the reports we read. And as such, there are times we just need to go back to the raw data. >> I want to ask you... >> Oh, it could be, that it's going to be tough, yeah. >> Yeah, I want to ask you a question about AI as obviously it's in your title and it's something you know a lot about but. And I'm going to make a statement, you tell me if it's on point or off point. So seems that most of the AI going on in the enterprise is modeling data science applied to, you know, troves of data. But there's also a lot of AI going on in consumer. Whether it's, you know, fingerprint technology or facial recognition or natural language processing. Well, two part question will the consumer market, as it has so often in the enterprise sort of inform us is sort of first part. And then, there'll be a shift from sort of modeling if you will to more, you mentioned the autonomous vehicles, more AI inferencing in real time, especially with the Edge. Could you help us understand that better? >> Yeah, this is a great question, right? There are three stages to just simplify. I mean, you know, it's probably more sophisticated than that. But let's just simplify that three stages, right? To building an AI system that ultimately can predict, make a prediction, right? Or to assist you in decision-making. I have an outcome. So you start with the data, massive amounts of data that you have to decide what to feed the machine with. So you feed the machine with this massive chunk of data, and the machine starts to evolve a model based on all the data it's seeing. It starts to evolve, right? To a point that using a test set of data that you have separately kept aside that you know the answer for. Then you test the model, you know? After you've trained it with all that data to see whether its prediction accuracy is high enough. And once you are satisfied with it, you then deploy the model to make the decision. And that's the inference, right? So a lot of times, depending on what we are focusing on, we in data science are, are we working hard on assembling the right data to feed the machine with? That's the data preparation organization work. And then after which you build your models you have to pick the right models for the decisions and prediction you need to make. You pick the right models. And then you start feeding the data with it. Sometimes you pick one model and a prediction isn't that robust. It is good, but then it is not consistent, right? Now what you do is you try another model. So sometimes it gets keep trying different models until you get the right kind, yeah? That gives you a good robust decision-making and prediction. Now, after which, if it's tested well, QA, you will then take that model and deploy it at the Edge. Yeah, and then at the Edge is essentially just looking at new data, applying it to the model that you have trained. And then that model will give you a prediction or a decision, right? So it is these three stages, yeah. But more and more, your question reminds me that more and more people are thinking as the Edge become more and more powerful. Can you also do learning at the Edge? >> Right. >> That's the reason why we spoke about Swarm Learning the last time. Learning at the Edge as a Swarm, right? Because maybe individually, they may not have enough power to do so. But as a Swarm, they may. >> Is that learning from the Edge or learning at the Edge? In other words, is that... >> Yes. >> Yeah. You do understand my question. >> Yes. >> Yeah. (Dave chuckles) >> That's a great question. That's a great question, right? So the quick answer is learning at the Edge, right? And also from the Edge, but the main goal, right? The goal is to learn at the Edge so that you don't have to move the data that Edge sees first back to the Cloud or the Call to do the learning. Because that would be the reason, one of the main reasons why you want to learn at the Edge. Right? So that you don't need to have to send all that data back and assemble it back from all the different Edge devices. Assemble it back to the Cloud Site to do the learning, right? Some on you can learn it and keep the data at the Edge and learn at that point, yeah. >> And then maybe only selectively send. >> Yeah. >> The autonomous vehicle, example you gave is great. 'Cause maybe they're, you know, there may be only persisting. They're not persisting data that is an inclement weather, or when a deer runs across the front. And then maybe they do that and then they send that smaller data setback and maybe that's where it's modeling done but the rest can be done at the Edge. It's a new world that's coming through. Let me ask you a question. Is there a limit to what data should be collected and how it should be collected? >> That's a great question again, yeah. Well, today full of these insightful questions. (Dr. Eng chuckles) That actually touches on the the second challenge, right? How do we, in order to thrive in this new age of insight? The second challenge is our future challenge, right? What do we do for our future? And in there is the statement we make is we have to focus on collecting data strategically for the future of our enterprise. And within that, I talked about what to collect, right? When to organize it when you collect? And then where will your data be going forward that you are collecting from? So what, when, and where? For what data to collect? That was the question you asked, it's a question that different industries have to ask themselves because it will vary, right? Let me give you the, you use the autonomous car example. Let me use that. And we do have this customer collecting massive amounts of data. You know, we're talking about 10 petabytes a day from a fleet of their cars. And these are not production autonomous cars, right? These are training autonomous cars, collecting data so they can train and eventually deploy commercial cars, right? Also this data collection cars, they collect 10, as a fleet of them collect 10 petabytes a day. And then when they came to us, building a storage system you know, to store all of that data, they realized they don't want to afford to store all of it. Now here comes the dilemma, right? What should I, after I spent so much effort building all this cars and sensors and collecting data, I've now decide what to delete. That's a dilemma, right? Now in working with them on this process of trimming down what they collected, you know, I'm constantly reminded of the 60s and 70s, right? To remind myself 60s and 70s, we called a large part of our DNA, junk DNA. >> Yeah. (Dave chuckles) >> Ah! Today, we realized that a large part of that what we call junk has function as valuable function. They are not genes but they regulate the function of genes. You know? So what's junk in yesterday could be valuable today. Or what's junk today could be valuable tomorrow, right? So, there's this tension going on, right? Between you deciding not wanting to afford to store everything that you can get your hands on. But on the other hand, you worry, you ignore the wrong ones, right? You can see this tension in our customers, right? And then it depends on industry here, right? In healthcare they say, I have no choice. I want it all, right? Oh, one very insightful point brought up by one healthcare provider that really touched me was you know, we don't only care. Of course we care a lot. We care a lot about the people we are caring for, right? But who also care for the people we are not caring for? How do we find them? >> Uh-huh. >> Right, and that definitely, they did not just need to collect data that they have with from their patients. They also need to reach out, right? To outside data so that they can figure out who they are not caring for, right? So they want it all. So I asked them, so what do you do with funding if you want it all? They say they have no choice but to figure out a way to fund it and perhaps monetization of what they have now is the way to come around and fund that. Of course, they also come back to us rightfully, that you know we have to then work out a way to help them build a system, you know? So that's healthcare, right? And if you go to other industries like banking, they say they can afford to keep them all. >> Yeah. >> But they are regulated, seemed like healthcare, they are regulated as to privacy and such like. So many examples different industries having different needs but different approaches to what they collect. But there is this constant tension between you perhaps deciding not wanting to fund all of that, all that you can install, right? But on the other hand, you know if you kind of don't want to afford it and decide not to start some. Maybe those some become highly valuable in the future, right? (Dr. Eng chuckles) You worry. >> Well, we can make some assumptions about the future. Can't we? I mean, we know there's going to be a lot more data than we've ever seen before. We know that. We know, well, not withstanding supply constraints and things like NAND. We know the prices of storage is going to continue to decline. We also know and not a lot of people are really talking about this, but the processing power, but the says, Moore's law is dead. Okay, it's waning, but the processing power when you combine the CPUs and NPUs, and GPUs and accelerators and so forth actually is increasing. And so when you think about these use cases at the Edge you're going to have much more processing power. You're going to have cheaper storage and it's going to be less expensive processing. And so as an AI practitioner, what can you do with that? >> Yeah, it's a highly, again, another insightful question that we touched on our Keynote. And that goes up to the why, uh, to the where? Where will your data be? Right? We have one estimate that says that by next year there will be 55 billion connected devices out there, right? 55 billion, right? What's the population of the world? Well, of the other 10 billion? But this thing is 55 billion. (Dave chuckles) Right? And many of them, most of them can collect data. So what do you do? Right? So the amount of data that's going to come in, it's going to way exceed, right? Drop in storage costs are increasing compute power. >> Right. >> Right. So what's the answer, right? So the answer must be knowing that we don't, and even a drop in price and increase in bandwidth, it will overwhelm the, 5G, it will overwhelm 5G, right? Given the amount of 55 billion of them collecting. So the answer must be that there needs to be a balance between you needing to bring all of that data from the 55 billion devices of the data back to a central, as a bunch of central cost. Because you may not be able to afford to do that. Firstly bandwidth, even with 5G and as the, when you'll still be too expensive given the number of devices out there. You know given storage costs dropping is still be too expensive to try and install them all. So the answer must be to start, at least to mitigate from to, some leave most a lot of the data out there, right? And only send back the pertinent ones, as you said before. But then if you did that then how are we going to do machine learning at the Core and the Cloud Site, if you don't have all the data? You want rich data to train with, right? Sometimes you want to mix up the positive type data and the negative type data. So you can train the machine in a more balanced way. So the answer must be eventually, right? As we move forward with these huge number of devices all at the Edge to do machine learning at the Edge. Today we don't even have power, right? The Edge typically is characterized by a lower energy capability and therefore lower compute power. But soon, you know? Even with low energy, they can do more with compute power improving in energy efficiency, right? So learning at the Edge, today we do inference at the Edge. So we data, model, deploy and you do inference there is. That's what we do today. But more and more, I believe given a massive amount of data at the Edge, you have to start doing machine learning at the Edge. And when you don't have enough power then you aggregate multiple devices, compute power into a Swarm and learn as a Swarm, yeah. >> Oh, interesting. So now of course, if I were sitting and fly on the wall and the HPE board meeting I said, okay, HPE is a leading provider of compute. How do you take advantage of that? I mean, we're going, I know it's future but you must be thinking about that and participating in those markets. I know today you are, you have, you know, Edge line and other products. But there's, it seems to me that it's not the general purpose that we've known in the past. It's a new type of specialized computing. How are you thinking about participating in that opportunity for the customers? >> Hmm, the wall will have to have a balance, right? Where today the default, well, the more common mode is to collect the data from the Edge and train at some centralized location or number of centralized location. Going forward, given the proliferation of the Edge devices, we'll need a balance, we need both. We need capability at the Cloud Site, right? And it has to be hybrid. And then we need capability on the Edge side that we need to build systems that on one hand is an Edge adapter, right? Meaning they environmentally adapted because the Edge differently are on it, a lot of times on the outside. They need to be packaging adapted and also power adapted, right? Because typically many of these devices are battery powered. Right? So you have to build systems that adapts to it. But at the same time, they must not be custom. That's my belief. It must be using standard processes and standard operating system so that they can run a rich set of applications. So yes, that's also the insight for that Antonio announced in 2018. For the next four years from 2018, right? $4 billion invested to strengthen our Edge portfolio. >> Uh-huh. >> Edge product lines. >> Right. >> Uh-huh, Edge solutions. >> I could, Doctor Goh, I could go on for hours with you. You're just such a great guest. Let's close. What are you most excited about in the future of, certainly HPE, but the industry in general? >> Yeah, I think the excitement is the customers, right? The diversity of customers and the diversity in the way they have approached different problems of data strategy. So the excitement is around data strategy, right? Just like, you know, the statement made for us was so was profound, right? And Antonio said, we are in the age of insight powered by data. That's the first line, right? The line that comes after that is as such we are becoming more and more data centric with data that currency. Now the next step is even more profound. That is, you know, we are going as far as saying that, you know, data should not be treated as cost anymore. No, right? But instead as an investment in a new asset class called data with value on our balance sheet. This is a step change, right? Right, in thinking that is going to change the way we look at data, the way we value it. So that's a statement. (Dr. Eng chuckles) This is the exciting thing, because for me a CTO of AI, right? A machine is only as intelligent as the data you feed it with. Data is a source of the machine learning to be intelligent. Right? (Dr. Eng chuckles) So, that's why when the people start to value data, right? And say that it is an investment when we collect it it is very positive for AI. Because an AI system gets intelligent, get more intelligence because it has huge amounts of data and a diversity of data. >> Yeah. >> So it'd be great, if the community values data. >> Well, you certainly see it in the valuations of many companies these days. And I think increasingly you see it on the income statement. You know data products and people monetizing data services. And yeah, maybe eventually you'll see it in the balance sheet. I know Doug Laney, when he was at Gartner Group, wrote a book about this and a lot of people are thinking about it. That's a big change, isn't it? >> Yeah, yeah. >> Dr. Goh... (Dave chuckles) >> The question is the process and methods in valuation. Right? >> Yeah, right. >> But I believe we will get there. We need to get started. And then we'll get there. I believe, yeah. >> Doctor Goh, it's always my pleasure. >> And then the AI will benefit greatly from it. >> Oh, yeah, no doubt. People will better understand how to align, you know some of these technology investments. Dr. Goh, great to see you again. Thanks so much for coming back in theCUBE. It's been a real pleasure. >> Yes, a system is only as smart as the data you feed it with. (Dave chuckles) (Dr. Eng laughs) >> Excellent. We'll leave it there. Thank you for spending some time with us and keep it right there for more great interviews from HPE Discover 21. This is Dave Vellante for theCUBE, the leader in Enterprise Tech Coverage. We'll be right back. (upbeat music)

Published Date : Jun 8 2021

SUMMARY :

Doctor Goh, great to see you again. great to talk to you again. And you talked about thriving And you really dug in the age of insight, right? of the ones you talked about today? to get what you need. And as a great example, the Flash Crash. is that humans put in the rules to decide But the rule was applied, you know, that it's going to be tough, yeah. So seems that most of the AI and the machine starts to evolve a model they may not have enough power to do so. Is that learning from the Edge You do understand my question. or the Call to do the learning. but the rest can be done at the Edge. When to organize it when you collect? But on the other hand, to help them build a system, you know? all that you can install, right? And so when you think about So what do you do? of the data back to a central, in that opportunity for the customers? And it has to be hybrid. about in the future of, as the data you feed it with. if the community values data. And I think increasingly you The question is the process We need to get started. And then the AI will Dr. Goh, great to see you again. as smart as the data Thank you for spending some time with us

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Lewis	PERSON	0.99+
Doug Laney	PERSON	0.99+
Dave	PERSON	0.99+
2018	DATE	0.99+
$4 billion	QUANTITY	0.99+
Antonio	PERSON	0.99+
two languages	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
two challenges	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
55 billion	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
last year	DATE	0.99+
Gartner Group	ORGANIZATION	0.99+
first line	QUANTITY	0.99+
10	QUANTITY	0.99+
second area	QUANTITY	0.99+
both	QUANTITY	0.99+
tomorrow	DATE	0.99+
Hundreds of millions	QUANTITY	0.99+
Today	DATE	0.99+
today	DATE	0.99+
second barrier	QUANTITY	0.99+
two part	QUANTITY	0.99+
May 6, 2010	DATE	0.99+
One	QUANTITY	0.99+
Edge	ORGANIZATION	0.99+
first barrier	QUANTITY	0.99+
less than a hundred shares	QUANTITY	0.99+
next year	DATE	0.98+
Eng	PERSON	0.98+
yesterday	DATE	0.98+
first part	QUANTITY	0.98+
May 6	DATE	0.98+
United Nations	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
one area	QUANTITY	0.98+
one model	QUANTITY	0.98+
first one	QUANTITY	0.98+
Hewlett Packard Enterprise	ORGANIZATION	0.98+
Dr.	PERSON	0.97+
less than a hundred shares	QUANTITY	0.97+
three stages	QUANTITY	0.97+
one rule	QUANTITY	0.97+
Three main areas	QUANTITY	0.97+
Flash Boys	TITLE	0.97+
one language	QUANTITY	0.97+
one	QUANTITY	0.96+
10 petabytes a day	QUANTITY	0.96+
Flash Crash	TITLE	0.95+
under a hundred	QUANTITY	0.95+
Firstly	QUANTITY	0.95+

Robert Abate, Global IDS | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (futuristic music) >> Welcome back to Cambridge, Massachusetts everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events and we extract the signal from the noise. This is day two, we're sort of wrapping up the Chief Data Officer event. It's MIT CDOIQ, it started as an information quality event and with the ascendancy of big data the CDO emerged and really took center stage here. And it's interesting to know that it's kind of come full circle back to information quality. People are realizing all this data we have, you know the old saying, garbage in, garbage out. So the information quality worlds and this chief data officer world have really come colliding together. Robert Abate is here, he's the Vice President and CDO of Global IDS and also the co-chair of next year's, the 14th annual MIT CDOIQ. Robert, thanks for coming on. >> Oh, well thank you. >> Now you're a CDO by background, give us a little history of your career. >> Sure, sure. Well I started out with an Electrical Engineering degree and went into applications development. By 2000, I was leading the Ralph Lauren's IT, and I realized when Ralph Lauren hired me, he was getting ready to go public. And his problem was he had hired eight different accounting firms to do eight different divisions. And each of those eight divisions were reporting a number, but the big number didn't add up, so he couldn't go public. So he searched the industry to find somebody who could figure out the problem. Now I was, at the time, working in applications and had built this system called Service Oriented Architectures, a way of integrating applications. And I said, "Well I don't know if I could solve the problem, "but I'll give it a shot." And what I did was, just by taking each silo as it's own problem, which was what EID Accounting Firm had done, I was able to figure out that one of Ralph Lauren's policies was if you buy a garment, you can return it anytime, anywhere, forever, however long you own it. And he didn't think about that, but what that meant is somebody could go to a Bloomingdale's, buy a garment and then go to his outlet store and return it. Well, the cross channels were different systems. So the outlet stores were his own business, retail was a different business, there was a completely different, each one had their own AS/400, their own data. So what I quickly learned was, the problem wasn't the systems, the problem was the data. And it took me about two months to figure it out and he offered me a job, he said well, I was a consultant at the time, he says, "I'm offering you a job, you're going to run my IT." >> Great user experience but hard to count. >> (laughs) Hard to count. So that's when I, probably 1999 was when that happened. I went into data and started researching-- >> Sorry, so how long did it take you to figure that out? You said a couple of months? >> A couple of months, I think it was about two months. >> 'Cause jeez, it took Oracle what, 10 years to build Fusion with SOA? That's pretty good. (laughs) >> This was a little bit of luck. When we started integrating the applications we learned that the messages that we were sending back and forth didn't match, and we said, "Well that's impossible, it can't not match." But what didn't match was it was coming from one channel and being returned in another channel, and the returns showed here didn't balance with the returns on this side. So it was a data problem. >> So a forensics showdown. So what did you do after? >> After that I went into ICICI Bank which was a large bank in India who was trying to integrate their systems, and again, this was a data problem. But they heard me giving a talk at a conference on how SOA had solved the data challenge, and they said, "We're a bank with a wholesale, a retail, "and other divisions, "and we can't integrate the systems, can you?" I said, "Well yeah, I'd build a website "and make them web services and now what'll happen is "each of those'll kind of communicate." And I was at ICICI Bank for about six months in Mumbai, and finished that which was a success, came back and started consulting because now a lot of companies were really interested in this concept of Service Oriented Architectures. Back then when we first published on it, myself, Peter Aiken, and a gentleman named Joseph Burke published on it in 1996. The publisher didn't accept the book, it was a really interesting thing. We wrote the book called, "Services Based Architectures: A Way to Integrate Systems." And the way Wiley & Sons, or most publishers work is, they'll have three industry experts read your book and if they don't think what you're saying has any value, they, forget about it. So one guy said this is brilliant, one guy says, "These guys don't know what they're talking about," and the third guy says, "I don't even think what they're talking about is feasible." So they decided not to publish. Four years later it came back and said, "We want to publish the book," and Peter said, "You know what, they lost their chance." We were ahead of them by four years, they didn't understand the technology. So that was kind of cool. So from there I went into consulting, eventually took a position as the Head of Enterprise and Director of Enterprise Information Architecture with Walmart. And Walmart, as you know, is a huge entity, almost the size of the federal government. So to build an architecture that integrates Walmart would've been a challenge, a behemoth challenge, and I took it on with a phenomenal team. >> And when was this, like what timeframe? >> This was 2010, and by the end of 2010 we had presented an architecture to the CIO and the rest of the organization, and they came back to me about a week later and said, "Look, everybody agrees what you did was brilliant, "but nobody knows how to implement it. "So we're taking you away, "you're no longer Director of Information Architecture, "you're now Director of Enterprise Information Management. "Build it. "Prove that what you say you could do, you could do." So we built something called the Data CAFE, and CAFE was an acronym, it stood for: Collaborative Analytics Facility for the Enterprise. What we did was we took data from one of the divisions, because you didn't want to take on the whole beast, boil the ocean. We picked Sam's Club and we worked with their CFO, and because we had information about customers we were able to build a room with seven 80 inch monitors that surrounded anyone in the room. And in the center was the Cisco telecommunications so you could be a part of a meeting. >> The TelePresence. >> TelePresence. And we built one room in one facility, and one room in another facility, and we labeled the monitors, one red, one blue, one green, and we said, "There's got to be a way where we can build "data science so it's interactive, so somebody, "an executive could walk into the room, "touch the screen, and drill into features. "And in another room "the features would be changing simultaneously." And that's what we built. The room was brought up on Black Friday of 2013, and we were able to see the trends of sales on the East Coast that we quickly, the executives in the room, and these are the CEO of Walmart and the heads of Sam's Club and the like, they were able to change the distribution in the Mountain Time Zone and west time zones because of the sales on the East Coast gave them the idea, well these things are going to sell, and these things aren't. And they saw a tremendous increase in productivity. We received the 2014, my team received the 2014 Walmart Innovation Project of the Year. >> And that's no slouch. Walmart has always been heavily data-oriented. I don't know if it's urban legend or not, but the famous story in the '80s of the beer and the diapers, right? Walmart would position beer next to diapers, why would they do that? Well the father goes in to buy the diapers for the baby, picks up a six pack while he's on the way, so they just move those proximate to each other. (laughs) >> In terms of data, Walmart really learned that there's an advantage to understanding how to place items in places that, a path that you might take in a store, and knowing that path, they actually have a term for it, I believe it's called, I'm sorry, I forgot the name but it's-- >> Selling more stuff. (laughs) >> Yeah, it's selling more stuff. It's the way you position items on a shelf. And Walmart had the brilliance, or at least I thought it was brilliant, that they would make their vendors the data champion. So the vendor, let's say Procter & Gamble's a vendor, and they sell this one product the most. They would then be the champion for that aisle. Oh, it's called planogramming. So the planogramming, the way the shelves were organized, would be set up by Procter & Gamble for that entire area, working with all their other vendors. And so Walmart would give the data to them and say, "You do it." And what I was purporting was, well, we shouldn't just be giving the data away, we should be using that data. And that was the advent of that. From there I moved to Kimberly-Clark, I became Global Director of Enterprise Data Management and Analytics. Their challenge was they had different teams, there were four different instances of SAP around the globe. One for Latin America, one for North America called the Enterprise Edition, one for EMEA, Europe, Middle East, and Africa, and one for Asia-Pacific. Well when you have four different instances of SAP, that means your master data doesn't exist because the same thing that happens in this facility is different here. And every company faces this challenge. If they implement more than one of a system the specialty fields get used by different companies in different ways. >> The gold standard, the gold version. >> The golden version. So I built a team by bringing together all the different international teams, and created one team that was able to integrate best practices and standards around data governance, data quality. Built BI teams for each of the regions, and then a data science and advanced analytics team. >> Wow, so okay, so that makes you uniquely qualified to coach here at the conference. >> Oh, I don't know about that. (laughs) There are some real, there are some geniuses here. >> No but, I say that because these are your peeps. >> Yes, they are, they are. >> And so, you're a practitioner, this conference is all about practitioners talking to practitioners, it's content-heavy, There's not a lot of fluff. Lunches aren't sponsored, there's no lanyard sponsor and it's not like, you know, there's very subtle sponsor desks, you have to have sponsors 'cause otherwise the conference's not enabled, and you've got costs associated with it. But it's a very intimate event and I think you guys want to keep it that way. >> And I really believe you're dead-on. When you go to most industry conferences, the industry conferences, the sponsors, you know, change the format or are heavily into the format. Here you have industry thought leaders from all over the globe. CDOs of major Fortune 500 companies who are working with their peers and exchanging ideas. I've had conversations with a number of CDOs and the thought leadership at this conference, I've never seen this type of thought leadership in any conference. >> Yeah, I mean the percentage of presentations by practitioners, even when there's a vendor name, they have a practitioner, you know, internal practitioner presenting so it's 99.9% which is why people attend. We're moving venues next year, I understand. Just did a little tour of the new venue, so, going to be able to accommodate more attendees, so that's great. >> Yeah it is. >> So what are your objectives in thinking ahead a year from now? >> Well, you know, I'm taking over from my current peer, Dr. Arka Mukherjee, who just did a phenomenal job of finding speakers. People who are in the industry, who are presenting challenges, and allowing others to interact. So I hope could do a similar thing which is, find with my peers people who have real world challenges, bring them to the forum so they can be debated. On top of that, there are some amazing, you know, technology change is just so fast. One of the areas like big data I remember only five years ago the chart of big data vendors maybe had 50 people on it, now you would need the table to put all the vendors. >> Who's not a data vendor, you know? >> Who's not a data vendor? (laughs) So I would think the best thing we could do is, is find, just get all the CDOs and CDO-types into a room, and let us debate and talk about these points and issues. I've seen just some tremendous interactions, great questions, people giving advice to others. I've learned a lot here. >> And how about long term, where do you see this going? How many CDOs are there in the world, do you know? Is that a number that's known? >> That's a really interesting point because, you know, only five years ago there weren't that many CDOs to be called. And then Gartner four years ago or so put out an article saying, "Every company really should have a CDO." Not just for the purpose of advancing your data, and to Doug Laney's point that data is being monetized, there's a need to have someone responsible for information 'cause we're in the Information Age. And a CIO really is focused on infrastructure, making sure I've got my PCs, making sure I've got a LAN, I've got websites. The focus on data has really, because of the Information Age, has turned data into an asset. So organizations realize, if you utilize that asset, let me reverse this, if you don't use data as an asset, you will be out of business. I heard a quote, I don't know if it's true, "Only 10 years ago, 250 of the Fortune 10 no longer exists." >> Yeah, something like that, the turnover's amazing. >> Many of those companies were companies that decided not to make the change to be data-enabled, to make data decision processing. Companies still use data warehouses, they're always going to use them, and a warehouse is a rear-view mirror, it tells you what happened last week, last month, last year. But today's businesses work forward-looking. And just like driving a car, it'd be really hard to drive your car through a rear-view mirror. So what companies are doing today are saying, "Okay, let's start looking at this as forward-looking, "a prescriptive and predictive analytics, "rather than just what happened in the past." I'll give you an example. In a major company that is a supplier of consumer products, they were leading in the industry and their sales started to drop, and they didn't know why. Well, with a data science team, we were able to determine by pulling in data from the CDC, now these are sources that only 20 years ago nobody ever used to bring in data in the enterprise, now 60% of your data is external. So we brought in data from the CDC, we brought in data on maternal births from the national government, we brought in data from the Census Bureau, we brought in data from sources of advertising and targeted marketing towards mothers. Pulled all that data together and said, "Why are diaper sales down?" Well they were targeting the large regions of the country and putting ads in TV stations in New York and California, big population centers. Birth rates in population centers have declined. Birth rates in certain other regions, like the south, and the Bible Belt, if I can call it that, have increased. So by changing the marketing, their product sales went up. >> Advertising to Texas. >> Well, you know, and that brings to one of the points, I heard a lecture today about ethics. We made it a point at Walmart that if you ran a query that reduced a result to less than five people, we wouldn't allow you to see the result. Because, think about it, I could say, "What is my neighbor buying? "What are you buying?" So there's an ethical component to this as well. But that, you know, data is not political. Data is not chauvinistic. It doesn't discriminate, it just gives you facts. It's the interpretation of that that is hard CDOs, because we have to say to someone, "Look, this is the fact, and your 25 years "of experience in the business, "granted, is tremendous and it's needed, "but the facts are saying this, "and that would mean that the business "would have to change its direction." And it's hard for people to do, so it requires that. >> So whether it's called the chief data officer, whatever the data czar rubric is, the head of analytics, there's obviously the data quality component there whatever that is, this is the conference for, as I called them, your peeps, for that role in the organization. People often ask, "Will that role be around?" I think it's clear, it's solidifying. Yes, you see the chief digital officer emerging and there's a lot of tailwinds there, but the information quality component, the data architecture component, it's here to stay. And this is the premiere conference, the premiere event, that I know of anyway. There are a couple of others, perhaps, but it's great to see all the success. When I first came here in 2013 there were probably about 130 folks here. Today, I think there were 500 people registered almost. Next year, I think 600 is kind of the target, and I think it's very reasonable with the new space. So congratulations on all the success, and thank you for stepping up to the co-chair role, I really appreciate it. >> Well, let me tell you I thank you guys. You provide a voice at these IT conferences that we really need, and that is the ability to get the message out. That people do think and care, the industry is not thoughtless and heartless. With all the data breaches and everything going on there's a lot of fear, fear, loathing, and anticipation. But having your voice, kind of like ESPN and a sports show, gives the technology community, which is getting larger and larger by the day, a voice and we need that so, thank you. >> Well thank you, Robert. We appreciate that, it was great to have you on. Appreciate the time. >> Great to be here, thank you. >> All right, and thank you for watching. We'll be right back with out next guest as we wrap up day two of MIT CDOIQ. You're watching theCUBE. (futuristic music)

Published Date : Aug 1 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and also the co-chair of next year's, give us a little history of your career. So he searched the industry to find somebody (laughs) Hard to count. 10 years to build Fusion with SOA? and the returns showed here So what did you do after? and the third guy says, And in the center was the Cisco telecommunications and the heads of Sam's Club and the like, Well the father goes in to buy the diapers for the baby, (laughs) So the planogramming, the way the shelves were organized, and created one team that was able to integrate so that makes you uniquely qualified to coach here There are some real, there are some geniuses here. and it's not like, you know, the industry conferences, the sponsors, you know, Yeah, I mean the percentage of presentations by One of the areas like big data I remember just get all the CDOs and CDO-types into a room, because of the Information Age, and the Bible Belt, if I can call it that, have increased. It's the interpretation of that that is hard CDOs, the data architecture component, it's here to stay. and that is the ability to get the message out. We appreciate that, it was great to have you on. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Peter Aiken	PERSON	0.99+
Robert Abate	PERSON	0.99+
Robert	PERSON	0.99+
Procter & Gamble	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
India	LOCATION	0.99+
Mumbai	LOCATION	0.99+
Census Bureau	ORGANIZATION	0.99+
2010	DATE	0.99+
1996	DATE	0.99+
New York	LOCATION	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
last month	DATE	0.99+
60%	QUANTITY	0.99+
Bloomingdale	ORGANIZATION	0.99+
Next year	DATE	0.99+
1999	DATE	0.99+
Texas	LOCATION	0.99+
25 years	QUANTITY	0.99+
10 years	QUANTITY	0.99+
one room	QUANTITY	0.99+
2014	DATE	0.99+
2013	DATE	0.99+
Doug Laney	PERSON	0.99+
Sam's Club	ORGANIZATION	0.99+
ICICI Bank	ORGANIZATION	0.99+
99.9%	QUANTITY	0.99+
Wiley & Sons	ORGANIZATION	0.99+
50 people	QUANTITY	0.99+
Arka Mukherjee	PERSON	0.99+
next year	DATE	0.99+
Jos	PERSON	0.99+
Today	DATE	0.99+
third guy	QUANTITY	0.99+
2000	DATE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
500 people	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
one channel	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
each	QUANTITY	0.99+
One	QUANTITY	0.99+
CDC	ORGANIZATION	0.99+
less than five people	QUANTITY	0.99+
Ralph Lauren	ORGANIZATION	0.99+
one guy	QUANTITY	0.99+
six pack	QUANTITY	0.99+
ESPN	ORGANIZATION	0.99+
four years ago	DATE	0.98+
Africa	LOCATION	0.98+
SOA	TITLE	0.98+
five years ago	DATE	0.98+
California	LOCATION	0.98+
Gartner	ORGANIZATION	0.98+
three industry experts	QUANTITY	0.98+
Global IDS	ORGANIZATION	0.98+
Four years later	DATE	0.98+
600	QUANTITY	0.98+
20 years ago	DATE	0.98+
East Coast	LOCATION	0.98+
250	QUANTITY	0.98+
Middle East	LOCATION	0.98+
four years	QUANTITY	0.98+
one team	QUANTITY	0.97+
months	QUANTITY	0.97+
first	QUANTITY	0.97+
about two months	QUANTITY	0.97+
Latin America	LOCATION	0.97+

Gokula Mishra | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019 brought to you by SiliconANGLE Media. (upbeat techno music) >> Hi everybody, welcome back to Cambridge, Massachusetts. You're watching theCUBE, the leader in tech coverage. We go out to the events. We extract the signal from the noise, and we're here at the MIT CDOIQ Conference, Chief Data Officer Information Quality Conference. It is the 13th year here at the Tang building. We've outgrown this building and have to move next year. It's fire marshal full. Gokula Mishra is here. He is the Senior Director of Global Data and Analytics and Supply Chain-- >> Formerly. Former, former Senior Director. >> Former! I'm sorry. It's former Senior Director of Global Data Analytics and Supply Chain at McDonald's. Oh, I didn't know that. I apologize my friend. Well, welcome back to theCUBE. We met when you were at Oracle doing data. So you've left that, you're on to your next big thing. >> Yes, thinking through it. >> Fantastic, now let's start with your career. You've had, so you just recently left McDonald's. I met you when you were at Oracle, so you cut over to the dark side for a while, and then before that, I mean, you've been a practitioner all your life, so take us through sort of your background. >> Yeah, I mean my beginning was really with a company called Tata Burroughs. Those days we did not have a lot of work getting done in India. We used to send people to U.S. so I was one of the pioneers of the whole industry, coming here and working on very interesting projects. But I was lucky to be working on mostly data analytics related work, joined a great company called CS Associates. I did my Master's at Northwestern. In fact, my thesis was intelligent databases. So, building AI into the databases and from there on I have been with Booz Allen, Oracle, HP, TransUnion, I also run my own company, and Sierra Atlantic, which is part of Hitachi, and McDonald's. >> Awesome, so let's talk about use of data. It's evolved dramatically as we know. One of the themes in this conference over the years has been sort of, I said yesterday, the Chief Data Officer role emerged from the ashes of sort of governance, kind of back office information quality compliance, and then ascended with the tailwind of the Big Data meme, and it's kind of come full circle. People are realizing actually to get value out of data, you have to have information quality. So those two worlds have collided together, and you've also seen the ascendancy of the Chief Digital Officer who has really taken a front and center role in some of the more strategic and revenue generating initiatives, and in some ways the Chief Data Officer has been a supporting role to that, providing the quality, providing the compliance, the governance, and the data modeling and analytics, and a component of it. First of all, is that a fair assessment? How do you see the way in which the use of data has evolved over the last 10 years? >> So to me, primarily, the use of data was, in my mind, mostly around financial reporting. So, anything that companies needed to run their company, any metrics they needed, any data they needed. So, if you look at all the reporting that used to happen it's primarily around metrics that are financials, whether it's around finances around operations, finances around marketing effort, finances around reporting if it's a public company reporting to the market. That's where the focus was, and so therefore a lot of the data that was not needed for financial reporting was what we call nowadays dark data. This is data we collect but don't do anything with it. Then, as the capability of the computing, and the storage, and new technologies, and new techniques evolve, and are able to handle more variety and more volume of data, then people quickly realize how much potential they have in the other data outside of the financial reporting data that they can utilize too. So, some of the pioneers leverage that and actually improved a lot in their efficiency of operations, came out with innovation. You know, GE comes to mind as one of the companies that actually leverage data early on, and number of other companies. Obviously, you look at today data has been, it's defining some of the multi-billion dollar company and all they have is data. >> Well, Facebook, Google, Amazon, Microsoft. >> Exactly. >> Apple, I mean Apple obviously makes stuff, but those other companies, they're data companies. I mean largely, and those five companies have the highest market value on the U.S. stock exchange. They've surpassed all the other big leaders, even Berkshire Hathaway. >> So now, what is happening is because the market changes, the forces that are changing the behavior of our consumers and customers, which I talked about which is everyone now is digitally engaging with each other. What that does is all the experiences now are being captured digitally, all the services are being captured digitally, all the products are creating a lot of digital exhaust of data and so now companies have to pay attention to engage with their customers and partners digitally. Therefore, they have to make sure that they're leveraging data and analytics in doing so. The other thing that has changed is the time to decision to the time to act on the data inside that you get is shrinking, and shrinking, and shrinking, so a lot more decision-making is now going real time. Therefore, you have a situation now, you have the capability, you have the technology, you have the data now, you have to make sure that you convert that in what I call programmatic kind of data decision-making. Obviously, there are people involved in more strategic decision-making. So, that's more manual, but at the operational level, it's going more programmatic decision-making. >> Okay, I want to talk, By the way, I've seen a stat, I don't know if you can confirm this, that 80% of the data that's out there today is dark data or it's data that's behind a firewall or not searchable, not open to Google's crawlers. So, there's a lot of value there-- >> So, I would say that percent is declining over time as companies have realized the value of data. So, more and more companies are removing the silos, bringing those dark data out. I think the key to that is companies being able to value their data, and as soon as they are able to value their data, they are able to leverage a lot of the data. I still believe there's a large percent still not used or accessed in companies. >> Well, and of course you talked a lot about data monetization. Doug Laney, who's an expert in that topic, we had Doug on a couple years ago when he, just after, he wrote Infonomics. He was on yesterday. He's got a very detailed prescription as to, he makes strong cases as to why data should be valued like an asset. I don't think anybody really disagrees with that, but then he gave kind of a how-to-do-it, which will, somewhat, make your eyes bleed, but it was really well thought out, as you know. But you talked a lot about data monetization, you talked about a number of ways in which data can contribute to monetization. Revenue, cost reduction, efficiency, risk, and innovation. Revenue and cost is obvious. I mean, that's where the starting point is. Efficiency is interesting. I look at efficiency as kind of a doing more with less but it's sort of a cost reduction, but explain why it's not in the cost bucket, it's different. >> So, it is first starts with doing what we do today cheaper, better, faster, and doing more comes after that because if you don't understand, and data is the way to understand how your current processes work, you will not take the first step. So, to take the first step is to understand how can I do this process faster, and then you focus on cheaper, and then you focus on better. Of course, faster is because of some of the market forces and customer behavior that's driving you to do that process faster. >> Okay, and then the other one was risk reduction. I think that makes a lot of sense here. Actually, let me go back. So, one of the key pieces of it, of efficiency is time to value. So, if you can compress the time, or accelerate the time and you get the value that means more cash in house faster, whether it's cost reduction or-- >> And the other aspect you look at is, can you automate more of the processes, and in that way it can be faster. >> And that hits the income statement as well because you're reducing headcount cost of your, maybe not reducing headcount cost, but you're getting more out of different, out ahead you're reallocating them to more strategic initiatives. Everybody says that but the reality is you hire less people because you just automated. And then, risk reduction, so the degree to which you can lower your expected loss. That's just instead thinking in insurance terms, that's tangible value so certainly to large corporations, but even midsize and small corporations. Innovation, I thought was a good one, but maybe you could use an example of, give us an example of how in your career you've seen data contribute to innovation. >> So, I'll give an example of oil and gas industry. If you look at speed of innovation in the oil and gas industry, they were all paper-based. I don't know how much you know about drilling. A lot of the assets that goes into figuring out where to drill, how to drill, and actually drilling and then taking the oil or gas out, and of course selling it to make money. All of those processes were paper based. So, if you can imagine trying to optimize a paper-based innovation, it's very hard. Not only that, it's very, very by itself because it's on paper, it's in someone's drawer or file. So, it's siloed by design and so one thing that the industry has gone through, they recognize that they have to optimize the processes to be better, to innovate, to find, for example, shale gas was a result output of digitizing the processes because otherwise you can't drill faster, cheaper, better to leverage the shale gas drilling that they did. So, the industry went through actually digitizing a lot of the paper assets. So, they went from not having data to knowingly creating the data that they can use to optimize the process and then in the process they're innovating new ways to drill the oil well cheaper, better, faster. >> In the early days of oil exploration in the U.S. go back to the Osage Indian tribe in northern Oklahoma, and they brilliantly, when they got shuttled around, they pushed him out of Kansas and they negotiated with the U.S. government that they maintain the mineral rights and so they became very, very wealthy. In fact, at one point they were the wealthiest per capita individuals in the entire world, and they used to hold auctions for various drilling rights. So, it was all gut feel, all the oil barons would train in, and they would have an auction, and it was, again, it was gut feel as to which areas were the best, and then of course they evolved, you remember it used to be you drill a little hole, no oil, drill a hole, no oil, drill a hole. >> You know how much that cost? >> Yeah, the expense is enormous right? >> It can vary from 10 to 20 million dollars. >> Just a giant expense. So, now today fast-forward to this century, and you're seeing much more sophisticated-- >> Yeah, I can give you another example in pharmaceutical. They develop new drugs, it's a long process. So, one of the initial process is to figure out what molecules this would be exploring in the next step, and you could have thousand different combination of molecules that could treat a particular condition, and now they with digitization and data analytics, they're able to do this in a virtual world, kind of creating a virtual lab where they can test out thousands of molecules. And then, once they can bring it down to a fewer, then the physical aspect of that starts. Think about innovation really shrinking their processes. >> All right, well I want to say this about clouds. You made the statement in your keynote that how many people out there think cloud is cheaper, or maybe you even said cheap, but cheaper I inferred cheaper than an on-prem, and so it was a loaded question so nobody put their hand up they're afraid, but I put my hand up because we don't have any IT. We used to have IT. It was a nightmare. So, for us it's better but in your experience, I think I'm inferring correctly that you had meant cheaper than on-prem, and certainly we talked to many practitioners who have large systems that when they lift and shift to the cloud, they don't change their operating model, they don't really change anything, they get a bill at the end of the month, and they go "What did this really do for us?" And I think that's what you mean-- >> So what I mean, let me make it clear, is that there are certain use cases that cloud is and, as you saw, that people did raise their hand saying "Yeah, I have use cases where cloud is cheaper." I think you need to look at the whole thing. Cost is one aspect. The flexibility and agility of being able to do things is another aspect. For example, if you have a situation where your stakeholder want to do something for three weeks, and they need five times the computing power, and the data that they are buying from outside to do that experiment. Now, imagine doing that in a physical war. It's going to take a long time just to procure and get the physical boxes, and then you'll be able to do it. In cloud, you can enable that, you can get GPUs depending on what problem we are trying to solve. That's another benefit. You can get the fit for purpose computing environment to that and so there are a lot of flexibility, agility all of that. It's a new way of managing it so people need to pay attention to the cost because it will add to the cost. The other thing I will point out is that if you go to the public cloud, because they make it cheaper, because they have hundreds and thousands of this canned CPU. This much computing power, this much memory, this much disk, this much connectivity, and they build thousands of them, and that's why it's cheaper. Well, if your need is something that's very unique and they don't have it, that's when it becomes a problem. Either you need more of those and the cost will be higher. So, now we are getting to the IOT war. The volume of data is growing so much, and the type of processing that you need to do is becoming more real-time, and you can't just move all this bulk of data, and then bring it back, and move the data back and forth. You need a special type of computing, which is at the, what Amazon calls it, adds computing. And the industry is kind of trying to design it. So, that is an example of hybrid computing evolving out of a cloud or out of the necessity that you need special purpose computing environment to deal with new situations, and all of it can't be in the cloud. >> I mean, I would argue, well I guess Microsoft with Azure Stack was kind of the first, although not really. Now, they're there but I would say Oracle, your former company, was the first one to say "Okay, we're going to put the exact same infrastructure on prem as we have in the public cloud." Oracle, I would say, was the first to truly do that-- >> They were doing hybrid computing. >> You now see Amazon with outposts has done the same, Google kind of has similar approach as Azure, and so it's clear that hybrid is here to stay, at least for some period of time. I think the cloud guys probably believe that ultimately it's all going to go to the cloud. We'll see it's going to be a long, long time before that happens. Okay! I'll give you last thoughts on this conference. You've been here before? Or is this your first one? >> This is my first one. >> Okay, so your takeaways, your thoughts, things you might-- >> I am very impressed. I'm a practitioner and finding so many practitioners coming from so many different backgrounds and industries. It's very, very enlightening to listen to their journey, their story, their learnings in terms of what works and what doesn't work. It is really invaluable. >> Yeah, I tell you this, it's always a highlight of our season and Gokula, thank you very much for coming on theCUBE. It was great to see you. >> Thank you. >> You're welcome. All right, keep it right there everybody. We'll be back with our next guest, Dave Vellante. Paul Gillin is in the house. You're watching theCUBE from MIT. Be right back! (upbeat techno music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. He is the Senior Director of Global Data and Analytics Former, former Senior Director. We met when you were at Oracle doing data. I met you when you were at Oracle, of the pioneers of the whole industry, and the data modeling and analytics, So, if you look at all the reporting that used to happen the highest market value on the U.S. stock exchange. So, that's more manual, but at the operational level, that 80% of the data that's out there today and as soon as they are able to value their data, Well, and of course you talked a lot and data is the way to understand or accelerate the time and you get the value And the other aspect you look at is, Everybody says that but the reality is you hire and of course selling it to make money. the mineral rights and so they became very, very wealthy. and you're seeing much more sophisticated-- So, one of the initial process is to figure out And I think that's what you mean-- and the type of processing that you need to do I mean, I would argue, and so it's clear that hybrid is here to stay, and what doesn't work. Yeah, I tell you this, Paul Gillin is in the house.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Hitachi	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Doug Laney	PERSON	0.99+
five times	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Kansas	LOCATION	0.99+
TransUnion	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
HP	ORGANIZATION	0.99+
three weeks	QUANTITY	0.99+
India	LOCATION	0.99+
10	QUANTITY	0.99+
Sierra Atlantic	ORGANIZATION	0.99+
Gokula Mishra	PERSON	0.99+
Doug	PERSON	0.99+
hundreds	QUANTITY	0.99+
Berkshire Hathaway	ORGANIZATION	0.99+
five companies	QUANTITY	0.99+
80%	QUANTITY	0.99+
U.S.	LOCATION	0.99+
Booz Allen	ORGANIZATION	0.99+
Tata Burroughs	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Gokula	PERSON	0.99+
next year	DATE	0.99+
thousands	QUANTITY	0.99+
McDonald's	ORGANIZATION	0.99+
one aspect	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
yesterday	DATE	0.99+
thousands of molecules	QUANTITY	0.99+
first one	QUANTITY	0.99+
One	QUANTITY	0.98+
GE	ORGANIZATION	0.98+
northern Oklahoma	LOCATION	0.98+
today	DATE	0.97+
CS Associates	ORGANIZATION	0.97+
20 million dollars	QUANTITY	0.97+
one	QUANTITY	0.96+
First	QUANTITY	0.96+
Global Data and Analytics and Supply Chain	ORGANIZATION	0.95+
MIT CDOIQ Conference	EVENT	0.95+
13th year	QUANTITY	0.94+
U.S. government	ORGANIZATION	0.93+
two worlds	QUANTITY	0.92+
Azure Stack	TITLE	0.91+
one thing	QUANTITY	0.9+
one point	QUANTITY	0.9+
Northwestern	ORGANIZATION	0.9+
couple years ago	DATE	0.89+
MIT Chief Data Officer and Information Quality Symposium 2019	EVENT	0.87+
this century	DATE	0.85+
Tang building	LOCATION	0.85+
Global Data Analytics and	ORGANIZATION	0.83+
Chief Data Officer Information Quality Conference	EVENT	0.81+
MIT	ORGANIZATION	0.78+
theCUBE	ORGANIZATION	0.77+
thousand different combination of molecules	QUANTITY	0.74+
last	DATE	0.67+
years	DATE	0.66+
U.S.	ORGANIZATION	0.66+
billion dollar	QUANTITY	0.65+
themes	QUANTITY	0.65+
Osage Indian	OTHER	0.64+

Dave Russell, Veeam | VeeamON 2018

>> Narrator: Live from Chicaco, Illinois. It's theCUBE Covering VeeamOn 2018. Brought to you by Veeam. >> We're back in Chicago at VeeamOn 2018 #VeamOn, my name is Dave Vellante with my cohost Stu Miniman. You're watching theCUBE, our exclusive live coverage of VeeamOn. We go out to the events, we extract the signal from the noise. Cube alum Dave Russell is here. He's the newly minted VP of enterprise strategy at Veeam. Dave, it's great to see you again, thanks for coming back on. >> Thanks for having me, guys, what a difference the year makes. >> Yeah, so newly minted. Last year we had you on as Gartner Analyst, we've followed your work for years. I've personally followed you for, actually many decades. Going back to your IBM days. So, let's start. How'd you end up at Veeam? >> Unexpectedly, very similar to IBM to Gartner transition. Wasn't looking to make a change. Opportunity came literally out of the blue. So, this transition was also equally out of the blue. Some emails, phone calls started taking place over one weekend. Actually on a Sunday, so towards the end of the weekend. And, after a little bit of discussion of a couple of opportunities, and kind of looking at where I might be the best fit, and realizing that I really didn't think I was in a position to relocate. You know, me move the family, even though no one else said that I had to do that, I just felt like to do another position justice, you really have to be there. And in the situation with Veeam, I didn't think that was the case. I also thought I could jump in there, and they've got lots of other great people, I mean, Danny Allen is one of many examples. So I ultimately, from Sunday morning to, I guess, the following Saturday evening, some things were sort of in flight, and they landed where they did. >> So, California, of course as you know, doesn't have non-competes. People leave companies all the time. You were in a position at Gartner. You saw everything, from everybody. You did the Magic Quadrants for years, and years, and years, you had visibility on companies' plans. And now you're here, many of the folks that were customers, or people that you were advising are now competitors. How do you, as an analyst, and now a professional at Veeam, draw that line between what you can and cannot share? >> I don't want to make it sound too simple, but it's actually not that hard. And what I mean by that is, it's not hard, I think, for anyone that does an analyst job role, to understand how to compartmentalize. Right, you can't go and talk to a three-letter company, and then talk to a two-letter company, and mix the two conversations. And the same with transitioning jobs. When I came from IBM for 15 plus years, to Gartner, there were a lot of things I knew, of course, lot of things that I knew about even the backup space that I was focused on. I was the technical strategist for that product, development manager for that product. Even in Adjacent areas like storage, meaning storage arrays, my colleagues, new colleagues at Gartner would say, "I wonder what the road map is for that." And I would say, well I know what the road map is for that, >> Keep wondering. >> But you know, I'm not going to say anything. And no one asked me to, it was never that kind of situation. The same, I think is true right now. No one's asked me, "So what do you know?" In fact, I probably over-rotated, in that I literally shredded everything that I had, and took pictures of me shredding documents that I had. I literally took every drive that I have and overwrote it, not just deleted it, but overwrote it with multiple patterns and took pictures of that. Semi-ironically, I guess I'll just give this example, but kind of leave it at a high level. For a couple of days it looked like I was going to the West Coast. And so I shredded everything but their Magic Quadrant response, and then when I realized that wasn't the case I shredded that Magic Quadrant response. So, I had to reach out to Veeam and said, hey you know the thing I just shredded? Your marketing plan that you gave me in person two weeks ago, the Magic Quadrant response that I had already printed off and highlighted and done notes on. Can you resend that to me again, I need to re-read that. (laughing) >> Okay, so now, you've clearly had a choice of places to go, you're sought after, you've had an impact on the road map and strategy of many, many of these companies. Why Veeam? >> Well. I don't know if Veeam is going to love me saying this, but I thought there were two great opportunities, and I'm not kidding, I only looked at two, there were a couple more, but I looked only seriously at two, and for very different reasons. The reason that I really liked Veeam, was that their problems set, or what I thought I could offer was really different. It wasn't, hey we need someone to really focus on strategy and to navigate through, going through a financial transaction or an IPO situation, and what happens after that. It was more operational. It was more, we already are in the enterprise, but we need to go big in the enterprise. We already have some strategy people, but we need enterprise strategy. So it was more of an augmentation play. And I thought that was really interesting. I thought where Veeam is in its life cycle was interesting, not that a younger startup isn't also equally as compelling, but when I looked at where I thought I could be of value, and ultimately what was right for the family, it was, I thought, the best decision. >> Dave, you've been covering backup for a long time, but would it be safe to say that it's one of the hottest times in this space that you've seen, and why is that? >> I'm a Homer, so I'm going to say, I think I've been saying for 28 years there's never been a time like this in backup. But I actually think there's evidence to support that that's true. So let me give you a couple cases, or examples. Case in points. Every year I ask the question, are you more or less willing to switch backup vendors, is essentially the gist of it, and that was through my Gartner days. And there's kind of a scale, are you somewhat more willing to augment the solution, are you far more willing to augment the solution, all the way to, are you somewhat more willing to completely replace it, or far more willing to completely replace it. Long story short, the heat index, or I'm far more willing to completely replace the solution, is on the rise. And that kind of flies in the face of the myth that people don't switch backup solutions. The other thing that was interesting is, also drawing from my Gartner heritage, last December at a conference, did onstage polling, you could ask people questions, and one of them was: one year from now who do you think will be your strategic backup vendor? The top response is: we won't have a strategic backup vendor. That was 23% of the audience. 22% said it would be Veeam. And then you went down the list for organizations or vendors that have far more market share than Veeam. So, the fact that the majority of people say, basically out with everybody, and then the second highest response is: we're going to choose number four in market, based on market share. That's pretty, I don't want to say, can we say damning? Is that okay to say on here? Okay that's a pretty damning indictment of the state of the industry. >> So, I know you don't see the stuff, or maybe you do, some of it, but the stuff that the Wikibon research guys do. And they've just done some work, and I want to run it by you, and just sort of stink test it, if you will. Clearly we've been talking all day that data protection is moving up in the minds of CXOs. I mean, that's kind of well-known. But, they discovered a dichotomy between the business and IT with respect to the degrees of automation. In other words the business expects that there's far more automation than actually exists. And that's leading, in their conclusion, to what you were saying before is, a lot of opportunities for customer churn. It seems to be very churn-ripe environment. And the other piece that I'd love your comment on is, the Global 2000 generally, specifically, really, the Fortune 1000, is leaving billions of dollars on the table over, let's say, a three or four year period in either inadequate data protection or poorly architected data protection. So do some of those findings jive with your experience and your knowledge of the marketplace? >> Yeah they really do, because the last three years at Gartner, one of the fun things I got to do, it was a little more horizontal, was participate in CIO level research. And there was like 4:15 a.m. phone calls for me, but it was still fun to do, because there was, I think 3700 CIOs participated from around the world. So if you look at the big takeaways from there, the short story is, CIOs think that they are much further along on their journey than they actually are. I don't think it's because these men and women are blind, it's just they're thinking that we've been talking about this for so long, haven't we automated more? Aren't we more virtualized? Aren't we more into the cloud? And haven't we done more of our objectives that we set out to do? The sad reality is, the case is often no. And if you look at backup and recovery in particular, I totally agree with you, I mean for the amount of money that's being spent in this industry, our rate of return is not so great. It's not a spending problem, to your point, you're spending billions and billions of dollars, on software and then you're spending even more billions on hardware, and you're obviously spending human capital to go and manage this stuff, and professional services, what have you. So how come we can't restore the file? >> Right. And essentially many parts of that business are failing. So we can do better, is your point. >> I wanted to ask you about the value of data. One of your former colleagues at Gartner, Doug Laney, wrote a great book. I got an advance copy, Doug's been on theCUBE many times. Infonomics is the name of the book, really talking about a methodology to understand the value of data. Do you feel like organizations, especially in this digital world, have a good understanding of the value of their data, and if not, how does that affect their data protection decisions? >> I'll give you the short, not so great answer, which is no I don't think that they do. But to elaborate on that, I think someone or some people do. I don't think that's distributed around the whole enterprise so for example, if I'm the backup person, I think I know what I need to go and protect. You might be the Cassandra administrator, and you say, no this is the future of our business that I'm actually instantiating this new application right here. Meanwhile I'm not doing anything to protect that whatsoever. So if I'm operating under an independent view, that doesn't align with the business, then we're in trouble, and I, unfortunately, think that's too typically the case. That all parts of the business aren't interlocked. >> Yeah, back to your point about some of the transitions happening in the market. There's a number of players that are putting forth primarily appliances, even though they are software based, and Veeam is 100% pure software, how do you see that dynamic playing the market right now? >> Well I don't think there are any wrong answers, I know that sounds like a weasel cop-out, so let me double click on that, >> Stu: You're no longer an analyst you can't say, "It depends." >> There you go, yeah, there are 16 shades of gray actually. So the part that I think is very positive, on an appliance delivery model is that solves initial deployment challenges, that solves proof of concept challenges, that's a wonderful thing to be able to say, "Dave I want you to go take this box and just try it." And then you say, "You know what, I do like that." Great, you can actually keep the car you just test-drove. We can cut a PO for you right now. So there's actually a value in my mind for that hardware delivery model. Then you get other customers, that are on the other end of the spectrum, right? I don't want to spend more money on your server, that you're going to charge me for when I actually have more buying power, if I'm a large size organization, I can go to name your server company and buy it for cheaper than you can. And what I've found is, what I used to do, at Gartner, ask questions of, what is your purchasing intention around backup and recovery? It literally became kind of right down the middle. Some people were moving away from appliances towards software base, some people were doing the opposite. Others were kind of of open mind somewhere in the middle. So at net net, I think anyone, whether you're a startup like, so let's just name names, Rubrik and Cohesity, they're today primarily sales motion of a hardware appliance, but obviously they offer a virtual appliance as well. You take the other end of the spectrum, someone like Veeam, where Ratmir's made it very clear, we are not in the hardware business. And you look around and you see, there are a lot of hardware partners. So at the end of the day, whether you own it or enable it, I'm not convinced is 100% the point. I think it's just really offering the choice. But more importantly, what's the experience of that choice? People don't want to be integrators, so that favors appliances, you would think, but maybe people don't want to be integrators, and if they have a tightly coupled solution. Where they don't feel like they're assembling it, but they also don't have to just buy whatever Veeam says is going to be the controller this year, then maybe that's positive too. >> What's your point of view, and it may not be Veeam's sweet spot, but I wanted to get your thoughts on this, when you look at an Oracle environment, and you see how Oracle approaches data protection. Obviously there's RMAN in there, but it seems like the database and the application take more responsibility for recovery in particular, and it seems to work quite well, but it's expensive. And it's probably overkill for most applications. Do you see that as a trend, or is that a sort of an isolated tip of the pyramid? >> I would've said years ago that I thought it was a trend. Because the notion of either a hypervisor, or an application being more aware of recoverability, or availability, would make a lot of sense to me. Because they understand more about what's going on in that particular system. The reality is, and Oracle does a number of great things, RMAN is wonderful, ASM is wonderful, they have a couple of different appliances, but I'll just leave it at the fact that that's not the predominant Oracle protection mechanism today even for Fortune 500, means that there's some sort of feeling that maybe that's not answering all of the issues. >> Is that you feel like that's an opportunity for Veeam then? I infer from your response. >> I do, and honestly, to be fair, I think it's an opportunity for others besides Veeam, But absolutely I think it's an opportunity for Veeam, because Veeam is trying to go in and further penetrate that space. Oracle is forever going to be vitally important. I don't think we're ever going to see a day where SAP running on Oracle on on-prem server goes to zero. >> Right. >> Dave on the keynote stage this morning you said you want to be a builder again, what do we expect to see from you in the next coming year? >> Well I think the big thing is, I have had the luxury of being able to listen to and advise people, and that's a, I was going to say blessing, that sounds corny, but it's a privilege. But I miss the direct connect, it'd be great to be able to really go to product groups and say here's what I think we need to do in the next rev of the solution. Or here's my rationale from talking to, either Veeam customers or Veeam prospects, about why they're not choosing us for some workloads, maybe it's high end work, Oracle. And be able to effect change. I really was serious on stage when I said, I view this as my last stop. This is my third switch or second switch and third company so hopefully I'm here for 10 or 12 years, otherwise that's a little premature of a switch. >> Well, Dave, congratulations on the move, and the new role at Veeam. Your reputation is impeccable, and I really appreciate you coming on theCUBE. >> Always good to see you guys, thanks for having me. >> Alright, you're welcome. Alright keep it right there everybody, Stu and I will be back with our next guest. This is theCUBE, we're live from VeeamOn 2018 in Chicago. We'll be right back. (upbeat music)

Published Date : May 15 2018

SUMMARY :

Brought to you by Veeam. Dave, it's great to see you again, what a difference the year makes. Going back to your IBM days. And in the situation with Veeam, or people that you were and mix the two conversations. that you gave me in person two weeks ago, of places to go, you're sought after, are in the enterprise, and that was through my Gartner days. to what you were saying before is, I mean for the amount of of that business are failing. of the value of their data, and you say, no this is of the transitions you can't say, "It depends." the car you just test-drove. and the application but I'll just leave it at the fact that Is that you feel like I do, and honestly, to be fair, I have had the luxury of and the new role at Veeam. Always good to see you Stu and I will be back

ENTITIES

Entity	Category	Confidence
Gartner	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
two	QUANTITY	0.99+
Danny Allen	PERSON	0.99+
Dave Russell	PERSON	0.99+
Chicago	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Doug	PERSON	0.99+
Veeam	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Stu	PERSON	0.99+
10	QUANTITY	0.99+
100%	QUANTITY	0.99+
ASM	ORGANIZATION	0.99+
28 years	QUANTITY	0.99+
12 years	QUANTITY	0.99+
Sunday morning	DATE	0.99+
23%	QUANTITY	0.99+
Last year	DATE	0.99+
15 plus years	QUANTITY	0.99+
Doug Laney	PERSON	0.99+
4:15 a.m.	DATE	0.99+
second switch	QUANTITY	0.99+
billions	QUANTITY	0.99+
third switch	QUANTITY	0.99+
16 shades	QUANTITY	0.99+
last December	DATE	0.99+
two conversations	QUANTITY	0.99+
three	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
four year	QUANTITY	0.99+
22%	QUANTITY	0.99+
two-letter	QUANTITY	0.99+
two weeks ago	DATE	0.99+
three-letter	QUANTITY	0.99+
One	QUANTITY	0.98+
one	QUANTITY	0.98+
third company	QUANTITY	0.98+
Chicaco, Illinois	LOCATION	0.98+
Wikibon	ORGANIZATION	0.98+
today	DATE	0.98+
3700 CIOs	QUANTITY	0.97+
billions of dollars	QUANTITY	0.97+
California	LOCATION	0.97+
this year	DATE	0.97+
billions of dollars	QUANTITY	0.95+
SAP	ORGANIZATION	0.95+
Saturday evening	DATE	0.95+
next coming year	DATE	0.95+
Fortune 500	ORGANIZATION	0.94+
zero	QUANTITY	0.9+

Paul Barth, Podium Data | The Podium Data Marketplace

(light techno music) >> Narrator: From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now here's your host, Stu Miniman. >> Hi, I'm Stu Miniman and welcome to theCUBE conversation here in our Boston area studio. Happy to welcome back to the program, Paul Barth, who's the CEO of Podium Data, also a Boston area company. Paul, great to see you. >> Great to see you, Stu. >> Alright, so we last caught up with you, it was a fun event that we do at MIT talking about information, data quality, kind of understand why your company would be there. For our audience that doesn't know, just give us a quick summary, your background, what was kind of the why of Podium Data back when it was founded in 2014. >> Oh that's great Stu, thank you. I've spent most of my career in helping large companies with their data and analytic strategies, next generation architectures, new technologies, et cetera, and in doing this work, we kept stumbling across the complexity of adopting new technologies. And around the time that big data and Hadoop was getting popular and lots of hype in the marketplace, we realized that traditional large businesses couldn't manage data on this because the technology was so new and different. So we decided to form a software company that would automate a lot of the processing, manage a catalog of the data, and make it easy for nontechnical users to access their data. >> Yeah, that's great. You know when I think back to when we were trying to help people understand this whole big data wave, one of the pithy things we did, it was turning all this glut of data from a problem to an opportunity, how do we put this in to the users. But a lot of things kind of, we hit bumps in the road as an industry. Did studies it was more than 50 percent of these projects fail. You brought up a great point, tooling is tough, changing processes is really challenging. But that focus on data is core to our research, what we talk about all the time. But now it's automation and AIML, choose your favorite acronym of the day. This is going to solve all the ills that the big data wave didn't do right. Right, Paul? So maybe you can help us connect the dots a little bit because I hear a lot in to the foundation that trend from the big data to kind of the automation and AI thing. So you're maybe just a little ahead of your time. >> Well thanks, I saw an opportunity before there was anything in the marketplace that could help companies really corral their data, get some of the benefits of consolidation, some oversight in management through an automated catalog and the like. As AI has started to emerge as the next hype wave, what we're seeing consistently from our partners like Data Robot and others who have great AI technology is they're starved for good information. You can't learn automatically or even human learning if you're given inconsistent information, data that's not conformed or ready or consistent, which you can look at a lot of different events and start to build correlations. So we believe that we're still a central part of large companies building out their analytics infrastructure. >> Okay, help us kind of look at how your users and how you fit into this changing ecosystem. We all know things are just changing so fast. From 2014 to today, Cloud is so much bigger, the big waves of IoT keep talking. Everybody's got some kind of machine learning initiative. So what're the customers looking for, how do you fit in some of those different environments? >> I think when we formed the company we recognized that the cost performance differential between the open-sourced data management platforms like Hadoop and now Spark, were so dramatically better than the traditional databases and data warehouses, that we could transform the business process of how do you get data from Rotaready. And that's a consistent problem for large companies they have data in legacy formats, on mainframes, they have them in relational databases, they have them in flat files, in the Cloud, behind the firewall, and these silos continue to grow. This view of a consistent, or consistent view of your business, your customers, your processes, your operations, is cental to optimizing and automating the business today. So our business users are looking for a couple of things. One thing they are looking for is some manageability and a consistent view of their data no matter where it lives, and our catalog can create that automatically in days or weeks depending on how how big we go or how broadly we go. They're looking for that visibility but also they're looking for productivity enhancements, which means that they can start leveraging that data without a big IT project. And finally they're looking for agility which means there's self-service, there's an ability to access data that you know is trusted and secured and safe for the end users to use without having to call IT and have a program spin something up. So they're really looking for a totally new paradigm of data delivery. >> I tell you that hits on so many things that we've been seeing and a challenge that we've seen in the marketplace. In my world, talk about people they had their data centers and if I look at my data and I look at my applications, it's this heterogeneous nightmare. We call it hybrid or multi cloud these days, and it shows the promise of making me faster and all this stuff. But as you said, my data is all over the place, my applications are getting spun up and maybe I'm moving them and federating things and all that. But, my data is one of the most critical components of my business. Maybe explain a little bit how that works. Where do the customers come in and say oh my gosh, I've got a challenge and Podium Data's helping and the marketplace and all that. >> Sure, first of all we targeted from the start large regulated businesses, financial services, pharmaceutical healthcare, and we've broadened since then. But these companies' data issues were really pressure from both ends. One was a compliance pressure. They needed to develop regulatory reports that could be audited and proven correct. If your data is in many silos and it's compiled manually using spreadsheets, that's not only incredibly expensive and nonreproducible, it's really not auditable. So a lot of these folks were pressured to prove that the data they were reporting was accurate. On the other side, it's the opportunity cost. Fintech companies are coming into their space offering loans and financial products, without any human interaction, without any branches. They knew that data was the center to that. The only way you can make an offer to someone for financial product is if you know enough about them that you understand the risk. So the use and leverage of data was a very critical mass. There was good money to invest in it and they also saw that the old ways of doing this just weren't working. >> Paul, does your company help with the incoming GDPR challenges that are being faced? >> Sure, last year we introduced a PII detector and protection scheme. That may not sound like such a big deal but in the Hadoop open-source world it is. At the end of the day this technology while cheap and powerful is incredibly immature. So when you land data, for example, into these open data platforms like S3 out in the Cloud, Podium takes the time to analyze that data and tell you what the structures of the data are, where you might have issues with sensitive data, and has the tooling like obfuscation and encryption to protect the data so you can create safe to use data. I'd say our customers right now, they started out behind the firewall. Again, these regulated businesses were very nervous about breaches. They're looking and realizing they need to get to the Cloud 'cause frankly not only is it a better platform for them from a cost basis and scalability, it's actually where the data comes from these days, their data suppliers are in the Cloud. So we're helping them catalog their data and identify the sensitive data and prepare data sets to move to the Cloud and then migrate it to the Cloud and manage it there. >> Such a critical piece. I lived in the storage world for about a decade. There was a little acquisition that they made of a company called Pi, P-I. It was Paul Maritz who a lot of people know, Paul had a great career at Microsoft went on to run VMware for a bunch. But it was, the vision you talk about reminds me of what I heard Paul Maritz talking to. Gosh, that was a decade ago. Information, so much sensitivity. Expand a little bit on the security aspect there, when I looked through your website, you're not a security company per se, but are there partnerships? How do you help customers with I want to leverage data but I need to be secure, all the GRC and security things that's super challenging. >> At this space to achieve agility and scale on a new technology, you have to be enterprise ready. So in version one of our product, we had security features that included field level encryption and protection, but also integration with LDAB and Kerberos and other enterprise standard mechanisms and systems that would protect data. We can interoperate with Protegrity's and other kinds of encryption and protection algorithms with our open architecture. But it's kind of table stakes to get your data in a secured, monitorable infrastructure if you're going to enable this agility and self-service. Otherwise you restrict the use of the new data technologies to sandboxes. The failures you hear about are not in the sandboxes in the exploration, they're in getting those to production. I had one of my customers talk about how before Podium they had 50 different projects on Hadoop and all of them were in code red and none of them could go to production. >> Paul you mentioned catalogs, give us the update. What's the newest from Podium Data? Help explain that a little bit more. >> So we believe that the catalog has to help operationalize the data delivery process. So one of the things we did from the very start was say let's use the analytical power of big data technologies, Spark, Hadoop, and others, to analyze the data on it's way in to the platform and build a metadata catalog out of that. So we have over 100 profiling statistics that we automatically calculate and maintain for every field of every file we ever load. It's not something you do as an afterthought or selectively. We knew from our experience that we needed to do that, data validation, and then bring in inferences such as this field looks like PII data and tag that in the metadata. That process of taking in data and this even applies to legacy mainframe data coming in a VSAM format. It gets converted and landed to a usable format automatically. But the most important part is the catalog gets enriched with all this statistical profiling information, validation, all of the technical information and we interoperate as well as have a GUI to help with business tagging, business definitions in the light. >> Paul, just a little bit of a broader industry question, we talked a value of data I think everybody understands how important is it. How are we doing in understanding the value of that data though, is that a monetization thing? You've got academia in your background, there's debates, we've talked to some people at MIT about this. How do you look at data value as an industry in general, is there anything from Podium Data that you help people identify, are we leveraging it, are we doing the most, what are your thoughts around that? >> So I'd say someone who's looking for a good framework to think about this I'd recommend Doug Laney's book on infonomics, we've collaborated for a while, he's doing a great job there. But there's also just a blocking and tackling which is what data is getting used or a common one for our customers is where do I have data that's duplicate or it comes from the same source but it's not exactly the same. That often causes reconciliation issues in finance, or in forecasting, in sales analysis. So what we've done with our data catalog with all these profiling statistics is start to build some analytics that identify similar data sets that don't have to be exactly the same to say you may have a version of the data that you're trying to load here already available. Why don't you look at that data set and see if that one is preferred and the data governance community really likes this. For one of our customers there were literally millions of dollars in savings of eliminating duplication but the more important thing is the inconsistency, when people are using similar but not the same data sets. So we're seeing that as a real driver. >> I want to give you the final word. Just what are you seeing out in the industry these days, biggest opportunities, biggest challenges from users you're talking to? >> Well, what I'd say is when we started this it was very difficult for traditional businesses to use Hadoop in production and they needed an army of programmers and I think we solved that. Last year we started on our work to move to a post-Hadoop world so the first thing we've done is open up our cataloging tools so we can catalog any data set in any source and allow the data to be brought into an analytical environment or production environment more on demand then the idea that you're going to build a giant data lake with everything in it and replicate everything. That's become really interesting because you can build the catalog in a few weeks and then actually use the analysis and all the contents to drive the strategy. What do I prioritize, where do I put things? The other big initiative is of course, Cloud. As I mentioned earlier you have to protect and make Cloud ready data behind your firewall and then you have to know where it's used and how it's used externally. We automate a lot of that process and make that transition something that you can manage over time, and that is now going to be extended into multi cloud, multi lake type of technologies. >> Multi cloud, multi lake, alright. Well Paul Barth, I appreciate getting the update everything happening with Podium Data. Well, theCUBE had so many events this year, be sure to check out thecube.net for all the upcoming events and all the existing interviews. I'm Stu Miniman, thanks for watching theCUBE. (light techno music)

Published Date : Apr 26 2018

SUMMARY :

Narrator: From the SiliconANGLE Media office Hi, I'm Stu Miniman and welcome to theCUBE conversation it was a fun event that we do at MIT and in doing this work, we kept stumbling across one of the pithy things we did, and start to build correlations. and how you fit into this changing ecosystem. and safe for the end users to use and it shows the promise of making me So the use and leverage of data was a very critical mass. and then migrate it to the Cloud and manage it there. Expand a little bit on the security aspect there, and none of them could go to production. What's the newest from Podium Data? and tag that in the metadata. that you help people identify, are we leveraging it, and the data governance community really likes this. I want to give you the final word. and allow the data to be brought into Well Paul Barth, I appreciate getting the update

ENTITIES

Entity	Category	Confidence
2014	DATE	0.99+
Podium Data	ORGANIZATION	0.99+
Paul Maritz	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Paul Barth	PERSON	0.99+
Paul	PERSON	0.99+
Boston	LOCATION	0.99+
last year	DATE	0.99+
Stu	PERSON	0.99+
Last year	DATE	0.99+
Podium	ORGANIZATION	0.99+
Doug Laney	PERSON	0.99+
thecube.net	OTHER	0.99+
more than 50 percent	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
Boston, Massachusetts	LOCATION	0.99+
MIT	ORGANIZATION	0.98+
GRC	ORGANIZATION	0.98+
One	QUANTITY	0.98+
this year	DATE	0.98+
both ends	QUANTITY	0.98+
50 different projects	QUANTITY	0.97+
Spark	TITLE	0.97+
Data Robot	ORGANIZATION	0.97+
Hadoop	TITLE	0.96+
S3	TITLE	0.95+
millions of dollars	QUANTITY	0.95+
GDPR	TITLE	0.95+
theCUBE	ORGANIZATION	0.95+
a decade ago	DATE	0.94+
over 100 profiling statistics	QUANTITY	0.91+
Cloud	TITLE	0.9+
Rotaready	ORGANIZATION	0.89+
One thing	QUANTITY	0.87+
first thing	QUANTITY	0.87+
VMware	TITLE	0.86+
Kerberos	TITLE	0.83+
The Podium Data Marketplace	ORGANIZATION	0.79+
first	QUANTITY	0.79+
LDAB	TITLE	0.79+
Pi, P-I	ORGANIZATION	0.77+
SiliconANGLE Media	ORGANIZATION	0.61+
a decade	QUANTITY	0.6+
wave	EVENT	0.45+
Protegrity	ORGANIZATION	0.44+

Gene LeGanza, Forrester Research | IBM CDO Strategy Summit 2017

>> Announcer: Live from Boston, Massachusetts, it's theCube, covering IBM Chief Data Officer's Summit, brought to you by IBM. (upbeat music) >> Welcome back to theCUBE's live coverage of the IBM CDO Strategy Summit here in Boston, Massachusetts. I'm your host, Rebecca Knight, along with my co-host, Dave Vellante. >> Hey, hey. We are joined by Gene Leganza, he is the vice president and research director at Forrester Research. Thanks so much for coming on theCUBE. >> Pleasure, thanks for having me. >> So, before the cameras were rolling, we were talking about this transformation, putting data at the front and center of an organization, and you were saying how technology is a piece of the puzzle, a very important piece of the puzzle, but so much of this transformation involves these cultural, social, organizational politics issues that can be just as big and as onerous as the technology, and maybe bigger. >> Bigger in a sense that there can be intractable without any clear path forward. I was just in a session, at a breakout session, at the conference, as I was saying before, we could have had the same discussion 15 or 20 years ago in terms of how do you get people on board for things like data governance, things that sound painful and onerous to business people, something that sound like IT should take care of that, this is not something that a business person should get involved in. But the whole notion of the value of data as an asset to drive an organization forward, to do things you couldn't do before, to be either driven by insights, and if you're even advanced, AI, and cognitive sort of things, really advancing your organization forward, data's obviously very critical. And the things that you can do should be getting business people excited, but they're still having the same complaints about 20 years ago about this is something somebody should do for me. So, clearly the message is not getting throughout the organization that data is a new and fascinating thing that they should care about. There's a disconnect for a lot of organizations, I think. >> So, from your perspective, what is the push back? I mean, as you said, the fact that data is this asset should be getting the business guys' eyes lighting up. What do you see as sort of biggest obstacle and stumbling block here? >> I think it's easy to characterize the people we talk about. I came from IT myself, so the business is always the guys that don't get it, and in this case, the people who are not on board are somehow out of it, they're really bad corporate citizens, they're just not on board in some way that characterizes them as missing something. But I think what no one ever does who's in the position of trying to sell the value of data and data processes and data capabilities, is the fact that these folks are all doing their best to do their job. I mean, nobody thinks about that, right? They just think they're intractable, they like doing things the way they've always done them, they don't like change, and they're going to resist everything I try to do. But the fact is, from their perspective, they know how to be successful, and they know when risk is going to introduce something that they don't want to go there. It's unjustifiable risk. So the missing link is that no one's made that light bulb go off, to say, there is actually a good reason to change the way you've done things, right? And it's like, maybe it's in your best interest to do things differently, and to care more about something that sounds like IT stuff, like data governance, and data quality. So, that's why I think the chief data officer role, whether it's that title or chief analytics officer, or there's actually a chief artificial intelligence officer at the conference this time around, someone has to be the evangelist who can tell really meaningful stories. I mean, you know, 20 years ago, when IT was trying to convince the business that they should care more about data, data architects and DBAs could talk till they're blue in the face about why data was important. No one wanted to hear it. People get turned off even faster now than they did before, because they have a shorter attention span now than they did before. The fact is that somebody with a lot of credibility on the business side, people who kind of really believe it's capable of driving the business forward, hasta have a very meaningful message, not a half-hour wrap on why data is good for you, but what, specifically, can change in your business that you should want to change. I mean, basically, if you can't put it in terms of what's in it for me, why should they listen to you, right? And so yeah, you know, we've got this thing goin' on, it's really important, and everybody's behind it, and I can give you a list of people whose job title begins with C who really thinks that this is a really important idea, get right down to it, if it's not going to make their area of the business work better, or more efficiently, or, especially with, you know, top line growth sort of issues, they're not going to be that interested. And so it's the job of the person who's trying to evangelize these things to put it in those terms. And it might take some research, it certainly would take some in-depth business knowledge about what happens in that area of the business, you can't give an example from another industry or even another company. You've got to go around and find out what's broken, and talk about what can be fixed, you have to have some really good ideas about what can be innovative in very material terms. One of the breakout sessions I had earlier today, well, they're all around how you define new data products, and get innovative, and very interesting to hear some of the techniques by the folks who'd been successful there, down to, you know, it was somebody's job to go around, and when I say somebody, I don't mean a flunky, I need a chief analytics officer sort of person, talking to people about, you know, what did they hate about their job. Finding, collecting all the things that are broken, and thinking about what could be my best path forward to fix something that's going to get a lot of attention, that I can actually build a marketing message here about why everybody should care about this. And so, the missing link is really not seeing the value in changing behaviors. >> So one of the things that I've always respected about George Colony is he brings people into Forrester that care about social, cultural, organizational issues, not just technology. One of your counterparts, Doug Laney, just wrote a book called Infonomics. You mighta seen it on Twitter, there's a little bit of noise going around it. Premise of the book is essentially that organization shouldn't wait for the accounting industry to tell them how to value data. They should take it upon themselves, and he went into a lot of very detailed, you know, kind of mind-numbing calculations and ways to apply it. But there's a real cultural issue there. First of all, do you buy the premise, and what are you seeing in your client base in terms of the culture of data first, data value, and understanding data value? >> Really good question, really good question. And I do follow what Doug Laney does. Actually, Peter Burris, who you folks know, a long time ago, when he was at Forrester, said, "You know what Doug Laney is doing? "We better be doing that sort of thing." So he brought my attention to it a long time ago. I'm really glad he's working on that area, and I've been in conversations with him at other conferences, where people get into those mind-numbing discussions about the details and how to measure the value of data and stuff, and it's a really good thing that that is going on, and those discussions have to happen. To link my answer to that to answer to your second part of your question about what am I seeing in our client base, is that I'm not seeing a technical answer about how to value data in the books, in a spreadsheet, in some counting rules, going to be the differentiator. The missing link has not been that we haven't had the right rules in place to take X terabytes of data and turn it into X dollars of assets on the books. To me, the problem with that point of view is just that there is data that will bring you gold, and there's data that'll sit there, and it's valuable, but it's not really all that valuable. You know, it's a matter of what do you do with it. You know, I can have a hunk of wood on this table, and it's a hunk of wood, and how much it is, you know, what kind of wood is it and how much does it cost. If I make something out of it that's really valuable to somebody else, it'll cost something completely different based on what its function is, or its value as an art piece or whatever it might be. So, it's so much the product end of it. It's like, what do you do with it, and whether there's an asset value in terms of how it supports the business, in terms of got some regular reporting, but where all the interest is at these days, and why there's a lot of interest in it is like, okay, what are we missing about our business model that can be different, because now that everything's digitized, there are products people aren't thinking of. There are, you know, things that we can sell that may be related to our business, and somehow it's not even related our business, it's just that we now have this data, and it's unique to us, and there's something we can do with it. So the value is very much in terms of who would care about this, and what can I do with it to make it into an analytics product, or, you know, at very least I've got valuable data, I think this is how people tend to think of monetizing data, I've got valuable data, maybe I can put it somewhere people will download it and pay me for it. It's more that I can take this, and then from there do something really interesting with it and create a product, or a service, it's really it's on an app, it's on a phone, or it's on a website, or it's something that you deliver in person, but is giving somebody something they didn't have before. >> So what would you say, from your perspective, what are the companies that are being the most innovative at creating new data products, monetizing, creating new analytics products? What are they doing? What are the best practices of those companies from your perspective? >> You know, I think the best practice of those companies are they've got people who are actively trying to answer the question of, what can I do with this that's new, and interesting, and innovative. I'd say, in the examples I've seen, there been more small to medium companies doing interesting things than really, really huge companies. Or if they're huge companies, they're pockets of huge companies. It's kind of very hard to kind of institutionalize at the enterprise level. It's when you have somebody who gets it about the value of data, working to understand the business at a detailed level enough to understand what might be valuable to somebody in that business if I have a product, is when the magic can potentially happen. And what I've heard people doing are things like that hackathons, where in order to kind of surface these ideas, you get a bunch of folks who kind of get technology and data together with folks who get the business. And they play around with stuff, and they're matching the data to the business problem, comin' up with really kind of cool ideas. Those kind of things tend to happen on a smaller scale. You don't have a hackathon, as far as I can tell, with a couple thousand people in a room. It's usually a smaller sort of operation, where people are digging this up. So, it's folks who kind of get it, because they've been kind of working to find the value in analytics, and it's where there's pockets of people who're kind of working together with the business to make it happen. The profile is such that it's organizations that tend to be more mature about data. They're not complaining that data is something IT should take care of for me. They've kind of been there 10 years ago, or five years ago even, and they've gotten at a point where they actually wanted to move forward from defense and do some offensive playing. They're looking for those kind of cool things to do. So, they're more mature, certainly, than folks who aren't doing it. They're more agile and nimble, I think, than your typical organization in the sense of they can build cross disciplinary teams to make this happen, and that's really where the magic happens. You don't get a genius in the room to come up with this, you get this combination of technical skills, and data knowledge, and data engineering skills, and business smarts all in the same room, and that might be four or five different people to kind of brainstorm until they kind of come up with this. And so the folks who recognize that problem, make that happen, regardless of the industry, regardless of the size of the company, are where it's actually happening. >> I know we have to go, but I wanted to ask you, what about the IBM scorecard in terms of how they're doing in that regard? >> You know, I want to talk to them more. From what they said, you know, in a day, you hear a lot of talk, it's been a long day of hearing people talk about this. It sounds pretty amazing, you know, and I think, actually, we had a half hour session with Inderpal after his keynote, I'm going to get together with him more, and hear more about what's going on under the covers, 'cause it sounds like they're being very effective in kind of making this happen at the enterprise level. And I think that's the unusual thing. I mean, IBM is a huge, huge place. So the notion that you can take these cool ideas and make them work in pockets is one thing. Trying to make it enterprise class, scalable, cognitive-driven organization, with all the right wheels in motion to the data, and analytics, and process, and business change, and operating model change, is kind of amazing. From what I've heard so far, they're actually making it happen. And if it's really, really true, it's really amazing. So it makes me want to hear more, certainly, I have no reason to doubt that what they're saying is happening is happening, I just would love to hear just some more of the story. >> Yeah, you're making us all want to hear more. Well, thanks so much, Gene. It's been a pleasure-- >> Not a problem. >> having you on the show. >> A pleasure. >> Thanks. >> Thank you. >> I'm Rebecca Knight, for Dave Vellante, we will have more from the CDO Summit just after this. (upbeat music)

Published Date : Oct 24 2017

SUMMARY :

brought to you by IBM. of the IBM CDO Strategy Summit here We are joined by Gene Leganza, he is the vice president and you were saying how technology And the things that you can do I mean, as you said, the fact that data is this asset talking to people about, you know, and what are you seeing in your client base about the details and how to measure the value of data You don't get a genius in the room to come up with this, So the notion that you can take these cool ideas It's been a pleasure-- we will have more from the CDO Summit just after this.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Gene Leganza	PERSON	0.99+
Peter Burris	PERSON	0.99+
Doug Laney	PERSON	0.99+
Gene LeGanza	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Gene	PERSON	0.99+
George Colony	PERSON	0.99+
second part	QUANTITY	0.99+
four	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
Inderpal	PERSON	0.99+
Forrester Research	ORGANIZATION	0.99+
One	QUANTITY	0.99+
CDO Summit	EVENT	0.98+
Forrester	LOCATION	0.97+
one	QUANTITY	0.97+
15	DATE	0.96+
five years ago	DATE	0.96+
10 years ago	DATE	0.96+
20 years ago	DATE	0.93+
IBM CDO Strategy Summit	EVENT	0.92+
Twitter	ORGANIZATION	0.92+
a day	QUANTITY	0.91+
terabytes	QUANTITY	0.9+
earlier today	DATE	0.89+
First	QUANTITY	0.89+
theCUBE	ORGANIZATION	0.88+
IBM CDO Strategy Summit 2017	EVENT	0.88+
half hour	QUANTITY	0.87+
first	QUANTITY	0.83+
five different people	QUANTITY	0.83+
one thing	QUANTITY	0.79+
a half-hour	QUANTITY	0.79+
couple thousand people	QUANTITY	0.77+
Forrester	ORGANIZATION	0.76+
about	DATE	0.75+
Officer's Summit	EVENT	0.74+
Infonomics	TITLE	0.56+
data	QUANTITY	0.51+
theCube	COMMERCIAL_ITEM	0.46+
Chief	EVENT	0.38+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Doug Laney: