Don DeLoach, Midwest IoT Council | PentahoWorld 2017

>> Announcer: Live, from Orlando, Florida, it's TheCUBE, covering PentahoWorld 2017. Brought to you by Hitachi Vantara. >> Welcome back to sunny Orlando everybody. This is TheCUBE, the leader in live tech coverage. My name is Dave Vellante and this is PentahoWorld, #PWorld17. Don DeLoach here, he's the co-chair of the midwest IoT council. Thanks so much for coming on TheCUBE. >> Good to be here. >> So you've just written a new book. I got it right in my hot off the presses in my hands. The Future of IoT, leveraging the shift to a data-centric world. Can you see that okay? Alright, great, how's that, you got that? Well congratulations on getting the book done. >> Thanks. >> It's like, the closest a male can come to having a baby, I guess. But, so, it's fantastic. Let's start with sort of the premise of the book. What, why'd you write it? >> Sure, I'll give you the short version, 'cause that in and of itself could go on forever. I'm a data guy by background. And for the last five or six years, I've really been passionate about IoT. And the two converged with a focus on data, but it was kind of ahead of where most people in IoT were, because they were mostly focused on sensor technology and communications, and to a limited extent, the workflow. So I kind of developed this thesis around where I thought the market was going to go. And I would have this conversation over and over and over, but it wasn't really sticking and so I decided maybe I should write a book to talk about it and it took me forever to write the book 'cause fundamentally I didn't know what I was doing. Fortunately, I was able to eventually bring on a couple of co-authors and collectively we were able to get the book written and we published it in May of this year. >> And give us the premise, how would you summarize? >> So the central thesis of the book is that the market is going to shift from a focus on IoT enabled products like a smart refrigerator or a low-fat fryer or a turbine in a factory or a power plant or whatever. It's going to shift from the IoT enabled products to the IoT enabled enterprise. If you look at the Harvard Business Review article that Jim Heppelmann and Michael Porter did in 2014, they talked about the progression from products to smart products to smart, connected products, to product systems, to system of systems. We've largely been focused on smart, connected products, or as I would call IoT enabled products. And most of the technology vendors have focused their efforts on helping the lighting vendor or the refrigerator vendor or whatever IoT enable their product. But when that moves to mass adoption of IoT, if you're the CIO or the CEO of SeaLand or Disney or Walmart or whatever, you're not going to want to be a company that has 100,000 IoT enabled products. You're going to want to be an IoT enabled company. And the difference is really all around data primacy and how that data is treated. So, right now, most of the data goes from the IoT enabled product to the product provider. And they tell you what data you can get. But that, if you look at the progression, it's almost mathematically impossible that that is sustainable because company, organizations are going to want to take my, like let's just say we're talking about a fast food restaurant. They're going to want to take the data from the low-fat fryer and the data from the refrigerator or the shake machine or the lighting system or whatever, and they're going to want to look at it in the context of the other data. And they're going to also want to combine it with their point-of-sale or crew scheduling, or inventory and then if they're smart, they'll start to even pull in external data, like pedestrian traffic or street traffic or microweather or whatever, and they'll create a much richer signature. And then, it comes down to governance, where I want to create this enriched data set, and then propagate it to the right constituent in the right time in the right way. So you still give the product provider back the data that they want, and there's nothing that precludes you from doing that. And you give the low-fat fryer provider the data that they want, but you give your regional and corporate offices a different view of the same data, and you give the FDA or your supply chain partner, it's still the same atomic data, but what you're doing is you're separating the creation of the data from the consumption of the data, and that's where you gain maximum leverage, and that's really the thesis of the book. >> It's data, great summary by the way, so it's data in context, and the context of the low-fat fryer is going to be different than the workflow within that retail operation. >> Yeah, that's right and again, this is where, the product providers have initially kind of pushed back because they feel like they have stickiness and loyalty that's bred out of that link. But, first of all, that's going to change. So if you're Walmart or a major concern and you say, "I'm going to do a lighting RFP," and there's 10 vendors that say, "Hey, we want to compete for this," and six of 'em will allow Walmart to control the data, and four say, "No, we have to control the data," their list just went to six. They're just not going to put up with that. >> Dave: Period, the end, absolutely. >> That's right. So if the product providers are smart, they're going to get ahead of this and say, "Look, I get where the market's going. "We're going to need to give you control of the data, "but I'm going to ask for a contract that says "I'm going to get the data I'm already getting, "'cause I need to get that, and you want me to get that. "But number two, I'm going to recognize that "they can give, Walmart can give me my data back, "but enrich it and contextualize it "so I get better data back." So everybody can win, but it's all about the right architecture. >> Well and the product guys going to have the Trojan horse strategy of getting in when nobody was really looking. >> Don: That's right. >> And okay, so they've got there. Do you envision, Don, a point at which the Walmart might say, "No, that's our data "and you don't get it." >> Um, not really- >> or is there going to be a quid pro quo? >> and here's why. The argument that the product providers have made all along is, almost in a condescending way sometimes, although not intentionally condescending, it's been, look, we're selling you this low-fat fryer for your fast food restaurant. And you say you want the data, but you know, we had a team of people who are experts in this. Leave that to us, we'll analyze the data and we'll give you back what you need. Now, there's some truth to the fact that they should know their products better than anybody, and if I'm the fast food chain, I want them to get that data so that they can continually analyze and help me do my job better. They just don't have to get that data at my expense. There are ways to cooperatively work this, but again, it comes back to just the right architecture. So what we call the first receiver is in essence, setting up an abstraction close to the point of the ingestion of all this data. Upon which it's cleansed, enriched, and then propagated again to the right constituent in the right time in the right way. And by the way, I would add, with the right security considerations, and with the right data privacy considerations, 'cause like, if you look around the market now, things like GEP are in Europe and what we've seen in the US just in the wake of the elections and everything around how data is treated, privacy concerns are going to be huge. So if you don't know how to treat the data in the context of how it needs to be leveraged, you're going to lose that leverage of the data. >> Well, plus the widget guys are going to say "Look, we have to do predictive maintenance "on those devices and you want us to do that." You know, they say follow the money. Let's follow the data. So, what's the data flow look like in your mind? You got these edge devices. >> Yep, physical or virtual. Doesn't have to be a physical edge. Although, in a lot of cases, there are good reasons why you'd want a physical edge, but there's nothing technologically that says you have to have a physical edge. >> Elaborate on that, would you? What do you mean by virtual? >> Sure, so let's say I have a server inside a retail outfit. And it's collecting all of my IoT data and consolidating it and persisting it into a data store and then propagating it to a variety of constituents. That would be creating the first receiver in the physical edge. There's nothing that says that that edge device can't grab that data, but then persist it in a distributed Amazon cloud instance, or a Rackspace instance or whatever. It doesn't actually need to be persisted physically on the edge, but there's no reason it can't either. >> Okay, now I understand that now. So the guys at Wikibon, which is a sort of sister company to TheCUBE, have envisioned this three tiered data model where you've got the devices at the edge where real-time activity's going on, real-time analytics, and then you've got this sort of aggregation point, I guess call it a gateway. And then you've got, and that's as I say, aggregation of all these edge devices. And then you've got the cloud where the heavy modeling is done. It could be your private cloud or your public cloud. So does that three tier model make sense to you? >> Yeah, so what you're describing as the first tier is actually the sensor layer. The gateway layer that you're describing, in the book would be characterized as the first receiver. It's basically an edge tier that is augmented to persist and enrich the data and then apply the proper governance to it. But what I would argue is, in reality, I mean, your reference architecture is spot-on. But if you actually take that one step further, it's actually an n-tier architecture. Because there's no reason why the data doesn't go from the ten franchise stores, to the regional headquarters, to the country headquarters, to the corporate headquarters, and every step along the way, including the edge, you're going to see certain types of analytics and computational work done. I'll put a plug for my friends at Hitachi Lumada in on this, you know, there's like 700 horizontal IoT platforms out there. There aren't going to be 700 winners. There's going to be probably eight to 10, and that's only because the different specific verticals will provide for more winners than it would be if it was just one like a search engine. But, the winners are going to have to have an extensible architecture that is, will ultimately allow enterprises to do the very things I'm talking about doing. And so there are a number out there, but one of the things, and Rob Tiffany, who's the CTO of Lumada, I think has a really good handle on his team on an architecture that is really plausible for accomplishing this as the market migrates into the future. >> And that architecture's got to be very flexible, not just elastic, but sometimes we use the word plastic, plasticity, being able to go in any direction. >> Well, sure, up to and including the use of digital twins and avatars and the logic that goes along with that and the ability to spin something up and spin something down gives you that flexibility that you as an enterprise, especially the larger the enterprise, the more important that becomes, need. >> How much of the data, Don, at that edge do you think will be persisted, two part question? It's not all going to be persisted, is it? Isn't that too expensive? Is it necessary to persist all of that data? >> Well, no. So this is where, you'll hear the notion of data exhaust. What that really means is, let's just say I'm instrumenting every room in this hotel and each room has six different sensors in it and I'm taking a reading once a second. The ratio of inconsequential to consequential data is probably going to be over 99 to one. So it doesn't really make sense to persist that data and it sure as hell doesn't make sense to take that data and push it into a cloud where I spend more to reduce the value of the payload. That's just dumb. But what will happen is that, there are two things, one, I think people will see the value in locally persisting the data that has value, the consequential data, and doing that in a way that's stored at least for some period of time so you can run the type of edge analytics that might benefit from having that persisted store. The other thing that I think will happen, and this is, I don't talk much, I talk a little bit about it in the book, but there's this whole notion where when we get to the volumes of data that we really talk about where IoT will go by like 2025, it's going to push the physical limitations of how we can accommodate that. So people will begin to use techniques like developing statistical metadata models that are a highly accurate metadata representation of the entirety of the data set, but probably in about one percent of the space that's queryable and suitable for machine learning where it's going to enable you to do what you just physically couldn't do before. So that's a little bit into the future, but there are people doing some fabulous work on that right now and that'll creep into the overall lexicon over time. >> Is that a lightweight digital twin that gives you substantially the same insight? >> It could augment the digital twin in ways that allow you to stand up digital twins where you might not be able to before. The thing that, the example that most people would know about are, like in the Apache ecosystem, there are toolsets like SnappyData that are basically doing approximation, but they're doing it via sampling. And that is a step in that direction, but what you're looking for is very high value approximation that doesn't lose the outlier. So like in IoT, one of the things you normally are looking for is where am I going to pick up on anomalous behavior? Well if I'm using a sample set, and I'm only taking 15%, I by definition am going to lose a lot of that anomalous behavior. So it has to be a holistic representation of the data, but what happens is that that data is transformed into statistics that can be queryable as if it was the atomic data set, but what you're getting is a very high value approximation in a fraction of the space and time and resources. >> Ok, but that's not sampling. >> No, it's statistical metadata. There are, there's a, my last company had developed a thing that we called approximate query, and it was based on that exact set of patents around the formation of a statistical metadata model. It just so happens it's absolutely suited for where IoT is going. It's kind of, IoT isn't really there yet. People are still trying to figure out the edge in its most basic forms, but the sheer weight of the data and the progression of the market is going to force people to be innovative in how they look at some of these things. Just like, if you look at things like privacy, right now, people think in terms of anonymization. And that's, basically, I'm going to de-link data contextually where I'm going to effectively lose the linkages to the context in order to conform with data privacy. But there are techniques, like if you look at GDCAR, their techniques, within certain safe harbors, that allow you to pseudonymize the data where you can actually relink it under certain conditions. And there are some smart people out there solving these problems. That's where the market's going to go, it's just going to get there over time. And what I would also add to this equation is, at the end of the day, right now, the concepts that are in the book about the first receiver and the create, the abstraction of the creation of the data from the consumption of the data, look, it's a pretty basic thing, but it's the type of shift that is going to be required for enterprises to truly leverage the data. The things about statistical metadata and pseudonymization, pseudonymization will come before the statistical metadata. But the market forces are going to drive more and more into those areas, but you got to walk before you run. Right now, most people still have silos, which is interesting, because when you think about the whole notion of the internet of things, it infers that it's this exploitation of understanding the state of physical assets in a very broad based environment. And yet, the funny thing is, most IoT devices are silos that emulate M2M, sort of peer to peer networks just using the internet as a communication vehicle. But that'll change. >> Right, and that's really again, back to the premise of the book. We're going from these individual products, where all the data is locked into the product silo, to this digital fabric, that is an enterprise context, not a product context. >> That's right and if you go to the toolsets that Pentaho offers, the analytic toolsets. Let's just say, now that I've got this rich data set, assuming I'm following basic architectural principles so that I can leverage the maximum amount of data, that now gives me the ability to use these type of toolsets to do far better operational analytics to know what's going on, far better forensic analysis and investigative analytics to mine through the date and do root cause analysis, far better predictive analytics and prescriptive analytics to figure out what will go on, and ultimately feed the machine learning algorithms ultimately to get to in essence, the living organism, the adaptive systems that are continuously changing and adapting to circumstances. That's kind of the Holy Grail. >> You mentioned Hitachi Vantara before. I'm curious what your thoughts are on the Hitachi, you know, two years ago, we saw the acquisition, said, okay, now what? And you know, on paper it sounded good, and now it starts to come together, it starts to make more sense. You know, storage is going to the cloud. HDS says, alright, well we got this Hitachi relationship. But what do you make of that? How do you assess it, and where do you see it going? >> First of all, I actually think the moves that they've done are good. And I would not say that if I didn't think it. I'd just find a politically correct way not to say that. But I do think it's good. So they created the Hitachi Insight Group about a year and a half ago, and now that's been folded into Hitachin Vantara, alongside HDS and Pentaho and I think that it's a fairly logical set of elements coming together. I think they're going down the right path. In full disclosure, I worked for Hitachi Data Systems from '91 til '94, so it's not like I'm a recent employee of them, it's 25 years ago, but my experience with Hitachi corporate and the way they approach things has been unlike a lot of really super large companies, who may be super large, but may not be the best engineers, or may not always get everything done so well, Hitachi's a really formidable organization. And I think what they're doing with Pentaho and HDS and the Insight Group and specifically Lumada, is well thought out and I'm optimistic about where they're going. And by the way, they won't be the only winner in the equation. There's going to be eight or nine different key players, but they'll, I would not short them whatsoever. I have high hopes for them. >> The TAM is enormous. Normally, Hitachi eventually gets to where it wants to go. It's a very thoughtful company. I've been watching them for 30 years. But to a lot of people, the Pentaho and the Insight's play make a lot of sense, and then HDS, you used to work for HDS, lot of infrastructure still, lot of hardware, but a relationship with Hitachi Limited, that is quite strong, where do you see that fit, that third piece of the stool? >> So, this is where there's a few companies that have unique advantages, with Hitachi being one of them. Because if you think about IoT, IoT is the intersection of information technology and operational technology. So it's one thing to say, "I know how to build a database." or "I can build machine learning algorithms," or whatever. It's another thing to say, "I know how to build trains "or CAT scans or smart city lighting systems." And the domain expertise married with the technology delivers a set of capabilities that you can't match without that domain expertise. And, I mean, if you even just reduce it down to artificial intelligence and machine learning, you get an expert ML or AI guy, and they're only as good as the limits of their domain expertise. So that's why, and again, that's why I go back to the comparison to search engines, where there's going to be like, there's Google and maybe Yahoo. There's probably going to be more platform winners because the vertical expertise is going to be very, very important, but there's not going to be 700 of 'em. But Hitachi has an advantage that they bring to the table, 'cause they have very deep roots in energy, in medical equipment, in transportation. All of that will manifest itself in what they're doing in a big way, I think. >> Okay, so, but a lot of the things that you described, and help me understand this, are Hitachi Limited. Now of course, Hitachi Data Systems started as, National Advance Systems was a distribution arm for Hitachi IT products. >> Don: Right, good for you, not many people remember. >> I'm old. So, like I said, I had a 30 year history with this company. Do you foresee that that, and by the way, interestingly, was often criticized back when you were working for HDS, it was like, it's still a distribution hub, but in the last decade, HDS has become much more of a contributor to the innovation and the product strategy and so forth. Having said that, it seems to me advantageous if some of those things you discussed, the trains, the medical equipment, can start flowing back through HDS. I'm not sure if that's explicitly the plan. I didn't necessarily hear that, but it sort of has to, right? >> Well, I'm not privy to those discussions, so it would be conjecture on my part. >> Let's opine, but right, doesn't that make sense? >> Don: It makes perfect sense. >> Because, I mean HDS for years was just this storage silo. And then storage became a very uninteresting business, and credit to Hitachi for pivoting. But it seems to me that they could really, and they probably have a, I had Brian Householder on earlier I wish I had explored this more with him. But it just seems, the question for them is, okay, how are you going to tap those really diverse businesses. I mean, it's a business like a GE or a Siemens. I mean, it's very broad based. >> Well, again, conjecture on my part, but one way I would do it would be to start using Lumada in the various operations, the domain-specific operations right now with Hitachi. Whether they plan to do that or not, I'm not sure of. I've heard that they probably will. >> That's a data play, obviously, right? >> Well it's a platform play. And it's enabling technology that should augment what's already going on in the various elements of Hitachi. Again, I'm, this is conjecture on my part. But you asked, let's just go with this. I would say that makes a lot of sense. I'd be surprised if they don't do that. And I think in the process of doing that, you start to crosspollinate that expertise that gives you a unique advantage. It goes back to if you have unique advantages, you can choose to exploit them or not. Very few companies have the set of unique advantages that somebody like Hitachi has in terms of their engineering and massive reach into so many, you know, Hitachi, GE, Siemens, these are companies that have big reach to the extent that they exploit them or not. One of the things about Hitachi that's different than almost anybody though is they have all this domain expertise, but they've been in the technology-specific business for a long time as well, making computers. And so, they actually already have the internal expertise to crosspollinate, but you know, whether they do it or not, time will tell. >> Well, but it's interesting to watch the big whales, the horses in the track, if you will. Certainly GE has made a lot of noise, like, okay, we're a software company. And now you're seeing, wow, that's not so easy, and then again, I'm sanguine about GE. I think eventually they'll get there. And then you see IBM's got their sort of IoT division. They're bringing in people. Another company with a lot of IT expertise. Not a lot of OT expertise. And then you see Hitachi, who's actually got both. Siemens I don't know as well, but presumably, they're more OT than IT and so you would think that if you had to evaluate the companies' positions, that Hitachi's in a unique position. Certainly have a lot of software. We'll see if they can leverage that in the data play, obviously Pentaho is a key piece of that. >> One would assume, yeah for sure. No, I mean, I again, I think, I'm very optimistic about their future. I think very highly of the people I know inside that I think are playing a role here. You know, it's not like there aren't people at GE that I think highly of, but listen, you know, San Ramon was something that was spun up recently. Hitachi's been doing this for years and years and years. You know, so different players have different capabilities, but Hitachi seems to have sort of a holistic set of capabilities that they can bring together and to date, I've been very impressed with how they've been going about it. And especially with the architecture that they're bringing to bear with Lumada. >> Okay, the book is The Future of IoT, leveraging the shift to a data-centric world. Don DeLoach, and you had a co-author here as well. >> I had two co-authors. One is Wael Elrifai from Pentaho, Hitachi Vantara and the other is Emil Berthelsen, a Gartner analyst who was with Machina Research and then Gartner acquired them and Emil has stayed on with them. Both of them great guys and we wouldn't have this book if it weren't for the three of us together. I never would have pulled this off on my own, so it's a collective work. >> Don DeLoach, great having you on TheCUBE. Thanks very much for coming on. Alright, keep it right there buddy. We'll be back. This is PentahoWorld 2017, and this is TheCUBE. Be right back.

Published Date : Oct 27 2017

SUMMARY :

Brought to you by Hitachi Vantara. of the midwest IoT council. The Future of IoT, leveraging the shift the premise of the book. and communications, and to a is that the market is going to shift and the context of the low-fat But, first of all, that's going to change. So if the product providers are smart, Well and the product guys going to the Walmart might say, and if I'm the fast food chain, Well, plus the widget Doesn't have to be a physical edge. and then propagating it to the devices at the edge where and that's only because the got to be very flexible, especially the larger the enterprise, of the entirety of the data set, in a fraction of the space the linkages to the context in order back to the premise of the book. so that I can leverage the and now it starts to come together, and the Insight Group Pentaho and the Insight's play that they bring to the table, Okay, so, but a lot of the not many people remember. and the product strategy and so forth. to those discussions, and credit to Hitachi for pivoting. in the various operations, It goes back to if you the horses in the track, if you will. that they're bringing to bear with Lumada. leveraging the shift to and the other is Emil 2017, and this is TheCUBE.

ENTITIES

Entity	Category	Confidence
Hitachi	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Emil Berthelsen	PERSON	0.99+
2014	DATE	0.99+
Siemens	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Disney	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
eight	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Don DeLoach	PERSON	0.99+
Hitachi Data Systems	ORGANIZATION	0.99+
Wael Elrifai	PERSON	0.99+
15%	QUANTITY	0.99+
Jim Heppelmann	PERSON	0.99+
six	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
Emil	PERSON	0.99+
30 year	QUANTITY	0.99+
HDS	ORGANIZATION	0.99+
SeaLand	ORGANIZATION	0.99+
National Advance Systems	ORGANIZATION	0.99+
10 vendors	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
30 years	QUANTITY	0.99+
Insight Group	ORGANIZATION	0.99+
Rob Tiffany	PERSON	0.99+
700	QUANTITY	0.99+
Michael Porter	PERSON	0.99+
One	QUANTITY	0.99+
Hitachi Limited	ORGANIZATION	0.99+
Pentaho	ORGANIZATION	0.99+
Wikibon	ORGANIZATION	0.99+
three	QUANTITY	0.99+
2025	DATE	0.99+
Gartner	ORGANIZATION	0.99+
The Future of IoT	TITLE	0.99+
Brian Householder	PERSON	0.99+
Hitachi Data Systems	ORGANIZATION	0.99+
Machina Research	ORGANIZATION	0.99+
Hitachi Lumada	ORGANIZATION	0.99+
both	QUANTITY	0.99+
two years ago	DATE	0.99+
Orlando, Florida	LOCATION	0.99+
Lumada	ORGANIZATION	0.99+
Don	PERSON	0.99+
Midwest IoT Council	ORGANIZATION	0.99+
TAM	ORGANIZATION	0.99+
700 winners	QUANTITY	0.99+
two things	QUANTITY	0.99+
third piece	QUANTITY	0.99+
first tier	QUANTITY	0.99+
Both	QUANTITY	0.99+
Hitachi Insight Group	ORGANIZATION	0.99+
25 years ago	DATE	0.99+
Hitachi Vantara	ORGANIZATION	0.99+
two	QUANTITY	0.98+
10	QUANTITY	0.98+
one	QUANTITY	0.98+
each room	QUANTITY	0.98+
US	LOCATION	0.98+
TheCUBE	ORGANIZATION	0.98+

Bill Schmarzo, Dell EMC | DataWorks Summit 2017

>> Voiceover: Live from San Jose in the heart of Silicon Valley, it's The Cube covering DataWorks Summit 2017. Brought to you by: Hortonworks. >> Hey, welcome back to The Cube. We are live on day one of the DataWorks Summit in the heart of Silicon Valley. I'm Lisa Martin with my co-host Peter Burris. Not only is this day one of the DataWorks Summit, this is the day after the Golden State Warriors won the NBA Championship. Please welcome our next guess, the CTO of Dell AMC, Bill Shmarzo. And Cube alumni, clearly sporting the pride. >> Did they win? I don't even remember. I just was-- >> Are we breaking news? (laughter) Bill, it's great to have you back on The Cube. >> The Division III All-American from-- >> Cole College. >> 1947? >> Oh, yeah, yeah, about then. They still had the peach baskets. You make a basket, you have to climb up this ladder and pull it out. >> They're going rogue on me. >> It really slowed the game down a lot. (laughter) >> All right so-- And before we started they were analyzing the game, it was actually really interesting. But, kick things off, Bill, as the volume and the variety and the velocity of data are changing, organizations know there's a tremendous amount of transformational value in this data. How is Dell AMC helping enterprises extract and maximize that as the economic value of data's changing? >> So, the thing that we find is most relevant is most of our customers don't give a hoot about the three V's of big data. Especially on the business side. We like to jokingly say they care of the four M's of big data, make me more money. So, when you think about digital transformation and how it might take an organization from where they are today to sort of imbed digital capabilities around data and analytics, it's really about, "How do I make more money?" What processes can I eliminate or reduce? How do I improve my ability to market and reach customers? How do I, ya know-- All the things that are designed to drive value from a value perspective. Let's go back to, ya know, Tom Peters kind of thinking, right? I guess Michael Porter, right? His value creation processes. So, we find that when we have a conversation around the business and what the business is trying to accomplish that provides the framework around which to have this digital transformation conversation. >> So, well, Bill, it's interesting. The volume, velocity, variety; three V's, really say something about the value of the infrastructure. So, you have to have infrastructure in place where you can get more volume, it can move faster, and you can handle more variety. But, fundamentally, it is still a statement about the underlying value of the infrastructure and the tooling associated with the data. >> True, but one of the things that changes is not all data is of equal value. >> Peter: Absolutely. >> Right? So, what data, what technologies-- Do I need to have Spark? Well, I don't know, what are you trying to do, right? Do I need to have Kafka or Ioda, right? Do I need to have these things? Well, if I don't know what I'm trying to do, then I don't have a way to value the data and I don't have a way to figure out and prioritize my investment and infrastructure. >> But, that's what I want to come to. So, increasingly, what business executives, at least the ones who we're talking to all the time, are make me more money. >> Right. >> But, it really is, what is the value of my data? And, how do I start pricing data and how do I start thinking about investing so that today's data can be valuable tomorrow? Or the data that's not going to be valuable tomorrow, I can find some other way to not spend money on it, etc. >> Right. >> That's different from the variety, velocity, volume statement which is all about the infrastructure-- >> Amen. >> --and what an IT guy might be worried about. So, I've done a lot of work on data value, you've done a lot of work in data value. We've coincided a couple times. Let's pick that notion up of, ya know, digital transformation is all about what you do with your data. So, what are you seeing in your clients as they start thinking this through? >> Well, I think one of the first times it was sort of an "aha" moment to me was when I had a conversation with you about Adam Smith. The difference between value in exchange versus value in use. A lot of people when they think about monetization, how do I monetize my data, are thinking about value in exchange. What is my data worth to somebody else? Well, most people's data isn't worth anything to anybody else. And the way that you can really drive value is not data in exchange or value in exchange, but it's value in use. How am I using that data to make better decisions regarding customer acquisition and customer retention and predictive maintenance and quality of care and all the other oodles of decisions organizations are making? The evaluation of that data comes from putting it into use to make better decisions. If I know then what decision I'm trying to make, now I have a process not only in deciding what data's most valuable but, you said earlier, what data is not important but may have liability issues with it, right? Do I keep a data set around that might be valuable but if it falls into the wrong hands through cyber security sort of things, do I actually open myself up to all kinds of liabilities? And so, organizations are rushing from this EVD conversation, not only from a data evaluation perspective but also from a risk perspective. Cause you've got to balance those two aspects. >> But, this is not a pure-- This is not really doing an accounting in a traditional accounting sense. We're not doing double entry book keeping with data. What we're really talking about is understand how your business used its data. Number one today, understand how you think you want your business to be able to use data to become a more digital corporation and understand how you go from point "a" to point "b". >> Correct, yes. And, in fact, the underlying premise behind driving economic value of data, you know people say data is the new oil. Well, that's a BS statement because it really misses the point. The point is, imagine if you had a barrel of oil; a single barrel of oil that can be used across an infinite number of vehicles and it never depleted. That's what data is, right? >> Explain that. You're right but explain it. >> So, what it means is that data-- You can use data across an endless number of use cases. If you go out and get-- >> Peter: At the same time. >> At the same time. You pay for it once, you put it in the data lake once, and then I can use it for customer acquisition and retention and upsell and cross-sell and fraud and all these other use cases, right? So, it never wears out. It never depletes. So, I can use it. And what organizations struggle with, if you look at data from an accounting perspective, accounting tends to value assets based on what you paid for it. >> Peter: And how you can apply them uniquely to a particular activity. A machine can be applied to this activity and it's either that activity or that activity. A building can be applied to that activity or that activity. A person's time to that activity or that activity. >> It has a transactional limitation. >> Peter: Exactly, it's an oar. >> Yeah, so what happens now is instead of looking at it from an accounting perspective, let's look at it from an economics and a data science perspective. That is, what can I do with the data? What can I do as far as using the data to predict what's likely to happen? To prescribe actions and to uncover new monetization opportunities. So, the entire approach of looking at it from an accounting perspective, we just completed that research at the University of San Francisco. Where we looked at, how do you determine economic value of data? And we realized that using an accounting approach grossly undervalued the data's worth. So, instead of using an accounting, we started with an economics perspective. The multiplier effect, marginal perpetuity to consume, all that kind of stuff that we all forgot about once we got out of college really applies here because now I can use that same data over and over again. And if I apply data science to it to really try to predict, prescribe, and monetize; all of a sudden economic value of your data just explodes. >> Precisely because of your connecting a source of data, which has a particular utilization, to another source of data that has a particular utilization and you can combine them, create new utilizations that might in and of itself be even more valuable than either of the original cases. >> They genetically mutate. >> That's exactly right. So, think about-- I think it's right. So, congratulations, we agree. Thank you very much. >> Which is rare. >> So, now let's talk about this notion of as we move forward with data value, how does an organization have to start translating some of these new ways of thinking about the value of data into investments in data so that you have the data where you want it, when you want it, and in the form that you need it. >> That's the heart of why you do this, right? If I know what the value of my data is, then I can make decisions regarding what data am I going to try to protect, enhance? What data am I going to get rid of and put on cold storage, for example? And so we came up with a methodology for how we tie the value of data back to use cases. Everything we do is use case based so if you're trying to increase same-store sales at a Chipotle, one of my favorite places; if you're trying to increase it by 7.1 percent, that's worth about 191 million dollars. And the use cases that support that like increasing local even marketing or increasing new product introduction effectiveness, increasing customer cross-sale or upsell. If you start breaking those use cases down, you can start tying financial value to those use cases. And if I know what data sets, what three, five, seven data sets are required to help solve that problem, I now have a basis against which I can start attaching value to data. And as I look across at a number of use cases, now the valued data starts to increment. It grows exponentially; not exponentially but it does increment, right? And it gets more and more-- >> It's non-linear, it's super linear. >> Yeah, and what's also interesting-- >> Increasing returns. >> From an ROI perspective, what you're going to find that as you go down these use cases, the financial value of that use case may not be really high. But, when the denominator of your ROI calculation starts approaching zero because I'm reusing data at zero cost, I can reuse data at zero cost. When the denominator starts going to zero ya know what happens to your ROI? In infinity, it explodes. >> Last question, Bill. You mentioned The University of San Francisco and you've been there a while teaching business students how to embrace analytics. One of the things that was talked about this morning in the keynote was Hortonworks dedication to the open-source community from the beginning. And they kind of talked about there, with kids in college these days, they have access to this open-source software that's free. I'd just love to get, kind of the last word, your take on what are you seeing in university life today where these business students are understanding more about analytics? Do you see them as kind of, helping to build the next generation of data scientists since that's really kind of the next leg of the digital transformation? >> So, the premise we have in our class is we probably can't turn business people into data scientists. In fact, we don't think that's valuable. What we want to do is teach them how to think like a data scientist. What happens, if we can get the business stakeholders to understand what's possible with data and analytics and then you couple them with a data scientist that knows how to do it, we see exponential impact. We just did a client project around customer attrition. The industry benchmark in customer attrition is it was published, I won't name the company, but they had a 24 percent identification rate. We had a 59 percent. We two X'd the number. Not because our data scientists are smarter or our tools are smarter but because our approach was to leverage and teach the business people how to think like a data scientist and they were able to identify variables and metrics they want to test. And when our data scientists tested them they said, "Oh my gosh, that's a very highly predicted variable." >> And trust what they said. >> And trust what they said, right. So, how do you build trust? On the data science side, you fail. You test, you fail, you test, you fail, you're never going to understand 100 percent accuracy. But have you failed enough times that you feel comfortable and confident that the model is good enough? >> Well, what a great spirit of innovation that you're helping to bring there. Your keynote, we should mention, is tomorrow. >> That's right. >> So, you can, if you're watching the livestream or you're in person, you can see Bill's keynote. Bill Shmarzo, CTO of Dell AMC, thank you for joining Peter and I. Great to have you on the show. A show where you can talk about the Warriors and Chipotle in one show. I've never seen it done, this is groundbreaking. Fantastic. >> Psycho donuts too. >> And psycho donuts and now I'm hungry. (laughter) Thank you for watching this segment. Again, we are live on day one of the DataWorks Summit in San Francisco for Bill Shmarzo and Peter Burris, my co-host. I am Lisa Martin. Stick around, we will be right back. (music)

Published Date : Jun 13 2017

SUMMARY :

Brought to you by: Hortonworks. in the heart of Silicon Valley. I don't even remember. Bill, it's great to have you back on The Cube. You make a basket, you have to climb It really slowed the game down a lot. and maximize that as the economic value of data's changing? All the things that are designed to drive value and the tooling associated with the data. True, but one of the things that changes Well, I don't know, what are you trying to do, right? at least the ones who we're talking to all the time, Or the data that's not going to be valuable tomorrow, So, what are you seeing in your clients And the way that you can really drive value is and understand how you go from point "a" to point "b". because it really misses the point. You're right but explain it. If you go out and get-- based on what you paid for it. Peter: And how you can apply them uniquely So, the entire approach of looking at it and you can combine them, create new utilizations Thank you very much. so that you have the data where you want it, That's the heart of why you do this, right? the financial value of that use case may not be really high. One of the things that was talked about this morning So, the premise we have in our class is we probably On the data science side, you fail. Well, what a great spirit of innovation Great to have you on the show. Thank you for watching this segment.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Peter Burris	PERSON	0.99+
Peter	PERSON	0.99+
Bill Shmarzo	PERSON	0.99+
Michael Porter	PERSON	0.99+
Bill Schmarzo	PERSON	0.99+
Chipotle	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Tom Peters	PERSON	0.99+
Golden State Warriors	ORGANIZATION	0.99+
7.1 percent	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Adam Smith	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Bill	PERSON	0.99+
five	QUANTITY	0.99+
100 percent	QUANTITY	0.99+
59 percent	QUANTITY	0.99+
University of San Francisco	ORGANIZATION	0.99+
two aspects	QUANTITY	0.99+
24 percent	QUANTITY	0.99+
tomorrow	DATE	0.99+
Cole College	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
today	DATE	0.99+
1947	DATE	0.99+
zero	QUANTITY	0.99+
DataWorks Summit	EVENT	0.99+
about 191 million dollars	QUANTITY	0.98+
one	QUANTITY	0.98+
Dell AMC	ORGANIZATION	0.98+
Cube	ORGANIZATION	0.98+
Dell EMC	ORGANIZATION	0.97+
first times	QUANTITY	0.97+
One	QUANTITY	0.97+
DataWorks Summit 2017	EVENT	0.97+
day one	QUANTITY	0.96+
one show	QUANTITY	0.96+
four M's	QUANTITY	0.92+
zero cost	QUANTITY	0.91+
Hortonworks	ORGANIZATION	0.91+
NBA Championship	EVENT	0.89+
CTO	PERSON	0.86+
single barrel	QUANTITY	0.83+
The Cube	ORGANIZATION	0.82+
once	QUANTITY	0.8+
two X	QUANTITY	0.75+
three V	QUANTITY	0.74+
seven data sets	QUANTITY	0.73+
Number one	QUANTITY	0.73+
this morning	DATE	0.67+
double entry	QUANTITY	0.65+
Kafka	ORGANIZATION	0.63+
Spark	ORGANIZATION	0.58+
Hortonworks	PERSON	0.55+
III	ORGANIZATION	0.46+
Division	OTHER	0.38+
Ioda	ORGANIZATION	0.35+
American	OTHER	0.28+

Show Wrap - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona. It's theCUBE. Covering Data Platforms 2017. Brought to you by Kubo. >> Hey welcome back everybody. Jeff Frick here with theCUBE along with George Gilbert from Wikibon. We've had a tremendous day here at DataPlatforms 2017 at the historic Wigwam Resort, just outside of Phoenix, Arizona. George, you've been to a lot of big data shows. What's your impression? >> I thought we're at the, we're sort of at the edge of what could be a real bridge to something new, which is, we've built big data systems for like out of traditional, as traditional software for deployment on traditional infrastructure. Even if you were going to put it in a virtual machine, it's still not a cloud. You're still dealing with server abstractions. But what's happening with Kubo is, they're saying, once you go to the cloud, whether it's Amazon, Azure, Google or Oracle, you're going to be dealing with services. Services are very different. It greatly simplifies the administrative experience, the developer experience, and more than that, they're focused on, they're focused on turning Kubo, the product on Kubo the service, so that they can automate the management of it. And we know that big data has been choking itself on complexity. Both admin and developer complexity. And they're doing something unique, both on sort of the big data platform management, but also data science operations. And their point, their contention, which we still have to do a little more homework on, is that the vendors who started with software on-prem, can't really make that change very easily without breaking what they've done on-prem. Cuz they have traditional perpetual license physical software as opposed to services, which is what is in the cloud. >> The question is, are people going to wait for them to figure it out. I talked to somebody in the hallway earlier this morning and we were talking about their move to put all their data into, it was S3, on their data lake. And he said, it's part of a much bigger transformational process that we're doing inside the company. And so, this move, from his cloud, public cloud viable, to tell me, give me a reason why it shouldn't go to the cloud, has really kicked in big time. And hear over and over and over that speed and agility, not just in deploying applications, but in operating as a company, is the key to success. And we hear over and over how many, how short the tenure is on the Fortune 500 now, compared to what it used to be. So if you're not speed and agile, which you pretty much have to use cloud, and software driven automated decision-making >> Yeah. >> that's powered by machine learning to eat. >> Those two things. >> A huge percentage of your transaction and decision-making, you're going to get smoked by the person that is. >> Let's let's sort of peel that back. I was talking to Monte Zweben who is the co-founder of Splice Machine, one of the most advance databases that sort of come out of nowhere over the last couple of years. And it's now, I think, in close beta on Amazon. He showed me, like a couple of screens for spinning it up and configuring it on Amazon. And he said, if I were doing that on-prem, he goes I needed Hadoop cluster with HBase. It would take me like four plus months. And that's an example of software versus services. >> Jeff: Right. >> And when you said, when you pointed out that, automated decision-making, powered by machine learning, that's the other part, which is these big data systems ultimately are in the service of creating machine learning models that will inform ever better decisions with ever greater speed and the key then is to plug those models into existing systems of record. >> Jeff: Right. Right. >> Because we're not going to, >> We're not going to to rip those out and rebuild them from scratch. >> Right. But as you just heard, you can pull the data out that you need, run it through a new age application. >> George: Yeah. >> And then feed it back into the old system. >> George: Yes. >> The other thing that came up, it was Oskar, I have to look him up, Oskar Austegard from Gannett was on one of the panels. We always talk about the flexibility to add capacity very easily in a cloud-based solution. But he talked about in the separation of storage and cloud, that they actually have times where they turn off all their compute. It's off. Off. >> And that was If you had to boil down the fundamental compatibility break between on-prem and in the cloud, the Kubo folks, both the CEO and CMO said, look, you cannot reconcile what's essentially server send, where the storage is attached to the compute node, the server. With cloud where you have storage separate from compute and allowing you to spin it down completely. He said those are just the fundamentally incompatible. >> Yeah, yeah. And also, Andretti, one of the founders in his talk, he talked about the big three trends, which we just kind of talked about, he summarized them right in serverless. This continual push towards smaller and smaller units >> George: Yeah. >> of store compute. And the increasing speed of networks is one, from virtual servers to just no servers, to just compute. The second one is automation, you've got to move to automation. >> George: Right. If you're not, you're going to get passed by your competitor that is. Or the competitor you that you don't even know that exists that's going to come out from over your shoulder. And the third one was the intelligence, right. There is a lot of intelligence that can be applied. And I think the other cusp that we're on, is this continuing crazy increase in compute horsepower. Which just keeps going. That the speed and the intelligence of these machines is growing at an exponential curve, not a linear curve. It's going to be bananas in the not too distance future. >> We're soaking up more and more that intelligence with machine learning. The training part of machine learning where the datasets to train a model are immense. Not only the dataset are large, but the amount of time to sort of chug through them to come up with the, just the right mix of variables and values for those variables. Or maybe even multiple models. So that we're going to see in the cloud. And that's going to chew up more and more cycles. Even as we have >> Jeff: Right. Right. >> specialized processors. >> Jeff: Right. But in the data ops world, in theory yes, but I don't have to wait to get it right. Right? I can get it 70% right. >> George: Yeah. >> Which is better than not right. >> George: Yeah. >> And I can continue to iterate over time. In that, I think was the the genius of dev-ops. To stop writing PRDs and MRDs. >> George: Yeah. >> And deliver something. And then listen and adjust. >> George: Yeah. >> And within the data ops world, it's the same thing. Don't try to figure it all out. Take the data you know, have some hypothesis. Build some models and iterate. That's really tough to compete with. >> George: Yeah. >> Fast, fast, fast iteration. >> We're doing actually a fair amount of research on that. On the Wikibon side. Which is, if you build, if you build an enterprise application that has, that is reinforced or informed by models in many different parts, in other words, you're modeling more and more digital entities within the business. >> Jeff: Right. >> Each of those has feedback loops. >> Jeff: Right. Right. >> And when you get the whole thing orchestrated and moving or learning in concert then you have essentially what Michael Porter many years ago called competitive advantage. Which is when each business process reinforces all the other business processes in service of a delivering a value proposition. And those models represent business processes and when they're learning and orchestrated all together, you have a, what Trump called a fined-tuned machine. >> I won't go there. >> Leaving out that it was Bigley and it was finely-tuned machine. >> Yeah, yeah. But the end of the day, if you're using resources and effort to improve an different resource and effort, you're getting a multiplier effect. >> Yes. >> And that's really the key part. Final thought as we go out of here. Are you excited about this? Do you see, they showed the picture the NASA headquarters with the big giant snowball truck loading up? Do you see more and more of this big enterprise data going into S3, going into Google Cloud, going into Microsoft Azure? >> You're asking-- >> Is this the solution for the data lake swamp issue that we've been talking about? >> You're asking the 64 dollar question. Which is, companies, we sensed a year ago at the at the Hortonworks DataWorks Summit in, was in June, down in San Jose last year. That was where we first got the sense that, people were sort of throwing in the towel on trying to build, large scale big data platforms on-prem. And what changes now is, are they now evaluating Hortonworks versus Cloudera versus MapR in the cloud or are they widening their consideration as Kubo suggests. Because now they want to look, not only at Cloud Native Hadoop, but they actually might want to look at Cloud Native Services that aren't necessarily related to Hadoop. >> Right. Right. And we know as a service wins. It's continue. PAS is a service. Software is a service. Time and time again, as a service either eats a lot of share from the incumbent or knocks the incumbent out. So, Hadoop as a service, regardless of your distro, via one of these types of companies on Amazon, it seems like it's got to win, right. It's going to win. >> Yeah but the difference is, so far, so far, the Clouderas and the MapRs and the Hortonworks of the world are more software than service when they're in the cloud. They don't hide all the knobs. You still need You still a highly trained admin to get them up-- >> But not if you buy it as a service, in theory, right. It's going to be packaged up by somebody else and they'll have your knobs all set. >> They're not designed yet that way. >> HD Insight >> Then, then, then, then, They better be careful cuz it might be a new, as a service distro, of the Hadoop system. >> My point, which is what this is. >> Okay, very good, we'll leave it at that. So George, thanks for spending the day with me. Good show as always. >> And I'll be in a better mood next time when you don't steal my candy bars. >> All right. He's George Goodwin. I'm Jeff Frick. You're watching theCUBE. We're at the historic 99 years young, Wigwam Resort, just outside of Phoenix, Arizona. DataPlatforms 2017. Thanks for watching. It's been a busy season. It'll continue to be a busy season. So keep it tuned. SiliconAngle.TV or YouTube.com/SiliconAngle. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Brought to you by Kubo. at the historic Wigwam Resort, is that the vendors who started with software on-prem, but in operating as a company, is the key to success. you're going to get smoked by the person that is. over the last couple of years. and the key then is to plug those models Jeff: Right. We're not going to to rip those out But as you just heard, We always talk about the flexibility to add capacity And that was And also, Andretti, one of the founders in his talk, And the increasing speed of networks is one, And the third one was the intelligence, right. but the amount of time to sort of chug through them Jeff: Right. But in the data ops world, in theory yes, And I can continue to iterate over time. And then listen and adjust. Take the data you know, have some hypothesis. On the Wikibon side. Jeff: Right. And when you get the whole thing orchestrated Leaving out that it was Bigley But the end of the day, if you're using resources And that's really the key part. You're asking the 64 dollar question. a lot of share from the incumbent and the Hortonworks of the world It's going to be packaged up by somebody else of the Hadoop system. which is what this is. So George, thanks for spending the day with me. And I'll be in a better mood next time We're at the historic 99 years young, Wigwam Resort,

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
George	PERSON	0.99+
George Goodwin	PERSON	0.99+
George Gilbert	PERSON	0.99+
Michael Porter	PERSON	0.99+
Andretti	PERSON	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
64 dollar	QUANTITY	0.99+
70%	QUANTITY	0.99+
Trump	PERSON	0.99+
Oskar Austegard	PERSON	0.99+
June	DATE	0.99+
Oracle	ORGANIZATION	0.99+
Oskar	PERSON	0.99+
Google	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Kubo	ORGANIZATION	0.99+
one	QUANTITY	0.99+
last year	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
four plus months	QUANTITY	0.99+
99 years	QUANTITY	0.99+
third one	QUANTITY	0.99+
Phoenix, Arizona	LOCATION	0.99+
a year ago	DATE	0.99+
Splice Machine	ORGANIZATION	0.98+
Both	QUANTITY	0.98+
Microsoft	ORGANIZATION	0.98+
Hadoop	TITLE	0.98+
both	QUANTITY	0.97+
Azure	ORGANIZATION	0.97+
Each	QUANTITY	0.96+
Monte Zweben	PERSON	0.96+
first	QUANTITY	0.94+
MapRs	ORGANIZATION	0.94+
earlier this morning	DATE	0.92+
Wigwam Resort	LOCATION	0.92+
two things	QUANTITY	0.92+
2017	DATE	0.92+
#DataPlatforms2017	EVENT	0.89+
Wikibon	ORGANIZATION	0.89+
second one	QUANTITY	0.89+
three trends	QUANTITY	0.89+
each business process	QUANTITY	0.87+
DataPlatforms	TITLE	0.86+
theCUBE	ORGANIZATION	0.85+
Cloudera	ORGANIZATION	0.85+
Hortonworks DataWorks Summit	EVENT	0.85+
Wigwam Resort	ORGANIZATION	0.85+
Kubo	PERSON	0.84+
Gannett	ORGANIZATION	0.82+
MapR	ORGANIZATION	0.8+
S3	TITLE	0.8+
many years ago	DATE	0.78+
DataPlatforms 2017	EVENT	0.74+
years	DATE	0.73+
YouTube.com/SiliconAngle	OTHER	0.72+
Clouderas	ORGANIZATION	0.7+
Cloud Native	TITLE	0.67+
Platforms	TITLE	0.67+
Google Cloud	TITLE	0.64+
Cloud Native Hadoop	TITLE	0.64+
last couple	DATE	0.64+
Azure	TITLE	0.61+

Tom Davenport, Babson College - #MITCDOIQ - #theCUBE

in Cambridge Massachusetts it's the cube covering the MIT chief data officer and information quality symposium now here are your hosts Stu miniman and George Gilbert you're watching the cube SiliconANGLE media's flagship program we go out to lots of technology shows and symposiums like this one here help extract the signal from the noise I'm Stu miniman droid joined by George Gilbert from the Wikibon research team and really thrilled to have on the program the keynote speaker from this MIT event Tom Davenport whose pressure at babson author of some books including a new one that just came out and thank you so much for joining us my pleasure great to be here all right so uh you know so many things your morning keynote that I know George and I want to dig into I guess I'll start with you talk about the you know for eras of you called it data today used to be formation from the information sorry but you said you started with when it was three eras of analytics and now you've came to information so I'm just curious we you know we get caught up sometimes on semantics but is there a reason why you switch from you know analytics to information now well I'm not sure it's a permanent switch I just did it for this occasion but you know I I think that it's important for even people who aren't who don't have as their job doing something with analytics to realize that analytics or how we turn data into information so kind of on a whim I change it from four errors of analytics 24 hours of information to kind of broaden it out in a sense and make people realize that the whole world is changing it's not just about analytics ya know I it resonated with me because you know in the tech industry so much we get caught up on the latest tool George will be talking about how Hadoop is moving to spark and you know right if we step back and look from a longitudinal view you know data is something's been around for a long time but as as you said from Peter Drucker's quote when we endow that with relevance and purpose you know that that's when we get information so yeah and that's why I got interested in analytics a year ago or so it was because we weren't thinking enough about how we endowed data with relevance and purpose turning it into knowledge and knowledge management was one of those ways and I did that for a long time but the people who were doing stuff with analytics weren't really thinking about any of the human mechanisms for adding value to to data so that moved me in analytics direction okay so so Tommy you've been at this event before you know you you've taught in written and you know written books about this about this whole space so willing I'm old no no its you got a great perspective okay so bring us what's exciting you these days what are some of our big challenges and big opportunities that we're facing as kind of kind of humanity and in an industry yeah well I think for me the most exciting thing is they're all these areas where there's just too much data and too much analysis for humans to to do it anymore you know when I first started working with analytics the idea was some human analysts would have a hypothesis about how to do that about what's going on in the data and you'd gather some data and test that hypothesis and so on it could take weeks if not months and now you know we need me to make decisions in milliseconds on way too much data for a human to absorb even in areas like health care we have 400 different types of cancer hundreds of genes that might be related to cancer hundreds of drugs to administer you know we have these decisions have to be made by technology now and so very interesting to think about what's the remaining human role how do we make sure those decisions are good how do we review them and understand them all sorts of fascinating new issues I think along those lines come you know in at a primitive level in the Big Data realm the tools are kind of still emerging and we want to keep track of every time someone's touched it or transformed it but when you talk about something as serious as cancer and let's say we're modeling how we decide to or how we get to a diagnosis do we need a similar mechanism so that it's not either/or either the doctor or you know some sort of machine machine learning model or cognitive model some waited for the model to say here's how I arrived at that conclusion and then for the doctor to say you know to the patient here's my thinking along those lines yeah I mean I think one can like or just like Watson it was being used for a lot of these I mean Watson's being used for a lot of these oncology oriented projects and the good thing about Watson in that context is it does kind of presume a human asking a question in the first place and then a human deciding whether to take the answer the answers in most cases still have confidence intervals you know confidence levels associated with them so and in health care it's great that we have this electronic medical record where the physicians decision of their clinicians decision about how to treat that patient is recorded in a lot of other areas of business we don't really have that kind of system of record to say you know what what decision did we make and why do we make it and so on so in a way I think health care despite being very backward in a lot of areas is kind of better off than then a lot of areas of business the other thing I often say about healthcare is if they're treating you badly and you die at least there will be a meeting about it in a healthcare institution in business you know we screw up a decision we push it under the rug nobody ever nobody ever considered it what about 30 years ago I think it was with Porter's second book you know and the concept of the value chain and sort of remaking the the understanding of strategy and you're talking about the you know the AP AP I economy and and the data flows within that can you help tie your concept you know the data flows the data value chain and the api's that connect them with the porters value chain across companies well it's an interesting idea I think you know companies are just starting to realize that we are in this API economy you don't have to do it all yourself the smart ones have without kind of modeling it in any systematic way like the porter value chain have said you know we we need to have other people linking to our information through api's google is fairly smart i think in saying will even allow that for free for a while and if it looks like there's money to be made in what start charging for access to those api so you know building the access and then thinking about the the revenue from it is one of the new principles of this approach but i haven't seen its i think would be a great idea for paper to say how do we translate the sort of value chain ideas a michael porter which were i don't know 30 years ago into something for the api oriented world that we live in today which you think would you think that might be appropriate for the sort of platform economics model of thinking that's emerging that's an interesting question i mean the platform people are quite interested in inner organizational connections i don't hear them as talking as much about you know the new rules of the api economy it's more about how to two sided and multi-sided platforms work and so on Michael Porter was a sort of industrial economist a lot of those platform people are economists so from that sense it's the same kind of overall thinking but lots of opportunity there to exploit I think so tell me what want to bring it back to kind of the chief data officer when one of the main themes of the symposium here I really like you talked about kind of there needs to be a balance of offense and defense because so much at least in the last couple of years we've been covering this you know governance and seems to be kind of a central piece of it but it's such an exciting subject it's exciting subject but you know you you put that purely in defense on and you know we get excited the companies that are you know building new products you know either you know saving or making more money with with data Kenny can you talk a little bit about kind of as you see how this chief data officer needs to be how that fits into your kind of four arrows yeah yeah well I don't know if I mentioned it in my talk but I went back and confirmed my suspicions that the sama Phi odd was the world's first chief data officer at Yahoo and I looked at what Osama did at Yahoo and it was very much data product and offense or unity established yahoo research labs you know not everything worked out well at Yahoo in retrospect but I think they were going in the direction of what interesting data products can can we create and so I think we saw a lot of kind of what I call to point o companies in the in the big data area in Silicon Valley sing it's not just about internal decisions from data it's what can we provide to customers in terms of data not just access but things that really provide value that means data plus analytics so you know linkedin they attribute about half of their membership to the people you may know data product and everybody else as a people you may know now well we these companies haven't been that systematic about how you build them and how do you know which one to actually take the market and so on but I think now more and more companies even big industrial companies are realizing that this is a distinct possibility and we oughta we ought to look externally with our data for opportunities as much as supporting internal and I guess for you talk to you know companies like Yahoo some of the big web companies the whole you know Big Data meme has been about allowing you know tools and processes to get to a broader you know piece of the economy you know the counterbalance that a little bit you know large public clouds and services you know how much can you know a broad spectrum of companies out there you know get the skill set and really take advantage of these tools versus you know or is it going to be something that I'm going to still going to need to go to some outside chores for some of this well you know I think it's all being democratized fairly rapidly and I read yesterday the first time the quote nobody ever got fired for choosing amazon web services that's a lot cheaper than the previous company in that role which was IBM where you had to build up all these internal capabilities so I think the human side is being democratized they're over a hundred company over 100 universities in the US alone that have analytics oriented degree programs so i think there's plenty of opportunity for existing companies to do this it's just a matter of awareness on the part of the management team I think that's what's lacking in most cases they're not watching your shows i guess and i along the lines of the you know going back 30 years we had a preference actually a precedent where the pc software sort of just exploded onto the scene and it was i want control over my information not just spreadsheets you know creating my documents but then at the same time aighty did not have those guardrails to you know help help people from falling off you know their bikes and getting injured what are the what tools and technologies do we have for both audiences today so that we don't repeat that mistake ya know it's a very interesting question and I think you know spreadsheets were great you know the ultimate democratization tool but depending on which study you believe 22 eighty percent of them had errors in them and there was some pretty bad decisions that were made sometimes with them so we now have the tools so that we could tell people you know that spreadsheet is not going to calculate the right value or you should not be using a pie chart for that visual display I think vendors need to start building in those guardrails as you put it to say here's how you use this product effectively in addition to just accomplishing your basic task but you wouldn't see those guardrails extending all the way back because of data that's being provisioned for the users well I think ultimately if we got to the point of having better control over our data to saying you should not be using that data element it's not you know the right one for representing you know customer address or something along those lines we're not there yet and the vast majority of companies I've seen a few that have kind of experimented with data watermarks or something to say yes this is the one that you're allowed to to use has been certified as the right one for that purpose but we need to do a lot more in that regard yeah all right so Tommy you've got a new book that came out earlier this year only humans need apply winners and losers in the age of smart machines so ask you the same question we asked eric donaldson and Auntie McAfee when they wrote the second Machine Age you know are we all out of job soon well I think big day and I have become a little more optimistic as we look in some depth at at the data I mean one there are a lot of jobs evolving working with these technologies and you know it's just somebody was telling me the other day that is that I was doing a radio interview from my book and the guy was hung who said you know I've made a big transition into podcasting he said but the vast majority of people in radio have not been able to make that transition so if you're willing to kind of go with the flow learn about new technologies how they work I think there are plenty of opportunities the other thing to think about is that these transitions tend to be rather slow I mean we had about in the United States in 1980 about half a million bank tellers since then we've had ATMs online banking etc give so many bank tellers we have in 2016 about half a million it's rather shocking i think i don't know exactly what they're all doing but we're pretty slow in making these transitions so i think those of us sitting here today or even watching her probably okay we'll see some job loss on the margins but anybody who's willing to keep up with new technologies and add value to the smart machines that come into the workplace i think is likely to be okay okay do you have any advice for people that either are looking at becoming you know chief data officers well yeah as I as you said balanced offense and defense defense is a very tricky area to inhabit as a CDO because you if you succeed and you prevent you know breaches and privacy problems and security issues and so on nobody gives you necessarily any credit for it or even knows that it's helps of your work that you were successful and if you fail it's obviously very visible and bad for your career too so I think you need to supplement defense with offense activities are analytics adding valued information digitization data products etc and then I think it's very important that you make nice with all the other data oriented c-level executives you know you may not want to report to the CIO or if you have a cheap analytics officer or chief information security officer chief digitization officer chief digital officer you gotta present a united front to your organization and figure out what's the division of labor who's going to do what in too many of these organizations some of these people aren't even talking to each other and it's crazy really and very confusing to the to the rest of the organization about who's doing what yeah do you see the CDO role but you know five years from now being a standalone you know peace in the organization and you know any guidance on where that should sit is structurally compared to say the CIO yeah I don't you know I I've said that ideally you'd have a CIO or somebody who all of these things reported to who could kind of represent all these different interests of the rest of the organization that doesn't mean that a CDO shouldn't engage with the rest of the business I think CIO should be very engaged with the rest of the business but i think this uncontrolled proliferation has not been a good thing it does mean that information and data are really important to organization so we need multiple people to address it but they need to be coordinated somehow in a smart CEO would say you guys get your act together and figure out sort of who does what tell me a structure I think multiple different things can work you can have it inside of IT outside of IT but you can at least be collaborating okay last question I've got is you talked about these errors and you know that they're not you know not one dies in the next one comes and you talked about you know we know how slow you know people especially are to change so what happened to the company that are still sitting in the 10 or 20 era as we see more 30 and 40 companies come yeah well it's not a good place to be in general and I think what we've seen is this in many industries the sophisticated companies with regard to IT are the ones that get more and more market share the the late adopters end up ultimately going out of business I mean you think about in retail who's still around Walmart was the most aggressive company in terms of Technology Walmart is the world's largest company in moving packages around the world FedEx was initially very aggressive with IT UPS said we better get busy and they did it to not too much left of anybody else sending packages around the world so I think in every industry ultimately the ones that embrace these ideas tend to be the ones who who prosper all right well Tom Davenport really appreciate this morning's keynote and sharing with our audience everything that's happening in the space will be back with lots more coverage here from the MIT CDO IQ symposium you're watching the q hi this is christopher

Published Date : Jul 14 2016

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Tom Davenport	PERSON	0.99+
2016	DATE	0.99+
Yahoo	ORGANIZATION	0.99+
yahoo	ORGANIZATION	0.99+
Tom Davenport	PERSON	0.99+
Michael Porter	PERSON	0.99+
George	PERSON	0.99+
IBM	ORGANIZATION	0.99+
FedEx	ORGANIZATION	0.99+
24 hours	QUANTITY	0.99+
amazon	ORGANIZATION	0.99+
United States	LOCATION	0.99+
1980	DATE	0.99+
hundreds of drugs	QUANTITY	0.99+
Tommy	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
US	LOCATION	0.99+
second book	QUANTITY	0.99+
Peter Drucker	PERSON	0.99+
MIT	ORGANIZATION	0.99+
Stu miniman	PERSON	0.99+
yesterday	DATE	0.99+
michael porter	PERSON	0.99+
hundreds of genes	QUANTITY	0.99+
UPS	ORGANIZATION	0.99+
Auntie McAfee	PERSON	0.98+
today	DATE	0.98+
30	QUANTITY	0.98+
Babson College	ORGANIZATION	0.98+
a year ago	DATE	0.98+
about half a million	QUANTITY	0.98+
sama Phi	PERSON	0.98+
both audiences	QUANTITY	0.98+
about half a million	QUANTITY	0.98+
eric donaldson	PERSON	0.97+
40 companies	QUANTITY	0.97+
over 100 universities	QUANTITY	0.96+
Cambridge Massachusetts	LOCATION	0.96+
over a hundred	QUANTITY	0.96+
Osama	PERSON	0.96+
30 years ago	DATE	0.96+
22 eighty percent	QUANTITY	0.96+
first time	QUANTITY	0.95+
this morning	DATE	0.93+
earlier this year	DATE	0.92+
two sided	QUANTITY	0.92+
first	QUANTITY	0.92+
Watson	TITLE	0.9+
one	QUANTITY	0.9+
first chief data officer	QUANTITY	0.9+
second	QUANTITY	0.88+
four errors	QUANTITY	0.88+
#MITCDOIQ	ORGANIZATION	0.87+
Kenny	PERSON	0.86+
MIT CDO IQ symposium	EVENT	0.86+
christopher Tom Davenport	PERSON	0.84+
SiliconANGLE	ORGANIZATION	0.84+
about 30 years ago	DATE	0.83+
400 different types of cancer	QUANTITY	0.83+
Wikibon research team	ORGANIZATION	0.82+
last couple of years	DATE	0.82+
google	ORGANIZATION	0.81+
Porter	PERSON	0.81+
one of those ways	QUANTITY	0.8+
20	QUANTITY	0.75+
first place	QUANTITY	0.75+
10	QUANTITY	0.74+
Hadoop	PERSON	0.74+
some books	QUANTITY	0.73+
about half	QUANTITY	0.72+
lot of areas	QUANTITY	0.71+
three eras	QUANTITY	0.7+
five years	QUANTITY	0.7+
Machine	TITLE	0.63+
lot of areas	QUANTITY	0.61+
30 years	QUANTITY	0.59+
lot	QUANTITY	0.59+
milliseconds	QUANTITY	0.55+
linkedin	ORGANIZATION	0.55+
AP	ORGANIZATION	0.52+
shows	QUANTITY	0.51+
aighty	ORGANIZATION	0.5+
Watson	ORGANIZATION	0.5+
babson	TITLE	0.42+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Michael Porter: