Image Title

Search Results for Bill Schmarzo:

Bill Schmarzo, Hitachi Vantara | CUBE Conversation, August 2020


 

>> Announcer: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is a CUBE conversation. >> Hey, welcome back, you're ready. Jeff Frick here with theCUBE. We are still getting through the year of 2020. It's still the year of COVID and there's no end in sight I think until we get to a vaccine. That said, we're really excited to have one of our favorite guests. We haven't had him on for a while. I haven't talked to him for a long time. He used to I think have the record for the most CUBE appearances of probably any CUBE alumni. We're excited to have him joining us from his house in Palo Alto. Bill Schmarzo, you know him as the Dean of Big Data, he's got more titles. He's the chief innovation officer at Hitachi Vantara. He's also, we used to call him the Dean of Big Data, kind of for fun. Well, Bill goes out and writes a bunch of books. And now he teaches at the University of San Francisco, School of Management as an executive fellow. He's an honorary professor at NUI Galway. I think he's just, he likes to go that side of the pond and a many time author now, go check him out. His author profile on Amazon, the "Big Data MBA," "The Art of Thinking Like A Data Scientist" and another Big Data, kind of a workbook. Bill, great to see you. >> Thanks, Jeff, you know, I miss my time on theCUBE. These conversations have always been great. We've always kind of poked around the edges of things. A lot of our conversations have always been I thought, very leading edge and the title Dean of Big Data is courtesy of theCUBE. You guys were the first ones to give me that name out of one of the very first Strata Conferences where you dubbed me the Dean of Big Data, because I taught a class there called the Big Data MBA and look what's happened since then. >> I love it. >> It's all on you guys. >> I love it, and we've outlasted Strata, Strata doesn't exist as a conference anymore. So, you know, part of that I think is because Big Data is now everywhere, right? It's not the standalone thing. But there's a topic, and I'm holding in my hands a paper that you worked on with a colleague, Dr. Sidaoui, talking about what is the value of data? What is the economic value of data? And this is a topic that's been thrown around quite a bit. I think you list a total of 28 reference sources in this document. So it's a well researched piece of material, but it's a really challenging problem. So before we kind of get into the details, you know, from your position, having done this for a long time, and I don't know what you're doing today, you used to travel every single week to go out and visit customers and actually do implementations and really help people think these through. When you think about the value, the economic value, how did you start to kind of frame that to make sense and make it kind of a manageable problem to attack? >> So, Jeff, the research project was eyeopening for me. And one of the advantages of being a professor is, you have access to all these very smart, very motivated, very free research sources. And one of the problems that I've wrestled with as long as I've been in this industry is, how do you figure out what is data worth? And so what I did is I took these research students and I stick them on this problem. I said, "I want you to do some research. Let me understand what is the value of data?" I've seen all these different papers and analysts and consulting firms talk about it, but nobody's really got this thing clicked. And so we launched this research project at USF, professor Mouwafac Sidaoui and I together, and we were bumping along the same old path that everyone else got, which was inched on, how do we get data on our balance sheet? That was always the motivation, because as a company we're worth so much more because our data is so valuable, and how do I get it on the balance sheet? So we're headed down that path and trying to figure out how do you get it on the balance sheet? And then one of my research students, she comes up to me and she says, "Professor Schmarzo," she goes, "Data is kind of an unusual asset." I said, "Well, what do you mean?" She goes, "Well, you think about data as an asset. It never depletes, it never wears out. And the same dataset can be used across an unlimited number of use cases at a marginal cost equal to zero." And when she said that, it's like, "Holy crap." The light bulb went off. It's like, "Wait a second. I've been thinking about this entirely wrong for the last 30 some years of my life in this space. I've had the wrong frame. I keep thinking about this as an act, as an accounting conversation. An accounting determines valuation based on what somebody is willing to pay for." So if you go back to Adam Smith, 1776, "Wealth of Nations," he talks about valuation techniques. And one of the valuation techniques he talks about is valuation and exchange. That is the value of an asset is what someone's willing to pay you for it. So the value of this bottle of water is what someone's willing to pay you for it. So everybody fixates on this asset, valuation in exchange methodology. That's how you put it on balance sheet. That's how you run depreciation schedules, that dictates everything. But Adam Smith also talked about in that book, another valuation methodology, which is valuation in use, which is an economics conversation, not an accounting conversation. And when I realized that my frame was wrong, yeah, I had the right book. I had Adam Smith, I had "Wealth of Nations." I had all that good stuff, but I hadn't read the whole book. I had missed this whole concept about the economic value, where value is determined by not how much someone's willing to pay you for it, but the value you can drive by using it. So, Jeff, when that person made that comment, the entire research project, and I got to tell you, my entire life did a total 180, right? Just total of 180 degree change of how I was thinking about data as an asset. >> Right, well, Bill, it's funny though, that's kind of captured, I always think of kind of finance versus accounting, right? And then you're right on accounting. And we learn a lot of things in accounting. Basically we learn more that we don't know, but it's really hard to put it in an accounting framework, because as you said, it's not like a regular asset. You can use it a lot of times, you can use it across lots of use cases, it doesn't degradate over time. In fact, it used to be a liability. 'cause you had to buy all this hardware and software to maintain it. But if you look at the finance side, if you look at the pure play internet companies like Google, like Facebook, like Amazon, and you look at their valuation, right? We used to have this thing, we still have this thing called Goodwill, which was kind of this capture between what the market established the value of the company to be. But wasn't reflected when you summed up all the assets on the balance sheet and you had this leftover thing, you could just plug in goodwill. And I would hypothesize that for these big giant tech companies, the market has baked in the value of the data, has kind of put in that present value on that for a long period of time over multiple projects. And we see it captured probably in goodwill, versus being kind of called out as an individual balance sheet item. >> So I don't think it's, I don't know accounting. I'm not an accountant, thank God, right? And I know that goodwill is one of those things if I remember from my MBA program is something that when you buy a company and you look at the value you paid versus what it was worth, it stuck into this category called goodwill, because no one knew how to figure it out. So the company at book value was a billion dollars, but you paid five billion for it. Well, you're not an idiot, so that four billion extra you paid must be in goodwill and they'd stick it in goodwill. And I think there's actually a way that goodwill gets depreciated as well. So it could be that, but I'm totally away from the accounting framework. I think that's distracting, trying to work within the gap rules is more of an inhibitor. And we talk about the Googles of the world and the Facebooks of the world and the Netflix of the world and the Amazons and companies that are great at monetizing data. Well, they're great at monetizing it because they're not selling it, they're using it. Google is using their data to dominate search, right? Netflix is using it to be the leader in on-demand videos. And it's how they use all the data, how they use the insights about their customers, their products, and their operations to really drive new sources of value. So to me, it's this, when you start thinking about from an economics perspective, for example, why is the same car that I buy and an Uber driver buys, why is that car more valuable to an Uber driver than it is to me? Well, the bottom line is, Uber drivers are going to use that car to generate value, right? That $40,000, that car they bought is worth a lot more, because they're going to use that to generate value. For me it sits in the driveway and the birds poop on it. So, right, so it's this value in use concept. And when organizations can make that, by the way, most organizations really struggle with this. They struggle with this value in use concept. They want to, when you talk to them about data monetization and say, "Well, I'm thinking about the chief data officer, try not to trying to sell data, knocking on doors, shaking their tin cup, saying, 'Buy my data.'" No, no one wants your data. Your data is more valuable for how you use it to drive your operations then it's a sell to somebody else. >> Right, right. Well, on of the other things that's really important from an economics concept is scarcity, right? And a whole lot of economics is driven around scarcity. And how do you price for scarcity so that the market evens out and the price matches up to the supply? What's interesting about the data concept is, there is no scarcity anymore. And you know, you've outlined and everyone has giant numbers going up into the right, in terms of the quantity of the data and how much data there is and is going to be. But what you point out very eloquently in this paper is the scarcity is around the resources to actually do the work on the data to get the value out of the data. And I think there's just this interesting step function between just raw data, which has really no value in and of itself, right? Until you start to apply some concepts to it, you start to analyze it. And most importantly, that you have some context by which you're doing all this analysis to then drive that value. And I thought it was really an interesting part of this paper, which is get beyond the arguing that we're kind of discussing here and get into some specifics where you can measure value around a specific business objective. And not only that, but then now the investment of the resources on top of the data to be able to extract the value to then drive your business process for it. So it's a really different way to think about scarcity, not on the data per se, but on the ability to do something with it. >> You're spot on, Jeff, because organizations don't fail because of a lack of use cases. They fail because they have too many. So how do you prioritize? Now that scarcity is not an issue on the data side, but it is this issue on the people resources side, you don't have unlimited data scientists, right? So how do you prioritize and focus on those opportunities that are most important? I'll tell you, that's not a data science conversation, that's a business conversation, right? And figuring out how you align organizations to identify and focus on those use cases that are most important. Like in the paper we go through several different use cases using Chipotle as an example. The reason why I picked Chipotle is because, well, I like Chipotle. So I could go there and I could write it off as research. But there's a, think about the number of use cases where a company like Chipotle or any other company can leverage your data to drive their key business initiatives and their key operational use cases. It's almost unbounded, which by the way, is a huge challenge. In fact, I think part of the problem we see with a lot of organizations is because they do such a poor job of prioritizing and focusing, they try to solve the entire problem with one big fell swoop, right? It's slightly the old ERP big bang projects. Well, I'm just going to spend $20 million to buy this analytic capability from company X and I'm going to install it and then magic is going to happen. And then magic is going to happen, right? And then magic is going to happen, right? And magic never happens. We get crickets instead, because the biggest challenge isn't around how do I leverage the data, it's about where do I start? What problems do I go after? And how do I make sure the organization is bought in to basically use case by use case, build out your data and analytics architecture and capabilities. >> Yeah, and you start backwards from really specific business objectives in the use cases that you outline here, right? I want to increase my average ticket by X. I want to increase my frequency of visits by X. I want to increase the amount of items per order from X to 1.2 X, or 1.3 X. So from there you get a nice kind of big revenue hit that you can plan around and then work backwards into the amount of effort that it takes and then you can come up, "Is this a good investment or not?" So it's a really different way to get back to the value of the data. And more importantly, the analytics and the work to actually call out the information. >> The technologies, the data and analytic technologies available to us. The very composable nature of these allow us to take this use case by use case approach. I can build out my data lake one use case at a time. I don't need to stuff 25 data sources into my data lake and hope there's someone more valuable. I can use the first use case to say, "Oh, I need these three data sources to solve that use case. I'm going to put those three data sources in the data lake. I'm going to go through the entire curation process of making sure the data has been transformed and cleansed and aligned and enriched and met of, all the other governance, all that kind of stuff this goes on. But I'm going to do that use case by use case, 'cause a use case can tell me which data sources are most important for that given situation. And I can build up my data lake and I can build up my analytics then one use case at a time. And there is a huge impact then, huge impact when I build out use case by use case. That does not happen. Let me throw something that's not really covered in the paper, but it is very much covered in my new book that I'm working on, which is, in knowledge-based industries, the economies of learning are more powerful than the economies of scale. Now think about that for a second. >> Say that again, say that again. >> Yeah, the economies of learning are more powerful than the economies of scale. And what that means is what I learned on the first use case that I build out, I can apply that learning to the second use case, to the third use case, to the fourth use case. So when I put my data into my data lake for my first use case, and the paper covers this, well, once it's in my data lake, the cost of reusing that data in a second, third and fourth use cases is basically, you know marginal cost is zero. So I get this ability to learn about what data sets are most important and to reapply that across the organization. So this learning concept, I learn use case by use case, I don't have to do a big economies of scale approach and start with 25 datasets of which only three or four might be useful. But I'm incurring the overhead for all those other non-important data sets because I didn't take the time to go through and figure out what are my most important use cases and what data do I need to support those use cases. >> I mean, should people even think of the data per se or should they really readjust their thinking around the application of the data? Because the data in and of itself means nothing, right? 55, is that fast or slow? Is that old or young? Well, it depends on a whole lot of things. Am I walking or am I in a brand new Corvette? So it just, it's funny to me that the data in and of itself really doesn't have any value and doesn't really provide any direction into a decision or a higher order, predictive analytics until you start to manipulate the data. So is it even the wrong discussion? Is data the right discussion? Or should we really be talking about the capabilities to do stuff within and really get people focused on that? >> So Jeff, there's so many points to hit on there. So the application of data is what's the value, and the queue of you guys used to be famous for saying, "Separating noise from the signal." >> Signal from the noise. Signal from a noise, right. Well, how do you know in your dataset what's signal and what's noise? Well, the use case will tell you. If you don't know the use case and you have no way of figuring out what's important. One of the things I use, I still rail against, and it happens still. Somebody will walk up my data science team and say, "Here's some data, tell me what's interesting in it." Well, how do you separate signal from noise if I don't know the use case? So I think you're spot on, Jeff. The way to think about this is, don't become data-driven, become value-driven and value is driven from the use case or the application or the use of the data to solve that particular use case. So organizations that get fixated on being data-driven, I hate the term data-driven. It's like as if there's some sort of frigging magic from having data. No, data has no value. It's how you use it to derive customer product and operational insights that drive value,. >> Right, so there's an interesting step function, and we talk about it all the time. You're out in the weeds, working with Chipotle lately, and increase their average ticket by 1.2 X. We talk more here, kind of conceptually. And one of the great kind of conceptual holy grails within a data-driven economy is kind of working up this step function. And you've talked about it here. It's from descriptive, to diagnostic, to predictive. And then the Holy grail prescriptive, we're way ahead of the curve. This comes into tons of stuff around unscheduled maintenance. And you know, there's a lot of specific applications, but do you think we spend too much time kind of shooting for the fourth order of greatness impact, instead of kind of focusing on the small wins? >> Well, you certainly have to build your way there. I don't think you can get to prescriptive without doing predictive, and you can't do predictive without doing descriptive and such. But let me throw a really one at you, Jeff, I think there's even one beyond prescriptive. One we're talking more and more about, autonomous, a ton of analytics, right? And one of the things that paper talked about that didn't click with me at the time was this idea of orphaned analytics. You and I kind of talked about this before the call here. And one thing we noticed in the research was that a lot of these very mature organizations who had advanced from the retrospective analytics of BI to the descriptive, to the predicted, to the prescriptive, they were building one off analytics to solve a problem and getting value from it, but never reusing this analytics over and over again. They were done one off and then they were thrown away and these organizations were so good at data science and analytics, that it was easier for them to just build from scratch than to try to dig around and try to find something that was never actually ever built to be reused. And so I have this whole idea of orphaned analytics, right? It didn't really occur to me. It didn't make any sense into me until I read this quote from Elon Musk, and Elon Musk made this statement. He says, " I believe that when you buy a Tesla, you're buying an asset that appreciates in value, not depreciates through usage." I was thinking, "Wait a second, what does that mean?" He didn't actually say it, "Through usage." He said, "He believes you're buying an asset that appreciates not depreciates in value." And of course the first response I had was, "Oh, it's like a 1964 and a half Mustang. It's rare, so everybody is going to want these things. So buy one, stick it in your garage. And 20 years later, you're bringing it out and it's worth more money." No, no, there's 600,000 of these things roaming around the streets, they're not rare. What he meant is that he is building an autonomous asset. That the more that it's used, the more valuable it's getting, the more reliable, the more efficient, the more predictive, the more safe this asset's getting. So there is this level beyond prescriptive where we can think about, "How do we leverage artificial intelligence, reinforcement, learning, deep learning, to build these assets that the more that they are used, the smarter they get." That's beyond prescriptive. That's an environment where these things are learning. In many cases, they're learning with minimal or no human intervention. That's the real aha moment. That's what I miss with orphaned analytics and why it's important to build analytics that can be reused over and over again. Because every time you use these analytics in a different use case, they get smarter, they get more valuable, they get more predictive. To me that's the aha moment that blew my mind. I realized I had missed that in the paper entirely. And it took me basically two years later to realize, dough, I missed the most important part of the paper. >> Right, well, it's an interesting take really on why the valuation I would argue is reflected in Tesla, which is a function of the data. And there's a phenomenal video if you've never seen it, where they have autonomous vehicle day, it might be a year or so old. And he's got his number one engineer from, I think the Microprocessor Group, The Computer Vision Group, as well as the autonomous driving group. And there's a couple of really great concepts I want to follow up on what you said. One is that they have this thing called The Fleet. To your point, there's hundreds of thousands of these things, if they haven't hit a million, that are calling home reporting home every day as to exactly how everyone took the Northbound 101 on-ramp off of University Avenue. How fast did they go? What line did they take? What G-forces did they take? And every one of those cars feeds into the system, so that when they do the autonomous update, not only are they using all their regular things that they would use to map out that 101 Northbound entry, but they've got all the data from all the cars that have been doing it. And you know, when that other car, the autonomous car couple years ago hit the pedestrian, I think in Phoenix, which is not good, sad, killed a person, dark tough situation. But you know, we are doing an autonomous vehicle show and the guy who made a really interesting point, right? That when something like that happens, typically if I was in a car wreck or you're in a car wreck, hopefully not, I learned the person that we hit learns and maybe a couple of witnesses learn, maybe the inspector. >> But nobody else learns. >> But nobody else learns. But now with the autonomy, every single person can learn from every single experience with every vehicle contributing data within that fleet. To your point, it's just an order of magnitude, different way to think about things. >> Think about a 1% improvement compounded 365 times, equals I think 38 X improvement. The power of 1% improvements over these 600,000 plus cars that are learning. By the way, even when the autonomous FSD, the full self-driving mode module isn't turned on, even when it's not turned on, it runs in shadow mode. So it's learning from the human drivers, the human overlords, it's constantly learning. And by the way, not only they're collecting all this data, I did a little research, I pulled out some of their job search ads and they've built a giant simulator, right? And they're there basically every night, simulating billions and billions of more driven miles because of the simulator. They are building, he's going to have a simulator, not only for driving, but think about all the data he's capturing as these cars are riding down the road. By the way, they don't use Lidar, they use video, right? So he's driving by malls. He knows how many cars are in the mall. He's driving down roads, he knows how old the cars are and which ones should be replaced. I mean, he has this, he's sitting on this incredible wealth of data. If anybody could simulate what's going on in the world and figure out how to get out of this COVID problem, it's probably Elon Musk and the data he's captured, be courtesy of all those cars. >> Yeah, yeah, it's really interesting, and we're seeing it now. There's a new autonomous drone out, the Skydio, and they just announced their commercial product. And again, it completely changes the way you think about how you use that tool, because you've just eliminated the complexity of driving. I don't want to drive that, I want to tell it what to do. And so you're saying, this whole application of air force and companies around things like measuring piles of coal and measuring these huge assets that are volume metric measured, that these things can go and map out and farming, et cetera, et cetera. So the autonomy piece, that's really insightful. I want to shift gears a little bit, Bill, and talk about, you had some theories in here about thinking of data as an asset, data as a currency, data as monetization. I mean, how should people think of it? 'Cause I don't think currency is very good. It's really not kind of an exchange of value that we're doing this kind of classic asset. I think the data as oil is horrible, right? To your point, it doesn't get burned up once and can't be used again. It can be used over and over and over. It's basically like feedstock for all kinds of stuff, but the feedstock never goes away. So again, or is it that even the right way to think about, do we really need to shift our conversation and get past the idea of data and get much more into the idea of information and actionable information and useful information that, oh, by the way, happens to be powered by data under the covers? >> Yeah, good question, Jeff. Data is an asset in the same way that a human is an asset. But just having humans in your company doesn't drive value, it's how you use those humans. And so it's really again the application of the data around the use cases. So I still think data is an asset, but I don't want to, I'm not fixated on, put it on my balance sheet. That nice talk about put it on a balance sheet, I immediately put the blinders on. It inhibits what I can do. I want to think about this as an asset that I can use to drive value, value to my customers. So I'm trying to learn more about my customer's tendencies and propensities and interests and passions, and try to learn the same thing about my car's behaviors and tendencies and my operations have tendencies. And so I do think data is an asset, but it's a latent asset in the sense that it has potential value, but it actually has no value per se, inputting it into a balance sheet. So I think it's an asset. I worry about the accounting concept medially hijacking what we can do with it. To me the value of data becomes and how it interacts with, maybe with other assets. So maybe data itself is not so much an asset as it's fuel for driving the value of assets. So, you know, it fuels my use cases. It fuels my ability to retain and get more out of my customers. It fuels ability to predict what my products are going to break down and even have products who self-monitor, self-diagnosis and self-heal. So, data is an asset, but it's only a latent asset in the sense that it sits there and it doesn't have any value until you actually put something to it and shock it into action. >> So let's shift gears a little bit and start talking about the data and talk about the human factors. 'Cause you said, one of the challenges is people trying to bite off more than they can chew. And we have the role of chief data officer now. And to your point, maybe that mucks things up more than it helps. But in all the customer cases that you've worked on, is there a consistent kind of pattern of behavior, personality, types of projects that enables some people to grab those resources to apply to their data to have successful projects, because to your point there's too much data and there's too many projects and you talk a lot about prioritization. But there's a lot of assumptions in the prioritization model that you can, that you know a whole lot of things, especially if you're comparing project A over in group A with project B, with group B and the two may not really know the economics across that. But from an individual person who sees the potential, what advice do you give them? What kind of characteristics do you see, either in the type of the project, the type of the boss, the type of the individual that really lends itself to a higher probability of a successful outcome? >> So first off you need to find somebody who has a vision for how they want to use the data, and not just collect it. But how they're going to try to change the fortunes of the organization. So it always takes a visionary, may not be the CEO, might be somebody who's a head of marketing or the head of logistics, or it could be a CIO, it could be a chief data officer as well. But you've got to find somebody who says, "We have this latent asset we could be doing more with, and we have a series of organizational problem challenges against which I could apply this asset. And I need to be the matchmaker that brings these together." Now the tool that I think is the most powerful tool in marrying the latent capabilities of data with all the revenue generating opportunities in the application side, because there's a countless number, the most important tool that I found doing that is design thinking. Now, the reason why I think design thinking is so important, because one of the things that design thinking does a great job is it gives everybody a voice in the process of identifying, validating, valuing, and prioritizing use cases you're going to go after. Let me say that again. The challenge organizations have is identifying, validating, valuing, and prioritizing the use cases they want to go after. Design thinking is a marvelous tool for driving organizational alignment around where we're going to start and what's going to be next and why we're going to start there and how we're going to bring everybody together. Big data and data science projects don't die because of technology failure. Most of them die because of passive aggressive behaviors in the organization that you didn't bring everybody into the process. Everybody's voice didn't get a chance to be heard. And that one person who's voice didn't get a chance to get heard, they're going to get you. They may own a certain piece of data. They may own something, but they're just waiting and lay, they're just laying there waiting for their chance to come up and snag it. So what you got to do is you got to proactively bring these people together. We call this, this is part of our value engineering process. We have a value engineering process around envisioning where we bring all these people together. We help them to understand how data in itself is a latent asset, but how it can be used from an economics perspective, drive all those value. We get them all fired up on how these can solve any one of these use cases. But you got to start with one, and you've got to embrace this idea that I can build out my data and analytic capabilities, one use case at a time. And the first use case I go after and solve, makes my second one easier, makes my third one easier, right? It has this ability that when you start going use case by use case two really magical things happen. Number one, your marginal cost flatten. That is because you're building out your data lake one use case at a time, and you're bringing all the important data lake, that data lake one use case at a time. At some point in time, you've got most of the important data you need, and the ability that you don't need to add another data source. You got what you need, so your marginal costs start to flatten. And by the way, if you build your analytics as composable, reusable, continuous learning analytic assets, not as orphaned analytics, pretty soon you have all the analytics you need as well. So your marginal cost flatten, but effect number two is that you've, because you've have the data and the analytics, I can accelerate time to value, and I can de-risked projects as I go use case by use case. And so then the biggest challenge becomes not in the data and the analytics, it's getting the all the business stakeholders to agree on, here's a roadmap we're going to go after. This one's first, and this one is going first because it helps to drive the value of the second and third one. And then this one drives this, and you create a whole roadmap of rippling through of how the data and analytics are driving this value to across all these use cases at a marginal cost approaching zero. >> So should we have chief design thinking officers instead of chief data officers that really actually move the data process along? I mean, I first heard about design thinking years ago, actually interviewing Dan Gordon from Gordon Biersch, and they were, he had just hired a couple of Stanford grads, I think is where they pioneered it, and they were doing some work about introducing, I think it was a a new apple-based alcoholic beverage, apple cider, and they talked a lot about it. And it's pretty interesting, but I mean, are you seeing design thinking proliferate into the organizations that you work with? Either formally as design thinking or as some derivation of it that pulls some of those attributes that you highlighted that are so key to success? >> So I think we're seeing the birth of this new role that's marrying capabilities of design thinking with the capabilities of data and analytics. And they're calling this dude or dudette the chief innovation officer. Surprise. >> Title for someone we know. >> And I got to tell a little story. So I have a very experienced design thinker on my team. All of our data science projects have a design thinker on them. Every one of our data science projects has a design thinker, because the nature of how you build and successfully execute a data science project, models almost exactly how design thinking works. I've written several papers on it, and it's a marvelous way. Design thinking and data science are different sides of the same coin. But my respect for data science or for design thinking took a major shot in the arm, major boost when my design thinking person on my team, whose name is John Morley introduced me to a senior data scientist at Google. And I was bottom coffee. I said, "No," this is back in, before I even joined Hitachi Vantara, and I said, "So tell me the secret to Google's data science success? You guys are marvelous, you're doing things that no one else was even contemplating, and what's your key to success?" And he giggles and laughs and he goes, "Design thinking." I go, "What the hell is that? Design thinking, I've never even heard of the stupid thing before." He goes, "I'd make a deal with you, Friday afternoon let's pop over to Stanford's B school and I'll teach you about design thinking." So I went with him on a Friday to the d.school, Design School over at Stanford and I was blown away, not just in how design thinking was used to ideate and bring and to explore. But I was blown away about how powerful that concept is when you marry it with data science. What is data science in its simplest sense? Data science is about identifying the variables and metrics that might be better predictors of performance. It's that might phrase that's the real key. And who are the people who have the best insights into what values or metrics or KPIs you might want to test? It ain't the data scientists, it's the subject matter experts on the business side. And when you use design thinking to bring this subject matter experts with the data scientists together, all kinds of magic stuff happens. It's unbelievable how well it works. And all of our projects leverage design thinking. Our whole value engineering process is built around marrying design thinking with data science, around this prioritization, around these concepts of, all ideas are worthy of consideration and all voices need to be heard. And the idea how you embrace ambiguity and diversity of perspectives to drive innovation, it's marvelous. But I feel like I'm a lone voice out in the wilderness, crying out, "Yeah, Tesla gets it, Google gets it, Apple gets it, Facebook gets it." But you know, most other organizations in the world, they don't think like that. They think design thinking is this Wufoo thing. Oh yeah, you're going to bring people together and sing Kumbaya. It's like, "No, I'm not singing Kumbaya. I'm picking their brains because they're going to help make their data science team much more effective and knowing what problems we're going to go after and how I'm going to measure success and progress. >> Maybe that's the next Dean for the next 10 years, the Dean of design thinking instead of data science, and who knew they're one and the same? Well, Bill, that's a super insightful, I mean, it's so, is validated and supported by the trends that we see all over the place, just in terms of democratization, right? Democratization of the tools, more people having access to data, more opinions, more perspective, more people that have the ability to manipulate the data and basically experiment, does drive better business outcomes. And it's so consistent. >> If I could add one thing, Jeff, I think that what's really powerful about design thinking is when I think about what's happening with artificial intelligence or AI, there's all these conversations about, "Oh, AI is going to wipe out all these jobs. Is going to take all these jobs away." And what we're actually finding is that if we think about machine learning, driven by AI and human empowerment, driven by design thinking, we're seeing the opportunity to exploit these economies of learning at the front lines where every customer engagement, every operational execution is an opportunity to gather not only more data, but to gather more learnings, to empower the humans at the front lines of the organization to constantly be seeking, to try different things, to explore and to learn from each of these engagements. I think it's, AI to me is incredibly powerful. And I think about it as a source of driving more learning, a continuous learning and continuously adapting an organization where it's not just the machines that are doing this, but it's the humans who've been empowered to do that. And my chapter nine in my new book, Jeff, is all about team empowerment, because nothing you do with AI is going to matter of squat if you don't have empowered teams who know how to take and leverage that continuous learning opportunity at the front lines of customer and operational engagement. >> Bill, I couldn't set a better, I think we'll leave it there. That's a great close, when is the next book coming out? >> So today I do my second to last final review. Then it goes back to the editor and he does a review and we start looking at formatting. So I think we're probably four to six weeks out. >> Okay, well, thank you so much, congratulations on all the success. I just love how the Dean is really the Dean now, teaching all over the world, sharing the knowledge and attacking some of these big problems. And like all great economics problems, often the answer is not economics at all. It's completely really twist the lens and don't think of it in that, all that construct. >> Exactly. >> All right, Bill. Thanks again and have a great week. >> Thanks, Jeff. >> All right. He's Bill Schmarzo, I'm Jeff Frick. You're watching theCUBE. Thanks for watching, we'll see you next time. (gentle music)

Published Date : Aug 3 2020

SUMMARY :

leaders all around the world. And now he teaches at the of the very first Strata Conferences into the details, you know, and how do I get it on the balance sheet? of the data, has kind of put at the value you paid but on the ability to And how do I make sure the analytics and the work of making sure the data has the time to go through that the data in and of itself and the queue of you is driven from the use case And one of the great kind And of course the first and the guy who made a really But now with the autonomy, and the data he's captured, and get past the idea of of the data around the use cases. and the two may not really and the ability that you don't need into the organizations that you work with? the birth of this new role And the idea how you embrace ambiguity people that have the ability of the organization to is the next book coming out? Then it goes back to the I just love how the Dean Thanks again and have a great week. we'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

Bill SchmarzoPERSON

0.99+

Jeff FrickPERSON

0.99+

SidaouiPERSON

0.99+

AmazonORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

John MorleyPERSON

0.99+

AppleORGANIZATION

0.99+

NetflixORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

AmazonsORGANIZATION

0.99+

five billionQUANTITY

0.99+

1%QUANTITY

0.99+

$20 millionQUANTITY

0.99+

$40,000QUANTITY

0.99+

August 2020DATE

0.99+

365 timesQUANTITY

0.99+

Adam SmithPERSON

0.99+

PhoenixLOCATION

0.99+

UberORGANIZATION

0.99+

secondQUANTITY

0.99+

NUI GalwayORGANIZATION

0.99+

fourQUANTITY

0.99+

thirdQUANTITY

0.99+

SchmarzoPERSON

0.99+

billionsQUANTITY

0.99+

ChipotleORGANIZATION

0.99+

Friday afternoonDATE

0.99+

The Art of Thinking Like A Data ScientistTITLE

0.99+

University AvenueLOCATION

0.99+

Hitachi VantaraORGANIZATION

0.99+

oneQUANTITY

0.99+

threeQUANTITY

0.99+

28 reference sourcesQUANTITY

0.99+

Elon MuskPERSON

0.99+

BillPERSON

0.99+

BostonLOCATION

0.99+

180QUANTITY

0.99+

The Computer Vision GroupORGANIZATION

0.99+

four billionQUANTITY

0.99+

first use caseQUANTITY

0.99+

Dan GordonPERSON

0.99+

TeslaORGANIZATION

0.99+

firstQUANTITY

0.99+

1776DATE

0.99+

zeroQUANTITY

0.99+

third use caseQUANTITY

0.99+

180 degreeQUANTITY

0.99+

Elon MuskPERSON

0.99+

38 XQUANTITY

0.99+

2020DATE

0.99+

twoQUANTITY

0.99+

todayDATE

0.99+

hundreds of thousandsQUANTITY

0.99+

Microprocessor GroupORGANIZATION

0.99+

25 data sourcesQUANTITY

0.99+

six weeksQUANTITY

0.99+

USFORGANIZATION

0.99+

fourth use caseQUANTITY

0.99+

StrongyByScience Podcast | Bill Schmarzo Part Two


 

so two points max first off ideas aren't worth a damn ever he's got ideas all right I could give a holy hoot about about ideas I mean I I I got people throw ideas at me all the friggin time you know I don't give a shit I just truly told give a shit right I want actions show me how I'm gonna turn something into an action how am I gonna make something better right and I I want to know ahead of time what that something is am I trying to improve customer attention trying to improve recovery time for an athlete who's got back-to-back games right III I know what I'm trying to do and I want to focus on that where ideas become great and you said it really well max is ideas are something I want to test so but I know what I want to test these of the event what outcome I'm trying to drive so it isn't just it is an ideation for the I eat for the sake of ideation its ideation around the idea that I need to drive an outcome I need to have athletes that are better prepare for the next game who can recover faster who are stronger and can you know it can play through a longer point of the season here we are in March Madness and we know that by the way that the teams that tend to rise to the top are the teams that have gone through a more rigorous schedule played tougher teams right they're better prepared for this and it's really hard for a mid-major team to get better prepared because they're playing a bunch of lollipop teams in their own conference so it's it's ideas really don't excite me ideation does around an environment that allows me to test ideas quickly fail fast in order to find those you know variables or metrics those data sources it just might be better predictors of performance yeah I like the idea of acting quickly failing quickly and learning quickly right you have this loop and what happens is and then I think every strand coach in the world is probably guilty of this is we get an idea and we just apply it you go home you know I think eccentric trainings this great idea and we're going to do an eccentric training block and I just apply it to my athletes and you don't know what the hell happened because you don't have any contextual metrics that you base your test on to actually learn from so you at the day go I think it worked you know they jump high but you're not comparing that to anything right they jump they've been the weight room for three months my god I hope they jump higher I hope they're stronger like I can sit in the weight room probably get stronger for three months and my thought is but let's have context and it's um I call them anchor data points they were always reflecting back on so for example if I have a key performance metric where I want to jump high I'll always track jumping high but then I can apply different interventions eccentric training power training strength training and I can see the stress response of these KPIs so now I've set an environment that we have our charter still there my charter being I'm going to improve my athletic development and that's my goal I'm basing that charter on the KPI of jumping high so key performance indicator of jumping high now I can apply different blocks and interventions with that anchor point over and over again and the example I give is I don't come home and ask my girlfriend how she's doing once every month I ask her every day and that's my anchor point right and I might try different things I might try cookie and I might try making dinner I might do the dishes I might stop forgetting our dates I might actually buy groceries for once well maybe she gets happier then I'll continue to buy groceries maybe I'll remember it's her birthday March 30th I remember that that's my put it on there right and so but the idea is we have in life the way life works we have these modular points where we call anchor points where we were self-reflect and we reflect off of others and we understand our progress in our own life environment based on these anchor points and we progress and we apply different interventions I want this job maybe I'll try having this idea outside of here maybe I'll play in a softball league and we're always reflecting it's not making me happier is that making me feel fulfilled and I don't understand why we don't take what we do every day and like subconsciously and apply it into the sports science world but lava is because it happens unconsciously because that's how our body has learned to evolve we have anchor points I want to survive I want to have kids lots of kids strong kids and I and I die so my kids can have my food and that's what we want as a body right your bison care about anything else and so that's why you walk with a limp after you get hurt you don't want perfect again it's a waste of energy to walk perfect right you can still have kids with a limp I hate to break it to you right we're not running from animals anymore and so we have all these anchor points in life let's apply that same model now and like you said it's like design thinking and actually having that architecture to outline it whether it's in that hypothesis canvas to force us to now consciously do it because we're not just interacting with ourselves now we're interacting with other systems other nodes of information to now have to work together in use in to achieve our company's charter interesting max there's a lot of a lot of key points in there the one that strikes me is measurement John Smail at Procter & Gamble I was there you still I say you are what you measure and you measure what you reward that was his way of saying as an organization that the compensation systems are critical and the story just walked through about what Kelsey right and what you guys are doing and how you increase your your happiness level right now here's the damnest your work I mean that is that is how you're rewarded right if you are rewarded by happiness and so you you learn to measure if you're smart right that you don't miss birthdays that you do dishes you you you help up around the house you do things and when you do those things the happiness meter goes up and when you don't do those things happiness meter goes down and you know because you're you're you're probably pulling not just once a day but as you walk by her throughout the day are on a weekend you're you're constantly knowing right if if you're liking your mom you know when mom's not happy you don't need to be a day to sign this and know mom's not happy and so then you you know you re engineer about okay what did I do wrong that causes unhappiness right and so life is a lot of there's a lot of life lessons that we can learn that we can apply to either our business our operations or sports whatever it might be that your your profession is in about the importance of capturing the right metrics and understanding how those metrics really drive you towards a desired outcome and the rewards you're gonna receive from those outcomes yeah and with those it's the right metrics right that's what not metrics the right metrics if I want to know if someone was happy I wouldn't go look at the weather I wouldn't you know check gas prices especially if I'm curious they're happy with me well maybe they might reflect if they're happy in general if they're happy with me right now I'm contextualizing I'm actually trying to look at I know a little bit more about what I should look at I don't know everything and so you might have metrics that you say you know I know science says this metric is good this metric is good maybe we want to explore of these couple of metrics over here because we think that either aid they're related to one of these metrics or they related to the main outcome itself and that gives you a way to then I have these key and core metrics that's not stacking the deck but it's no one you're gonna get insights out of it and then I have these exploratory metrics over here but you're gonna allow me then to dive and explore elsewhere and if you're a company those can be trade secrets they can be proprietary information if you're a trainer it can be ways to learn how different athletes adapt to make yourself better and again we're talking about a company and we're talking about trainer there's no difference when it comes to trade secrets right trainers keep their trade secrets and companies keep their trade secrets and as we talk about this it's really easy to see how these two environments where they're talking about company athletic development sports science personal training health and wellness are really universally governed by the same concepts because life itself is typically governed by these concepts and when we're playing those kind of home iterations to it you can really begin to quickly learn what's going on and whether or not those metrics that you we're good ARCA and whether or not you can learn new metrics and from that max you raise an interesting question or made a point here that's I might be very different in the sports world than it is in the business world and that is the ability to test and what I mean by that is you know the business world is full of concepts like a bee testing and see both custody and simulations and things like that when you're dealing with athletes individually I would imagine it's really hard to test athlete a with one technique and athlete B with another technique when both these athletes are trying to maximize their performance capabilities in order to maximize you know the money there can they can they can generate how do you deal with that so yes no one wants to get the shitty program yes that's correct yeah for the most part people don't and this I'll take people don't test like that and but here's my solution to us I think being a critic without solutions called being an asshole my solution to that is making it very agile and so we're not going to be able to you know test group a versus group B but what you can do if you're a coach and you have faith in because there are a lot of programs coaches use coaches probably use you know every offseason they might try a new program so there's no real difference in all honesty to try a new program on you know these seven athletes versus and then try a different one that you also trust on these seven athletes and part of that comes from the fact that we have science and evidence to show that both these programs are really good right but there's no one's actually broken down the minutiae of it and so yes you probably could do a and B testing because you have faith in both programs so it's not like either athletes getting the wrong program they're both getting programs that are going to probably elicit an outcome of performing better but who wants to perform the best the second asks the second aspect would be what kind of longitudinal data that you can collect very easily to understand typical progression of athletes for example if you coach and you coach for eight years you'll have you know eight different freshman classes theoretically and you'll begin to understand how a freshman typically progresses to a sophomore in what their key performance indicators typically trend ass and so you can now say okay last year we did this this year we do this I'm gonna see if my freshman class responds differently is this going to give us the perfect answer absolutely not no but without data you're just another person with an opinion that's not my quote I stole that quote but it's true because if we don't try and audit ourselves and try to understand the process of how is someone developing then we're just strictly relying on confirmation bias I mean my program was great you know Pat some guys in the back that jumped higher and we did awesome if we're truly into understanding what's best then we'll actually try and you know measure some of these progress some of this some of these KPIs over time in the example I give and it's unfortunate and fortunate I don't mean anything bad by this either we're on a salary right and so what happens when you're on a salary is no matter really what happens assuming you're doing your job you're gonna keep your job but if you look at a start-up a startup has one option and that's to make money or go out of business right they don't really have the luxury of oh we're just gonna you know hang out and not saying coaches hang up or not we're just gonna you know keep this path we're going on as a coach you know how do I apply a similar model well I start up the bank my startup is you can go from worth zero dollars to worth a hundred you know million two billion dollars in one year at the coach we don't have that same environment because we're not producing something tangible which doesn't always it doesn't have the same capitalistic Drive right the invisible hand pushing us the same way the free market does with you know devices and so we don't always follow the same path that these startups have done yet that same path and same model might provide better insights so max you've hit something I found very interesting confirmation bias if if you don't take the time before you execute a test understand the variables that you're gonna test what happens is if you after the test is over you go back and try to triage what the drivers were that impact and confirmation bias and revisionist history and all these other things that make humans really poor decision-makers get in the way and so but before as a coach I would imagine before as a coach what you'd want to do is is set up ahead of time we're gonna test the following things to see if they have impact by thoroughly like the hypothesis development canvas right they'll really understand against what you're really going to test and then when you've done that test you you will you would have much more confidence in the results of that test versus trying to say wow Jimmy Jimmy jumped two inches higher this year thank God what did he do let's figure out and revision it wasn't what he ate was it where he slept oh he played a lot of video games that must be it he is the video games made him jump higher right so it's I think a lot of sports in particular even more than the business for a lot of sports is based on on heuristics and gut feel it's run by a priesthood of former athletes who are were great because of their own skills and capabilities and it maybe had very little do with her development and I don't want to pick on Michael Jordan but no Michael Jordan was notoriously a poor coach and a poor judge of talent he made some of the most industries when the worst draft choices industry has ever seen and that's because he mistakenly thought that everybody was like him that he revision history about well what made me great were the following thing so I'm gonna look for people like that instead of reversing the course and saying okay let's figure out ahead of time what makes what will make you a better plant player and then trying these tests across a number of different players to figure out okay which of these things actually had impact so sports I think has gotten much better Moneyball sort of opened that people's eyes to it now we're seeing now more and more team who are realizing that that data science is as a discipline it's not something you apply after the fact but in order to really uncover what's the real drivers of performance you have to sit down before you do the test to really understand what it is you're testing because then you can learn from the tests and and let's be honest right learning is a process of exploring and failing and if you don't try and fail enough times if you don't have enough might moments you'll never have any break to a moment and I think what people don't understand is they hear the word fail and assumed oh we did a six-month program and failed nope failure can occur in one day and that's okay right you can use for example I'm going to use this piece of technology as motivation for biofeedback to increase my athletes and tint and the amount of effort they put into the weight room that's right hypothesis you can test that in one day you print out that piece of technology the athletes don't respond well you'd have learned something now okay that technology didn't bring about the motivation I thought why was that you can do reflect and that revision because you had the infrastructure beforehand on maybe notes that you may have taken and scribbled down on your pad or observations from the coaches I am I but you know what the athletes weren't very invested because the technology took too long to set up right it wasn't the technology's fault it was the process of given technology available to act and utilize on so maybe you retest again with it set up beforehand or a piece of technology that's much easier to use and the intent increases so now you say okay it's not the technology's fault it's the application of how we're using the technology at the same time we hear a lot of things like I'm gonna take a little bit of pivot not too far though is in the baseball world you see technology being more used more and more as a tool and it's helping guide immediate actions on the field whether it's not it's a you know spin rates its arm velocities with accelerometers or some sort of measurement they decide to use but that's not necessarily collecting data that's using technology as a performance tool and I think there's a distinction between the two the two are not mutually exclusive you can still use it as a performance tool but that performance data if the infrastructure is not there to store a file and reflect and analyze it's only being used one-sided and so people think oh we're doing sports science we're doing data science because we're collecting data well that's not I can go count ants that's collecting data but that's not you know I don't unless I count ants every day and say oh my game populations decreasing right and kind of a here's a really easy way to think of it in my opinion you have cookies in the fridge right and every day I go and every week will say my mom makes cookies this doesn't happen I wish it did be very cool but I love your mom and we didn't eat cookies every week but in the fridge I go when I count how many cookies there were right and using data I'd say oh twelve cookies if there's any cookies at all I can eat right that's using technology and that moment but doing data Sciences well you know what she's gonna make you know twelve and a couple of days and I have two days left and there's six cookies I can eat three today and three tomorrow because now you're doing prescriptive analytics right because you are prescribing an action based on the information you collected it's based on historical data because you know that every seventh day the cookies are coming no I just take it as I'm using technology as a tool I might only eat one cookie and forever be leaving six cookies on the table right and so there's hid don't want to do that no we don't but we trick ourselves I think we see that not saying baseball does is but I'm saying we've see that in all domains where we use technology we say oh technology good we had someone use technology that's data science no that's not data science that's using technology to help Tripp augment training using data Sciences understand the information that happened during the training process looking at it contextually to them prescribed saying I'm going to do this exercise or this exercise based on the collection and maturation of the information so instead of cookies here I eat one cookie it's a historic Lee I know there's going to be twelve cookies every seven days I have two days left I can eat three cookies now I can hide two and tell my sister Amelia oh there's only one left very weird I don't know who ate data - well let max let me let me let me wrap up with a very interesting challenge that I think all all data scientists face wellmaybe all citizens of data science face and I say did as citizens of data science I mean people who understand how to use the results of data science not necessarily people who are creating the data science and here's here's the challenge that if you if you make your decisions just based on the numbers alone you're likely to end up with suboptimal results and the reason why that happens is because there's lots of outside variables that have huge influence especially when it comes to humans and even machines to a certain extent let me give you an example know baseball is is infatuated with cyber metrics and numbers right everybody is making decisions we're seeing this now in the current offseason you know who was signing contracts and who has given given money and they're using they're using the numbers to show you know how much is that person really worth and and organizations are getting really surgical and their ability to figure out that that person is not worth a you know a six year contract for you know 84 million dollars they're worth a two-year contract for 36 and that's the best way I'm gonna you know pay but minimize my risks and so then the numbers are really drive and allow that but it isn't just the big data that helps to make decisions and in fact I would argue the insights carried from the small data is equally important especially in sports and I think this is a challenge in other parts of the business is the numbers itself the data itself doesn't tell the full story and in particular think about how does an organization leverage the small data the observed data to really help make a better decision so right now in baseball for example in this offseason the teams became infatuated with using numbers to figure out who were they going to offer contracts to how much they were going to pay him for how long and we saw really the contracts in most cases really shrinking and value in size cuz people are using the numbers and comparing that to say always so and so it only got this you're only going to get this and numbers are great but they miss some of the smaller aspects that really differentiate good athletes from great athletes and those are things like fortitude part you know effort resilience these these kind of things that aren't you can't find that in the number so somebody's ability to a closer write who goes out there in the eighth-inning and and just has a shit performance gets beat up all over the place comes back in it still has to lead and and does that person have the guts the fortitude to go back out there after us bad eighth-inning and go do it again who can fight through when they're tired it's late in the game now you've been playing it's a you know 48 minute game you've been playing forty minutes already you've hardly had a break and you're down by two the balls in your hand a three-pointer is gonna win it what are you gonna do my numbers don't measure that it's theirs these these these other metrics out there like fortitude at heart and such that you actually can start to measure they don't show up a numbers where they come from the inside some subject matter experts to say yeah that person has fight and in fact there's one pro team that actually what they do in the minor leagues they actually put their players into situations that are almost no win because they want to see what they're gonna do do they give up or do they fight back and and you know what you again you can't batting average then tell you that if somebody's gonna get up and that you're gonna give up it's a ninth in and you think you've lost you know what I don't want that person out there and so think about in sports how do you complement the data that you can see coming off of devices with the data that experience coach can say that that person's got something extra there they got the fight they have the fortitude they have the resilience when they're down they keep battling they don't give up and you know from experience from from playing and coaching I know from playing and coaching the guy is going to give up you know who they are I don't want them on the court right it made me the best player from a numbers perspective hell if that was the case Carmelo Anthony would be an all-star every time his numbers are always great the guide lacks heart but he doesn't know how to win so think about how as an organization a sporting organization you use the metrics to help give you a baseline but don't forget about the the soft metrics the servable things that you got to tell you that somebody has something special that is an awesome way to bring this together because subject matter experts those are people who have been in the trenches who see it firsthand date is here to augment you in your decisions it's not here to override you it's not here to take your place and so in coaches fear data it's the silliest thing ever because it's giving more ammo to a gunslinger that's all it does right it's not going to win the battle right it's just the bullets you got to still aim it in fire and so when we look at it in regards to performance and athletic development all these numbers they'll never be right ever they'll never be 100% perfect but neither will you and so what we're trying to do is help your decisions with more information that you can process into your brain that you might otherwise not be able to quantify so it's giving that paintbrush not just the color red but given all the colors to you and so now you can make whatever painting you want and you're not constrained by things you can't measure yourself I could add one point max to bill on that data won't make a shitty coach good but it will make a good coach great yeah yeah I couldn't agree more well dad thank you for being on here I really appreciate and for everyone who's listening this is going on prime March Madness time and so to pull away the dean of big data from March Madness who for people listening he made his bracket on the Google cloud using AI and so it only he so I was thanking him to come here and only he would be the one to I guess take I don't say take the fun out of it but try and grid the family bracket for used it all augmented decision-making he possibly can like it the data will make won't make somebody shitty good and I'm still not good Google Cloud couldn't help me I still at the bottom of the family pool it's great to have you in I guess every minute here is worth double being that's March Madness time thanks max for the opportunity it's a fun conversation alright thank you guys for listening really appreciate it and [Music] [Applause] [Music] you

Published Date : Mar 25 2019

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
AmeliaPERSON

0.99+

Procter & GambleORGANIZATION

0.99+

Carmelo AnthonyPERSON

0.99+

two-yearQUANTITY

0.99+

Michael JordanPERSON

0.99+

forty minutesQUANTITY

0.99+

two daysQUANTITY

0.99+

twelve cookiesQUANTITY

0.99+

six cookiesQUANTITY

0.99+

John SmailPERSON

0.99+

six yearQUANTITY

0.99+

March 30thDATE

0.99+

six-monthQUANTITY

0.99+

twelveQUANTITY

0.99+

zero dollarsQUANTITY

0.99+

three cookiesQUANTITY

0.99+

48 minuteQUANTITY

0.99+

three monthsQUANTITY

0.99+

36QUANTITY

0.99+

one cookieQUANTITY

0.99+

eight yearsQUANTITY

0.99+

100%QUANTITY

0.99+

three monthsQUANTITY

0.99+

Bill SchmarzoPERSON

0.99+

one yearQUANTITY

0.99+

eighth-inningQUANTITY

0.99+

KelseyPERSON

0.99+

last yearDATE

0.99+

two pointsQUANTITY

0.99+

twoQUANTITY

0.99+

tomorrowDATE

0.99+

twelve cookiesQUANTITY

0.99+

both programsQUANTITY

0.99+

bothQUANTITY

0.99+

84 million dollarsQUANTITY

0.98+

two inchesQUANTITY

0.98+

todayDATE

0.98+

this yearDATE

0.98+

one optionQUANTITY

0.98+

second aspectQUANTITY

0.98+

threeQUANTITY

0.98+

million two billion dollarsQUANTITY

0.98+

March MadnessEVENT

0.98+

secondQUANTITY

0.98+

once a dayQUANTITY

0.97+

one dayQUANTITY

0.97+

March MadnessEVENT

0.96+

seven athletesQUANTITY

0.96+

every seven daysQUANTITY

0.96+

one-sidedQUANTITY

0.95+

one techniqueQUANTITY

0.95+

LeePERSON

0.94+

Jimmy JimmyPERSON

0.94+

ninthQUANTITY

0.93+

one dayQUANTITY

0.93+

firstQUANTITY

0.93+

ARCAORGANIZATION

0.92+

oneQUANTITY

0.91+

every seventh dayQUANTITY

0.9+

couple of metricsQUANTITY

0.88+

three-pointerQUANTITY

0.87+

a hundredQUANTITY

0.85+

eight different freshmanQUANTITY

0.85+

doubleQUANTITY

0.84+

one pro teamQUANTITY

0.83+

one pointQUANTITY

0.83+

a couple of daysQUANTITY

0.82+

every weekQUANTITY

0.82+

Google cloudTITLE

0.79+

Google CloudTITLE

0.78+

one of theseQUANTITY

0.77+

two theQUANTITY

0.76+

a lot of video gamesQUANTITY

0.73+

every dayQUANTITY

0.73+

strandQUANTITY

0.68+

dayQUANTITY

0.67+

lots of outside variablesQUANTITY

0.67+

baseballTITLE

0.67+

every monthQUANTITY

0.66+

number of different playersQUANTITY

0.64+

PartOTHER

0.64+

TwoQUANTITY

0.63+

lots ofQUANTITY

0.62+

StrongyByScience Podcast | Bill Schmarzo Part One


 

produced from the cube studios this is strong by science in-depth conversations about science based training sports performance and all things health and wellness here's your host max smart [Music] [Applause] [Music] all right thank you guys tune in today I have the one and only Dean of big data the man the myth the legend bill Schwarz oh also my dad is the CTO of Hitachi van Tara and IOC in analytics he has a very interesting background because he is the well he's known as the Dean of big data but also the king of the court and all things basketball related when it comes to our household and unlike most people in the data world and I want to say most as an umbrella term but a some big bill has an illustrious sports career playing at Coe College the Harvard of the Midwest my alma mater as well but I think having that background of not just being computer science but where you have multiple disciplines involved when it comes to your jazz career you had basketball career you have obviously the career Iran now all that plays a huge role in being able to interpret and take multiple domains and put it into one so thank you for being here dad yeah thanks max that's a great introduction I rep reciate that no it's it's wonderful to have you and for our listeners who are not aware bill is referring him is Bill like my dad but I call my dad the whole time is gonna drive me crazy bill has a mind that thinks not like most so he he sees things he thinks about it not just in terms of the single I guess trajectory that could be taken but the multiple domains that can go so both vertically and horizontally and when we talk about data data is something so commonly brought up in sports so commonly drop in performance and athletic development big data is probably one of the biggest guess catchphrases or hot words or sayings that people have nowadays but doesn't always have a lot of meaning to it because a lot of times we get the word big data and then we don't have action out of big data and bill specialty is not just big data but it's giving action out of big data with that going forward I think a lot of this talk to be talking about how to utilize Big Data how do you guys data in general how to organize it how to put yourself in a situation to get actionable insights and so just to start it off Becky talked a little bit on your background some of the things you've done and how you develop the insights that you have thanks max I have kind of a very nos a deep background but I've been doing data analytics a long time and I was very fortunate one of those you know Forrest Gump moments in life where in the late 1980s I was involved in a project at Procter & Gamble I ran the project where we brought in Walmart's point of sales data for the first time into a what we would now call a data warehouse and for many of this became the launching point of the data warehouse bi marketplace and we can trace the effect the origins of many of the BI players to that project at Procter & Gamble in 87 and 88 and I spent a big chunk of my life just a big believer in business intelligence and data warehousing and trying to amass data together and trying to use that data to report on what's going on and writing insights and I did that for 20 25 years of my life until as you probably remember max I was recruited out Business Objects where I was the vice president of analytic applications I was recruited out of there by Yahoo and Yahoo had a very interesting problem which is they needed to build analytics for their advertisers to help those advertisers to optimize or spend across the Yahoo ad network and what I learned there in fact what I unlearned there was that everything that I had learned about bi and data warehouse and how you constructed data warehouses how you were so schema centric how everything was evolved around tabular data at Yahoo there was an entirely different approach the of my first introduction to Hadoop and the concept of a data Lake that was my first real introduction into data science and how to do predictive analytics and prescriptive analytics and in fact it was it was such a huge change for me that I was I was asked to come back to the TD WI data world Institute right was teaching for many years and I was asked to do a keynote after being at Yahoo for a year or so to share sort of what were the observations what did I learn and I remember I stood up there in front of about 600 people and I started my presentation by saying everything I've taught you the past 20 years is wrong and it was well I didn't get invited back for 10 years so that probably tells you something but it was really about unlearning a lot about what I had learned before and probably max one of the things that was most one of the aha moments for me was bi was very focused on understanding the questions that people were trying to ask an answer davus science is about us to understand the decisions they're trying to take action on questions by their very nature our informative but decisions are actionable and so what we did at Yahoo in order to really drive the help our advertisers optimize your spend across the Yahoo ad network is we focus on identifying the decisions the media planners and buyers and the campaign managers had to make around running a campaign know what what how much money to allocate to what sides how much how many conversions do I want how many impressions do I want so all the decisions we built predictive analytics around so that we can deliver prescriptive actions to these two classes of stakeholders the media planners and buyers and the campaign managers who had no aspirations about being analysts they're trying to be the best digital marketing executives or you know or people they could possibly be they didn't want to be analysts so and that sort of leads me to where I am today and my my teaching my books my blogs everything I do is very much around how do we take data and analytics and help organizations become more effective so everything I've done since then the books I've written the teaching I do with University of San Francisco and next week at the National University of Ireland and Galway and all the clients I work with is really how do we take data and analytics and help organizations become more effective at driving the decisions that optimize their business and their operational models it's really about decisions and how do we leverage data and analytics to drive those decisions so what would how would you define the difference between a question that someone's trying to answer versus a decision but they're trying to be better informed on so here's what I'd put it I call it the Sam test I am and that is it strategic is it actionable is it material and so you can ask questions that are provocative but you might not fast questions that are strategic to the problems you're trying to solve you may not be able to ask questions that are actionable in a sense you know what to do and you don't necessarily ask questions that are material in the sense that the value of that question is greater than the cost of answering that question right and so if I think about the Sam test when I apply it to data science and decisions when I start mining the data so I know what decisions are most important I'm going through a process to identify to validate the value and prioritize those decisions right I understand what decisions are most important now when I start to dig through the data all this structured unstructured data across a number different data sources I'm looking for I'm trying to codify patterns and relationships buried in that data and I'm applying the Sam test is that against those insights is it strategic to the problem I'm trying to solve can I actually act on it and is it material in the sense that it's it's it's more valuable to act than it is to create the action around it so that's the to me that big difference is by their very nature decisions are actually trying to make a decision I'm going to take an action questions by their nature are informative interesting they could be very provocative you know questions have an important role but ultimately questions do not necessarily lead to actions so if I'm a a sport coach I'm writing a professional basketball team some of the decisions I'm trying to make are I'm deciding on what program best develops my players what metrics will help me decide who the best prospect is is that the right way of looking at it yeah so we did an exercise at at USF too to have the students go through an exercise - what question what decisions does Steve Kerr need to make over the next two games he's playing right and we go through an exercise of the identifying especially in game decisions exercise routes oh no how often are you gonna play somebody no how long are they gonna play what are the right combinations what are the kind of offensive plays that you're gonna try to run so there's a know a bunch of decisions that Steve Kerr is coach of the Warriors for example needs to make in the game to not only try to win the game but to also minimize wear and tear on his players and by the way that's a really good point to think about the decisions good decisions are always a conflict of other ideas right win the game while minimizing wear and tear on my players right there's there are there are all the important decisions in life have two three or four different variables that may not be exactly the same which is by this is where data science comes in the data science is going to look across those three or four very other metrics against what you're going to measure success and try to figure out what's the right balance of those given the situation I'm in so if going back to the decision about about playing time well think about all the data you might want to look at in order to optimize that so when's the next game how far are they in this in this in the season where do they currently sit ranking wise how many minutes per game has player X been playing looking over the past few years what's there you know what's their maximum point so there's there's a there's not a lot of decisions that people are trying to make and by the way the beauty of the decisions is the decisions really haven't changed in years right what's changed is not the decisions it's the answers and the answers have changed because we have this great bound of data available to us in game performance health data you know all DNA data all kinds of other data and then we have all these great advanced analytic techniques now neural networks and unstructured supervised machine learning on right all this great technology now that can help us to uncover those relationships and patterns that are buried in the data that we can use to help individualize those decisions one last point there the point there to me at the end when when people talk about Big Data they get fixated on the big part the volume part it's not the volume of big data that I'm going to monetize it's the granularity and what I mean by that is I now have the ability to build very detailed profiles going back to our basketball example I can build a very detailed performance profile on every one of my players so for every one of the players on the Warriors team I can build a very detailed profile it the details out you know what's their optimal playing time you know how much time should they spend before a break on the feet on the on the on the court right what are the right combinations of players in order to generate the most offense or the best defense I can build these very detailed individual profiles and then I can start mission together to find the right combination so when we talk about big it's not the volume it's interesting it's the granularity gotcha and what's interesting from my world is so when you're dealing with marketing and business a lot of that when you're developing whether it be a company that you're trying to find more out about your customers or your startup trying to learn about what product you should develop there's tons of unknowns and a lot of big data from my understanding it can help you better understand some patterns within customers how to market you know in your book you talk about oh we need to increase sales at Chipotle because we understand X Y & Z our current around us now in the sports science world we have our friend called science and science has helped us early identify certain metrics that are very important and correlated to different physiological outcomes so it almost gives us a shortcut because in the big data world especially when you're dealing with the data that you guys are dealing with and trying to understand customer decisions each customer is individual and you're trying to compile all together to find patterns no one's doing science on that right it's not like a lab work where someone is understanding muscle protein synthesis and the amount of nutrients you need to recover from it so in my position I have all these pillars that maybe exist already where I can begin my search there's still a bunch of unknowns with that kind of environment do you take a different approach or do you still go with the I guess large encompassing and collect everything you can and siphon after maybe I'm totally wrong I'll let you take it away no that's it's a it's a good question and what's interesting about that max is that the human body is governed by a series of laws we'll say in each me see ology and the things you've talked about physics they have laws humans as buyers you know shoppers travelers we have propensity x' we don't have laws right I have a propensity that I'm gonna try to fly United because I get easier upgrades but I might fly you know Southwest because of schedule or convenience right I have propensity x' I don't have laws so you have laws that work to your advantage what's interesting about laws that they start going into the world of IOT and this concept called digital twins they're governed by laws of physics I have a compressor or a chiller or an engine and it's got a bunch of components in it that have been engineered together and I can actually apply the laws I can actually run simulations against my digital twins to understand exactly when is something likely to break what's the remaining useful life in that product what's the severity of the the maintenance I need to do on that so the human body unlike the human psyche is governed by laws human behaviors are really hard right and we move the las vegas is built on the fact that human behaviors are so flawed but body mate but bat body physics like the physics that run these devices you can actually build models and one simulation to figure out exactly how you know what's the wear and tear and what's the extensibility of what you can operate in gotcha yeah so that's when from our world you start looking at subsystems and you say okay this is your muscular system this is your autonomic nervous system this is your central nervous system these are ways that we can begin to measure it and then we can wrote a blog on this that's a stress response model where you understand these systems and their inferences for the most part and then you apply a stress and you see how the body responds and even you determine okay well if I know the body I can only respond in a certain number of ways it's either compensatory it's gonna be you know returning to baseline and by the mal adaptation but there's only so many ways when you look at a cell at the individual level that that cell can actually respond and it's the aggregation of all these cellular responses that end up and manifest in a change in a subsystem and that subsystem can be measured inferential II through certain technology that we have but I also think at the same time we make a huge leap and that leap is the word inference right we're making an assumption and sometimes those assumptions are very dangerous and they lead to because that assumptions unknown and we're wrong on it then we kind of sway and missed a little bit on our whole projection so I like the idea of looking at patterns and look at the probabilistic nature of it and I'm actually kind of recently change my view a little bit from my room first I talked about this I was much more hardwired and laws but I think it's a law but maybe a law with some level of variation or standard deviation and it we have guardrails instead so that's kind of how I think about it personally is that something that you say that's on the right track for that or how would you approach it yeah actually there's a lot of similarities max so your description of the human body made up of subsystems when we talk to organizations about things like smart cities or smart malls or smart hospitals a smart city is comprised of a it's made up of a series of subsystems right I've got subsystems regarding water and wastewater traffic safety you know local development things like this look there's a bunch of subsystems that make a city work and each of those subsystems is comprised of a series of decisions or clusters of decisions with equal use cases around what you're trying to optimize so if I'm trying to improve traffic flow if one of my subsystems is practically flow there are a bunch of use cases there about where do I do maintenance where do I expand the roads you know where do I put HOV lanes right so and so you start taking apart the smart city into the subsystems and then know the subsystems are comprised of use cases that puts you into really good position now here's something we did recently with a client who is trying to think about building the theme park of the future and how do we make certain that we really have a holistic view of the use cases that I need to go after it's really easy to identify the use cases within your own four walls but digital transformation in particular happens outside the four walls of an organization and so what we what we're doing is a process where we're building journey maps for all their key stakeholders so you've got a journey map for a customer you have a journey map for operations you have a journey map for partners and such so you you build these journey maps and you start thinking about for example I'm a theme park and at some point in time my guest / customer is going to have a pity they want to go do something you want to go on vacation at that point in time that theme park is competing against not only all the other theme parks but it's competing against major league baseball who's got things it's competing against you know going to the beach in Sanibel Island just hanging around right there they're competing at that point and if they only start engaging the customer when the customers actually contacted them they must a huge part of the market they made you miss a huge chance to influence that person's agenda and so one of the things that think about I don't know how this applies to your space max but as we started thinking about smart entities we use design thinking and customer journey match there's a way to make certain that we're not fooling ourselves by only looking within the four walls of our organization that we're knocking those walls down making them very forest and we're looking at what happens before somebody engages it with us and even afterwards so again going back to the theme park example once they leave the theme park they're probably posting on social media what kind of fun they had or fun they didn't have they're probably making plans for next year they're talking to friends and other things so there's there's a bunch of stuff we're gonna call it afterglow that happens after event that you want to make certain that you're in part of influencing that so again I don't know how when you combined the data science of use cases and decisions with design thinking of journey Maps what that might mean to do that your business but for us in thinking about smart cities it's opened up all kinds of possibilities and most importantly for our customers it's opened up all kinds of new areas where they can create new sources of value so anyone listening to this need to understand that when the word client or customer is used it can be substituted for athlete and what I think is really important is that when we hear you talk about your the the amount of infrastructure you do for an idea when you approach a situation is something that sports science for in my opinion especially across multiple domains it's truly lacking what happens is we get a piece of technology and someone says go do science while you're taking the approach of let's actually think out what we're doing beforehand let's determine our key performance indicators let's understand maybe the journey that this piece of technology is going to take with the athlete or how the athletes going to interact with this piece of technology throughout their four years if you're in the private sector right that afterglow effect might be something that you refer to as a client retention and their ability to come back over and over and spread your own word for you if you're in the sector with student athletes maybe it's those athletes talking highly about your program to help with recruiting and understanding that developing athletes is going to help you know make that college more enticing to go to or that program or that organization but what really stood out was the fact that you have this infrastructure built beforehand and the example I give I spoke with a good number of organizations and teams about data utilization is that if if you're to all of a sudden be dropped in the middle of the woods and someone says go build a cabin now how was it a giant forest I could use as much wood as I want I could just keep chopping down trees until I had something that had with a shelter of some sort right even I could probably do that well if someone said you know what you have three trees to cut down to make a cabin you could become very efficient and you're going to think about each chop in each piece of wood and how it's going to be used and your interaction with that wood and conjunction with that woods interaction with yourself and so when we start looking at athlete development and we're looking at client retention or we're looking at general health and wellness it's not just oh this is a great idea right we want to make the world's greatest theme park and we want to make the world's greatest training facility but what infrastructure and steps you need to take and you said stakeholders so what individuals am i working with am I talking with the physical therapist am i talking with the athletic trainer am I talking with the skill coach how does the skill coach want the data presented to them maybe that's different than how the athletic trainer is going to have a day to present it to them maybe the sport coach doesn't want to see the data unless something a red flag comes up so now you have all these different entities just like how you're talking about developing this customer journey throughout the theme park and making sure that they have a you know an experience that's memorable and causes an afterglow and really gives that experience meaning how can we now take data and apply it in the same way so we get the most value like you said on the granular aspect of data and really turn that into something valuable max you said something really important and one of the things that let me share one of many horror stories that that that comes up in my daily life which is somebody walking up to me and saying hey I got a client here's their data you know go do some science on it like well well what the heck right so when we created this thing called the hypothesis development canvas our sales teams hate it or do the time our data science teams love it because we do all this pre work we just say we make sure we understand the problem we're going after the decision they're trying to make the KPI is it's what you're going to measure success in progress what are they the operational and financial business benefits what are the data sources we want to consider here's something by the way that's it's important that maybe I wish Boeing would have thought more about which is what are the costs of false positives and false negatives right do you really understand where your risks points are and the reason why false positive and false negatives are really important in data science because data size is making predictions and by virtue of making predictions we are never 100% certain that's right or not predictions hath me built on I'm good enough well when is good enough good enough and a lot of that determination as to when is good enough good enough is really around the cost of false positives and false negatives think about a professional athlete like the false the you know the ramifications of overtraining professional athlete like a Kevin Durant or Steph Curry and they're out for the playoffs as huge financial implications them personally and for the organization so you really need to make sure you understand exactly what's the cost of being wrong and so this hypothesis development canvas is we do a lot of this work before we ever put science to the data that yeah it's it's something that's lacking across not just sports science but many fields and what I mean by that is especially you referred to the hypothesis canvas it's a piece of paper that provides a common language right it's you can sit it out before and for listeners who aren't aware a hypothesis canvas is something bill has worked and developed with his team and it's about 13 different squares and boxes and you can manipulate it based on your own profession and what you're diving into but essentially it goes through the infrastructure that you need to have setup in order for this hypothesis or idea or decision to actually be worth a damn and what I mean by that is that so many times and I hate this but I'm gonna go in a little bit of a rant and I apologize that people think oh I get an idea and they think Thomas Edison all son just had an idea and he made a light bulb Thomas Edison's famous for saying you know I did you know make a light bulb I learned was a 9000 ways to not make a light bulb and what I mean by that is he set an environment that allowed for failure and allowed for learning but what happens often people think oh I have an idea they think the idea comes not just you know in a flash because it always doesn't it might come from some research but they also believe that it comes with legs and it comes with the infrastructure supported around it that's kind of the same way that I see a lot of the data aspect going in regards to our field is that we did an idea we immediately implement and we hope it works as opposed to set up a learning environment that allows you to go okay here's what I think might happen here's my hypothesis here's I'm going to apply it and now if I fail because I have the infrastructure pre mapped out I can look at my infrastructure and say you know what that support beam or that individual box itself was the weak link and we made a mistake here but we can go back and fix it

Published Date : Mar 25 2019

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
Steve KerrPERSON

0.99+

Kevin DurantPERSON

0.99+

Procter & GambleORGANIZATION

0.99+

Steph CurryPERSON

0.99+

YahooORGANIZATION

0.99+

Sanibel IslandLOCATION

0.99+

10 yearsQUANTITY

0.99+

Procter & GambleORGANIZATION

0.99+

ChipotleORGANIZATION

0.99+

WalmartORGANIZATION

0.99+

threeQUANTITY

0.99+

a yearQUANTITY

0.99+

9000 waysQUANTITY

0.99+

BoeingORGANIZATION

0.99+

Hitachi van TaraORGANIZATION

0.99+

Bill SchmarzoPERSON

0.99+

twoQUANTITY

0.99+

100%QUANTITY

0.99+

fourQUANTITY

0.99+

BeckyPERSON

0.99+

Thomas EdisonPERSON

0.99+

IOCORGANIZATION

0.99+

each pieceQUANTITY

0.99+

WarriorsORGANIZATION

0.99+

University of San FranciscoORGANIZATION

0.99+

HadoopTITLE

0.99+

eachQUANTITY

0.99+

each chopQUANTITY

0.99+

next yearDATE

0.98+

Thomas EdisonPERSON

0.98+

four yearsQUANTITY

0.98+

firstQUANTITY

0.98+

next weekDATE

0.98+

todayDATE

0.98+

billPERSON

0.98+

late 1980sDATE

0.98+

Forrest GumpPERSON

0.98+

20 25 yearsQUANTITY

0.97+

first timeQUANTITY

0.97+

two classesQUANTITY

0.97+

HarvardORGANIZATION

0.97+

first introductionQUANTITY

0.96+

four different variablesQUANTITY

0.96+

singleQUANTITY

0.94+

Coe CollegeORGANIZATION

0.94+

each customerQUANTITY

0.94+

two gamesQUANTITY

0.94+

bothQUANTITY

0.94+

DeanPERSON

0.93+

about 600 peopleQUANTITY

0.93+

yearsQUANTITY

0.92+

USFORGANIZATION

0.92+

ta world InstituteORGANIZATION

0.92+

oneQUANTITY

0.91+

one of my subsystemsQUANTITY

0.9+

about 13 different squaresQUANTITY

0.89+

a dayQUANTITY

0.88+

GalwayLOCATION

0.86+

88DATE

0.86+

National University of IrelandORGANIZATION

0.85+

StrongyByScienceTITLE

0.82+

BillPERSON

0.81+

SouthwestLOCATION

0.81+

TD WIORGANIZATION

0.81+

tons of unknownsQUANTITY

0.81+

Sam testTITLE

0.8+

bill SchwarzPERSON

0.8+

lot of timesQUANTITY

0.78+

87DATE

0.78+

three treesQUANTITY

0.78+

boxesQUANTITY

0.77+

many timesQUANTITY

0.74+

UnitedORGANIZATION

0.72+

one last pointQUANTITY

0.7+

one of the thingsQUANTITY

0.68+

past 20 yearsDATE

0.67+

Part OneOTHER

0.67+

other metricsQUANTITY

0.65+

IranORGANIZATION

0.65+

four wallsQUANTITY

0.63+

past few yearsDATE

0.62+

maxPERSON

0.62+

Bill Schmarzo, Dell EMC | DataWorks Summit 2017


 

>> Voiceover: Live from San Jose in the heart of Silicon Valley, it's The Cube covering DataWorks Summit 2017. Brought to you by: Hortonworks. >> Hey, welcome back to The Cube. We are live on day one of the DataWorks Summit in the heart of Silicon Valley. I'm Lisa Martin with my co-host Peter Burris. Not only is this day one of the DataWorks Summit, this is the day after the Golden State Warriors won the NBA Championship. Please welcome our next guess, the CTO of Dell AMC, Bill Shmarzo. And Cube alumni, clearly sporting the pride. >> Did they win? I don't even remember. I just was-- >> Are we breaking news? (laughter) Bill, it's great to have you back on The Cube. >> The Division III All-American from-- >> Cole College. >> 1947? >> Oh, yeah, yeah, about then. They still had the peach baskets. You make a basket, you have to climb up this ladder and pull it out. >> They're going rogue on me. >> It really slowed the game down a lot. (laughter) >> All right so-- And before we started they were analyzing the game, it was actually really interesting. But, kick things off, Bill, as the volume and the variety and the velocity of data are changing, organizations know there's a tremendous amount of transformational value in this data. How is Dell AMC helping enterprises extract and maximize that as the economic value of data's changing? >> So, the thing that we find is most relevant is most of our customers don't give a hoot about the three V's of big data. Especially on the business side. We like to jokingly say they care of the four M's of big data, make me more money. So, when you think about digital transformation and how it might take an organization from where they are today to sort of imbed digital capabilities around data and analytics, it's really about, "How do I make more money?" What processes can I eliminate or reduce? How do I improve my ability to market and reach customers? How do I, ya know-- All the things that are designed to drive value from a value perspective. Let's go back to, ya know, Tom Peters kind of thinking, right? I guess Michael Porter, right? His value creation processes. So, we find that when we have a conversation around the business and what the business is trying to accomplish that provides the framework around which to have this digital transformation conversation. >> So, well, Bill, it's interesting. The volume, velocity, variety; three V's, really say something about the value of the infrastructure. So, you have to have infrastructure in place where you can get more volume, it can move faster, and you can handle more variety. But, fundamentally, it is still a statement about the underlying value of the infrastructure and the tooling associated with the data. >> True, but one of the things that changes is not all data is of equal value. >> Peter: Absolutely. >> Right? So, what data, what technologies-- Do I need to have Spark? Well, I don't know, what are you trying to do, right? Do I need to have Kafka or Ioda, right? Do I need to have these things? Well, if I don't know what I'm trying to do, then I don't have a way to value the data and I don't have a way to figure out and prioritize my investment and infrastructure. >> But, that's what I want to come to. So, increasingly, what business executives, at least the ones who we're talking to all the time, are make me more money. >> Right. >> But, it really is, what is the value of my data? And, how do I start pricing data and how do I start thinking about investing so that today's data can be valuable tomorrow? Or the data that's not going to be valuable tomorrow, I can find some other way to not spend money on it, etc. >> Right. >> That's different from the variety, velocity, volume statement which is all about the infrastructure-- >> Amen. >> --and what an IT guy might be worried about. So, I've done a lot of work on data value, you've done a lot of work in data value. We've coincided a couple times. Let's pick that notion up of, ya know, digital transformation is all about what you do with your data. So, what are you seeing in your clients as they start thinking this through? >> Well, I think one of the first times it was sort of an "aha" moment to me was when I had a conversation with you about Adam Smith. The difference between value in exchange versus value in use. A lot of people when they think about monetization, how do I monetize my data, are thinking about value in exchange. What is my data worth to somebody else? Well, most people's data isn't worth anything to anybody else. And the way that you can really drive value is not data in exchange or value in exchange, but it's value in use. How am I using that data to make better decisions regarding customer acquisition and customer retention and predictive maintenance and quality of care and all the other oodles of decisions organizations are making? The evaluation of that data comes from putting it into use to make better decisions. If I know then what decision I'm trying to make, now I have a process not only in deciding what data's most valuable but, you said earlier, what data is not important but may have liability issues with it, right? Do I keep a data set around that might be valuable but if it falls into the wrong hands through cyber security sort of things, do I actually open myself up to all kinds of liabilities? And so, organizations are rushing from this EVD conversation, not only from a data evaluation perspective but also from a risk perspective. Cause you've got to balance those two aspects. >> But, this is not a pure-- This is not really doing an accounting in a traditional accounting sense. We're not doing double entry book keeping with data. What we're really talking about is understand how your business used its data. Number one today, understand how you think you want your business to be able to use data to become a more digital corporation and understand how you go from point "a" to point "b". >> Correct, yes. And, in fact, the underlying premise behind driving economic value of data, you know people say data is the new oil. Well, that's a BS statement because it really misses the point. The point is, imagine if you had a barrel of oil; a single barrel of oil that can be used across an infinite number of vehicles and it never depleted. That's what data is, right? >> Explain that. You're right but explain it. >> So, what it means is that data-- You can use data across an endless number of use cases. If you go out and get-- >> Peter: At the same time. >> At the same time. You pay for it once, you put it in the data lake once, and then I can use it for customer acquisition and retention and upsell and cross-sell and fraud and all these other use cases, right? So, it never wears out. It never depletes. So, I can use it. And what organizations struggle with, if you look at data from an accounting perspective, accounting tends to value assets based on what you paid for it. >> Peter: And how you can apply them uniquely to a particular activity. A machine can be applied to this activity and it's either that activity or that activity. A building can be applied to that activity or that activity. A person's time to that activity or that activity. >> It has a transactional limitation. >> Peter: Exactly, it's an oar. >> Yeah, so what happens now is instead of looking at it from an accounting perspective, let's look at it from an economics and a data science perspective. That is, what can I do with the data? What can I do as far as using the data to predict what's likely to happen? To prescribe actions and to uncover new monetization opportunities. So, the entire approach of looking at it from an accounting perspective, we just completed that research at the University of San Francisco. Where we looked at, how do you determine economic value of data? And we realized that using an accounting approach grossly undervalued the data's worth. So, instead of using an accounting, we started with an economics perspective. The multiplier effect, marginal perpetuity to consume, all that kind of stuff that we all forgot about once we got out of college really applies here because now I can use that same data over and over again. And if I apply data science to it to really try to predict, prescribe, and monetize; all of a sudden economic value of your data just explodes. >> Precisely because of your connecting a source of data, which has a particular utilization, to another source of data that has a particular utilization and you can combine them, create new utilizations that might in and of itself be even more valuable than either of the original cases. >> They genetically mutate. >> That's exactly right. So, think about-- I think it's right. So, congratulations, we agree. Thank you very much. >> Which is rare. >> So, now let's talk about this notion of as we move forward with data value, how does an organization have to start translating some of these new ways of thinking about the value of data into investments in data so that you have the data where you want it, when you want it, and in the form that you need it. >> That's the heart of why you do this, right? If I know what the value of my data is, then I can make decisions regarding what data am I going to try to protect, enhance? What data am I going to get rid of and put on cold storage, for example? And so we came up with a methodology for how we tie the value of data back to use cases. Everything we do is use case based so if you're trying to increase same-store sales at a Chipotle, one of my favorite places; if you're trying to increase it by 7.1 percent, that's worth about 191 million dollars. And the use cases that support that like increasing local even marketing or increasing new product introduction effectiveness, increasing customer cross-sale or upsell. If you start breaking those use cases down, you can start tying financial value to those use cases. And if I know what data sets, what three, five, seven data sets are required to help solve that problem, I now have a basis against which I can start attaching value to data. And as I look across at a number of use cases, now the valued data starts to increment. It grows exponentially; not exponentially but it does increment, right? And it gets more and more-- >> It's non-linear, it's super linear. >> Yeah, and what's also interesting-- >> Increasing returns. >> From an ROI perspective, what you're going to find that as you go down these use cases, the financial value of that use case may not be really high. But, when the denominator of your ROI calculation starts approaching zero because I'm reusing data at zero cost, I can reuse data at zero cost. When the denominator starts going to zero ya know what happens to your ROI? In infinity, it explodes. >> Last question, Bill. You mentioned The University of San Francisco and you've been there a while teaching business students how to embrace analytics. One of the things that was talked about this morning in the keynote was Hortonworks dedication to the open-source community from the beginning. And they kind of talked about there, with kids in college these days, they have access to this open-source software that's free. I'd just love to get, kind of the last word, your take on what are you seeing in university life today where these business students are understanding more about analytics? Do you see them as kind of, helping to build the next generation of data scientists since that's really kind of the next leg of the digital transformation? >> So, the premise we have in our class is we probably can't turn business people into data scientists. In fact, we don't think that's valuable. What we want to do is teach them how to think like a data scientist. What happens, if we can get the business stakeholders to understand what's possible with data and analytics and then you couple them with a data scientist that knows how to do it, we see exponential impact. We just did a client project around customer attrition. The industry benchmark in customer attrition is it was published, I won't name the company, but they had a 24 percent identification rate. We had a 59 percent. We two X'd the number. Not because our data scientists are smarter or our tools are smarter but because our approach was to leverage and teach the business people how to think like a data scientist and they were able to identify variables and metrics they want to test. And when our data scientists tested them they said, "Oh my gosh, that's a very highly predicted variable." >> And trust what they said. >> And trust what they said, right. So, how do you build trust? On the data science side, you fail. You test, you fail, you test, you fail, you're never going to understand 100 percent accuracy. But have you failed enough times that you feel comfortable and confident that the model is good enough? >> Well, what a great spirit of innovation that you're helping to bring there. Your keynote, we should mention, is tomorrow. >> That's right. >> So, you can, if you're watching the livestream or you're in person, you can see Bill's keynote. Bill Shmarzo, CTO of Dell AMC, thank you for joining Peter and I. Great to have you on the show. A show where you can talk about the Warriors and Chipotle in one show. I've never seen it done, this is groundbreaking. Fantastic. >> Psycho donuts too. >> And psycho donuts and now I'm hungry. (laughter) Thank you for watching this segment. Again, we are live on day one of the DataWorks Summit in San Francisco for Bill Shmarzo and Peter Burris, my co-host. I am Lisa Martin. Stick around, we will be right back. (music)

Published Date : Jun 13 2017

SUMMARY :

Brought to you by: Hortonworks. in the heart of Silicon Valley. I don't even remember. Bill, it's great to have you back on The Cube. You make a basket, you have to climb It really slowed the game down a lot. and maximize that as the economic value of data's changing? All the things that are designed to drive value and the tooling associated with the data. True, but one of the things that changes Well, I don't know, what are you trying to do, right? at least the ones who we're talking to all the time, Or the data that's not going to be valuable tomorrow, So, what are you seeing in your clients And the way that you can really drive value is and understand how you go from point "a" to point "b". because it really misses the point. You're right but explain it. If you go out and get-- based on what you paid for it. Peter: And how you can apply them uniquely So, the entire approach of looking at it and you can combine them, create new utilizations Thank you very much. so that you have the data where you want it, That's the heart of why you do this, right? the financial value of that use case may not be really high. One of the things that was talked about this morning So, the premise we have in our class is we probably On the data science side, you fail. Well, what a great spirit of innovation Great to have you on the show. Thank you for watching this segment.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Lisa MartinPERSON

0.99+

Peter BurrisPERSON

0.99+

PeterPERSON

0.99+

Bill ShmarzoPERSON

0.99+

Michael PorterPERSON

0.99+

Bill SchmarzoPERSON

0.99+

ChipotleORGANIZATION

0.99+

threeQUANTITY

0.99+

Tom PetersPERSON

0.99+

Golden State WarriorsORGANIZATION

0.99+

7.1 percentQUANTITY

0.99+

San JoseLOCATION

0.99+

Adam SmithPERSON

0.99+

Silicon ValleyLOCATION

0.99+

BillPERSON

0.99+

fiveQUANTITY

0.99+

100 percentQUANTITY

0.99+

59 percentQUANTITY

0.99+

University of San FranciscoORGANIZATION

0.99+

two aspectsQUANTITY

0.99+

24 percentQUANTITY

0.99+

tomorrowDATE

0.99+

Cole CollegeORGANIZATION

0.99+

San FranciscoLOCATION

0.99+

todayDATE

0.99+

1947DATE

0.99+

zeroQUANTITY

0.99+

DataWorks SummitEVENT

0.99+

about 191 million dollarsQUANTITY

0.98+

oneQUANTITY

0.98+

Dell AMCORGANIZATION

0.98+

CubeORGANIZATION

0.98+

Dell EMCORGANIZATION

0.97+

first timesQUANTITY

0.97+

OneQUANTITY

0.97+

DataWorks Summit 2017EVENT

0.97+

day oneQUANTITY

0.96+

one showQUANTITY

0.96+

four M'sQUANTITY

0.92+

zero costQUANTITY

0.91+

HortonworksORGANIZATION

0.91+

NBA ChampionshipEVENT

0.89+

CTOPERSON

0.86+

single barrelQUANTITY

0.83+

The CubeORGANIZATION

0.82+

onceQUANTITY

0.8+

two XQUANTITY

0.75+

three VQUANTITY

0.74+

seven data setsQUANTITY

0.73+

Number oneQUANTITY

0.73+

this morningDATE

0.67+

double entryQUANTITY

0.65+

KafkaORGANIZATION

0.63+

SparkORGANIZATION

0.58+

HortonworksPERSON

0.55+

IIIORGANIZATION

0.46+

DivisionOTHER

0.38+

IodaORGANIZATION

0.35+

AmericanOTHER

0.28+

Greg Benson, SnapLogic | SnapLogic Innovation Day 2018


 

>> Narrator: From San Mateo, California, it's theCUBE, covering SnapLogic Innovation Day 2018. Brought to you by SnapLogic. >> Welcome back, Jeff Frick here with theCUBE. We're at the Crossroads, that's 92 and 101 in the Bay Area if you've been through it, you've had time to take a minute and look at all the buildings, 'cause traffic's usually not so great around here. But there's a lot of great software companies that come through here. It's interesting, I always think back to the Siebel Building that went up and now that's Rakuten, who we all know from the Warrior jerseys, the very popular Japanese retailer. But that's not why we're here. We're here to talk to SnapLogic. They're doing a lot of really interesting things, and they have been in data, and now they're doing a lot of interesting things in integration. And we're excited to have a many time CUBE alum. He's Greg Benson, let me get that title right, chief scientist at SnapLogic and of course a professor at University of San Francisco. Greg great to see you. >> Great to see you, Jeff. >> So I think the last time we see you was at Fleet Forward. Interesting open-source project, data, ad moves. The open-source technologies and the technologies available for you guys to use just continue to evolve at a crazy breakneck speed. >> Yeah, it is. Open source in general, as you know, has really revolutionized all of computing, starting with Linux and what that's done for the world. And, you know, in one sense it's a boon, but it introduces a challenge, because how do you choose? And then even when you do choose, do you have the expertise to harness it? You know, the early social companies really leveraged off of Hadoop and Hadoop technology to drive their business and their objectives. And now we've seen a lot of that technology be commercialized and have a lot of service around it. And SnapLogic is doing that as well. We help reduce the complexity and make a lot of this open-source technology available to our customers. >> So, I want to talk about a lot of different things. One of the things is Iris. So Iris is your guys' leverage of machine learning and artificial intelligence to help make integration easier. Did I get that right? >> That's correct, yeah. Iris is the umbrella terms for everything that we do with machine learning and how we use it to enhance the user experience. And one way to think about it is when you're interacting with our product, we've made the SnapLogic designer a web-based UI, drag-and-drop interface to construct these integration pipelines. We connect these things called Snaps. It's like building with Legos to build out these transformations on your data. And when you're doing that, when you're interacting with the designer, we would like to believe that we've made it one of the simplest interfaces to do this type of work, but even with that, there are many times we have to make decisions, like what type of transformation do you do next? How do you configure that transformation if you're talking to an Oracle database? How do you configure it? What's your credentials if you talk to SalesForce? If I'm doing a transformation on data, which fields do I need? What kind of operations do I need to apply to those fields? So as you can imagine, there's lots of situations as you're building out these data integration pipelines to make decisions. And one way to think about Iris is Iris is there to help reduce the complexity, help reduce what kind of decision you have to make at any point in time. So it's contextually aware of what you're doing at that moment in time, based on mining our thousands of existing pipelines and scenarios in which SnapLogic has been used. We leverage that to train models to help make recommendations so that you can speed through whatever task you're trying to do as quickly as possible. >> It's such an important piece of information, because if I'm doing an integration project using the tool, I don't have the experience of the vast thousands and thousands, and actually you're doing now, what, a trillion document moves last month? I just don't have that expertise. You guys have the expertise, and truth be told, as unique as I think I am, and as unique as I think my business processes are, probably, a lot of them are pretty much the same as a lot of other people that are hooking up to SalesForce to Oracle or hooking up Marketta to their CRM. So you guys have really taken advantage of that using the AI and ML to help guide me along, which is probably a pretty high-probability prediction of what my next move's going to be. >> Yeah, absolutely, and you know, back in the day, we used to consider, like, wizards or these sorts of things that would walk you through it. And really that was, it seemed intelligent, but it wasn't really intelligence or machine learning. It was really just hard-coded facts or heuristics that hopefully would be right for certain situations. The difference today is we're using real data, gigabytes of metadata that we can use to train our models. The nice thing about that it's not hard-coded it's adaptive. It's adaptive both for new customers but also for existing customers. We have customers that have hundreds of people that just use SnapLogic to get their business objectives done. And as they're building new pipelines, as they are putting in new expressions, we are learning that for them within their organization. So like their coworkers, the next day, they can come in and then they get the advantages of all the intellectual work that was done to figure something out will be learned and then will be made available through Iris. >> Right. I love this idea of operationalizing machine learning and the augmented intelligence. So how do you apply it? Don't just talk about it, don't give it a name of some dead smart person, but actually apply it to an application where you can start to see the benefit. And that's really what Iris is all about. So what's changed the most in the last year since you launched it? >> You know, one thing I'll say: The most interesting thing that we discovered when we first launched Iris, and I should say one of the first Iris technologies that we introduced was something called the integration assistant. And this was an assistant that would make, make recommendations of the next Snap as you're building out your pipeline, so the next transformation or the next connector, and before we launched it, we did lots of experimentation with different machine learning models. We did different training to get the best accuracy possible. And what we really thought was that this was going to be most useful for the new user, somebody who hasn't really used the product and it turns out, when we looked at our data, and we looked at how it got used, it turns out that yes, new users did use it, but existing or very skilled users were using it just as much if not more, 'cause it turned out that it was so good at making recommendations that it was like a shortcut. Like, even if they knew the product really well, it's still actually a little more work to go through our catalog of 400 plus Snaps and pick something out when if it's just sitting right there and saying, "Hey, the next thing you need to do," you don't even have to think. You just have to click, and it's right there. Then it just speeds up the expert user as well. That was an interesting sort of revelation about machine learning and our application of it. In terms of what's changed over the last year, we've done a number of things. Probably the operationalizing it so that instead of training off of SnapShot, we're now training on a continuous basis so that we get that adaptive learning that I was talking about earlier. The other thing that we have done, and this is kind of getting into the weeds, we were using a decision tree model, which is a type of machine learning algorithm, and we switched to neural nets now, so now we use neural nets to achieve higher accuracy, and also a more adaptive learning experience. The neural net allowed us to bring in sort of like this organizational information so that your recommendations would be more tailored to your specific organization. The other thing we're just on the cusp of releasing is, in the integration assistant, we're working on sort of a, sort of, from beginning-to-end type recommendation, where you were kind of working forward. But what we found is, in talking to people in the field, and our customers who use the product, is there's all kinds of different ways that people interact with a product. They might know know where they want the data to go, and then they might want to work backwards. Or they might know that the most important thing I need this to do is to join some data. So like when you're solving a puzzle with the family, you either work on the edges or you put some clumps in the middle and work to get to. And that puzzle solving metaphor is where we're moving integration assistance so that you can fill in the pieces that you know, and then we help you work in any direction to make the puzzle complete. That's something that we've been adding to. We recently started recommending, based on your context, the most common sources and destinations you might need, but we're also about to introduce this idea of working backwards and then also working from the inside out. >> We just had Gaurav on, and he's talking about the next iteration of the vision is to get to autonomous, to get to where the thing not only can guess what you want to do, has a pretty good idea, but it actually starts to basically do it for you, and I guess it would flag you if there's some strange thing or it needs an assistant, and really almost full autonomy in this integration effort. It's a good vision. >> I'm the one who has to make that vision a reality. The way I like to explain is that customers or users have a concept of what they want to achieve. And that concept is as a thought in their head, and the goal is how to get that concept or thought into something that is machine executable. What's the pathway to achieve that? Or if somebody's using SnapLogic for a lot of their organizational operations or for their data integration, we can start looking at what you're doing and make recommendations about other things you should or might be doing. So it's kind of like this two-way thing where we can give you some suggestions but people also know what they want to do conceptually but how do we make that realizable as something that's executable. So I'm working on a number of research projects that is getting us closer to that vision. And one that I've been very excited about is we're working a lot with NLP, Natural Language Processing, like many companies and other products are investigating. For our use in particular is in a couple of different ways. To be sort of concrete, we've been working on a research project in which, rather than, you know, having to know the name of a Snap. 'Cause right now, you get this thing called a Snap catalog, and like I said, 400 plus Snaps. To go through the whole list, it's pretty long. You can start to type a name, and yeah, it'll limit it, but you still have to know exactly what that Snap is called. What we're doing is we're applying machine learning in order to allow you to either speak or type what the intention is of what you're looking for. I want to parse a CSV file. Now, we have a file reader, and we have a CSV parser, but if you just typed, parse a CSV file, it may not find what you're looking for. But we're trying to take the human description and then connect that with the actual Snaps that you might need to complete your task. That's one thing we're working on. I have two more. The second one is a little bit more ambitious, but we have some preliminary work that demonstrates this idea of actually saying or typing what you want an entire pipeline to do. I might say I want to read data from SalesForce, I want to filter out only records from the last week, and then I want to put those records into Redshift. And if you were to just say or type what I just said, we would give you a pipeline that maybe isn't entirely complete, but working and allows you to evolve it from there. So you didn't have to go through all the steps of finding each individual Snap and connecting them together. So this is still very early on, but we have some exciting results. And then the last thing we're working on with NLP is, in SnapLogic, we have a nice view eye, and it's really good. A lot of the heavy lifting in building these pipelines, though, is in the actual manipulation of the data. And to actually manipulate the data, you need to construct expressions. And expressions in SnapLogic, we have a JavaScript expression language, so you have to write these expressions to do operations, right. One of our next goals is to use natural language to help you describe what you want those expressions to do and then generate those expressions for you. To get at that vision, we have to chisel. We have to break down the barriers on each one of these and then collectively, this will get us closer to that vision of truly autonomous integration. >> What's so cool about it, and again, you say autonomous and I can't help but think autonomous vehicles. We had a great interview, he said, if you have an accident in your car, you learn, the person you had an accident learns a little bit, and maybe the insurance adjuster learns a little bit. But when you have an accident in an autonomous vehicle, everybody learns, the whole system learns. That learning is shared orders of magnitude greater, to greater benefit of the whole. And that's really where you guys are sitting in this cloud situation. You've got all this integration going on with customers, you have all this translation and movement of data. Everybody benefits from the learning that's gained by everybody's participation. That's what is so exciting, and why it's such a great accelerator to how things used to be done before by yourself, in your little company, coding away trying to solve your problems. Very very different kind of paradigm, to leverage all that information of actual use cases, what's actually happening with the platform. So it puts you guys in a pretty good situation. >> I completely agree. Another analogy is, look, we're not going to get rid of programmers anytime soon. However, programming's a complex, human endeavor. However, the Snap pipelines are kind of like programs, and what we're doing in our domain, our space, is trying to achieve automated programming so that, you're right, as you said, learning from the experience of others, learning from the crowd, learning from mistakes and capturing that knowledge in a way that when somebody is presented with a new task, we can either make it very quick for them to achieve that or actually provide them with exactly what they need. So yeah, it's very exciting. >> So we're running out of time. Before I let you go, I wanted to tie it back to your professor job. How do you leverage that? How does that benefit what's going on here at SnapLogic? 'Cause you've obviously been doing that for a long time, it's important to you. Bill Schmarzo, great fan of theCUBE, I deemed him the dean of big data a couple of years ago, he's now starting to teach. So there's a lot of benefits to being involved in academe, so what are you doing there in academe, and how does it tie back to what you're doing here in SnapLogic? >> So yeah, I've been a professor for 20 years at the University of San Francisco. I've long done research in operating systems and distributed systems, parallel computing programming languages, and I had the opportunity to start working with SnapLogic in 2010. And it was this great experience of, okay, I've done all this academic research, I've built systems, I've written research papers, and SnapLogic provided me with an opportunity to actually put a lot of this stuff in practice and work with real-world data. I think a lot of people on both sides of the industry academia fence will tell you that a lot of the real interesting stuff in computer science happens in industry because a lot of what we do with computer science is practical. And so I started off bringing in my expertise in working on innovation and doing research projects, which I continue to do today. And at USF, we happened to have a vehicle already set up. All of our students, both undergraduates and graduates, have to do a capstone senior project or master's project in which we pair up the students with industry sponsors to work on a project. And this is a time in their careers where they don't have a lot of professional experience, but they have a lot of knowledge. And so we bring the students in, and we carve out a project idea. And the students under my mentorship and working with the engineering team work toward whatever project we set up. Those projects have resulted in numerous innovations now that are in the product. The most recent big one is Iris came out of one of these research projects. >> Oh, it did? >> It was a machine learning project about, started around three years ago. We continuously have lots of other projects in the works. On the flip side, my experience with SnapLogic has allowed me to bring sort of this industry experience back to the classroom, both in terms of explaining to students and understanding what their expectations will be when they get out into industry, but also being able to make the examples more real and relevant in the classroom. For me, it's been a great relationship that's benefited both those roles. >> Well, it's such a big and important driver to what goes on in the Bay Area. USF doesn't get enough credit. Clearly Stanford and Cal get a lot, they bring in a lot of smart people every year. They don't leave, they love the weather. It is really a significant driver. Not to mention all the innovation that happens and cool startups that come out. Well, Greg thanks for taking a few minutes out of your busy day to sit down with us. >> Thank you, Jeff. >> All right, he's Greg, I'm Jeff. You're watching theCUBE from SnapLogic in San Mateo, California. Thanks for watching.

Published Date : May 21 2018

SUMMARY :

Brought to you by SnapLogic. and look at all the buildings, So I think the last time we see you was at Fleet Forward. And then even when you do choose, and artificial intelligence to help make integration easier. to help make recommendations so that you can So you guys have really taken advantage of that Yeah, absolutely, and you know, and the augmented intelligence. "Hey, the next thing you need to do," and I guess it would flag you if there's some strange thing and the goal is how to get that concept or thought the person you had an accident learns a little bit, and what we're doing in our domain, our space, and how does it tie back to of the industry academia fence will tell you that We continuously have lots of other projects in the works. and cool startups that come out. SnapLogic in San Mateo, California.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

Bill SchmarzoPERSON

0.99+

Greg BensonPERSON

0.99+

Jeff FrickPERSON

0.99+

GregPERSON

0.99+

2010DATE

0.99+

StanfordORGANIZATION

0.99+

20 yearsQUANTITY

0.99+

SnapLogicORGANIZATION

0.99+

USFORGANIZATION

0.99+

San Mateo, CaliforniaLOCATION

0.99+

CalORGANIZATION

0.99+

Bay AreaLOCATION

0.99+

OneQUANTITY

0.99+

last weekDATE

0.99+

OracleORGANIZATION

0.99+

last yearDATE

0.99+

both sidesQUANTITY

0.99+

LegosORGANIZATION

0.99+

bothQUANTITY

0.99+

RakutenORGANIZATION

0.99+

thousandsQUANTITY

0.98+

two-wayQUANTITY

0.98+

101LOCATION

0.98+

last monthDATE

0.98+

400 plus SnapsQUANTITY

0.98+

LinuxTITLE

0.98+

IrisTITLE

0.98+

SnapLogic Innovation Day 2018EVENT

0.97+

firstQUANTITY

0.97+

oneQUANTITY

0.97+

second oneQUANTITY

0.97+

University of San FranciscoORGANIZATION

0.97+

SnapLogicTITLE

0.97+

todayDATE

0.96+

NLPORGANIZATION

0.95+

Siebel BuildingLOCATION

0.95+

SnapShotTITLE

0.95+

GauravPERSON

0.95+

hundreds of peopleQUANTITY

0.95+

Fleet ForwardORGANIZATION

0.94+

92LOCATION

0.93+

JavaScriptTITLE

0.93+

next dayDATE

0.92+

couple of years agoDATE

0.91+

one wayQUANTITY

0.9+

WarriorORGANIZATION

0.9+

each oneQUANTITY

0.87+

one thingQUANTITY

0.86+

SalesForceORGANIZATION

0.86+

MarkettaORGANIZATION

0.85+

each individualQUANTITY

0.84+

IrisPERSON

0.84+

IrisORGANIZATION

0.84+

Natural Language ProcessingORGANIZATION

0.83+

around three years agoDATE

0.81+

SnapORGANIZATION

0.79+

Greg Benson, SnapLogic | SnapLogic Innovation Day 2018


 

>> Narrator: From San Mateo, California, it's theCUBE, covering SnapLogic Innovation Day 2018. Brought to you by SnapLogic. >> Welcome back, Jeff Frick here with theCUBE. We're at the Crossroads, that's 92 and 101 in the Bay Area if you've been through it, you've had time to take a minute and look at all the buildings, 'cause traffic's usually not so great around here. But there's a lot of great software companies that come through here. It's interesting, I always think back to the Siebel Building that went up and now that's Rakuten, who we all know from the Warrior jerseys, the very popular Japanese retailer. But that's not why we're here. We're here to talk to SnapLogic. They're doing a lot of really interesting things, and they have been in data, and now they're doing a lot of interesting things in integration. And we're excited to have a many time Cube alum. He's Greg Benson, let me get that title right, chief scientist at SnapLogic and of course a professor at University of San Francisco. Greg great to see you. >> Great to see you, Jeff. >> So I think the last time we see you was at Fleet Forward. Interesting open-source project, data, ad moves. The open-source technologies and the technologies available for you guys to use just continue to evolve at a crazy breakneck speed. >> Yeah, it is. Open source in general, as you know, has really revolutionized all of computing, starting with Linux and what that's done for the world. And, you know, in one sense it's a boon, but it introduces a challenge, because how do you choose? And then even when you do choose, do you have the expertise to harness it? You know, the early social companies really leveraged off of Hadoop and Hadoop technology to drive their business and their objectives. And now we've seen a lot of that technology be commercialized and have a lot of service around it. And SnapLogic is doing that as well. We help reduce the complexity and make a lot of this open-source technology available to our customers. >> So, I want to talk about a lot of different things. One of the things is Iris. So Iris is your guys' leverage of machine learning and artificial intelligence to help make integration easier. Did I get that right? >> That's correct, yeah. Iris is the umbrella terms for everything that we do with machine learning and how we use it to enhance the user experience. And one way to think about it is when you're interacting with our product, we've made the SnapLogic designer a web-based UI, drag-and-drop interface to construct these integration pipelines. We connect these things called Snaps. It's like building with Legos to build out these transformations on your data. And when you're doing that, when you're interacting with the designer, we would like to believe that we've made it one of the simplest interfaces to do this type of work, but even with that, there are many times we have to make decisions, like what type of transformation do you do next? How do you configure that transformation if you're talking to an Oracle database? How do you configure it? What's your credentials if you talk to SalesForce? If I'm doing a transformation on data, which fields do I need? What kind of operations do I need to apply to those fields? So as you can imagine, there's lots of situations as you're building out these data integration pipelines to make decisions. And one way to think about Iris is Iris is there to help reduce the complexity, help reduce what kind of decision you have to make at any point in time. So it's contextually aware of what you're doing at that moment in time, based on mining our thousands of existing pipelines and scenarios in which SnapLogic has been used. We leverage that to train models to help make recommendations so that you can speed through whatever task you're trying to do as quickly as possible. >> It's such an important piece of information, because if I'm doing an integration project using the tool, I don't have the experience of the vast thousands and thousands, and actually you're doing now, what, a trillion document moves last month? I just don't have that expertise. You guys have the expertise, and truth be told, as unique as I think I am, and as unique as I think my business processes are, probably, a lot of them are pretty much the same as a lot of other people that are hooking up to SalesForce to Oracle or hooking up Marketta to their CRM. So you guys have really taken advantage of that using the AI and ML to help guide me along, which is probably a pretty high-probability prediction of what my next move's going to be. >> Yeah, absolutely, and you know, back in the day, we used to consider, like, wizards or these sorts of things that would walk you through it. And really that was, it seemed intelligent, but it wasn't really intelligence or machine learning. It was really just hard-coded facts or heuristics that hopefully would be right for certain situations. The difference today is we're using real data, gigabytes of metadata that we can use to train our models. The nice thing about that it's not hard-coded it's adaptive. It's adaptive both for new customers but also for existing customers. We have customers that have hundreds of people that just use SnapLogic to get their business objectives done. And as they're building new pipelines, as they are putting in new expressions, we are learning that for them within their organization. So like their coworkers, the next day, they can come in and then they get the advantages of all the intellectual work that was done to figure something out will be learned and then will be made available through Iris. >> Right. I love this idea of operationalizing machine learning and the augmented intelligence. So how do you apply it? Don't just talk about it, don't give it a name of some dead smart person, but actually apply it to an application where you can start to see the benefit. And that's really what Iris is all about. So what's changed the most in the last year since you launched it? >> You know, one thing I'll say: The most interesting thing that we discovered when we first launched Iris, and I should say one of the first Iris technologies that we introduced was something called the integration assistant. And this was an assistant that would make, make recommendations of the next Snap as you're building out your pipeline, so the next transformation or the next connector, and before we launched it, we did lots of experimentation with different machine learning models. We did different training to get the best accuracy possible. And what we really thought was that this was going to be most useful for the new user, somebody who hasn't really used the product and it turns out, when we looked at our data, and we looked at how it got used, it turns out that yes, new users did use it, but existing or very skilled users were using it just as much if not more, 'cause it turned out that it was so good at making recommendations that it was like a shortcut. Like, even if they knew the product really well, it's still actually a little more work to go through our catalog of 400 plus Snaps and pick something out when if it's just sitting right there and saying, "Hey, the next thing you need to do," you don't even have to think. You just have to click, and it's right there. Then it just speeds up the expert user as well. That was an interesting sort of revelation about machine learning and our application of it. In terms of what's changed over the last year, we've done a number of things. Probably the operationalizing it so that instead of training off of SnapShot, we're now training on a continuous basis so that we get that adaptive learning that I was talking about earlier. The other thing that we have done, and this is kind of getting into the weeds, we were using a decision tree model, which is a type of machine learning algorithm, and we switched to neural nets now, so now we use neural nets to achieve higher accuracy, and also a more adaptive learning experience. The neural net allowed us to bring in sort of like this organizational information so that your recommendations would be more tailored to your specific organization. The other thing we're just on the cusp of releasing is, in the integration assistant, we're working on sort of a, sort of, from beginning-to-end type recommendation, where you were kind of working forward. But what we found is, in talking to people in the field, and our customers who use the product, is there's all kinds of different ways that people interact with a product. They might know know where they want the data to go, and then they might want to work backwards. Or they might know that the most important thing I need this to do is to join some data. So like when you're solving a puzzle with the family, you either work on the edges or you put some clumps in the middle and work to get to. And that puzzle solving metaphor is where we're moving integration assistance so that you can fill in the pieces that you know, and then we help you work in any direction to make the puzzle complete. That's something that we've been adding to. We recently started recommending, based on your context, the most common sources and destinations you might need, but we're also about to introduce this idea of working backwards and then also working from the inside out. >> We just had Gaurav on, and he's talking about the next iteration of the vision is to get to autonomous, to get to where the thing not only can guess what you want to do, has a pretty good idea, but it actually starts to basically do it for you, and I guess it would flag you if there's some strange thing or it needs an assistant, and really almost full autonomy in this integration effort. It's a good vision. >> I'm the one who has to make that vision a reality. The way I like to explain is that customers or users have a concept of what they want to achieve. And that concept is as a thought in their head, and the goal is how to get that concept or thought into something that is machine executable. What's the pathway to achieve that? Or if somebody's using SnapLogic for a lot of their organizational operations or for their data integration, we can start looking at what you're doing and make recommendations about other things you should or might be doing. So it's kind of like this two-way thing where we can give you some suggestions but people also know what they want to do conceptually but how do we make that realizable as something that's executable. So I'm working on a number of research projects that is getting us closer to that vision. And one that I've been very excited about is we're working a lot with NLP, Natural Language Processing, like many companies and other products are investigating. For our use in particular is in a couple of different ways. To be sort of concrete, we've been working on a research project in which, rather than, you know, having to know the name of a Snap. 'Cause right now, you get this thing called a Snap catalog, and like I said, 400 plus Snaps. To go through the whole list, it's pretty long. You can start to type a name, and yeah, it'll limit it, but you still have to know exactly what that Snap is called. What we're doing is we're applying machine learning in order to allow you to either speak or type what the intention is of what you're looking for. I want to parse a CSV file. Now, we have a file reader, and we have a CSV parser, but if you just typed, parse a CSV file, it may not find what you're looking for. But we're trying to take the human description and then connect that with the actual Snaps that you might need to complete your task. That's one thing we're working on. I have two more. The second one is a little bit more ambitious, but we have some preliminary work that demonstrates this idea of actually saying or typing what you want an entire pipeline to do. I might say I want to read data from SalesForce, I want to filter out only records from the last week, and then I want to put those records into Redshift. And if you were to just say or type what I just said, we would give you a pipeline that maybe isn't entirely complete, but working and allows you to evolve it from there. So you didn't have to go through all the steps of finding each individual Snap and connecting them together. So this is still very early on, but we have some exciting results. And then the last thing we're working on with NLP is, in SnapLogic, we have a nice view eye, and it's really good. A lot of the heavy lifting in building these pipelines, though, is in the actual manipulation of the data. And to actually manipulate the data, you need to construct expressions. And expressions in SnapLogic, we have a JavaScript expression language, so you have to write these expressions to do operations, right. One of our next goals is to use natural language to help you describe what you want those expressions to do and then generate those expressions for you. To get at that vision, we have to chisel. We have to break down the barriers on each one of these and then collectively, this will get us closer to that vision of truly autonomous integration. >> What's so cool about it, and again, you say autonomous and I can't help but think autonomous vehicles. We had a great interview, he said, if you have an accident in your car, you learn, the person you had an accident learns a little bit, and maybe the insurance adjuster learns a little bit. But when you have an accident in an autonomous vehicle, everybody learns, the whole system learns. That learning is shared orders of magnitude greater, to greater benefit of the whole. And that's really where you guys are sitting in this cloud situation. You've got all this integration going on with customers, you have all this translation and movement of data. Everybody benefits from the learning that's gained by everybody's participation. That's what is so exciting, and why it's such a great accelerator to how things used to be done before by yourself, in your little company, coding away trying to solve your problems. Very very different kind of paradigm, to leverage all that information of actual use cases, what's actually happening with the platform. So it puts you guys in a pretty good situation. >> I completely agree. Another analogy is, look, we're not going to get rid of programmers anytime soon. However, programming's a complex, human endeavor. However, the Snap pipelines are kind of like programs, and what we're doing in our domain, our space, is trying to achieve automated programming so that, you're right, as you said, learning from the experience of others, learning from the crowd, learning from mistakes and capturing that knowledge in a way that when somebody is presented with a new task, we can either make it very quick for them to achieve that or actually provide them with exactly what they need. So yeah, it's very exciting. >> So we're running out of time. Before I let you go, I wanted to tie it back to your professor job. How do you leverage that? How does that benefit what's going on here at SnapLogic? 'Cause you've obviously been doing that for a long time, it's important to you. Bill Schmarzo, great fan of theCUBE, I deemed him the dean of big data a couple of years ago, he's now starting to teach. So there's a lot of benefits to being involved in academe, so what are you doing there in academe, and how does it tie back to what you're doing here in SnapLogic? >> So yeah, I've been a professor for 20 years at the University of San Francisco. I've long done research in operating systems and distributed systems, parallel computing programming languages, and I had the opportunity to start working with SnapLogic in 2010. And it was this great experience of, okay, I've done all this academic research, I've built systems, I've written research papers, and SnapLogic provided me with an opportunity to actually put a lot of this stuff in practice and work with real-world data. I think a lot of people on both sides of the industry academia fence will tell you that a lot of the real interesting stuff in computer science happens in industry because a lot of what we do with computer science is practical. And so I started off bringing in my expertise in working on innovation and doing research projects, which I continue to do today. And at USF, we happened to have a vehicle already set up. All of our students, both undergraduates and graduates, have to do a capstone senior project or master's project in which we pair up the students with industry sponsors to work on a project. And this is a time in their careers where they don't have a lot of professional experience, but they have a lot of knowledge. And so we bring the students in, and we carve out a project idea. And the students under my mentorship and working with the engineering team work toward whatever project we set up. Those projects have resulted in numerous innovations now that are in the product. The most recent big one is Iris came out of one of these research projects. >> Oh, it did? >> It was a machine learning project about, started around three years ago. We continuously have lots of other projects in the works. On the flip side, my experience with SnapLogic has allowed me to bring sort of this industry experience back to the classroom, both in terms of explaining to students and understanding what their expectations will be when they get out into industry, but also being able to make the examples more real and relevant in the classroom. For me, it's been a great relationship that's benefited both those roles. >> Well, it's such a big and important driver to what goes on in the Bay Area. USF doesn't get enough credit. Clearly Stanford and Cal get a lot, they bring in a lot of smart people every year. They don't leave, they love the weather. It is really a significant driver. Not to mention all the innovation that happens and cool startups that come out. Well, Greg thanks for taking a few minutes out of your busy day to sit down with us. >> Thank you, Jeff. >> All right, he's Greg, I'm Jeff. You're watching theCUBE from SnapLogic in San Mateo, California. Thanks for watching.

Published Date : May 18 2018

SUMMARY :

Brought to you by SnapLogic. and look at all the buildings, and the technologies available and make a lot of this and artificial intelligence to one of the simplest interfaces to do of the vast thousands and thousands, back in the day, we used and the augmented intelligence. "Hey, the next thing you need to do," and I guess it would flag you and the goal is how to get the person you had an learning from the experience of others, and how does it tie back to a lot of the real interesting to students and understanding what and cool startups that come out. SnapLogic in San Mateo, California.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

Bill SchmarzoPERSON

0.99+

Greg BensonPERSON

0.99+

Jeff FrickPERSON

0.99+

GregPERSON

0.99+

2010DATE

0.99+

StanfordORGANIZATION

0.99+

20 yearsQUANTITY

0.99+

SnapLogicORGANIZATION

0.99+

USFORGANIZATION

0.99+

San Mateo, CaliforniaLOCATION

0.99+

CalORGANIZATION

0.99+

Bay AreaLOCATION

0.99+

OneQUANTITY

0.99+

last weekDATE

0.99+

OracleORGANIZATION

0.99+

last yearDATE

0.99+

both sidesQUANTITY

0.99+

LegosORGANIZATION

0.99+

bothQUANTITY

0.99+

RakutenORGANIZATION

0.99+

thousandsQUANTITY

0.98+

two-wayQUANTITY

0.98+

last monthDATE

0.98+

400 plus SnapsQUANTITY

0.98+

101LOCATION

0.98+

IrisTITLE

0.98+

LinuxTITLE

0.97+

SnapLogic Innovation Day 2018EVENT

0.97+

firstQUANTITY

0.97+

oneQUANTITY

0.97+

second oneQUANTITY

0.97+

University of San FranciscoORGANIZATION

0.97+

SnapLogicTITLE

0.97+

GauravPERSON

0.96+

Siebel BuildingLOCATION

0.96+

todayDATE

0.96+

NLPORGANIZATION

0.95+

SnapShotTITLE

0.95+

hundreds of peopleQUANTITY

0.95+

Fleet ForwardORGANIZATION

0.95+

University of San FranciscoORGANIZATION

0.94+

92LOCATION

0.93+

JavaScriptTITLE

0.93+

next dayDATE

0.92+

couple of years agoDATE

0.91+

one wayQUANTITY

0.9+

WarriorORGANIZATION

0.88+

each oneQUANTITY

0.87+

one thingQUANTITY

0.86+

SalesForceORGANIZATION

0.86+

MarkettaORGANIZATION

0.85+

each individualQUANTITY

0.84+

IrisPERSON

0.84+

IrisORGANIZATION

0.84+

Natural Language ProcessingORGANIZATION

0.83+

around three years agoDATE

0.81+

SnapORGANIZATION

0.79+

Prakash Nanduri, Paxata | BigData NYC 2017


 

>> Announcer: Live from midtown Manhattan, it's theCUBE covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and it's ecosystem sponsors. (upbeat techno music) >> Hey, welcome back, everyone. Here live in New York City, this is theCUBE from SiliconANGLE Media Special. Exclusive coverage of the Big Data World at NYC. We call it Big Data NYC in conjunction also with Strata Hadoop, Strata Data, Hadoop World all going on kind of around the corner from our event here on 37th Street in Manhattan. I'm John Furrier, the co-host of theCUBE with Peter Burris, Head of Research at SiliconANGLE Media, and General Manager of WikiBon Research. And our next guest is one of our famous CUBE alumni, Prakash Nanduri co-founder and CEO of Paxata who launched his company here on theCUBE at our first inaugural Big Data NYC event in 2013. Great to see you. >> Great to see you, John. >> John: Great to have you back. You've been on every year since, and it's been the lucky charm. You guys have been doing great. It's not broke, don't fix it, right? And so theCUBE is working with you guys. We love having you on. It's been a pleasure, you as an entrepreneur, launching your company. Really, the entrepreneurial mojo. It's really what it's all about. Getting access to the market, you guys got in there, and you got a position. Give us the update on Paxata. What's happening? >> Awesome, John and Peter. Great to be here again. Every time I come here to New York for Strata I always look forward to our conversations. And every year we have something exciting and new to share with you. So, if you recall in 2013, it was a tiny little show, and it was a tiny little company, and we came in with big plans. And in 2013, I said, "You know, John, we're going to completely disrupt the way business consumers and business analysts turn raw data into information and they do self-service data preparation." That's what we brought to the market in 2013. Ever since, we have gone on to do something really exciting and new for our customers every year. In '14, we came in with the first Apache Spark-based platform that allowed business analysts to do data preparation at scale interactively. Every year since, last year we did enterprise grade and we talked about how Paxata is going to be delivering our self-service data preparation solution in a highly-scalable enterprise grade deployment world. This year, what's super exciting is in addition to the recent announcements we made on Paxata running natively on the Microsoft Azure HDI Spark system. We are truly now the only information platform that allows business consumers to turn data into information in a multi-cloud hybrid world for our enterprise customers. In the last few years, I came and I talked to you and I told you about work we're doing and what great things are happening. But this year, in addition to the super-exciting announcements with Microsoft and other exciting announcements that you'll be hearing. You are going to hear directly from one of our key anchor customers, Standard Chartered Bank. 150-year-old institution operating in over 46 countries. One of the most storied banks in the world with 87,500 employees. >> John: That's not a start up. >> That's not a start up. (John laughs) >> They probably have a high bar, high bar. They got a lot of data. >> They have lots of data. And they have chosen Paxata as their information fabric. We announced our strategic partnership with them recently and you know that they are going to be speaking on theCUBE this week. And what started as a little experiment, just like our experiment in 2013, has actually mushroomed now into Michael Gorriz, and Shameek Kundu, and the entire leadership of Standard Chartered choosing Paxata as the platform that will democratize information in the bank across their 87,500 employees. We are going in a very exciting way, a very fast way, and now delivering real value to the bank. And you can hear all about it on our website-- >> Well, he's coming on theCUBE so we'll drill down on that, but banks are changing. You talk about a transformation. What is a teller? An Internet of Things device. The watch potentially could be a terminal. So, the Internet of Things of people changes the game. Are the ATMs going to go away and become like broadcast points? >> Prakash: And you're absolutely right. And really what it is about is, it doesn't matter if you're a Standard Chartered Bank or if you're a pharma company or if you're the leading healthcare company, what it is is that everyone of our customers is really becoming an information-inspired business. And what we are driving our customers to is moving from a world where they're data-driven. I think being data-driven is fine. But what you need to be is information-inspired. And what does that mean? It means that you need to be able to consume data, regardless of format, regardless of source, regardless of where it's coming from, and turn it into information that actually allows you to get inside in decisions. And that's what Paxata does for you. So, this whole notion of being information-inspired, I don't care if you're a bank, if you're a car company, or if you're a healthcare company today, you need to have-- >> Prakash, for the folks watching that might not know our history as you launched on theCUBE in 2013 and have been successful every year since. You guys have really deploying the classic entrepreneurial success formula, be fast, walk the talk, listen to customers, add value. Take a minute quickly just to talk about what you guys do. Just for the folks that don't know you. >> Absolutely, let's just actually give it in the real example of you know, a customer like Standard Chartered. Standard Chartered operates in multiple countries. They have significant number of lines of businesses. And whether it's in risk and compliance, whether it is in their marketing department, whether it's in their corporate banking business, what they have to do is, a simple example could be I want to create a customer list to be able to go and run a marketing campaign. And the customer list in a particular region is not something easy for a bank like Standard Charter to come up with. They need to be able to pull from multiple sources. They need to be able to clean the data. They need to be able to shape the data to get that list. And if you look at what is really important, the people who understand the data are actually not the folks in IT but the folks in business. So, they need to have a tool and a platform that allows them to pull data from multiple sources to be able to massage it, to be able to clean it-- >> John: So, you sell to the business person? >> We sell to the business consumer. The business analyst is our consumer. And the person who supports them is the chief data officer and the person who runs the Paxata platform on their data lake infrastructure. >> So, IT sets the data lake and you guys just let the business guys go to town on the data. >> Prakash: Bingo. >> Okay, what's the problem that you solve? If you can summarize the problem that you solve for the customers, what is it? >> We take data and turn it into information that is clean, that's complete, that's consumable and that's contextual. The hardest problem in every analytical exercise is actually taking data and cleaning it up and getting it ready for analytics. That's what we do. >> It's the prep work. >> It's the prep work. >> As companies gain experience with Big Data, John, what they need to start doing increasingly is move more of the prep work or have more of the prep work flow closer to the analyst. And the reason's actually pretty simple. It's because of that context. Because the analyst knows more about what their looking for and is a better evaluator of whether or not they get what they need. Otherwise, you end up in this strange cycle time problem between people in back end that are trying to generate the data that they think they want. And so, by making the whole concept of data preparation simpler, more straight forward, you're able to have the people who actually consume the data and need it do a better job of articulating what they need, how they need it and making it presentable to the work that they're performing. >> Exactly, Peter. What does that say about how roles are starting to merge together? Cause you've got to be at the vanguard of seeing how some of these mature organizations are working. What do you think? Are we seeing roles start to become more aligned? >> Yes, I do think. So, first and foremost, I think what's happening is there is no such thing as having just one group that's doing data science and another group consuming. I think what you're going to be going into is the world of data and information isn't all-consuming and that everybody's role. Everybody has a role in that. And everybody's going to consume. So, if you look at a business analyst that was spending 80% of their time living in Excel or working with self-service BI tools like our partner's Tableau and Power BI from Microsoft, others. What you find is these people today are living in a world where either they have to live in coding scripting world hell or they have to rely on IT to get them the real data. So, the role of a business analyst or a subject matter expert, first and foremost, the fact that they work with data and they need information that's a given. There is no business role today where you can't deal with data. >> But it also makes them real valuable, because there aren't a lot of people who are good at dealing with data. And they're very, very reliant on these people to turn that data into something that is regarded as consumable elsewhere. So, you're trying to make them much more productive. >> Exactly. So, four years years ago, when we launched on theCUBE, the whole premise was that in order to be able to really drive towards a world where you can make information and data-driven decisions, you need to ensure that the business analyst community, or what I like to call the business consumer needs to have the power of being able to, A, get access to data, B, make sense of the data, and then turn that data into something that's valuable for her or for him. >> Peter: And others. >> And others, and others. Absolutely. And that's what Paxata is doing. In a collaborative, in a 21st Century world where I don't work in a silo, I work collaboratively. And then the tool, and the platform that helps me do that is actually a 21st Century platform. >> So, John, at the beginning of the session you and Jim were talking about what is going to be one of the themes here at the show. And we observed that it used to be that people were talking about setting up the hardware, setting up the clutters, getting Hadoop to work, and Jim talked about going up the stack. Well, this is one of the indicators that, in fact, people were starting to go up the stack because they're starting to worry more about the data, what it can do, the value of how it's going to be used, and how we distribute more of that work so that we get more people using data that's actually good and useful to the business. >> John: And drives value. >> And drives value. >> Absolutely. And if I may, just put a chronological aspect to this. When we launched the company we said the business analyst needs to be in charge of the data and turning the data into something useful. Then right at that time, the world of create data lakes came in thanks to our partners like Cloudera and Hortonworks, and others, and MapR and others. In the recent past, the world of moving from on premise data lakes to hybrid, multicloud data lakes is becoming reality. Our partners at Microsoft, at AWS, and others are having customers come in and build cloud-based data lakes. So, today what you're seeing is on one hand this complete democratization within the business, like at Standard Chartered, where all these business analysts are getting access to data. And on the other hand, from the data infrastructure moving into a hybrid multicloud world. And what you need is a 21st Century information management platform that serves the need of the business and to make that data relevant and information and ready for their consumption. While at the same time we should not forget that enterprises need governance. They need lineage. They need scale. They need to be able to move things around depending on what their business needs are. And that's what Paxata is driving. That's why we're so excited about our partnership with Microsoft, with AWS, with our customer partnerships such as Standard Chartered Bank, rolling this out in an enterprise-- >> This is a democratization that you were referring to with your customers. We see this-- >> Everywhere. >> When you free the data up, good things happen but you don't want to have IT be the constraint, you want to let them enable-- >> Peter: And IT doesn't want to be the constraint. >> They don't. >> This is one of the biggest problems that they have on a daily basis. >> They're happy to let it go free as long as it's in they're mind DevOps-like related, this is cool for them. >> Well, they're happy to let it go with policy and security in place. >> Our customers, our most strategic customers, the folks who are running the data lakes, the folks who are managing the data lakes, they are the first ones that say that we want business to be able to access this data, and to be able to go and make use out of this data in the right way for the bank. And not have us be the impediment, not have us be the roadblock. While at the same time we still need governance. We still need security. We still need all those things that are important for a bank or a large enterprise. That's what Paxata is delivering to the customers. >> John: So, what's next? >> Peter: Oh, I'm sorry. >> So, really quickly. An interesting observation. People talk about data being the new fuel of business. That really doesn't work because, as Bill Schmarzo says, it's not the new fuel of business, it's new sunlight of business. And the reason why is because fuel can only be used once. >> Prakash: That's right. >> The whole point of data is that it can be used a lot, in a lot of different ways, and a lot of different contexts. And so, in many respects what we're really trying to facilitate or if someone who runs a data lake when someone in the business asks them, "Well, how do you create value for the business?" The more people, the more users, the more context that they're serving out of that common data, the more valuable the resource that they're administering. So, they want to see more utilization, more contexts, more data being moved out. But again, governance, security have to be in place. >> You bet, you bet. And using that analogy of data, and I've heard this term about data being the new oil, etc. Well, if data is the oil, information is really the refined fuel or sunlight as we like to call it. >> Peter: Yeah. >> John: Well, you're riffing on semantics, but the point is it's not a one trick pony. Data is part of the development, I wrote a blog post in 1997, I mean 2007 that said data's the new development kit. And it was kind of riffing on this notion of the old days >> Prakash: You bet. >> Here's your development kit, SDK, or whatever was how people did things back then Enter the cloud, >> Prakash: That's right. >> And boom, there it is. The data now is in the process of the refinery the developers wanted. The developers want the data libraries. Whatever that means. That's where I see it. And that is the democratization where data is available to be integrated in to apps, into feeds, into ... >> Exactly, and so it brings me to our point about what was the exciting, new product innovation announcement we made today about Intelligent Ingest. You want to be able to access data in the enterprise regardless of where it is, regardless of the cloud where it's sitting, regardless of whether it's on-premise, in the cloud. You don't need to as a business worry about whether that is a JSON file or whether that's an XML file or that's a relational file. That's irrelevant. What you want is, do I have the access to the right data? Can I take that data, can I turn it into something valuable and then can I make a decision out of it? I need to do that fast. At the same time, I need to have the governance and security, all of that. That's at the end of the day the objective that our customers are driving towards. >> Prakash, thanks so much for coming on and being a great member of our community. >> Fantastic. >> You're part of our smart network of great people out there and entrepreneurial journey continues. >> Yes. >> Final question. Just observation. As you pinch yourself and you go down the journey, you guys are walking the talk, adding new products. We're global landscape. You're seeing a lot of new stuff happening. Customers are trying to stay focused. A lot of distractions whether security or data or app development. What's your state of the industry? How do you view the current market, from your perspective and also how the customer might see it from their impact? >> Well, the first thing is that I think in the last four years we have seen significant maturity both on the providers off software technology and solutions, and also amongst the customers. I do think that going forward what is really going to make a difference is one really driving towards business outcomes by leveraging data. We've talked about a lot of this over the last few years. What real business outcomes are you delivering? What we are super excited is when we see our customers each one of them actually subscribes to Paxata, we're a SAS company, they subscribe to Paxata not because they're doing the science experiment but because they're trying to deliver real business value. What is that? Whether that is a risk in compliance solution which is going to drive towards real cost savings. Or whether that's a top line benefit because they know what they're customer 360 is and how they can go and serve their customers better or how they can improve supply chains or how they can optimize their entire efficiency in the company. I think if you take it from that lens, what is going to be important right now is there's lots of new technologies coming in, and what's important is how is it going to drive towards those top three business drivers that I have today for the next 18 months? >> John: So, that's foundational. >> That's foundational. Those are the building blocks-- >> That's what is happening. Don't jump... If you're a customer, it's great to look at new technologies, etc. There's always innovation projects-- >> RND, GPOCs, whatever. Kick the tires. >> But now, if you are really going to talk the talk about saying I'm going to be, call your word, data-driven, information-driven, whatever it is. If you're going to talk the talk, then you better walk the walk by delivering the real kind of tools and capabilities that you're business consumers can adopt. And they better adopt that fast. If they're not up and running in 24 hours, something is wrong. >> Peter: Let me ask one question before you close, John. So, you're argument, which I agree with, suggests that one of the big changes in the next 18 months, three years as this whole thing matures and gets more consistent in it's application of the value that it generates, we're going to see an explosion in the number users of these types of tools. >> Prakash: Yes, yes. >> Correct? >> Prakash: Absolutely. >> 2X, 3X, 5X? What do you think? >> I think we're just at the cusp. I think is going to grow up at least 10X and beyond. >> Peter: In the next two years? >> In the next, I would give that next three to five years. >> Peter: Three to five years? >> Yes. And we're on the journey. We're just at the tip of the high curve taking off. That's what I feel. >> Yeah, and there's going to be a lot more consolidation. You're going to start to see people who are winning. It's becoming clear as the fog lifts. It's a cloud game, a scale game. It's democratization, community-driven. It's open source software. Just solve problems, outcomes. I think outcome is going to be much faster. I think outcomes as a service will be a model that we'll probably be talking about in the future. You know, real time outcomes. Not eight month projects or year projects. >> Certainly, we started writing research about outcome-based management. >> Right. >> Wikibon Research... Prakash, one more thing? >> I also just want to say that in addition to this business outcome thing, I think in the last five years I've seen a lot of shift in our customer's world where the initial excitement about analytics, predictive, AI, machine-learning to get to outcomes. They've all come into a reality that none of that is possible if you're not able to handle, first get a grip on your data, and then be able to turn that data into something meaningful that can be analyzed. So, that is also a major shift. That's why you're seeing the growth we're seeing-- >> John: Cause it's really hard. >> Prakash: It's really hard. >> I mean, it's a cultural mindset. You have the personnel. It's an operational model. I mean this is not like, throw some pixie dust on it and it magically happens. >> That's why I say, before you go into any kind of BI, analytics, AI initiative, stop, think about your information management strategy. Think about how you're going to democratize information. Think about how you're going to get governance. Think about how you're going to enable your business to turn data into information. >> Remember, you can't do AI with IA? You can't do AI without information architecture. >> There you go. That's a great point. >> And I think this all points to why Wikibon's research have all the analysts got it right with true private cloud because people got to take care of their business here to have a foundation for the future. And you can't just jump to the future. There's too much just to come and use a scale, too many cracks in the foundation. You got to do your, take your medicine now. And do the homework and lay down a solid foundation. >> You bet. >> All right, Prakash. Great to have you on theCUBE. Again, congratulations. And again, it's great for us. I totally have a great vibe when I see you. Thinking about how you launched on theCUBE in 2013, and how far you continue to climb. Congratulations. >> Thank you so much, John. Thanks, Peter. That was fantastic. >> All right, live coverage continuing day one of three days. It's going to be a great week here in New York City. Weather's perfect and all the players are in town for Big Data NYC. I'm John Furrier with Peter Burris. Be back with more after this short break. (upbeat techno music).

Published Date : Sep 27 2017

SUMMARY :

Brought to you by SiliconANGLE Media I'm John Furrier, the co-host of theCUBE with Peter Burris, and it's been the lucky charm. In the last few years, I came and I talked to you That's not a start up. They got a lot of data. and Shameek Kundu, and the entire leadership Are the ATMs going to go away and turn it into information that actually allows you Take a minute quickly just to talk about what you guys do. And the customer list in a particular region and the person who runs the Paxata platform and you guys just let the business guys and that's contextual. is move more of the prep work or have more of the prep work are starting to merge together? And everybody's going to consume. to turn that data into something that is regarded to be able to really drive towards a world And that's what Paxata is doing. So, John, at the beginning of the session of the business and to make that data relevant This is a democratization that you were referring to This is one of the biggest problems that they have They're happy to let it go free as long as Well, they're happy to let it go with policy and to be able to go and make use out of this data And the reason why is because fuel can only be used once. out of that common data, the more valuable Well, if data is the oil, I mean 2007 that said data's the new development kit. And that is the democratization At the same time, I need to have the governance and being a great member of our community. and entrepreneurial journey continues. How do you view the current market, and also amongst the customers. Those are the building blocks-- it's great to look at new technologies, etc. Kick the tires. the real kind of tools and capabilities in it's application of the value that it generates, I think is going to grow up at least 10X and beyond. We're just at the tip of Yeah, and there's going to be a lot more consolidation. Certainly, we started writing research Prakash, one more thing? and then be able to turn that data into something meaningful You have the personnel. to turn data into information. Remember, you can't do AI with IA? There you go. And I think this all points to Great to have you on theCUBE. Thank you so much, John. It's going to be a great week here in New York City.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Peter BurrisPERSON

0.99+

JohnPERSON

0.99+

JimPERSON

0.99+

MicrosoftORGANIZATION

0.99+

2013DATE

0.99+

PeterPERSON

0.99+

PrakashPERSON

0.99+

AWSORGANIZATION

0.99+

John FurrierPERSON

0.99+

Prakash NanduriPERSON

0.99+

Bill SchmarzoPERSON

0.99+

1997DATE

0.99+

New YorkLOCATION

0.99+

ThreeQUANTITY

0.99+

80%QUANTITY

0.99+

Michael GorrizPERSON

0.99+

Standard Chartered BankORGANIZATION

0.99+

New York CityLOCATION

0.99+

2007DATE

0.99+

HortonworksORGANIZATION

0.99+

87,500 employeesQUANTITY

0.99+

PaxataORGANIZATION

0.99+

NYCLOCATION

0.99+

last yearDATE

0.99+

37th StreetLOCATION

0.99+

SASORGANIZATION

0.99+

WikiBon ResearchORGANIZATION

0.99+

five yearsQUANTITY

0.99+

ExcelTITLE

0.99+

24 hoursQUANTITY

0.99+

OneQUANTITY

0.99+

this yearDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

This yearDATE

0.99+

21st CenturyDATE

0.99+

oneQUANTITY

0.99+

eight monthQUANTITY

0.99+

one questionQUANTITY

0.99+

four years years agoDATE

0.99+

3XQUANTITY

0.99+

5XQUANTITY

0.99+

firstQUANTITY

0.99+

three yearsQUANTITY

0.99+

Nathan Trueblood, DataTorrent | CUBEConversations


 

(techno music) >> Hey welcome back everybody, Jeff Frick here with The CUBE. We're having a cube conversation in the Palo Alto studio. It's a different kind of format of CUBE. Not in the context of a big show. Got a great guest here lined up who we just had on at a show recently. He's Nathan Trueblood, he's the vice president of product management for DataTorrent. Nathan great to see you. >> Thanks for having me. >> We just had you on The CUBE at Hadoop, or Data Works now, >> That's right. >> not Hadoop Summit anymore. So just a quick follow up on that, we were just talking before we turned the cameras on. You said that was a pretty good show for you guys. >> Yeah it was a really great show. In fact as a software company one of the things you really want to see at shows is a lot of customer flow and a lot of good customer discussions, and that's definitely what happened at Data Works. It was also really good validation for us that everyone was coming and talking to us about what can you do from a real time analytics perspective? So that was also a good strong signal that we're onto something in this marketplace. >> It's interesting, I heard your quote from somewhere, that really the streaming and the real time streaming in the big data space is really grabbing all the attention. Obviously we do Spark Summit. We did Flink Forward. So we're seeing more and more activity around streaming and it's so logical that now that we have the compute horsepower, the storage horsepower, the networking horsepower, to enable something that we couldn't do very effectively before but now it's opening up a whole different way to look at data. >> Yeah it really is and I think as someone who's been working the tech world for a while, I'm always looking for simplifying ways to explain what this means. 'Cause people say streaming and real time and all of that stuff. For us what it really comes down to is the faster I can make decisions or the closer to when something happens I can make a decision, that gives me competitive advantage. And so if you look at the whole big data evolution. It's always been towards how quickly can we analyze this data so that we can respond to what it's telling us? And in many ways that means being more responsive to my customer. So a lot of this came out of course originally from very large scale systems at some of the big internet companies like Yahoo where Hadoop was born. But really it all comes down to if I'm more responsive to my customer, I'm more competitive and I win. And I think what a lot of customers are saying across many different verticals is real time means more responsiveness and that means competitive advantage. >> Right and even we hear all the time moving into a predictive model, and then even to a prescriptive model where you're offloading a lot of the grunt work of the decision making, letting the machine do a lot more of that, and so really it's the higher value stuff that finally gets to the human at the end of the interaction who's got to make a judgment. >> That's exactly right, that's right. And so to me all the buzz about streaming is really representative of just this is now the next evolution of where big data architecture has been going which is towards moving away from a batch oriented world into something where we're making decisions as close to the time of data creation as possible. >> So you've been involved in not only tech for a long time but Hadoop specifically and Big Data specifically. And one of the knocks, I remember that first time I ever heard about Hadoop, is actually from Bill Schmarzo at EMC the dean of Big Data. And I was talking to a friend of it and he goes yeah but what Bill didn't tell you, there's not enough people. You know Hadoop's got all this great promise, there just aren't enough people for all the enterprises at the individual company level to implement this stuff. Huge part of the problem. And now you're at DataTorrent and as we talked before, interesting kind of shift in strategy and going to really an application focus strategy as opposed to more of a platform focus strategy so that you can help people at companies solve problems faster. >> That's right we've definitely focused, especially recently on more of an application strategy. But to kind of peel that back a little bit, you need a platform with all the capabilities that a platform has to be able to deliver large scale operable streaming analytics. But customers aren't looking for platforms, they're looking for please solve my business problem, give me that competitive advantage. I think it's a long standing problem in technology and particularly in Big Data where you build a tremendous platform but there's only a handful of people who know how to actually construct the applications to deliver that value. And I think increasingly in big data but also across all of tech, customers are looking for outcomes now and the way for us to deliver outcomes is to deliver applications that run on our platform. So we've built a tremendous platform and now we are working with customers and delivering applications for that platform so that it takes a lot of the complexity out of the equation for them. And we kind of think of it like if in the past it required sort of an architect level person in order to construct an application on our platform, now we're gearing towards a much larger segment of developers in the enterprise who are tremendously capable but don't have that deep Big Data experience that they need to build an application from scratch. >> And it's pretty interesting too 'cause another theme we see over and over and over and over, especially around the innovation theme is the democratization of the access to the data, the democratization of the tools to access the data so that anyone in the company or a much greater set of individuals inside the company have the opportunity to have a hypothesis, to explore the hypothesis, to come back with solutions. And so by kind of removing this ivory tower, either the data scientists or the super smart engineer who's the only one that has the capability to play with the data and the tools. That's really how you open up innovation is democratizing access and ability to test and try things. >> That's right, to me I look at it very simply, when you have large scale adoption of a technology, usually it comes down to simplifying abstractions of one kind or another. And the big simplifying abstraction really of Big Data is providing the ability to break up a huge amount of data and make some sense of it, using of course large scale distributed computing. The abstraction we're delivering at DataTorrent now is building on all that stuff, on all those layers, we've obscured all of that and now you can download with our software an application that produces an outcome. So for example one of the applications we're shipping shortly is a Omni-Channel credit card fraud prevention application. Now our customers in the past have already constructed applications like this on our platform. But now what we're doing like you said is democratizing access to those kinds of applications by providing an application that works out of the box. And that's a simplifying abstraction. Now truthfully there's still a lot of complexity in there but we are providing the pattern, the foundational application that then the customer can focus on customizing to their particular situation, their integrations, their fraud rules and so forth. And so that just means getting you closer to that outcome much more quickly. >> Watching your video from Data Works, one of the interesting topics you brought up is really speed and how faster, better, cheaper, which is innovative for a little while, becomes the new norm. And as soon as you reset the bar on speed, then they just want it, well can you go faster. So whether you went from a week to a day, a day to an hour, there's just this relentless pressure to be able to get the data, analyze the data, make a decision faster and faster and faster. And you've seen this just changing by leap years right over time. >> Right and I literally started my career in the days of ETL extracting data from tape that was data produced weeks or months ago, down to now we're analyzing data at volumes that were inconceivable and producing insight in less than a second, which is kind of mind boggling. And I think the interesting thing that's happening when we think about speed, and I've had a few discussions with other folks about this, they say well speed really only matters for some very esoteric applications. It's one of the things that people bring up. But no one has ever said well I wish my data was less fresh or my insight was not as current. And so when you start to look at the kinds of customers that want to bring real time data processing and analytics, it turns out that nearly every vertical that we look at has a whole host of applications where if you could bring real time analytics you could be more responsive to what your customer's doing. >> Right right. >> Right and that can be, certainly that's the case in retail, but we see it in industrial automation and IoT. All I think of is IoT is a way to sense what's going on in the world, bring that data in, get insight and take action from it. And so real time analytics is a huge part of that, which you know again, healthcare, insurance, banking, all these different places have used cases. And so what we're aiming to do at DataTorrent is make it easy for the businesses in those different verticals to really get the outcome they're looking for, not produce a platform and say imagine what you could do, but produce an application that actually delivers on a particular problem they have. >> It's funny too the speed equation, you saw it in Flash, remembering to shift gears a little bit into the hardware space right, is people said well it's only super low latency, super high volume transactions, financial services, is the only benefit we're going to get from Flash. >> Right yeah we've had the same knock for real time analytics. >> Same thing right, but as soon as you put it in, there's all these second order impacts, third order impacts that nobody ever thought of, that speed that delivers, that aren't directly tied to that transactional speed, but now enable you because of that transactional speed, to do so many other things that you couldn't even imagine to do and so that's why I think we see this pervasiveness of Flash, why wouldn't you want Flash? I mean why wouldn't you want to go faster? 'Cause there's so much upside. >> Yeah so again all of these innovations in IT come down to how can I be more flexible and more responsive to changing conditions? More responsive to my customer, more flexible when it comes to changing business conditions and so forth. And so now as we start to instrument the world and have technologies like machine learning and artificial intelligence, that all needs to be fed by data that is delivered as quickly as possible and then it can be analyzed to make decisions in real time. >> So I wanted to shift gears a little bit, kind of back to the application strategies. So you said you had the first app that's going to be, (Jeff drowned out by Nathan) >> Yeah so the first application yes it was fraud prevention. That's an important distinction there because the distinction between detection and prevention is the competitive advantage of real time. Because what we deliver in DataTorrent is the ability to process massive amounts of data in very very low time frame. Sub seconds time frames. And so that's the kind of fundamental capability you need in order to do something like respond to some kind of fraud event. And what we see in the market is that fraud is becoming a greater and greater problem. The market itself is expanding. But I think as we see fraud is also evolving in terms of the ways it can take place across e-commerce and point of sale and so forth. And so merchants and processors and everyone in the whole spectrum of that market is facing a massive problem and an evolving problem. And so that's where we're focused in one of our first I would say vertically oriented business applications is it's really easy to be able to take in new sources of data with our application but also to be able to process all that data and then run it through a decision engine to decide if something is fraudulent or not in a short period of time. So you need to be able to take in all that data to be able to make a good decision. And you need to be able to decide quickly if it's going to matter. And you also need to be able to have a really strong model for making decisions so that you avoid things like false positives which are as big a problem as preventing fraud itself if you deliver bad customer experience. And we've all had that experience as well which is your card gets shut down for what you think is a legitimate activity. >> It's just so ironic that false positives are the biggest problem with credit card fraud. >> Yeah it's one of yeah. >> You would think we would be thankful for a false positive but all you hear over and over and over is that false positive and the customer experience. It shows that we're so good at it is the thing that really irks people. >> Well if you think about that, having an application that allows you to make better decisions more quickly and prevent those false positives and take care of fraud is a huge competitive advantage for all the different players in that industry. And it's not just for the credit card companies of course, it's for the whole spectrum of people from the merchant all the way to the bank that are trying to deal with this problem. And so that's why it's one of the applications that we think of as a key example where we see a lot of opportunity. And certainly people that are looking at credit card fraud have been thinking about this problem for a while. But there's the complexity like we were discussing earlier of finding the talent, on being able to deliver these kinds of applications finding the technology that can actually scale to the processing volume. And so by delivering Omni-Channel fraud prevention as a Big Data application, that just puts our customers so much closer to the outcome that they want. And it makes it a lot easier to adopt. >> So as you sit, shift gears a little bit, as your VP of product hat, and there's a huge wide world of opportunity in front of you, we talked about IoT a little bit, obviously fraud, you've talked about Omni-Channel retail. How are you guys going to figure out where you want to go next? How are you prioritizing the world, and as you build up more of these applications is it going to be vertically focused, horizontally focused, what are you thoughts as you start down the application journey? >> So a few thoughts on that. Certainly one of the key indicators for me as a product manager when I look at where to go next and what applications we should build next, it comes down to what signal are the customers giving us? As we mentioned earlier, we built a platform for real time analytics and decision making, and one of the things that we see is broad adoption across a lot of different verticals. So I mentioned industrial IoT and financial services fraud prevention and advertising technology, and, and, and. We have a company that we're working with in GPS geofencing. So the possibilities are pretty interesting. But when it comes to prioritizing those different applications we have to also look at what are the economics involved for the customer and for us. So certainly one of the reasons we chose fraud prevention is that the economics are pretty obvious for our customers. Some of these other things are going to take a little bit longer for the economics to show up when it comes to the applications. So you'll certainly see us focusing on vertically oriented business applications because again the horizontals tend to be more like a platform and it's not close enough to delivering an outcome for a customer. But it's worth noting one of the things we see is that while we will deliver vertically oriented applications that oftentimes switching from one vertical app to another is really not a lot more than changing the kind of data we're analyzing, and changing the decision engine. But the fundamental idea of processing data in a pipeline at very high volume with fault tolerance and low latency, that remains the same in every case. So we see a lot of opportunity essentially as we solve an application in one vertical, to rescan it into another. >> So you can say you're tweaking the dials and tweaking the UDI. >> Tweaking the data and the rules that you apply to that data. So if you think about Omni-Channel fraud prevention, well it's not that big of a leap to look at healthcare fraud or into look at all the other kinds of fraud in different verticals that you might see. >> Do you ever see that you'll potentially break out the algorithm, I forget which one we're at, people are talking about algorithms as a service. Or is that too much of a bit, does there need to be a little bit more packaging? >> No I mean I think there will be cases where we will have an algorithm out of the box that provides some basics for the decisions support. But as we see a huge market springing up around AI and machine learning and machine scoring and all of that, there's a whole industry that's growing up around essentially, we provide you the best way to deliver that algorithm or that decision engine, that you train on your data and so forth. So that's certainly an area where we're looking from a partnership perspective. Where we already today partner with some of the AI vendors for what I would say is some custom applications that customers have deployed. But you'll see more of that in our applications coming up in the future. But as far as algorithms as a service, I think that's already here in the form of being able to query against some kind of AI with a question, you know essentially a model and then getting an answer back. >> Right well Nathan, exciting times, and your Big Data journey continues. >> It certainly does, thanks a lot Jeff. >> Thanks Nathan Trueblood from DataTorrent. I'm Jeff Frick, you're watching The CUBE, we'll see you next time, thanks for watching. (techno music)

Published Date : Jul 21 2017

SUMMARY :

Not in the context of a big show. You said that was a pretty good show for you guys. In fact as a software company one of the things and it's so logical that now that we have or the closer to when something happens and so really it's the higher value stuff And so to me all the buzz about streaming at the individual company level to implement this stuff. so that it takes a lot of the complexity is the democratization of the access to the data, is providing the ability to break up a huge amount of data one of the interesting topics you brought up is really speed And so when you start to look at the kinds of customers is make it easy for the businesses is the only benefit we're going to get from Flash. for real time analytics. to do so many other things that you couldn't even imagine that all needs to be fed by data kind of back to the application strategies. And so that's the kind of fundamental capability you need are the biggest problem with credit card fraud. is that false positive and the customer experience. And it's not just for the credit card companies of course, is it going to be vertically focused, horizontally focused, and one of the things that we see So you can say you're tweaking the dials that you apply to that data. break out the algorithm, I forget which one we're at, that provides some basics for the decisions support. and your Big Data journey continues. we'll see you next time, thanks for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jeff FrickPERSON

0.99+

Bill SchmarzoPERSON

0.99+

JeffPERSON

0.99+

Nathan TruebloodPERSON

0.99+

NathanPERSON

0.99+

YahooORGANIZATION

0.99+

EMCORGANIZATION

0.99+

a weekQUANTITY

0.99+

BillPERSON

0.99+

DataTorrentORGANIZATION

0.99+

first appQUANTITY

0.99+

Data WorksORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

first applicationQUANTITY

0.99+

a dayQUANTITY

0.99+

less than a secondQUANTITY

0.99+

second orderQUANTITY

0.98+

oneQUANTITY

0.98+

HadoopORGANIZATION

0.97+

todayDATE

0.97+

third orderQUANTITY

0.97+

an hourQUANTITY

0.97+

Big DataORGANIZATION

0.96+

firstQUANTITY

0.96+

first timeQUANTITY

0.95+

FlashTITLE

0.94+

HadoopPERSON

0.92+

HadoopTITLE

0.91+

weeksDATE

0.85+

one verticalQUANTITY

0.83+

Hadoop SummitEVENT

0.81+

The CUBEORGANIZATION

0.79+

one of the applicationsQUANTITY

0.77+

FlinkORGANIZATION

0.72+

Omni-ChannelORGANIZATION

0.72+

UDIORGANIZATION

0.7+

SummitEVENT

0.66+

CUBEORGANIZATION

0.57+

CUBEConversationsORGANIZATION

0.47+

SparkORGANIZATION

0.46+

monthsQUANTITY

0.43+

David Lyle, Informatica - DataWorks Summit 2017


 

>> Narrator: Live from San Jose, in the heart of Silicon Valley, it's the Cube, covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Hey, welcome back to the Cube, I'm Lisa Martin with my co-host, Peter Buress. We are live on day one of the DataWorks Summit in Silicon Valley. We've had a great day so far, talking about innovation across different, different companies, different use cases, it's been really exciting. And now, please welcome our next guest, David Lyle from Informatica. You are driving business transformation services. >> Yes. >> Lisa: Welcome to the Cube. >> Well thank you, it's good to be here. >> It's great to have you here. So, tell us a little about Informatica World, Peter you were there with the Cube. Just recently some of the big announcements that came out of there, Informatica getting more aggressive with cloud movement, extending your master data management strategy, and you also introduce a set of AI capabilities around meta-data. >> David: Exactly. >> So, looking at those three things, and your customer landscape, what's going on with Informatica customers, where are you seeing these great new capabilities be, come to fruition? >> Absolutely, well one of the areas that is really wonderful that we're using in every other aspect of our life is using the computer to do the logical things it should, and could, be doing to help us out. So, in this announcement at Informatica World, we talked about the central aspect of meta-data finally being the true center of Informatica's universe. So bringing in meta-data-- >> And customer's universes. >> Well, and customer's universes, so the, not seeing it as something that sits over here that's not central, but truly the thing that, is where you should be focusing your attention on. And so Informatica has some card carrying PhD artificial intelligence machine learning engineers, scientists, that we have hired, that have been working for several years, that have built this new capability called CLAIRE. That's the marketing term for it, but really what it is, it's helping to apply artificial intelligence against that meta-data, to use the computer to do things for the developer, for the analyst, for the architect, for the business people, whatever, that are dealing with these complex data transformation initiatives that they're doing. Where in the past what's been happening is whatever product you're using, the product is basically keeping track of all the things that the scientist or analyst does, but isn't really looking at that meta-data to help suggest the things, that they, that maybe has already been done before. Or domains of data. Why, how come you have to tell the system that this is an address? Can't the system identify that when data looks like this, it's an address already? We think about Shazam and all these other apps that we have on our phones that can do these fantastic things with music. How come we can't do those same things with data? Well, that's really what CLAIRE can actually do now is discover these things and help. >> Well, I want to push now a little bit. >> David: Sure, sure. >> So, historically meta-data was the thing that you created in the modeling activity. >> David: Right. >> And it wasn't something that you wanted to change, or was expected to change frequently. >> In fact, in the world of transaction processing, you didn't want to change. >> Oh, yeah. And especially you get into finance apps, and things like that, you want to keep that slow. >> Exactly. >> Yeah. >> And meta-data became one of those things that often had to be secured in a different way, and was one of those reasons why IT was always so slow. >> Yeah. >> Because of all these concerns about what's the impact on meta-data. >> Yeah. >> We move into this big data world, and we're bringing forward many of the same perspectives on how we should treat meta-data, and what you guys are doing is saying "that's fine, keep the meta-data of that data, but do a better job of revealing it, and how it connects-- >> David: Exactly. >> and how it could be connected." And we talked about this with Bill Schmarzo just recently-- >> Good friend of mine. >> Yeah, the data that's in that system can also be applied to that system. >> Yeah. >> It doesn't have to be a silo. And what CLAIRE is trying to do is remove some of the artificial barriers-- >> Exactly. >> Of how we get access to data that are founded by organization, or application, or system. >> David: Right. >> And make it easier to find that data, use that data, and trust the data. >> Exactly. >> Peter: I got that right? >> You've totally got that right. So, if we think about all these systems in a organization as this giant complex air ball, that in the past we may have had pockets of meta-data here and there that weren't really exposed, or controlled in the right way in the first place. But now bringing it together. >> But also valuable in the context of the particular database or system-- >> Yep. >> that was running. It wasn't the meta-data that was guarded as valuable-- >> Right. that just provided documentation for what was in the data. >> Exactly, exactly. So, but now with this ability to see it, really for the first time, and understand how it connects and impacts with other systems, that are exchanging data with this, or viewing data with this. We can understand then if I need, occasionally, to make a change to the general ledger, or something, I can now understand what impact on different KPIs, and the calculations stream of tableaux, business objects, cognos, micro strategy, quick, whatever. That, what else do I need to change? What else do I need to test? That's something computers are good at. Something that humans have had to do manually up to this point. And that, that's what computers are for. >> Right. >> So questions for you on the business side. Since we look at-- >> Yeah. >> Businesses are demanding real time access to data to make real time decisions, manage costs, be competitive, and that's driving cloud, it's driving IOTs, it's driving big data and analytics. You talked about CLAIRE, and the implications of it across different people within an organization. >> Right. Meta-data, how does a C-Sweet, or a senior manager care-- >> David: Good point. >> About meta-data? >> They don't, and that's why we don't talk about the word architecture. Typically we see sweet folks we don't use the word meta-data. We see sweet folks, instead we talk about things like solving the problem of time, to get the application, or information that you need, reducing that time by being able to see and change and retest the things that need to be. So we just change the discussion to either dollars, or time, or of course those are really equivalent. >> But really facilitated by this-- >> Exactly. >> Artificial intelligence. >> It's facilitated by this artificial intelligence. It can also then lead to the, when we get into data lakes, ensuring that those data lakes are, understood better, trusted better, that people are being able to see what other people are actually using. And in other words we kind of bring, somewhat, the Amazon.com website model to the data lake, so that people know, okay, if I'm looking of a product, or data set, that looks like this for my, our, processing data science utility, or what I want to do. Then these are the data sets that are out there, that may be useful. This is how many people have used them, or who those other people are, and are those people kind of trusted, valid, people that have done similar stuff to what I want to do before? Anyway, all that information we're used to when we buy products from Amazon, we bring that now to the data lake that you're putting together, so that you can actually prevent it, kind of, from being a swamp and actually get value at it. Once again, it's the meta-data that's the key to that, of getting the value out of that data. >> Have you seen historically that, you're working with customers that, have or are already using hadoop. >> David: That's right. >> They've got data lakes. >> Oh yeah. >> Have you seen that historically they haven't really thought about meta-data as driving this much value before, is this sort of a, not a new problem, but are you seeing that it's not been part of their-- >> It's a new. >> strategic approach. >> That's right, it's a new solution. I think you talk to anybody, and they knew this problem was coming. That with a data lake, and the speed that we're talking about, if you don't back that up with the corresponding information that you need to really digest, you can create a new mess, a new hairball, faster than you ever created the original hairball you're trying to fix in the first place. >> Lisa: Nobody likes a hairball. >> Nobody likes a hairball, exactly. >> Well it also seems as though, for example at the executive level, do I have a question? Can I get this question answered? How do I get this question answered? How can I trust the answer that I get? In many respects that's what you guys are trying to solve. >> David: Exactly, exactly. >> So, it's not, hey what you need to do is invest a whole bunch in the actual data, or copying data, or moving a bunch of data around. You're just starting with the prob, with the observation, with the proposition. Yes, you can answer this question, here's how you're going to do it, and you can trust it because of this trail-- >> David: Exactly. >> Of activities based on the meta-data. >> Exactly, exactly. So, it's about helping to, hate to use the phrase again, but "detangle" that hairball, so that, or at least manage it a bit, so that we can begin to move faster and solve these problems with a hell of a lot more confidence. So we have-- >> Can we switch gears? >> Absolutely. >> Certainly. >> Let's switch gears and talk about transformations. >> Yeah. >> I know that's something that is near and dear to your heart, and something you're spending a lot of time with clients in. >> Yeah. >> How, how do you approach, when a customer comes to you, how are they approaching the transformation, and what are they, what's the conversation that you're having with them? >> Well, it's interesting that the phrase has, and I'm even thinking of changing our group's title to digital transformation services, not just because it's hot, but because, frankly, the fluid or the thing, the glue, that really makes that happen is data in these different environments. But the way that we approach it is by, well understanding what the business capabilities are that are affected by the transformation that is being discussed. Looking at and prioritizing those capabilities based upon the strategic relevance of that capability, along with the opportunity to improve, and multiplying those together, we can then take those and rank those capabilities, and look at it in conjunction with, what we call a business view of the company. And from that we can understand what the effects are on the different parts of the organization, and create the corresponding plans, or roadmaps that are necessary to do this digital transformation. We actually bought a little stealth acquisition of a company two years ago, that's kind of the underpinnings of what my team does, that is extremely helpful in being able to drive these kinds of complex transformations. In fact, big companies, a lot, several in this room in a way, are going through the transformation of moving from a traditional software license sale transaction with the customer to a subscription, monthly transaction. That changes marketing. That changes sales. That changes customer support. That changes R&D. Everything Changes. >> Everything, yeah. >> How do you coordinate that? What is the data that you need in order to calculate a new KPI for how I judge how well I'm doing in my company? Annual recurring revenue, or something. It's a, these are all, they get into data governance. You get into all these different aspects, and that's what our team's tool and approach is actually able to credibly go in, and lay out this road map for folks that is shocking, kind of, in how it's making complex problems manageable. Not necessarily simple. Actually it was Bill Schmarzo, on the, he told me this 15 years ago. Our problem is not to make simple problems mundane, our problem, or what we're trying to do, is make complex problems manageable. I love that. >> Sounds like something-- >> I love that. >> Bill would say. >> That's an important point though about not saying "we're going to make it simple-- >> No. >> we're going to make it manageable." >> David: Exactly. >> Because that's much more realistic. >> David: Right. >> Don't you think? >> David: Exactly, exactly. The fact-- >> I dunno, if we can make them simple, that's good too. >> That would be nice. >> Oh, we'd love that >> Yeah. >> Oh yeah. >> When it happens, it's beautiful. >> That's art. >> Right, right. >> Well, your passion and your excitement for what you guys have just announced is palpable. So, obviously just coming off that announcement, what's next? We look out the rest of the calendar year, what's next for Informatica and transforming digital businesses? >> I think it is, you could say the first 20 years, almost, of Informatica's existence was building that meta-data center of gravity, and allowing people to put stuff in, I guess you could say. So going forward, the future is getting value out. It's continually finding new ways to use, in the same way, for instance, Apple is trying to improve Siri, right? And each release they come out with more capabilities. Obviously Google and Amazon seems to be working a little better, but nevertheless, it's all about continuous improvement. Now, I think, the things that Informatica is doing, is moving that, power of using that meta-data also towards helping our customers more directly with the business aspect of data in a digital transformation. >> Excellent. Well, David, thank you so much for joining us on the Cube. We wish you continued success, I'm sure the Cube be back with Informatica in the next round. >> Excellent. >> Thanks for sharing your passion and your excitement for what you guys are doing. Like I said, it was very palpable, and it's always exciting to have that on the show. So, thank you for watching. I'm Lisa Martin, for my co-host Peter Burress, we thank you for watching the Cube again. And we are live on day one of the Dataworks summit from San Jose. Stick around, we'll be right back.

Published Date : Jun 13 2017

SUMMARY :

Brought to you by Hortonworks. We are live on day one of the It's great to have you here. and could, be doing to help us out. that we have on our phones that can do that you created in the modeling activity. you wanted to change, In fact, in the world of transaction processing, And especially you get into finance apps, things that often had to be secured in a different way, Because of all these concerns And we talked about this with Bill Schmarzo just recently-- Yeah, the data that's in that system is remove some of the artificial barriers-- that are founded by organization, And make it easier to find that data, that in the past we may have had pockets of that was running. that just provided documentation and the calculations stream of tableaux, So questions for you on the business side. and the implications of it across Meta-data, how does a C-Sweet, or a senior manager care-- and change and retest the things that need to be. it's the meta-data that's the key to that, Have you seen historically that, and the speed that we're talking about, In many respects that's what you guys are trying to solve. and you can trust it because of this trail-- so that we can begin to move faster near and dear to your heart, And from that we can understand what the What is the data that you need in order David: Exactly, exactly. for what you guys have just announced is palpable. and allowing people to put stuff in, I'm sure the Cube be back with and it's always exciting to have that on the show.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

Peter BuressPERSON

0.99+

Lisa MartinPERSON

0.99+

David LylePERSON

0.99+

Bill SchmarzoPERSON

0.99+

LisaPERSON

0.99+

AmazonORGANIZATION

0.99+

PeterPERSON

0.99+

InformaticaORGANIZATION

0.99+

Peter BurressPERSON

0.99+

GoogleORGANIZATION

0.99+

AppleORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

SiriTITLE

0.99+

San JoseLOCATION

0.99+

Amazon.comORGANIZATION

0.99+

CLAIREPERSON

0.99+

first timeQUANTITY

0.99+

oneQUANTITY

0.98+

DataWorks Summit 2017EVENT

0.98+

BillPERSON

0.97+

first 20 yearsQUANTITY

0.97+

DataWorks SummitEVENT

0.97+

two years agoDATE

0.97+

Informatica WorldORGANIZATION

0.96+

DataworksEVENT

0.96+

each releaseQUANTITY

0.96+

CubeCOMMERCIAL_ITEM

0.95+

first placeQUANTITY

0.93+

day oneQUANTITY

0.92+

three thingsQUANTITY

0.92+

HortonworksORGANIZATION

0.9+

Informatica - DataWorks Summit 2017EVENT

0.89+

15 years agoDATE

0.88+

CubeORGANIZATION

0.88+

CubeTITLE

0.53+

ShazamORGANIZATION

0.51+

yearsQUANTITY

0.47+

Day 3 Kickoff - Dell EMC World 2017


 

>> Announcer: Live, from Las Vegas, it's theCUBE covering Dell EMC World 2017. Brought to you by Dell EMC. >> Okay, welcome back everyone, we're live here, day three of three days of coverage of theCUBE at Dell EMC World 2017. I'm John Furrier with my co-host Paul Gillin and special guest on our day-three opening, Peter Burris, head of research of SiliconANGLE Media, general manager of wikibon.com research. Guys, good to see you on day three. We're goin' strong. I mean, I think I feel great, a lot of activity. So many story lines to talk about. Obviously the big one is the combination, not merger, I slipped yesterday, or acquisition, the combination of equals, Dell, EMC. Some will question did EMC acquire Dell or Dell acquire EMC? Certainly Michael Dell's still captain of the ship. But that's the top story. But a lot of product line conversations. Not a lot of overlap. Peter, you've been at all the analyst sessions. We had David Furrier on yesterday, teasing it up, but I'd like to get you, your perspective and reaction to your thoughts as you look at the giants in the industry. Michael Dell bought EMC for a record 60 billion plus. You've been around the block. You've seen many waves. You've analyzed many generations of the computer industry. What does this actually mean. Where are they, what's your thoughts and reaction? >> So John, I'll give you three different story lines here, right? The meta-picture, the good, and the what the hell's goin' on kind of picture. The first one, the meta-picture is, and SiliconANGLE said this, it was a really well written article, you might have even written it Paul, that there has never really been a successful mega-merger in the tech industry. And historically I think that's because, well here's the bottom line. This one may actually work. And it may actually work nicely. And the reason is is that most of the other mergers or combinations were companies with problems and companies that didn't have problems. Or companies with problems and companies with problems. And if you take a look at Dell and EMC, neither of them had problems. They weren't buying each other's problems. It was a nice combination and complimentary in that EMC had a great consumer business, great channel business, and had a pretty strong financial position. And EMC had a great enterprise business, great, you know-- >> Sales organizations. >> Great sales organization. And they had, they were strong in where the industry's going around how do you handle data and how do you handle storage. So it's got, what we're seeing here is everybody singing out of the same hymnal. I'm not seeing any tension. And that is an indication that this one may actually go well. I think it's a very, very good early sign. >> Paul, you and I were talking on the day one open and also, we kind of hit it a little bit yesterday with David Furrier, talking about this mega-merger. Compare and contrast that to HBE, which is been kind of, being de-positioned by some of the Dell executives. They don't actually call 'em out by name, but HP Enterprise is taking a different approach. They're taking a, you know, smaller is better approach. Obviously, Michael Dell has a complete different philosophy. We're still going to analyze that as well. We've got HPE Discover coming up as well. Thoughts on the compare and contrast, guys, reaction to the strategies of HPE, smaller, faster, as they say. Or Dell, bigger, more powerful. >> I think both are viable strategies. It's just a matter of if they can pull it off. I mean, HP, you talk about bad mergers, Peter, I mean you think of HP Compact, HP Autonomy, this is a company that has had a terrible track record of big mergers. Although they've had some successful ones certainly. >> By the Meg Whitman inherited those. >> Yes. >> Prior to Meg Whitman coming on board. >> Oh she was a board member for some of them. >> Okay, so she was at the table. Now, we don't know, okay but your thoughts, continue. >> But Dell, clearly going the other direction. They, I mean, they're building sort of an IBM-like model, the way IBM was in the '80s when it dominated every market that it played in. And it played at even more markets than Dell does now. So I think that the model makes sense. I think Peter's absolutely right, I'm not sensing any tension at this conference. There seems to be, the most important thing is there seems to be a lot of communication going on. The executives are spending a lot of time with each other and they're talking a lot to the people. And when you look back, and I live, and Peter, you remember the DEC, you know, the fiasco with DEC being purchased by Compaq. That was clearly a takeover. And that was Compaq came in, took over the company and didn't tell anybody anything. And the DEC people were living in the dark and it was clear that they had no value to the acquiring company. That, clearly, they're not making those mistakes here. >> For the younger, for the younger audience, DEC is Digital Equipment Corporation which was a behemoth winner in the micro, mini-computer era and then now defunct company. >> Except the one, one thing I'd add to that, Paul, is that, and this is why, it's why this first sign is so important. That they are seem to be, that executives here seem to be collaborating and working together. DEC had been one of those mini-computer companies dominated by an OEM business, which means you had a common set of components and then everybody was competing for customers with how you put those components together. So there was, it was a, it was a maelstrom of internal competition at DEC. When Compaq got ahold of DEC, that DEC sense of internal competition took over Compaq. And then when Compaq, when HP acquired Compaq, that maelstrom and internal competition took over HP. >> They didn't know what they were getting into. >> We used to call it the red-blue wars and it was ugly. And that's not happening here. That's a first sign. >> Yeah, I would agree Peter. I want to get your thoughts to all that. I would agree that this is, I've been tryin' to sniff out where the wind's blowin' on this for a year and to my knowledge, and my insight and sources, it's not going bad at all. It's going great. The numbers are performing, they're winning some deals, but let's compare to HP because I asked Mark Heard at their Oracle media event last week, cause they were touting number one in every market. So I said, "Well, there's a digital transformation "going on, a whole new way to do business "for the next 33 years, "not looking back at the past 33 years." Which metrics are you using? Everyone's claiming to be number one at something. So, the question is, maybe HP does have it right. Maybe their strategy will work. What are the, what are going to be those metrics for this next generation? If cloud becomes the connective tissue to data, value of data, and that apps are going to be very agile. Maybe this decentralized approach from HP might be a better strategy for the growth. Thoughts. >> Well, look, let's, so let's, I want to get back to the, what's good about what we're seeing and some other things that probably need to be worked on, but, but here's what I'd say, John. And this is what Wikibon believes. That customers is always going to be the most important metric. So, the first metric is, is HP gaining customers? Is HP losing customers? Is Dell gaining customers? Or is Dell losing customers? That's the number one most important metric. Always will be as far as I'm concerned. But the second one is, and this, and I'll pre-say something I'm going to talk about in a little bit. The second one is, I'll call it data under management. If we think about, if we think about this notion of data as an asset, data as a source of value, how much does HP, through it's customers, how much data does, does HP have under management? How much data does Dell/EMC have under management? And I think that's going to be an important way of thinking about the intensity of the relationships, which relationships are going to steer towards which types of environments. Is it going to be a procurement relationship or a real strategic relationship? By procurement, I mean, it's fundamentally focused on driving cost out of the deal. Strategic, I mean it's fundamentally focused by jointly creating value. So this notion of data under management, to me, is going to be something we're going to be talking about in five years. >> So, Bill Schmarzo, friend of both of ours, was, came by the set before we came on here and he's the dean of big data as coined by theCUBE but now he's takin' on it his own, like he's actually a dean now teaching big data. We are talking about some of the research that you're doing and taking a stand on, it's important, I want to put a plug in for the Wikibon research team that you're leading, is the business value of data. >> Peter: Oh absolutely. >> And that you're looking at data as a valuation mechanism, not an accounting, compliance thing. And this is something, I think, is way ahead of the curve. So props to you guys for puttin' the stake in the ground. To your point, the new metric might just be the valuation of how they use data, whether that's customer data, product services data, application development concepts to reconfiguring how they do business. >> And it's the reconfigure that's the smart, that's the absolutely right word. So, from our perspective John, the difference between a business and a digital business is a business uses data one way, a digital business uses data another way. A business uses data as an, something to just handle coordination and administration. >> Paul: Bookkeeping. >> Yeah, exactly. A digital business uses data as a strategic asset to differentiate how to engage to markets. That's where the industry's going, and that's what we want to talk about. >> And by the way, in previous business constructs or business books people have, might have read over the years certainly, you know, the Peter Druckers and so on, management consultants, never actually factored data into the value chains of-- >> Oh they did, they did, they did. They just didn't actually, so Drucker, for example did. >> John: Digital data? >> Oh, he talked about information and the role that information played. >> John: I stand corrected. >> Herbert Simon talked about this kind of stuff 50 years. Unfortunately it all got lost when we went through things like, jeez, you know, there was a very famous economist who said in the late 80s, "Information technology "shows up everywhere but in the productivity numbers." So, you old guys would-- >> I remember that, I remember that quote. >> So, the idea ultimately is we now have to get very discrete and very specific about what that means. And that's a challenge. But let's come back to, let's come back to at least what we think is really working here, if I may. >> John: Absolutely, go ahead. >> So the first thing is, at a more tactical level, number one is the Hyperconvert story is exciting. And it's starting to come together. And again, I'm not, we're not seeing tension between the folks that are selling servers and the folks that are doing Hyperconversion. Both are introducing new technology that are going to create new opportunities for customers, and they're not as, as, as your good friend Michael Dell said, a couple times over the past year, here in theCUBE, "We are not going to "artificially constrain any of our businesses." And, as Amazon said at re:Invent, "If you're going to do it at scale, "eventually you're going to put in hardware." And he wants to demonstrate that all this great software stuff that's happening, that ultimately Dell's going to be the leader at designing these new capabilities into the hardware and he wants to show how that's going to show up in all his product lines. >> That's a great point. I think the most interesting dynamic I've been seeing out of the interviews we've been doing the last two days is that the problem Dell has to struggle with now, and it'll be interesting to watch how they, how they figure this out, is all of their, used to be called the Federation, now they're called the Strategic Business Alliances I think. The, you know, the VMwares, the RSAs, the Pivotals, how are they going to make sense of those in the context of this bigger whole? On the one hand, they've got some competing priorities here. Dell has a very strong relationship with Microsoft, VMware is a competitor to Microsoft. So you got to figure out how to get those, how to make sense of those different alliances. Pivotal is potentially a competitor to Microsoft. >> Potentially? >> Well, Microsoft is in the pass business, yeah. >> No, it is yeah, it's going to compete. >> So you've got a, you've got some paradoxes here in the businesses that Dell has acquired. They really still, I sense they still haven't made sense of what they're going to do with them. >> Yeah, great point. I mean, first of all, you guys are pros and we have a historical view here of the collective intelligence of all of us old guys here. We've seen a lot of ways. But Rob Hof wrote an article on SiliconANGLE, our Editor-in-Chief Rob Hof, who's also an industry veteran and journalist himself. After the Oracle media event, and the headline reads, "In Oracle's Cloud Pitch to Enterprises, "an Echo of a Bygone Tech Era." And his point with this story is, I want to get your reaction to this, cause I think we're seeing a trend here, you guys are teasing out here. We're kind of going back down to the old tech days. You were the Editor-in-Chief of Computerworld back in the day with the mainframe world and then the minis. Seeing Marius Haas on here using words like "Single pain of glass." "One throat to choke." "End to end." We're almost seeing the bygone era coming back again where maybe they might have the rights to it. Certainly Oracle saying, "Hey, you know, "reorganize our sales force." So the question. Is the cloud the de-centralized mainframe. Is it now the new centralized, with edge, intelligent edge, is that, are we going back to the old ways, in a way, not fully but, unifying the sales forces. >> So, the computing industry-- >> Thoughts. >> Has been been on an inexorable march to greater utilization of public infrastructure. What an economist would say is we've always found ways to reduce asset specificities. I buy something, and I apply it to one purpose. I can't apply it to another purpose. Software changes that. Commodity pricing and hardware changes that. Public infrastructure changes that. So we're going to continue to see that inexorable march to the use of public infrastructure or somethin' that looks like public infrastructure. And that's going to continue. And the industry's always been very, very good at that. That does not mean, however, that we're going to have one supplier. So what we're seeing is a lot of FUD right now. Amazon FUD, Dell FUD, Oracle FUD. There is a real tension in the model and the real tension is, more than likely, the future is going to be composites of services operating on multiple different cloud-like instances, including on premise. And who's going to offer the best end-to-end control plane? >> Paul, I want to get your thoughts. Cause you remember goin' back to the days, IBM had SNA network stack, DEC had DECnet, we had, they had propietary stacks. Cloud, Azure stack, this stack, that. Are we seeing this again? Your thoughts. >> Well I think Peter's absolutely right but the variable, and you're right, we are seeing this again. We're seeing a trend of return to simplicity. Because what IT organizations have been wrestling with for the last 20 years is everything is just getting more complex. There's more vendors, there's more piece parts, and they've got to fit them all together, and it sucks. And so they want someone to simplify this. Now, cloud vendors simplify it on one level. But software-defined, on another level. We've been talking here about software defined storage, about software-defined networking, massive virtualization. And that's on an open source or at least an open API-based model. Which I think is the twist here. Are we going back to the days of IBM? Yeah. But IBM, But the IBM may actually be software-defined. >> Or five different companies that look like IBM. >> I know what you're saying Paul, and I'm not going to disagree with you. But here's the opposite-- >> But you disagree with him. >> No, no, but no I'm not going to, I'm going to put a slightly different spin on it. It used to be that the most valuable asset in an IT organization was the mainframe. And the entire organization was organized and the interactions with the business were organized and put in place to handle the value of that mainframe. We are not going back to a day where the IT organization, the way business uses IT is organized around the mainframe as an asset. Or even around the provision of infrastructure as an asset. We are going to start seeing organization and frameworks that are fundamentally built around this idea of data as an asset. And that is going to be a lot more complex with a lot more buyers and a lot more opportunities for differentiation creating value. So we will see more complexity in IT at the software and the use case level, less complexity at the infrastructure levels. >> Which is why machine learning and automation gets a lot of hype, but to Paul, I'm going to get your point and tie Peter's point together and introduce Jeff Bezos' comment last week on NDC. He mentioned that most things take 10 years to bake out in terms of getting things right. Ten year kind of horizon. Kind of an order of magnitude. But he says, "All these startups say they have "disruptive technology, it's not their technology that's "disruptive, it's what's the customer is disrupted." So we're talkin' about customers being disrupted. It's not some company having disruptive technologies. >> And disrupting. >> So are we saying that customers are being disrupted by reconfiguring their businesses, hence with the mainframe disrupted, a new way to do things, we're seeing clouded-data as a new way to do things. So, that's causing some reconfiguration and disruption, allows them to say, "Shit, just when I thought it was simple "it got more complex." >> But the disruptive element is the data as Peter says. >> I mean the machines are becoming, the machines are already a commodity. The, with open source, the platforms are a commodity. What's disruptive is how you use the data in different ways. And to your point Peter, yes, it's going to be a much more complex world. >> Peter: Much more. >> Because there's a lot more data and there's a lot more things we can do with data. >> And data can, that's exactly right. We can do so much more with data. So again, let's go back to the fundamental metric that at least I suggested. Who gets more customers? There are going to be more buyers of this stuff in five years than there are today. More buyers in the sense that within an organization, there's going to be more people involved in the decision and there's going to be more businesses. Because if this stuff actually works, the transaction costs are going to go down and you can then organize your businesses, institutionalize how you do work differently so you can have more partnerships. All that means that fundamentally, what we're talkin' about here is going to lead to greater complexity in business, greater opportunity therefore, but what I've always said, and I don't know if you've heard this Paul, but I know you have John, and I've said it on theCUBE. That the fundamental demarcation is that the first 50 years of this industry featured known process, unknown technology. And what do you we focus on? The technology. What's the next 50 years? Unknown process, known technology. What are we going to focus on? How to build that software, how to handle those data assets. What are we going to focus less attention on? The technology. What does everybody want to talk about at this show? >> The technology. >> Technology. That's a disconnect. So going to one of the things that we now have to think about from a DELL/EMC standpoint is where's the story about how Dell is going to appreciate the value of your data assets over time. We need more of that. >> And let me point out, you now, you didn't mention IBM but one company that is doing that well right now, they aren't getting the business benefit for it yet, is IBM. Where they are really taking, they are not technology, I mean they don't talk about power aid anymore. They talk about Watson, they talk about what you can do with analytics, they talk about a smarter planet. They haven't been able to turn this into a successful business yet but they're doing, I think, exactly what you're talking about. >> Well the product, they have some product challenges. I mean, so let's get back down to the customer thing. I like that angle. You got to have the customer, you got to have the products that customers will be buying. That's the value, exchange that customers will value and then hence by your service or product. Andy Jassy and Pat Gelsinger, when they did the Amazon deal, VMware. Jassy, Andy Jassy CEO of AWS said to me, "We are customer focused." So I believe that you're right on this 100%. Whoever can get the customers. And this is not about who's the better stack, if the customers like it, they're going to buy it. >> And very importantly, John, they are going to invest in it to make it valuable in their business. And that's what you want. You want to see your customers become a centerpiece of value-creation in your ecosystem. >> And I think Amazon Web Services proves that the dark horse could come out of nowhere and be the behemoth that they are because they served the customers. >> So that's the second thing that I'm missing at this show. And I know, I think I know why, is where is the additional details, even a little bit more, about VMware and AWS. Now, I know that they're going to wait for the VMware World, that's the story. >> They showed a little preview in the keynote, it's still baking out. >> Yeah, but it would be nice to have a little bit more. >> That's one of those tough relationships they need to manage, right? >> Yeah, exactly right. >> I mean VMware and IBM also have an alliance. They are allied with their foes now through the acquisition. The point about, about the value of data, you know, I think Amazon has done a good job of building platforms that are very flexible for customers to use but they abstract a lot of the underlying complexity. >> Alright, so with the data, I want to just double-down on that for a second and get your reaction, thoughts on, obviously, one of the themes here is IOT and we heard Michael Dell saying it's going to be centralized, pushed out to the edge, you got in research from Wikibon intellegent edge. You and David Floy and the rest of the team doing some real amazing work at Wikibon.com. Check it out, subscription required. What's the edge strategy? What does that actually mean for IT practitioners out there? It's, certainly we heard from Bask Iyer, who's the CIO of Dell said, "Most CIOs are conservative "and don't usually jump on these waves." They missed mobile, they missed some other waves. His mandate was, CIOs, don't miss the IOT wave. So what is the IOT, this edge of the network thing mean for a CIO. >> Well, the first thing is in hardcore circumstances, many CIOs aren't even involved in the edge. So if you take a look, if you go into where a lot of the edged domains are really crucial, you see a plant manager that's more responsible for what's going on in the edge than the CIO. The CIO is handling the corporate systems. The plant manager is handling what's actually happening at the edge. The operational technology stuff. So the first thing is we're going to see a slow circling of the IT and OT organizations about who's going to win-- >> OT meaning Operational Technology. >> Operational Technology. Just as we saw a slow circling back in the 1990s when TCPIP came in, and blew away DEC and blew away everybody, and started blowing away the TELECOM divisions, or TELECOM's functions within side large enterprises. >> So you think that IOT is going to be as disruptive as TCPIP was in standardizing in the network layer. >> Oh absolutely, absolutely. It's going to be, it's going to have an enormous impact because there's so many new sources. The data is going to have, how to think about it, and that was the second point I was going to make, John, is we do not currently have architectural standards in place for thinking about how this stuff is going to come together. And it's something that David Furrier and I and the Wikibon team are working on and I hope to come up with, I hope to come out with some research, actually probably next month, on what we call automation zones or data zones or probably edge zones. Which is, how do, just we think about security zones today, how do we think about edge zones. Where the edge zone is defined by a moment, an automation moment, cannot have data outside of that zone. And that needs to become an architectural principle where OT and IT can work together and say, "What data has to be in that zone? "I'll make sure my data gets there, "you make sure you're data gets there. "We'll figure out how control happens, "and that's how we drive this thing forward." >> Well, just to give you a prop here on theCUBE here is, Wikibon was right about Flash, they were right about Hyperconvergence and convergent infrastructure. Big bets early on that were kind of like, people were like, "What?" And certainly Vstand, ServiceStand although some people will disagree with this. >> They were right about the edge. >> Now you're right about, I think you're right on, way right on the edge and you're way right on value of data. >> Yeah. >> I think those are two stands that you're taking that will be-- >> And let's give great props to David Furrier who was a catalyst for thinking many of these things through. >> Alright Paul, final word from you. Obviously, you know, as a veteran, you've covered it all. Okay, what's your take? I mean, what's the, how's the wind blowing, what's your instinct tell you of what's happening. >> I think it's generally good, but it's hard to tell from conferences. As you know John, the reason most conferences are so boring is that there's no tension, there's no conflict. It's all good, it's all everybody's happy and everybody's doin' a great job. That's the very same thing that we're seeing here. >> Rah rah, Kool-aid injection. >> One thing I can't help notice is on the keynote, if you look at the keynote agenda for the three days, there's not a single customer on the, on the keynote agenda. Which I think is a problem. Or I don't think that says good things about where Dell is really focusing it's message right now. You want to have, at most big company conferences, there's lots and lots of customers who come up on stage. I think Dell is still thinking about, I mean it's a technology-focused company. They're thinking about technology integration right now. >> So speeds and feeds. >> Yeah, you hear a lot of speeds and feeds. >> Everybody wants to be the most important thing in the enterprise, and they still want hardware to be the most important thing. >> Well, I think I mean, I would agree with you 100%, but I just think, just, in this acquisition, I mean, sorry, merger of equals, they have a lot of herding cats going on right now. There's a lot of herding of portfolio and not a lot of overlap but I can see them kind of making room on the stage for that. But I do agree, I mean, customers do tell the best story. >> And in the long run, that's, as Peter said, that is what is going to make the difference. Are the customers happy? >> Guys, amazing exchange. Thanks so much, Peter, for comin' out and takin' some time out of your busy schedule to come on theCUBE and share your insight. The daily on-cue Paul, as always, we're havin' another three days. Third day of our three days of coverage here on theCUBE. Great commentary, great analysis, more live coverage from day three of Dell/EMC World 2017. We'll be right back, stay with us, we'll be right back after this short break.

Published Date : May 10 2017

SUMMARY :

Brought to you by Dell EMC. You've analyzed many generations of the computer industry. and the what the hell's goin' on kind of picture. is everybody singing out of the same hymnal. Compare and contrast that to HBE, I mean, HP, you talk about bad mergers, Peter, Now, we don't know, okay but your thoughts, continue. And the DEC people were living in the dark in the micro, mini-computer era Except the one, one thing I'd add to that, Paul, and it was ugly. If cloud becomes the connective tissue to data, And I think that's going to be and he's the dean of big data as coined by theCUBE So props to you guys for puttin' the stake in the ground. And it's the reconfigure that's the smart, to differentiate how to engage to markets. Oh they did, they did, they did. and the role that information played. jeez, you know, there was a very famous economist So, the idea ultimately is we now have to get and the folks that are doing Hyperconversion. is that the problem Dell has to struggle with now, in the businesses that Dell has acquired. might have the rights to it. the future is going to be composites of services Cause you remember goin' back to the days, and they've got to fit them all together, and I'm not going to disagree with you. And that is going to be a lot more complex gets a lot of hype, but to Paul, allows them to say, "Shit, just when I thought it was simple But the disruptive element is the data And to your point Peter, yes, and there's a lot more things we can do with data. is that the first 50 years of this industry featured how Dell is going to appreciate the value They haven't been able to if the customers like it, they're going to buy it. And that's what you want. and be the behemoth that they are So that's the second thing that I'm missing at this show. They showed a little preview in the keynote, The point about, about the value of data, you know, You and David Floy and the rest of the team So the first thing is we're going to see a slow circling the TELECOM divisions, or TELECOM's functions in standardizing in the network layer. And that needs to become an architectural principle Well, just to give you a prop here I think you're right on, way right on the edge And let's give great props to David Furrier Obviously, you know, as a veteran, you've covered it all. That's the very same thing that we're seeing here. is on the keynote, if you look at the keynote agenda in the enterprise, and they still want hardware But I do agree, I mean, customers do tell the best story. And in the long run, that's, as Peter said, to come on theCUBE and share your insight.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

Paul GillinPERSON

0.99+

IBMORGANIZATION

0.99+

Bill SchmarzoPERSON

0.99+

EMCORGANIZATION

0.99+

PeterPERSON

0.99+

CompaqORGANIZATION

0.99+

David FloyPERSON

0.99+

Mark HeardPERSON

0.99+

Peter BurrisPERSON

0.99+

Michael DellPERSON

0.99+

Pat GelsingerPERSON

0.99+

MicrosoftORGANIZATION

0.99+

David FurrierPERSON

0.99+

AmazonORGANIZATION

0.99+

HPORGANIZATION

0.99+

DellORGANIZATION

0.99+

AWSORGANIZATION

0.99+

Rob HofPERSON

0.99+

Herbert SimonPERSON

0.99+

Meg WhitmanPERSON

0.99+

JassyPERSON

0.99+

Andy JassyPERSON

0.99+

PaulPERSON

0.99+

VMwareORGANIZATION

0.99+

10 yearsQUANTITY

0.99+

Jeff Bezos'PERSON

0.99+

DECORGANIZATION

0.99+

100%QUANTITY

0.99+

John FurrierPERSON

0.99+

Las VegasLOCATION

0.99+

Marius HaasPERSON

0.99+