Gokula Mishra | MIT CDOIQ 2019
>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019 brought to you by SiliconANGLE Media. (upbeat techno music) >> Hi everybody, welcome back to Cambridge, Massachusetts. You're watching theCUBE, the leader in tech coverage. We go out to the events. We extract the signal from the noise, and we're here at the MIT CDOIQ Conference, Chief Data Officer Information Quality Conference. It is the 13th year here at the Tang building. We've outgrown this building and have to move next year. It's fire marshal full. Gokula Mishra is here. He is the Senior Director of Global Data and Analytics and Supply Chain-- >> Formerly. Former, former Senior Director. >> Former! I'm sorry. It's former Senior Director of Global Data Analytics and Supply Chain at McDonald's. Oh, I didn't know that. I apologize my friend. Well, welcome back to theCUBE. We met when you were at Oracle doing data. So you've left that, you're on to your next big thing. >> Yes, thinking through it. >> Fantastic, now let's start with your career. You've had, so you just recently left McDonald's. I met you when you were at Oracle, so you cut over to the dark side for a while, and then before that, I mean, you've been a practitioner all your life, so take us through sort of your background. >> Yeah, I mean my beginning was really with a company called Tata Burroughs. Those days we did not have a lot of work getting done in India. We used to send people to U.S. so I was one of the pioneers of the whole industry, coming here and working on very interesting projects. But I was lucky to be working on mostly data analytics related work, joined a great company called CS Associates. I did my Master's at Northwestern. In fact, my thesis was intelligent databases. So, building AI into the databases and from there on I have been with Booz Allen, Oracle, HP, TransUnion, I also run my own company, and Sierra Atlantic, which is part of Hitachi, and McDonald's. >> Awesome, so let's talk about use of data. It's evolved dramatically as we know. One of the themes in this conference over the years has been sort of, I said yesterday, the Chief Data Officer role emerged from the ashes of sort of governance, kind of back office information quality compliance, and then ascended with the tailwind of the Big Data meme, and it's kind of come full circle. People are realizing actually to get value out of data, you have to have information quality. So those two worlds have collided together, and you've also seen the ascendancy of the Chief Digital Officer who has really taken a front and center role in some of the more strategic and revenue generating initiatives, and in some ways the Chief Data Officer has been a supporting role to that, providing the quality, providing the compliance, the governance, and the data modeling and analytics, and a component of it. First of all, is that a fair assessment? How do you see the way in which the use of data has evolved over the last 10 years? >> So to me, primarily, the use of data was, in my mind, mostly around financial reporting. So, anything that companies needed to run their company, any metrics they needed, any data they needed. So, if you look at all the reporting that used to happen it's primarily around metrics that are financials, whether it's around finances around operations, finances around marketing effort, finances around reporting if it's a public company reporting to the market. That's where the focus was, and so therefore a lot of the data that was not needed for financial reporting was what we call nowadays dark data. This is data we collect but don't do anything with it. Then, as the capability of the computing, and the storage, and new technologies, and new techniques evolve, and are able to handle more variety and more volume of data, then people quickly realize how much potential they have in the other data outside of the financial reporting data that they can utilize too. So, some of the pioneers leverage that and actually improved a lot in their efficiency of operations, came out with innovation. You know, GE comes to mind as one of the companies that actually leverage data early on, and number of other companies. Obviously, you look at today data has been, it's defining some of the multi-billion dollar company and all they have is data. >> Well, Facebook, Google, Amazon, Microsoft. >> Exactly. >> Apple, I mean Apple obviously makes stuff, but those other companies, they're data companies. I mean largely, and those five companies have the highest market value on the U.S. stock exchange. They've surpassed all the other big leaders, even Berkshire Hathaway. >> So now, what is happening is because the market changes, the forces that are changing the behavior of our consumers and customers, which I talked about which is everyone now is digitally engaging with each other. What that does is all the experiences now are being captured digitally, all the services are being captured digitally, all the products are creating a lot of digital exhaust of data and so now companies have to pay attention to engage with their customers and partners digitally. Therefore, they have to make sure that they're leveraging data and analytics in doing so. The other thing that has changed is the time to decision to the time to act on the data inside that you get is shrinking, and shrinking, and shrinking, so a lot more decision-making is now going real time. Therefore, you have a situation now, you have the capability, you have the technology, you have the data now, you have to make sure that you convert that in what I call programmatic kind of data decision-making. Obviously, there are people involved in more strategic decision-making. So, that's more manual, but at the operational level, it's going more programmatic decision-making. >> Okay, I want to talk, By the way, I've seen a stat, I don't know if you can confirm this, that 80% of the data that's out there today is dark data or it's data that's behind a firewall or not searchable, not open to Google's crawlers. So, there's a lot of value there-- >> So, I would say that percent is declining over time as companies have realized the value of data. So, more and more companies are removing the silos, bringing those dark data out. I think the key to that is companies being able to value their data, and as soon as they are able to value their data, they are able to leverage a lot of the data. I still believe there's a large percent still not used or accessed in companies. >> Well, and of course you talked a lot about data monetization. Doug Laney, who's an expert in that topic, we had Doug on a couple years ago when he, just after, he wrote Infonomics. He was on yesterday. He's got a very detailed prescription as to, he makes strong cases as to why data should be valued like an asset. I don't think anybody really disagrees with that, but then he gave kind of a how-to-do-it, which will, somewhat, make your eyes bleed, but it was really well thought out, as you know. But you talked a lot about data monetization, you talked about a number of ways in which data can contribute to monetization. Revenue, cost reduction, efficiency, risk, and innovation. Revenue and cost is obvious. I mean, that's where the starting point is. Efficiency is interesting. I look at efficiency as kind of a doing more with less but it's sort of a cost reduction, but explain why it's not in the cost bucket, it's different. >> So, it is first starts with doing what we do today cheaper, better, faster, and doing more comes after that because if you don't understand, and data is the way to understand how your current processes work, you will not take the first step. So, to take the first step is to understand how can I do this process faster, and then you focus on cheaper, and then you focus on better. Of course, faster is because of some of the market forces and customer behavior that's driving you to do that process faster. >> Okay, and then the other one was risk reduction. I think that makes a lot of sense here. Actually, let me go back. So, one of the key pieces of it, of efficiency is time to value. So, if you can compress the time, or accelerate the time and you get the value that means more cash in house faster, whether it's cost reduction or-- >> And the other aspect you look at is, can you automate more of the processes, and in that way it can be faster. >> And that hits the income statement as well because you're reducing headcount cost of your, maybe not reducing headcount cost, but you're getting more out of different, out ahead you're reallocating them to more strategic initiatives. Everybody says that but the reality is you hire less people because you just automated. And then, risk reduction, so the degree to which you can lower your expected loss. That's just instead thinking in insurance terms, that's tangible value so certainly to large corporations, but even midsize and small corporations. Innovation, I thought was a good one, but maybe you could use an example of, give us an example of how in your career you've seen data contribute to innovation. >> So, I'll give an example of oil and gas industry. If you look at speed of innovation in the oil and gas industry, they were all paper-based. I don't know how much you know about drilling. A lot of the assets that goes into figuring out where to drill, how to drill, and actually drilling and then taking the oil or gas out, and of course selling it to make money. All of those processes were paper based. So, if you can imagine trying to optimize a paper-based innovation, it's very hard. Not only that, it's very, very by itself because it's on paper, it's in someone's drawer or file. So, it's siloed by design and so one thing that the industry has gone through, they recognize that they have to optimize the processes to be better, to innovate, to find, for example, shale gas was a result output of digitizing the processes because otherwise you can't drill faster, cheaper, better to leverage the shale gas drilling that they did. So, the industry went through actually digitizing a lot of the paper assets. So, they went from not having data to knowingly creating the data that they can use to optimize the process and then in the process they're innovating new ways to drill the oil well cheaper, better, faster. >> In the early days of oil exploration in the U.S. go back to the Osage Indian tribe in northern Oklahoma, and they brilliantly, when they got shuttled around, they pushed him out of Kansas and they negotiated with the U.S. government that they maintain the mineral rights and so they became very, very wealthy. In fact, at one point they were the wealthiest per capita individuals in the entire world, and they used to hold auctions for various drilling rights. So, it was all gut feel, all the oil barons would train in, and they would have an auction, and it was, again, it was gut feel as to which areas were the best, and then of course they evolved, you remember it used to be you drill a little hole, no oil, drill a hole, no oil, drill a hole. >> You know how much that cost? >> Yeah, the expense is enormous right? >> It can vary from 10 to 20 million dollars. >> Just a giant expense. So, now today fast-forward to this century, and you're seeing much more sophisticated-- >> Yeah, I can give you another example in pharmaceutical. They develop new drugs, it's a long process. So, one of the initial process is to figure out what molecules this would be exploring in the next step, and you could have thousand different combination of molecules that could treat a particular condition, and now they with digitization and data analytics, they're able to do this in a virtual world, kind of creating a virtual lab where they can test out thousands of molecules. And then, once they can bring it down to a fewer, then the physical aspect of that starts. Think about innovation really shrinking their processes. >> All right, well I want to say this about clouds. You made the statement in your keynote that how many people out there think cloud is cheaper, or maybe you even said cheap, but cheaper I inferred cheaper than an on-prem, and so it was a loaded question so nobody put their hand up they're afraid, but I put my hand up because we don't have any IT. We used to have IT. It was a nightmare. So, for us it's better but in your experience, I think I'm inferring correctly that you had meant cheaper than on-prem, and certainly we talked to many practitioners who have large systems that when they lift and shift to the cloud, they don't change their operating model, they don't really change anything, they get a bill at the end of the month, and they go "What did this really do for us?" And I think that's what you mean-- >> So what I mean, let me make it clear, is that there are certain use cases that cloud is and, as you saw, that people did raise their hand saying "Yeah, I have use cases where cloud is cheaper." I think you need to look at the whole thing. Cost is one aspect. The flexibility and agility of being able to do things is another aspect. For example, if you have a situation where your stakeholder want to do something for three weeks, and they need five times the computing power, and the data that they are buying from outside to do that experiment. Now, imagine doing that in a physical war. It's going to take a long time just to procure and get the physical boxes, and then you'll be able to do it. In cloud, you can enable that, you can get GPUs depending on what problem we are trying to solve. That's another benefit. You can get the fit for purpose computing environment to that and so there are a lot of flexibility, agility all of that. It's a new way of managing it so people need to pay attention to the cost because it will add to the cost. The other thing I will point out is that if you go to the public cloud, because they make it cheaper, because they have hundreds and thousands of this canned CPU. This much computing power, this much memory, this much disk, this much connectivity, and they build thousands of them, and that's why it's cheaper. Well, if your need is something that's very unique and they don't have it, that's when it becomes a problem. Either you need more of those and the cost will be higher. So, now we are getting to the IOT war. The volume of data is growing so much, and the type of processing that you need to do is becoming more real-time, and you can't just move all this bulk of data, and then bring it back, and move the data back and forth. You need a special type of computing, which is at the, what Amazon calls it, adds computing. And the industry is kind of trying to design it. So, that is an example of hybrid computing evolving out of a cloud or out of the necessity that you need special purpose computing environment to deal with new situations, and all of it can't be in the cloud. >> I mean, I would argue, well I guess Microsoft with Azure Stack was kind of the first, although not really. Now, they're there but I would say Oracle, your former company, was the first one to say "Okay, we're going to put the exact same infrastructure on prem as we have in the public cloud." Oracle, I would say, was the first to truly do that-- >> They were doing hybrid computing. >> You now see Amazon with outposts has done the same, Google kind of has similar approach as Azure, and so it's clear that hybrid is here to stay, at least for some period of time. I think the cloud guys probably believe that ultimately it's all going to go to the cloud. We'll see it's going to be a long, long time before that happens. Okay! I'll give you last thoughts on this conference. You've been here before? Or is this your first one? >> This is my first one. >> Okay, so your takeaways, your thoughts, things you might-- >> I am very impressed. I'm a practitioner and finding so many practitioners coming from so many different backgrounds and industries. It's very, very enlightening to listen to their journey, their story, their learnings in terms of what works and what doesn't work. It is really invaluable. >> Yeah, I tell you this, it's always a highlight of our season and Gokula, thank you very much for coming on theCUBE. It was great to see you. >> Thank you. >> You're welcome. All right, keep it right there everybody. We'll be back with our next guest, Dave Vellante. Paul Gillin is in the house. You're watching theCUBE from MIT. Be right back! (upbeat techno music)
SUMMARY :
brought to you by SiliconANGLE Media. He is the Senior Director of Global Data and Analytics Former, former Senior Director. We met when you were at Oracle doing data. I met you when you were at Oracle, of the pioneers of the whole industry, and the data modeling and analytics, So, if you look at all the reporting that used to happen the highest market value on the U.S. stock exchange. So, that's more manual, but at the operational level, that 80% of the data that's out there today and as soon as they are able to value their data, Well, and of course you talked a lot and data is the way to understand or accelerate the time and you get the value And the other aspect you look at is, Everybody says that but the reality is you hire and of course selling it to make money. the mineral rights and so they became very, very wealthy. and you're seeing much more sophisticated-- So, one of the initial process is to figure out And I think that's what you mean-- and the type of processing that you need to do I mean, I would argue, and so it's clear that hybrid is here to stay, and what doesn't work. Yeah, I tell you this, Paul Gillin is in the house.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Hitachi | ORGANIZATION | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Microsoft | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Doug Laney | PERSON | 0.99+ |
five times | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Kansas | LOCATION | 0.99+ |
TransUnion | ORGANIZATION | 0.99+ |
Paul Gillin | PERSON | 0.99+ |
HP | ORGANIZATION | 0.99+ |
three weeks | QUANTITY | 0.99+ |
India | LOCATION | 0.99+ |
10 | QUANTITY | 0.99+ |
Sierra Atlantic | ORGANIZATION | 0.99+ |
Gokula Mishra | PERSON | 0.99+ |
Doug | PERSON | 0.99+ |
hundreds | QUANTITY | 0.99+ |
Berkshire Hathaway | ORGANIZATION | 0.99+ |
five companies | QUANTITY | 0.99+ |
80% | QUANTITY | 0.99+ |
U.S. | LOCATION | 0.99+ |
Booz Allen | ORGANIZATION | 0.99+ |
Tata Burroughs | ORGANIZATION | 0.99+ |
first step | QUANTITY | 0.99+ |
Gokula | PERSON | 0.99+ |
next year | DATE | 0.99+ |
thousands | QUANTITY | 0.99+ |
McDonald's | ORGANIZATION | 0.99+ |
one aspect | QUANTITY | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
thousands of molecules | QUANTITY | 0.99+ |
first one | QUANTITY | 0.99+ |
One | QUANTITY | 0.98+ |
GE | ORGANIZATION | 0.98+ |
northern Oklahoma | LOCATION | 0.98+ |
today | DATE | 0.97+ |
CS Associates | ORGANIZATION | 0.97+ |
20 million dollars | QUANTITY | 0.97+ |
one | QUANTITY | 0.96+ |
First | QUANTITY | 0.96+ |
Global Data and Analytics and Supply Chain | ORGANIZATION | 0.95+ |
MIT CDOIQ Conference | EVENT | 0.95+ |
13th year | QUANTITY | 0.94+ |
U.S. government | ORGANIZATION | 0.93+ |
two worlds | QUANTITY | 0.92+ |
Azure Stack | TITLE | 0.91+ |
one thing | QUANTITY | 0.9+ |
one point | QUANTITY | 0.9+ |
Northwestern | ORGANIZATION | 0.9+ |
couple years ago | DATE | 0.89+ |
MIT Chief Data Officer and Information Quality Symposium 2019 | EVENT | 0.87+ |
this century | DATE | 0.85+ |
Tang building | LOCATION | 0.85+ |
Global Data Analytics and | ORGANIZATION | 0.83+ |
Chief Data Officer Information Quality Conference | EVENT | 0.81+ |
MIT | ORGANIZATION | 0.78+ |
theCUBE | ORGANIZATION | 0.77+ |
thousand different combination of molecules | QUANTITY | 0.74+ |
last | DATE | 0.67+ |
years | DATE | 0.66+ |
U.S. | ORGANIZATION | 0.66+ |
billion dollar | QUANTITY | 0.65+ |
themes | QUANTITY | 0.65+ |
Osage Indian | OTHER | 0.64+ |
Kevin Bates, Fannie Mae | Corinium Chief Analytics Officer Spring 2018
>> From the Corinium Chief Analytics Officer Conference Spring San Francisco, it's The Cube >> Hey welcome back, Jeff Frick with The Cube We're in downtown San Francisco at the Corinium Chief Analytics Officer Spring event. We go to Chief Data Officer, this is Chief Analytics Officer. There's so much activity around big data and analytics and this one is really focused on the practitioners. Relatively small event, and we're excited to have another practitioner here today and it's Kevin Bates. He's the VP of Enterprise Data Strategy Execution for Fannie Mae. Kevin, welcome. >> It's a mouthful. Thank you. >> You've got it all. You've got strategy, which is good, and then you've got execution. And you've been at a big Fannie Mae for 15 years according to your LinkedIn, so you've seen a lot of changes. Give us kind of your perspective as this train keeps rolling down the tracks. >> OK. Yeah, so it's been a wild ride I've been there, like you say, for 15 years. When I started off there I was writing code, working on their underwriting systems. And I've been in different divisions including the credit loss division, which had a pretty exciting couple of years back around 2008. >> More exciting than you care to - >> Well, there was certainly a lot going on. Data's been sort of a consistent theme throughout my career, so the data, Fannie Mae not unlike most companies, is really the blood that keeps the entire organism functioning. So over the past few years I've actually moved into the Enterprise Data Division of the company where I have responsibility for delivery, operations, platforms, the whole 9 yards. And that's really given me the unique view of what the company does. It's given me the opportunity to touch most of the different business areas and learn a lot about what we need to do better. >> So how is the perspective changed around the data? Before data was almost a liability because you had to store it, keep it, manage it, and take good care of it. Now it's a core asset and we see the valuations up and down. One on one probably the driver of some of the crazy valuations that you see in a lot of the companies. So how has that added to change and what have you done to take advantage of that shift in attitude? >> Sure, it's a great question. So I think the data has always been the life blood and key ingredient to success for the company, but the techniques of managing the data have changed for sure, and with that the culture has to change and how you think about the data has to change. If you go back 10 years ago all of our data was stored in our data center, which means that we had to pay for all of those servers, and every time data kept getting bigger we had to buy more servers and it almost became like a bad thing. >> That's what I said, almost like a liability >> That's right And as we've certainly started adopting the cloud and technologies associated with the cloud you may step into that thinking "OK, now I don't have to manage my own data center I'll let Amazon or whoever do it for me." But it's much more fundamental than that because as you start embracing the cloud and now storage is no longer a limitation and compute is no longer a limitation the numbers of tools that you use is no longer really a limitation. So as an organization you have to change your way of thinking from "I'm going to limit the number of business intelligence tools that my users can take advantage of" to "How can I support them to use whatever tools they want?" So the mentality around the data I think really goes to how can I make sure the right data is available at the right time with the right quality checks so that everybody can say "yep, I can hang my hat on that data" but then get out of the way and let them self serve from there. It's very challenging, there's a lot of new tools and technologies involved. >> And that's a huge piece of the old innovation game to have the right data for the right people with the right tools and let more people play with it. But you've got this other pesky thing like governance. You've got a lot of legal restrictions and regulations and compliances. So how do you fold that into opening up the goodies, if you will. >> So I think one effort we have is we're building a platform we call the Enterprise Data Infrastructure so for that 85 percent of data at Fannie Mae what we do is loans, we create securities from the loans. And there's liabilities. There's a pretty finite set of data areas that are pretty much consistent at Fannie Mae and everybody uses those data sets. So taking those and calling them enterprise data sets that will be centralized they will be presented to our customers in a uniform way with all of the data quality checks in place. That's the big effort. It means that you're standardizing your data. You're performing a consistent data quality approach on that data and then you're making it available through any number of consumption patterns so that can be applications needed, so I'm integrating applications. It could be warehousing analytics. But it's the same data and it comes from that promise that we've tagged it enterprise data and we've done that good stuff to make sure that it's good, that it's healthy. That we know where we stand so if it's not a good data set we know how to tag it and make it such. For all the other data around we have to let our business partners be accountable for how they're enriching that data and innovating and so forth. But governance is not a - I think in the past another part of your question, governance used to be more of a, slow everybody down but if we can incorporate governance and have implied governance in the platform and then allow the customers to self serve off of that platform, governance becomes really that universal good. That thing that allows you to be confident that you can take the data and innovate with that data. >> So I'm curious how much of the value add now comes from the not enterprise data. The outside the core which you've had forever. What's the increasing importance and overlay of that exterior data to your enterprise data to drive more value out of your enterprise data? >> So that enterprise data like I say may be the 85%, it's just the facts. These are the loans we brought in. Here's how we can aggregate risk or how we can aggregate what we call UPB, or the value of our loans. That is pretty generic and it's intended to be. The third party data sets that our business partners may bring in that they bump up against that data can give them strategic advantages. Also the data that those businesses generate our business lines generate within their local applications which we would not call enterprise data, that's very much their special sauce. That's something that the broader organization doesn't need. Those things are all really what our data scientists and our business people combine to create the value added reports that they use for decisioning and so forth. >> And then I'm curious how the big data and the analytics environment has changed from the old day where you had some PHds and some super bright guys that ran super hard algorithms and it was on Mahogany Row and you put in the request and maybe from down high someday you'll get your request versus really trying to enable a broader set of analysts to have access to that data with a much broader set of tools, enabling a bunch of tools versus picking the one or two winners that are very expensive, you got to limit the seats et cetera. How has that changed the culture of the company as well as the way that you are able to deliver products and deliver new applications if you will? >> So I think that's a work in progress. We still have all the PHds and they still really call the shots. They're the ones that get the call from the Executive Vice President and they want to see something today that tells them what decision they should make. We have to enable them. They were enabled in the past by having people basically hustle to get them what they need. The big change we're trying to make now is to present the data in a common platform where they really can take it and run with it so there is a change in how we're delivering our systems to make sure we have the lowest level of granularity. That we have real time data. there's no longer waiting. And the technology tools that have come out in the past 10 years have enabled that. It's not just about implementing that, making it available to all those Phds. There's another population of analysts that is now empowered where they were not before. The guys that suffered just using excel or access databases that were I would call them not the power users but the empowered analysts. The ones who know the data, know how to query data but they're not hard core quants and they're not developers. Those guys have access to a plethora of tools now that were never available before that allow them to wrangle data from 20 different data sets, align it, ask questions of it. And they're really focused on operations and running our systems in a smoother, lower cost way. So I think the granularity, the timing, and support for that explosion of tools we'll still have the big, heavy SAS and R users that are the quants. I think that's the combination everything has to be supported and we'll support it better with higher quality, with more recent data, but the culture change isn't going to happen even in a few years. It will be a longer term path for larger organizations to really see maybe possibilities where they can restructure themselves based on technology. Right now the technologies are early enough and young enough that I think they're going to wait and see. >> Obviously you have a ton of legacy systems, you have all these tools. You have that core set, your enterprise data that doesn't really change that much. What's the objective down the road? Are you looking to expand on that core set? Is it such a fixture that you can't do anything with it in terms of flexibility? Where do you go from here? if we were to sit down three years from now what are we going to be talking about? >> So two things. One, I hope I'll be looking back with excitement at my huge success at transforming those legacy systems. In particular we have what we call the legacy warehouses that have been around well over 20 years that are limited and have not been updated because we've been trying to retire them for many years. Folding all of that into my core enterprise data infrastructure that will be fully aligned on terminology, on near-real time, all those things. That will be a huge success, I'll be looking back and glowing about how we did that and how we've empowered the business with that core data set that is uniquely available on this platform. They don't need to go anywhere else to find it. The other thing I think we'll see is enabling analysts to utilize cloud-based assets and really be successful working both with our on-premises data center, our own data center-supported applications but also starting to move their heavy running quantitative modeling and all the sorts of things they do into the data lake which will be cloud based and really enabling that as a true kind of empowerment for them so they can use a different sent of tools. They can move all that heavy lifting and the servers they sometimes bring down right now move it into an environment where they can really manage their own performance. I think those are going to be the two big changes three years from now that will feel like we're in the next generation. >> All right. Kevin Bates, projecting the future so we look forward to that day. Thanks for taking a few minutes out of your day. >> Thank you. >> All right, thanks. He's Kevin, I'm Jeff. You're watching The Cube from the Corinium Chief Analytics Officer Event in San Francisco. Thanks for watching. (music)
SUMMARY :
We're in downtown San Francisco at the Corinium It's a mouthful. according to your LinkedIn, including the credit loss division, It's given me the opportunity to touch So how has that added to change and what have you done to the culture has to change and how you think the numbers of tools that you use And that's a huge piece of the old innovation game and then allow the customers to self serve off So I'm curious how much of the value add now comes So that enterprise data like I say may be the 85%, How has that changed the culture of the company that are the quants. What's the objective down the road? and the servers they sometimes bring down right now Kevin Bates, projecting the future from the Corinium Chief Analytics Officer Event
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Frick | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Kevin Bates | PERSON | 0.99+ |
Corinium | ORGANIZATION | 0.99+ |
Fannie Mae | ORGANIZATION | 0.99+ |
Jeff | PERSON | 0.99+ |
15 years | QUANTITY | 0.99+ |
Kevin | PERSON | 0.99+ |
one | QUANTITY | 0.99+ |
85% | QUANTITY | 0.99+ |
85 percent | QUANTITY | 0.99+ |
Fannie Mae | PERSON | 0.99+ |
excel | TITLE | 0.99+ |
20 different data sets | QUANTITY | 0.99+ |
9 yards | QUANTITY | 0.99+ |
San Francisco | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
two things | QUANTITY | 0.99+ |
Spring 2018 | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
both | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
10 years ago | DATE | 0.98+ |
three years | QUANTITY | 0.97+ |
two big | QUANTITY | 0.96+ |
over 20 years | QUANTITY | 0.95+ |
Corinium Chief Analytics Officer | EVENT | 0.92+ |
two winners | QUANTITY | 0.92+ |
Executive Vice President | PERSON | 0.91+ |
2008 | DATE | 0.89+ |
Mahogany Row | TITLE | 0.88+ |
Corinium Chief Analytics Officer Conference | EVENT | 0.87+ |
The Cube | ORGANIZATION | 0.85+ |
years | DATE | 0.76+ |
Corinium Chief | EVENT | 0.73+ |
past 10 years | DATE | 0.72+ |
one effort | QUANTITY | 0.72+ |
The Cube | TITLE | 0.7+ |
Fannie | ORGANIZATION | 0.7+ |
Officer | EVENT | 0.69+ |
past | DATE | 0.59+ |
couple of years | DATE | 0.59+ |
Mae | PERSON | 0.55+ |
Enterprise | ORGANIZATION | 0.52+ |
Spring | EVENT | 0.51+ |
Officer | PERSON | 0.48+ |
Data | OTHER | 0.39+ |
Prakash Nanduri, Paxata | Corinium Chief Analytics Officer Spring 2018
(techno music) >> Announcer: From the Corinium Chief Analytics Officer Conference Spring San Francisco. It's theCUBE. >> Hey, welcome back everybody. Jeff Frick here with theCUBE. We're in downtown San Francisco at the Parc 55 Hotel at the Corinium Chief Analytics Officer Spring 2018 event, about 100 people, pretty intimate affair. A lot of practitioners here talking about the challenges of Big Data and the challenges of Analytics. We're really excited to have a very special Cube guest. I think he was the first guy to launch his company on theCUBE. It was Big Data New York City 2013. I remember it distinctly. It's Prakash Nanduri, the co-founder and CEO of Paxata. Great to see you. >> Great seeing you. Thank you for having me back. >> Absolutely. You know we got so much mileage out of that clip. We put it on all of our promotional materials. You going to launch your company? Launch your company on theCUBE. >> You know it seems just like yesterday but it's been a long ride and it's been a fantastic ride. >> So give us just a quick general update on the company, where you guys are now, how things are going. >> Things are going fantastic. We continue to grow. If you recall, when we launched, we launched the whole notion of democratization of information in the enterprise with self service data prep. We have gone onto now delivered real value to some of the largest brands in the world. We're very proud that 2017 was the year when massive amount of adoption of Paxata's adaptive information platform was taken across multiple industries, financial services, retail, CPG, high tech, in the OIT space. So, we just keep growing and it's the usual challenges of managing growth and managing, you know, the change in the company as you, as you grow from being a small start-up to know being a real company. >> Right, right. There's good problems and bad problems. Those are the good problems. >> Yes, yes. >> So, you know, we do so many shows and there's two big themes over and over and over like digital transformation which gets way over used and then innovation and how do you find a culture of innovation. In doing literally thousands of these interviews, to me it seems pretty simple. It is about democratization. If you give more people the data, more people the tools to work with the data, and more people the power to do something once they find something in the data, and open that up to a broader set of people, they're going to find innovations, simply the fact of doing it. But the reality is those three simple steps aren't necessarily very easy to execute. >> You're spot on, you're spot on. I like to say that when we talk about digital transformation the real focus should be on the deed . And it really centers around data and it centers around the whole notion of democratization, right? The challenge always in large enterprises is democratization without governance becomes chaos. And we always need to focus on democratization. We need to focus on data because as we all know data is the new oil, all of that, and governance becomes a critical piece too. But as you recall, when we launched Paxata, the entire vision from day one has been while the entire focus around digitization covers many things right? It covers people processes. It covers applications. It's a very large topic, the whole digital transformation of enterprise. But the core foundation to digital transformation, data democratization governance, but the key issue is the companies that are going to succeed are the companies that turn data into information that's relevant for every digital transformation effort. >> Right, right. >> Because if you do not turn raw data into information, you're just dealing with raw data which is not useful >> Jeff: Right >> And it will not be democratized. >> Jeff: Right >> Because the business will only consume the information that is contextual to their need, the information that's complete and the information that is clean. >> Right, right. >> So that's really what we're driving towards. >> And that's interesting 'cause the data, there's so many more sources of data, right? There's data that you control. There's structured data, unstructured data. You know, I used to joke, just the first question when you'd ask people "Where's your data?", half the time they couldn't even, they couldn't even get beyond that step. And that's before you start talking about cleaning it and making it ready and making it available. Before you even start to get into governance and rights and access so it's a really complicated puzzle to solve on the backend. >> I think it starts with first focusing on what are the business outcomes we are driving with digital transformation. When you double-click on digital transformation and then you start focusing on data and information, there's a few things that come to fore. First of all, how do I leverage information to improve productivity in my company? There's multiple areas, whether it is marketing or supply chain or whatever. The second notion is how do I ensure that I can actually transform the culture in my company and attract the brightest and the best by giving them the the environment where democratization of information is actually reality, where people feel like they're empowered to access data and turn it into information and then be able to do really interesting things. Because people are not interested on being subservient to somebody who gives them the data. They want to be saying "Give it to me. "I'm smart enough. "I know analytics. "I think analytically and I want to drive my career forward." So the second thing is the cultural aspect to it. And the last thing, which is really important is every company, regardless of whether you're making toothpicks or turbines, you are looking to monetize data. So it's about productivity. It's about cultural change and attracting of talent. And it's about monetization. And when it comes to monetization of data, you cannot be satisfied with only covering enterprise data which is sitting in my enterprise systems. You have to be able to focus on, oh, how can I leverage the IOT data that's being generated from my products or widgets. How can I generate social immobile? How can I consume that? How can I bring all of this together and get the most complete insight that I need for my decision-making process? >> Right. So, I'm just curious, how do you see it your customers? So this is the chief analytics officer, we go to chief data officer, I mean, there's all these chief something officers that want to get involved in data and marketing is much more involved with it. Forget about manufacturing. So when you see successful cultural change, what drives that? Who are the people that are successful and what is the secret to driving the cultural change that we are going to be data-driven, we are going to give you the tools, we are going to make the investment to turn data which historically was even arguably a liability 'cause it had to buy a bunch o' servers to stick it on, into that now being an asset that drives actionable outcomes? >> You know, recently I was having this exact discussion with the CEO of one of the largest financial institutions in the world. This gentleman is running a very large financial services firm, is dealing with all the potential disruption where they're seeing completely new type of PINTEC products coming in, the whole notion of blockchain et cetera coming in. Everything is changing. Everything looks very dramatic. And what we started talking about is the first thing as the CEO that we always focus on is do we have the right people? And do we have the people that are motivated and driven to basically go and disrupt and change? For those people, you need to be able to give them the right kind of tools, the right kind of environment to empower them. This doesn't start with lip service. It doesn't start about us saying "We're going to be on a digital transformation journey" but at the same time, your data is completely in silos. It's locked up. There is 15,000 checks and balances before I can even access a simple piece of data and third, even when I get access to it, it's too little, too late or it's garbage in, garbage out. And that's not the culture. So first, it needs to be CEO drive, top down. We are going to go through digital transformation which means we are going to go through a democratization effort which means we are going to look at data and information as an asset and that means we are not only going to be able to harness these assets, but we're also going to monetize these assets. How are we going to do it? It depends very much on the business you're in, the vertical industry you play in, and your strengths and weaknesses. So each company has to look at it from their perspective. There's no one size fits all for everyone. >> Jeff: Right. >> There are some companies that have fantastic cultures of empowerment and openness but they may not have the right innovation or the right kind of product innovation skills in place. So it's about looking at data across the board. First from your culture and your empowerment, second about democratization of information which is where a company like Paxata comes in, and third, along with democratization, you have to focus on governance because we are for-profit companies. We have a fiducial responsibility to our customers and our regulators and therefore we cannot have democratization without governance. >> Right, right >> And that's really what our biggest differentiation is. >> And then what about just in terms of the political play inside the company. You know, on one hand, used to be if you held the information, you had the power. And now that's changed really 'cause there's so much information. It's really, if you are the conduit of information to help people make better decisions, that's actually a better position to be. But I'm sure there's got to be some conflicts going through digital transformation where I, you know, I was the keeper of the kingdom and now you want to open that up. Conversely, it must just be transformational for the people on the front lines that finally get the data that they've been looking for to run the analysis that they want to rather than waiting for the weekly reports to come down from on high. >> You bet. You know what I like to say is that if you've been in a company for 10, 15 years and if you felt like a particular aspect, purely selfishly, you felt a particular aspect was job security, that is exactly what's going to likely make you lose your job today. What you thought 10 years ago was your job security, that's exactly what's going to make you lose your job today. So if you do not disrupt yourself, somebody else will. So it's either transform yourself or not. Now this whole notion of politics and you know, struggle within the company, it's been there for as long as, humans generally go towards entropy. So, if you have three humans, you have all sort of issues. >> Jeff: Right, right. >> The issue starts frankly with leadership. It starts with the CEO coming down and not only putting an edict down on how things will be done but actually walking the walk with talking the talk. If, as a CEO, you're not transparent, it you're not trusting your people, if you're not sharing information which could be confidential, but you mention that it's confidential but you have to keep this confidential. If you trust your people, you give them the ability to, I think it's a culture change thing. And the second thing is incentivisation. You have to be able to focus on giving people the ability to say "by sharing my data, "I actually become a hero." >> Right, right. >> By giving them the actual credit for actually delivering the data to achieve an outcome. And that takes a lot of work. But if you do not actually drive the cultural change, you will not drive the digital transformation and you will not drive the democratization of information. >> And have you seen people try to do it without making the commitment? Have you seen 'em pay the lip service, spend a few bucks, start a project but then ultimately they, they hamstring themselves 'cause they're not actually behind it? >> Look, I mean, there's many instances where companies start on digital transformation or they start jumping into cool terms like AI or machine-learning, and there's a small group of people who are kind of the elites that go in and do this. And they're given all the kind of attention et cetera. Two things happen. Because these people who are quote, unquote, the elite team, either they are smart but they're not able to scale across the organization or many times, they're so good, they leave. So that transformation doesn't really get democratized. So it is really important from day one to start a culture where you're not going to have a small group of exclusive data scientists. You can have those people but you need to have a broader democratization focus. So what I have seen is many of the siloed, small, tight, mini science projects end up failing. They fail because number one, either the business outcome is not clearly identified early on or two, it's not scalable across the enterprise. >> Jeff: Right. >> And a majority of these exercises fail because the whole information foundation that is taking raw data turning it into clean, complete, potential consumable information, to feed across the organization, not just for one siloed group, not just one data science team. But how do you do that across the company? That's what you need to think from day one. When you do these siloed things, these departmental things, a lot of times they can fail. Now, it's important to say "I will start with a couple of test cases" >> Jeff: Right, right. >> "But I'm going to expand it across "from the beginning to think through that." >> So I'm just curious, your perspective, is there some departments that are the ripest for being that leading edge of the digital transformation in terms of, they've got the data, they've got the right attitude, they're just a short step away. Where have you seen the great place to succeed when you're starting on kind of a smaller PLC, I don't know if you'd say PLC, project or department level? >> So, it's funny but you will hear this, it's not rocket science. Always they say, follow the money. So, in a business, there are three incentives, making more money, saving money, or staying out of jail. (laughs) >> Those are good. I don't know if I'd put them in that order but >> Exactly, and you know what? Depending on who are you are, you may have a different order but staying out of jail if pretty high on my list. >> Jeff: I'm with you on that one. >> So, what are the ambiants? Risk and compliance. Right? >> Jeff: Right, right. >> That's one of those things where you absolutely have to deliver. You absolutely have to do it. It's significantly high cost. It's very data and analytic centric and if you find a smart way to do it, you can dramatically reduce your cost. You can significantly increase your quality and you can significantly increase the volume of your insights and your reporting, thereby achieving all the risk and compliance requirements but doing it in a smarter way and a less expensive way. >> Right. >> That's where incentives have really been high. Second, in making money, it always comes down to sales and marketing and customer success. Those are the three things, sales, marketing, and customer success. So most of our customers who have been widely successful, are the ones who have basically been able to go and say "You know what? "It used to take us eight months "to be able to even figure out a customer list "for a particular region. "Now it takes us two days because of Paxata "and because of the data prep capabilities "and the governance aspects." That's the power that you can deliver today. And when you see one person who's a line of business person who says "Oh my God. "What used to take me eight months, "now it's done in half a day". Or "What use to take me 22 days to create a report, "is now done in 45 minutes." All of a sudden, you will not have a small kind of trickle down, you will have a tsunami of democratization with governance. That's what we've seen in our customers. >> Right, right. I love it. And this is just so classic too. I always like to joke, you know, back in the day, you would run your business based on reports from old data. Now we want to run your business with stuff you can actually take action on now. >> Exactly. I mean, this is public, Shameek Kundu, the chief data officer of Standard Chartered Bank and Michael Gorriz who's the global CIO of Standard Chartered Bank, they have embraced the notion that information democratization in the bank is a foundational element to the digital transformation of Standard Chartered. They are very forward thinking and they're looking at how do I democratize information for all our 87,500 employees while we maintain governance? And another major thing that they are looking at is they know that the data that they need to manipulate and turn into information is not sitting only on premise. >> Right, right. >> It's sitting across a multi-cloud world and that's why they've embraced the Paxata information platform to be their information fabric for a multi-cloud hybrid world. And this is where we see successes and we're seeing more and more of this, because it starts with the people. It starts with the line of business outcomes and then it starts with looking at it from scale. >> Alright, Prakash, well always great to catch up and enjoy really watching the success of the company grow since you launched it many moons ago in New York City >> yes Fantastic. Always a pleasure to come back here. Thank you so much. >> Alright. Thank you. He's Prakash, I'm Jeff Frick. You're watching theCUBE from downtown San Francisco. Thanks for watching. (techno music)
SUMMARY :
Announcer: From the Corinium and the challenges of Analytics. Thank you for having me back. You going to launch your company? You know it seems just like yesterday where you guys are now, how things are going. of information in the enterprise Those are the good problems. and more people the power to do something and it centers around the whole notion of and the information that is clean. And that's before you start talking about cleaning it So the second thing is the cultural aspect to it. we are going to give you the tools, the vertical industry you play in, So it's about looking at data across the board. And that's really and now you want to open that up. and if you felt like a particular aspect, the ability to say "by sharing my data, and you will not drive the democratization of information. but you need to have a broader democratization focus. That's what you need to think from day one. "from the beginning to think through that." Where have you seen the great place to succeed So, it's funny but you will hear this, I don't know if I'd put them in that order but Exactly, and you know what? Risk and compliance. and if you find a smart way to do it, That's the power that you can deliver today. I always like to joke, you know, back in the day, is a foundational element to the digital transformation the Paxata information platform Thank you so much. Thank you.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff | PERSON | 0.99+ |
Michael Gorriz | PERSON | 0.99+ |
Prakash Nanduri | PERSON | 0.99+ |
eight months | QUANTITY | 0.99+ |
Standard Chartered Bank | ORGANIZATION | 0.99+ |
22 days | QUANTITY | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
Paxata | ORGANIZATION | 0.99+ |
Shameek Kundu | PERSON | 0.99+ |
two days | QUANTITY | 0.99+ |
New York City | LOCATION | 0.99+ |
Prakash | PERSON | 0.99+ |
Second | QUANTITY | 0.99+ |
thousands | QUANTITY | 0.99+ |
87,500 employees | QUANTITY | 0.99+ |
45 minutes | QUANTITY | 0.99+ |
PINTEC | ORGANIZATION | 0.99+ |
Standard Chartered | ORGANIZATION | 0.99+ |
2017 | DATE | 0.99+ |
10 | QUANTITY | 0.99+ |
third | QUANTITY | 0.99+ |
half a day | QUANTITY | 0.99+ |
15,000 checks | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
First | QUANTITY | 0.99+ |
Spring 2018 | DATE | 0.99+ |
first question | QUANTITY | 0.99+ |
each company | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
second thing | QUANTITY | 0.98+ |
nium | ORGANIZATION | 0.98+ |
three things | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
yesterday | DATE | 0.98+ |
three simple steps | QUANTITY | 0.98+ |
two big themes | QUANTITY | 0.98+ |
first guy | QUANTITY | 0.98+ |
second | QUANTITY | 0.97+ |
three humans | QUANTITY | 0.97+ |
about 100 people | QUANTITY | 0.97+ |
two | QUANTITY | 0.97+ |
Cori | PERSON | 0.97+ |
Two things | QUANTITY | 0.96+ |
Paxata | PERSON | 0.96+ |
day one | QUANTITY | 0.96+ |
15 years | QUANTITY | 0.95+ |
three incentives | QUANTITY | 0.94+ |
today | DATE | 0.94+ |
theCUBE | ORGANIZATION | 0.94+ |
second notion | QUANTITY | 0.93+ |
first thing | QUANTITY | 0.92+ |
one person | QUANTITY | 0.92+ |
Cube | ORGANIZATION | 0.89+ |
Parc 55 Hotel | LOCATION | 0.88+ |
San Francisco | LOCATION | 0.87+ |
2013 | DATE | 0.85+ |
Corinium Chief Analytics Officer | EVENT | 0.82+ |
double- | QUANTITY | 0.8+ |
downtown San Francisco | LOCATION | 0.79+ |
Chief Analytics Officer | PERSON | 0.78+ |
Corinium Chief Analytics Officer Conference | EVENT | 0.77+ |
group | QUANTITY | 0.74+ |
one data | QUANTITY | 0.69+ |
Paxata | TITLE | 0.66+ |
many moons ago | DATE | 0.61+ |
couple | QUANTITY | 0.61+ |
theCUBE | TITLE | 0.57+ |
Spring | EVENT | 0.5+ |
Seth Dobrin, IBM | Big Data SV 2018
>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.
SUMMARY :
brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Seth | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Seth Dobrin | PERSON | 0.99+ |
20% | QUANTITY | 0.99+ |
50% | QUANTITY | 0.99+ |
TD Bank | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
two people | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
IBM Analytics | ORGANIZATION | 0.99+ |
two things | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
one model | QUANTITY | 0.99+ |
five years | QUANTITY | 0.98+ |
20 years ago | DATE | 0.98+ |
Big Data SV | EVENT | 0.98+ |
five years ago | DATE | 0.98+ |
GDPR | TITLE | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
Strata Data Conference | EVENT | 0.97+ |
today | DATE | 0.97+ |
first statistics | QUANTITY | 0.95+ |
CUBE | ORGANIZATION | 0.94+ |
Women in Data Science Conference | EVENT | 0.94+ |
both | QUANTITY | 0.94+ |
Chief Data Officers Summit | EVENT | 0.93+ |
Big Data SV 2018 | EVENT | 0.93+ |
couple of days ago | DATE | 0.93+ |
years | DATE | 0.9+ |
Forager Eatery | ORGANIZATION | 0.9+ |
first | QUANTITY | 0.86+ |
Watson | TITLE | 0.86+ |
Officers Summit | EVENT | 0.74+ |
Data Officer | PERSON | 0.73+ |
SV | EVENT | 0.71+ |
President | PERSON | 0.68+ |
Strata | TITLE | 0.67+ |
Big Data | ORGANIZATION | 0.66+ |
earlier today | DATE | 0.65+ |
Silicon Valley | LOCATION | 0.64+ |
years | QUANTITY | 0.6+ |
Chief | EVENT | 0.44+ |
Kaggle | ORGANIZATION | 0.43+ |