Show Wrap | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's three Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back. We're here to wrap up the M I T. Chief data officer officer, information quality. It's hashtag m i t CDO conference. You're watching the Cube. I'm David Dante, and Paul Gill is my co host. This is two days of coverage. We're wrapping up eyes. Our analysis of what's going on here, Paul, Let me let me kick it off. When we first started here, we talked about that are open. It was way saw the chief data officer role emerged from the back office, the information quality role. When in 2013 the CEO's that we talked to when we asked them what was their scope. We heard things like, Oh, it's very wide. Involves analytics, data science. Some CEOs even said Oh, yes, security is actually part of our purview because all the cyber data so very, very wide scope. Even in some cases, some of the digital initiatives were sort of being claimed. The studios were staking their claim. The reality was the CDO also emerged out of highly regulated industries financialservices healthcare government. And it really was this kind of wonky back office role. And so that's what my compliance, that's what it's become again. We're seeing that CEOs largely you're not involved in a lot of the emerging. Aye, aye initiatives. That's what we heard, sort of anecdotally talking to various folks At the same time. I feel as though the CDO role has been more fossilized than it was before. We used to ask, Is this role going to be around anymore? We had C I. Ose tell us that the CEO Rose was going to disappear, so you had both ends of the spectrum. But I feel as though that whatever it's called CDO Data's our chief analytics off officer, head of data, you know, analytics and governance. That role is here to stay, at least for for a fair amount of time and increasingly, issues of privacy and governance. And at least the periphery of security are gonna be supported by that CD a role. So that's kind of takeaway Number one. Let me get your thoughts. >> I think there's a maturity process going on here. What we saw really in 2016 through 2018 was, ah, sort of a celebration of the arrival of the CDO. And we're here, you know, we've got we've got power now we've got an agenda. And that was I mean, that was a natural outcome of all this growth and 90% of organizations putting sea Dios in place. I think what you're seeing now is a realization that Oh, my God, this is a mess. You know what I heard? This year was a lot less of this sort of crowing about the ascendance of sea Dios and Maura about We've got a big integration problem of big data cleansing problem, and we've got to get our hands down to the nitty gritty. And when you talk about, as you said, we had in here so much this year about strategic initiatives, about about artificial intelligence, about getting involved in digital business or customer experience transformation. What we heard this year was about cleaning up data, finding the data that you've got organizing it, applying meditator, too. It is getting in shape to do something with it. There's nothing wrong with that. I just think it's part of the natural maturation process. Organizations now have to go through Tiu to the dirty process of cleaning up this data before they can get to the next stage, which was a couple of three years out for most of >> the second. Big theme, of course. We heard this from the former head of analytics. That G s K on the opening keynote is the traditional methods have failed the the Enterprise Data Warehouse, and we've actually studied this a lot. You know, my analogy is often you snake swallowing a basketball, having to build cubes. E D W practitioners would always used to call it chasing the chips until we come up with a new chip. Oh, we need that because we gotta run faster because it's taking us hours and hours, weeks days to run these analytics. So that really was not an agile. It was a rear view mirror looking thing. And Sarbanes Oxley saved the E. D. W. Business because reporting became part of compliance thing perspective. The master data management piece we've heard. Do you consistently? We heard Mike Stone Breaker, who's obviously a technology visionary, was right on. It doesn't scale through this notion of duping. Everything just doesn't work and manually creating rules. It's just it's just not the right approach. This we also heard the top down data data enterprise data model doesn't works too complicated, can operationalize it. So what they do, they kick the can to governance. The Duke was kind of a sidecar, their big data that failed to live up to its promises. And so it's It's a big question as to whether or not a I will bring that level of automation we heard from KPMG. Certainly, Mike Stone breaker again said way heard this, uh, a cz well, from Andy Palmer. They're using technology toe automate and scale that big number one data science problem, which is? They spend all their time wrangling data. We'll see if that if that actually lives up >> to his probable is something we did here today from several of our guests. Was about the promise of machine learning to automate this day to clean up process and as ah Mark Ramsay kick off the conference saying that all of these efforts to standardize data have failed in the past. This does look, He then showed how how G s K had used some of the tools that were represented here using machine learning to actually clean up the data at G S. K. So there is. And I heard today a lot of optimism from the people we talked to about the capability of Chris, for example, talking about the capability of machine learning to bring some order to solve this scale scale problem Because really organizing data creating enterprise data models is a scale problem, and the only way you can solve that it's with with automation, Mike Stone breaker is right on top of that. So there was optimism at this event. There was kind of an ooh, kind of, ah, a dismay at seeing all the data problems they have to clean up, but also promised that tools are on the way that could do that. >> Yeah, The reason I'm an optimist about this role is because data such a hard problem. And while there is a feeling of wow, this is really a challenge. There's a lot of smart people here who are up for the challenge and have the d n a for it. So the role, that whole 360 thing. We talked about the traditional methods, you know, kind of failing, and in the third piece that touched on, which is really bringing machine intelligence to the table. We haven't heard that as much at this event. It's now front and center. It's just another example of a I injecting itself into virtually every aspect every corner of the industry. And again, I often jokes. Same wine, new bottle. Our industry has a habit of doing that, but it's cyclical, but it is. But we seem to be making consistent progress. >> And the machine learning, I thought was interesting. Several very guest spoke to machine learning being applied to the plumbing projects right now to cleaning up data. Those are really self contained projects. You can manage those you can. You can determine out test outcomes. You can vet the quality of the of the algorithms. It's not like you're putting machine learning out there in front of the customer where it could potentially do some real damage. There. They're vetting their burning in machine, learning in a environment that they control. >> Right, So So, Amy, Two solid days here. I think that this this conference has really grown when we first started here is about 130 people, I think. And now it was 500 registrants. This'd year. I think 600 is the sort of the goal for next year. Moving venues. The Cube has been covering this all but one year since 2013. Hope to continue to do that. Paul was great working with you. Um, always great work. I hope we can, uh we could do more together. We heard the verdict is bringing back its conference. You put that together. So we had column. Mahoney, um, had the vertical rock stars on which was fun. Com Mahoney, Mike Stone breaker uh, Andy Palmer and Chris Lynch all kind of weighed in, which was great to get their perspectives kind of the days of MPP and how that's evolved improving on traditional relational database. And and now you're Stone breaker. Applying all these m i. Same thing with that scale with Chris Lynch. So it's fun to tow. Watch those guys all Boston based East Coast folks some news. We just saw the news hit President Trump holding up jet icon contractors is we've talked about. We've been following that story very closely and I've got some concerns over that. It's I think it's largely because he doesn't like Bezos in The Washington Post Post. Exactly. You know, here's this you know, America first. The Pentagon says they need this to be competitive with China >> and a I. >> There's maybe some you know, where there's smoke. There's fire there, so >> it's more important to stick in >> the eye. That's what it seems like. So we're watching that story very closely. I think it's I think it's a bad move for the executive branch to be involved in those type of decisions. But you know what I know? Well, anyway, Paul awesome working with you guys. Thanks. And to appreciate you flying out, Sal. Good job, Alex Mike. Great. Already wrapping up. So thank you for watching. Go to silicon angle dot com for all the news. Youtube dot com slash silicon angles where we house our playlist. But the cube dot net is the main site where we have all the events. It will show you what's coming up next. We've got a bunch of stuff going on straight through the summer. And then, of course, VM World is the big kickoff for the fall season. Goto wicked bond dot com for all the research. We're out. Thanks for watching Dave. A lot day for Paul Gillon will see you next time.
SUMMARY :
Brought to you by in 2013 the CEO's that we talked to when we asked them what was their scope. And that was I mean, And Sarbanes Oxley saved the E. data models is a scale problem, and the only way you can solve that it's with with automation, We talked about the traditional methods, you know, kind of failing, and in the third piece that touched on, And the machine learning, I thought was interesting. We just saw the news hit President Trump holding up jet icon contractors There's maybe some you know, where there's smoke. And to appreciate you flying out, Sal.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Andy Palmer | PERSON | 0.99+ |
David Dante | PERSON | 0.99+ |
Chris Lynch | PERSON | 0.99+ |
Chris | PERSON | 0.99+ |
2013 | DATE | 0.99+ |
Paul | PERSON | 0.99+ |
Paul Gill | PERSON | 0.99+ |
Mike Stone | PERSON | 0.99+ |
2016 | DATE | 0.99+ |
Paul Gillon | PERSON | 0.99+ |
Mike Stone Breaker | PERSON | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
2018 | DATE | 0.99+ |
Rose | PERSON | 0.99+ |
Alex Mike | PERSON | 0.99+ |
Bezos | PERSON | 0.99+ |
G s K | ORGANIZATION | 0.99+ |
Mahoney | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
KPMG | ORGANIZATION | 0.99+ |
90% | QUANTITY | 0.99+ |
Sal | PERSON | 0.99+ |
third piece | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
500 registrants | QUANTITY | 0.99+ |
two days | QUANTITY | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
today | DATE | 0.99+ |
next year | DATE | 0.99+ |
Mark Ramsay | PERSON | 0.99+ |
360 | QUANTITY | 0.99+ |
this year | DATE | 0.99+ |
Maura | PERSON | 0.99+ |
G S. K. | ORGANIZATION | 0.98+ |
Youtube | ORGANIZATION | 0.98+ |
Amy | PERSON | 0.98+ |
Pentagon | ORGANIZATION | 0.98+ |
C I. Ose | PERSON | 0.98+ |
Sarbanes Oxley | PERSON | 0.97+ |
first | QUANTITY | 0.97+ |
This year | DATE | 0.96+ |
one year | QUANTITY | 0.96+ |
Mike Stone breaker | PERSON | 0.95+ |
Enterprise Data Warehouse | ORGANIZATION | 0.95+ |
Dios | PERSON | 0.94+ |
Two solid days | QUANTITY | 0.94+ |
second | QUANTITY | 0.94+ |
three years | QUANTITY | 0.92+ |
about 130 people | QUANTITY | 0.91+ |
600 | QUANTITY | 0.9+ |
Duke | ORGANIZATION | 0.89+ |
VM World | EVENT | 0.88+ |
dot com | ORGANIZATION | 0.85+ |
China | ORGANIZATION | 0.84+ |
E. D. W. | ORGANIZATION | 0.83+ |
Cube | ORGANIZATION | 0.8+ |
MIT | ORGANIZATION | 0.77+ |
East Coast | LOCATION | 0.75+ |
M I T. | PERSON | 0.75+ |
2019 | DATE | 0.74+ |
President Trump | PERSON | 0.71+ |
both ends | QUANTITY | 0.71+ |
three | QUANTITY | 0.68+ |
M I T. | EVENT | 0.64+ |
cube dot net | ORGANIZATION | 0.59+ |
Chief | PERSON | 0.58+ |
The Washington Post Post | TITLE | 0.57+ |
America | ORGANIZATION | 0.56+ |
Goto wicked | ORGANIZATION | 0.54+ |
CEO | PERSON | 0.54+ |
couple | QUANTITY | 0.54+ |
CDO | ORGANIZATION | 0.45+ |
Stone | PERSON | 0.43+ |
CDOIQ | TITLE | 0.24+ |
Bob Parr & Sreekar Krishna, KPMG US | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to Cambridge, Massachusetts. Everybody watching the Cuban leader live tech coverage. We here covering the M I t CDO conference M I t CEO Day to wrapping up. Bob Parr is here. He's a partner in principle at KPMG, and he's joined by Streetcar Krishna, who is the managing director of data science. Aye, aye. And innovation at KPMG. Gents, welcome to the Cube. Thank >> thank you. Let's start with your >> roles. So, Bob, where do you focus >> my focus? Ah, within KPMG, we've got three main business lines audit tax, an advisory. And so I'm the advisory chief date officer. So I'm more focused on how we use data competitively in the market. More the offense side of our focus. So, you know, how do we make sure that our teams have the data they need to deliver value? Uh, much as possible working concert with the enterprise? CDO uh, who's more focused on our infrastructure, Our standards, security, privacy and those >> you've focused on making KPMG better A >> supposed exactly clients. OK, >> I also have a second hat, and I also serve financial service is si Dios as well. So Okay, so >> get her out of a dual role. I got sales guys in >> streetcar. What was your role? >> Yeah, You know, I focus a lot on data science, artificial intelligence and overall innovation s o my reaction. I actually represent a centre of >> excellence within KPMG that focuses on the I machine learning natural language processing. And I work with Bob's Division to actually advance the data site off the store because all the eye needs data. And without data, there's no algorithms, So we're focusing a lot on How do we use a I to make data Better think about their equality. Think about data lineage. Think about all of the problems that data has. How can we make it better using algorithms? And I focused a lot on that working with Bob, But no, it's it's customers and internal. I mean, you know, I were a horizontal within the form, So we help customers. We help internal, we focus a lot on the market. >> So, Bob, you mentioned used data offensively. So 10 12 years ago, it was data was a liability. You had to get rid of it. Keep it no longer than you had to, because you're gonna get soon. So email archives came in and obviously thinks flipped after the big data. But so what do you What are you seeing in terms of that shift from From the defense data to the offensive? >> Yeah, and it's it's really you know, when you think about it and let me define sort of offense versus defense. Who on the defense side, historically, that's where most of CEOs have played. That's risk regulatory reporting, privacy, um, even litigation support those types of activities today. Uh, and really, until about a year and 1/2 ago, we really saw most CEOs still really anchored in that I run a forum with a number of studios and financial service is, and every year we get them together and asked him the same set of questions. This was the first year where they said that you know what my primary focus now is. Growth. It's bringing efficiency is trying to generate value on the offensive side. It's not like the regulatory work's going away, certainly in the face of some of the pending privacy regulation. But you know, it's It's a sign that the volume of use cases as the investments in their digital transformations are starting to kick out, as well as the volumes of data that are available. The raw material that's available to them in terms of third party data in terms of the the just the general volumes that that exist that are streaming into the organization and the overall literacy in the business units are creating this, this massive demand. And so they're having to >> respond because of getting a handle on the data they're actually finding. Word is, they're categorizing it there, there, >> yeah, organizing that. That is still still a challenge. Um, I think it's better with when you have a very narrow scope of critical data elements going back to the structure data that we're talking it with the regulatory reporting when you start to get into the three offense, the generating value, getting the customer experience, you know, really exploring. You know that side of it. There's there's a ton of new muscle that has to be built new muscle in terms of data quality, new muscle in terms of um, really more scalable operating model. I think that's a big issue right now with Si Dios is, you know, we've got ah, we're used to that limited swath of CDs and they've got Stewardship Network. That's very labor intensive. A lot of manual processes still, um, and and they have some good basic technology, but it's a lot of its rules based. And when you do you think about those how that constraints going to scale when you have all of this demand. You know, when you look at the customer experience analytics that they want to do when you look at, you know, just a I applied to things like operations. The demand on the focus there is is is gonna start to create a fundamental shift >> this week are one of things that I >> have scene, and maybe it's just my small observation space. But I wonder, if you could comment Is that seems like many CBO's air not directly involved in the aye aye initiatives. Clearly, the chief digital officer is involved, but the CDO zehr kind of, you know, in the background still, you see that? >> That's a fantastic question, and I think this is where we're seeing some off the cutting it change that is happening in the industry. And when Barbara presenter idea that we can often civilly look at data, this is what it is that studios for a long time have become more reactive in their roles. And that is that is starting to come forefront now. So a lot of institutions were working with are asking What's the next generation Roll off a CDO and why are they in the background and why are they not in the foreground? And this is when you become more often they were proactive with data and the digital officers are obviously focused on, you know, the transformation that has to happen. But the studios are their backbone in order to make the transformation. Really. And if the CDO started, think about their data as an asset did as a product did us a service. The judicial officers are right there because those are the real, you know, like the data data they're living so CDO can really become from my back office to really become a business line. We've >> seen taking the reins in machine learning in machine learning projects and cos you work with. Who >> was driving that? Yeah. Great question. So we are seeing, like, you know, different. I would put them in buckets, right? There is no one mortal fits all. We're seeing different generations within the company's. Some off. The ones were just testing out the market. There's two keeping it in their technology space in their back office. Take idea and, you know, in in forward I d let me call them where they are starting to experiment with this. But you see, the mature organizations on the other end of the spectrum, they are integrating action, learning and a I right into the business line because they want to see ex souls having the technology right by their side so they can lead leverage. Aye, aye. And machine learning spot right for the business right there. And that is where we're seeing know some of the new models. Come on. >> I think the big shift from a CDO perspective is using a i to prep data for a That's that's fundamentally where you know, where the data science was distributed. Some of that data science has to come back and free the integration for equality for data prepping because you've got all this data third party and other from customer streaming into the organization. And you know, the work that you're doing around, um, anomaly detection is it transcends developing the rules, doing the profiling, doing the rules. You know, the very manual, the very labor intensive process you've got to get away from that >> is used in order for this to be scale goes and a I to figure out which out goes to apply t >> clean to prepare the data toe, see what algorithms we can use. So it's basically what we're calling a eye for data rather than just data leading into a I. So it's I mean, you know, you developed a technology for one off our clients and pretty large financial service. They were getting closer, like 1,000,000,000 data points every day. And there was no way manually, you could go through the same quality controls and all of those processes. So we automated it through algorithms, and these algorithms are learning the behavior of data as they flow into the organization, and they're able to proactively tell their problems are starting very much. And this is the new face that we see in in the industry, you cannot scale the traditional data governance using manual processes, we have to go to the next generation where a i natural language processing and think about on structure data, right? I mean, that is, like 90% off. The organization is unstructured data, and we have not talked about data quality. We have not talked about data governance. For a lot of these sources of information, now is the time. Hey, I can do it. >> And I think that raised a great question. If you look at unstructured and a lot of the data sources, as you start to take more of an offensive stance will be unstructured. And the data quality, what it means to apply data quality isn't the the profiling and the rules generation the way you would with standard data. So the teams, the skills that CEOs have in their organizations, have to change. You have to start to, and, you know, it's a great example where, you know, you guys were ingesting documents and there was handwriting all over the documents, you know, and >> yeah, you know, you're a great example, Bob. Like you no way would ask the client, like, you know, is this document gonna scanned into the system so my algorithm can run and they're like, Yeah, everything is good. I mean, the deal is there, but when you then start scanning it, you realize there's handwriting and the information is in the handwriting. So all the algorithms breakdown now >> tribal knowledge striving Exactly. >> Exactly. So that's what we're seeing. You know, if I if we talk about the digital transformation in data in the city organization, it is this idea dart. Nothing is left unseen. Some algorithm or some technology, has seen everything that is coming into. The organization has has has a para 500. So you can tell you where the problems are. And this is what algorithms do. This scale beautifully. >> So the data quality approaches are evolving, sort of changing. So rather than heavy, heavy emphasis on masking or duplication and things like that, you would traditionally think of participating the difficult not that that goes away. But it's got to evolve to use machine >> intelligence. Exactly what kind of >> skill sets people need thio achieve that Is it Is it the same people or do we need to retrain them or bring in new skills. >> Yeah, great question. And I can talk from the inspector off. Where is disrupting every industry now that we know, right? But we knew when you look at what skills are >> required, all of the eye, including natural language processing, machine learning, still require human in the loop. And >> that is the training that goes in there. And who do you who are the >> people who have that knowledge? It is the business analyst. It's the data analyst who are the knowledge betters the C suite and the studios. They are able to make decisions. But the day today is still with the data analyst. >> Those s Emmys. Those sm >> means So we have to obscure them to really start >> interacting with these new technologies where they are the leaders, rather than just waiting for answers to come through. And >> when that happens now being as a data scientist, my job is easy because they're Siamese, are there? I deploy the technology. They're semi's trained algorithms on a regular basis. Then it is a fully fungible model which is evolving with the business. And no longer am I spending time re architect ing my rules. And like my, you know, what are the masking capabilities I need to have? It is evolving us. >> Does that change the >> number one problem that you hear from data scientists, which is the 80% of the time >> spent on wrangling cleaning data 10 15 20% run into sm. He's being concerned that they're gonna be replaced by the machine. Their training. >> I actually see them being really enabled now where they're spending 80% of the time doing boring job off, looking at data. Now they're spending 90% of their time looking at the elements future creative in which requires human intelligence to say, Hey, this is different because off X, >> y and Z so let's let's go out. It sounds like a lot of what machine learning is being used for now in your domain is clean things up its plumbing. It's basic foundation work. So go out. Three years after all that work has been done and the data is clean. Where are your clients talking about going next with machine learning? Bob, did you want? >> I mean, it's a whole. It varies by by industry, obviously, but, um but it covers the gamut from, you know, and it's generally tied to what's driving their strategies. So if you look at a financial service is organization as an example today, you're gonna have, you know, really a I driving a lot of the behind the scenes on the customer experience. It's, you know, today with your credit card company. It's behind the scenes doing fraud detection. You know, that's that's going to continue. So it's take the critical functions that were more data. It makes better models that, you know, that that's just going to explode. And I think they're really you can look across all the functions, from finance to to marketing to operations. I mean, it's it's gonna be pervasive across, you know all of that. >> So if I may, I don't top award. While Bob was saying, I think what's gonna what What our clients are asking is, how can I exhilarate the decision making? Because at the end of the day on Lee, all our leaders are focused on making decisions, and all of this data science is leading up to their decision, and today you see like you know what you brought up, like 80% of the time is wasted in cleaning the data. So only 20% time was spent in riel experimentation and analytics. So your decision making time was reduced to 20% off the effort that I put in the pipeline. What if now I can make it 80% of the time? They're I put in the pipeline, better decisions are gonna come on the train. So when I go into a meeting and I'm saying like, Hey, can you show me what happened in this particular region or in this particular part of the country? Previously, it would have been like, Oh, can you come back in two weeks? I will have the data ready, and I will tell you the answer. But in two weeks, the business has ran away and the CDO know or the C Street doesn't require the same answer. But where we're headed as as the data quality improves, you can get to really time questions and decisions. >> So decision, sport, business, intelligence. Well, we're getting better. Isn't interesting to me. Six months to build a cube, we'd still still not good enough. Moving too fast. As the saying goes, data is plentiful. Insights aren't Yes, you know, in your view, well, machine intelligence. Finally, close that gap. Get us closer to real time decision >> making. It will eventually. But there's there's so much that we need to. Our industry needs to understand first, and it really ingrained. And, you know, today there is still a fundamental trust issues with a I you know, it's we've done a lot of work >> watch Black box or a part of >> it. Part of it. I think you know, the research we've done. And some of this is nine countries, 2400 senior executives. And we asked some, ah, a lot of questions around their data and trusted analytics, and 92% of them came back with. They have some fundamental trust issues with their data and their analytics and and they feel like there's reputational risk material reputational risk. This isn't getting one little number wrong on one of the >> reports about some more of an >> issue, you know, we also do a CEO study, and we've done this many years in a row going back to 2017. We started asked them okay, making a lot of companies their data driven right. When it comes to >> what they say they're doing well, They say they're day driven. That's the >> point. At the end of the day, they making strategic decisions where you have an insight that's not intuitive. Do you trust your gut? Go with the analytics back then. You know, 67% said they go with their gut, So okay, this is 2017. This industry's moving quickly. There's tons and tons of investment. Look at it. 2018 go down. No, went up 78%. So it's not aware this issue there is something We're fundamentally wrong and you hit it on. It's a part of its black box, and part of it's the date equality and part of its bias. And there's there's all of these things flowing around it. And so when we dug into that, we said, Well, okay, if that exists, how are we going to help organizations get their arms around this issue and start digging into that that trust issue and really it's the front part is, is exactly what we're talking about in terms of data quality, both structured more traditional approaches and unstructured, using the handwriting example in those types of techniques. But then you get into the models themselves, and it's, you know, the critical thing she had to worry about is, you know, lineage. So from an integrity perspective, where's the data coming from? Whether the sources for the change controls on some of that, they need to look at explain ability, gain at the black box part where you can you tell me the inferences decisions are those documented. And this is important for this me, the human in the loop to get confidence in the algorithm as well as you know, that executive group. So they understand there's a structure set of processes around >> Moneyball. Problem is actually pretty confined. It's pretty straightforward. Dono 32 teams are throwing minor leagues, but the data models pretty consistent through the problem with organizations is I didn't know data model is consistent with the organization you mentioned, Risk Bob. The >> other problem is organizational inertia. If they don't trust it, what is it? What is a P and l manage to do when he or she wants to preserve? Yeah, you know, their exit position. They attacked the data. You know, I don't believe that well, which which is >> a fundamental point, which is culture. Yes. I mean, you can you can have all the data, science and all the governance that you want. But if you don't work culture in parallel with all this, it's it's not gonna stick. And and that's, I think the lot of the leading organisations, they're starting to really dig into this. We hear a lot of it literacy. We hear a lot about, you know, top down support. What does that really mean? It means, you know, senior executives are placing bats around and linking demonstrably linking the data and the role of data days an asset into their strategies and then messaging it out and being specific around the types of investments that are going to reinforce that business strategy. So that's absolutely critical. And then literacy absolutely fundamental is well, because it's not just the executives and the data scientists that have to get this. It's the guy in ops that you're trying to get you. They need to understand, you know, not only tools, but it's less about the tools. But it's the techniques, so it's not. The approach is being used, are more transparent and and that you know they're starting to also understand, you know, the issues of privacy and data usage rights. That's that's also something that we can't leave it the curb. With all this >> innovation, it's also believing that there's an imperative. I mean, there's a lot of for all the talk about digital transformation hear it everywhere. Everybody's trying to get digital, right? But there's still a lot of complacency in the organization in the lines of business in operation to save. We're actually doing really well. You know, we're in financial service is health care really hasn't been disrupted. This is Oh, it's coming, it's coming. But there's still a lot of I'll be retired by then or hanging. Actually, it's >> also it's also the fact that, you know, like in the previous generation, like, you know, if I had to go to a shopping, I would go into a shop and if I wanted by an insurance product, I would call my insurance agent. But today the New world, it's just a top off my screen. I have to go from Amazon, so some other some other app, and this is really this is what is happening to all of our kind. Previously that they start their customers, pocketed them in different experience. Buckets. It's not anymore that's real in front of them. So if you don't get into their digital transformation, a customer is not going to discount you by saying, Oh, you're not Amazon. So I'm not going to expect that you're still on my phone and you're only two types of here, so you have to become really digital >> little surprises that you said you see the next. The next stage is being decision support rather than customer experience, because we hear that for CEOs, customer experience is top of mind right now. >> No natural profile. There are two differences, right? One is external facing is absolutely the customer internal facing. It's absolutely the decision making, because that's how they're separating. The internal were, says the external, and you know most of the meetings that we goto Customer insight is the first place where analytics is starting where data is being cleaned up. Their questions are being asked about. Can I master my customer records? Can I do a good master off my vendor list? That is where they start. But all of that leads to good decision making to support the customers. So it's like that external towards internal view well, back >> to the offense versus defense and the shift. I mean, it absolutely is on the offense side. So it is with the customer, and that's a more directly to the business strategy. So it's get That's the area that's getting the money, the support and people feel like it's they're making an impact with it there. When it's it's down here in some admin area, it's below the water line, and, you know, even though it's important and it flows up here, it doesn't get the VIN visibility. So >> that's great conversation. You coming on? You got to leave it there. Thank you for watching right back with our next guest, Dave Lot. Paul Gillen from M I t CDO I Q Right back. You're watching the Cube
SUMMARY :
Brought to you by We here covering the M I t CDO conference M I t CEO Day to wrapping Let's start with your So, Bob, where do you focus And so I'm the advisory chief date officer. I also have a second hat, and I also serve financial service is si Dios as well. I got sales guys in What was your role? Yeah, You know, I focus a lot on data science, artificial intelligence and I mean, you know, I were a horizontal within the form, So we help customers. seeing in terms of that shift from From the defense data to the offensive? Yeah, and it's it's really you know, when you think about it and let me define sort of offense versus respond because of getting a handle on the data they're actually finding. getting the customer experience, you know, really exploring. if you could comment Is that seems like many CBO's air not directly involved in And this is when you become more often they were proactive with data and the digital officers seen taking the reins in machine learning in machine learning projects and cos you work with. So we are seeing, like, you know, different. And you know, the work that you're doing around, um, anomaly detection is So it's I mean, you know, you developed a technology for one off our clients and pretty and the rules generation the way you would with standard data. I mean, the deal is there, but when you then start scanning it, So you can tell you where the problems are. So the data quality approaches are evolving, Exactly what kind of do we need to retrain them or bring in new skills. And I can talk from the inspector off. machine learning, still require human in the loop. And who do you who are the But the day today is still with the data Those s Emmys. And And like my, you know, what are the masking capabilities I need to have? He's being concerned that they're gonna be replaced by the machine. 80% of the time doing boring job off, looking at data. the data is clean. And I think they're really you and all of this data science is leading up to their decision, and today you see like you know what you brought Insights aren't Yes, you know, fundamental trust issues with a I you know, it's we've done a lot of work I think you know, the research we've done. issue, you know, we also do a CEO study, and we've done this many years That's the in the algorithm as well as you know, that executive group. is I didn't know data model is consistent with the organization you mentioned, Yeah, you know, science and all the governance that you want. the organization in the lines of business in operation to save. also it's also the fact that, you know, like in the previous generation, little surprises that you said you see the next. The internal were, says the external, and you know most of the meetings it's below the water line, and, you know, even though it's important and it flows up here, Thank you for
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Barbara | PERSON | 0.99+ |
KPMG | ORGANIZATION | 0.99+ |
Bob | PERSON | 0.99+ |
20% | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
90% | QUANTITY | 0.99+ |
80% | QUANTITY | 0.99+ |
Bob Parr | PERSON | 0.99+ |
2017 | DATE | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Dave Lot | PERSON | 0.99+ |
2018 | DATE | 0.99+ |
67% | QUANTITY | 0.99+ |
nine countries | QUANTITY | 0.99+ |
92% | QUANTITY | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
2400 senior executives | QUANTITY | 0.99+ |
Six months | QUANTITY | 0.99+ |
three offense | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
Paul Gillen | PERSON | 0.99+ |
Lee | PERSON | 0.99+ |
today | DATE | 0.99+ |
78% | QUANTITY | 0.99+ |
Sreekar Krishna | PERSON | 0.99+ |
two types | QUANTITY | 0.99+ |
One | QUANTITY | 0.98+ |
32 teams | QUANTITY | 0.98+ |
second hat | QUANTITY | 0.98+ |
Three years | QUANTITY | 0.98+ |
two differences | QUANTITY | 0.98+ |
10 | DATE | 0.98+ |
both | QUANTITY | 0.97+ |
two | QUANTITY | 0.97+ |
two weeks | QUANTITY | 0.97+ |
this week | DATE | 0.96+ |
one | QUANTITY | 0.95+ |
M I t CDO | EVENT | 0.95+ |
C Street | ORGANIZATION | 0.93+ |
M I t CEO Day | EVENT | 0.93+ |
Streetcar Krishna | PERSON | 0.92+ |
about a year and | DATE | 0.91+ |
2019 | DATE | 0.9+ |
Cuban | OTHER | 0.9+ |
CBO | ORGANIZATION | 0.88+ |
first year | QUANTITY | 0.88+ |
Si Dios | ORGANIZATION | 0.87+ |
12 years ago | DATE | 0.86+ |
10 | QUANTITY | 0.84+ |
Risk | PERSON | 0.81+ |
1,000,000,000 data points | QUANTITY | 0.8+ |
CDO | TITLE | 0.8+ |
Parr | PERSON | 0.79+ |
Cube | ORGANIZATION | 0.79+ |
1/2 ago | DATE | 0.78+ |
CDO | ORGANIZATION | 0.78+ |
tons and | QUANTITY | 0.76+ |
dual | QUANTITY | 0.72+ |
15 | QUANTITY | 0.71+ |
Dono | ORGANIZATION | 0.7+ |
one little number | QUANTITY | 0.69+ |
MIT | ORGANIZATION | 0.67+ |
three | QUANTITY | 0.64+ |
500 | OTHER | 0.63+ |
box | TITLE | 0.61+ |
M I T. | EVENT | 0.6+ |
Cube Bob | ORGANIZATION | 0.59+ |
Andy Palmer, TAMR | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019 Brought to you by Silicon Angle Media >> Welcome back to M I. T. Everybody watching the Cube. The leader in live tech coverage we hear a Day two of the M I t chief data officer information Quality Conference Day Volonte with Paul Dillon. Andy Palmer's here. He's the co founder and CEO of Tamer. Good to see again. It's great to see it actually coming out. So I didn't ask this to Mike. I could kind of infirm from someone's dances. But why did you guys start >> Tamer? >> Well, it really started with an academic project that Mike was doing over at M. I. T. And I was over in of artists at the time. Is the chief get officer over there? And what we really found was that there were a lot of companies really suffering from data mastering as the primary bottleneck in their company did used great new tech like the vertical system that we've built and, you know, automated a lot of their warehousing and such. But the real bottleneck was getting lots of data integrated and mastered really, really >> quickly. Yeah, He took us through the sort of problems with obviously the d. W. In terms of scaling master data management and the scanning problems was Was that really the problem that you were trying to solve? >> Yeah, it really was. And when we started, I mean, it was like, seven years ago, eight years ago, now that we started the company and maybe almost 10 when we started working on the academic project, and at that time, people weren't really thinking are worried about that. They were still kind of digesting big data. A zit was called, but I think what Mike and I kind of felt was going on was that people were gonna get over the big data, Um, and the volume of data. And we're going to start worrying about the variety of the data and how to make the data cleaner and more organized. And, uh, I think I think way called that one pretty much right. Maybe >> we're a little >> bit early, but but I think now variety is the big problem >> with the other thing about your big day. Big data's oftentimes associated with Duke, which was a batch and then you sort of saw the shifter real time and spark was gonna fix all that. And so what are you seeing in terms of the trends in terms of how data is being used to drive almost near real time business decisions. >> You know, Mike and I came out really specifically back in 2007 and declared that we thought, uh, Hadoop and H D f s was going to be far less impactful than other people. >> 07 >> Yeah, Yeah. And Mike Mike actually was really aggressive and saying it was gonna be a disaster. And I think we've finally seen that actually play out of it now that the bloom is off the rose, so to speak. And so they're They're these fundamental things that big companies struggle with in terms of their data and, you know, cleaning it up and organizing it and making it, Iike want. Anybody that's worked at one of these big companies can tell you that the data that they get from most of their internal system sucks plain and simple, and so cleaning up that data, turning it into something it's an asset rather than liability is really what what tamers all about? And it's kind of our mission. We're out there to do this and it sort of pails and compare. Do you think about the amount of money that some of these companies have spent on systems like ASAP on you're like, Yeah, but all the data inside of the systems so bad and so, uh, ugly and unuseful like we're gonna fix that problem. >> So you're you're you're special sauce and machine learning. Where are you applying machine learning most most effectively when >> we apply machine learning to probably the least sexy problem on the planet. There are a lot of companies out there that use machine learning and a I t o do predictive algorithms and all kinds of cool stuff. All we do with machine learning is actually use it to clean up data and organize data. Get it ready for people to use a I I I started in the eye industry back in the late 19 eighties on, you know, really, I learned from the sky. Marvin Minsky and Mark Marvin taught me two things. First was garbage in garbage out. There's no algorithm that's worth anything unless you've got great data, and the 2nd 1 is it's always about the human in the machine working together. And I've really been working on those two same principles most of my career, and Tamer really brings both of those together. Our goal is to prepare data so that it can be used analytically inside of these companies, that it's actually high quality and useful. And the way we do that involves bringing together the machine, mostly these advanced machine learning algorithms with humans, subject matter experts inside of these companies that actually know all the ins and outs and all the intricacies of the data inside of their company. >> So say garbage in garbage out. If you don't have good training data course you're not going good ML model. How much how much upfront work is required. G. I know it was one of your customers and how much time is required to put together on ML model that can deal with 20,000,000 records like that? >> Well, you know, the amazing thing that this happened for us in the last five years, especially is that now we've got we've built enough models from scratch inside of these large global 2000 companies that very rarely do we go into a place where there we don't already have a model that's pre built. That they can use is a starting point. And I think that's the same thing that's happening in modeling in general. If you look a great companies like data robot Andi and even in in the Python community ml live that the accessibility of these modeling tools and the models themselves are actually so they're commoditized. And so most of our models and most of the projects we work on, we've already got a model. That's a starting point. We don't really have to start from scratch. >> You mentioned gonna ta I in the eighties Is that is the notion of a I Is it same as it was in the eighties and now we've just got the tooling, the horsepower, the data to take advantage of it is the concept changed? The >> math is all the same, like, you know, absolutely full stop, like there's really no new math. The two things I think that have changed our first. There's a lot more data that's available now, and, you know, uh, neural nets are a great example, right? in Marvin's things that, you know when you look at Google translate and how aggressively they used neural nets, it was the quantity of data that was available that actually made neural nets work. The second thing that that's that's changed is the cheap availability of Compute that Now the largest supercomputer in the world is available to rent by the minute. And so we've got all this data. You've got all this really cheap compute. And then third thing is what you alluded to earlier. The accessibility of all the math that now it's becoming so simple and easy to apply these math techniques, and they're becoming you know, it's It's almost to the point where the average data scientists not the advance With the average data, scientists can do a practice. Aye, aye. Techniques that 20 years ago required five PhDs. >> It's not surprising that Google, with its new neural net technology, all the search data that it has has been so successful. It's a surprise you that that Amazon with Alexa was able to compete so effectively. >> Oh, I think that I would never underestimate Amazon and their ability to, you know, build great tact. They've done some amazing work. One of my favorite Mike and I actually, one of our favorite examples in the last, uh, three years, they took their red shift system, you know, that competed with with Veronica and they they re implemented it and, you know, as a compiled system and it really runs incredibly fast. I mean, that that feat of engineering, what was truly exceptional >> to hear you say that Because it wasn't Red Shift originally Park. So yeah, that's right, Larry Ellison craps all over Red Shift because it's just open source offer that they just took and repackage. But you're saying they did some major engineering to Oh >> my gosh, yeah, It's like Mike and I both way Never. You know, we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. But this this latest rewrite that they've done this compiled version like it's really good. >> So as a guy has been doing a eye for 30 years now, and it's really seeing it come into its own, a lot of a I project seems right now are sort of low hanging fruit is it's small scale stuff where you see a I in five years what kind of projects are going our bar company's gonna be undertaking and what kind of new applications are gonna come out of this? But >> I think we're at the very beginning of this cycle, and actually there's a lot more potential than has been realized. So I think we are in the pick the low hanging fruit kind of a thing. But some of the potential applications of A I are so much more impactful, especially as we modernize core infrastructure in the enterprise. So the enterprise is sort of living with this huge legacy burden. And we always air encouraging a tamer our customers to think of all their existing legacy systems is just dated generating machines and the faster they can get that data into a state where they can start doing state of the art A. I work on top of it, the better. And so really, you know, you gotta put the legacy burden aside and kind of draw this line in the sand so that as you really get, build their muscles on the A. I side that you can take advantage of that with all the data that they're generating every single day. >> Everything about these data repose. He's Enterprise Data Warehouse. You guys built better with MPP technology. Better data warehouses, the master data management stuff, the top down, you know, Enterprise data models, Dupin in big data, none of them really lived up to their promise, you know? Yeah, it's kind of somewhat unfair toe toe like the MPP guys because you said, Hey, we're just gonna run faster. And you did. But you didn't say you're gonna change the world and all that stuff, right? Where's e d? W? Did Do you feel like this next wave is actually gonna live up to the promise? >> I think the next phase is it's very logical. Like, you know, I know you're talking to Chris Lynch here in a minute, and you know what? They're doing it at scale and at scale and tamer. These companies are all in the same general area. That's kind of related to how do you take all this data and actually prepare it and turn it into something that's consumable really quickly and easily for all of these new data consumers in the enterprise and like so that that's the next logical phase in this process. Now, will this phase be the one that finally sort of meets the high expectations that were set 2030 years ago with enterprise data warehousing? I don't know, but we're certainly getting closer >> to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out there. That was a technology just >> I'm huge. I'm fanatical right now about health care. I think that the opportunity for health care to be transformed with technology is, you know, almost makes everything else look like chump change. What aspect of health care? Well, I think that the most obvious thing is that now, with the consumer sort of in the driver seat in healthcare, that technology companies that come in and provide consumer driven solutions that meet the needs of patients, regardless of how dysfunctional the health care system is, that's killer stuff. We had a great company here in Boston called Pill Pack was a great example of that where they just build something better for consumers, and it was so popular and so, you know, broadly adopted again again. Eventually, Amazon bought it for $1,000,000,000. But those kinds of things and health care Pill pack is just the beginning. There's lots and lots of those kinds of opportunities. >> Well, it's right. Healthcare's ripe for disruption on, and it hasn't been hit with the digital destruction. And neither is financialservices. Really? Certainly, defenses has not yet another. They're high risk industry, so Absolutely takes longer. Well, Andy, thanks so much for making the time. You know, You gotta run. Yeah. Yeah. Thank you. All right, keep it right. Everybody move back with our next guest right after this short break. You're watching the Cube from M I T c B O Q. Right back.
SUMMARY :
you by Silicon Angle Media But why did you guys start like the vertical system that we've built and, you know, the problem that you were trying to solve? now that we started the company and maybe almost 10 when we started working on the academic And so what are you seeing in terms of the trends in terms of how data that we thought, uh, Hadoop and H D f s was going to be far big companies struggle with in terms of their data and, you know, cleaning it up and organizing Where are you applying machine the eye industry back in the late 19 eighties on, you know, If you don't have good training data course And so most of our models and most of the projects we work on, we've already got a model. math is all the same, like, you know, absolutely full stop, like there's really no new math. It's a surprise you that that Amazon implemented it and, you know, as a compiled system and to hear you say that Because it wasn't Red Shift originally Park. we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. And so really, you know, you gotta put the legacy of them really lived up to their promise, you know? That's kind of related to how do you take all this data and actually to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out health care to be transformed with technology is, you know, Well, Andy, thanks so much for making the time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Mike | PERSON | 0.99+ |
Andy | PERSON | 0.99+ |
Andy Palmer | PERSON | 0.99+ |
Mark Marvin | PERSON | 0.99+ |
2007 | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Paul Dillon | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
$1,000,000,000 | QUANTITY | 0.99+ |
Chris Lynch | PERSON | 0.99+ |
Marvin Minsky | PERSON | 0.99+ |
Larry Ellison | PERSON | 0.99+ |
First | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
30 years | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Cambridge, Massachusetts | LOCATION | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
second thing | QUANTITY | 0.99+ |
third thing | QUANTITY | 0.99+ |
20,000,000 records | QUANTITY | 0.99+ |
two same principles | QUANTITY | 0.99+ |
seven years ago | DATE | 0.99+ |
eight years ago | DATE | 0.99+ |
Mike Mike | PERSON | 0.98+ |
three years | QUANTITY | 0.98+ |
late 19 eighties | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
five years | QUANTITY | 0.98+ |
2030 years ago | DATE | 0.98+ |
2nd 1 | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
two things | QUANTITY | 0.97+ |
five PhDs | QUANTITY | 0.97+ |
Day two | QUANTITY | 0.97+ |
Veronica | PERSON | 0.97+ |
M I. T. | PERSON | 0.96+ |
Marvin | PERSON | 0.96+ |
20 years ago | DATE | 0.96+ |
Python | TITLE | 0.96+ |
eighties | DATE | 0.94+ |
2019 | DATE | 0.94+ |
2000 companies | QUANTITY | 0.94+ |
Red Shift | TITLE | 0.94+ |
Duke | ORGANIZATION | 0.93+ |
Alexa | TITLE | 0.91+ |
last five years | DATE | 0.9+ |
M I t | EVENT | 0.88+ |
almost 10 | QUANTITY | 0.87+ |
TAMR | PERSON | 0.86+ |
Andi | PERSON | 0.8+ |
M. I. T. | ORGANIZATION | 0.79+ |
Tamer | ORGANIZATION | 0.78+ |
Information Quality Symposium | EVENT | 0.78+ |
Quality Conference Day Volonte | EVENT | 0.77+ |
Tamer | PERSON | 0.77+ |
Google translate | TITLE | 0.75+ |
single day | QUANTITY | 0.71+ |
H | PERSON | 0.71+ |
Chief | PERSON | 0.66+ |
Hadoop | PERSON | 0.64+ |
MIT | ORGANIZATION | 0.63+ |
Cube | ORGANIZATION | 0.61+ |
more | QUANTITY | 0.6+ |
M. I. T. | PERSON | 0.57+ |
Pill pack | COMMERCIAL_ITEM | 0.56+ |
Pill Pack | ORGANIZATION | 0.53+ |
D f s | ORGANIZATION | 0.48+ |
Park | TITLE | 0.44+ |
CDOIQ | EVENT | 0.32+ |
Cube | PERSON | 0.27+ |
Michael Stonebraker, TAMR | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to Cambridge, Massachusetts. Everybody, You're watching the Cube, the leader in live tech coverage, and we're covering the M I t CDO conference M I t. CDO. My name is David Monty in here with my co host, Paul Galen. Mike Stone breakers here. The legend is founder CTO of Of Tamer, as well as many other companies. Inventor Michael. Thanks for coming back in the Cube. Good to see again. Nice to be here. So this is kind of ah, repeat pattern for all of us. We kind of gather here in August that the CDO conference You're always the highlight of the show. You gave a talk this week on the top 10. Big data mistakes. You and I are one of the few. You were the few people who still use the term big data. I happen to like it. Sad that it's out of vogue already, but people associated with the doo doop it's kind of waning, but regardless, so welcome. How'd the talk go? What were you talking about. >> So I talked to a lot of people who were doing analytics. We're doing operation Offer operational day of data at scale, and they always make most of them make a collection of bad mistakes. And so the talk waas a litany of the blunders that I've seen people make, and so the audience could relate to the blunders about most. Most of the enterprise is represented. Make a bunch of the blunders. So I think no. One blunder is not planning on moving most everything to the cloud. >> So that's interesting, because a lot of people would would would love to debate that, but and I would imagine you probably could have done this 10 years ago in a lot of the blunders would be the same, but that's one that wouldn't have been there. But so I tend to agree. I was one of the two hands that went up this morning, and vocalist talk when he asked, Is the cloud cheaper for us? It is anyway. But so what? Why should everybody move everything? The cloud aren't there laws of physics, laws of economics, laws of the land that suggest maybe you >> shouldn't? Well, I guess 22 things and then a comment. First thing is James Hamilton, who's no techies. Techie works for Amazon. We know James. So he claims that he could stand up a server for 25% of your cost. I have no reason to disbelieve him. That number has been pretty constant for a few years, so his cost is 1/4 of your cost. Sooner or later, prices are gonna reflect costs as there's a race to the bottom of cloud servers. So >> So can I just stop you there for a second? Because you're some other date on that. All you have to do is look at a W S is operating margin and you'll see how profitable they are. They have software like economics. Now we're deploying servers. So sorry to interrupt, but so carry. So >> anyway, sooner or later, they're gonna have their gonna be wildly cheaper than you are. The second, then yet is from Dave DeWitt, whose database wizard. And here's the current technology that that Microsoft Azure is using. As of 18 months ago, it's shipping containers and parking lots, chilled water in power in Internet, Ian otherwise sealed roof and walls optional. So if you're doing raised flooring in Cambridge versus I'm doing shipping containers in the Columbia River Valley, who's gonna be a lot cheaper? And so you know the economies of scale? I mean, that, uh, big, big cloud guys are building data centers as fast as they can, using the cheapest technology around. You put up the data center every 10 years on dhe. You do it on raised flooring in Cambridge. So sooner or later, the cloud guys are gonna be a lot cheaper. And the only thing that isn't gonna the only thing that will change that equation is For example, my lab is up the street with Frank Gehry building, and we have we have an I t i t department who runs servers in Cambridge. Uh, and they claim they're cheaper than the cloud. And they don't pay rent for square footage and they don't pay for electricity. So yeah, if if think externalities, If there are no externalities, the cloud is assuredly going to be cheaper. And then the other thing is that most everybody tonight that I talk thio including me, has very skewed resource demands. So in the cloud finding three servers, except for the last day of the month on the last day of the month. I need 20 servers. I just do it. If I'm doing on Prem, I've got a provision for peak load. And so again, I'm just way more expensive. So I think sooner or later these combinations of effects was going to send everybody to the cloud for most everything, >> and my point about the operating margins is difference in price and cost. I think James Hamilton's right on it. If he If you look at the actual cost of deploying, it's even lower than the price with the market allows them to their growing at 40 plus percent a year and a 35 $40,000,000,000 run rate company sooner, Sooner or >> later, it's gonna be a race to the lot of you >> and the only guys are gonna win. You have guys have the best cost structure. A >> couple other highlights from your talk. >> Sure, I think 2nd 2nd thing like Thio Thio, no stress is that machine learning is going to be a game is going to be a game changer for essentially everybody. And not only is it going to be autonomous vehicles. It's gonna be automatic. Check out. It's going to be drone delivery of most everything. Uh, and so you can, either. And it's gonna affect essentially everybody gonna concert of, say, categorically. Any job that is easy to understand is going to get automated. And I think that's it's gonna be majorly impactful to most everybody. So if you're in Enterprise, you have two choices. You can be a disrupt or or you could be a disruptive. And so you can either be a taxi company or you can be you over, and it's gonna be a I machine learning that's going going to be determined which side of that equation you're on. So I was a big blunder that I see people not taking ml incredibly seriously. >> Do you see that? In fact, everyone I talked who seems to be bought in that this is we've got to get on the bandwagon. Yeah, >> I'm just pointing out the obvious. Yeah, yeah, I think, But one that's not quite so obvious you're is a lot of a lot of people I talked to say, uh, I'm on top of data science. I've hired a group of of 10 data scientists, and they're doing great. And when I talked, one vignette that's kind of fun is I talked to a data scientist from iRobot, which is the guys that have the vacuum cleaner that runs around your living room. So, uh, she said, I spend 90% of my time locating the data. I want to analyze getting my hands on it and cleaning it, leaving the 10% to do data science job for which I was hired. Of the 10% I spend 90% fixing the data cleaning errors in my data so that my models work. So she spends 99% of her time on what you call data preparation 1% of her time doing the job for which he was hired. So data science is not about data science. It's about data integration, data cleaning, data, discovery. >> But your new latest venture, >> so tamer does that sort of stuff. And so that's But that's the rial data science problem. And a lot of people don't realize that yet, And, uh, you know they will. I >> want to ask you because you've been involved in this by my count and starting up at least a dozen companies. Um, 99 Okay, It's a lot. >> It's not overstated. You estimated high fall. How do you How >> do you >> decide what challenge to move on? Because they're really not. You're not solving the same problems. You're You're moving on to new problems. How do you decide? What's the next thing that interests you? Enough to actually start a company. Okay, >> that's really easy. You know, I'm on the faculty of M i t. My job is to think of news new ship and investigate it, and I come up. No, I'm paid to come up with new ideas, some of which have commercial value, some of which don't and the ones that have commercial value, like, commercialized on. So it's whatever I'm doing at the time on. And that's why all the things I've commercialized, you're different >> s so going back to tamer data integration platform is a lot of companies out there claim to do it day to get integration right now. What did you see? What? That was the deficit in the market that you could address. >> Okay, great question. So there's the traditional data. Integration is extract transforming load systems and so called Master Data management systems brought to you by IBM in from Attica. Talent that class of folks. So a dirty little secret is that that technology does not scale Okay, in the following sense that it's all well, e t l doesn't scale for a different reason with an m d l e t l doesn't scale because e t. L is based on the premise that somebody really smart comes up with a global data model For all the data sources you want put together. You then send a human out to interview each business unit to figure out exactly what data they've got and then how to transform it into the global data model. How to load it into your data warehouse. That's very human intensive. And it doesn't scale because it's so human intensive. So I've never talked to a data warehouse operator who who says I integrate the average I talk to says they they integrate less than 10 data sources. Some people 20. If you twist my arm hard, I'll give you 50. So a Here. Here's a real world problem, which is Toyota Motor Europe. I want you right now. They have a distributor in Spain, another distributor in France. They have a country by country distributor, sometimes canton by Canton. Distribute distribution. So if you buy a Toyota and Spain and move to France, Toyota develops amnesia. The French French guys know nothing about you. So they've got 250 separate customer databases with 40,000,000 total records in 50 languages. And they're in the process of integrating that. It was single customer database so that they can Duke custom. They could do the customer service we expect when you cross cross and you boundary. I've never seen an e t l system capable of dealing with that kind of scale. E t l dozen scale to this level of problem. >> So how do you solve that problem? >> I'll tell you that they're a tamer customer. I'll tell you all about it. Let me first tell you why MGM doesn't scare. >> Okay. Great. >> So e t l says I now have all your data in one place in the same format, but now you've got following problems. You've got a d duplicated because if if I if I bought it, I bought a Toyota in Spain, I bought another Toyota in France. I'm both databases. So if you want to avoid double counting customers, you got a dupe. Uh, you know, got Duke 30,000,000 records. And so MGM says Okay, you write some rules. It's a rule based technology. So you write a rule. That's so, for example, my favorite example of a rule. I don't know if you guys like to downhill downhill skiing, All right? I love downhill skiing. So ski areas, Aaron, all kinds of public databases assemble those all together. Now you gotta figure out which ones are the same the same ski area, and they're called different names in different addresses and so forth. However, a vertical drop from bottom to the top is the same. Chances are they're the same ski area. So that's a rule that says how to how to put how to put data together in clusters. And so I now have a cluster for mount sanity, and I have a problem which is, uh, one address says something rather another address as something else. Which one is right or both? Right, so now you want. Now you have a gold. Let's call the golden Record problem to basically decide which, which, which data elements among a variety that maybe all associated with the same entity are in fact correct. So again, MDM, that's a rule's a rule based system. So it's a rule based technology and rule systems don't scale the best example I can give you for why Rules systems don't scale. His tamer has another customer. General Electric probably heard of them, and G wanted to do spend analytics, and so they had 20,000,000 spend transactions. Frank the year before last and spend transaction is I paid $12 to take a cab from here here to the airport, and I charged it to cost center X Y Z 20,000,000 of those so G has a pre built classification system for spend, so they have parts and underneath parts or computers underneath computers and memory and so forth. So pre existing preexisting class classifications for spend they want to simply classified 20,000,000 spent transactions into this pre existing hierarchy. So the traditional technology is, well, let's write some rules. So G wrote 500 rules, which is about the most any single human I can get there, their arms around so that classified 2,000,000 of the 20,000,000 transactions. You've now got 18 to go and another 500 rules is not going to give you 2,000,000 more. It's gonna give you love diminishing returns, right? So you have to write a huge number of rules and no one can possibly understand. So the technology simply doesn't scale, right? So in the case of G, uh, they had tamer health. Um, solve this. Solved this classification problem. Tamer used their 2,000,000 rule based, uh, tag records as training data. They used an ML model, then work off the training data classifies remaining 18,000,000. So the answer is machine learning. If you don't use machine learning, you're absolutely toast. So the answer to MDM the answer to MGM doesn't scale. You've got to use them. L The answer to each yell doesn't scale. You gotta You're putting together disparate records can. The answer is ml So you've got to replace humans by machine learning. And so that's that seems, at least in this conference, that seems to be resonating, which is people are understanding that at scale tradition, traditional data integration, technology's just don't work >> well and you got you got a great shot out on yesterday from the former G S K Mark Grams, a leader Mark Ramsay. Exactly. Guys. And how they solve their problem. He basically laid it out. BTW didn't work and GM didn't work, All right. I mean, kick it, kick the can top down data modelling, didn't work, kicked the candid governance That's not going to solve the problem. And But Tamer did, along with some other tooling. Obviously, of course, >> the Well, the other thing is No. One technology. There's no silver bullet here. It's going to be a bunch of technologies working together, right? Mark Ramsay is a great example. He used his stream sets and a bunch of other a bunch of other startup technology operating together and that traditional guys >> Okay, we're good >> question. I want to show we have time. >> So with traditional vendors by and large or 10 years behind the times, And if you want cutting edge stuff, you've got to go to start ups. >> I want to jump. It's a different topic, but I know that you in the past were critic of know of the no sequel movement, and no sequel isn't going away. It seems to be a uh uh, it seems to be actually gaining steam right now. What what are the flaws in no sequel? It has your opinion changed >> all? No. So so no sequel originally meant no sequel. Don't use it then. Then the marketing message changed to not only sequel, So sequel is fine, but no sequel does others. >> Now it's all sequel, right? >> And my point of view is now. No sequel means not yet sequel because high level language, high level data languages, air good. Mongo is inventing one Cassandra's inventing one. Those unless you squint, look like sequel. And so I think the answer is no sequel. Guys are drifting towards sequel. Meanwhile, Jason is That's a great idea. If you've got your regular data sequel, guys were saying, Sure, let's have Jason is the data type, and I think the only place where this a fair amount of argument is schema later versus schema first, and I pretty much think schema later is a bad idea because schema later really means you're creating a data swamp exactly on. So if you >> have to fix it and then you get a feel of >> salary, so you're storing employees and salaries. So, Paul salaries recorded as dollars per month. Uh, Dave, salary is in euros per week with a lunch allowance minds. So if you if you don't, If you don't deal with irregularities up front on data that you care about, you're gonna create a mess. >> No scheme on right. Was convenient of larger store, a lot of data cheaply. But then what? Hard to get value out of it created. >> So So I think the I'm not opposed to scheme later. As long as you realize that you were kicking the can down the road and you're just you're just going to give your successor a big mess. >> Yeah, right. Michael, we gotta jump. But thank you so much. Sure appreciate it. All right. Keep it right there, everybody. We'll be back with our next guest right into the short break. You watching the cue from M i t cdo Ike, you right back
SUMMARY :
Brought to you by We kind of gather here in August that the CDO conference You're always the highlight of the so the audience could relate to the blunders about most. physics, laws of economics, laws of the land that suggest maybe you So he claims that So can I just stop you there for a second? And so you know the and my point about the operating margins is difference in price and cost. You have guys have the best cost structure. And so you can either be a taxi company got to get on the bandwagon. leaving the 10% to do data science job for which I was hired. But that's the rial data science problem. want to ask you because you've been involved in this by my count and starting up at least a dozen companies. How do you How You're You're moving on to new problems. No, I'm paid to come up with new ideas, s so going back to tamer data integration platform is a lot of companies out there claim to do and so called Master Data management systems brought to you by IBM I'll tell you that they're a tamer customer. So the answer to MDM the I mean, kick it, kick the can top down data modelling, It's going to be a bunch of technologies working together, I want to show we have time. and large or 10 years behind the times, And if you want cutting edge It's a different topic, but I know that you in the past were critic of know of the no sequel movement, No. So so no sequel originally meant no So if you So if you if Hard to get value out of it created. So So I think the I'm not opposed to scheme later. But thank you so much.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michael | PERSON | 0.99+ |
James | PERSON | 0.99+ |
Mark Ramsay | PERSON | 0.99+ |
James Hamilton | PERSON | 0.99+ |
Paul Galen | PERSON | 0.99+ |
Dave DeWitt | PERSON | 0.99+ |
Toyota | ORGANIZATION | 0.99+ |
David Monty | PERSON | 0.99+ |
General Electric | ORGANIZATION | 0.99+ |
2,000,000 | QUANTITY | 0.99+ |
France | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
20,000,000 | QUANTITY | 0.99+ |
10% | QUANTITY | 0.99+ |
Michael Stonebraker | PERSON | 0.99+ |
Cambridge | LOCATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
50 | QUANTITY | 0.99+ |
$12 | QUANTITY | 0.99+ |
Spain | LOCATION | 0.99+ |
18,000,000 | QUANTITY | 0.99+ |
25% | QUANTITY | 0.99+ |
20 servers | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
Columbia River Valley | LOCATION | 0.99+ |
99% | QUANTITY | 0.99+ |
18 | QUANTITY | 0.99+ |
Aaron | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
August | DATE | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
three servers | QUANTITY | 0.99+ |
35 $40,000,000,000 | QUANTITY | 0.99+ |
50 languages | QUANTITY | 0.99+ |
500 rules | QUANTITY | 0.99+ |
22 things | QUANTITY | 0.99+ |
10 data scientists | QUANTITY | 0.99+ |
Mike Stone | PERSON | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
MGM | ORGANIZATION | 0.99+ |
less than 10 data sources | QUANTITY | 0.99+ |
Ian | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
1% | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
Toyota Motor Europe | ORGANIZATION | 0.99+ |
Of Tamer | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
single | QUANTITY | 0.99+ |
Attica | ORGANIZATION | 0.99+ |
10 years ago | DATE | 0.99+ |
yesterday | DATE | 0.99+ |
iRobot | ORGANIZATION | 0.99+ |
Mark Grams | PERSON | 0.99+ |
TAMR | PERSON | 0.99+ |
10 years | QUANTITY | 0.99+ |
20 | QUANTITY | 0.98+ |
1/4 | QUANTITY | 0.98+ |
250 separate customer databases | QUANTITY | 0.98+ |
Cassandra | PERSON | 0.98+ |
First thing | QUANTITY | 0.98+ |
30,000,000 records | QUANTITY | 0.98+ |
both databases | QUANTITY | 0.98+ |
18 months ago | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
M I t CDO | EVENT | 0.98+ |
One blunder | QUANTITY | 0.98+ |
Tamer | PERSON | 0.98+ |
one place | QUANTITY | 0.98+ |
second | QUANTITY | 0.97+ |
two choices | QUANTITY | 0.97+ |
tonight | DATE | 0.97+ |
each business unit | QUANTITY | 0.97+ |
Thio Thio | PERSON | 0.97+ |
two hands | QUANTITY | 0.96+ |
this week | DATE | 0.96+ |
Frank | PERSON | 0.95+ |
Duke | ORGANIZATION | 0.95+ |
Lars Toomre, Brass Rat Capital | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to M I. T. Everybody. This is the Cube. The leader in live coverage. My name is David wanted. I'm here with my co host, Paul Gill, in this day to coverage of the M I t cdo I Q conference. A lot of acronym stands for M I. T. Of course, the great institution. But Chief Data officer information quality event is his 13th annual event. Lars to Maria's here is the managing partner of Brass Rat Capital. Cool name Lars. Welcome to the Cube. Great. Very much. Glad I start with a name brass around Capitol was That's >> rat is reference to the M I t school. Okay, Beaver? Well, he is, but the students call it a brass rat, and I'm third generation M i t. So it's just seen absolutely appropriate. That is a brass rods and capital is not a reference to money, but is actually referenced to the intellectual capital. They if you have five or six brass rats in the same company, you know, we Sometimes engineers arrive and they could do some things. >> And it Boy, if you put in some data data capital in there, you really explosions. We cause a few problems. So we're gonna talk about some new regulations that are coming down. New legislation that's coming down that you exposed me to yesterday, which is gonna have downstream implications. You get ahead of this stuff and understand it. You can really first of all, prepare, make sure you're in compliance, but then potentially take advantage for your business. So explain to us this notion of open government act. >> Um, in the last five years, six years or so, there's been an effort going on to increase the transparency across all levels of government. Okay, State, local and federal government. The first of federal government laws was called the the Open Data Act of 2014 and that was an act. They was acted unanimously by Congress and signed by Obama. They was taking the departments of the various agencies of the United States government and trying to roll up all the expenses into one kind of expense. This is where we spent our money and who got the money and doing that. That's what they were trying to do. >> Big picture type of thing. >> Yeah, big picture type thing. But unfortunately, it didn't work, okay? Because they forgot to include this odd word called mentalities. So the same departments meant the same thing. Data problem. They have a really big data problem. They still have it. So they're to G et o reports out criticizing how was done, and the government's gonna try and correct it. Then in earlier this year, there was another open government date act which said in it was signed by Trump. Now, this time you had, like, maybe 25 negative votes, but essentially otherwise passed Congress completely. I was called the Open as all capital O >> P E >> n Government Data act. Okay, and that's not been implemented yet. But there's live talking around this conference today in various Chief date officers are talking about this requirement that every single non intelligence defense, you know, vital protection of the people type stuff all the like, um, interior, treasury, transportation, those type of systems. If you produce a report these days, which is machine, I mean human readable. You must now in two years or three years. I forget the exact invitation date. Have it also be machine readable. Now, some people think machine riddle mil means like pdf formats, but no, >> In fact, what the government did is it >> said it must be machine readable. So you must be able to get into the reports, and you have to be able to extract out the information and attach it to the tree of knowledge. Okay, so we're all of sudden having context like they're currently machine readable, Quote unquote, easy reports. But you can get into those SEC reports. You pull out the net net income information and says its net income, but you don't know what it attaches to on the tree of knowledge. So, um, we are helping the government in some sense able, machine readable type reporting that weaken, do machine to machine without people being involved. >> Would you say the tree of knowledge You're talking about the constant >> man tick semantic tree of knowledge so that, you know, we all come from one concept like the human is example of a living thing living beast, a living Beeston example Living thing. So it also goes back, and they're serving as you get farther and farther out the tree, there's more distance or semantic distance, but you can attach it back to concept so you can attach context to the various data. Is this essentially metadata? That's what people call it. But if I would go over see sale here at M I t, they would turn around. They call it the Tree of Knowledge or semantic data. Okay, it's referred to his semantic dated, So you are passing not only the data itself, but the context that >> goes along with the data. Okay, how does this relate to the financial transparency? >> Well, Financial Transparency Act was introduced by representative Issa, who's a Republican out of California. He's run the government Affairs Committee in the House. He retired from Congress this past November, but in 2017 he introduced what's got referred to his H R 15 30 Um, and the 15 30 is going to dramatically change the way, um, financial regulators work in the United States. Um, it is about it was about to be introduced two weeks ago when the labor of digital currency stuff came up. So it's been delayed a little bit because they're trying to add some of the digital currency legislation to that law. >> A front run that Well, >> I don't know exactly what the remember soul coming out of Maxine Waters Committee. So the staff is working on a bunch of different things at once. But, um, we own g was asked to consult with them on looking at the 15 30 act and saying, How would we improve quote unquote, given our technical, you know, not doing policy. We just don't have the technical aspects of the act. How would we want to see it improved? So one of the things we have advised is that for the first time in the United States codes history, they're gonna include interesting term called ontology. You know what intelligence? Well, everyone gets scared by the word. And when I read run into people, they say, Are you a doctor? I said, no, no, no. I'm just a date. A guy. Um, but an intolerant tea is like a taxonomy, but it had order has important, and an ontology allows you to do it is ah, kinda, you know, giving some context of linking something to something else. And so you're able Thio give Maur information with an intolerant that you're able to you with a tax on it. >> Okay, so it's a taxonomy on steroids? >> Yes, exactly what? More flexible, >> Yes, but it's critically important for artificial intelligence machine warning because if I can give them until ology of sort of how it goes up and down the semantics, I can turn around, do a I and machine learning problems on the >> order of 100 >> 1000 even 10,000 times faster. And it has context. It has contacts in just having a little bit of context speeds up these problems so dramatically so and it is that what enables the machine to machine? New notion? No, the machine to machine is coming in with son called SP R M just standard business report model. It's a OMG sophistication of way of allowing the computers or machines, as we call them these days to get into a standard business report. Okay, so let's say you're ah drug company. You have thio certify you >> drugged you manufactured in India, get United States safely. Okay, you have various >> reporting requirements on the way. You've got to give extra easy the FDA et cetera that will always be a standard format. The SEC has a different format. FERC has a different format. Okay, so what s p r m does it allows it to describe in an intolerant he what's in the report? And then it also allows one to attach an ontology to the cells in the report. So if you like at a sec 10 Q 10 k report, you can attach a US gap taxonomy or ontology to it and say, OK, net income annual. That's part of the income statement. You should never see that in a balance sheet type item. You know his example? Okay. Or you can for the first time by having that context you can say are solid problem, which suggested that you can file these machine readable reports that air wrong. So they believe or not, There were about 50 cases in the last 10 years where SEC reports have been filed where the assets don't equal total liabilities, plus cheryl equity, you know, just they didn't add >> up. So this to, >> you know, to entry accounting doesn't work. >> Okay, so so you could have the machines go and check scale. Hey, we got a problem We've >> got a problem here, and you don't have to get humans evolved. So we're gonna, um uh, Holland in Australia or two leaders ahead of the United States. In this area, they seem dramatic pickups. I mean, Holland's reporting something on the order of 90%. Pick up Australia's reporting 60% pickup. >> We say pick up. You're talking about pickup of errors. No efficiency, productivity, productivity. Okay, >> you're taking people out of the whole cycle. It's dramatic. >> Okay, now what's the OMG is rolling on the hoof. Explain the OMG >> Object Management Group. I'm not speaking on behalf of them. It's a membership run organization. You remember? I am a >> member of cold. >> I'm a khalid of it. But I don't represent omg. It's the membership has to collectively vote that this is what we think. Okay, so I can't speak on them, right? I have a pretty significant role with them. I run on behalf of OMG something called the Federated Enterprise Risk Management Group. That's the group which is focusing on risk management for large entities like the federal government's Veterans Affairs or Department offense upstairs. I think talking right now is the Chief date Officer for transportation. OK, that's a large organization, which they, they're instructed by own be at the, um, chief financial officer level. The one number one thing to do for the government is to get an effective enterprise worst management model going in the government agencies. And so they come to own G let just like NIST or just like DARPA does from the defense or intelligence side, saying we need to have standards in this area. So not only can we talk thio you effectively, but we can talk with our industry partners effectively on space. Programs are on retail, on medical programs, on finance programs, and so they're at OMG. There are two significant financial programs, or Sanders, that exist once called figgy financial instrument global identifier, which is a way of identifying a swap. Its way of identifying a security does not have to be used for a que ce it, but a worldwide. You can identify that you know, IBM stock did trade in Tokyo, so it's a different identifier has different, you know, the liberals against the one trading New York. Okay, so those air called figgy identifiers them. There are attributes associated with that security or that beast the being identified, which is generally comes out of 50 which is the financial industry business ontology. So you know, it says for a corporate bond, it has coupon maturity, semi annual payment, bullets. You know, it is an example. So that gives you all the information that you would need to go through to the calculation, assuming you could have a calculation routine to do it, then you need thio. Then turn around and set up your well. Call your environment. You know where Ford Yield Curves are with mortgage backed securities or any portable call. Will bond sort of probabilistic lee run their numbers many times and come up with effective duration? Um, And then you do your Vader's analytics. No aggregating the portfolio and looking at Shortfalls versus your funding. Or however you're doing risk management and then finally do reporting, which is where the standardized business reporting model comes in. So that kind of the five parts of doing a full enterprise risk model and Alex So what >> does >> this mean for first? Well, who does his impact on? What does it mean for organizations? >> Well, it's gonna change the world for basically everyone because it's like doing a clue ends of a software upgrade. Conversion one's version two point. Oh, and you know how software upgrades Everyone hates and it hurts because everyone's gonna have to now start using the same standard ontology. And, of course, that Sarah Ontology No one completely agrees with the regulators have agreed to it. The and the ultimate controlling authority in this thing is going to be F sock, which is the Dodd frank mandated response to not ever having another chart. So the secretary of Treasury heads it. It's Ah, I forget it's the, uh, federal systemic oversight committee or something like that. All eight regulators report into it. And, oh, if our stands is being the adviser Teff sock for all the analytics, what these laws were doing, you're getting over farm or more power to turn around and look at how we're going to find data across the three so we can come up consistent analytics and we can therefore hopefully take one day. Like Goldman, Sachs is pre payment model on mortgages. Apply it to Citibank Portfolio so we can look at consistency of analytics as well. It is only apply to regulated businesses. It's gonna apply to regulated financial businesses. Okay, so it's gonna capture all your mutual funds, is gonna capture all your investment adviser is gonna catch her. Most of your insurance companies through the medical air side, it's gonna capture all your commercial banks is gonna capture most of you community banks. Okay, Not all of them, because some of they're so small, they're not regularly on a federal basis. The one regulator which is being skipped at this point, is the National Association Insurance Commissioners. But they're apparently coming along as well. Independent federal legislation. Remember, they're regulated on the state level, not regularly on the federal level. But they've kind of realized where the ball's going and, >> well, let's make life better or simply more complex. >> It's going to make life horrible at first, but we're gonna take out incredible efficiency gains, probably after the first time you get it done. Okay, is gonna be the problem of getting it done to everyone agreeing. We use the same definitions >> of the same data. Who gets the efficiency gains? The regulators, The companies are both >> all everyone. Can you imagine that? You know Ah, Goldman Sachs earnings report comes out. You're an analyst. Looking at How do I know what Goldman? Good or bad? You have your own equity model. You just give the model to the semantic worksheet and all turn around. Say, Oh, those numbers are all good. This is what expected. Did it? Did it? Didn't you? Haven't. You could do that. There are examples of companies here in the United States where they used to have, um, competitive analysis. Okay. They would be taking somewhere on the order of 600 to 7. How 100 man hours to do the competitive analysis by having an available electronically, they cut those 600 hours down to five to do a competitive analysis. Okay, that's an example of the type of productivity you're gonna see both on the investment side when you're doing analysis, but also on the regulatory site. Can you now imagine you get a regulatory reports say, Oh, there's they're out of their way out of whack. I can tell you this fraud going on here because their numbers are too much in X y z. You know, you had to fudge numbers today, >> and so the securities analyst can spend Mme. Or his or her time looking forward, doing forecasts exactly analysis than having a look back and reconcile all this >> right? And you know, you hear it through this conference, for instance, something like 80 to 85% of the time of analysts to spend getting the data ready. >> You hear the same thing with data scientists, >> right? And so it's extent that we can helped define the data. We're going thio speed things up dramatically. But then what's really instinct to me, being an M I t engineer is that we have great possibilities. An A I I mean, really great possibilities. Right now, most of the A miles or pattern matching like you know, this idea using face shield technology that's just really doing patterns. You can do wonderful predictive analytics of a I and but we just need to give ah lot of the a m a. I am a I models the contact so they can run more quickly. OK, so we're going to see a world which is gonna found funny, But we're going to see a world. We talk about semantic analytics. Okay. Semantic analytics means I'm getting all the inputs for the analysis with context to each one of the variables. And when I and what comes out of it will be a variable results. But you also have semantics with it. So one in the future not too distant future. Where are we? We're in some of the national labs. Where are you doing it? You're doing pipelines of one model goes to next model goes the next mile. On it goes Next model. So you're gonna software pipelines, Believe or not, you get them running out of an Excel spreadsheet. You know, our modern Enhanced Excel spreadsheet, and that's where the future is gonna be. So you really? If you're gonna be really good in this business, you're gonna have to be able to use your brain. You have to understand what data means You're going to figure out what your modeling really means. What happens if we were, You know, normally for a lot of the stuff we do bell curves. Okay, well, that doesn't have to be the only distribution you could do fat tail. So if you did fat tail descriptions that a bell curve gets you much different results. Now, which one's better? I don't know, but, you know, and just using example >> to another cut in the data. So our view now talk about more about the tech behind this. He's mentioned a I What about math? Machine learning? Deep learning. Yeah, that's a color to that. >> Well, the tech behind it is, believe or not, some relatively old tech. There is a technology called rd F, which is kind of turned around for a long time. It's a science kind of, ah, machine learning, not machine wearing. I'm sorry. Machine code type. Fairly simplistic definitions. Lots of angle brackets and all this stuff there is a higher level. That was your distracted, I think put into standard in, like, 2000 for 2005. Called out. Well, two point. Oh, and it does a lot at a higher level. The same stuff that already f does. Okay, you could also create, um, believer, not your own special ways of a communicating and ontology just using XML. Okay, So, uh, x b r l is an enhanced version of XML, okay? And so some of these older technologies, quote unquote old 20 years old, are essentially gonna be driving a lot of this stuff. So you know you know Corbett, right? Corba? Is that what a maid omg you know, on the communication and press thing, do you realize that basically every single device in the world has a corpus standard at okay? Yeah, omg Standard isn't all your smartphones and all your computers. And and that's how they communicate. It turns out that a lot of this old stuff quote unquote, is so rigidly well defined. Well done that you can build modern stuff that takes us to the Mars based on these old standards. >> All right, we got to go. But I gotta give you the award for the most acronyms >> HR 15 30 fi G o m g s b r >> m fsoc tarp. Oh, fr already halfway. We knew that Owl XML ex brl corba, Which of course >> I do. But that's well done. Like thanks so much for coming. Everyone tried to have you. All right, keep it right there, everybody, We'll be back with our next guest from M i t cdo I Q right after this short, brief short message. Thank you
SUMMARY :
Brought to you by A lot of acronym stands for M I. T. Of course, the great institution. in the same company, you know, we Sometimes engineers arrive and they could do some things. And it Boy, if you put in some data data capital in there, you really explosions. of the United States government and trying to roll up all the expenses into one kind So they're to G et o reports out criticizing how was done, and the government's I forget the exact invitation You pull out the net net income information and says its net income, but you don't know what it attaches So it also goes back, and they're serving as you get farther and farther out the tree, Okay, how does this relate to the financial and the 15 30 is going to dramatically change the way, So one of the things we have advised is that No, the machine to machine is coming in with son Okay, you have various So if you like at a sec Okay, so so you could have the machines go and check scale. I mean, Holland's reporting something on the order of 90%. We say pick up. you're taking people out of the whole cycle. Explain the OMG You remember? go through to the calculation, assuming you could have a calculation routine to of you community banks. gains, probably after the first time you get it done. of the same data. You just give the model to the semantic worksheet and all turn around. and so the securities analyst can spend Mme. And you know, you hear it through this conference, for instance, something like 80 to 85% of the time You have to understand what data means You're going to figure out what your modeling really means. to another cut in the data. on the communication and press thing, do you realize that basically every single device But I gotta give you the award for the most acronyms We knew that Owl Thank you
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Paul Gill | PERSON | 0.99+ |
Obama | PERSON | 0.99+ |
Trump | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Lars | PERSON | 0.99+ |
India | LOCATION | 0.99+ |
2017 | DATE | 0.99+ |
David | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
Goldman | ORGANIZATION | 0.99+ |
Issa | PERSON | 0.99+ |
Federated Enterprise Risk Management Group | ORGANIZATION | 0.99+ |
80 | QUANTITY | 0.99+ |
600 hours | QUANTITY | 0.99+ |
Financial Transparency Act | TITLE | 0.99+ |
Congress | ORGANIZATION | 0.99+ |
60% | QUANTITY | 0.99+ |
Maxine Waters Committee | ORGANIZATION | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Tokyo | LOCATION | 0.99+ |
90% | QUANTITY | 0.99+ |
20 years | QUANTITY | 0.99+ |
United States | LOCATION | 0.99+ |
Maria | PERSON | 0.99+ |
600 | QUANTITY | 0.99+ |
National Association Insurance Commissioners | ORGANIZATION | 0.99+ |
Brass Rat Capital | ORGANIZATION | 0.99+ |
California | LOCATION | 0.99+ |
Citibank | ORGANIZATION | 0.99+ |
Goldman Sachs | ORGANIZATION | 0.99+ |
Excel | TITLE | 0.99+ |
FERC | ORGANIZATION | 0.99+ |
Lars Toomre | PERSON | 0.99+ |
15 30 | TITLE | 0.99+ |
2005 | DATE | 0.99+ |
two leaders | QUANTITY | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
SEC | ORGANIZATION | 0.99+ |
Australia | LOCATION | 0.99+ |
three years | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
7 | QUANTITY | 0.99+ |
NIST | ORGANIZATION | 0.99+ |
Open Data Act of 2014 | TITLE | 0.99+ |
25 negative votes | QUANTITY | 0.99+ |
85% | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
50 | QUANTITY | 0.99+ |
two years | QUANTITY | 0.99+ |
Sarah | PERSON | 0.99+ |
yesterday | DATE | 0.99+ |
Veterans Affairs | ORGANIZATION | 0.99+ |
five parts | QUANTITY | 0.99+ |
both | QUANTITY | 0.98+ |
first time | QUANTITY | 0.98+ |
Republican | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
two weeks ago | DATE | 0.98+ |
one concept | QUANTITY | 0.98+ |
DARPA | ORGANIZATION | 0.98+ |
10,000 times | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
New York | LOCATION | 0.98+ |
Alex | PERSON | 0.98+ |
United States government | ORGANIZATION | 0.98+ |
Vader | PERSON | 0.98+ |
one day | QUANTITY | 0.98+ |
about 50 cases | QUANTITY | 0.98+ |
Treasury | ORGANIZATION | 0.97+ |
government Affairs Committee | ORGANIZATION | 0.97+ |
Mars | LOCATION | 0.97+ |
Object Management Group | ORGANIZATION | 0.97+ |
Government Data act | TITLE | 0.96+ |
earlier this year | DATE | 0.96+ |
OMG | ORGANIZATION | 0.96+ |
Teff | PERSON | 0.96+ |
100 | QUANTITY | 0.96+ |
six years | QUANTITY | 0.96+ |
Beaver | PERSON | 0.95+ |
two significant financial programs | QUANTITY | 0.94+ |
two point | QUANTITY | 0.94+ |
third generation | QUANTITY | 0.94+ |
Matt Kobe, Chicago Bulls | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to M. I. T. In Cambridge, Massachusetts. Everybody You're watching The Cube, the Leader and Live Tech coverage. My name is Dave Volante, and it's my pleasure to introduce Matt Kobe, who's the vice president of business strategy Analytics of Chicago Bulls. We love talking sports. We love talking data. Matt. Thanks for coming on. >> No problem getting a date. So talk about >> your role. Is the head of analytics for the Bulls? >> Sure. So I work exclusively on the business side of the operation. So we have a separate team that those the basketball side, which is kind of your players stuff. But on the business side, um, what we're focused on is really two things. One is being essentially internal consultants for the rest of the customer facing functions. So we work a lot with ticketing, allow its sponsorship, um, marketing digital, all of those folks that engage with our customer base and then on the backside back end of it, we're building out the technical infrastructure for the organization right. So everything from data warehouse to C. R M to email marketing All of that sits with my team. And so we were a lot of hats, which is exciting. But at the end of the day, we're trying to use data to enhance the customer and fan experience. Um and that's our aim. And that's what we're driving towards >> success in sports. In a larger respect. It's come down to don't be offended by this. Who's got the best geeks? So now your side of the house is not about like you say, player performance about the business performances. But that's it. That's a big part of getting the best players. I mean, if it's successful and all the nuances of the N B, A salary cap and everything else, but I think there is one, and so that makes it even more important. But you're helping fund. You know that in various ways, but so are the other two teams that completely separate. Is there a Chinese wall between them? Are you part of the sort of same group? >> Um, we're pretty separate. So the basketball folks do their thing. The business folks do their thing from an analytic standpoint. We meet and we collaborate on tools and other methods of actually doing the analysis. But in terms of, um, the analysis itself, there is a little bit of separation there, and mainly that is from priority standpoint. Obviously, the basketball stuff is the most important stuff. And so if we're working on both sides that we'd always be doing the basketball stuff and the business stuff needs to get done, >> drag you into exactly okay. But which came first? The chicken or the egg was It was the sort of post Moneyball activity applied to the N B. A. And I want to ask you a question about that. And then somebody said, Hey, we should do this for the business side. Or was the business side of sort of always there? >> I think I think, the business side and probably the last 5 to 7 years you've really seen it grown. So if you look at the N. B. A. I've been with the Bulls for five years. If you look at the N. B. A. 78 years ago, there was a handful of Business analytics teams and those those teams had one or two people at him. Now every single team in the NBA has some sort of business analytics team, and the average staff is seven. So my staff is six full time folks pushed myself, so we'll write it right at the average. And I think what you've seen is everything has become more complex in sports. Right? If you look at ticketing, you've got all the secondary markets. You have all this data flowing in, and they need someone to make sense of all that data. If you look at sponsorship sponsorship, his transition from selling a sign that sits on the side of the court for these truly integrated partnerships, where our partners are coming to us and saying, What do we get out of? This was our return. And so you're seeing a lot more part lot more collaboration between analytics and sponsorship to go back to those partners and say, Hey, here's what we delivered And so I think you it started on the basketball side, certainly because that's that's where the, you know that is the most important piece. But it quickly followed on the business side because they saw the value that that type of thinking can bring in the business. >> So I know this is not, you know, your swim lane, but But, you know, the lore of Billy Beane and Moneyball and all that, a sort of the starting point for sports analytics. Is that Is that Is that a fair characterization? Yeah. I mean, was that Was that really the main spring? >> I think it It probably started even before that. I think if you have got to see Billy being at the M I t Sports Analytics conference and him thought he always references kind of Bill James is first, and so I think it started. Baseball was I wouldn't say the easiest place to start, But it was. It's a one versus one, right? It's pitcher versus batter. In a lot of cases, basketball is a little bit more fluid. It's a team. Sport is a little harder, but I think as technology has advanced, there's been more and more opportunities to do the analytics on the basketball side and on the business side. I think what you're seeing is this huge. What we've heard the first day and 1/2 here, this huge influx of data, not nearly to the levels of the MasterCard's and others of the world. But as more and more things moved to the mobile phone, I think you're going to see this huge influx of data on the business side, and you're going to need the same systems in the same sort of approach to tackle it. >> S O. Bill James is the ultimate sports geek, and he's responsible for all these stats that, no, none of us understand. He's why we don't pay attention to batting average anymore. Of course, I still do. So let's talk about the business side of things. If you think about the business of baseball, you know it's all about maximizing the gate. Yeah, there's there's some revenue, a lot of revenue course from TV. But it's not like football, which is dominated by the by the TV. Basketball, I think, is probably a mix right. You got 80 whatever 82 game season, so filling up the stadium is important. Obviously, N v A has done a great job of of really getting it right. Free agency is like, fascinating. Now >> it's 12 months a year >> scored way. Talk about the NBA all the time and of course, you know, people like celebrities like LeBron have certainly helped, and now a whole batch of others. But what's the money side of the n ba look like? Where's the money coming from? >> Yeah, I mean, I think you certainly have broadcast right, but in many ways, like national broadcast sort of takes care of it itself. In some ways, from the standpoint of my team, doesn't have a lot of control over national broadcast money. That's a league level thing. And so the things that we have control over the two big buckets are ticketing and sponsorship. Those those are the two big buckets of revenue that my team spends a lot of time on. Ticketing is, is one that is important from the standpoint, as you say, which is like, How do we fill the building right? We've got 41 home game, supposed three preseason games. We got 44 events a year. Our goal is to fill the building for all 44 of those events. We do a pretty good job of doing it, but that has cascading effects into other revenue streams. Right, As you think about concessions and merchandise and sponsorship, it's a lot easier to spell spot cell of sponsorship when you're building is full, then if you're building isn't full. And so our focus is on. How do we? How do we fill the building in the most efficient way possible? And as you have things like the secondary market and people have access to tickets in different ways than they did 10 to 15 years ago, I think that becomes increasingly complex. Um, but that's the fun area that's like, That's where we spend a lot of time. There's the pricing, There's inventory management. It's a lot of, you know, is you look a traditional cpg. There's there's some of those same principles being applied, which is how do you are you looking airline right there? They're selling a plane. It's an asset you have to fill. We have ah, building. That's an asset we have to fill, and how do we fill it in the most optimal way? >> So the idea of surge pricing demand supply, But so several years ago, the Red Sox went to a tiered pricing. You guys do the same If the Sox are playing Kansas City Royals tickets way cheaper than if they're playing the Yankees. You guys do a similar. So >> we do it for single game tickets. So far are season ticket holders. It's the same price for every game, but on the price for primary tickets for single games, right? So if we're playing, you know this year will be the Clippers and the Lakers. That price is going to be much more expensive, so we dynamically price on a game to game basis. But our season ticket holders pay this. >> Why don't you do it for the season ticket holders? Um, just haven't gone there yet. >> Yeah, I mean, there's some teams have, right, so there's a few different approaches you convey. Lovely price. Those tickets, I think, for for us, the there's in years past. In the last few years, in particular, there's been a couple of flagship games, and then every other game feels similar. I think this will be the first year where you have 8 to 10 teams that really have a shot at winning the title, and so I think you'll see a more balanced schedule. Um, and so we've We've talked about it a lot. We just haven't gone to that made that move yet? >> Well, a season ticket holder that shares his tickets with seven other guys with red sauce. You could buy a BMW. You share the tickets, so but But I would love it if they didn't do the tiered. Pricing is a season ticket holder, so hope you hold off a while, but I don't know. It could maximize revenues if the Red Sox that was probably not a stupid thing is they're smart people. What about the sponsorships? Is fascinating about the partners looking for our ally. How are you measuring that? You're building your forging a tighter relationship, obviously, with the sponsors in these partners. Yeah, what's that are? Why look like it's >> measured? A variety of relies, largely based on the assets that they deliver. But I think every single partner we talk to these days, I also leave the sponsorship team. So I oversee. It's It's rare in sports, but I stayed over business strategy and Alex and sponsorship team. Um, it's not my title, but in practice, that's what I do. And I think everyone we talked to wants digital right? They want we've got over 25,000,000 social media followers with the Bulls, right? We've got 19,000,000 on Facebook alone. And so sponsors see those numbers and they know that we can deliver impression. They know we can deliver engagement and they want access to those channels. And so, from a return on, I always call a return on objectives, right? Return on investment is a little bit tricky, but return on objectives is if we're trying to reel brand awareness, we're gonna go back to them and say, Here's how many people came to our arena and saw your logo and saw the feature that you had on the scoreboard. If you're on our social media channels or a website, here's the number of impressions you got. Here is the number of engagements you got. I think where we're at now is Maura's Bad Morris. Still better, right? Everyone wants the big numbers. I think where you're starting to see it move, though, is that more isn't always better. We want the right folks engaging with our brands, and that's really what we're starting to think about is if you get 10,000,000 impressions, but they're 10,000,000 impressions to the wrong group of potential customers, that's not terribly helpful. for a brand. We're trying to work with our brands to reach the right demographics that they want to reach in order to actually build that brand awareness they want to build. >> What, What? Your primary social channels. Twitter, Obviously. >> So every platform has a different purpose way. Have Facebook, Twitter, instagram, Snapchat. We're in a week. We bow in in China and you know, every platform has a different function. Twitter's obviously more real time news. Um, you know the timeline stuff, it falls off really quick. Instagram is really the artistic piece of it on, and then Facebook is a blend of both, and so that's kind of how we deploy our channels. We have a whole social team that generates content and pushes that content out. But those are the channels we use and those air incredibly valuable. Now what you're starting to see is those channels are changing very rapidly, based on their own set of algorithms, of how they deliver content of fans. And so we're having to continue to adapt to those changing environments in those social >> show impressions. In the term, impressions varies by various platforms. So so I know. I know I'm more familiar with Twitter impressions. They have the definition. It's not just somebody who might have seen it. It's somebody that they believe actually spent a few seconds looking at. They have some algorithm to figure that out. Yeah. Is that a metric that you finding your brands are are buying into, for example? >> Yeah. I mean, I think certainly there they view it's kind of the old, you know, when you bought TV ads, it's how many households. So my commercial right, it's It's a similar type of metric of how many eyeballs saw a piece of content that we put out. I think we're the metrics. More people are starting to care about his engagements, which is how many of you actually engaged with that piece of content, whether it's a like a common a share, because then that's actual. Yeah, you might have seen it for three seconds, but we know how things work. You're scrolling pretty fast, But if you actually stopped to engage it with something, that's where I think brands are starting to see value. And as we think about our content, we have ah framework that our digital team uses. But one of the pillars of that is thumb stopping. We want to create content that is some stopping that people actually engage with. And that's been a big focus of ours. Last couple years, >> I presume. Using video, huge >> video We've got a whole graphics team that does custom graphics for whether it's stats or for history, historical anniversaries. We have a hole in house production team that does higher end, and then our digital team does more kind of straight from the phone raw footage. So we're using a variety of different mediums toe reach our fans >> that What's your background? How'd you get into all of this? >> I spent seven years in consulting, so I worked for Deloitte on their strategy group out of Chicago, And I worked for CPG companies like at the intersection of Retailer and CPG. So a lot of in store promotional work helping brands think through just General Revenue management, pricing strategy, promotional strategy and, um stumbled upon greatness with the Bulls job. A friend gave me the heads up that they were looking to fill this type of role and I was able to get my resume in the mix and I was lucky enough to get get the job, and it's been when I started. We're single, single, single, so it's a team of one. Five years later, we're a team of six, and we'll probably keep growing. So it's been an exciting ride and >> your background is >> maths. That's eyes business. Undergrad. And then I got a went Indian undergrad business and then went to Kellogg. Northwestern got an MBA on strategy, so that's my background. But it's, you know, I've dabbled in sports. I worked for the Chicago 2016 Olympic bid back in the day when I was at Deloitte. Um, and so it's been It's always been a dream of mine. I just never knew how I get there like I was wanted to work in sports. They just don't know the path. And I'm lucky enough to find the path a lot earlier than I thought. >> How about this conference? I know you have been the other M I T. Event. How about this one? How we found some of the key takeaways. Think you >> think it's been great because a lot of the conferences we go to our really sports focus? So you've got the M. I T Sports Analytics conference. You have seat. You have n b a type, um, programming that they put on. But it's nice to get out of sports and sort of see how other bigger industries are thinking about some of the problems specifically around data management and the influx of data and how they're thinking about it. It's always nice to kind of elevated. Just have some room to breathe and think and meet people that are not in sports and start to build those, you know, relationships and with thought leaders and things like that. So it's been great. It's my first time here. What are probably back >> good that Well, hopefully get to see a game, even though that stocks are playing that well. Thanks so much for coming in Cuba. No problems here on your own. You have me. It was great to have you. All right. Keep right, everybody. I'll be back with our next guest with Paul Gill on day Volante here in the house. You're watching the cue from M I T CEO. I cube. Right back
SUMMARY :
Brought to you by Silicon Angle Media. Welcome back to M. I. T. In Cambridge, Massachusetts. So talk about Is the head of analytics for the Bulls? But on the business side, um, what we're focused on is really two things. the house is not about like you say, player performance about the business performances. always be doing the basketball stuff and the business stuff needs to get done, A. And I want to ask you a question about that. it started on the basketball side, certainly because that's that's where the, you know that is the most important So I know this is not, you know, your swim lane, but But, you know, the lore of Billy Beane I think if you have got to see Billy being at the M So let's talk about the business side of things. Talk about the NBA all the time and of course, you know, And so the things that we have control over the two big buckets are So the idea of surge pricing demand supply, But so several years ago, It's the same price for every game, Why don't you do it for the season ticket holders? I think this will be the first year where you have 8 to 10 teams that really have a shot at winning so hope you hold off a while, but I don't know. Here is the number of engagements you got. Twitter, Obviously. Um, you know the timeline stuff, it falls off really quick. Is that a metric that you finding your brands are are More people are starting to care about his engagements, which is how many of you actually engaged with that piece of content, I presume. We have a hole in house production team A friend gave me the heads up that they were looking to fill this type of role and I was able to get my resume in the But it's, you know, I've dabbled I know you have been the other M I T. Event. you know, relationships and with thought leaders and things like that. good that Well, hopefully get to see a game, even though that stocks are playing that well.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Volante | PERSON | 0.99+ |
Matt Kobe | PERSON | 0.99+ |
19,000,000 | QUANTITY | 0.99+ |
Cuba | LOCATION | 0.99+ |
8 | QUANTITY | 0.99+ |
Deloitte | ORGANIZATION | 0.99+ |
Red Sox | ORGANIZATION | 0.99+ |
Clippers | ORGANIZATION | 0.99+ |
China | LOCATION | 0.99+ |
Billy | PERSON | 0.99+ |
five years | QUANTITY | 0.99+ |
Bill James | PERSON | 0.99+ |
seven | QUANTITY | 0.99+ |
Chicago | LOCATION | 0.99+ |
Matt | PERSON | 0.99+ |
Yankees | ORGANIZATION | 0.99+ |
Paul Gill | PERSON | 0.99+ |
Lakers | ORGANIZATION | 0.99+ |
seven years | QUANTITY | 0.99+ |
BMW | ORGANIZATION | 0.99+ |
three seconds | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Chicago Bulls | ORGANIZATION | 0.99+ |
80 | QUANTITY | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
single | QUANTITY | 0.99+ |
MasterCard | ORGANIZATION | 0.99+ |
two teams | QUANTITY | 0.99+ |
two big buckets | QUANTITY | 0.99+ |
82 game | QUANTITY | 0.99+ |
Sox | ORGANIZATION | 0.99+ |
seven other guys | QUANTITY | 0.99+ |
M. I T Sports Analytics | EVENT | 0.99+ |
10,000,000 impressions | QUANTITY | 0.99+ |
Bulls | ORGANIZATION | 0.99+ |
three preseason games | QUANTITY | 0.99+ |
M I t Sports Analytics | EVENT | 0.99+ |
two things | QUANTITY | 0.99+ |
two people | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
single games | QUANTITY | 0.99+ |
Five years later | DATE | 0.98+ |
ORGANIZATION | 0.98+ | |
several years ago | DATE | 0.98+ |
10 teams | QUANTITY | 0.98+ |
41 home game | QUANTITY | 0.98+ |
Northwestern | ORGANIZATION | 0.98+ |
both sides | QUANTITY | 0.98+ |
first time | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
LeBron | PERSON | 0.98+ |
both | QUANTITY | 0.98+ |
10 | DATE | 0.98+ |
Alex | PERSON | 0.98+ |
this year | DATE | 0.97+ |
Kansas City Royals | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
12 months a year | QUANTITY | 0.97+ |
first year | QUANTITY | 0.97+ |
78 years ago | DATE | 0.95+ |
single game tickets | QUANTITY | 0.95+ |
M I T. Event | EVENT | 0.94+ |
1/2 | QUANTITY | 0.94+ |
Indian | OTHER | 0.94+ |
ORGANIZATION | 0.94+ | |
ORGANIZATION | 0.93+ | |
7 years | QUANTITY | 0.92+ |
first day | QUANTITY | 0.92+ |
15 years ago | DATE | 0.92+ |
44 of those events | QUANTITY | 0.91+ |
six full | QUANTITY | 0.91+ |
Maura's Bad Morris | ORGANIZATION | 0.9+ |
a week | QUANTITY | 0.9+ |
Snapchat | ORGANIZATION | 0.9+ |
M. I. T. | PERSON | 0.9+ |
over 25,000,000 social media followers | QUANTITY | 0.88+ |
seconds | QUANTITY | 0.88+ |
Last couple years | DATE | 0.88+ |
N. B. | LOCATION | 0.87+ |
Stewart Bond, IDC | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's three Cube covering M. I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to M I. T. CDO I Q everybody, you're watching the cube we got. We go out to the events we extract the signal from the noise is day one of this conference. Chief Data Officer event. I'm Dave, along with my co host, Paul Gillen. Stuart Bond is here is a research director of International Data Corporation I DC Stewart. Welcome to the Cube. Thanks for coming on. Thank you for having me. You're very welcome. So your space data intelligence tell us about your swim lane? Sure. >> So my role it I. D. C is a ZAY. Follow the data integration and data intelligence software market. So I follow all the different vendors in the market. I look at what kinds of solutions they're bringing to market, what kinds of problems. They're solving both business and technical for their clients. And so I can then report on the trends and market sizes, forecasts and such, And within that part of what I what I cover is everything from data integration which is more than traditionally E T l change data capture data movements, data, virtualization types of technologies as well as what we call date integrity of one. And I'm calling data intelligence, which is all of the Tell the metadata about the data. It's the data catalogs meditating management's data lineage. It's the data quality data profiling, master data intelligence. It's all of the data about the data and understanding really answering what I call a entering the five W's and h of data. It's the who, what, where, when, why and how. Data. So that's the market that I'm covering and following, and that's why I'm >> here. Were you here this morning for Mark Ramsey's Yes, I talk. So he kind of went to you. Heard it started with the D W kind of through E T L under the bus. Well, MGM, then the Enterprise data model said all that failed. But that stuff's not going away, and I'm sure they're black. So still using, you know, all those all that tooling today. So what was your reaction to that you were not in your head and yeah, it's true or saying, Well, maybe there's a little we'll have what we've been saying. The mainframe is gonna go away for years and >> still around, so I think they're obviously there's still those technologies out there and they're still being used. You can look at any of the major dtl vendors and there's new ones coming to the market, so that's still alive and well. There's no doubt that it's out there and its biggest segment of the market that I followed. So there's no source tooling, right? Yes, >> there's no doubt that it's still >> there. But Mark's vision of where things are going, where things are heading with, with data intelligence really being at the Cory talk about those spiders talked about that central depository of information about knowledge of the data. That's where things are heading to, whether you call it a data hub, whether you call it a date, a platform, not really a one big, huge data pop for one big, huge data depository, but one a place where you can go to get the information but natives you can find out where the data is. You could find out what it means, both the business context as well as the technical information you find out who's using that data. You can find out when it's being used, Why it's being used in. Why do we even have it and how it should >> be used? So it's being used >> appropriately. So you would say that his vision, actually what he implemented was visionary skating. They skated to the puck, so to speak, and that's we're going >> to see more of that. Where are seeing more of that? That's why we've seen such a jump in the number of vendors that air providing data catalogue solutions. I did, Uh, I d. C has this work product calling market glance. I did that >> beginning of 2018. >> I just did it again. In the middle of this year, the number of vendors that offer data catalogue solutions has significantly interest 240% increase in the number of vendors that offer that now itself of a small base. These air, not exhaustive studies. It may be that I didn't know about all those data catalogue vendors a year and 1/2 ago, but may also be that people are now saying that we've got a data catalogue, >> but you've really got a >> peel back the layers a little bit. Understand what these different data catalysts are and what they're doing because not all of them are crediting. >> We'll hear Radar. You don't know about it. 99% of the world mark talked this morning about some interesting new technologies. They were using Spider Ring to find the data bots to classify the data tools wrangle the data. I mean, there's a lot of new technology being applied to this area. What? Which of those technologies do you think has the greatest promise right now? And how? How how automated can this process become? >> It's the spider ring, and it's the cataloging of the data. It's understanding what you've got out there that is growing crazy. Just started to track that it's growing a lot that has the most promised. And as I said, I think that's going to be the data platform in the future. Is the intelligence knowing about where your data is? You men go on, get it. You know it's not a matter of all. The data is one place anymore. Data's everywhere Date is in hybrid cloud. It's in on premise. It's in private. Cloud isn't hosted. It's everywhere. I just did a survey. I got the results back in June 2019 just a month ago, and the data is all over the place. So really having that knowledge having that intelligence about where your data is, that has the most promise. As faras, the automation is concerned. Next step there. It's not just about collecting the information about where your data is, but it's actually applying the analytics, the machine learning and the artificial intelligence to that metadata collection that you've got so that you can then start to create those bots to create those pipelines to start to automate those tasks. We're starting to see some vendors move in that area, moving that direction. There's a lot of promise there >> you guys, at least when I remember. You see, the software is pretty robust taxonomy. I'm sure it's evolved over the years. So how do you sort of define your space? I'm interested in How big is that space, you know, in terms of market size and is a growing and where do you see it going? >> Right. So my my coverage of data integration and data intelligence is fairly small. It's a small, little marketed. I D. C. I'm part of a larger team that looks a data management, the analytics and information management. So we've got people on our team like a damn vessel. Who covers the analytics? Advanced Analytics show Nautical Palo Carlson. He's been on the cable covers, innovative technologies, those I apologize. I don't have that number off the top. >> Okay, No, But your space, my space is it. That's that Software market is so fragmented. And what I d. C has always done well, as you put people on those fragments and you know, deep in there. So So how you've been ableto not make your eyes bleed when you do that, challenging so the data and put it all together. >> It's important. Integration markets about 66 and 1/2 1,000,000,000 >> dollars. Substantial size. Yeah, but again, a lot of vendors Growing number of events in the markets growing, >> the market continues to grow as the data is becoming more distributed, more dispersed. There's no need to continue to integrate that data. There's also that need that growing >> need for that date intelligence. It's not >> just, you know, we've had a lot of enquiries lately about data being fed into machine learning artificial intelligence and people realizing our data isn't clean. We have to clean up our data because we're garbage in garbage. Out is probably more important now than ever before because you don't have someone saying, I don't think that day is right. You've got machines were looking at data instead. The technology that's out there and the problem with data quality. It's on a new problem. It's the same problem we've had for years. All of the technology is there to clean that data up, and that's a part of what I saw. I look at the data quality vendors experience here, sink sort in all of the other data quality capabilities that you get from in from Attica, from Tahoe or from a click podium. Metal is there, and so that part is growing. And there's a lot of more interest in that data quality and that data intelligence side again so the right data can be used. Good data can be used to trust in that data. Can the increase we used for the right reasons as well That's adding that context. Understand that Samantha having all that metadata that goes around that data so that could be used. Most of >> it is one of those markets that you may be relatively small. It's not 100,000,000,000 but it it enables a lot of larger markets. So okay, so it's 66 and 1/2 1,000,000,000 it's growing. It is a growing single digits, double digits. It's growing. It's hovering around the double dip double. It is okay, it's 10%. And then and then who were the, You know, big players who was driving the shares there? Is there a dominant player there? Bunch of >> so infirm. Atticus Number one in the market. Okay, followed by IBM. And I say peas right up there. Sass is there. Tell End is making a good Uh, okay, they're making a nice with Yeah, but there there's a number of different players. There's There's a lot of different players in that market. >> And in the leading market share player has what, 10%? 15%? 50%? Is it like a dominant divine spot? That's tough to say. You got a big It's over 1,000,000,000,000,000,000 right? So they've got maybe 1/6 of the market. Okay, so but it's not like Cisco as 2/3 of the networking market or anything like that. And what about the cloud guys? A participating in this guy's deal with >> the cloud guys? Yeah, the ClA got so there are some pure cloud solutions. There's a relative, for example. Pure cloud MBM mastered a management there. There's I'd say there's less pure cloud than there used to be. But, you know, but someone like an infra matic is really pushing that clouds presence in that cloud >> running these tools, this tooling in in the cloud But the cloud guys directly or not competing at this >> point. So Amazon Google? Yes, Those cloud guys. Yes. Okay, there, there. Google announced data flow back in our data. Sorry. Data fusion back. Google. >> Yeah, that's right. >> And so there they've got an e t l two on the cloud now. Ah, Amazon has blue yet which is both a catalog and an e t l tool. Microsoft course has data factory in azure. >> So those guys are coming on. I'm guessing if you talk to in dramatic and they said, Well, they're not as robust as we are. And we got a big install base and we go multi cloud is that kind of posturing of the incumbents or yeah, that's posturing. And maybe that's I don't mean it is a pejorative. If I were, those guys would be doing the same thing. You know, we were talking earlier about how the cloud guys essentially killed the Duke. All right, do you Do you see the same thing happening here, or is it well, the will the tool vendors be able to stay ahead in your view, >> depends on how they execute. If they're there and they're available in the cloud along with along with those clapper viers, they're able to provide solutions in the same same way the same elasticity, the same type of consumption based pricing models that pod vendors air offering. They can compete with that. They still have a better solution. Easton What >> in multi cloud in hybrid is a big part of their value problems that the cloud guys aren't really going hard after. I mean, this sort of dangling your toe in the water, some of them some of the >> cloud guys they have. They have the hybrid capabilities because they've got some of what they're what they built comes from on premises, worlds as well. So they've got that ability. Microsoft in particular >> on Google, >> Google that the data fusion came out of >> You're saying, But it's part of the Antos initiative. Er, >> um, I apologize. Folks are watching, >> but soup of acronyms notices We're starting a little bit. What tools have you seen or technology? Have you seen making governance of unstructured data? That looks promising? Uh, so I don't really cover >> the instructor data space that much. What I can say is Justus in the structure data world. It's about the metadata. It's about having the proper tags about that unstructured data. It's about getting the information of that unstructured data so that it can then be governed appropriately, making structure out of that, that is, I can't really say, because I don't cover that market explicitly. But I think again it comes back to the same type of data intelligence having that intelligence about that data by understanding what's in there. >> What advice are you giving to, you know, the buyers in your community and the sellers in your community, >> So the buyer's within the market. I talk a lot about that. The need for that data intelligence, so data governance to me is not a technology you can't go by data governance data governance is an organizational disappoint. Technology is a part of that. To me, the data intelligence technology is a part of that. So, really, organizations, if they really want a good handle, get a good handle on what data they have, how to use that, how to be enabled by that data. They need to have that date intelligence into go look for solutions that can help him pull that data intelligence out. But the other part of that is measurement. It's critical to measure because you can't improve what you're not measuring. So you know that type of approach to it is critical Eve, and you've got to be able to have people in the organization. You've got to be able to have cooperation collaboration across the business. I t. The the gifted office chief Officer office. You've gotta have that collaboration. You've gotta have accountability and for in order for that, to really be successful. For the vendors in the space hybrid is the new reality. In my survey data, it shows clearly that hybrid is where things are. It's not just cloud, it's not just on promise Tiebreak. That's where the future is. They've got to be able to have solutions that work in that environment. Working that hybrid cloud ability has got to be able to have solutions that can be purchased and used again in the same sort of elastic type of method that they're able to get consumers able to get. Service is from other vendors in that same >> height, so we gotta run. Thank you so much for sharing your insights and your data. And I know we were fired. I was firing a lot of questions. Did pretty well, not having the report in front of me. I know what that's like. So thank you for sharing and good luck with your challenges in the future. You got You got a lot of a lot of data to collect and a lot of fast moving markets. So come back any time. Share with you right now, Okay? And thank you for watching Paul and I will be back with our next guest right after this short break from M I t cdo. Right back
SUMMARY :
Brought to you by Silicon Angle Media. We go out to the events we extract the signal from the noise is day one of this conference. It's all of the So what was your reaction to that you were You can look at any of the major dtl vendors and there's new ones coming to the market, the information but natives you can find out where the data is. So you would say that his vision, actually what he implemented in the number of vendors that air providing data catalogue solutions. significantly interest 240% increase in the number of vendors that offer that now peel back the layers a little bit. 99% of the world mark It's not just about collecting the information about where your data is, but it's actually applying the I'm sure it's evolved over the years. I don't have that number off the top. that, challenging so the data and put it all together. It's important. number of events in the markets growing, the market continues to grow as the data is becoming more distributed, need for that date intelligence. All of the technology is there to clean that data up, and that's a part of what I saw. It's hovering around the double dip double. There's There's a lot of different players in that market. And in the leading market share player has what, 10%? Yeah, the ClA got so there are some pure cloud solutions. Google announced data flow back in our And so there they've got an e t l two on the cloud now. of the incumbents or yeah, that's posturing. They can compete with that. I mean, this sort of dangling your toe in the water, some of them some of the They have the hybrid capabilities because they've got some You're saying, But it's part of the Antos initiative. Folks are watching, What tools have you seen or technology? It's about getting the information of that So the buyer's within the market. not having the report in front of me.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Paul Gillen | PERSON | 0.99+ |
June 2019 | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
50% | QUANTITY | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
10% | QUANTITY | 0.99+ |
15% | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Mark | PERSON | 0.99+ |
Mark Ramsey | PERSON | 0.99+ |
Samantha | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Stuart Bond | PERSON | 0.99+ |
Attica | ORGANIZATION | 0.99+ |
66 | QUANTITY | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
International Data Corporation | ORGANIZATION | 0.99+ |
240% | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
99% | QUANTITY | 0.99+ |
a month ago | DATE | 0.99+ |
1/6 | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
100,000,000,000 | QUANTITY | 0.98+ |
MGM | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
Duke | ORGANIZATION | 0.98+ |
a year | DATE | 0.97+ |
Paul | PERSON | 0.95+ |
today | DATE | 0.94+ |
Eve | PERSON | 0.94+ |
M I. T. CDO | PERSON | 0.94+ |
Nautical Palo Carlson | ORGANIZATION | 0.93+ |
this morning | DATE | 0.92+ |
ClA | ORGANIZATION | 0.92+ |
Easton | PERSON | 0.92+ |
1/2 ago | DATE | 0.9+ |
about 66 | QUANTITY | 0.89+ |
2/3 | QUANTITY | 0.88+ |
1/2 1,000,000,000 | QUANTITY | 0.88+ |
Cory | ORGANIZATION | 0.86+ |
one place | QUANTITY | 0.86+ |
single | QUANTITY | 0.85+ |
Stewart Bond | PERSON | 0.84+ |
day one | QUANTITY | 0.83+ |
I DC | ORGANIZATION | 0.83+ |
I. D. C | PERSON | 0.82+ |
this year | DATE | 0.81+ |
over 1,000,000,000,000,000,000 | QUANTITY | 0.8+ |
years | QUANTITY | 0.79+ |
Stewart | PERSON | 0.78+ |
middle | DATE | 0.75+ |
Spider Ring | COMMERCIAL_ITEM | 0.74+ |
beginning | DATE | 0.72+ |
2018 | DATE | 0.72+ |
Radar | ORGANIZATION | 0.72+ |
double | QUANTITY | 0.68+ |
2019 | DATE | 0.68+ |
MIT | ORGANIZATION | 0.68+ |
Tiebreak | ORGANIZATION | 0.64+ |
three | QUANTITY | 0.64+ |
Tahoe | LOCATION | 0.63+ |
M. I T. | PERSON | 0.59+ |
IDC | ORGANIZATION | 0.54+ |
Cube | COMMERCIAL_ITEM | 0.54+ |
those | QUANTITY | 0.51+ |
Justus | PERSON | 0.5+ |
Antos | ORGANIZATION | 0.48+ |
CDOIQ | EVENT | 0.34+ |
Joe Caserta & Doug Laney, Caserta | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's three Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Hi already. We're back in Cambridge, Massachusetts at the M I t. Chief data officer Information quality event. Hashtag m i t cdo i Q. And I'm David Dante. He's Paul Gillen. Day one of our two day coverage of this event. This is the Cube, the leader in live tech coverage. Joe Caserta is here is the president of Caserta and Doug Laney, who is principal data strategist at Caserta, both Cube alarm guys. Great to see you again, Joe. What? Did you pick up this guy? How did that all came on here a couple of years ago? We had a great conversation. I read the book, Loved it. So congratulations. A nice pickup. >> We're very fortunate to have. >> Thanks. So I'm fortunate to be here, >> so Okay, well, what attracted you to Cassard? Oh, >> it's Joe's got a tremendous reputation. His his team of consultants has a great reputation. We both felt there was an opportunity to build some data strategy competency on top of that and leverage some of those in Phanom. Its ideas that I've been working on over the years. >> Great. Well, congratulations. And so, Joe, you and I have talked many times. And the reason I like talking because you know what's going on in the market place? You could you could siphon. What's riel? What's hype? So what do you see? It is the big trends in this data space, and then we'll get into it. Yeah, sure. Um, trends >> are chief data officer has been evolving over the last couple of years. You know, when we started doing this several years ago, there was just a handful of people, maybe 30 40 people. Now, there's 450 people here today, and it's been evolving. People are still trying to find their feet. Exactly what the chief date officers should be doing where they are in the hierarchy. Should they report to the c e o the C I O u the other CDO, which is a digital officer. So I think you know, hierarchically. That's still figuring it out politically. They're figuring it out, but technically also, they're still trying to figure it out. You know what's been happening over the past three years is the evolution of data going from traditional data warehousing and business intelligence. To get inside out of data just isn't working anymore. Eso evolving that moving it forward to more modern data engineering we've been doing for the past couple of years with quote unquote big data on That's not working anymore either, right? Because it's been evolving so fast. So now we're on, like, maybe Data three dato. And now we're talking about just pure automate everything. We have to automate everything. And we have to change your mindset from from having output of a data solution to an outcome to date a solution. And that's why I hired Doug, because way have to figure out not only had to get this data and look at it and analyze really had to monetize it, right? It's becoming a revenue stream for your business if you're doing it right and Doug is the leader in the industry, how to figure that >> you keep keep premise of your book was you gotta start valuing data and its fundamental you put forth a number of approaches and techniques and examples of companies doing that. Since you've published in phenomena Microsoft Apple, Amazon, Google and Facebook. Of the top five market value cos they've surpassed all the financial service is guys all ExxonMobil's and any manufacturer? Automobile makers? And what of a data companies, right? Absolutely. But intrinsically we know there's value their way any closer to the prescription that you put forth. >> Yeah, it's really no surprise and extra. We found that data companies have, ah, market to book value. That's nearly 33 times the market average, so Apple and others are much higher than that. But on average, if you look at the data product companies, they're valued much higher than other companies, probably because data can be reused in multiple ways. That's one of the core tenets of intra nomics is that Data's is non depleted ble regenerative, reusable asset and that companies that get that an architect of businesses based on those economics of information, um, can really perform well and not just data companies, but >> any company. That was a key takeaway of the book. The data doesn't conform to the laws of scarcity. Every says data is the new oil. It's like, No, it's not more valuable. So what are some examples in writing your book and customers that you work with. Where do you see Cos outside of these big data driven firms, breaking new ground and uses of data? I >> think the biggest opportunity is really not with the big giant Cos it's really with. Most of our most valuable clients are small companies with large volumes of data. You know if and the reason why they can remain small companies with large volumes of data is the thing that holds back the big giant enterprises is they have so much technical. Dad, it's very hard. They're like trying to, you know, raise the Titanic, right? You can't really. It's not agile enough. You need something that small and agile in order to pivot because it is changing so fast every time there's a solution created, it's obsolete. We have to greet the new solution on dhe when you have a big old processes. Big old technologies, big old mind sets on big old cultures. It's very hard to be agile. >> So is there no hope? I mean, the reason I ask the question was, What hope can you give some of these smokestack companies that they can become data centric? Yeah, What you >> see is that there was a There was a move to build big, monolithic data warehouses years ago and even Data Lakes. And what we find is that through the wealth of examples of companies that have benefited in significant ways from data and analytics, most of those solutions are very vocational. They're very functionally specific. They're not enterprise class, yada, yada, kind of kind of projects. They're focused on a particular business problem or monetizing or leveraging data in a very specific way, and they're generating millions of dollars of value. But again they tend to be very, very functionally specific. >> The other trend that we're seeing is also that the technology and the and the end result of what you're doing with your data is one thing. But really, in order to make that shift, if your big enterprises culture to really change all of the people within the organization to migrate from being a conventional wisdom run company to be a data really analytics driven company, and that takes a lot of change management, a lot of what we call data therapy way actually launched a new practice within the organization that Doug is actually and I are collaborating on to really mature because that is the next wave is really we figured out the data part. We figured out the technology part, but now it's the people part people. Part is really why we're not way ahead of where we even though we're way ahead of where we were a couple of years ago, we should be even further. Culturally, it's very, very challenging, and we need to address that head on. >> And that zeta skills issue that they're sort of locked into their existing skill sets and processes. Or is it? It's fear of the unknown what we're doing, you know? What about foam? Oh, yeah, Well, I mean, there are people >> jumping into bed to do this, right? So there is that part in an exciting part of it. But there's also just fear, you know, and fear of the unknown and, you know, part of what we're trying to do. And why were you trying Thio push Doug's book not for sales, but really just to share the knowledge and remove the mystery and let people see what they can actually do with this data? >> Yeah, it's more >> than just date illiteracy. So there's a lot of talk of the industry about data literacy programs and educating business people on the data and educating data people on the business. And that's obviously important. But what Joe is talking about is something bigger than that. It's really cultural, and it's something that is changed to the company's DNA. >> So where do you attack that problem? It doesn't have to go from the top down. You go into the middle. It has to >> be from the top down. It has to be. It has to be because my boss said to do it all right. >> Well, otherwise they well, they might do it. But the organization's because if you do, it >> is a grassroots movement on Lee. The folks who are excited, right? The foam of people, right? They're the ones who are gonna be excited. But they're going to evolve in adopt anyway, right? But it's the rest of the organization, and that needs to be a top down, Um, approach. >> It was interesting hearing this morning keynote speakers. You scored a throw on top down under the bus, but I had the same reaction is you can't do it without that executive buying. And of course, we defined, I guess in the session what that was. Amazon has an interesting concept for for any initiative, like every initiative that's funded has to have what they call a threaded leader. Another was some kind of And if they don't, if they don't have a threat of leader, there's like an incentive system tau dime on initiative. Kill it. It kind of forces top down. Yeah, you know, So >> when we interview our clients, we have a litmus test and the limits. It's kind of a ready in this test. Do you have the executive leadership to actually make this project successful? And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, you know, or because one of the challenges another part of the litmus test is this IittIe driven. If it's I t driven is gonna be very tough to get embraced by the rest of the business. So way need to really be able to have that executive leadership from the business to say this is something that we need >> to do to survive. Yeah, and, you know, with without the top down support. You could play small ball. But if you're playing the Yankees, you're gonna win one >> of the reasons why when it's I t driven, it's very challenging is because the people part right is a different budget from the i T budget. And when we start talking about data therapy, right and human resource is and training and education of just culture and data literacy, which is not necessary technical, that that becomes a challenge internally figuring out, like how to pay for Andi how to get it done with a corporate politics. >> So So the CDO crowd definitely parts of your book that they should be adopting because to me, there their main job is okay. How does data support the monetization of my organization? Raising revenue, cutting costs, improving productivity, saving lives. You call it value. And so that seems to be the starting point. At the same time. In this conference, you grew out of the ashes of back room information quality of the big data height, but exploded and have kind of gone full circle. So But I wonder, I mean, is the CDO crowd still focused on that monetization? Certainly I think we all agree they should be, but they're getting sucked back into a governance role. Can they do both, I guess, is >> my question. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation and focus on on on ensuring compliance with them. But there's often a just a pendulum swing back, and I think there's a swing back to adding business value. And so we're seeing a lot of opportunities to help companies monetize their data broadly in a variety of ways. A CZ you mentioned not just in one way and, um, again those you need to be driven from the top. We have a process that we go through to generate ideas, and that's wonderful. Generating ideas. No is fairly straightforward enough. But then running them through kind of a feasibility government, starting with you have the executive support for that is a technology technologically feasible, managerially feasible, ethically feasible and so forth. So we kind of run them through that gauntlet next. >> One of my concerns is that chief data officer, the level of involvement that year he has in these digital initiatives again is digital initiative of Field of Dreams. Maybe it is. But everywhere you go the CEO is trying to get digital right, and it seems like the chief data officer is not necessarily front and center in those. Certainly a I projects, which are skunk works. But it's the chief digital officer that's driving it. So how how do you see in those roles playoff >> In the less panel that I've just spoken, very similar question was asked. And again, we're trying to figure out the hierarchy of where the CDO should live in an organization. Um, I find that the biggest place it fails typically is if it rolls up to a C I. O. Right. If you think the data is a technical issue, you're wrong, Right? Data is a business issue, Andi. I also think for any company to survive today, they have to have a digital presence. And so digital presence is so tightly coupled to data that I find the best success is when the chief date officer reports directly to the chief digital officer. Chief Digital officer has a vision for the user experience for the customer customers Ella to figure out. How do we get that customer engaged and that directly is dependent on insight. Right on analytics. You know, if the four of us were to open up, any application on our phone, even for the same product, would have four different experiences based on who we are, who are peers are what we bought in the past, that's all based on analytics. So the business application of the digital presence is tightly couple tow Analytics, which is driven by the chief state officer. >> That's the first time I've heard that. I think that's the right organizational structure. Did see did. JJ is going to be sort of the driver, right? The strategy. That's where the budget's gonna go and the chief date office is gonna have that supporting role that's vital. The enabler. Yeah, I think the chief data officer is a long term play. Well, we have a lot of cheap date officers. Still, 10 years from now, I think that >> data is not a fad. I think Data's just become more and more important. And will they ultimately leapfrog the chief digital officer and report to the CEO? Maybe someday, but for now, I think that's where they belong. >> You know what's company started managing their labor and workforce is as an actual asset, even though it's not a balance sheet. Asked for obvious reasons in the 19 sixties that gave rise to the chief human resource officer, which we still see today and his company start to recognize information as an asset, you need an executive leader to oversee and be responsible for that asset. >> Conceptually, it's always been data is an asset and a liability. And, you know, we've always thought about balancing terms. Your book sort of put forth a formula for actually formalizing. That's right. Do you think it's gonna happen our lifetime? What exactly clear on it, what you put forth in your book in terms of organizations actually valuing data specifically on the balance sheet. So that's >> an accounting question and one that you know that you leave to the accounting professionals. But there have been discussion papers published by the accounting standards bodies to discuss that issue. We're probably at least 10 years away, but I think respective weather data is that about what she'd asked or not. It's an imperative organizations to behave as if it is one >> that was your point it's probably not gonna happen, but you got a finger in terms that you can understand the value because it comes >> back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. Or what day do you have your in a poor position to manage it like one. And if you're not manage like an asset, then you're really not probably able to leverage it like one. >> Give us a little commercial for I do want to say that I do >> think in our lifetime we will see it become an asset. There are lots of intangible assets that are on the books, intellectual property contracts. I think data that supports both of those things are equally is important. And they will they will see the light. >> Why are those five companies huge market cap winners, where they've surpassed all the evaluation >> of a business that the data that they have is considered right? So it should be part of >> the assets in the books. All right, we gotta wraps, But give us Give us the The Caserta Commercial. Well, concert is >> a consultancy that does essentially three things. We do data advisory work, which, which Doug is heading up. We do data architecture and strategy, and we also do just implementation of solutions. Everything from data engineering gate architecture and data science. >> Well, you made a good bet on data. Thanks for coming on, you guys. Great to see you again. Thank you. That's a wrap on day one, Paul. And I'll be back tomorrow for day two with the M I t cdo m I t cdo like you. Thanks for watching. We'll see them all.
SUMMARY :
Brought to you by Great to see you again, Joe. Its ideas that I've been working on over the years. And the reason I like talking because you know what's going on in the market place? So I think you that you put forth. We found that data companies have, ah, market to book value. The data doesn't conform to the laws of scarcity. We have to greet the new solution on dhe when you have a big old processes. But again they tend to be very, very functionally specific. But really, in order to make that shift, if your big enterprises It's fear of the unknown what we're But there's also just fear, you know, and fear of the unknown and, people on the data and educating data people on the business. It doesn't have to go from the top down. It has to be because my boss said to do it all But the organization's because if you do, But it's the rest of the organization, and that needs to be a top down, And of course, we defined, I guess in the session what that was. And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, Yeah, and, you know, with without the top down support. of the reasons why when it's I t driven, it's very challenging is because the people part And so that seems to be the starting point. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation One of my concerns is that chief data officer, the level of involvement experience for the customer customers Ella to figure out. JJ is going to be sort of the driver, right? data is not a fad. to the chief human resource officer, which we still see today and his company start to recognize information What exactly clear on it, what you put forth in your book in terms of an accounting question and one that you know that you leave to the accounting professionals. back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. assets that are on the books, intellectual property contracts. the assets in the books. a consultancy that does essentially three things. Great to see you again.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Joe | PERSON | 0.99+ |
Paul Gillen | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
David Dante | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
ExxonMobil | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
ORGANIZATION | 0.99+ | |
Joe Caserta | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
five companies | QUANTITY | 0.99+ |
Doug | PERSON | 0.99+ |
450 people | QUANTITY | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
four | QUANTITY | 0.99+ |
Yankees | ORGANIZATION | 0.99+ |
JJ | PERSON | 0.99+ |
tomorrow | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
two day | QUANTITY | 0.99+ |
Lee | PERSON | 0.99+ |
Doug Laney | PERSON | 0.99+ |
today | DATE | 0.98+ |
One | QUANTITY | 0.98+ |
Cassard | PERSON | 0.98+ |
Andi | PERSON | 0.97+ |
Cube | ORGANIZATION | 0.97+ |
The Caserta Commercial | ORGANIZATION | 0.97+ |
one | QUANTITY | 0.97+ |
day one | QUANTITY | 0.97+ |
first time | QUANTITY | 0.97+ |
day two | QUANTITY | 0.96+ |
several years ago | DATE | 0.96+ |
one thing | QUANTITY | 0.93+ |
Day one | QUANTITY | 0.93+ |
three things | QUANTITY | 0.92+ |
Phanom | LOCATION | 0.92+ |
Caserta | ORGANIZATION | 0.91+ |
this morning | DATE | 0.91+ |
nearly 33 times | QUANTITY | 0.9+ |
couple of years ago | DATE | 0.9+ |
millions of dollars | QUANTITY | 0.9+ |
last couple of years | DATE | 0.9+ |
Doug Laney, | PERSON | 0.9+ |
wave | EVENT | 0.89+ |
19 sixties | DATE | 0.87+ |
2019 | DATE | 0.86+ |
Thio push | PERSON | 0.85+ |
past couple of years | DATE | 0.84+ |
years ago | DATE | 0.84+ |
Data three dato | ORGANIZATION | 0.84+ |
one way | QUANTITY | 0.84+ |
next | EVENT | 0.83+ |
past three years | DATE | 0.81+ |
Titanic | COMMERCIAL_ITEM | 0.8+ |
30 40 people | QUANTITY | 0.8+ |
least 10 years | QUANTITY | 0.75+ |
top | QUANTITY | 0.75+ |
M I T. | EVENT | 0.75+ |
MIT CDOIQ | EVENT | 0.7+ |
Field of Dreams | ORGANIZATION | 0.7+ |
past few years | DATE | 0.7+ |
three | QUANTITY | 0.7+ |
five market | QUANTITY | 0.69+ |
CDO | ORGANIZATION | 0.68+ |
of people | QUANTITY | 0.66+ |
M I t. | EVENT | 0.65+ |
years | QUANTITY | 0.64+ |
Caserta | PERSON | 0.63+ |
Cos | ORGANIZATION | 0.56+ |
Ella | PERSON | 0.56+ |
k | ORGANIZATION | 0.53+ |
Tom Davenport, Babson College | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back >> to M I. T. Everybody watching the Cube, The leader in live tech coverage. My name is Dave Volonte here with Paul Guillen. My co host, Tom Davenport, is here is the president's distinguished professor at Babson College. Huebel? Um, good to see again, Tom. Thanks for coming on. Glad to be here. So, yeah, this is, uh let's see. The 13th annual M I t. Cdo lucky. >> Yeah, sure. As this year. Our seventh. I >> think so. Really? Maybe we'll offset. So you gave a talk earlier? She would be afraid of the machines, Or should we embrace them? I think we should embrace them, because so far, they are not capable of replacing us. I mean, you know, when we hit the singularity, which I'm not sure we'll ever happen, But it's certainly not going happen anytime soon. We'll have a different answer. But now good at small, narrow task. Not so good at doing a lot of the things that we do. So I think we're fine. Although as I said in my talk, I have some survey data suggesting that large U. S. Corporations, their senior executives, a substantial number of them more than half would liketo automate as many jobs as possible. They say. So that's a little scary. But unfortunately for us human something, it's gonna be >> a while before they succeed. Way had a case last year where McDonald's employees were agitating for increasing the minimum wage and tThe e management used the threat of wrote of robotics sizing, hamburger making process, which can be done right to thio. Get them to back down. Are you think we're going to Seymour of four that were maybe a eyes used as a threat? >> Well, I haven't heard too many other examples. I think for those highly structured, relatively low level task, it's quite possible, particularly if if we do end up raising the minimum wage beyond a point where it's economical, pay humans to do the work. Um, but I would like to think that, you know, if we gave humans the opportunity, they could do Maur than they're doing now in many cases, and one of the things I was saying is that I think companies are. Generally, there's some exceptions, but most companies they're not starting to retrain their workers. Amazon recently announced they're going to spend 700,000,000 to retrain their workers to do things that a I and robots can't. But that's pretty rare. Certainly that level of commitment is very rare. So I think it's time for the companies to start stepping up and saying, How can we develop a better combination of humans and machines? >> The work by, you know, brain Nelson and McAfee, which is a little dated now. But it definitely suggests that there's some things to be concerned about. Of course, ultimately there prescription was one of an optimist and education, and yeah, on and so forth. But you know, the key point there is the machines have always replace humans, but now, in terms of cognitive functions, but you see it everywhere you drive to the airport. Now it's Elektronik billboards. It's not some person putting up the kiosks, etcetera, but you know, is you know, you've you've used >> the term, you know, paid the cow path. We don't want to protect the past from the future. All right, so, to >> your point, retraining education I mean, that's the opportunity here, isn't it? And the potential is enormous. Well, and, you know, let's face it, we haven't had much in the way of productivity improvements in the U. S. Or any other advanced economy lately. So we need some guests, you know, replacement of humans by machines. But my argument has always been You can handle innovation better. You can avoid sort of race to the bottom at automation sometimes leads to, if you think creatively about humans and machines working as colleagues. In many cases, you remember in the PC boom, I forget it with a Fed chairman was it might have been, Greenspan said, You can see progress everywhere except in the product. That was an M. I. T. Professor Robert Solow. >> OK, right, and then >> won the Nobel Prize. But then, shortly thereafter, there was a huge productivity boom. So I mean is there may be a pent up Well, God knows. I mean, um, everybody's wondering. We've been spending literally trillions on I t. And you would think that it would have led toe productivity, But you know, certain things like social media, I think reduced productivity in the workplace and you know, we're all chatting and talking and slacking and sewing all over the place. Maybe that's is not conducive to getting work done. It depends what you >> do with that social media here in our business. It's actually it's phenomenal to see political coverage these days, which is almost entirely consist of reprinting politicians. Tweets >> Exactly. I guess it's made life easier for for them all people reporters sitting in the White House waiting for a press conference. They're not >> doing well. There are many reporters left. Where do you see in your consulting work your academic work? Where do you see a I being used most effectively in organizations right now? And where do you think that's gonna be three years from now? >> Well, I mean, the general category of activity of use case is the sort of someone's calling boring I. It's data integration. One thing that's being discussed a lot of this conference, it's connecting your invoices to your contracts to see Did we actually get the stuff that we contracted for its ah, doing a little bit better job of identifying fraud and doing it faster so all of those things are quite feasible. They're just not that exciting. What we're not seeing are curing cancer, creating fully autonomous vehicles. You know, the really aggressive moonshots that we've been trying for a while just haven't succeeded at what if we kind of expand a I is gonna The rumor, trawlers. New cool stuff that's coming out. So considering all these new checks with detective Aye, aye, Blockchain new security approaches. When do you think that machines will be able to make better diagnoses than doctors? Well, I think you know, in a very narrow sense in some cases, that could do it now. But the thing is, first of all, take a radiologist, which is one of the doctors I think most at risk from this because they don't typically meet with patients and they spend a lot of time looking at images. It turns out that the lab experiments that say you know, these air better than human radiologist say I tend to be very narrow, and what one lab does is different from another lab. So it's just it's gonna take a very long time to make it into, you know, production deployment in the physician's office. We'll probably have to have some regulatory approval of it. You know, the lab research is great. It's just getting it into day to day. Reality is the problem. Okay, So staying in this context of digital a sort of umbrella topic, do you think large retail stores roll largely disappeared? >> Uh, >> some sectors more than others for things that you don't need toe, touch and feel, And soon before you're to them. Certainly even that obviously, it's happening more and more on commerce. What people are saying will disappear. Next is the human at the point of sale. And we've been talking about that for a while. In In grocery, Not so not achieve so much yet in the U. S. Amazon Go is a really interesting experiment where every time I go in there, I tried to shoplift. I took a while, and now they have 12 stores. It's not huge yet, but I think if you're in one of those jobs that a substantial chunk of it is automata ble, then you really want to start looking around thinking, What else can I do to add value to these machines? Do you think traditional banks will lose control of the payment system? Uh, No, I don't because the Finn techs that you see thus far keep getting bought by traditional bank. So my guess is that people will want that certainty. And you know, the funny thing about Blockchain way say in principle it's more secure because it's spread across a lot of different ledgers. But people keep hacking into Bitcoin, so it makes you wonder. I think Blockchain is gonna take longer than way thought as well. So, you know, in my latest book, which is called the Aye Aye Advantage, I start out talking by about Tamara's Law, This guy Roy Amara, who was a futurist, not nearly as well known as Moore's Law. But it said, You know, for every new technology, we tend to overestimate its impact in the short run and underestimated Long, long Ryan. And so I think a I will end up doing great things. We may have sort of tuned it out of the time. It actually happens way finally have autonomous vehicles. We've been talking about it for 50 years. Last one. So one of the Democratic candidates of the 75 Democratic ended last night mentioned the chief manufacturing officer Well, do you see that automation will actually swing the pendulum and bring back manufacturing to the U. S. I think it could if we were really aggressive about using digital technologies in manufacturing, doing three D manufacturing doing, um, digital twins of every device and so on. But we are not being as aggressive as we ought to be. And manufacturing companies have been kind of slow. And, um, I think somewhat delinquent and embracing these things. So they're gonna think, lose the ability to compete. We have to really go at it in a big way to >> bring it. Bring it all back. Just we've got an election coming up. There are a lot of concern following the last election about the potential of a I chatbots Twitter chat bots, deep fakes, technologies that obscure or alter reality. Are you worried about what's coming in the next year? And that that >> could never happen? Paul. We could never see anything deep fakes I'm quite worried about. We don't seem. I know there's some organizations working on how we would certify, you know, an image as being really But we're not there yet. My guess is, certainly by the time the election happens, we're going to have all sorts of political candidates saying things that they never really said through deep fakes and image manipulation. Scary? What do you think about the call to break up? Big check. What's your position on that? I think that sell a self inflicted wound. You know, we just saw, for example, that the automobile manufacturers decided to get together. Even though the federal government isn't asking for better mileage, they said, We'll do it. We'll work with you in union of states that are more advanced. If Big Tak had said, we're gonna work together to develop standards of ethical behavior and privacy and data and so on, they could've prevented some of this unless they change their attitude really quickly. I've seen some of it sales force. People are talking about the need for data standard data protection standards, I must say, change quickly. I think they're going to get legislation imposed and maybe get broken up. It's gonna take awhile. Depends on the next administration, but they're not being smart >> about it. You look it. I'm sure you see a lot of demos of advanced A I type technology over the last year, what is really impressed you. >> You know, I think the biggest advances have clearly been in image recognition looking the other day. It's a big problem with that is you need a lot of label data. It's one of the reasons why Google was able to identify cat photos on the Internet is we had a lot of labeled cat images and the Image net open source database. But the ability to start generating images to do synthetic label data, I think, could really make a big difference in how rapidly image recognition works. >> What even synthetic? I'm sorry >> where we would actually create. We wouldn't have to have somebody go around taking pictures of cats. We create a bunch of different cat photos, label them as cat photos have variations in them, you know, unless we have a lot of variation and images. That's one of the reasons why we can't use autonomous vehicles yet because images differ in the rain and the snow. And so we're gonna have to have synthetic snow synthetic rain to identify those images. So, you know, the GPU chip still realizes that's a pedestrian walking across there, even though it's kind of buzzed up right now. Just a little bit of various ation. The image can throw off the recognition altogether. Tom. Hey, thanks so much for coming in. The Cube is great to see you. We gotta go play Catch. You're welcome. Keep right. Everybody will be back from M I t CDO I Q In Cambridge, Massachusetts. Stable, aren't they? Paul Gillis, You're watching the Cube?
SUMMARY :
Brought to you by My co host, Tom Davenport, is here is the president's distinguished professor at Babson College. I I mean, you know, when we hit the singularity, Are you think we're going to Seymour of four that were maybe a eyes used as you know, if we gave humans the opportunity, they could do Maur than they're doing now But you know, the key point there is the machines the term, you know, paid the cow path. Well, and, you know, in the workplace and you know, we're all chatting and talking It's actually it's phenomenal to see reporters sitting in the White House waiting for a press conference. And where do you think that's gonna be three years from now? I think you know, in a very narrow sense in some cases, No, I don't because the Finn techs that you see thus far keep There are a lot of concern following the last election about the potential of a I chatbots you know, an image as being really But we're not there yet. I'm sure you see a lot of demos of advanced A But the ability to start generating images to do synthetic as cat photos have variations in them, you know, unless we have
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
McDonald | ORGANIZATION | 0.99+ |
Dave Volonte | PERSON | 0.99+ |
Paul Gillis | PERSON | 0.99+ |
Roy Amara | PERSON | 0.99+ |
Paul Guillen | PERSON | 0.99+ |
Tom Davenport | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Tom | PERSON | 0.99+ |
Seymour | PERSON | 0.99+ |
700,000,000 | QUANTITY | 0.99+ |
12 stores | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Robert Solow | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
last year | DATE | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
one | QUANTITY | 0.99+ |
50 years | QUANTITY | 0.99+ |
U. S. | LOCATION | 0.99+ |
Babson College | ORGANIZATION | 0.99+ |
Huebel | PERSON | 0.99+ |
next year | DATE | 0.99+ |
Fed | ORGANIZATION | 0.98+ |
four | QUANTITY | 0.98+ |
Democratic | ORGANIZATION | 0.98+ |
more than half | QUANTITY | 0.98+ |
M I. T. | PERSON | 0.98+ |
seventh | QUANTITY | 0.98+ |
2019 | DATE | 0.98+ |
Nobel Prize | TITLE | 0.97+ |
McAfee | ORGANIZATION | 0.97+ |
Greenspan | PERSON | 0.97+ |
ORGANIZATION | 0.96+ | |
One | QUANTITY | 0.96+ |
U. S. | LOCATION | 0.96+ |
one lab | QUANTITY | 0.96+ |
Ryan | PERSON | 0.95+ |
Catch | TITLE | 0.95+ |
this year | DATE | 0.95+ |
last night | DATE | 0.94+ |
Big Tak | ORGANIZATION | 0.87+ |
Professor | PERSON | 0.84+ |
Aye Aye Advantage | TITLE | 0.84+ |
75 | QUANTITY | 0.84+ |
Amazon Go | ORGANIZATION | 0.81+ |
U. | ORGANIZATION | 0.78+ |
Maur | PERSON | 0.77+ |
trillions | QUANTITY | 0.76+ |
Nelson | ORGANIZATION | 0.73+ |
Tamara | PERSON | 0.71+ |
one of the reasons | QUANTITY | 0.71+ |
White House | ORGANIZATION | 0.69+ |
Big check | ORGANIZATION | 0.69+ |
Law | TITLE | 0.67+ |
three years | QUANTITY | 0.66+ |
M I t. Cdo | EVENT | 0.66+ |
M | PERSON | 0.65+ |
Moore | PERSON | 0.59+ |
13th annual | QUANTITY | 0.58+ |
first | QUANTITY | 0.57+ |
Last | QUANTITY | 0.54+ |
Aye | PERSON | 0.52+ |
MIT CDOIQ | ORGANIZATION | 0.51+ |
M. | PERSON | 0.48+ |
Finn | ORGANIZATION | 0.45+ |
Cube | TITLE | 0.41+ |
Dr. Stuart Madnick, MIT | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to M I. T. In Cambridge, Massachusetts. Everybody. You're watching the cube. The leader in live tech coverage. This is M I t CDO I Q the chief data officer and information quality conference. Someday Volonte with my co host, Paul Galen. Professor Dr Stewart, Mad Nick is here. Longtime Cube alum. Ah, long time professor at M i. T soon to be retired, but we're really grateful that you're taking your time toe. Come on. The Cube is great to see you again. >> It's great to see you again. It's been a long time. She worked together and I really appreciate the opportunity to share our spirits. Hear our mighty with your audience. Well, it's really been fun >> to watch this conference evolved were full and it's really amazing. We have to move to a new venue >> next year. I >> understand. And data we talk about the date explosion all the time, But one of the areas that you're focused on and you're gonna talk about today is his ethics and privacy and data causes so many concerns in those two areas. But so give us the highlight of what you're gonna discuss with the audience today. We'll get into >> one of things that makes it so challenging. It is. Data has so many implications. Tow it. And that's why the issue of ethics is so hard to get people to reach agreement on it. We're talking people regarding medicine and the idea big data and a I so know, to be able to really identify causes you need mass amounts of data. That means more data has to be made available as long as it's Elsa data, not mine. Well, not my backyard. If he really So you have this issue where on the one hand, people are concerned about sharing the data. On the other hand, there's so many valuable things would gain by sharing data and getting people to reach agreement is a challenge. Well, one of things >> I wanted to explore with you is how things have changed you back in the day very familiar with Paul you as well with Microsoft, Department of Justice, justice, FTC issues regarding Microsoft. And it wasn't so much around data was really around browsers and bundling things today. But today you see Facebook and Google Amazon coming under fire, and it's largely data related. Listen, Liz Warren, last night again break up big tech your thoughts on similarities and differences between sort of the monopolies of yesterday and the data monopolies of today Should they be broken up? What do you thought? So >> let me broaden the issue a little bit more from Maryland, and I don't know how the demographics of the audience. But I often refer to the characteristics that millennials the millennials in general. I ask my students this question here. Now, how many of you have a Facebook account in almost every class? Facebook. You realize you've given away a lot of nation about yourself. It it doesn't really occurred to them. That may be an issue. I was told by someone that in some countries, Facebook is very popular. That's how they cordoned the kidnappings of teenagers from rich families. They track them. They know they're going to go to this basketball game of the soccer match. You know exactly what I'm going after it. That's the perfect spot to kidnap them, so I don't know whether students think about the fact that when they're putting things on Facebook than making so much of their life at risk. On the other hand, it makes their life richer, more enjoyable. And so that's why these things are so challenging now, getting back to the issue of the break up of the big tech companies. One of the big challenges there is that in order to do the great things that big data has been doing and the things that a I promises do you need lots of data. Having organizations that can gather it all together in a relatively systematic and consistent manner is so valuable breaking up the tech companies. And there's some reasons why people want to do that, but also interferes with that benefit. And that's why I think it's gonna be looked at real Kim, please, to see not only what game maybe maybe breaking up also what losses of disadvantages we're creating >> for ourselves so example might be, perhaps it makes United States less competitive. Visa VI China, in the area of machine intelligence, is one example. The flip side of that is, you know Facebook has every incentive to appropriate our data to sell ads. So it's not an easy, you know, equation. >> Well, even ads are a funny situation for some people having a product called to your attention that something actually really want. But you never knew it before could be viewed as a feature, right? So, you know, in some case of the ads, could be viewed as a feature by some people. And, of course, a bit of intrusion by other people. Well, sometimes we use the search. Google, right? Looking >> for the ad on the side. No longer. It's all ads. You know >> it. I wonder if you see public public sentiment changing in this respect. There's a lot of concerns, certainly at the legislative level now about misuse of data. But Facebook user ship is not going down. Instagram membership is not going down. Uh, indication is that that ordinary citizens don't really care. >> I know that. That's been my I don't have all the data. Maybe you may have seen, but just anecdotally and talking to people in the work we're doing, I agree with you. I think most people maybe a bit dramatic, but at a conference once and someone made a comment that there has not been the digital Pearl Harbor yet. No, there's not been some event that was just so onerous. Is so all by the people. Remember the day it happened kind of thing. And so these things happen and maybe a little bit of press coverage and you're back on your Facebook. How their instagram account the next day. Nothing is really dramatic. Individuals may change now and then, but I don't see massive changes. But >> you had the Equifax hack two years ago. 145,000,000 records. Capital one. Just this week. 100,000,000 records. I mean, that seems pretty Pearl Harbor ish to me. >> Well, it's funny way we're talking about that earlier today regarding different parts of the world. I think in Europe, the general, they really seem to care about privacy. United States that kind of care about privacy in China. They know they have no privacy. But even in us where they care about privacy, exactly how much they care about it is really an issue. And in general it's not enough to move the needle. If it does, it moves it a little bit about the time when they show that smart TVs could be broken into smart. See, TV sales did not Dutch an inch. Not much help people even remember that big scandal a year ago. >> Well, now, to your point about expects, I mean, just this week, I think Equifax came out with a website. Well, you could check whether or not your credentials were. >> It's a new product. We're where we're compromised. And enough in what has been >> as head mind, I said, My wife says it's too. So you had a choice, you know, free monitoring or $125. So that way went okay. Now what? You know, life goes >> on. It doesn't seem like anything really changes. And we were talking earlier about your 1972 book about cyber security, that many of the principles and you outlined in that book are still valid today. Why are we not making more progress against cybercriminals? >> Well, two things. One thing is you gotta realize, as I said before, the Cave man had no privacy problems and no break in problems. But I'm not sure any of us want to go back to caveman era because you've got to realize that for all these bad things. There's so many good things that are happening, things you could now do, which a smartphone you couldn't even visualize doing a decade or two ago. So there's so much excitement, so much for momentum, autonomous cars and so on and so on that these minor bumps in the road are easy to ignore in the enthusiasm and excitement. >> Well and now, as we head into 2020 affection it was. It was fake news in 2016. Now we've got deep fakes. Get the ability to really use video in new ways. Do you see a way out of that problem? A lot of people looking a Blockchain You wrote an article recently, and Blockchain you think it's on hackable? Well, think again. >> What are you seeing? I think one of things we always talk about when we talk about improving privacy and security and organizations, the first thing is awareness. Most people are really small moment of time, aware that there's an issue and it quickly pass in the mind. The analogy I use regarding industrial safety. You go into almost any factory. You'll see a sign over the door every day that says 520 days, his last industrial accident and then a sub line. Please do not be the one to reset it this year. And I often say, When's the last time you went to a data center? And so assign is at 50 milliseconds his last cyber data breach. And so it needs to be something that is really front, the mind and people. And we talk about how to make awareness activities over companies and host household. And that's one of our major movements here is trying to be more aware because we're not aware that you're putting things at risk. You're not gonna do anything about it. >> Last year we contacted Silicon Angle, 22 leading security experts best in one simple question. Are we winning or losing the war against cybercriminals? Unanimously, they said, we're losing. What is your opinion of that question? >> I have a great quote I like to use. The good news is the good guys are getting better than a firewall of cryptographic codes. But the bad guys are getting batter faster, and there's a lot of reasons for that well on all of them. But we came out with a nautical talking about the docking Web, and the reason why it's fascinating is if you go to most companies if they've suffered a data breach or a cyber attack, they'll be very reluctant to say much about unless they really compelled to do so on the dock, where they love to Brent and reputation. I'm the one who broke in the Capital One. And so there's much more information sharing that much more organized, a much more disciplined. I mean, the criminal ecosystem is so much more superior than the chaotic mess we have here on the good guys side of the table. >> Do you see any hope for that? There are service's. IBM has one, and there are others in a sort of anonymous eyes. Security data enable organizations to share sensitive information without risk to their company. You see any hope on the collaboration, Front >> said before the good guys are getting better. The trouble is, at first I thought there was an issue that was enough sharing going on. It turns out we identified over 120 sharing organizations. That's the good news. And the bad news is 120. So IBM is one and another 119 more to go. So it's not a very well coordinated sharing. It's going just one example. The challenges Do I see any hope in the future? Well, in the more distant future, because the challenge we have is that there'll be a cyber attack next week of some form or shape that we've never seen before and therefore what? Probably not well prepared for it. At some point, I'll no longer be able to say that, but I think the cyber attackers and creatures and so on are so creative. They've got another decade of more to go before they run out of >> Steve. We've got from hacktivists to organized crime now nation states, and you start thinking about the future of war. I was talking to Robert Gates, aboutthe former defense secretary, and my question was, Why don't we have the best cyber? Can't we go in the oven? It goes, Yeah, but we also have the most to lose our critical infrastructure, and the value of that to our society is much greater than some of our adversaries. So we have to be very careful. It's kind of mind boggling to think autonomous vehicles is another one. I know that you have some visibility on that. And you were saying that technical challenges of actually achieving quality autonomous vehicles are so daunting that security is getting pushed to the back burner. >> And if the irony is, I had a conversation. I was a visiting professor, sir, at the University of Niece about a 12 14 years ago. And that's before time of vehicles are not what they were doing. Big automotive tele metrics. And I realized at that time that security wasn't really our top priority. I happen to visit organization, doing really Thomas vehicles now, 14 years later, and this conversation is almost identical now. The problems we're trying to solve. A hider problem that 40 years ago, much more challenging problems. And as a result, those problems dominate their mindset and security issues kind of, you know, we'll get around him if we can't get the cot a ride correctly. Why worry about security? >> Well, what about the ethics of autonomous vehicles? Way talking about your programming? You know, if you're gonna hit a baby or a woman or kill your passengers and yourself, what do you tell the machine to Dio, that is, it seems like an unsolvable problem. >> Well, I'm an engineer by training, and possibly many people in the audience are, too. I'm the kind of person likes nice, clear, clean answers. Two plus two is four, not 3.94 point one. That's the school up the street. They deal with that. The trouble with ethic issues is they don't tend to have a nice, clean answer. Almost every study we've done that has these kind of issues on it. And we have people vote almost always have spread across the board because you know any one of these is a bad decision. So which the bad decision is least bad. Like, what's an example that you used the example I use in my class, and we've been using that for well over a year now in class, I teach on ethics. Is you out of the design of an autonomous vehicle, so you must program it to do everything and particular case you have is your in the vehicle. It's driving around the mountain and Swiss Alps. You go around a corner and the vehicle, using all of senses, realize that straight ahead on the right? Ian Lane is a woman in a baby carriage pushing on to this onto the left, just entering the garage way a three gentlemen, both sides a road have concrete barriers so you can stay on your path. Hit the woman the baby carriage via to the left. Hit the three men. Take a shop, right or shot left. Hit the concrete wall and kill yourself. And trouble is, every one of those is unappealing. Imagine the headline kills woman and baby. That's not a very good thing. There actually is a theory of ethics called utility theory that says, better to say three people than to one. So definitely doing on Kim on a kill three men, that's the worst. And then the idea of hitting the concrete wall may feel magnanimous. I'm just killing myself. But as a design of the car, shouldn't your number one duty be to protect the owner of the car? And so people basically do. They close their eyes and flip a coin because they don't want anyone. Those hands, >> not an algorithmic >> response, doesn't leave. >> I want to come back for weeks before we close here to the subject of this conference. Exactly. You've been involved with this conference since the very beginning. How have you seen the conversation changed since that time? >> I think I think it's changing to Wei first. As you know, this record breaking a group of people are expecting here. Close to 500 I think have registered s o much Clea grown kind of over the years, but also the extent to which, whether it was called big data or call a I now whatever is something that was kind of not quite on the radar when we started, I think it's all 15 years ago. He first started the conference series so clearly has become something that is not just something We talk about it in the academic world but is becoming main stay business for corporations Maur and Maur. And I think it's just gonna keep increasing. I think so much of our society so much of business is so dependent on the data in any way, shape or form that we use it and have >> it well, it's come full circle. It's policy and I were talking at are open. This conference kind of emerged from the ashes of the back office information quality and you say the big date and now a I guess what? It's all coming back to information. >> Lots of data. That's no good. Or that you don't understand what they do with this. Not very healthy. >> Well, doctor Magic. Thank you so much. It's a >> relief for all these years. Really Wanna thank you. Thank you, guys, for joining us and helping to spread the word. Thank you. Pleasure. All right, keep it right, everybody. Paul and >> I will be back at M I t cdo right after this short break. You're watching the cue.
SUMMARY :
Brought to you by The Cube is great to see you again. It's great to see you again. We have to move to a new venue I But one of the areas that you're focused on and you're gonna talk about today is his ethics and privacy to be able to really identify causes you need mass amounts of data. I wanted to explore with you is how things have changed you back in the One of the big challenges there is that in order to do the great things that big data has been doing The flip side of that is, you know Facebook has every incentive to appropriate our data to sell ads. But you never knew it before could be viewed as a feature, for the ad on the side. There's a lot of concerns, certainly at the legislative level now about misuse of data. Is so all by the people. I mean, that seems pretty Pearl Harbor ish to me. And in general it's not enough to move the needle. Well, now, to your point about expects, I mean, just this week, And enough in what has been So you had a choice, you know, book about cyber security, that many of the principles and you outlined in that book are still valid today. in the road are easy to ignore in the enthusiasm and excitement. Get the ability to really use video in new ways. And I often say, When's the last time you went to a data center? What is your opinion of that question? Web, and the reason why it's fascinating is if you go to most companies if they've suffered You see any hope on the collaboration, in the more distant future, because the challenge we have is that there'll be a cyber attack I know that you have some visibility on that. And if the irony is, I had a conversation. that is, it seems like an unsolvable problem. But as a design of the car, shouldn't your number one How have you seen the conversation so much of business is so dependent on the data in any way, shape or form that we use it and from the ashes of the back office information quality and you say the big date and now a I Or that you don't understand what they do with this. Thank you so much. to spread the word. I will be back at M I t cdo right after this short break.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ian Lane | PERSON | 0.99+ |
Stuart Madnick | PERSON | 0.99+ |
Liz Warren | PERSON | 0.99+ |
Paul Galen | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
China | LOCATION | 0.99+ |
$125 | QUANTITY | 0.99+ |
Paul | PERSON | 0.99+ |
Equifax | ORGANIZATION | 0.99+ |
2016 | DATE | 0.99+ |
Steve | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Robert Gates | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Silicon Angle | ORGANIZATION | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Elsa | PERSON | 0.99+ |
four | QUANTITY | 0.99+ |
520 days | QUANTITY | 0.99+ |
Stewart | PERSON | 0.99+ |
Last year | DATE | 0.99+ |
next year | DATE | 0.99+ |
Cambridge, Massachusetts | LOCATION | 0.99+ |
Two | QUANTITY | 0.99+ |
Kim | PERSON | 0.99+ |
2020 | DATE | 0.99+ |
50 milliseconds | QUANTITY | 0.99+ |
Swiss Alps | LOCATION | 0.99+ |
this week | DATE | 0.99+ |
yesterday | DATE | 0.99+ |
three men | QUANTITY | 0.99+ |
14 years later | DATE | 0.99+ |
two years ago | DATE | 0.99+ |
a year ago | DATE | 0.99+ |
three people | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
one simple question | QUANTITY | 0.99+ |
last night | DATE | 0.99+ |
one example | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
two areas | QUANTITY | 0.98+ |
Dio | PERSON | 0.98+ |
United States | LOCATION | 0.98+ |
120 | QUANTITY | 0.98+ |
next week | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
this year | DATE | 0.98+ |
22 leading security experts | QUANTITY | 0.98+ |
three gentlemen | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
1972 | DATE | 0.98+ |
ORGANIZATION | 0.98+ | |
FTC | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.97+ |
100,000,000 records | QUANTITY | 0.97+ |
Magic | PERSON | 0.97+ |
145,000,000 records | QUANTITY | 0.97+ |
Pearl Harbor | EVENT | 0.97+ |
40 years ago | DATE | 0.97+ |
Maryland | LOCATION | 0.97+ |
University of Niece | ORGANIZATION | 0.97+ |
Department of Justice | ORGANIZATION | 0.96+ |
One thing | QUANTITY | 0.95+ |
over 120 sharing organizations | QUANTITY | 0.95+ |
next day | DATE | 0.95+ |
12 14 years ago | DATE | 0.94+ |
15 years ago | DATE | 0.93+ |
an inch | QUANTITY | 0.93+ |
first thing | QUANTITY | 0.93+ |
one example | QUANTITY | 0.92+ |