Kirk Haslbeck, Collibra | Data Citizens '22

(bright upbeat music) >> Welcome to theCUBE's Coverage of Data Citizens 2022 Collibra's Customer event. My name is Dave Vellante. With us is Kirk Hasselbeck, who's the Vice President of Data Quality of Collibra. Kirk, good to see you. Welcome. >> Thanks for having me, Dave. Excited to be here. >> You bet. Okay, we're going to discuss data quality, observability. It's a hot trend right now. You founded a data quality company, OwlDQ and it was acquired by Collibra last year. Congratulations! And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >> Yeah, absolutely. It's definitely exciting times for data quality which you're right, has been around for a long time. So why now, and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as to why this is becoming so important now. And I guess you could kind of break this down simply and think about if Dave, you and I were going to build, you know a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, what the ramifications could be? What those incidents would look like? Or maybe better yet, we try to build a new trading algorithm with a crossover strategy where the 50 day crosses the 10 day average. And imagine if the data underlying the inputs to that is incorrect. We'll probably have major financial ramifications in that sense. So, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. I bought a car not too long ago and my dad called and said, "How many cylinders does it have?" And I realized in that moment, I might have failed him because 'cause I didn't know. And I used to ask those types of questions about any lock brakes and cylinders and if it's manual or automatic and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips. I really don't know that much about it. And that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the individuals loading and consuming all of this data for the company actually may not know that much about the data itself and that's not even their job anymore. So, we'll talk more about that in a minute but that's really what's setting the foreground for this observability play and why everybody's so interested, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >> You know, the other thing too about data quality and for years we did the MIT CDOIQ event we didn't do it last year at COVID, messed everything up. But the observation I would make there love thoughts is it data quality used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a a risk to data as an asset. And now, as we say, we're going to talk about observability. And so it's really become front and center, just the whole quality issue because data's fundamental, hasn't it? >> Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And that's kind of what's going on. There's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor. But with the scale that we've achieved in early days, even before Collibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is listening right intently nowadays to this topic is so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's not ever going to be based on one or two domain experts anymore. >> So, how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they cousins? What's your perspective on that? >> Yeah, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the lingo is constantly moving as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens it's wrong and when it doesn't, it's correct. Or I could look for a trend and I'll give you a good example. Everybody's talking about fresh data and stale data and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good and the bads. That was kind of your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >> So what's the Collibra angle on all this stuff made the acquisition you got data quality observability coming together, you guys have a lot of expertise in this area but you hear providence of data you just talked about stale data, the whole trend toward real time. How is Collibra approaching the problem and what's unique about your approach? >> Well, I think where we're fortunate is with our background, myself and team we sort of lived this problem for a long time in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution. It's more advanced than some of the observation techniques that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong just show me the big picture. Help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact connecting it with lineage and catalog, metadata. And as that grows, you can actually achieve total data governance. At this point, with the acquisition of what was a lineage company years ago and then my company OwlDQ, now Collibra Data Quality, Collibra may be the best positioned for total data governance and intelligence in the space. >> Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, they just said, "Oh, it's a glitch." So they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens '22 that you're announcing you got to announce new products, right? Your yearly event, what's new? Give us a sense as to what products are coming out but specifically around data quality and observability. >> Absolutely. There's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks, Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a salike model. And we've started to hook in to these databases. And while we've always worked with the same databases in the past they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did my data that I've spent all this time and money with my security team securing ever leave my hands? Did it ever leave my secure VPC as they call it? And with these native integrations that we're building and about to unveil here as kind of a sneak peek for next week at Data Citizens, we're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration you could log into the Collibra Data Quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >> So this is interesting because what you just described you mentioned Snowflake, you mentioned Google, oh actually you mentioned yeah, the Data Bricks. Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool but then Google's got the open data cloud. If you heard Google Nest and now Data Bricks doesn't call it the data cloud but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm hearing to really understand the relationships between all those and have confidence across, it's like (indistinct) you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And that's what you're bringing to the table. Is that right? Did I get that right? >> Yeah, that's right. And for us, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now we can send them the operating ability to crunch all of the calculations, the governance, the quality and get the answers. And what that's doing, it's basically zero network cost, zero egress cost, zero latency of time. And so when you were to log into Big BigQuery tomorrow using our tool or let or say Snowflake, for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls things of that nature that just become less onerous. What we're seeing is there's so much technology out there just like all of the major brands that you mentioned but how do we make it easier? The future is about less clicks, faster time to value faster scale, and eventually lower cost. And we think that this positions us to be the leader there. >> I love this example because every talks about wow the cloud guys are going to own the world and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. Alright, Kirk, give us your final thoughts and on the trends that we've talked about and Data Citizens '22. >> Absolutely. Well I think, one big trend is discovery and classification. Seeing that across the board people used to know it was a zip code and nowadays with the amount of data that's out there, they want to know where everything is where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases, how fast they can get controls and insights out of their tools. So I think we're going to see more one click solutions, more SAS-based solutions and solutions that hopefully prove faster time to value on all of these modern cloud platforms. >> Excellent, all right. Kurt Hasselbeck, thanks so much for coming on theCUBE and previewing Data Citizens '22. Appreciate it. >> Thanks for having me, Dave. >> You're welcome. All right, and thank you for watching. Keep it right there for more coverage from theCUBE.

Published Date : Oct 24 2022

SUMMARY :

Kirk, good to see you. Excited to be here. and it was acquired by Collibra last year. And it's so complex that the And now, as we say, we're going and I check out the NASDAQ market cap. and areas changing the and what's unique about your approach? of the curve there when most and some examples, remember and data activity happens in the database. and has the proper lineage, providence. and get the answers. and on the trends that we've talked about and solutions that hopefully and previewing Data Citizens '22. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Kurt Hasselbeck	PERSON	0.99+
2010	DATE	0.99+
one	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
50 day	QUANTITY	0.99+
Kirk	PERSON	0.99+
10 day	QUANTITY	0.99+
OwlDQ	ORGANIZATION	0.99+
Kirk Haslbeck	PERSON	0.99+
next week	DATE	0.99+
Google	ORGANIZATION	0.99+
last year	DATE	0.99+
two sides	QUANTITY	0.99+
thousands	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
Snowflake	TITLE	0.99+
Data Citizens	ORGANIZATION	0.99+
Data Bricks	ORGANIZATION	0.99+
two other things	QUANTITY	0.98+
one click	QUANTITY	0.98+
tomorrow	DATE	0.98+
today	DATE	0.98+
five seconds	QUANTITY	0.97+
two domain	QUANTITY	0.94+
Collibra Data Quality	TITLE	0.92+
MIT CDOIQ	EVENT	0.9+
Data Citizens '22	TITLE	0.9+
Egress	ORGANIZATION	0.89+
Delta Lake	TITLE	0.89+
three	QUANTITY	0.86+
zero	QUANTITY	0.85+
Big Query	TITLE	0.85+
about a decade ago	DATE	0.85+
SQL Pushdown	TITLE	0.83+
Data Citizens 2022 Collibra	EVENT	0.82+
Big BigQuery	TITLE	0.81+
more than a couple	QUANTITY	0.79+
couple	QUANTITY	0.78+
one big	QUANTITY	0.77+
Collibra Data Quality	ORGANIZATION	0.75+
Collibra	OTHER	0.75+
Google Nest	ORGANIZATION	0.75+
Data Citizens '22	ORGANIZATION	0.74+
zero latency	QUANTITY	0.72+
SAS	ORGANIZATION	0.71+
Snowflake	ORGANIZATION	0.69+
COVID	ORGANIZATION	0.69+
years ago	DATE	0.68+
Wall Street	LOCATION	0.66+
theCUBE	ORGANIZATION	0.66+
many numbers	QUANTITY	0.63+
Collibra	PERSON	0.63+
times	QUANTITY	0.61+
Data	ORGANIZATION	0.61+
too long	DATE	0.6+
Vice President	PERSON	0.57+
data	QUANTITY	0.56+
CDO	TITLE	0.52+
Bricks	TITLE	0.48+

Aaron Kalb, Alation | CUBEConversation, September 2020

>> Announcer: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is theCUBE conversation. >> Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're in our Palo Alto studios today for theCUBE conversation. We're talking about data. We're always talking about data and it's really interesting. You know we like to go out and get you the first person insight from the people that start the companies, run the companies, the practitioners and, and, and get the insight directly from them. We also like to go out and get original research and hear from original research. And this is a great opportunity to hear from both. So we're excited to have, and welcome back into the studio. He's Aaron Kalb. He's the co founder of Alation, many time CUBE alumni. Aaron. Great to see you. >> Yeah, thanks for having me. It's good to be here. >> Yeah, it's very cool. But today it's a special, a special thing. We've never done this before with you. You guys are releasing a brand new report called, the Alation State of Data Culture Report. So really interesting report. A lot of great information that we're going to dig in here for the next few minutes. But before we do, tell us kind of the history of this report. This is a, the kind of the inaugural release. What was kind of behind it, why did you guys do this? And give us a little background before we get into the details. >> Absolutely. So, yes, that's exactly right. It's debuting today that we plan to kind of update this research quarterly we going to see the trends over time. And this emerged because, you know, I, part of my job, I talk to chief data officers and chief analytics officers across our customer base and prospects. And I keep hearing anecdotally over and over that establishing a data culture, is often the number one priority for these data leaders and for these organizations. And so we wanted to really say, can we quantify that? Can we agree upon a definition of data culture? And can we create sort of a simple yardstick to more objectively measure where organizations are on this sort of data maturity curve to get it into culture. >> Right. I love it. So you created this data, data index right? The data culture index. And, and I think it's important to look at methodology. I think people, a lot of times go right to the results on reports before talking about the methodologies. And let's talk about the methodologies cause we're supposed to be talking about data, right? So you talked to 300, some odd executives, correct. And I think it's really interesting and you broke it down into three kind of buckets of data literacy, if you will. Data search and discovery, number one, data, two kind of literacy in terms of their ability to work with the data. And then the third bucket is really data governance. And then in, in the form ABCD, you gave him a four point score and basically, are they doing it well? Are they doing it in the majority of the time? Are they doing it about half, they got one or they got a zero and you get this four point scale and you end up with a 12 point scale which we're all familiar with from, from school, from an A to an, A minus and B, et cetera. Just dig it a little bit on those three categories and how you chose those. So the first one again is kind of the data search and discovery, you know can they find it and then their competency, if you will and then a governance and compliance. Kind of dig into each of those three buckets a little bit. >> For sure. So, so the, the end goal in data culture, is to have an organization in which data is valued and decisions are made based on data and evidence, right? Versus a culture in which we go with the highest paid person's opinion or what we did last quarter or any of these other ways things get done. And so the idea is to make that possible, as you said you've to be able to find the data when you need it. That's the data search and discovery. You've to be able to interpret that data correctly and draw valid conclusions from it. And that's a data literacy, excuse me. And both of those are contingent upon having data governance in place. So that data is well-defined and has high data quality, as well as other aspects, so that it is possible to find it and understand it properly. >> Right. And what are the things too that I think is really important that we call that, and again, we're going to dive into the details, is your perceived execution versus the reported execution by the people that are actually providing data. And I think you've found and you've highlighted on specific slides that you know, there's not necessarily a match there. And sometimes that you know, what you perceive is happening, isn't necessarily what's happening when you go down and query the people in the field. So really important to come up with a number. And I think a, I think you said this is going to be an ongoing thing over a period of time. So you kind of start to see longitudinal changes in these organizations. >> Absolutely. And we're very excited to see those, those trends over time. But even at the outset is this you know, very striking effect emerges which is, as you said, if we ask one of these you know, 300 data leaders, you know, all around the world actually, you know, if we ask, how is the data culture at your company overall, and this is very broad general top down way and have them graded on the sort of SaaS scale. You know, we get results where there's a large gap between kind of that level of maturity and what emerges in a bottom up methodology excuse me, in which you ask about, you know governance and literacy and, and such kind of by department and in a more bottom up way. And so we do see that that, you know, it can be helpful, even for data people to have a, a more granular metric and framework for quantifying their progress. >> Right? Let's jump into some of the results. It's, it's a fascinating, they're kind of all over the map, but there's some definite trends. One of the trends you talked about is that there's a lot of questions on the quality of the data. But that's a real inhibitor to people. Whether that suspicion is because it's not good data. And I don't know, this question for you, is, is, do they think it's not relevant to the decision that's being made? Is it an incomplete data set or the wrong data set? It seems to be that keeps coming up over and over about, decision-makers not necessarily having confidence in the data. What, can you share a little bit more color around that? >> Yeah, it's quite interesting actually. So what we find is that 90%. So 90 people, 10 executives (indistinct) to question the data sometimes often or always. But the part that's maybe disappointing or concerning is the two thirds of executives are believed to ignore the data and make a decision kind of pushing the data aside which is really quite striking when you think about it, why have all this data, if more often than not you're sort of disregarding it to make your final answer. And so you're absolutely correct when we dug into why, what are the reasons behind pushing it aside. Data quality was number one. And I think it is a question of, Oh, is the data inaccurate? Is it out of date, these sort of concerns sort of we, we hear from customers and prospects. But as we dig in deeper in the survey results, excuse me, we, we see some other reasons behind that. One is a lack of collaboration between the data analytics folks and the business folks. And so there's a question of, I don't know exactly where this data came from or to your point kind of how it was produced. What was the methodology? How was it sourced? And maybe because of that disconnect is a lack of trust. So trust really is the ultimate I think, failure to having data culture really take root. >> Right? And it's trust in this trust, as you said, not only in the data per se, the source of the data, the quality of the data, the relevance of the data but also the people who are providing you with the data. And obviously you get, you get some data sets. Sometimes you didn't get other data sets. So, that's really I'm a little bit disconcerting. The other thing I thought was kind of interesting is, it seems to be consistent that the, the primary reason that people are using big data projects is around operations and operations efficiency, a little bit about compliance, but, you know, it's interesting we had you on at the MIT CDOIQ, Chief Data Information Officer quality symposium, and you talked about the goodness of people moving from kind of a defensive posture to an offensive posture, you know using data in terms of product development and innovation. And, and what comes across in this survey is that's kind of down the list behind you know, kind of operational efficiency. We're seeing a little bit of governance and regulation but the, the quest for data as a tool for innovation, didn't really shine through in this report. >> Well, you know, it's very interesting. It depends whether you look at the aggregate level or you break things down a little bit more. So one thing we did after we got that zero to 12 scale on the data culture index or DCI, is it actually, we were able to break it down into thirds. And among the sort of bottom third, it has the least well-established data culture by this yardstick. We've found that governance and regulatory compliance, was the number one application of data. But among the top third of respondents, we actually found the opposite where things like providing a great customer experience, doing product innovation, those sort of things actually came to the fore and governance fell behind. So I think there is this curve where, It's table stakes to get the sort of defense side of data figured out. And then you can move on to offense in using data to make your organization meet its meet its other goals. >> Right. Right. And then I wanted to get your take on kind of the democratization of data, right? This is a, this is a trend that's been going on, and really, I think you said before you know, your guys' whole mission is to empower curious and rational world to give people the ability to ask the right questions have the right data and get the right answer. So, you know, we've seen democratization in terms of the access to the data, the access to the tools, the ability to do something with the data and the tool, and then the actual authority to execute business decision based on that. The results on that seem a little bit split here because a lot of the problems seem to be focused on leadership, not necessarily taking a data based decision move, but on the good hand a lot of people trying to break down data silos and make data more accessible for a larger group of people. So that more people in the organization are making data based decisions. This seems kind of like this little bit of a bifurcation between the C suite and everybody else trying to get their job done. >> Absolutely. There's always this question of you know, sort of the, that organizational wide initiative and then what's happening on the ground. One thing we saw that was very heartening and aligns with our customers index success, is a real emphasis being placed on having data governance and data context and data literacy factors sort of be embedded at the point of use. To not expecting people, to just like take a course and look things up and kind of end up with their workflow to be able to use data quickly and accurately and, and interpret it in varied ways. So that was really exciting to see as, as, as a initiative. It sort of bridges that gap along with initiatives to have more collaboration and integration between the data people and the business people. because really you know, they exist to serve one another. But in terms of the disconnect between the C suite and other parts of the org, there was a really interesting inverse correlation. Well, or maybe it's not interesting how you look at it, but basically, you know, when we talk to C level executives and ask, you know, does the C suite ignore data? Do they question data et cetera, those numbers came in lower than when we talked to, you know, senior director about the C suite right? It's sort of the farther you get, and there's a difference there, you know, from my perspective, I almost wonder whether that distance is actually is more objective viewpoint. And when you're in that role, it's hard to even see your cognitive biases and your tendency to ignore a data when it doesn't suit you. >> Right. Right. So there's, there's some other interesting things here. So one of them is, you know, kind of predictors, right? One of the whole reasons to do studies and collect data so that we can have some predictive ability. And, and it comes out here that the reporting structure is a strong predictor of a company's data tier structure. So, you know, there's the whole rise of the chief data officers and the chief analytics officer and the chief data and analytics officer and lots of conversations about those roles and what exactly are those roles and who do they report to. Your study finds a pretty compelling leading indicator that if that role is reporting to either the CEO or the executive board, which is often a one in the same person, that that's actually a terrific indicator of success in moving to a more data centric culture. >> That's absolutely correct. So we found that that top third of organizations on the data culture index were much more likely to have a chief data executive, a CDO, CAO or CDAO. In fact, they're more likely to have folks with the analytics in their title because in some organizations, data is thought to mean sort of raw data, infrastructural defense and analytics is sort of where it gets you know, infused into business processes and value. But certainly that top third is much more likely to have the chief data executive reporting into the executive board or CEO when the highest ranking data executive is under the CIO or some other part of the organization, those orgs tend to score a far lower on the DCI. >> Right. Right. So it's interesting, you know you're a really interesting guy even doing this for a while. You were at Siri before you were at Alation. So you have a really good feel for kind of what data can do and can't do and natural human or natural language processing and, and, and human voice interaction with these devices, a really interesting case study, and they can do a really good job within a small defined data set and instruction set, but they don't do necessarily so well once you kind of get outside how, how they're trained. And you've talked a lot about how metaphor shaped the way that we think and I know you and Dave talked about data oil and data lakes I don't want to necessarily go down that whole path but I do think it's important. And what came out of the study and the way people think about data. You know, there's a lot of conversation. How do you value data? Is data, you know it used to just be an expense that we had to buy servers to store the stuff we weren't sure what we ever did with it. So I wonder if there's any, you know, kind of top level metaphors level, kind of a thought or process or framing in the companies that you study that came out. maybe not necessarily in the top line data, but maybe in some of the notes that help define why some people, you know are being successful at making this transition and putting, you know kind of data out front of their decision processing versus data, either behind as a supporting thing or maybe data, I just don't have time with it or I don't trust it, or God knows where you got that, and this is not the data that I wanted. You know, was there any, you know, kind of tangental or anecdotal stuff that came out of this study that's more reflective of, of the softer parts of a data culture versus the harder parts in terms of titles and roles and, and, and job responsibilities. >> Yeah. It's a really interesting place to explore. I do think there's a, I don't want to make this overly simplistic group binary, but at the end of the day you know, like anything else within an organization, you can view data as a liability to say, okay, we have for example, you know, customer's names and phone numbers and passwords, and we just need to prevent an adverse event in which there's a leak or some sort of InfoSec problem that could cause, you know, bad press and fines and other negative consequences. And I think the issue there is if data's a liability, the most you know, the best case is that it's worth zero as opposed to some huge negative on your company's balance sheet. And, and I think, you know, intuitively, if you really want to prevent data misuse and data problems, one fail safe, but I think ultimately in its own way risky way to do that was just not collect any data, right. And not store it. So I think that the transition is to say, look data must be protected and taken care of that's step zero. But you know, it's really just the beginning and data is this asset that can be used to inform the huge company level strategic decisions that are made in annual planning at the board level, down to the millions of little decisions every day in the work of people in customer support and in sales and in product management and in, you know, various roles that just across industries. And I think once you have that, that shift, you know the upside is potentially, you know, unbounded. >> Right. And, and it just changes the way, the way you think. And suddenly instead of saying, Oh, data needs to be kind of hidden away, it's more like, Oh, people need to be trained on data use and empowered with data. And it's all about not if it's used or if it's misused but really how it's used and why it's used, what it's being used for to make a real impact. >> Right. Right. And it's funny when I just remember it being back in business school one of the great things that help teach is to think in terms of data, right. And you always have the infamous center consulting interview question, How many manhole covers are in Manhattan. Right. So, you know, to, to, to start to think about that problem from a data centric, point of view really gives you a leg up and, and even, you know where to start and how to attack those types of problems. And I thought it was interesting you know, talking about challenges for people to have a more data centric, point of view. It's interesting. The reports says, basically everybody said there's all kinds of challenges around data quality and compliance, and they had democratization. But the bottom companies, the bottom companies said that the biggest challenge was lack of buy in from company leadership. So I guess the good news bad news is that there's a real opportunity to make a significant change and get your company from the bottom third to a middle third or a top third, simply by taking a change in attitude about putting data in a much more central role in your decision making process. 'Cause all the other stuff's kind of operational, execution challenges that we all have, not enough people, blah, blah, blah. But in terms of attitude of leadership and prioritization, that's something that's very easy to change if you so choose. And really seems to be the key to unlock this real journey as opposed to the minutiae of a lot of the little details that that are a challenge for everybody. >> Absolutely. In your changing attitudes might be the easiest thing or the hardest thing depending on (indistinct). But I think you're absolutely right. The first step, which, which which could, maybe it should be easy, is admitting that you have a problem or maybe to put it more positively, realizing you have an opportunity. >> I love that. And then just again, looking at the top tier companies, the other thing that I thought was pretty interesting in this study is, I'm looking at it here, is getting champions in each of the operational segments. So rather than, I mean, a chief data officer is important and you know, somebody kind of at the high level to shepherd it in the executive suite, as we just discussed, but within each of the individual tasks and functions and roles, whether that's operations or customer service or product development or operational efficiency, you need some type of champion, some type of person, you know, banging the gavel, collecting the data, smoothing out the complexities, helping people get their thing together. And again, another way to really elevate your position on the score. >> Absolutely. And I think this idea of again, bridging between, you know, if data is centralized you have a chance to try to really get excellent practices within the data org. But even it becomes even more essential to have those ambassadors, people who are in the business and understand all the business context who can sort of make the data relevant, identify the key areas where data can really help, maybe demystify data and pick the right metaphors and the right examples to make it real for the people in their function. >> Right. Right. So Aaron has a lot of great stuff. People can go to the website at alation.com. I'm sure you'll have a link to this, a very prominently displayed, but, and they should and they should check it out and really think about it and think about how it applies to their own situation, their own department, company et cetera. I just wanted to give you the last word before we before we sign off, you know, kind of what was the most you know, kind of positive affirmation or not the most but one or two of the most outcome affirming outcomes of this exercise. And what were one or two of the things that were a little concerning or, you know, kind of surprises on the downside that, that came out of this research? >> Yeah. So I think one thing that was maybe surprising or concerning the biggest one is sort of where we started with that disconnect between, you know, what people would, say as an off the cuff overall assessment and the disconnect between that and what emerges when we go department by department and (indistinct) to be pillars of data culture from such a discovery to data literacy, to data governance. I think that disconnect, you know, should give one pause. I think certainly it should make one think, Hmm. Maybe I shouldn't look from 10,000 feet, but actually be a little more systematic. And considering the framework I use to assess data culture that is the most important thing to my organization. I think though, there's this quote that you move what you measure, just having this hopefully simple but not simplistic yardstick to measure data culture and the data culture index should help people be a little bit more realistic in their quantification and they track their progress, you know, quarter over quarter. So I think that's very promising. I think another thing is that, you know sometimes we ask, how long have you had this initiative? How much progress have you made? And it can sometimes seem like pushing a boulder uphill. Obviously the COVID pandemic and the economic impacts of that has been really tragic and really hard. You know, a tiny silver lining in that is the survey results showed that organizations have really observed a shift in how much they're using data because sometimes things are changing but it's like a frog in boiling water. You don't realize it. And so you just assume that the future is going to look like the recent past and you don't look at the data or you ignore the data or you miss parts of the data. And a lot of organizations said, you know COVID was this really troubling wake up call, but they could even after this crisis is over, producing enduring change which people were consulting data more and making decisions in a more data driven way. >> Yeah, certainly an accelerant that, that is for sure whether you wanted it, didn't want it, thought you had it at the time, didn't have time. You know COVID is definitely digital transformation accelerant and data is certainly the thing that powers that. Well again, it's the Alation State of Data Culture Report available, go check it at alation.com. Aaron always great to catch up and again, thank you for, for doing the work and supporting this research. And I think it's really important stuff. And it's going to be interesting to see how it changes over time. 'Cause that's really when these types of reports really start to add value. >> Thanks for having me, Jeff and I really look forward to discussing some of those trends as the research is completed. >> All right. Thanks a lot, Aaron, take care. Alright. He's Aaron and I'm Jeff. You're watching theCUBE, Palo Alto. Thanks for watching. We'll see you next time. (upbeat music)

Published Date : Oct 1 2020

SUMMARY :

leaders all around the world. and get the insight directly from them. It's good to be here. This is a, the kind of you know, I, part of my job, and then their competency, if you will And so the idea is to make that possible, And sometimes that you know, But even at the outset is this you know, One of the trends you talked of pushing the data aside and you talked about the And among the sort of bottom third, in terms of the access to the It's sort of the farther you get, and the chief data and analytics officer where it gets you know, and putting, you know but at the end of the day you know, the way, the way you think. a lot of the little details that you have a problem or and you know, somebody and the right examples to make it real before we sign off, you know, And a lot of organizations said, you know and data is certainly the and I really look forward to We'll see you next time.

ENTITIES

Entity	Category	Confidence
Aaron	PERSON	0.99+
Dave	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Aaron Kalb	PERSON	0.99+
Palo Alto	LOCATION	0.99+
one	QUANTITY	0.99+
10 executives	QUANTITY	0.99+
12 point	QUANTITY	0.99+
September 2020	DATE	0.99+
Siri	TITLE	0.99+
90%	QUANTITY	0.99+
90 people	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
two	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
10,000 feet	QUANTITY	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
each	QUANTITY	0.99+
today	DATE	0.99+
zero	QUANTITY	0.99+
first step	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
four point	QUANTITY	0.98+
alation.com	OTHER	0.98+
Alation State of Data Culture Report	TITLE	0.98+
one thing	QUANTITY	0.98+
COVID pandemic	EVENT	0.97+
millions	QUANTITY	0.96+
third bucket	QUANTITY	0.96+
Alation	ORGANIZATION	0.95+
first one	QUANTITY	0.94+
two thirds	QUANTITY	0.94+
last quarter	DATE	0.92+
300 data leaders	QUANTITY	0.91+
about half	QUANTITY	0.91+
three categories	QUANTITY	0.9+
three buckets	QUANTITY	0.89+
MIT CDOIQ	ORGANIZATION	0.89+
third	QUANTITY	0.89+
InfoSec	ORGANIZATION	0.88+
step zero	QUANTITY	0.86+
first person	QUANTITY	0.85+
three kind	QUANTITY	0.84+
thirds	QUANTITY	0.83+
Alation	PERSON	0.82+
12 scale	QUANTITY	0.74+
C suite	TITLE	0.73+
C	TITLE	0.71+
300	OTHER	0.71+
One thing	QUANTITY	0.7+
bottom	QUANTITY	0.67+
Alation State of Data Culture Report	TITLE	0.65+
minutes	DATE	0.58+
Officer	EVENT	0.56+
top third	QUANTITY	0.56+
middle	QUANTITY	0.51+

Aaron Kalb, Alation | CUBEConversation, September 2020

Published Date : Sep 30 2020

SUMMARY :

ENTITIES

Entity	Category	Confidence
Aaron	PERSON	0.99+
Dave	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Aaron Kalb	PERSON	0.99+
Palo Alto	LOCATION	0.99+
one	QUANTITY	0.99+
10 executives	QUANTITY	0.99+
12 point	QUANTITY	0.99+
September 2020	DATE	0.99+
Siri	TITLE	0.99+
90%	QUANTITY	0.99+
90 people	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
two	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
10,000 feet	QUANTITY	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
each	QUANTITY	0.99+
today	DATE	0.99+
zero	QUANTITY	0.99+
first step	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
four point	QUANTITY	0.98+
alation.com	OTHER	0.98+
Alation State of Data Culture Report	TITLE	0.98+
one thing	QUANTITY	0.98+
COVID pandemic	EVENT	0.97+
millions	QUANTITY	0.96+
third bucket	QUANTITY	0.96+
Alation	ORGANIZATION	0.95+
first one	QUANTITY	0.94+
two thirds	QUANTITY	0.94+
last quarter	DATE	0.92+
300 data leaders	QUANTITY	0.91+
about half	QUANTITY	0.91+
three categories	QUANTITY	0.9+
three buckets	QUANTITY	0.89+
MIT CDOIQ	ORGANIZATION	0.89+
third	QUANTITY	0.89+
InfoSec	ORGANIZATION	0.88+
step zero	QUANTITY	0.86+
first person	QUANTITY	0.85+
three kind	QUANTITY	0.84+
thirds	QUANTITY	0.83+
Alation	PERSON	0.82+
12 scale	QUANTITY	0.74+
C suite	TITLE	0.73+
C	TITLE	0.71+
300	OTHER	0.71+
One thing	QUANTITY	0.7+
bottom	QUANTITY	0.67+
Alation State of Data Culture Report	TITLE	0.65+
minutes	DATE	0.58+
Officer	EVENT	0.56+
top third	QUANTITY	0.56+
middle	QUANTITY	0.51+

Eileen Vidrine, US Air Force | MIT CDOIQ 2020

>> Announcer: From around the globe, it's theCube with digital coverage of MIT, Chief Data Officer and Information Quality Symposium brought to you by Silicon Angle Media. >> Hi, I'm Stu Miniman and this is the seventh year of theCubes coverage of the MIT, Chief Data Officer and Information Quality Symposium. We love getting to talk to these chief data officers and the people in this ecosystem, the importance of data, driving data-driven cultures, and really happy to welcome to the program, first time guests Eileen Vitrine, Eileen is the Chief Data Officer for the United States Air Force, Eileen, thank you so much for joining us. >> Thank you Stu really excited about being here today. >> All right, so the United States Air Force, I believe had it first CDO office in 2017, you were put in the CDO role in June of 2018. If you could, bring us back, give us how that was formed inside the Air force and how you came to be in that role. >> Well, Stu I like to say that we are a startup organization and a really mature organization, so it's really about culture change and it began by bringing a group of amazing citizen airman reservists back to the Air Force to bring their skills from industry and bring them into the Air Force. So, I like to say that we're a total force because we have active and reservists working with civilians on a daily basis and one of the first things we did in June was we stood up a data lab, that's based in the Jones building on Andrews Air Force Base. And there, we actually take small use cases that have enterprise focus, and we really try to dig deep to try to drive data insights, to inform senior leaders across the department on really important, what I would call enterprise focused challenges, it's pretty exciting. >> Yeah, it's been fascinating when we've dug into this ecosystem, of course while the data itself is very sensitive and I'm sure for the Air Force, there are some very highest level of security, the practices that are done as to how to leverage data, the line between public and private blurs, because you have people that have come from industry that go into government and people that are from government that have leveraged their experiences there. So, if you could give us a little bit of your background and what it is that your charter has been and what you're looking to build out, as you mentioned that culture of change. >> Well, I like to say I began my data leadership journey as an active duty soldier in the army, and I was originally a transportation officer, today we would use the title condition based maintenance, but back then, it was really about running the numbers so that I could optimize my truck fleet on the road each and every day, so that my soldiers were driving safely. Data has always been part of my leadership journey and so I like to say that one of our challenges is really to make sure that data is part of every airmans core DNA, so that they're using the right data at the right level to drive insights, whether it's tactical, operational or strategic. And so it's really about empowering each and every airman, which I think is pretty exciting. >> There's so many pieces of that data, you talk about data quality, there's obviously the data life cycle. I know your presentation that you're given here at the CDO, IQ talks about the data platform that your team has built, could you explain that? What are the key tenants and what maybe differentiates it from what other organizations might have done? >> So, when we first took the challenge to build our data lab, we really wanted to really come up. Our goal was to have a cross domain solution where we could solve data problems at the appropriate classification level. And so we built the VAULT data platform, VAULT stands for visible, accessible, understandable, linked, and trustworthy. And if you look at the DOD data strategy, they will also add the tenants of interoperability and secure. So, the first steps that we have really focused on is making data visible and accessible to airmen, to empower them, to drive insights from available data to solve their problems. So, it's really about that data empowerment, we like to use the hashtag built by airmen because it's really about each and every airman being part of the solution. And I think it's really an exciting time to be in the Air Force because any airman can solve a really hard challenge and it can very quickly wrap it up rapidly, escalate up with great velocity to senior leadership, to be an enterprise solution. >> Is there some basic training that goes on from a data standpoint? For any of those that have lived in data, oftentimes you can get lost in numbers, you have to have context, you need to understand how do I separate good from bad data, or when is data still valid? So, how does someone in the Air Force get some of that beta data competency? >> Well, we have taken a multitenant approach because each and every airman has different needs. So, we have quite a few pathfinders across the Air Force today, to help what I call, upscale our total force. And so I developed a partnership with the Air Force Institute of Technology and they now have a online graduate level data science certificate program. So, individuals studying at AFIT or remotely have the opportunity to really focus on building up their data touchpoints. Just recently, we have been working on a pathfinder to allow our data officers to get their ICCP Federal Data Sector Governance Certificate Program. So, we've been running what I would call short boot camps to prep data officers to be ready for that. And I think the one that I'm most excited about is that this year, this fall, new cadets at the U.S Air Force Academy will be able to have an undergraduate degree in data science and so it's not about a one prong approach, it's about having short courses as well as academe solutions to up skill our total force moving forward. >> Well, information absolutely is such an important differentiator(laughs) in general business and absolutely the military aspects are there. You mentioned the DOD talks about interoperability in their platform, can you speak a little bit to how you make sure that data is secure? Yet, I'm sure there's opportunities for other organizations, for there to be collaboration between them. >> Well, I like to say, that we don't fight alone. So, I work on a daily basis with my peers, Tom Cecila at the Department of Navy and Greg Garcia at the Department of Army, as well as Mr. David Berg in the DOD level. It's really important that we have an integrated approach moving forward and in the DOD we partner with our security experts, so it's not about us doing security individually, it's really about, in the Air Force we use a term called digital air force, and it's about optimizing and building a trusted partnership with our CIO colleagues, as well as our chief management colleagues because it's really about that trusted partnership to make sure that we're working collaboratively across the enterprise and whatever we do in the department, we also have to reach across our services so that we're all working together. >> Eileen, I'm curious if there's been much impact from the global pandemic. When I talk to enterprise companies, that they had to rapidly make sure that while they needed to protect data, when it was in their four walls and maybe for VPN, now everyone is accessing data, much more work from home and the like. I have to imagine some of those security measures you've already taken, but have there anything along those lines or anything else that this shift in where people are, and a little bit more dispersed has impacted your work? >> Well, the story that I like to say is, that this has given us velocity. So, prior to COVID, we built our VAULT data platform as a multitenancy platform that is also cross-domain solution, so it allows people to develop and do their problem solving in an appropriate classification level. And it allows us to connect or pushup if we need to into higher classification levels. The other thing that it has helped us really work smart because we do as much as we can in that unclassified environment and then using our cloud based solution in our gateways, it allows us to bring people in at a very scheduled component so that we maximize, or we optimize their time on site. And so I really think that it's really given us great velocity because it has really allowed people to work on the right problem set, on the right class of patient level at a specific time. And plus the other pieces, we look at what we're doing is that the problem set that we've had has really allowed people to become more data focused. I think that it's personal for folks moving forward, so it has increased understanding in terms of the need for data insights, as we move forward to drive decision making. It's not that data makes the decision, but it's using the insight to make the decision. >> And one of the interesting conversations we've been having about how to get to those data insights is the use of things like machine learning, artificial intelligence, anything you can share about, how you're looking at that journey, where you are along that discovery. >> Well, I love to say that in order to do AI and machine learning, you have to have great volumes of high quality data. And so really step one was visible, accessible data, but we in the Department of the Air Force stood up an accelerator at MIT. And so we have a group of amazing airmen that are actually working with MIT on a daily basis to solve some of those, what I would call opportunities for us to move forward. My office collaborates with them on a consistent basis, because they're doing additional use cases in that academic environment, which I'm pretty excited about because I think it gives us access to some of the smartest minds. >> All right, Eileen also I understand it's your first year doing the event. Unfortunately, we don't get, all come together in Cambridge, walking those hallways and being able to listen to some of those conversations and follow up is something we've very much enjoyed over the years. What excites you about being interact with your peers and participating in the event this year? >> Well, I really think it's about helping each other leverage the amazing lessons learned. I think that if we look collaboratively, both across industry and in the federal sector, there have been amazing lessons learned and it gives us a great forum for us to really share and leverage those lessons learned as we move forward so that we're not hitting the reboot button, but we actually are starting faster. So, it comes back to the velocity component, it all helps us go faster and at a higher quality level and I think that's really exciting. >> So, final question I have for you, we've talked for years about digital transformation, we've really said that having that data strategy and that culture of leveraging data is one of the most critical pieces of having gone through that transformation. For people that are maybe early on their journey, any advice that you'd give them, having worked through a couple of years of this and the experience you've had with your peers. >> I think that the first thing is that you have to really start with a blank slate and really look at the art of the possible. Don't think about what you've always done, think about where you want to go because there are many different paths to get there. And if you look at what the target goal is, it's really about making sure that you do that backward tracking to get to that goal. And the other piece that I tell my colleagues is celebrate the wins. My team of airmen, they are amazing, it's an honor to serve them and the reality is that they are doing great things and sometimes you want more. And it's really important to celebrate the victories because it's a very long journey and we keep moving the goalposts because we're always striving for excellence. >> Absolutely, it is always a journey that we're on, it's not about the destination. Eileen, thank you so much for sharing all that you've learned and glad you could participate. >> Thank you, STU, I appreciate being included today. Have a great day. >> Thanks and thank you for watching theCube. I'm Stu Miniman stay tuned for more from the MIT, CDO IQ event. (lively upbeat music)

Published Date : Sep 3 2020

SUMMARY :

brought to you by Silicon Angle Media. and the people in this ecosystem, Thank you Stu really All right, so the of the first things we did sure for the Air Force, at the right level to drive at the CDO, IQ talks to build our data lab, we have the opportunity to and absolutely the It's really important that we that they had to rapidly make Well, the story that I like to say is, And one of the interesting that in order to do AI and participating in the event this year? in the federal sector, is one of the most critical and really look at the art it's not about the destination. Have a great day. from the MIT, CDO IQ event.

ENTITIES

Entity	Category	Confidence
Michael	PERSON	0.99+
Eileen	PERSON	0.99+
Claire	PERSON	0.99+
Tom Cecila	PERSON	0.99+
Lisa Martin	PERSON	0.99+
David Berg	PERSON	0.99+
2017	DATE	0.99+
Greg Garcia	PERSON	0.99+
June of 2018	DATE	0.99+
Jonathan Rosenberg	PERSON	0.99+
Michael Rose	PERSON	0.99+
June	DATE	0.99+
Eileen Vitrine	PERSON	0.99+
Blair	PERSON	0.99+
U.S Air Force Academy	ORGANIZATION	0.99+
MIT	ORGANIZATION	0.99+
Wednesday	DATE	0.99+
five minutes	QUANTITY	0.99+
Omni Channel	ORGANIZATION	0.99+
five billion	QUANTITY	0.99+
Air Force Institute of Technology	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Cambridge	LOCATION	0.99+
Thursday	DATE	0.99+
Orlando, Florida	LOCATION	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
United States Air Force	ORGANIZATION	0.99+
Eileen Vidrine	PERSON	0.99+
Ryan	PERSON	0.99+
Google	ORGANIZATION	0.99+
Blake	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Blair Pleasant	PERSON	0.99+
BC Strategies	ORGANIZATION	0.99+
Department of Navy	ORGANIZATION	0.99+
next year	DATE	0.99+
Stu	PERSON	0.99+
first	QUANTITY	0.99+
Confusion	ORGANIZATION	0.99+
today	DATE	0.99+
five o'Clock	DATE	0.99+
YouTube	ORGANIZATION	0.99+
seventh year	QUANTITY	0.99+
twenty years ago	DATE	0.99+
first year	QUANTITY	0.99+
twenty	QUANTITY	0.99+
decades ago	DATE	0.99+
last year	DATE	0.99+
Jones	LOCATION	0.98+
Andrews Air Force Base	LOCATION	0.98+
this year	DATE	0.98+
United States Air Force	ORGANIZATION	0.98+
last year	DATE	0.98+
five	QUANTITY	0.98+
five nine mugs	QUANTITY	0.98+
first thing	QUANTITY	0.98+
first thing	QUANTITY	0.98+
first steps	QUANTITY	0.98+
this fall	DATE	0.98+
DOD	ORGANIZATION	0.98+
Enterprise Connect	ORGANIZATION	0.97+
Department of the Air Force	ORGANIZATION	0.97+
AFIT	ORGANIZATION	0.97+
both	QUANTITY	0.97+
one	QUANTITY	0.97+
Department of Army	ORGANIZATION	0.97+
first time	QUANTITY	0.97+
each	QUANTITY	0.96+
CDO IQ	EVENT	0.96+
One	QUANTITY	0.95+
Twitter	ORGANIZATION	0.95+

Doug Laney, Caserta | MIT CDOIQ 2020

>> Announcer: From around the globe, it's theCUBE with digital coverage of MIT Chief Data Officer and Information Quality symposium brought to you by SiliconANGLE Media. >> Hi everybody. This is Dave Vellante and welcome back to theCUBE's coverage of the MIT CDOIQ 2020 event. Of course, it's gone virtual. We wish we were all together in Cambridge. They were going to move into a new building this year for years they've done this event at the Tang Center, moving into a new facility, but unfortunately going to have to wait at least a year, we'll see, But we've got a great guest. Nonetheless, Doug Laney is here. He's a Business Value Strategist, the bestselling author, an analyst, consultant then a long time CUBE friend. Doug, great to see you again. Thanks so much for coming on. >> Dave, great to be with you again as well. So can I ask you? You have been an advocate for obviously measuring the value of data, the CDO role. I don't take this the wrong way, but I feel like the last 150 days have done more to accelerate people's attention on the importance of data and the value of data than all the great work that you've done. What do you think? (laughing) >> It's always great when organizations, actually take advantage of some of these concepts of data value. You may be speaking specifically about the situation with United Airlines and American Airlines, where they have basically collateralized their customer loyalty data, their customer loyalty programs to the tunes of several billion dollars each. And one of the things that's very interesting about that is that the third party valuations of their customer loyalty data, resulted in numbers that were larger than the companies themselves. So basically the value of their data, which is as we've discussed previously off balance sheet is more valuable than the market cap of those companies themselves, which is just incredibly fascinating. >> Well, and of course, all you have to do is look to the Trillionaire's Club. And now of course, Apple pushing two trillion to really see the value that the market places on data. But the other thing is of course, COVID, everybody talks about the COVID acceleration. How have you seen it impact the awareness of the importance of data, whether it applies to business resiliency or even new monetization models? If you're not digital, you can't do business. And digital is all about data. >> I think the major challenge that most organizations are seeing from a data and analytics perspective due to COVID is that their traditional trend based forecast models are broken. If you're a company that's only forecasting based on your own historical data and not taking into consideration, or even identifying what are the leading indicators of your business, then COVID and the economic shutdown have entirely broken those models. So it's raised the awareness of companies to say, "Hey, how can we predict our business now? We can't do it based on our own historical data. We need to look externally at what are those external, maybe global indicators or other kinds of markets that proceed our own forecasts or our own activity." And so the conversion from trend based forecast models to what we call driver based forecast models, isn't easy for a lot of organizations to do. And one of the more difficult parts is identifying what are those external data factors from suppliers, from customers, from partners, from competitors, from complimentary products and services that are leading indicators of your business. And then recasting those models and executing on them. >> And that's a great point. If you think about COVID and how it's changed things, everything's changed, right? The ideal customer profile has changed, your value proposition to those customers has completely changed. You got to rethink that. And of course, it's very hard to predict even when this thing eventually comes back, some kind of hybrid mode, you used to be selling to people in an office environment. That's obviously changed. There's a lot that's permanent there. And data is potentially at least the forward indicator, the canary in the coal mine. >> Right. It also is the product and service. So not only can it help you and improve your forecasting models, but it can become a product or service that you're offering. Look at us right now, we would generally be face to face and person to person, but we're using video technology to transfer this content. And then one of the things that I... It took me awhile to realize, but a couple of months after the COVID shutdown, it occurred to me that even as a consulting organization, Caserta focuses on North America. But the reality is that every consultancy is now a global consultancy because we're all doing business remotely. There are no particular or real strong localization issues for doing consulting today. >> So we talked a lot over the years about the role of the CDO, how it's evolved, how it's changed the course of the early... The pre-title days it was coming out of a data quality world. And it's still vital. Of course, as we heard today from the Keynote, it's much more public, much more exposed, different public data sources, but the role has certainly evolved initially into regulated industries like financial, healthcare and government, but now, many, many more organizations have a CDO. My understanding is that you're giving a talk in the business case for the CDO. Help us understand that. >> Yeah. So one of the things that we've been doing here for the last couple of years is a running an ongoing study of how organizations are impacted by the role of the CDO. And really it's more of a correlation and looking at what are some of the qualities of organizations that have a CDO or don't have a CDO. So some of the things we found is that organizations with a CDO nearly twice as often, mention the importance of data and analytics in their annual report organizations with a C level CDO, meaning a true executive are four times more often likely to be using data, to transform the business. And when we're talking about using data and advanced analytics, we found that organizations with a CIO, not a CDO responsible for their data assets are only half as likely to be doing advanced analytics in any way. So there are a number of interesting things that we found about companies that have a CDO and how they operate a bit differently. >> I want to ask you about that. You mentioned the CIO and we're increasingly seeing lines of reporting and peer reporting alter shift. The sands are shifting a little bit. In the early days the CDO and still predominantly I think is an independent organization. We've seen a few cases and increasingly number where they're reporting into the CIO, we've seen the same thing by the way with the chief Information Security Officer, which used to be considered the fox watching the hen house. So we're seeing those shifts. We've also seen the CDO become more aligned with a technical role and sometimes even emerging out of that technical role. >> Yeah. I think the... I don't know, what I've seen more is that the CDOs are emerging from the business, companies are realizing that data is a business asset. It's not an IT asset. There was a time when data was tightly coupled with applications of technologies, but today data is very easily decoupled from those applications and usable in a wider variety of contexts. And for that reason, as data gets recognized as a business, not an IT asset, you want somebody from the business responsible for overseeing that asset. Yes, a lot of CDOs still report to the CIO, but increasingly more CDOs you're seeing and I think you'll see some other surveys from other organizations this week where the CDOs are more frequently reporting up to the CEO level, meaning they're true executives. Along I advocated for the bifurcation of the IT organization into separate I and T organizations. Again, there's no reason other than for historical purposes to keep the data and technology sides of the organizations so intertwined. >> Well, it makes sense that the Chief Data Officer would have an affinity with the lines of business. And you're seeing a lot of organizations, really trying to streamline their data pipeline, their data life cycles, bringing that together, infuse intelligence into that, but also take a systems view and really have the business be intimately involved, if not even owned into the data. You see a lot of emphasis on self-serve, what are you seeing in terms of that data pipeline or the data life cycle, if you will, that used to be wonky, hard core techies, but now it really involving a lot more constituent. >> Yeah. Well, the data life cycle used to be somewhat short. The data life cycles, they're longer and they're more a data networks than a life cycle and or a supply chain. And the reason is that companies are finding alternative uses for their data, not just using it for a single operational purpose or perhaps reporting purpose, but finding that there are new value streams that can be generated from data. There are value streams that can be generated internally. There are a variety of value streams that can be generated externally. So we work with companies to identify what are those variety of value streams? And then test their feasibility, are they ethically feasible? Are they legally feasible? Are they economically feasible? Can they scale? Do you have the technology capabilities? And so we'll run through a process of assessing the ideas that are generated. But the bottom line is that companies are realizing that data is an asset. It needs to be not just measured as one and managed as one, but also monetized as an asset. And as we've talked about previously, data has these unique qualities that it can be used over and over again, and it generate more data when you use it. And it can be used simultaneously for multiple purposes. So companies like, you mentioned, Apple and others have built business models, based on these unique qualities of data. But I think it's really incumbent upon any organization today to do so as well. >> But when you observed those companies that we talk about all the time, data is at the center of their organization. They maybe put people around that data. That's got to be one of the challenge for many of the incumbents is if we talked about the data silos, the different standards, different data quality, that's got to be fairly major blocker for people becoming a "Data-driven organization." >> It is because some organizations were developed as people driven product, driven brand driven, or other things to try to convert. To becoming data-driven, takes a high degree of data literacy or fluency. And I think there'll be a lot of talk about that this week. I'll certainly mention it as well. And so getting the organization to become data fluent and appreciate data as an asset and understand its possibilities and the art of the possible with data, it's a long road. So the culture change that goes along with it is really difficult. And so we're working with 150 year old consumer brand right now that wants to become more data-driven and they're very product driven. And we hear the CIO say, "We want people to understand that we're a data company that just happens to produce this product. We're not a product company that generates data." And once we realized that and started behaving in that fashion, then we'll be able to really win and thrive in our marketplace. >> So one of the key roles of a Chief Data Officers to understand how data affects the monetization of an organization. Obviously there are four profit companies of your healthcare organization saving lives, obviously being profitable as well, or at least staying within the budget, depending upon the structure of the organization. But a lot of people I think oftentimes misunderstand that it's like, "Okay, do I have to become a data broker? Am I selling data directly?" But I think, you pointed out many times and you just did that unlike oil, that's why we don't like that data as a new oil analogy, because it's so much more valuable and can be use, it doesn't fall because of its scarcity. But what are you finding just in terms of people's application of that notion of monetization? Cutting costs, increasing revenue, what are you seeing in the field? What's that spectrum look like? >> So one of the things I've done over the years is compile a library of hundreds and hundreds of examples of how organizations are using data and analytics in innovative ways. And I have a book in process that hopefully will be out this fall. I'm sharing a number of those inspirational examples. So that's the thing that organizations need to understand is that there are a variety of great examples out there, and they shouldn't just necessarily look to their own industry. There are inspirational examples from other industries as well, many clients come to me and they ask, "What are others in my industry doing?" And my flippant response to that is, "Why do you want to be in second place or third place? Why not take an idea from another industry, perhaps a digital product company and apply that to your own business." But like you mentioned, there are a variety of ways to monetize data. It doesn't involve necessarily selling it. You can deliver analytics, you can report on it, you can use it internally to generate improved business process performance. And as long as you're measuring how data's being applied and what its impact is, then you're in a position to claim that you're monetizing it. But if you're not measuring the impact of data on business processes or on customer relationships or partner supplier relationships or anything else, then it's difficult to claim that you're monetizing it. But one of the more interesting ways that we've been working with organizations to monetize their data, certainly in light of GDPR and the California consumer privacy act where I can't sell you my data anymore, but we've identified ways to monetize your customer data in a couple of ways. One is to synthesize the data, create synthetic data sets that retain the original statistical anomalies in the data or features of the data, but don't share actually any PII. But another interesting way that we've been working with organizations to monetize their data is what I call, Inverted data monetization, where again, I can't share my customer data with you, but I can share information about your products and services with my customers. And take a referral fee or a commission, based on that. So let's say I'm a hospital and I can't sell you my patient data, of course, due to variety of regulations, but I know who my diabetes patients are, and I can introduce them to your healthy meal plans, to your gym memberships, to your at home glucose monitoring kits. And again, take a referral fee or a cut of that action. So we're working with customers and the financial services firm industry and in the healthcare industry on just those kinds of examples. So we've identified hundreds of millions of dollars of incremental value for organizations that from their data that we're just sitting on. >> Interesting. Doug because you're a business value strategist at the top, where in the S curve do you see you're able to have the biggest impact. I doubt that you enter organizations where you say, "Oh, they've got it all figured out. They can't use my advice." But as well, sometimes in the early stages, you may not be able to have as big of an impact because there's not top down support or whatever, there's too much technical data, et cetera, where are you finding you can have the biggest impact, Doug? >> Generally we don't come in and run those kinds of data monetization or information innovation exercises, unless there's some degree of executive support. I've never done that at a lower level, but certainly there are lower level more immediate and vocational opportunities for data to deliver value through, to simply analytics. One of the simple examples I give is, I sold a home recently and when you put your house on the market, everybody comes out of the woodwork, the fly by night, mortgage companies, the moving companies, the box companies, the painters, the landscapers, all know you're moving because your data is in the U.S. and the MLS directory. And it was interesting. The only company that didn't reach out to me was my own bank, and so they lost the opportunity to introduce me to a Mortgage they'd retain me as a client, introduce me to my new branch, print me new checks, move the stuff in my safe deposit box, all of that. They missed a simple opportunity. And I'm thinking, this doesn't require rocket science to figure out which of your customers are moving, the MLS database or you can harvest it from Zillow or other sites is basically public domain data. And I was just thinking, how stupid simple would it have been for them to hire a high school programmer, give him a can of red bull and say, "Listen match our customer database to the MLS database to let us know who's moving on a daily or weekly basis." Some of these solutions are pretty simple. >> So is that part of what you do, come in with just hardcore tactical ideas like that? Are you also doing strategy? Tell me more about how you're spending your time. >> I trying to think more of a broader approach where we look at the data itself and again, people have said, "If you tortured enough, what would you tell us? We're just take that angle." We look at examples of how other organizations have monetized data and think about how to apply those and adapt those ideas to the company's own business. We look at key business drivers, internally and externally. We look at edge cases for their customers' businesses. We run through hypothesis generating activities. There are a variety of different kinds of activities that we do to generate ideas. And most of the time when we run these workshops, which last a week or two, we'll end up generating anywhere from 35 to 50 pretty solid ideas for generating new value streams from data. So when we talk about monetizing data, that's what we mean, generating new value streams. But like I said, then the next step is to go through that feasibility assessment and determining which of these ideas you actually want to pursue. >> So you're of course the longtime industry watcher as well, as a former Gartner Analyst, you have to be. My question is, if I think back... I've been around a while. If I think back at the peak of Microsoft's prominence in the PC era, it was like windows 95 and you felt like, "Wow, Microsoft is just so strong." And then of course the Linux comes along and a lot of open source changes and low and behold, a whole new set of leaders emerges. And you see the same thing today with the Trillionaire's Club and you feel like, "Wow, even COVID has been a tailwind for them." But you think about, "Okay, where could the disruption come to these large players that own huge clouds, they have all the data." Is data potentially a disruptor for what appear to be insurmountable odds against the newbies" >> There's always people coming up with new ways to leverage data or new sources of data to capture. So yeah, there's certainly not going to be around for forever, but it's been really fascinating to see the transformation of some companies I think nobody really exemplifies it more than IBM where they emerged from originally selling meat slicers. The Dayton Meat Slicer was their original product. And then they evolved into Manual Business Machines and then Electronic Business Machines. And then they dominated that. Then they dominated the mainframe software industry. Then they dominated the PC industry. Then they dominated the services industry to some degree. And so they're starting to get into data. And I think following that trajectory is something that really any organization should be looking at. When do you actually become a data company? Not just a product company or a service company or top. >> We have Inderpal Bhandari is one of our huge guests here. He's a Chief-- >> Sure. >> Data Officer of IBM, you know him well. And he talks about the journey that he's undertaken to transform the company into a data company. I think a lot of people don't really realize what's actually going on behind the scenes, whether it's financially oriented or revenue opportunities. But one of the things he stressed to me in our interview was that they're on average, they're reducing the end to end cycle time from raw data to insights by 70%, that's on average. And that's just an enormous, for a company that size, it's just enormous cost savings or revenue generating opportunity. >> There's no doubt that the technology behind data pipelines is improving and the process from moving data from those pipelines directly into predictive or diagnostic or prescriptive output is a lot more accelerated than the early days of data warehousing. >> Is the skills barrier is acute? It seems like it's lessened somewhat, the early Hadoop days you needed... Even data scientist... Is it still just a massive skill shortage, or we're starting to attack that. >> Well, I think companies are figuring out a way around the skill shortage by doing things like self service analytics and focusing on more easy to use mainstream type AI or advanced analytics technologies. But there's still very much a need for data scientists and organizations and the difficulty in finding people that are true data scientists. There's no real certification. And so really anybody can call themselves a data scientist but I think companies are getting good at interviewing and determining whether somebody's got the goods or not. But there are other types of skills that we don't really focus on, like the data engineering skills, there's still a huge need for data engineering. Data doesn't self-organize. There are some augmented analytics technologies that will automatically generate analytic output, but there really aren't technologies that automatically self-organize data. And so there's a huge need for data engineers. And then as we talked about, there's a large interest in external data and harvesting that and then ingesting it and even identifying what external data is out there. So one of the emerging roles that we're seeing, if not the sexiest role of the 21st century is the role of the Data Curator, somebody who acts as a librarian, identifying external data assets that are potentially valuable, testing them, evaluating them, negotiating and then figuring out how to ingest that data. So I think that's a really important role for an organization to have. Most companies have an entire department that procures office supplies, but they don't have anybody who's procuring data supplies. And when you think about which is more valuable to an organization? How do you not have somebody who's dedicated to identifying the world of external data assets that are out there? There are 10 million data sets published by government, organizations and NGOs. There are thousands and thousands of data brokers aggregating and sharing data. There's a web content that can be harvested, there's data from your partners and suppliers, there's data from social media. So to not have somebody who's on top of all that it demonstrates gross negligence by the organization. >> That is such an enlightening point, Doug. My last question is, I wonder how... If you can share with us how the pandemic has effected your business personally. As a consultant, you're on the road a lot, obviously not on the road so much, you're doing a lot of chalk talks, et cetera. How have you managed through this and how have you been able to maintain your efficacy with your clients? >> Most of our clients, given that they're in the digital world a bit already, made the switch pretty quick. Some of them took a month or two, some things went on hold but we're still seeing the same level of enthusiasm for data and doing things with data. In fact some companies have taken our (mumbles) that data to be their best defense in a crisis like this. It's affected our business and it's enabled us to do much more international work more easily than we used to. And I probably spend a lot less time on planes. So it gives me more time for writing and speaking and actually doing consulting. So that's been nice as well. >> Yeah, there's that bonus. Obviously theCUBE yes, we're not doing physical events anymore, but hey, we've got two studios operating. And Doug Laney, really appreciate you coming on. (Dough mumbles) Always a great guest and sharing your insights and have a great MIT CDOIQ. >> Thanks, you too, Dave, take care. (mumbles) >> Thanks Doug. All right. And thank you everybody for watching. This is Dave Vellante for theCUBE, our continuous coverage of the MIT Chief Data Officer conference, MIT CDOIQ, will be right back, right after this short break. (bright music)

Published Date : Sep 3 2020

SUMMARY :

symposium brought to you Doug, great to see you again. and the value of data And one of the things of the importance of data, And one of the more difficult the canary in the coal mine. But the reality is that every consultancy a talk in the business case for the CDO. So some of the things we found is that In the early days the CDO is that the CDOs are that data pipeline or the data life cycle, of assessing the ideas that are generated. for many of the incumbents and the art of the possible with data, of the organization. and apply that to your own business." I doubt that you enter organizations and the MLS directory. So is that part of what you do, And most of the time when of Microsoft's prominence in the PC era, the services industry to some degree. is one of our huge guests here. But one of the things he stressed to me is improving and the process the early Hadoop days you needed... and the difficulty in finding people and how have you been able to maintain our (mumbles) that data to be and sharing your insights Thanks, you too, Dave, take care. of the MIT Chief Data Officer conference,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Laney	PERSON	0.99+
United Airlines	ORGANIZATION	0.99+
American Airlines	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Doug	PERSON	0.99+
thousands	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Cambridge	LOCATION	0.99+
21st century	DATE	0.99+
10 million	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Inderpal Bhandari	PERSON	0.99+
two trillion	QUANTITY	0.99+
windows 95	TITLE	0.99+
North America	LOCATION	0.99+
one	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
U.S.	LOCATION	0.99+
a month	QUANTITY	0.99+
35	QUANTITY	0.99+
two	QUANTITY	0.99+
third place	QUANTITY	0.99+
One	QUANTITY	0.99+
MLS	ORGANIZATION	0.98+
two studios	QUANTITY	0.98+
MIT CDOIQ 2020	EVENT	0.98+
Trillionaire's Club	ORGANIZATION	0.98+
today	DATE	0.98+
this week	DATE	0.98+
Tang Center	LOCATION	0.98+
California consumer privacy act	TITLE	0.97+
second place	QUANTITY	0.97+
Linux	TITLE	0.97+
COVID	EVENT	0.97+
Gartner	ORGANIZATION	0.97+
Zillow	ORGANIZATION	0.97+
50	QUANTITY	0.97+
GDPR	TITLE	0.97+
CUBE	ORGANIZATION	0.97+
this year	DATE	0.97+
MIT Chief Data Officer	EVENT	0.96+
theCUBE	ORGANIZATION	0.95+
a week	QUANTITY	0.94+
single	QUANTITY	0.94+
Caserta	ORGANIZATION	0.93+
four times	QUANTITY	0.92+
COVID	OTHER	0.92+
pandemic	EVENT	0.92+
2020	DATE	0.91+
hundreds of millions of dollars	QUANTITY	0.86+
150 year old	QUANTITY	0.86+
this fall	DATE	0.85+
MIT CDOIQ	EVENT	0.85+
last couple of years	DATE	0.84+
four profit companies	QUANTITY	0.84+
COVID	ORGANIZATION	0.82+
Dough	PERSON	0.78+
Keynote	EVENT	0.77+

Krishna Cheriath, Bristol Myers Squibb | MITCDOIQ 2020

>> From the Cube Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a Cube Conversation. >> Hi everyone, this is Dave Vellante and welcome back to the Cube's coverage of the MIT CDOIQ. God, we've been covering this show since probably 2013, really trying to understand the intersection of data and organizations and data quality and how that's evolved over time. And with me to discuss these issues is Krishna Cheriath, who's the Vice President and Chief Data Officer, Bristol-Myers Squibb. Krishna, great to see you, thanks so much for coming on. >> Thank you so much Dave for the invite, I'm looking forward to it. >> Yeah first of all, how are things in your part of the world? You're in New Jersey, I'm also on the East coast, how you guys making out? >> Yeah, I think these are unprecedented times all around the globe and whether it is from a company perspective or a personal standpoint, it is how do you manage your life, how do you manage your work in these unprecedented COVID-19 times has been a very interesting challenge. And to me, what is most amazing has been, I've seen humanity rise up and so to our company has sort of snap to be able to manage our work so that the important medicines that have to be delivered to our patients are delivered on time. So really proud about how we have done as a company and of course, personally, it has been an interesting journey with my kids from college, remote learning, wife working from home. So I'm very lucky and blessed to be safe and healthy at this time. So hopefully the people listening to this conversation are finding that they are able to manage through their lives as well. >> Obviously Bristol-Myers Squibb, very, very strong business. You guys just recently announced your quarter. There's a biologics facility near me in Devon's, Massachusetts, I drive by it all the time, it's a beautiful facility actually. But extremely broad portfolio, obviously some COVID impact, but you're managing through that very, very well, if I understand it correctly, you're taking a collaborative approach to a COVID vaccine, you're now bringing people physically back to work, you've been very planful about that. My question is from your standpoint, what role did you play in that whole COVID response and what role did data play? >> Yeah, I think it's a two part as you rightly pointed out, the Bristol-Myers Squibb, we have been an active partner on the the overall scientific ecosystem supporting many different targets that is, from many different companies I think. Across biopharmaceuticals, there's been a healthy convergence of scientific innovation to see how can we solve this together. And Bristol-Myers Squibb have been an active participant as our CEO, as well as our Chief Medical Officer and Head of Research have articulated publicly. Within the company itself, from a data and technology standpoint, data and digital is core to the response from a company standpoint to the COVID-19, how do we ensure that our work continues when the entire global workforce pivots to a kind of a remote setting. So that really calls on the digital infrastructure to rise to the challenge, to enable a complete global workforce. And I mean workforce, it is not just employees of the company but the all of the third-party partners and others that we work with, the whole ecosystem needs to work. And I think our digital infrastructure has proven to be extremely resilient than that. From a data perspective, I think it is twofold. One is how does the core book of business of data continue to drive forward to make sure that our companies key priorities are being advanced. Secondarily, we've been partnering with a research and development organization as well as medical organization to look at what kind of real world data insights can really help in answering the many questions around COVID-19. So I think it is twofold. Main summary; one is, how do we ensure that the data and digital infrastructure of the company continues to operate in a way that allows us to progress the company's mission even during a time when globally, we have been switched to a remote working force, except for some essential staff from lab and manufacturing standpoint. And secondarily is how do we look at the real-world evidence as well as the scientific data to be a good partner with other companies to look at progressing the societal innovations needed for this. >> I think it's a really prudent approach because let's face it, sometimes one shot all vaccine can be like playing roulette. So you guys are both managing your risk and just as I say, financially, a very, very successful company in a sound approach. I want to ask you about your organization. We've interviewed many, many Chief Data Officers over the years, and there seems to be some fuzziness as to the organizational structure. It's very clear with you, you report in to the CIO, you came out of a technical bag, you have a technical degree but you also of course have a business degree. So you're dangerous from that standpoint. You got both sides which is critical, I would think in your role, but let's start with the organizational reporting structure. How did that come about and what are the benefits of reporting into the CIO? >> I think the Genesis for that as Bristol-Myers Squibb and when I say Bristol-Myers Squibb, the new Bristol-Myers Squibb is a combination of Heritage Bristol-Myers Squibb and Heritage Celgene after the Celgene acquisition last November. So in the Heritage Bristol-Myers Squibb acquisition, we came to a conclusion that in order for BMS to be able to fully capitalize on our scientific innovation potential as well as to drive data-driven decisions across the company, having a robust data agenda is key. Now the question is, how do you progress that? Historically, we had approached a very decentralized mechanism that made a different data constituencies. We didn't have a formal role of a Chief Data Officer up until 2018 or so. So coming from that realization that we need to have an effective data agenda to drive forward the necessary data-driven innovations from an analytic standpoint. And equally importantly, from optimizing our execution, we came to conclusion that we need an enterprise-level data organization, we need to have a first among equals if you will, to be mandated by the CEO, his leadership team, to be the kind of an orchestrator of a data agenda for the company, because data agenda cannot be done individually by a singular CDO. It has to be done in partnership with many stakeholders, business, technology, analytics, et cetera. So from that came this notion that we need an enterprise-wide data organization. So we started there. So for awhile, I would joke around that I had all of the accountabilities of the CDO without the lofty title. So this journey started around 2016, where we create an enterprise-wide data organization. And we made a very conscious choice of separating the data organization from analytics. And the reason we did that is when we look at the bowl of Bristol-Myers Squibb, analytics for example, is core and part of our scientific discovery process, research, our clinical development, all of them have deep data science and analytic embedded in it. But we also have other analytics whether it is part of our sales and marketing, whether it is part of our finance and our enabling functions they catch all across global procurement et cetera. So the world of analytics is very broad. BMS did a separation between the world of analytics and from the world of data. Analytics at BMS is in two modes. There is a central analytics organization called Business Insights and Analytics that drive most of the enterprise-level analytics. But then we have embedded analytics in our business areas, which is research and development, manufacturing and supply chain, et cetera, to drive what needs to be closer to the business idea. And the reason for separating that out and having a separate data organization is that none of these analytic aspirations or the business aspirations from data will be met if the world of data is, you don't have the right level of data available, the velocity of data is not appropriate for the use cases, the quality of data is not great or the control of the data. So that we are using the data for the right intent, meeting the compliance and regulatory expectations around the data is met. So that's why we separated out that data world from the analytics world, which is a little bit of a unique construct for us compared to what we see generally in the world of CDOs. And from that standpoint, then the decision was taken to make that report for global CIO. At Bristol-Myers Squibb, they have a very strong CIO organization and IT organization. When I say strong, it is from this lens standpoint. A, it is centralized, we have centralized the budget as well as we have centralized the execution across the enterprise. And the CDO reporting to the CIO with that data-specific agenda, has a lot of value in being able to connect the world of data with the world of technology. So at BMS, their Chief Data Officer organization is a combination of traditional CDO-type accountabilities like data risk management, data governance, data stewardship, but also all of the related technologies around master data management, data lake, data and analytic engineering and a nascent AI data and technology lab. So that construct allows us to be a true enterprise horizontal, supporting analytics, whether it is done in a central analytics organization or embedded analytics teams in the business area, but also equally importantly, focus on the world of data from operational execution standpoint, how do we optimize data to drive operational effectiveness? So that's the construct that we have where CDO reports to the CIO, data organization separated from analytics to really focus around the availability but also the quality and control of data. And the last nuance that is that at BMS, the Chief Data Officer organization is also accountable to be the Data Protection Office. So we orchestrate and facilitate all privacy-related actions across because that allows us to make sure that all personal data that is collected, managed and consumed, meets all of the various privacy standards across the world, as well as our own commitments as a company from across from compliance principles standpoint. >> So that makes a lot of sense to me and thank you for that description. You're not getting in the way of R&D and the scientists, they know data science, they don't need really your help. I mean, they need to innovate at their own pace, but the balance of the business really does need your innovation, and that's really where it seems like you're focused. You mentioned master data management, data lakes, data engineering, et cetera. So your responsibility is for that enterprise data lifecycle to support the business side of things, and I wonder if you could talk a little bit about that and how that's evolved. I mean a lot has changed from the old days of data warehouse and cumbersome ETL and you mentioned, as you say data lakes, many of those have been challenging, expensive, slow, but now we're entering this era of cloud, real-time, a lot of machine intelligence, and I wonder if you could talk about the changes there and how you're looking at and thinking about the data lifecycle and accelerating the time to insights. >> Yeah, I think the way we think about it, we as an organization in our strategy and tactics, think of this as a data supply chain. The supply chain of data to drive business value whether it is through insights and analytics or through operation execution. When you think about it from that standpoint, then we need to get many elements of that into an effective stage. This could be the technologies that is part of that data supply chain, you reference some of them, the master data management platforms, data lake platforms, the analytics and reporting capabilities and business intelligence capabilities that plug into a data backbone, which is that I would say the technology, swim lane that needs to get right. Along with that, what we also need to get right for that effective data supply chain is that data layer. That is, how do you make sure that there is the right data navigation capability, probably you make sure that we have the right ontology mapping and the understanding around the data. How do we have data navigation? It is something that we have invested very heavily in. So imagine a new employee joining BMS, any organization our size has a pretty wide technology ecosystem and data ecosystem. How do you navigate that, how do we find the data? Data discovery has been a key focus for us. So for an effective data supply chain, then we knew that and we have instituted our roadmap to make sure that we have a robust technology orchestration of it, but equally important is an effective data operations orchestration. Both needs to go hand in hand for us to be able to make sure that that supply chain is effective from a business use case and analytic use standpoint. So that has led us on a journey from a cloud perspective, since you refer that in your question, is we have invested very heavily to move from very disparate set of data ecosystems to a more converse cloud-based data backbone. That has been a big focus at the BMS since 2016, whether it is from a research and development standpoint or from commercialization, it is our word for the sales and marketing or manufacturing and supply chain and HR, et cetera. How do we create a converged data backbone that allows us to use that data as a resource to drive many different consumption patterns? Because when you imagine an enterprise of our size, we have many different consumers of the data. So those consumers have different consumption needs. You have deep data science population who just needs access to the data and they have data science platforms but they are at once programmers as well, to the other end of the spectrum where executives need pre-packaged KPIs. So the effective orchestration of the data ecosystem at BMS through a data supply chain and the data backbone, there's a couple of things for us. One, it drives productivity of our data consumers, the scientific researchers, analytic community or other operational staff. And second, in a world where we need to make sure that the data consumption appalls ethical standards as well as privacy and other regulatory expectations, we are able to build it into our system and process the necessary controls to make sure that the consumption and the use of data meets our highest trust advancements standards. >> That makes a lot of sense. I mean, converging your data like that, people always talk about stove pipes. I know it's kind of a bromide but it's true, and allows you to sort of inject consistent policies. What about automation? How has that affected your data pipeline recently and on your journey with things like data classification and the like? >> I think in pursuing a broad data automation journey, one of the things that we did was to operate at two different speed points. In a historically, the data organizations have been bundled with long-running data infrastructure programs. By the time you complete them, their business context have moved on and the organization leaders are also exhausted from having to wait from these massive programs to reach its full potential. So what we did very intentionally from our data automation journey is to organize ourselves in two speed dimensions. First, a concept called Rapid Data Lab. The idea is that recognizing the reality that the data is not well automated and orchestrated today, we need a SWAT team of data engineers, data SMEs to partner with consumers of data to make sure that we can make effective data supply chain decisions here and now, and enable the business to answer questions of today. Simultaneously in a longer time horizon, we need to do the necessary work of moving the data automation to a better footprint. So enterprise data lake investments, where we built services based on, we had chosen AWS as the cloud backbone for data. So how do we use the AWS services? How do we wrap around it with the necessary capabilities so that we have a consistent reference and technical architecture to drive the many different function journeys? So we organized ourselves into speed dimensions; the Rapid Data Lab teams focus around partnering with the consumers of data to help them with data automation needs here and now, and then a secondary team focused around the convergence of data into a better cloud-based data backbone. So that allowed us to one, make an impact here and now and deliver value from data to the dismiss here and now. Secondly, we also learned a lot from actually partnering with consumers of data on what needs to get adjusted over a period of time in our automation journey. >> It makes sense, I mean again, that whole notion of converged data, putting data at the core of your business, you brought up AWS, I wonder if I could ask you a question. You don't have to comment on specific vendors, but there's a conversation we have in our community. You have AWS huge platform, tons of partners, a lot of innovation going on and you see innovation in areas like the cloud data warehouse or data science tooling, et cetera, all components of that data pipeline. As well, you have AWS with its own tooling around there. So a question we often have in the community is will technologists and technology buyers go for kind of best of breed and cobble together different services or would they prefer to have sort of the convenience of a bundled service from an AWS or a Microsoft or Google, or maybe they even go best of breeds for all cloud. Can you comment on that, what's your thinking? >> I think, especially for organizations, our size and breadth, having a converged to convenient, all of the above from a single provider does not seem practical and feasible, because a couple of reasons. One, the heterogeneity of the data, the heterogeneity of consumption of the data and we are yet to find a single stack provider who can meet all of the different needs. So I am more in the best of breed camp with a few caveats, a hybrid best of breed, if you will. It is important to have a converged the data backbone for the enterprise. And so whether you invest in a singular cloud or private cloud or a combination, you need to have a clear intention strategy around where are you going to host the data and how is the data is going to be organized. But you could have a lot more flexibility in the consumption of data. So once you have the data converged into, in our case, we converged on AWS-based backbone. We allow many different consumptions of the data, because I think the analytic and insights layer, data science community within R&D is different from a data science community in the supply chain context, we have business intelligence needs, we have a catered needs and then there are other data needs that needs to be funneled into software as service platforms like the sales forces of the world, to be able to drive operational execution as well. So when you look at it from that context, having a hybrid model of best of breed, whether you have a lot more convergence from a data backbone standpoint, but then allow for best of breed from an analytic and consumption of data is more where my heart and my brain is. >> I know a lot of companies would be excited to hear that answer, but I love it because it fosters competition and innovation. I wish I could talk for you forever, but you made me think of another question which is around self-serve. On your journey, are you at the point where you can deliver self-serve to the lines of business? Is that something that you're trying to get to? >> Yeah, I think it does. The self-serve is an absolutely important point because I think the traditional boundaries of what you consider the classical IT versus a classical business is great. I think there is an important gray area in the middle where you have a deep citizen data scientist in the business community who really needs to be able to have access to the data and I have advanced data science and programming skills. So self-serve is important but in that, companies need to be very intentional and very conscious of making sure that you're allowing that self-serve in a safe containment sock. Because at the end of the day, whether it is a cyber risk or data risk or technology risk, it's all real. So we need to have a balanced approach between promoting whether you call it data democratization or whether you call it self-serve, but you need to balance that with making sure that you're meeting the right risk mitigation strategy standpoint. So that's how then our focus is to say, how do we promote self-serve for the communities that they need self-serve, where they have deeper levels of access? How do we set up the right safe zones for those which may be the appropriate mitigation from a cyber risk or data risk or technology risk. >> Security pieces, again, you keep bringing up topics that I could talk to you forever on, but I heard on TV the other night, I heard somebody talking about how COVID has affected, because of remote access, affected security. And it's like hey, give everybody access. That was sort of the initial knee-jerk response, but the example they gave as well, if your parents go out of town and the kid has a party, you may have some people show up that you don't want to show up. And so, same issue with remote working, work from home. Clearly you guys have had to pivot to support that, but where does the security organization fit? Does that report separate alongside the CIO? Does it report into the CIO? Are they sort of peers of yours, how does that all work? >> Yeah, I think at Bristol-Myers Squibb, we have a Chief Information Security Officer who is a peer of mine, who also reports to the global CIO. The CDO and the CSO are effective partners and are two sides of the coin and trying to advance a total risk mitigation strategy, whether it is from a cyber risk standpoint, which is the focus of the Chief Information Security Officer and whether it is the general data consumption risk. And that is the focus from a Chief Data Officer in the capacities that I have. And together, those are two sides of a coin that the CIO needs to be accountable for. So I think that's how we have orchestrated it, because I think it is important in these worlds where you want to be able to drive data-driven innovation but you want to be able to do that in a way that doesn't open the company to unwanted risk exposures as well. And that is always a delicate balancing act, because if you index too much on risk and then high levels of security and control, then you could lose productivity. But if you index too much on productivity, collaboration and open access and data, it opens up the company for risks. So it is a delicate balance within the two. >> Increasingly, we're seeing that reporting structure evolve and coalesce, I think it makes a lot of sense. I felt like at some point you had too many seats at the executive leadership table, too many kind of competing agendas. And now your structure, the CIO is obviously a very important position. I'm sure has a seat at the leadership table, but also has the responsibility for managing that sort of data as an asset versus a liability which my view, has always been sort of the role of the Head of Information. I want to ask you, I want to hit the Escape key a little bit and ask you about data as a resource. You hear a lot of people talk about data is the new oil. We often say data is more valuable than oil because you can use it, it doesn't follow the laws of scarcity. You could use data in infinite number of places. You can only put oil in your car or your house. How do you think about data as a resource today and going forward? >> Yeah, I think the data as the new oil paradigm in my opinion, was an unhealthy, and it prompts different types of conversations around that. I think for certain companies, data is indeed an asset. If you're a company that is focused on information products and data products and that is core of your business, then of course there's monetization of data and then data as an asset, just like any other assets on the company's balance sheet. But for many enterprises to further their mission, I think considering data as a resource, I think is a better focus. So as a vital resource for the company, you need to make sure that there is an appropriate caring and feeding for it, there is an appropriate management of the resource and an appropriate evolution of the resource. So that's how I would like to consider it, it is a personal end of one perspective, that data as a resource that can power the mission of the company, the new products and services, I think that's a good, healthy way to look at it. At the center of it though, a lot of strategies, whether people talk about a digital strategy, whether the people talk about data strategy, what is important is a company to have a pool north star around what is the core mission of the company and what is the core strategy of the company. For Bristol-Myers Squibb, we are about transforming patients' lives through science. And we think about digital and data as key value levers and drivers of that strategy. So digital for the sake of digital or data strategy for the sake of data strategy is meaningless in my opinion. We are focused on making sure that how do we make sure that data and digital is an accelerant and has a value lever for the company's mission and company strategy. So that's why thinking about data as a resource, as a key resource for our scientific researchers or a key resource for our manufacturing team or a key resource for our sales and marketing, allows us to think about the actions and the strategies and tactics we need to deploy to make that effective. >> Yeah, that makes a lot of sense, you're constantly using that North star as your guideline and how data contributes to that mission. Krishna Cheriath, thanks so much for coming on the Cube and supporting the MIT Chief Data Officer community, it was a really pleasure having you. >> Thank you so much for Dave, hopefully you and the audience is safe and healthy during these times. >> Thank you for that and thank you for watching everybody. This is Vellante for the Cube's coverage of the MIT CDOIQ Conference 2020 gone virtual. Keep it right there, we'll right back right after this short break. (lively upbeat music)

Published Date : Sep 3 2020

SUMMARY :

leaders all around the world, coverage of the MIT CDOIQ. I'm looking forward to it. so that the important medicines I drive by it all the time, and digital infrastructure of the company of reporting into the CIO? So that's the construct that we have and accelerating the time to insights. and the data backbone, and allows you to sort of and enable the business to in areas like the cloud data warehouse and how is the data is to the lines of business? in the business community that I could talk to you forever on, that the CIO needs to be accountable for. about data is the new oil. that can power the mission of the company, and supporting the MIT Chief and healthy during these times. of the MIT CDOIQ Conference

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Bristol-Myers Squibb	ORGANIZATION	0.99+
New Jersey	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Devon	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
Rapid Data Lab	ORGANIZATION	0.99+
2013	DATE	0.99+
Krishna Cheriath	PERSON	0.99+
two sides	QUANTITY	0.99+
two	QUANTITY	0.99+
COVID-19	OTHER	0.99+
Celgene	ORGANIZATION	0.99+
First	QUANTITY	0.99+
Cube	ORGANIZATION	0.99+
Krishna	PERSON	0.99+
Heritage Bristol-Myers Squibb	ORGANIZATION	0.99+
2018	DATE	0.99+
both sides	QUANTITY	0.99+
Both	QUANTITY	0.98+
Boston	LOCATION	0.98+
2016	DATE	0.98+
CDO	TITLE	0.98+
two modes	QUANTITY	0.98+
COVID	OTHER	0.98+
first	QUANTITY	0.98+
Bristol-Myers Squibb	ORGANIZATION	0.98+
last November	DATE	0.98+
Data Protection Office	ORGANIZATION	0.98+
One	QUANTITY	0.98+
two part	QUANTITY	0.98+
Secondly	QUANTITY	0.98+
second	QUANTITY	0.98+
MIT	ORGANIZATION	0.98+
both	QUANTITY	0.98+
MIT CDOIQ Conference 2020	EVENT	0.97+
Heritage Celgene	ORGANIZATION	0.97+
one	QUANTITY	0.97+
COVID-19 times	OTHER	0.96+
today	DATE	0.96+
BMS	ORGANIZATION	0.96+
single provider	QUANTITY	0.95+
single stack	QUANTITY	0.93+
Bristol Myers Squibb	PERSON	0.93+
one shot	QUANTITY	0.92+
Cube Studios	ORGANIZATION	0.9+
one perspective	QUANTITY	0.9+
Bristol-Myers	ORGANIZATION	0.9+
Business Insights	ORGANIZATION	0.89+
two speed	QUANTITY	0.89+
twofold	QUANTITY	0.84+
secondary	QUANTITY	0.8+
Secondarily	QUANTITY	0.77+
MIT CDOIQ	ORGANIZATION	0.76+
Massachusetts	LOCATION	0.75+
MITCDOIQ 2020	EVENT	0.74+
Vellante	PERSON	0.72+
Data	PERSON	0.71+
Chief Data Officer	PERSON	0.61+

Susan Wilson, Informatica & Blake Andrews, New York Life | MIT CDOIQ 2019

(techno music) >> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, we're here with theCUBE at the MIT Chief Data Officer Information Quality Conference. I'm Dave Vellante with my co-host Paul Gillin. Susan Wilson is here, she's the vice president of data governance and she's the leader at Informatica. Blake Anders is the corporate vice president of data governance at New York Life. Folks, welcome to theCUBE, thanks for coming on. >> Thank you. >> Thank you. >> So, Susan, interesting title; VP, data governance leader, Informatica. So, what are you leading at Informatica? >> We're helping our customers realize their business outcomes and objectives. Prior to joining Informatica about 7 years ago, I was actually a customer myself, and so often times I'm working with our customers to understand where they are, where they going, and how to best help them; because we recognize data governance is more than just a tool, it's a capability that represents people, the processes, the culture, as well as the technology. >> Yeah so you've walked the walk, and you can empathize with what your customers are going through. And Blake, your role, as the corporate VP, but more specifically the data governance lead. >> Right, so I lead the data governance capabilities and execution group at New York Life. We're focused on providing skills and tools that enable government's activities across the enterprise at the company. >> How long has that function been in place? >> We've been in place for about two and half years now. >> So, I don't know if you guys heard Mark Ramsey this morning, the key-note, but basically he said, okay, we started with enterprise data warehouse, we went to master data management, then we kind of did this top-down enterprise data model; that all failed. So we said, all right, let's pump the governance. Here you go guys, you fix our corporate data problem. Now, right tool for the right job but, and so, we were sort of joking, did data governance fail? No, you always have to have data governance. It's like brushing your teeth. But so, like I said, I don't know if you heard that, but what are your thoughts on that sort of evolution that he described? As sort of, failures of things like EDW to live up to expectations and then, okay guys over to you. Is that a common theme? >> It is a common theme, and what we're finding with many of our customers is that they had tried many of the, if you will, the methodologies around data governance, right? Around policies and structures. And we describe this as the Data 1.0 journey, which was more application-centric reporting to Data 2.0 to data warehousing. And a lot of the failed attempts, if you will, at centralizing, if you will, all of your data, to now Data 3.0, where we look at the explosion of data, the volumes of data, the number of data consumers, the expectations of the chief data officer to solve business outcomes; crushing under the scale of, I can't fit all of this into a centralized data at repository, I need something that will help me scale and to become more agile. And so, that message does resonate with us, but we're not saying data warehouses don't exist. They absolutely do for trusted data sources, but the ability to be agile and to address many of your organizations needs and to be able to service multiple consumers is top-of-mind for many of our customers. >> And the mind set from 1.0 to 2.0 to 3.0 has changed. From, you know, data as a liability, to now data as this massive asset. It's sort of-- >> Value, yeah. >> Yeah, and the pendulum is swung. It's almost like a see-saw. Where, and I'm not sure it's ever going to flip back, but it is to a certain extent; people are starting to realize, wow, we have to be careful about what we do with our data. But still, it's go, go, go. But, what's the experience at New York Life? I mean, you know. A company that's been around for a long time, conservative, wants to make sure risk averse, obviously. >> Right. >> But at the same time, you want to keep moving as the market moves. >> Right, and we look at data governance as really an enabler and a value-add activity. We're not a governance practice for the sake of governance. We're not there to create a lot of policies and restrictions. We're there to add value and to enable innovation in our business and really drive that execution, that efficiency. >> So how do you do that? Square that circle for me, because a lot of people think, when people think security and governance and compliance they think, oh, that stifles innovation. How do you make governance an engine of innovation? >> You provide transparency around your data. So, it's transparency around, what does the data mean? What data assets do we have? Where can I find that? Where are my most trusted sources of data? What does the quality of that data look like? So all those things together really enable your data consumers to take that information and create new value for the company. So it's really about enabling your value creators throughout the organization. >> So data is an ingredient. I can tell you where it is, I can give you some kind of rating as to the quality of that data and it's usefulness. And then you can take it and do what you need to do with it in your specific line of business. >> That's right. >> Now you said you've been at this two and half years, so what stages have you gone through since you first began the data governance initiative. >> Sure, so our first year, year and half was really focused on building the foundations, establishing the playbook for data governance and building our processes and understanding how data governance needed to be implemented to fit New York Life in the culture of the company. The last twelve months or so has really been focused on operationalizing governance. So we've got the foundations in place, now it's about implementing tools to further augment those capabilities and help assist our data stewards and give them a better skill set and a better tool set to do their jobs. >> Are you, sort of, crowdsourcing the process? I mean, you have a defined set of people who are responsible for governance, or is everyone taking a role? >> So, it is a two-pronged approach, we do have dedicated data stewards. There's approximately 15 across various lines of business throughout the company. But, we are building towards a data democratization aspect. So, we want people to be self-sufficient in finding the data that they need and understanding the data. And then, when they have questions, relying on our stewards as a network of subject matter experts who also have some authorizations to make changes and adapt the data as needed. >> Susan, one of the challenges that we see is that the chief data officers often times are not involved in some of these skunkworks AI projects. They're sort of either hidden, maybe not even hidden, but they're in the line of business, they're moving. You know, there's a mentality of move fast and break things. The challenge with AI is, if you start operationalizing AI and you're breaking things without data quality, without data governance, you can really affect lives. We've seen it. In one of these unintended consequences. I mean, Facebook is the obvious example and there are many, many others. But, are you seeing that? How are you seeing organizations dealing with that problem? >> As Blake was mentioning often times what it is about, you've got to start with transparency, and you got to start with collaborating across your lines of businesses, including the data scientists, and including in terms of what they are doing. And actually provide that level of transparency, provide a level of collaboration. And a lot of that is through the use of our technology enablers to basically go out and find where the data is and what people are using and to be able to provide a mechanism for them to collaborate in terms of, hey, how do I get access to that? I didn't realize you were the SME for that particular component. And then also, did you realize that there is a policy associated to the data that you're managing and it can't be shared externally or with certain consumer data sets. So, the objective really is around how to create a platform to ensure that any one in your organization, whether I'm in the line of business, that I don't have a technical background, or someone who does have a technical background, they can come and access and understand that information and connect with their peers. >> So you're helping them to discover the data. What do you do at that stage? >> What we do at that stage is, creating insights for anyone in the organization to understand it from an impact analysis perspective. So, for example, if I'm going to make changes, to as well as discovery. Where exactly is my information? And so we have-- >> Right. How do you help your customers discover that data? >> Through machine learning and artificial intelligence capabilities of our, specifically, our data catalog, that allows us to do that. So we use such things like similarity based matching which help us to identify. It doesn't have to be named, in miscellaneous text one, it could be named in that particular column name. But, in our ability to scan and discover we can identify in that column what is potentially social security number. It might have resided over years of having this data, but you may not realize that it's still stored there. Our ability to identify that and report that out to the data stewards as well as the data analysts, as well as to the privacy individuals is critical. So, with that being said, then they can actually identify the appropriate policies that need to be adhered to, alongside with it in terms of quality, in terms of, is there something that we need to archive. So that's where we're helping our customers in that aspect. >> So you can infer from the data, the meta data, and then, with a fair degree of accuracy, categorize it and automate that. >> Exactly. We've got a customer that actually ran this and they said that, you know, we took three people, three months to actually physically tag where all this information existed across something like 7,000 critical data elements. And, basically, after the set up and the scanning procedures, within seconds we were able to get within 90% precision. Because, again, we've dealt a lot with meta data. It's core to our artificial intelligence and machine learning. And it's core to how we built out our platforms to share that meta data, to do something with that meta data. It's not just about sharing the glossary and the definition information. We also want to automate and reduce the manual burden. Because we recognize with that scale, manual documentation, manual cataloging and tagging just, >> It doesn't work. >> It doesn't work. It doesn't scale. >> Humans are bad at it. >> They're horrible at it. >> So I presume you have a chief data officer at New York Life, is that correct? >> We have a chief data and analytics officer, yes. >> Okay, and you work within that group? >> Yes, that is correct. >> Do you report it to that? >> Yes, so-- >> And that individual, yeah, describe the organization. >> So that sits in our lines of business. Originally, our data governance office sat in technology. And then, our early 2018 we actually re-orged into the business under the chief data and analytics officer when that role was formed. So we sit under that group along with a data solutions and governance team that includes several of our data stewards and also some others, some data engineer-type roles. And then, our center for data science and analytics as well that contains a lot of our data science teams in that type of work. >> So in thinking about some of these, I was describing to Susan, as these skunkworks projects, is the data team, the chief data officer's team involved in those projects or is it sort of a, go run water through the pipes, get an MVP and then you guys come in. How does that all work? >> We're working to try to centralize that function as much as we can, because we do believe there's value in the left hand knowing what the right hand is doing in those types of things. So we're trying to build those communications channels and build that network of data consumers across the organization. >> It's hard right? >> It is. >> Because the line of business wants to move fast, and you're saying, hey, we can help. And they think you're going to slow them down, but in fact, you got to make the case and show the success because you're actually not going to slow them down to terms of the ultimate outcome. I think that's the case that you're trying to make, right? >> And that's one of the things that we try to really focus on and I think that's one of the advantages to us being embedded in the business under the CDAO role, is that we can then say our objectives are your objectives. We are here to add value and to align with what you're working on. We're not trying to slow you down or hinder you, we're really trying to bring more to the table and augment what you're already trying to achieve. >> Sometimes getting that organization right means everything, as we've seen. >> Absolutely. >> That's right. >> How are you applying governance discipline to unstructured data? >> That's actually something that's a little bit further down our road map, but one of the things that we have started doing is looking at our taxonomy's for structured data and aligning those with the taxonomy's that we're using to classify unstructured data. So, that's something we're in the early stages with, so that when we get to that process of looking at more of our unstructured content, we can, we already have a good feel for there's alignment between the way that we think about and organize those concepts. >> Have you identified automation tools that can help to bring structure to that unstructured data? >> Yes, we have. And there are several tools out there that we're continuing to investigate and look at. But, that's one of the key things that we're trying to achieve through this process is bringing structure to unstructured content. >> So, the conference. First year at the conference. >> Yes. >> Kind of key take aways, things that interesting to you, learnings? >> Oh, yes, well the number of CDO's that are here and what's top of mind for them. I mean, it ranges from, how do I stand up my operating model? We just had a session just about 30 minutes ago. A lot of questions around, how do I set up my organization structure? How do I stand up my operating model so that I could be flexible? To, right, the data scientists, to the folks that are more traditional in structured and trusted data. So, still these things are top-of-mind and because they're recognizing the market is also changing too. And the growing amount of expectations, not only solving business outcomes, but also regulatory compliance, privacy is also top-of-mind for a lot of customers. In terms of, how would I get started? And what's the appropriate structure and mechanism for doing so? So we're getting a lot of those types of questions as well. So, the good thing is many of us have had years of experience in this phase and the convergence of us being able to support our customers, not only in our principles around how we implement the framework, but also the technology is really coming together very nicely. >> Anything you'd add, Blake? >> I think it's really impressive to see the level of engagement with thought leaders and decision makers in the data space. You know, as Susan mentioned, we just got out of our session and really, by the end of it, it turned into more of an open discussion. There was just this kind of back and forth between the participants. And so it's really engaging to see that level of passion from such a distinguished group of individuals who are all kind of here to share thoughts and ideas. >> Well anytime you come to a conference, it's sort of any open forum like this, you learn a lot. When you're at MIT, it's like super-charged. With the big brains. >> Exactly, you feel it when you come on the campus. >> You feel smarter when you walk out of here. >> Exactly, I know. >> Well, guys, thanks so much for coming to theCUBE. It was great to have you. >> Thank you for having us. We appreciate it, thank you. >> You're welcome. All right, keep it right there everybody. Paul and I will be back with our next guest. You're watching theCUBE from MIT in Cambridge. We'll be right back. (techno music)

Published Date : Aug 2 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Susan Wilson is here, she's the vice president So, what are you leading at Informatica? and how to best help them; but more specifically the data governance lead. Right, so I lead the data governance capabilities and then, okay guys over to you. And a lot of the failed attempts, if you will, And the mind set from 1.0 to 2.0 to 3.0 has changed. Where, and I'm not sure it's ever going to flip back, But at the same time, Right, and we look at data governance So how do you do that? What does the quality of that data look like? and do what you need to do with it so what stages have you gone through in the culture of the company. in finding the data that they need is that the chief data officers often times and to be able to provide a mechanism What do you do at that stage? So, for example, if I'm going to make changes, How do you help your customers discover that data? and report that out to the data stewards and then, with a fair degree of accuracy, categorize it And it's core to how we built out our platforms It doesn't work. And that individual, And then, our early 2018 we actually re-orged is the data team, the chief data officer's team and build that network of data consumers but in fact, you got to make the case and show the success and to align with what you're working on. Sometimes getting that organization right but one of the things that we have started doing is bringing structure to unstructured content. So, the conference. And the growing amount of expectations, and decision makers in the data space. it's sort of any open forum like this, you learn a lot. when you come on the campus. Well, guys, thanks so much for coming to theCUBE. Thank you for having us. Paul and I will be back with our next guest.

ENTITIES

Entity	Category	Confidence
Paul Gillin	PERSON	0.99+
Susan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paul	PERSON	0.99+
Susan Wilson	PERSON	0.99+
Blake	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
Cambridge	LOCATION	0.99+
Mark Ramsey	PERSON	0.99+
Blake Anders	PERSON	0.99+
three months	QUANTITY	0.99+
three people	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
New York Life	ORGANIZATION	0.99+
early 2018	DATE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
First year	QUANTITY	0.99+
one	QUANTITY	0.99+
90%	QUANTITY	0.99+
two and half years	QUANTITY	0.98+
first	QUANTITY	0.98+
approximately 15	QUANTITY	0.98+
7,000 critical data elements	QUANTITY	0.97+
about two and half years	QUANTITY	0.97+
first year	QUANTITY	0.96+
two	QUANTITY	0.96+
about 30 minutes ago	DATE	0.96+
theCUBE	ORGANIZATION	0.95+
Blake Andrews	PERSON	0.95+
MIT Chief Data Officer and	EVENT	0.93+
MIT Chief Data Officer Information Quality Conference	EVENT	0.91+
EDW	ORGANIZATION	0.86+
last twelve months	DATE	0.86+
skunkworks	ORGANIZATION	0.85+
CDAO	ORGANIZATION	0.85+
this morning	DATE	0.83+
MIT	ORGANIZATION	0.83+
7 years ago	DATE	0.78+
year	QUANTITY	0.75+
Information Quality Symposium 2019	EVENT	0.74+
3.0	OTHER	0.66+
York Life	ORGANIZATION	0.66+
2.0	OTHER	0.59+
MIT CDOIQ 2019	EVENT	0.58+
half	QUANTITY	0.52+
Data 2.0	OTHER	0.52+
Data 3.0	TITLE	0.45+
1.0	OTHER	0.43+
Data	OTHER	0.21+

Robert Abate, Global IDS | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (futuristic music) >> Welcome back to Cambridge, Massachusetts everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events and we extract the signal from the noise. This is day two, we're sort of wrapping up the Chief Data Officer event. It's MIT CDOIQ, it started as an information quality event and with the ascendancy of big data the CDO emerged and really took center stage here. And it's interesting to know that it's kind of come full circle back to information quality. People are realizing all this data we have, you know the old saying, garbage in, garbage out. So the information quality worlds and this chief data officer world have really come colliding together. Robert Abate is here, he's the Vice President and CDO of Global IDS and also the co-chair of next year's, the 14th annual MIT CDOIQ. Robert, thanks for coming on. >> Oh, well thank you. >> Now you're a CDO by background, give us a little history of your career. >> Sure, sure. Well I started out with an Electrical Engineering degree and went into applications development. By 2000, I was leading the Ralph Lauren's IT, and I realized when Ralph Lauren hired me, he was getting ready to go public. And his problem was he had hired eight different accounting firms to do eight different divisions. And each of those eight divisions were reporting a number, but the big number didn't add up, so he couldn't go public. So he searched the industry to find somebody who could figure out the problem. Now I was, at the time, working in applications and had built this system called Service Oriented Architectures, a way of integrating applications. And I said, "Well I don't know if I could solve the problem, "but I'll give it a shot." And what I did was, just by taking each silo as it's own problem, which was what EID Accounting Firm had done, I was able to figure out that one of Ralph Lauren's policies was if you buy a garment, you can return it anytime, anywhere, forever, however long you own it. And he didn't think about that, but what that meant is somebody could go to a Bloomingdale's, buy a garment and then go to his outlet store and return it. Well, the cross channels were different systems. So the outlet stores were his own business, retail was a different business, there was a completely different, each one had their own AS/400, their own data. So what I quickly learned was, the problem wasn't the systems, the problem was the data. And it took me about two months to figure it out and he offered me a job, he said well, I was a consultant at the time, he says, "I'm offering you a job, you're going to run my IT." >> Great user experience but hard to count. >> (laughs) Hard to count. So that's when I, probably 1999 was when that happened. I went into data and started researching-- >> Sorry, so how long did it take you to figure that out? You said a couple of months? >> A couple of months, I think it was about two months. >> 'Cause jeez, it took Oracle what, 10 years to build Fusion with SOA? That's pretty good. (laughs) >> This was a little bit of luck. When we started integrating the applications we learned that the messages that we were sending back and forth didn't match, and we said, "Well that's impossible, it can't not match." But what didn't match was it was coming from one channel and being returned in another channel, and the returns showed here didn't balance with the returns on this side. So it was a data problem. >> So a forensics showdown. So what did you do after? >> After that I went into ICICI Bank which was a large bank in India who was trying to integrate their systems, and again, this was a data problem. But they heard me giving a talk at a conference on how SOA had solved the data challenge, and they said, "We're a bank with a wholesale, a retail, "and other divisions, "and we can't integrate the systems, can you?" I said, "Well yeah, I'd build a website "and make them web services and now what'll happen is "each of those'll kind of communicate." And I was at ICICI Bank for about six months in Mumbai, and finished that which was a success, came back and started consulting because now a lot of companies were really interested in this concept of Service Oriented Architectures. Back then when we first published on it, myself, Peter Aiken, and a gentleman named Joseph Burke published on it in 1996. The publisher didn't accept the book, it was a really interesting thing. We wrote the book called, "Services Based Architectures: A Way to Integrate Systems." And the way Wiley & Sons, or most publishers work is, they'll have three industry experts read your book and if they don't think what you're saying has any value, they, forget about it. So one guy said this is brilliant, one guy says, "These guys don't know what they're talking about," and the third guy says, "I don't even think what they're talking about is feasible." So they decided not to publish. Four years later it came back and said, "We want to publish the book," and Peter said, "You know what, they lost their chance." We were ahead of them by four years, they didn't understand the technology. So that was kind of cool. So from there I went into consulting, eventually took a position as the Head of Enterprise and Director of Enterprise Information Architecture with Walmart. And Walmart, as you know, is a huge entity, almost the size of the federal government. So to build an architecture that integrates Walmart would've been a challenge, a behemoth challenge, and I took it on with a phenomenal team. >> And when was this, like what timeframe? >> This was 2010, and by the end of 2010 we had presented an architecture to the CIO and the rest of the organization, and they came back to me about a week later and said, "Look, everybody agrees what you did was brilliant, "but nobody knows how to implement it. "So we're taking you away, "you're no longer Director of Information Architecture, "you're now Director of Enterprise Information Management. "Build it. "Prove that what you say you could do, you could do." So we built something called the Data CAFE, and CAFE was an acronym, it stood for: Collaborative Analytics Facility for the Enterprise. What we did was we took data from one of the divisions, because you didn't want to take on the whole beast, boil the ocean. We picked Sam's Club and we worked with their CFO, and because we had information about customers we were able to build a room with seven 80 inch monitors that surrounded anyone in the room. And in the center was the Cisco telecommunications so you could be a part of a meeting. >> The TelePresence. >> TelePresence. And we built one room in one facility, and one room in another facility, and we labeled the monitors, one red, one blue, one green, and we said, "There's got to be a way where we can build "data science so it's interactive, so somebody, "an executive could walk into the room, "touch the screen, and drill into features. "And in another room "the features would be changing simultaneously." And that's what we built. The room was brought up on Black Friday of 2013, and we were able to see the trends of sales on the East Coast that we quickly, the executives in the room, and these are the CEO of Walmart and the heads of Sam's Club and the like, they were able to change the distribution in the Mountain Time Zone and west time zones because of the sales on the East Coast gave them the idea, well these things are going to sell, and these things aren't. And they saw a tremendous increase in productivity. We received the 2014, my team received the 2014 Walmart Innovation Project of the Year. >> And that's no slouch. Walmart has always been heavily data-oriented. I don't know if it's urban legend or not, but the famous story in the '80s of the beer and the diapers, right? Walmart would position beer next to diapers, why would they do that? Well the father goes in to buy the diapers for the baby, picks up a six pack while he's on the way, so they just move those proximate to each other. (laughs) >> In terms of data, Walmart really learned that there's an advantage to understanding how to place items in places that, a path that you might take in a store, and knowing that path, they actually have a term for it, I believe it's called, I'm sorry, I forgot the name but it's-- >> Selling more stuff. (laughs) >> Yeah, it's selling more stuff. It's the way you position items on a shelf. And Walmart had the brilliance, or at least I thought it was brilliant, that they would make their vendors the data champion. So the vendor, let's say Procter & Gamble's a vendor, and they sell this one product the most. They would then be the champion for that aisle. Oh, it's called planogramming. So the planogramming, the way the shelves were organized, would be set up by Procter & Gamble for that entire area, working with all their other vendors. And so Walmart would give the data to them and say, "You do it." And what I was purporting was, well, we shouldn't just be giving the data away, we should be using that data. And that was the advent of that. From there I moved to Kimberly-Clark, I became Global Director of Enterprise Data Management and Analytics. Their challenge was they had different teams, there were four different instances of SAP around the globe. One for Latin America, one for North America called the Enterprise Edition, one for EMEA, Europe, Middle East, and Africa, and one for Asia-Pacific. Well when you have four different instances of SAP, that means your master data doesn't exist because the same thing that happens in this facility is different here. And every company faces this challenge. If they implement more than one of a system the specialty fields get used by different companies in different ways. >> The gold standard, the gold version. >> The golden version. So I built a team by bringing together all the different international teams, and created one team that was able to integrate best practices and standards around data governance, data quality. Built BI teams for each of the regions, and then a data science and advanced analytics team. >> Wow, so okay, so that makes you uniquely qualified to coach here at the conference. >> Oh, I don't know about that. (laughs) There are some real, there are some geniuses here. >> No but, I say that because these are your peeps. >> Yes, they are, they are. >> And so, you're a practitioner, this conference is all about practitioners talking to practitioners, it's content-heavy, There's not a lot of fluff. Lunches aren't sponsored, there's no lanyard sponsor and it's not like, you know, there's very subtle sponsor desks, you have to have sponsors 'cause otherwise the conference's not enabled, and you've got costs associated with it. But it's a very intimate event and I think you guys want to keep it that way. >> And I really believe you're dead-on. When you go to most industry conferences, the industry conferences, the sponsors, you know, change the format or are heavily into the format. Here you have industry thought leaders from all over the globe. CDOs of major Fortune 500 companies who are working with their peers and exchanging ideas. I've had conversations with a number of CDOs and the thought leadership at this conference, I've never seen this type of thought leadership in any conference. >> Yeah, I mean the percentage of presentations by practitioners, even when there's a vendor name, they have a practitioner, you know, internal practitioner presenting so it's 99.9% which is why people attend. We're moving venues next year, I understand. Just did a little tour of the new venue, so, going to be able to accommodate more attendees, so that's great. >> Yeah it is. >> So what are your objectives in thinking ahead a year from now? >> Well, you know, I'm taking over from my current peer, Dr. Arka Mukherjee, who just did a phenomenal job of finding speakers. People who are in the industry, who are presenting challenges, and allowing others to interact. So I hope could do a similar thing which is, find with my peers people who have real world challenges, bring them to the forum so they can be debated. On top of that, there are some amazing, you know, technology change is just so fast. One of the areas like big data I remember only five years ago the chart of big data vendors maybe had 50 people on it, now you would need the table to put all the vendors. >> Who's not a data vendor, you know? >> Who's not a data vendor? (laughs) So I would think the best thing we could do is, is find, just get all the CDOs and CDO-types into a room, and let us debate and talk about these points and issues. I've seen just some tremendous interactions, great questions, people giving advice to others. I've learned a lot here. >> And how about long term, where do you see this going? How many CDOs are there in the world, do you know? Is that a number that's known? >> That's a really interesting point because, you know, only five years ago there weren't that many CDOs to be called. And then Gartner four years ago or so put out an article saying, "Every company really should have a CDO." Not just for the purpose of advancing your data, and to Doug Laney's point that data is being monetized, there's a need to have someone responsible for information 'cause we're in the Information Age. And a CIO really is focused on infrastructure, making sure I've got my PCs, making sure I've got a LAN, I've got websites. The focus on data has really, because of the Information Age, has turned data into an asset. So organizations realize, if you utilize that asset, let me reverse this, if you don't use data as an asset, you will be out of business. I heard a quote, I don't know if it's true, "Only 10 years ago, 250 of the Fortune 10 no longer exists." >> Yeah, something like that, the turnover's amazing. >> Many of those companies were companies that decided not to make the change to be data-enabled, to make data decision processing. Companies still use data warehouses, they're always going to use them, and a warehouse is a rear-view mirror, it tells you what happened last week, last month, last year. But today's businesses work forward-looking. And just like driving a car, it'd be really hard to drive your car through a rear-view mirror. So what companies are doing today are saying, "Okay, let's start looking at this as forward-looking, "a prescriptive and predictive analytics, "rather than just what happened in the past." I'll give you an example. In a major company that is a supplier of consumer products, they were leading in the industry and their sales started to drop, and they didn't know why. Well, with a data science team, we were able to determine by pulling in data from the CDC, now these are sources that only 20 years ago nobody ever used to bring in data in the enterprise, now 60% of your data is external. So we brought in data from the CDC, we brought in data on maternal births from the national government, we brought in data from the Census Bureau, we brought in data from sources of advertising and targeted marketing towards mothers. Pulled all that data together and said, "Why are diaper sales down?" Well they were targeting the large regions of the country and putting ads in TV stations in New York and California, big population centers. Birth rates in population centers have declined. Birth rates in certain other regions, like the south, and the Bible Belt, if I can call it that, have increased. So by changing the marketing, their product sales went up. >> Advertising to Texas. >> Well, you know, and that brings to one of the points, I heard a lecture today about ethics. We made it a point at Walmart that if you ran a query that reduced a result to less than five people, we wouldn't allow you to see the result. Because, think about it, I could say, "What is my neighbor buying? "What are you buying?" So there's an ethical component to this as well. But that, you know, data is not political. Data is not chauvinistic. It doesn't discriminate, it just gives you facts. It's the interpretation of that that is hard CDOs, because we have to say to someone, "Look, this is the fact, and your 25 years "of experience in the business, "granted, is tremendous and it's needed, "but the facts are saying this, "and that would mean that the business "would have to change its direction." And it's hard for people to do, so it requires that. >> So whether it's called the chief data officer, whatever the data czar rubric is, the head of analytics, there's obviously the data quality component there whatever that is, this is the conference for, as I called them, your peeps, for that role in the organization. People often ask, "Will that role be around?" I think it's clear, it's solidifying. Yes, you see the chief digital officer emerging and there's a lot of tailwinds there, but the information quality component, the data architecture component, it's here to stay. And this is the premiere conference, the premiere event, that I know of anyway. There are a couple of others, perhaps, but it's great to see all the success. When I first came here in 2013 there were probably about 130 folks here. Today, I think there were 500 people registered almost. Next year, I think 600 is kind of the target, and I think it's very reasonable with the new space. So congratulations on all the success, and thank you for stepping up to the co-chair role, I really appreciate it. >> Well, let me tell you I thank you guys. You provide a voice at these IT conferences that we really need, and that is the ability to get the message out. That people do think and care, the industry is not thoughtless and heartless. With all the data breaches and everything going on there's a lot of fear, fear, loathing, and anticipation. But having your voice, kind of like ESPN and a sports show, gives the technology community, which is getting larger and larger by the day, a voice and we need that so, thank you. >> Well thank you, Robert. We appreciate that, it was great to have you on. Appreciate the time. >> Great to be here, thank you. >> All right, and thank you for watching. We'll be right back with out next guest as we wrap up day two of MIT CDOIQ. You're watching theCUBE. (futuristic music)

Published Date : Aug 1 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and also the co-chair of next year's, give us a little history of your career. So he searched the industry to find somebody (laughs) Hard to count. 10 years to build Fusion with SOA? and the returns showed here So what did you do after? and the third guy says, And in the center was the Cisco telecommunications and the heads of Sam's Club and the like, Well the father goes in to buy the diapers for the baby, (laughs) So the planogramming, the way the shelves were organized, and created one team that was able to integrate so that makes you uniquely qualified to coach here There are some real, there are some geniuses here. and it's not like, you know, the industry conferences, the sponsors, you know, Yeah, I mean the percentage of presentations by One of the areas like big data I remember just get all the CDOs and CDO-types into a room, because of the Information Age, and the Bible Belt, if I can call it that, have increased. It's the interpretation of that that is hard CDOs, the data architecture component, it's here to stay. and that is the ability to get the message out. We appreciate that, it was great to have you on. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Peter Aiken	PERSON	0.99+
Robert Abate	PERSON	0.99+
Robert	PERSON	0.99+
Procter & Gamble	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
India	LOCATION	0.99+
Mumbai	LOCATION	0.99+
Census Bureau	ORGANIZATION	0.99+
2010	DATE	0.99+
1996	DATE	0.99+
New York	LOCATION	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
last month	DATE	0.99+
60%	QUANTITY	0.99+
Bloomingdale	ORGANIZATION	0.99+
Next year	DATE	0.99+
1999	DATE	0.99+
Texas	LOCATION	0.99+
25 years	QUANTITY	0.99+
10 years	QUANTITY	0.99+
one room	QUANTITY	0.99+
2014	DATE	0.99+
2013	DATE	0.99+
Doug Laney	PERSON	0.99+
Sam's Club	ORGANIZATION	0.99+
ICICI Bank	ORGANIZATION	0.99+
99.9%	QUANTITY	0.99+
Wiley & Sons	ORGANIZATION	0.99+
50 people	QUANTITY	0.99+
Arka Mukherjee	PERSON	0.99+
next year	DATE	0.99+
Jos	PERSON	0.99+
Today	DATE	0.99+
third guy	QUANTITY	0.99+
2000	DATE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
500 people	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
one channel	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
each	QUANTITY	0.99+
One	QUANTITY	0.99+
CDC	ORGANIZATION	0.99+
less than five people	QUANTITY	0.99+
Ralph Lauren	ORGANIZATION	0.99+
one guy	QUANTITY	0.99+
six pack	QUANTITY	0.99+
ESPN	ORGANIZATION	0.99+
four years ago	DATE	0.98+
Africa	LOCATION	0.98+
SOA	TITLE	0.98+
five years ago	DATE	0.98+
California	LOCATION	0.98+
Gartner	ORGANIZATION	0.98+
three industry experts	QUANTITY	0.98+
Global IDS	ORGANIZATION	0.98+
Four years later	DATE	0.98+
600	QUANTITY	0.98+
20 years ago	DATE	0.98+
East Coast	LOCATION	0.98+
250	QUANTITY	0.98+
Middle East	LOCATION	0.98+
four years	QUANTITY	0.98+
one team	QUANTITY	0.97+
months	QUANTITY	0.97+
first	QUANTITY	0.97+
about two months	QUANTITY	0.97+
Latin America	LOCATION	0.97+

Aaron Kalb, Alation | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. (dramatic music) >> Welcome back to Cambridge, Massachusetts, everybody. This is theCUBE, the leader in live tech coverage. We go out to the events, and we extract the signal from then noise. And, we're here at the MIT CDOIQ, the Chief Data Officer conference. I'm Dave Vellante with my cohost Paul Gillin. Day two of our wall to wall coverage. Aaron Kalb is here. He's the cofounder and chief data officer of Alation. Aaron, thanks for making the time to come on. >> Thanks so much Dave and Paul for having me. >> You're welcome. So, words matter, you know, and we've been talking about data, and big data, and the three Vs, and data is the new oil, and all this stuff. You gave a talk this week about, you know, "We're maybe not talking the right language "when it comes to data." What did you mean by all that? >> Absolutely, so I get a little bit frustrated by some of these cliques we hear at conference after conference, and the one I, sort of, took aim at in this talk is, data is the new oil. I think what people want to invoke with that is to say, in the same way that oil powered the industrial age, data's powering the information age. Just saying, data's really cool and trendy and important. That's true, but there are a lot of other associations and contexts that people have with data, and some of them don't really apply as, I'm sorry, with oil. And, some of them apply, as well, to data. >> So, is data more valuable than oil? >> Well, I think they're each valuable in different ways, but I think there's a couple issues with the metaphor. One is that data is scarce and dwindling, and part of value comes from the fact that it's so rare. Whereas, the experience with data is that it's so plentiful and abundant, we're almost drowning in it. And so, what I contend is, instead of talking about data as compared to oil, we should talk about data compared to water. And, the idea is, you know, water is very plentiful on the planet, but sometimes, you know, if you have saltwater or contaminated water, you can't drink it. Water is good for different purposes, depending on its form, and so it's all about getting the right data for the right purpose, like water. >> Well, we've certainly, at least in my opinion, fought wars, Paul, over oil. >> And, over water. >> And, certainly, conflicts over water. Do you think we'll be fighting wars over data? Or, are we already? >> No, we might be. One of my favorite talks from the sessions here was a keynote by the CDO for the Department of Defense, who was talking about, you know, the civic duty about transparency but was observing that, actually, more IP addresses from China and Russia are looking at our public datasets than from within the country. So, you know, it's definitely a resource that can be very powerful. >> So, what was the reaction to your premise from the audience. What kind of questions did you get? >> You know, people actually responded very favorably, including some folks from the oil and gas industry, which I was pleased to find. We have a lot of customers in energy, so that was cool. But, what it was nice being here at MIT and just really geeking out about language and linguistics and data with a bunch of CDOs and other people who are, kind of, data intellectuals. >> Right, so if data is not the new oil. >> And, water isn't really a good analogy either, because the supply of water is finite. >> That's true. >> So, what is data? >> Yeah. >> Space? >> Yeah, it's a good point. >> Matter? >> Maybe it is like the universe in that it's always expanding, right, somehow. Right, because any thing, any physic which is on the planet probably won't be growing at that exponential speed. >> So, give us the punchline. >> Well, so I contend that water, while imperfect, is, actually, a really good metaphor that helps for a lot of things. It has properties like the fact that if it's a data quality issue, it flows downstream like pollution in a river. It's the fact that it can come in different forms, useful for different purposes. You might have gray water, right, which is good enough for, you know, irrigation or industrial purposes, but not safe to drink. And so, you rely on metadata to get the data that's in the right form. And, you know, the talk is more fun because you've a lot of visual examples that make this clear. >> Yeah, of course, yeah. >> I actually had one person in the audience say that he used a similar analogy in his own company, so it's fun to trade notes. >> So, chief data officer is a relatively new title for you, is it not? In terms of your role at Alation. >> Yeah, that's right, and the most fun thing about my job is being able to interact with all of the other CDOs and CDAOs at a conference like this. And, it was cool to see. I believe this conference doubled since the last year. Is that right? >> No. >> No, it's up about a hundred, though. >> Right. >> Well. >> And, it's about double from three years ago. >> And, when we first started, in 2013, yeah. >> 130 people, yeah. >> Yeah, it was a very small and intimate event. >> Yeah, here we're outgrowing this building, it seems. >> Yeah, they're kicking us out. >> I think what's interesting is, you know, if we do a little bit of analysis, this is a small data, within our own company, you know, our biggest and most visionary customers typically bought Alation. The buyer champion either was a CDO or they weren't a CDO when they bought the software and have since been promoted to be a CDO. And so, seeing this trend of more and more CDOs cropping up is really exciting for us. And also, just hearing all of the people at the conference saying, two trends we're hearing. A move from, sort of, infrastructure and technology to driving business value, and a move from defense and governance to, sort of, playing offense and doing revenue generation with data. Both of those trends are really exciting for us. >> So, don't hate me for asking this question, because what a lot of companies will do is, they'll give somebody a CDO title, and it's, kind of, a little bit of gimmick, right, to go to market. And, they'll drag you into sales, because I'm sure they do, as a cofounder. But, as well, I know CDOs at tech companies that are actually trying to apply new techniques, figure out how data contributes to their business, how they can cut costs, raise revenue. Do you have an internal role, as well? >> Absolutely, yeah. >> Explain that. >> So, Alation, you know, we're about 250 people, so we're not at the same scale as many of the attendees here. But, we want to learn, you know, from the best, and always apply everything that we learn internally as well. So, obviously, analytics, data science is a huge role in our internal operations. >> And so, what kinds of initiatives are you driving internally? Is it, sort of, cost initiatives, efficiency, innovation? >> Yeah, I think it's all of the above, right. Every single division and both in the, sort of, operational efficiency and cost cutting side as well as figuring out the next big bet to make, can be informed by data. And, our goal was to empower a curious and rational world, and our every decision be based not on the highest paid person's opinion, but on the best evidence possible. And so, you know, the goal of my function is largely to enable that both centrally and within each business unit. >> I want to talk to you about data catalogs a bit because it's a topic close to my heart. I've talked to a lot of data catalog companies over the last couple years, and it seems like, for one thing, the market's very crowded right now. It seems to me. Would you agree there are a lot of options out there? >> Yeah, you know, it's been interesting because when we started it, we were basically the first company to make this technology and to, kind of, use this term, data catalog, in this way. And, it's been validating to see, you know, a lot of big players and other startups even, kind of, coming to that terminology. But, yeah, it has gotten more crowded, and I think our customers who, or our prospects, used to ask us, you know, "What is it that you do? "Explain this catalog metaphor to me," are now saying, "Yeah, catalogs, heard about that." >> It doesn't need to be defined anymore. >> "Which one should I pick? "Why you?" Yeah. >> What distinguished one product from another, you know? What are the major differentiation points? >> Yeah, I think one thing that's interesting is, you know, my talk was about how the metaphors we use shape the way we think. And, I think there's a sense in which, kind of, the history of each company shapes their philosophy and their approach, so we've always been a data catalog company. That's our one product. Some of the other catalog vendors come from ETL background, so they're a lot more focused on technical metadata and infrastructure. Some of the catalog products grew out of governance, and so it's, sort of, governance first, no sorry, defense first and then offense secondary. So, I think that's one of the things, I think, we encourage our prospects to look at, is, kind of, the soul of the company and how that affects their decisions. The other thing is, of course, technology. And, what we at Alation are really excited about, and it's been validating to hear Gartner and others and a lot of the people here, like the GSK keynote speaker yesterday, talking about the importance of comprehensiveness and on taking a behavioral approach, right. We have our Behavioral IO technology that really says, "Let's not look at all the bits and the bytes, "but how are people using the data to drive results?" As our core differentiator. >> Do your customers generally standardize on one data catalog, or might they have multiple catalogs for multiple purposes? >> Yeah, you know, we heard a term more last season, of catalog of catalogs, you know. And, people here can get arbitrarily, you know, meta, meta, meta data, where we like to go there. I think the customers we see most successful tend to have one catalog that serves this function of the single source of reference. Many of our customers will say, you know, that their catalog serves as, sort of, their internal Google for data. Or, the one stop shop where you could find everything. Even though they may have many different sources, Typically you don't want to have siloed catalogs. It makes it harder to find what you're looking for. >> Let's play a little word association with some metaphors. Data lake. (laughter) >> Data lake's another one that I sort of hate. If you think about it, people had data warehouses and didn't love them, but at least, when you put something into a warehouse, you can get it out, right. If you throw something into a lake, you know, there's really no hope you're ever going to find it. It's probably not going to be in great shape, and we're not surprised to find that many folks who invested heavily in data lakes are now having to invest in a layer over it, to make it comprehensible and searchable. >> So, yeah, the lake is where we hide the stolen cars. Data swamp. >> Yeah, I mean, I think if your point is it's worse than lake, it works. But, I think we can do better a lake, right. >> How about data ocean? (laughter) >> You know, out of respect for John Furrier, I'll say it's fantastic. But, to us we think, you know, it isn't really about the size. The more data you have, people think the more data the better. It's actually the more data the worse unless you have a mechanism for finding the little bit of data that is relevant and useful for your task and put it to use. >> And to, want to set up, enter the catalog. So, technically, how does the catalog solve that problem? >> Totally, so if we think about, maybe let's go to the warehouse, for example. But, it works just as well on a data lake in practice. >> Yeah, cool. >> Through the catalog is. It starts with the inventory, you know, what's on every single shelf. But, if you think about what Amazon has done, they have the inventory warehouse in the back, but what you see as a consumer is a simple search interface, where you type in the word of the product you're looking for. And then, you see ranked suggestions for different items, you know, toasters, lamps, whatever, books I want to buy. Same thing for data. I can type in, you know, if I'm at the DOD, you know, information about aircraft, or information about, you know, drug discovery if I'm at GSK. And, I should be able to therefore see all of the different data sets that I have. And, that's true in almost any catalog, that you can do some search over the curated data sets there. With Alation in particular, what I can see is, who's using it, how are they using it, what are they joining it with, what results do they find in that process. And, that can really accelerate the pace of discovery. >> Go ahead. >> I'm sorry, Dave. To what degree can you automate some of that detail, like who's using it and what it's being used for. I mean, doesn't that rely on people curating the catalog? Or, to what degree can you automate that? >> Yeah, so it's a great question. I think, sometimes, there's a sense with AI or ML that it's like the computer is making the decisions or making things up. Which is, obviously, very scary. Usually, the training data comes from humans. So, our goal is to learn from humans in two ways. There's learning from humans where humans explicitly teach you. Somebody goes and says, "This is goal standard data versus this is, "you know, low quality data." And, they do that manually. But, there's also learning implicitly from people. So, in the same way on amazon.com, if I buy one item and then buy another, I'm doing that for my own purposes, but Amazon can do collaborative filtering over all of these trends and say, "You might want to buy this item." We can do a similar thing where we parse the query logs, parse the usage logs and be eye tools, and can basically watch what people are doing for their own purposes. Not to, you know, extra work on top of their job to help us. We can learn from that and make everybody more effective. >> Aaron, is data classification a part of all this? Again, when we started in the industry, data classification was a manual exercise. It's always been a challenge. Certainly, people have applied math to it. You've seen support vector machines and probabilistic latent cement tech indexing being used to classify data. Have we solved that problem, as an industry? Can you automate the classification of data on creation or use at this point in time? >> Well, one thing that came up in a few talks about AI and ML here is, regardless of the algorithm you're using, whether it's, you know, IFH or SVM, or something really modern and exciting that keeps learning. >> Stuff that's been around forever or, it's like you say, some new stuff, right. >> Yeah, you know, actually, I think it was said best by Michael Collins at the DOD, that data is more important than the algorithm because even the best algorithm is useless without really good training data. Plus, the algorithm's, kind of, everyone's got them. So, really often, training data is the limiting reactant in getting really good classification. One thing we try to do at Alation is create an upward spiral where maybe some data is curated manually, and then we can use that as a seed to make some suggestions about how to label other data. And then, it's easier to just do a confirm or deny of a guess than to actually manually label everything. So, then you get more training, get it faster, and it kind of accelerates that way instead of being a big burden. >> So, that's really the advancement in the last five to what, five, six years. Where you're able to use machine intelligence to, sort of, solve that problem as opposed to brute forcing it with some algorithm. Is that fair? >> Yeah, I think that's right, and I think what gets me very excited is when you can have these interactive loops where the human helps the computer, which helps the human. You get, again, this upward spiral. Instead of saying, "We have to have all of this, "you know, manual step done "before we even do the first step," or trying to have an algorithm brute force it without any human intervention. >> It's kind of like notes key mode on write, except it actually works. I'm just kidding to all my ADP friends. All right, Aaron, hey. Thanks very much for coming on theCUBE, but give your last word on the event. I think, is this your first one or no? >> This is our first time here. >> Yeah, okay. So, what are your thoughts? >> I think we'll be back. It's just so exciting to get people who are thinking really big about data but are also practitioners who are solving real business problems. And, just the exchange of ideas and best practices has been really inspiring for me. >> Yeah, that's great. >> Yeah. >> Well, thank you for the support of the event, and thanks for coming on theCUBE. It was great to see you again. >> Thanks Dave, thanks Paul. >> All right, you're welcome. >> Thank you, sir. >> All right, keep it right there, everybody. We'll be back with our next guest right after this short break. You're watching theCUBE from MIT CDOIQ. Be right back. (upbeat music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. Aaron, thanks for making the time to come on. and data is the new oil, and all this stuff. in the same way that oil powered the industrial age, And, the idea is, you know, water is very plentiful Well, we've certainly, at least in my opinion, Do you think we'll be fighting wars over data? So, you know, it's definitely a resource What kind of questions did you get? We have a lot of customers in energy, so that was cool. because the supply of water is finite. Maybe it is like the universe And, you know, the talk is more fun because you've a lot I actually had one person in the audience say So, chief data officer is a relatively Yeah, that's right, and the most fun thing I think what's interesting is, you know, And, they'll drag you into sales, But, we want to learn, you know, from the best, And so, you know, the goal of my function I want to talk to you about data catalogs a bit And, it's been validating to see, you know, "Which one should I pick? Yeah, I think one thing that's interesting is, you know, Or, the one stop shop where you could find everything. Data lake. when you put something into a warehouse, So, yeah, the lake is where we hide the stolen cars. But, I think we can do better a lake, right. But, to us we think, you know, So, technically, how does the catalog solve that problem? maybe let's go to the warehouse, for example. I can type in, you know, if I'm at the DOD, you know, Or, to what degree can you automate that? Not to, you know, extra work on top of their job to help us. Can you automate the classification of data whether it's, you know, IFH or SVM, or something it's like you say, some new stuff, right. Yeah, you know, actually, I think it was said best in the last five to what, five, six years. when you can have these interactive loops I'm just kidding to all my ADP friends. So, what are your thoughts? And, just the exchange of ideas It was great to see you again. We'll be back with our next guest

ENTITIES

Entity	Category	Confidence
Michael Collins	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Paul	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
2013	DATE	0.99+
Aaron Kalb	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Aaron	PERSON	0.99+
five	QUANTITY	0.99+
Department of Defense	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
John Furrier	PERSON	0.99+
amazon.com	ORGANIZATION	0.99+
yesterday	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Alation	PERSON	0.99+
Alation	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
one item	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
first step	QUANTITY	0.99+
last year	DATE	0.99+
GSK	ORGANIZATION	0.99+
both	QUANTITY	0.99+
DOD	ORGANIZATION	0.99+
one person	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
130 people	QUANTITY	0.98+
One	QUANTITY	0.98+
first time	QUANTITY	0.98+
MIT	ORGANIZATION	0.98+
one product	QUANTITY	0.97+
three years ago	DATE	0.97+
this week	DATE	0.97+
two	QUANTITY	0.97+
MIT CDOIQ	ORGANIZATION	0.96+
MIT Chief Data Officer and	EVENT	0.96+
one data catalog	QUANTITY	0.96+
each	QUANTITY	0.96+
each company	QUANTITY	0.95+
Both	QUANTITY	0.95+
one thing	QUANTITY	0.95+
first one	QUANTITY	0.94+
one catalog	QUANTITY	0.93+
two trends	QUANTITY	0.93+
theCUBE	ORGANIZATION	0.93+
first	QUANTITY	0.92+
first company	QUANTITY	0.92+
last couple years	DATE	0.92+
CDO	ORGANIZATION	0.91+
about a hundred	QUANTITY	0.91+
single shelf	QUANTITY	0.88+
about 250 people	QUANTITY	0.88+
single source	QUANTITY	0.87+
China	LOCATION	0.87+
2019	DATE	0.86+
Day two	QUANTITY	0.86+
one	QUANTITY	0.85+
each business unit	QUANTITY	0.82+
MIT CDOIQ	EVENT	0.79+
ADP	ORGANIZATION	0.79+
couple issues	QUANTITY	0.76+
Information Quality Symposium 2019	EVENT	0.76+
One thing	QUANTITY	0.7+
single division	QUANTITY	0.69+
one stop	QUANTITY	0.68+
Russia	LOCATION	0.64+
three	QUANTITY	0.61+
double	QUANTITY	0.59+
favorite	QUANTITY	0.5+
CDOIQ	EVENT	0.46+
Chief	PERSON	0.42+

Jeanne Ross, MIT CISR | MIT CDOIQ 2019

(techno music) >> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to MIT CDOIQ. The CDO Information Quality Conference. You're watching theCUBE, the leader in live tech coverage. My name is Dave Vellante. I'm here with my co-host, Paul Gillin. This is our day two of our two day coverage. Jean Ross is here. She's the principle research scientist at MIT CISR, Jean good to see you again. >> Nice to be here! >> Welcome back. Okay, what do all these acronyms stand for, I forget. MIT CISR. >> CISR which we pronounce scissor, is the Center for Information Systems Research. It's a research center that's been at MIT since 1974, studying how big companies use technology effectively. >> So and, what's your role as a research scientist? >> As a research scientist, I work with both researchers and with company leaders to understand what's going on out there, and try to present some simple succinct ideas about how companies can generate greater value from information technology. >> Well, I guess not much has changed in information technology since 1974. (laughing) So let's fast forward to the big, hot trend, digital transformation, digital business. What's the difference between a business and a digital business? >> Right now, you're hoping there's no difference for you and your business. >> (chuckling) Yeah, for sure. >> The main thing about a digital business is it's being inspired by technology. So in the past, we would establish a strategy, and then we would check out technology and say, okay, how can technology make us more effective with that strategy? Today, and this has been driven a lot by start-ups, we have to stop and say, well wait a minute, what is technology making possible? Because if we're not thinking about it, there sure are a lot of students at MIT who are, and we're going to miss the boat. We're going to get Ubered if you will, somebody's going to think of a value proposition that we should be offering and aren't, and we'll be left in the dust. So, our digital businesses are those that are recognizing the opportunities that digital technologies make possible. >> Now, and what about data? In terms of the role of digital business, it seems like that's an underpinning of a digital business. Is it not? >> Yeah, the single biggest capability that digital technologies provide, is ubiquitous data that's readily accessible anytime. So when we think about being inspired by technology, we could reframe that as inspired by the availability of ubiquitous data that's readily accessible. >> Your premise about the difference between digitization and digital business is interesting. It's more than just a sematic debate. Do companies now, when companies talk about digital transformation these days, in fact, are most of them of thinking of digitization rather than really transformative business change? >> Yeah, this is so interesting to me. In 2006, we wrote a book that said, you need to become more agile, and you need to rely on information technology to get you there. And these are basic things like SAP and salesforce.com and things like that. Just making sure that your core processes are disciplined and reliable and predictable. We said this in 2006. What we didn't know is that we were explaining digitization, which is very effective use of technology in your underlying process. Today, when somebody says to me, we're going digital, I'm thinking about the new value propositions, the implications of the data, right? And they're often actually saying they're finally doing what we thought they should do in 2006. The problem is, in 2006, we said get going on this, it's a long journey. This could take you six, 10 years to accomplish. And then we gave examples of companies that took six to 10 years. LEGO, and USAA and really great companies. And now, companies are going, "Ah, you know, we really ought to do that". They don't have six to 10 years. They get this done now, or they're in trouble, and it's still a really big deal. >> So how realistic is it? I mean, you've got big established companies that have got all these information silos, as we've been hearing for the last two days, just pulling their information together, knowing what they've got is a huge challenge for them. Meanwhile, you're competing with born on the web, digitally native start-ups that don't have any of that legacy, is it really feasible for these companies to reinvent themselves in the way you're talking about? Or should they just be buying the companies that have already done it? >> Well good luck with buying, because what happens is that when a company starts up, they can do anything, but they can't do it to scale. So most of these start-ups are going to have to sell themselves because they don't know anything about scale. And the problem is, the companies that want to buy them up know about the scale of big global companies but they don't know how to do this seamlessly because they didn't do the basic digitization. They relied on basically, a lot of heroes in their company to pull of the scale. So now they have to rely more on technology than they did in the past, but they still have a leg up if you will, on the start-up that doesn't want to worry about the discipline of scaling up a good idea. They'd rather just go off and have another good idea, right? They're perpetual entrepreneurs if you will. So if we look at the start-ups, they're not really your concern. Your concern is the very well run company, that's been around, knows how to be inspired by technology and now says, "Oh I see what you're capable of doing, "or should be capable of doing. "I think I'll move into your space". So this, the Amazon's, and the USAA's and the LEGO's who say "We're good at what we do, "and we could be doing more". We're watching Schneider Electric, Phillips's, Ferovial. These are big ole companies who get digital, and they are going to start moving into a lot of people's territory. >> So let's take the example of those incumbents that you've used as examples of companies that are leaning into digital, and presumably doing a good job of it, they've got a lot of legacy debt, as you know people call it technical debt. The question I have is how they're using machine intelligence. So if you think about Facebook, Amazon, Microsoft, Google, they own horizontal technologies around machine intelligence. The incumbents that you mentioned, do not. Now do they close the gap? They're not going to build their own A.I. They're going to buy it, and then apply it. It's how they apply it that's going to be the difference. So do you agree with that premise, and where are they getting it, do they have the skill sets to do it, how are they closing that gap? >> They're definitely partnering. When you say they're not going to build any of it, that's actually not quite true. They're going to build a lot around the edges. They'll rely on partners like Microsoft and Google to provide some of the core, >> Yes, right. >> But they are bringing in their own experts to take it to the, basically to the customer level. How do I take, let me just take Schneider Electric for an example. They have gone from being an electrical equipment manufacturer, to a purveyor of energy management solutions. It's quite a different value proposition. To do that, they need a lot of intelligence. Some of it is data analytics of old, and some of it is just better representation on dashboards and things like that. But there is a layer of intelligence that is new, and it is absolutely essential to them by relying on partners and their own expertise in what they do for customers, and then co-creating a fair amount with customers, they can do things that other companies cannot. >> And they're developing a software presumably, a SAS revenue stream as part of that, right? >> Yeah, absolutely. >> How about the innovators dilemma though, the problem that these companies often have grown up, they're very big, they're very profitable, they see disruption coming, but they are unable to make the change, their shareholders won't let them make the change, they know what they have to do, but they're simply not able to do it, and then they become paralyzed. Is there a -- I mean, looking at some of the companies you just mentioned, how did they get over that mindset? >> This is real leadership from CEO's, who basically explain to their boards and to their investors, this is our future, we are... we're either going this direction or we're going down. And they sell it. It's brilliant salesmanship, and it's why when we go out to study great companies, we don't have that many to choose from. I mean, they are hard to find, right? So you are at such a competitive advantage right now. If you understand, if your own internal processes are cleaned up and you know how to rely on the E.R.P's and the C.R.M's, to get that done, and on the other hand, you're using the intelligence to provide value propositions, that new technologies and data make possible, that is an incredibly powerful combination, but you have to invest. You have to convince your boards and your investors that it's a good idea, you have to change your talent internally, and the biggest surprise is, you have to convince your customers that they want something from you that they never wanted before. So you got a lot of work to do to pull this off. >> Right now, in today's economy, the economy is sort of lifting all boats. But as we saw when the .com implosion happened in 2001, often these breakdown gives birth to great, new companies. Do you see that the next recession, which is inevitably coming, will be sort of the turning point for some of these companies that can't change? >> It's a really good question. I do expect that there are going to be companies that don't make it. And I think that they will fail at different rates based on their, not just the economy, but their industry, and what competitors do, and things like that. But I do think we're going to see some companies fail. We're going to see many other companies understand that they are too complex. They are simply too complex. They cannot do things end to end and seamlessly and present a great customer experience, because they're doing everything. So we're going to see some pretty dramatic changes, we're going to see failure, it's a fair assumption that when we see the economy crash, it's also going to contribute, but that's, it's not the whole story. >> But when the .com blew up, you had the internet guys that actually had a business model to make money, and the guys that didn't, the guys that didn't went away, and then you also had the incumbents that embrace the internet, so when we came out of that .com downturn, you had the survivors, who was Google and eBay, and obviously Amazon, and then you had incumbent companies who had online retailing, and e-tailing and e-commerce etc, who thrived. I would suspect you're going to see something similar, but I wonder what you guys think. The street today is rewarding growth. And we got another near record high today after the rate cut yesterday. And so, but companies that aren't making money are getting rewarded, 'cause they're growing. Well when the recession comes, those guys are going to get crushed. >> Right. >> Yeah. >> And you're going to have these other companies emerge, and you'll see the winners, are going to be those ones who have truly digitized, not just talking the talk, or transformed really, to use your definition. That's what I would expect. I don't know, what do you think about that? >> I totally agree. And, I mean, we look at industries like retail, and they have been fundamentally transformed. There's still lots of opportunities for innovation, and we're going to see some winners that have kind of struggled early but not given up, and they're kind of finding their footing. But we're losing some. We're losing a lot, right? I think the surprise is that we thought digital was going to replace what we did. We'd stop going to stores, we'd stop reading books, we wouldn't have newspapers anymore. And it hasn't done that. Its only added, it hasn't taken anything away. >> It could-- >> I don't think the newspaper industry has been unscathed by digital. >> No, nor has retail. >> Nor has retail, right. >> No, no no, not unscathed, but here's the big challenge. Is if I could substitute, If I could move from newspaper to online, I'm fine. You don't get to do that. You add online to what you've got, right? And I think this right now is the big challenge. Is that nothing's gone away, at least yet. So we have to sustain the business we are, so that it can feed the business we want to be. And we have to make that transition into new capabilities. I would argue that established companies need to become very binary, that there are people that do nothing but sustain and make better and better and better, who they are. While others, are creating the new reality. You see this in auto companies by the way. They're creating not just the autonomous automobiles, but the mobility services, the whole new value propositions, that will become a bigger and bigger part of their revenue stream, but right now are tiny. >> So, here's the scary thing to me. And again, I'd love to hear your thoughts on this. And I've been an outspoken critic of Liz Warren's attack on big tech. >> Absolutely. >> I just think if they're breaking the law, and they're really acting like monopolies, the D.O.J and F.T.C should do something, but to me, you don't just break up big tech because they're good capitalists. Having said that, one of the things that scares me is, when you see Apple getting into payment systems, Amazon getting into grocery and logistics. Digital allows you to do something that's never happened before which is, you can traverse industries. >> Yep. >> Yeah, absolutely >> You used to have this stack of industries, and if you were in that industry, you're stuck in healthcare, you're stuck in financial services or whatever it was. And today, digital allows you to traverse those. >> It absolutely does. And so in theory, Amazon and Apple and Facebook and Google, they can attack virtually any industry and they kind of are. >> Yeah they kind are. I would certainly not break up anything. I would really look hard though at acquisitions, because I think that's where some of this is coming from. They can stop the overwhelming growth, but I do think you're right. That you get these opportunities from digital that are just so much easier because they're basically sharing information and technology, not building buildings and equipment and all that kind of thing. But I think there all limits to all this. I do not fear these companies. I think there, we need some law, we need some regulations, they're fine. They are adding a lot of value and the great companies, I mean, you look at the Schneider's and the Phillips, yeah they fear what some of them can do, but they're looking forward to what they provide underneath. >> Doesn't Cloud change the equation here? I mean, when you think of something like Amazon getting into the payments business, or Google in the payments business, you know it used to be that the creating of global payments processing network, just going global was a huge barrier to entry. Now, you don't have nearly that same level of impediment right? I mean the cloud eliminates much of the traditional barrier. >> Yeah, but I'll tell you what limits it, is complexity. Every company we've studied gets a little over anxious and becomes too complex, and they cannot run themselves effectively anymore. It happens to everyone. I mean, remember when we were terrified about what Microsoft was going to become? But then it got competition because it's trying to do so many things, and somebody else is offering, Sales Force and others, something simpler. And this will happen to every company that gets overly ambitious. Something simpler will come along, and everybody will go "Oh thank goodness". Something simpler. >> Well with Microsoft, I would argue two things. One is the D.O.J put some handcuffs on them , and two, with Steve Ballmer, I wouldn't get his nose out of Windows, and then finally stuck on a (mumbles) (laughter) >> Well it's they had a platform shift. >> Well this is exactly it. They will make those kind of calls . >> Sure, and I think that talks to their legacy, that they won't end up like Digital Equipment Corp or Wang and D.G, who just ignored the future and held onto the past. But I think, a colleague of ours, David Moschella wrote a book, it's called "Seeing Digital". And his premise was we're moving from a world of remote cloud services, to one where you have to, to use your word, ubiquitous digital services that you can access upon which you can build your business and new business models. I mean, the simplest example is Waves, you mentioned Uber. They're using Cloud, they're using OAuth.in with Google, Facebook or LinkedIn and they've got a security layer, there's an A.I layer, there's all your BlockChain, mobile, cognitive, it's all these sets of services that are now ubiquitous on which you're building, so you're leveraging, he calls it the matrix, to the extent that these companies that you're studying, these incumbents can leverage that matrix, they should be fine. >> Yes. >> The part of the problem is, they say "No, we're going to invent everything ourselves, we're going to build it all ourselves". To use Andy Jassy's term, it's non-differentiated heavy lifting, slows them down, but there's no reason why they can't tap that matrix, >> Absolutely >> And take advantage of it. Where I do get scared is, the Facebooks, Apples, Googles, Amazons, they're matrix companies, their data is at their core, and they get this. It's not like they're putting data around the core, data is the core. So your thoughts on that? I mean, it looks like your slide about disruption, it's coming. >> Yeah, yeah, yeah, yeah. >> No industry is safe. >> Yeah, well I'll go back to the complexity argument. We studied complexity at length, and complexity is a killer. And as we get too ambitious, and we're constantly looking for growth, we start doing things that create more and more tensions in our various lines of business, causes to create silos, that then we have to coordinate. I just think every single company that, no cloud is going to save us from this. It, complexity will kill us. And we have to keep reminding ourselves to limit that complexity, and we've just not seen the example of the company that got that right. Sooner or later, they just kind of chop them, you know, create problems for themselves. >> Well isn't that inherent though in growth? >> Absolutely! >> It's just like, big companies slow down. >> That's right. >> They can't make decisions as quickly. >> That's right. >> I haven't seen a big company yet that moves nimbly. >> Exactly, and that's the complexity thing-- >> Well wait a minute, what about AWS? They're a 40 billion dollar company. >> Oh yeah, yeah, yeah >> They're like the agile gorilla. >> Yeah, yeah, yeah. >> I mean, I think they're breaking the rule, and my argument would be, because they have data at their core, and they've got that, its a bromide, but that common data model, that they can apply now to virtually any business. You know, we're been expecting, a lot of people have been expecting that growth to attenuate. I mean it hasn't yet, we'll see. But they're like a 40 billion dollar firm-- >> No that's a good example yeah. >> So we'll see. And Microsoft, is the other one. Microsoft is demonstrating double digit growth. For such a large company, it's astounding. I wonder, if the law of large numbers is being challenged, so. >> Yeah, well it's interesting. I do think that what now constitutes "so big" that you're really going to struggle with the complexity. I think that has definitely been elevated a lot. But I still think there will be a point at which human beings can't handle-- >> They're getting away. >> Whatever level of complexity we reach, yeah. >> Well sure, right because even though this great new, it's your point. Cloud technology, you know, there's going to be something better that comes along. Even, I think Jassy might have said, If we had to do it all over again, we would have built the whole thing on lambda functions >> Yeah. >> Oh, yeah. >> Not on, you know so there you go. >> So maybe someone else does that-- >> Yeah, there you go. >> So now they've got their hybrid. >> Yeah, yeah. >> Yeah, absolutely. >> You know maybe it'll take another ten years, but well Jean, thanks so much for coming to theCUBE, >> it was great to have you. >> My pleasure! >> Appreciate you coming back. >> Really fun to talk. >> All right, keep right there everybody, Paul Gillin and Dave Villante, we'll be right back from MIT CDOIQ, you're watching theCUBE. (chuckles) (techno music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. Jean good to see you again. Okay, what do all these acronyms stand for, I forget. is the Center for Information Systems Research. to understand what's going on out there, So let's fast forward to the big, hot trend, for you and your business. We're going to get Ubered if you will, Now, and what about data? Yeah, the single biggest capability and digital business is interesting. information technology to get you there. to reinvent themselves in the way you're talking about? and they are going to start moving into It's how they apply it that's going to be the difference. They're going to build a lot around the edges. and it is absolutely essential to them I mean, looking at some of the companies you just mentioned, and the biggest surprise is, you have to convince often these breakdown gives birth to great, new companies. I do expect that there are going to be companies and then you also had the incumbents I don't know, what do you think about that? and they have been fundamentally transformed. I don't think the newspaper industry so that it can feed the business we want to be. So, here's the scary thing to me. but to me, you don't just break up big tech and if you were in that industry, they can attack virtually any industry and they kind of are. But I think there all limits to all this. I mean, when you think of something like and they cannot run themselves effectively anymore. One is the D.O.J put some handcuffs on them , Well this is exactly it. Sure, and I think that talks to their legacy, The part of the problem is, they say data is the core. that then we have to coordinate. Well wait a minute, what about AWS? that growth to attenuate. And Microsoft, is the other one. I do think that what now constitutes "so big" that you're there's going to be something better that comes along. Paul Gillin and Dave Villante,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
David Moschella	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Jean Ross	PERSON	0.99+
2006	DATE	0.99+
six	QUANTITY	0.99+
Steve Ballmer	PERSON	0.99+
Jeanne Ross	PERSON	0.99+
Liz Warren	PERSON	0.99+
LEGO	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Schneider Electric	ORGANIZATION	0.99+
Dave Villante	PERSON	0.99+
Amazons	ORGANIZATION	0.99+
Googles	ORGANIZATION	0.99+
Jean	PERSON	0.99+
Facebooks	ORGANIZATION	0.99+
Phillips	ORGANIZATION	0.99+
USAA	ORGANIZATION	0.99+
Center for Information Systems Research	ORGANIZATION	0.99+
Apples	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Ferovial	ORGANIZATION	0.99+
Digital Equipment Corp	ORGANIZATION	0.99+
2001	DATE	0.99+
1974	DATE	0.99+
two day	QUANTITY	0.99+
two	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
D.O.J	ORGANIZATION	0.99+
yesterday	DATE	0.99+
eBay	ORGANIZATION	0.99+
40 billion dollar	QUANTITY	0.99+
MIT	ORGANIZATION	0.99+
Jassy	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
today	DATE	0.99+
10 years	QUANTITY	0.99+
ten years	QUANTITY	0.99+
Today	DATE	0.99+
One	QUANTITY	0.99+
CISR	ORGANIZATION	0.98+
MIT CISR	ORGANIZATION	0.98+
Seeing Digital	TITLE	0.98+
two things	QUANTITY	0.98+
single	QUANTITY	0.97+
Ubered	ORGANIZATION	0.97+
LinkedIn	ORGANIZATION	0.97+
Windows	TITLE	0.96+
OAuth.in	TITLE	0.96+
one	QUANTITY	0.94+
Wang and D.G	ORGANIZATION	0.94+
CDO Information Quality Conference	EVENT	0.94+
D.O.J	PERSON	0.87+

Gokula Mishra | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019 brought to you by SiliconANGLE Media. (upbeat techno music) >> Hi everybody, welcome back to Cambridge, Massachusetts. You're watching theCUBE, the leader in tech coverage. We go out to the events. We extract the signal from the noise, and we're here at the MIT CDOIQ Conference, Chief Data Officer Information Quality Conference. It is the 13th year here at the Tang building. We've outgrown this building and have to move next year. It's fire marshal full. Gokula Mishra is here. He is the Senior Director of Global Data and Analytics and Supply Chain-- >> Formerly. Former, former Senior Director. >> Former! I'm sorry. It's former Senior Director of Global Data Analytics and Supply Chain at McDonald's. Oh, I didn't know that. I apologize my friend. Well, welcome back to theCUBE. We met when you were at Oracle doing data. So you've left that, you're on to your next big thing. >> Yes, thinking through it. >> Fantastic, now let's start with your career. You've had, so you just recently left McDonald's. I met you when you were at Oracle, so you cut over to the dark side for a while, and then before that, I mean, you've been a practitioner all your life, so take us through sort of your background. >> Yeah, I mean my beginning was really with a company called Tata Burroughs. Those days we did not have a lot of work getting done in India. We used to send people to U.S. so I was one of the pioneers of the whole industry, coming here and working on very interesting projects. But I was lucky to be working on mostly data analytics related work, joined a great company called CS Associates. I did my Master's at Northwestern. In fact, my thesis was intelligent databases. So, building AI into the databases and from there on I have been with Booz Allen, Oracle, HP, TransUnion, I also run my own company, and Sierra Atlantic, which is part of Hitachi, and McDonald's. >> Awesome, so let's talk about use of data. It's evolved dramatically as we know. One of the themes in this conference over the years has been sort of, I said yesterday, the Chief Data Officer role emerged from the ashes of sort of governance, kind of back office information quality compliance, and then ascended with the tailwind of the Big Data meme, and it's kind of come full circle. People are realizing actually to get value out of data, you have to have information quality. So those two worlds have collided together, and you've also seen the ascendancy of the Chief Digital Officer who has really taken a front and center role in some of the more strategic and revenue generating initiatives, and in some ways the Chief Data Officer has been a supporting role to that, providing the quality, providing the compliance, the governance, and the data modeling and analytics, and a component of it. First of all, is that a fair assessment? How do you see the way in which the use of data has evolved over the last 10 years? >> So to me, primarily, the use of data was, in my mind, mostly around financial reporting. So, anything that companies needed to run their company, any metrics they needed, any data they needed. So, if you look at all the reporting that used to happen it's primarily around metrics that are financials, whether it's around finances around operations, finances around marketing effort, finances around reporting if it's a public company reporting to the market. That's where the focus was, and so therefore a lot of the data that was not needed for financial reporting was what we call nowadays dark data. This is data we collect but don't do anything with it. Then, as the capability of the computing, and the storage, and new technologies, and new techniques evolve, and are able to handle more variety and more volume of data, then people quickly realize how much potential they have in the other data outside of the financial reporting data that they can utilize too. So, some of the pioneers leverage that and actually improved a lot in their efficiency of operations, came out with innovation. You know, GE comes to mind as one of the companies that actually leverage data early on, and number of other companies. Obviously, you look at today data has been, it's defining some of the multi-billion dollar company and all they have is data. >> Well, Facebook, Google, Amazon, Microsoft. >> Exactly. >> Apple, I mean Apple obviously makes stuff, but those other companies, they're data companies. I mean largely, and those five companies have the highest market value on the U.S. stock exchange. They've surpassed all the other big leaders, even Berkshire Hathaway. >> So now, what is happening is because the market changes, the forces that are changing the behavior of our consumers and customers, which I talked about which is everyone now is digitally engaging with each other. What that does is all the experiences now are being captured digitally, all the services are being captured digitally, all the products are creating a lot of digital exhaust of data and so now companies have to pay attention to engage with their customers and partners digitally. Therefore, they have to make sure that they're leveraging data and analytics in doing so. The other thing that has changed is the time to decision to the time to act on the data inside that you get is shrinking, and shrinking, and shrinking, so a lot more decision-making is now going real time. Therefore, you have a situation now, you have the capability, you have the technology, you have the data now, you have to make sure that you convert that in what I call programmatic kind of data decision-making. Obviously, there are people involved in more strategic decision-making. So, that's more manual, but at the operational level, it's going more programmatic decision-making. >> Okay, I want to talk, By the way, I've seen a stat, I don't know if you can confirm this, that 80% of the data that's out there today is dark data or it's data that's behind a firewall or not searchable, not open to Google's crawlers. So, there's a lot of value there-- >> So, I would say that percent is declining over time as companies have realized the value of data. So, more and more companies are removing the silos, bringing those dark data out. I think the key to that is companies being able to value their data, and as soon as they are able to value their data, they are able to leverage a lot of the data. I still believe there's a large percent still not used or accessed in companies. >> Well, and of course you talked a lot about data monetization. Doug Laney, who's an expert in that topic, we had Doug on a couple years ago when he, just after, he wrote Infonomics. He was on yesterday. He's got a very detailed prescription as to, he makes strong cases as to why data should be valued like an asset. I don't think anybody really disagrees with that, but then he gave kind of a how-to-do-it, which will, somewhat, make your eyes bleed, but it was really well thought out, as you know. But you talked a lot about data monetization, you talked about a number of ways in which data can contribute to monetization. Revenue, cost reduction, efficiency, risk, and innovation. Revenue and cost is obvious. I mean, that's where the starting point is. Efficiency is interesting. I look at efficiency as kind of a doing more with less but it's sort of a cost reduction, but explain why it's not in the cost bucket, it's different. >> So, it is first starts with doing what we do today cheaper, better, faster, and doing more comes after that because if you don't understand, and data is the way to understand how your current processes work, you will not take the first step. So, to take the first step is to understand how can I do this process faster, and then you focus on cheaper, and then you focus on better. Of course, faster is because of some of the market forces and customer behavior that's driving you to do that process faster. >> Okay, and then the other one was risk reduction. I think that makes a lot of sense here. Actually, let me go back. So, one of the key pieces of it, of efficiency is time to value. So, if you can compress the time, or accelerate the time and you get the value that means more cash in house faster, whether it's cost reduction or-- >> And the other aspect you look at is, can you automate more of the processes, and in that way it can be faster. >> And that hits the income statement as well because you're reducing headcount cost of your, maybe not reducing headcount cost, but you're getting more out of different, out ahead you're reallocating them to more strategic initiatives. Everybody says that but the reality is you hire less people because you just automated. And then, risk reduction, so the degree to which you can lower your expected loss. That's just instead thinking in insurance terms, that's tangible value so certainly to large corporations, but even midsize and small corporations. Innovation, I thought was a good one, but maybe you could use an example of, give us an example of how in your career you've seen data contribute to innovation. >> So, I'll give an example of oil and gas industry. If you look at speed of innovation in the oil and gas industry, they were all paper-based. I don't know how much you know about drilling. A lot of the assets that goes into figuring out where to drill, how to drill, and actually drilling and then taking the oil or gas out, and of course selling it to make money. All of those processes were paper based. So, if you can imagine trying to optimize a paper-based innovation, it's very hard. Not only that, it's very, very by itself because it's on paper, it's in someone's drawer or file. So, it's siloed by design and so one thing that the industry has gone through, they recognize that they have to optimize the processes to be better, to innovate, to find, for example, shale gas was a result output of digitizing the processes because otherwise you can't drill faster, cheaper, better to leverage the shale gas drilling that they did. So, the industry went through actually digitizing a lot of the paper assets. So, they went from not having data to knowingly creating the data that they can use to optimize the process and then in the process they're innovating new ways to drill the oil well cheaper, better, faster. >> In the early days of oil exploration in the U.S. go back to the Osage Indian tribe in northern Oklahoma, and they brilliantly, when they got shuttled around, they pushed him out of Kansas and they negotiated with the U.S. government that they maintain the mineral rights and so they became very, very wealthy. In fact, at one point they were the wealthiest per capita individuals in the entire world, and they used to hold auctions for various drilling rights. So, it was all gut feel, all the oil barons would train in, and they would have an auction, and it was, again, it was gut feel as to which areas were the best, and then of course they evolved, you remember it used to be you drill a little hole, no oil, drill a hole, no oil, drill a hole. >> You know how much that cost? >> Yeah, the expense is enormous right? >> It can vary from 10 to 20 million dollars. >> Just a giant expense. So, now today fast-forward to this century, and you're seeing much more sophisticated-- >> Yeah, I can give you another example in pharmaceutical. They develop new drugs, it's a long process. So, one of the initial process is to figure out what molecules this would be exploring in the next step, and you could have thousand different combination of molecules that could treat a particular condition, and now they with digitization and data analytics, they're able to do this in a virtual world, kind of creating a virtual lab where they can test out thousands of molecules. And then, once they can bring it down to a fewer, then the physical aspect of that starts. Think about innovation really shrinking their processes. >> All right, well I want to say this about clouds. You made the statement in your keynote that how many people out there think cloud is cheaper, or maybe you even said cheap, but cheaper I inferred cheaper than an on-prem, and so it was a loaded question so nobody put their hand up they're afraid, but I put my hand up because we don't have any IT. We used to have IT. It was a nightmare. So, for us it's better but in your experience, I think I'm inferring correctly that you had meant cheaper than on-prem, and certainly we talked to many practitioners who have large systems that when they lift and shift to the cloud, they don't change their operating model, they don't really change anything, they get a bill at the end of the month, and they go "What did this really do for us?" And I think that's what you mean-- >> So what I mean, let me make it clear, is that there are certain use cases that cloud is and, as you saw, that people did raise their hand saying "Yeah, I have use cases where cloud is cheaper." I think you need to look at the whole thing. Cost is one aspect. The flexibility and agility of being able to do things is another aspect. For example, if you have a situation where your stakeholder want to do something for three weeks, and they need five times the computing power, and the data that they are buying from outside to do that experiment. Now, imagine doing that in a physical war. It's going to take a long time just to procure and get the physical boxes, and then you'll be able to do it. In cloud, you can enable that, you can get GPUs depending on what problem we are trying to solve. That's another benefit. You can get the fit for purpose computing environment to that and so there are a lot of flexibility, agility all of that. It's a new way of managing it so people need to pay attention to the cost because it will add to the cost. The other thing I will point out is that if you go to the public cloud, because they make it cheaper, because they have hundreds and thousands of this canned CPU. This much computing power, this much memory, this much disk, this much connectivity, and they build thousands of them, and that's why it's cheaper. Well, if your need is something that's very unique and they don't have it, that's when it becomes a problem. Either you need more of those and the cost will be higher. So, now we are getting to the IOT war. The volume of data is growing so much, and the type of processing that you need to do is becoming more real-time, and you can't just move all this bulk of data, and then bring it back, and move the data back and forth. You need a special type of computing, which is at the, what Amazon calls it, adds computing. And the industry is kind of trying to design it. So, that is an example of hybrid computing evolving out of a cloud or out of the necessity that you need special purpose computing environment to deal with new situations, and all of it can't be in the cloud. >> I mean, I would argue, well I guess Microsoft with Azure Stack was kind of the first, although not really. Now, they're there but I would say Oracle, your former company, was the first one to say "Okay, we're going to put the exact same infrastructure on prem as we have in the public cloud." Oracle, I would say, was the first to truly do that-- >> They were doing hybrid computing. >> You now see Amazon with outposts has done the same, Google kind of has similar approach as Azure, and so it's clear that hybrid is here to stay, at least for some period of time. I think the cloud guys probably believe that ultimately it's all going to go to the cloud. We'll see it's going to be a long, long time before that happens. Okay! I'll give you last thoughts on this conference. You've been here before? Or is this your first one? >> This is my first one. >> Okay, so your takeaways, your thoughts, things you might-- >> I am very impressed. I'm a practitioner and finding so many practitioners coming from so many different backgrounds and industries. It's very, very enlightening to listen to their journey, their story, their learnings in terms of what works and what doesn't work. It is really invaluable. >> Yeah, I tell you this, it's always a highlight of our season and Gokula, thank you very much for coming on theCUBE. It was great to see you. >> Thank you. >> You're welcome. All right, keep it right there everybody. We'll be back with our next guest, Dave Vellante. Paul Gillin is in the house. You're watching theCUBE from MIT. Be right back! (upbeat techno music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. He is the Senior Director of Global Data and Analytics Former, former Senior Director. We met when you were at Oracle doing data. I met you when you were at Oracle, of the pioneers of the whole industry, and the data modeling and analytics, So, if you look at all the reporting that used to happen the highest market value on the U.S. stock exchange. So, that's more manual, but at the operational level, that 80% of the data that's out there today and as soon as they are able to value their data, Well, and of course you talked a lot and data is the way to understand or accelerate the time and you get the value And the other aspect you look at is, Everybody says that but the reality is you hire and of course selling it to make money. the mineral rights and so they became very, very wealthy. and you're seeing much more sophisticated-- So, one of the initial process is to figure out And I think that's what you mean-- and the type of processing that you need to do I mean, I would argue, and so it's clear that hybrid is here to stay, and what doesn't work. Yeah, I tell you this, Paul Gillin is in the house.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Hitachi	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Doug Laney	PERSON	0.99+
five times	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Kansas	LOCATION	0.99+
TransUnion	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
HP	ORGANIZATION	0.99+
three weeks	QUANTITY	0.99+
India	LOCATION	0.99+
10	QUANTITY	0.99+
Sierra Atlantic	ORGANIZATION	0.99+
Gokula Mishra	PERSON	0.99+
Doug	PERSON	0.99+
hundreds	QUANTITY	0.99+
Berkshire Hathaway	ORGANIZATION	0.99+
five companies	QUANTITY	0.99+
80%	QUANTITY	0.99+
U.S.	LOCATION	0.99+
Booz Allen	ORGANIZATION	0.99+
Tata Burroughs	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Gokula	PERSON	0.99+
next year	DATE	0.99+
thousands	QUANTITY	0.99+
McDonald's	ORGANIZATION	0.99+
one aspect	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
yesterday	DATE	0.99+
thousands of molecules	QUANTITY	0.99+
first one	QUANTITY	0.99+
One	QUANTITY	0.98+
GE	ORGANIZATION	0.98+
northern Oklahoma	LOCATION	0.98+
today	DATE	0.97+
CS Associates	ORGANIZATION	0.97+
20 million dollars	QUANTITY	0.97+
one	QUANTITY	0.96+
First	QUANTITY	0.96+
Global Data and Analytics and Supply Chain	ORGANIZATION	0.95+
MIT CDOIQ Conference	EVENT	0.95+
13th year	QUANTITY	0.94+
U.S. government	ORGANIZATION	0.93+
two worlds	QUANTITY	0.92+
Azure Stack	TITLE	0.91+
one thing	QUANTITY	0.9+
one point	QUANTITY	0.9+
Northwestern	ORGANIZATION	0.9+
couple years ago	DATE	0.89+
MIT Chief Data Officer and Information Quality Symposium 2019	EVENT	0.87+
this century	DATE	0.85+
Tang building	LOCATION	0.85+
Global Data Analytics and	ORGANIZATION	0.83+
Chief Data Officer Information Quality Conference	EVENT	0.81+
MIT	ORGANIZATION	0.78+
theCUBE	ORGANIZATION	0.77+
thousand different combination of molecules	QUANTITY	0.74+
last	DATE	0.67+
years	DATE	0.66+
U.S.	ORGANIZATION	0.66+
billion dollar	QUANTITY	0.65+
themes	QUANTITY	0.65+
Osage Indian	OTHER	0.64+

Joe Caserta & Doug Laney, Caserta | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's three Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Hi already. We're back in Cambridge, Massachusetts at the M I t. Chief data officer Information quality event. Hashtag m i t cdo i Q. And I'm David Dante. He's Paul Gillen. Day one of our two day coverage of this event. This is the Cube, the leader in live tech coverage. Joe Caserta is here is the president of Caserta and Doug Laney, who is principal data strategist at Caserta, both Cube alarm guys. Great to see you again, Joe. What? Did you pick up this guy? How did that all came on here a couple of years ago? We had a great conversation. I read the book, Loved it. So congratulations. A nice pickup. >> We're very fortunate to have. >> Thanks. So I'm fortunate to be here, >> so Okay, well, what attracted you to Cassard? Oh, >> it's Joe's got a tremendous reputation. His his team of consultants has a great reputation. We both felt there was an opportunity to build some data strategy competency on top of that and leverage some of those in Phanom. Its ideas that I've been working on over the years. >> Great. Well, congratulations. And so, Joe, you and I have talked many times. And the reason I like talking because you know what's going on in the market place? You could you could siphon. What's riel? What's hype? So what do you see? It is the big trends in this data space, and then we'll get into it. Yeah, sure. Um, trends >> are chief data officer has been evolving over the last couple of years. You know, when we started doing this several years ago, there was just a handful of people, maybe 30 40 people. Now, there's 450 people here today, and it's been evolving. People are still trying to find their feet. Exactly what the chief date officers should be doing where they are in the hierarchy. Should they report to the c e o the C I O u the other CDO, which is a digital officer. So I think you know, hierarchically. That's still figuring it out politically. They're figuring it out, but technically also, they're still trying to figure it out. You know what's been happening over the past three years is the evolution of data going from traditional data warehousing and business intelligence. To get inside out of data just isn't working anymore. Eso evolving that moving it forward to more modern data engineering we've been doing for the past couple of years with quote unquote big data on That's not working anymore either, right? Because it's been evolving so fast. So now we're on, like, maybe Data three dato. And now we're talking about just pure automate everything. We have to automate everything. And we have to change your mindset from from having output of a data solution to an outcome to date a solution. And that's why I hired Doug, because way have to figure out not only had to get this data and look at it and analyze really had to monetize it, right? It's becoming a revenue stream for your business if you're doing it right and Doug is the leader in the industry, how to figure that >> you keep keep premise of your book was you gotta start valuing data and its fundamental you put forth a number of approaches and techniques and examples of companies doing that. Since you've published in phenomena Microsoft Apple, Amazon, Google and Facebook. Of the top five market value cos they've surpassed all the financial service is guys all ExxonMobil's and any manufacturer? Automobile makers? And what of a data companies, right? Absolutely. But intrinsically we know there's value their way any closer to the prescription that you put forth. >> Yeah, it's really no surprise and extra. We found that data companies have, ah, market to book value. That's nearly 33 times the market average, so Apple and others are much higher than that. But on average, if you look at the data product companies, they're valued much higher than other companies, probably because data can be reused in multiple ways. That's one of the core tenets of intra nomics is that Data's is non depleted ble regenerative, reusable asset and that companies that get that an architect of businesses based on those economics of information, um, can really perform well and not just data companies, but >> any company. That was a key takeaway of the book. The data doesn't conform to the laws of scarcity. Every says data is the new oil. It's like, No, it's not more valuable. So what are some examples in writing your book and customers that you work with. Where do you see Cos outside of these big data driven firms, breaking new ground and uses of data? I >> think the biggest opportunity is really not with the big giant Cos it's really with. Most of our most valuable clients are small companies with large volumes of data. You know if and the reason why they can remain small companies with large volumes of data is the thing that holds back the big giant enterprises is they have so much technical. Dad, it's very hard. They're like trying to, you know, raise the Titanic, right? You can't really. It's not agile enough. You need something that small and agile in order to pivot because it is changing so fast every time there's a solution created, it's obsolete. We have to greet the new solution on dhe when you have a big old processes. Big old technologies, big old mind sets on big old cultures. It's very hard to be agile. >> So is there no hope? I mean, the reason I ask the question was, What hope can you give some of these smokestack companies that they can become data centric? Yeah, What you >> see is that there was a There was a move to build big, monolithic data warehouses years ago and even Data Lakes. And what we find is that through the wealth of examples of companies that have benefited in significant ways from data and analytics, most of those solutions are very vocational. They're very functionally specific. They're not enterprise class, yada, yada, kind of kind of projects. They're focused on a particular business problem or monetizing or leveraging data in a very specific way, and they're generating millions of dollars of value. But again they tend to be very, very functionally specific. >> The other trend that we're seeing is also that the technology and the and the end result of what you're doing with your data is one thing. But really, in order to make that shift, if your big enterprises culture to really change all of the people within the organization to migrate from being a conventional wisdom run company to be a data really analytics driven company, and that takes a lot of change management, a lot of what we call data therapy way actually launched a new practice within the organization that Doug is actually and I are collaborating on to really mature because that is the next wave is really we figured out the data part. We figured out the technology part, but now it's the people part people. Part is really why we're not way ahead of where we even though we're way ahead of where we were a couple of years ago, we should be even further. Culturally, it's very, very challenging, and we need to address that head on. >> And that zeta skills issue that they're sort of locked into their existing skill sets and processes. Or is it? It's fear of the unknown what we're doing, you know? What about foam? Oh, yeah, Well, I mean, there are people >> jumping into bed to do this, right? So there is that part in an exciting part of it. But there's also just fear, you know, and fear of the unknown and, you know, part of what we're trying to do. And why were you trying Thio push Doug's book not for sales, but really just to share the knowledge and remove the mystery and let people see what they can actually do with this data? >> Yeah, it's more >> than just date illiteracy. So there's a lot of talk of the industry about data literacy programs and educating business people on the data and educating data people on the business. And that's obviously important. But what Joe is talking about is something bigger than that. It's really cultural, and it's something that is changed to the company's DNA. >> So where do you attack that problem? It doesn't have to go from the top down. You go into the middle. It has to >> be from the top down. It has to be. It has to be because my boss said to do it all right. >> Well, otherwise they well, they might do it. But the organization's because if you do, it >> is a grassroots movement on Lee. The folks who are excited, right? The foam of people, right? They're the ones who are gonna be excited. But they're going to evolve in adopt anyway, right? But it's the rest of the organization, and that needs to be a top down, Um, approach. >> It was interesting hearing this morning keynote speakers. You scored a throw on top down under the bus, but I had the same reaction is you can't do it without that executive buying. And of course, we defined, I guess in the session what that was. Amazon has an interesting concept for for any initiative, like every initiative that's funded has to have what they call a threaded leader. Another was some kind of And if they don't, if they don't have a threat of leader, there's like an incentive system tau dime on initiative. Kill it. It kind of forces top down. Yeah, you know, So >> when we interview our clients, we have a litmus test and the limits. It's kind of a ready in this test. Do you have the executive leadership to actually make this project successful? And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, you know, or because one of the challenges another part of the litmus test is this IittIe driven. If it's I t driven is gonna be very tough to get embraced by the rest of the business. So way need to really be able to have that executive leadership from the business to say this is something that we need >> to do to survive. Yeah, and, you know, with without the top down support. You could play small ball. But if you're playing the Yankees, you're gonna win one >> of the reasons why when it's I t driven, it's very challenging is because the people part right is a different budget from the i T budget. And when we start talking about data therapy, right and human resource is and training and education of just culture and data literacy, which is not necessary technical, that that becomes a challenge internally figuring out, like how to pay for Andi how to get it done with a corporate politics. >> So So the CDO crowd definitely parts of your book that they should be adopting because to me, there their main job is okay. How does data support the monetization of my organization? Raising revenue, cutting costs, improving productivity, saving lives. You call it value. And so that seems to be the starting point. At the same time. In this conference, you grew out of the ashes of back room information quality of the big data height, but exploded and have kind of gone full circle. So But I wonder, I mean, is the CDO crowd still focused on that monetization? Certainly I think we all agree they should be, but they're getting sucked back into a governance role. Can they do both, I guess, is >> my question. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation and focus on on on ensuring compliance with them. But there's often a just a pendulum swing back, and I think there's a swing back to adding business value. And so we're seeing a lot of opportunities to help companies monetize their data broadly in a variety of ways. A CZ you mentioned not just in one way and, um, again those you need to be driven from the top. We have a process that we go through to generate ideas, and that's wonderful. Generating ideas. No is fairly straightforward enough. But then running them through kind of a feasibility government, starting with you have the executive support for that is a technology technologically feasible, managerially feasible, ethically feasible and so forth. So we kind of run them through that gauntlet next. >> One of my concerns is that chief data officer, the level of involvement that year he has in these digital initiatives again is digital initiative of Field of Dreams. Maybe it is. But everywhere you go the CEO is trying to get digital right, and it seems like the chief data officer is not necessarily front and center in those. Certainly a I projects, which are skunk works. But it's the chief digital officer that's driving it. So how how do you see in those roles playoff >> In the less panel that I've just spoken, very similar question was asked. And again, we're trying to figure out the hierarchy of where the CDO should live in an organization. Um, I find that the biggest place it fails typically is if it rolls up to a C I. O. Right. If you think the data is a technical issue, you're wrong, Right? Data is a business issue, Andi. I also think for any company to survive today, they have to have a digital presence. And so digital presence is so tightly coupled to data that I find the best success is when the chief date officer reports directly to the chief digital officer. Chief Digital officer has a vision for the user experience for the customer customers Ella to figure out. How do we get that customer engaged and that directly is dependent on insight. Right on analytics. You know, if the four of us were to open up, any application on our phone, even for the same product, would have four different experiences based on who we are, who are peers are what we bought in the past, that's all based on analytics. So the business application of the digital presence is tightly couple tow Analytics, which is driven by the chief state officer. >> That's the first time I've heard that. I think that's the right organizational structure. Did see did. JJ is going to be sort of the driver, right? The strategy. That's where the budget's gonna go and the chief date office is gonna have that supporting role that's vital. The enabler. Yeah, I think the chief data officer is a long term play. Well, we have a lot of cheap date officers. Still, 10 years from now, I think that >> data is not a fad. I think Data's just become more and more important. And will they ultimately leapfrog the chief digital officer and report to the CEO? Maybe someday, but for now, I think that's where they belong. >> You know what's company started managing their labor and workforce is as an actual asset, even though it's not a balance sheet. Asked for obvious reasons in the 19 sixties that gave rise to the chief human resource officer, which we still see today and his company start to recognize information as an asset, you need an executive leader to oversee and be responsible for that asset. >> Conceptually, it's always been data is an asset and a liability. And, you know, we've always thought about balancing terms. Your book sort of put forth a formula for actually formalizing. That's right. Do you think it's gonna happen our lifetime? What exactly clear on it, what you put forth in your book in terms of organizations actually valuing data specifically on the balance sheet. So that's >> an accounting question and one that you know that you leave to the accounting professionals. But there have been discussion papers published by the accounting standards bodies to discuss that issue. We're probably at least 10 years away, but I think respective weather data is that about what she'd asked or not. It's an imperative organizations to behave as if it is one >> that was your point it's probably not gonna happen, but you got a finger in terms that you can understand the value because it comes >> back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. Or what day do you have your in a poor position to manage it like one. And if you're not manage like an asset, then you're really not probably able to leverage it like one. >> Give us a little commercial for I do want to say that I do >> think in our lifetime we will see it become an asset. There are lots of intangible assets that are on the books, intellectual property contracts. I think data that supports both of those things are equally is important. And they will they will see the light. >> Why are those five companies huge market cap winners, where they've surpassed all the evaluation >> of a business that the data that they have is considered right? So it should be part of >> the assets in the books. All right, we gotta wraps, But give us Give us the The Caserta Commercial. Well, concert is >> a consultancy that does essentially three things. We do data advisory work, which, which Doug is heading up. We do data architecture and strategy, and we also do just implementation of solutions. Everything from data engineering gate architecture and data science. >> Well, you made a good bet on data. Thanks for coming on, you guys. Great to see you again. Thank you. That's a wrap on day one, Paul. And I'll be back tomorrow for day two with the M I t cdo m I t cdo like you. Thanks for watching. We'll see them all.

Published Date : Jul 31 2019

SUMMARY :

Brought to you by Great to see you again, Joe. Its ideas that I've been working on over the years. And the reason I like talking because you know what's going on in the market place? So I think you that you put forth. We found that data companies have, ah, market to book value. The data doesn't conform to the laws of scarcity. We have to greet the new solution on dhe when you have a big old processes. But again they tend to be very, very functionally specific. But really, in order to make that shift, if your big enterprises It's fear of the unknown what we're But there's also just fear, you know, and fear of the unknown and, people on the data and educating data people on the business. It doesn't have to go from the top down. It has to be because my boss said to do it all But the organization's because if you do, But it's the rest of the organization, and that needs to be a top down, And of course, we defined, I guess in the session what that was. And in a lot of cases, they don't And you know, we'll have to say will call us when you're ready, Yeah, and, you know, with without the top down support. of the reasons why when it's I t driven, it's very challenging is because the people part And so that seems to be the starting point. Well, governance has been, has been a big issue the past few years with all of the new compliance regulation One of my concerns is that chief data officer, the level of involvement experience for the customer customers Ella to figure out. JJ is going to be sort of the driver, right? data is not a fad. to the chief human resource officer, which we still see today and his company start to recognize information What exactly clear on it, what you put forth in your book in terms of an accounting question and one that you know that you leave to the accounting professionals. back to you can't manage what you don't measure and measuring the value of potential value or quality of your information. assets that are on the books, intellectual property contracts. the assets in the books. a consultancy that does essentially three things. Great to see you again.

ENTITIES

Entity	Category	Confidence
Joe	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
David Dante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
ExxonMobil	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Joe Caserta	PERSON	0.99+
Paul	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
five companies	QUANTITY	0.99+
Doug	PERSON	0.99+
450 people	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
four	QUANTITY	0.99+
Yankees	ORGANIZATION	0.99+
JJ	PERSON	0.99+
tomorrow	DATE	0.99+
both	QUANTITY	0.99+
two day	QUANTITY	0.99+
Lee	PERSON	0.99+
Doug Laney	PERSON	0.99+
today	DATE	0.98+
One	QUANTITY	0.98+
Cassard	PERSON	0.98+
Andi	PERSON	0.97+
Cube	ORGANIZATION	0.97+
The Caserta Commercial	ORGANIZATION	0.97+
one	QUANTITY	0.97+
day one	QUANTITY	0.97+
first time	QUANTITY	0.97+
day two	QUANTITY	0.96+
several years ago	DATE	0.96+
one thing	QUANTITY	0.93+
Day one	QUANTITY	0.93+
three things	QUANTITY	0.92+
Phanom	LOCATION	0.92+
Caserta	ORGANIZATION	0.91+
this morning	DATE	0.91+
nearly 33 times	QUANTITY	0.9+
couple of years ago	DATE	0.9+
millions of dollars	QUANTITY	0.9+
last couple of years	DATE	0.9+
Doug Laney,	PERSON	0.9+
wave	EVENT	0.89+
19 sixties	DATE	0.87+
2019	DATE	0.86+
Thio push	PERSON	0.85+
past couple of years	DATE	0.84+
years ago	DATE	0.84+
Data three dato	ORGANIZATION	0.84+
one way	QUANTITY	0.84+
next	EVENT	0.83+
past three years	DATE	0.81+
Titanic	COMMERCIAL_ITEM	0.8+
30 40 people	QUANTITY	0.8+
least 10 years	QUANTITY	0.75+
top	QUANTITY	0.75+
M I T.	EVENT	0.75+
MIT CDOIQ	EVENT	0.7+
Field of Dreams	ORGANIZATION	0.7+
past few years	DATE	0.7+
three	QUANTITY	0.7+
five market	QUANTITY	0.69+
CDO	ORGANIZATION	0.68+
of people	QUANTITY	0.66+
M I t.	EVENT	0.65+
years	QUANTITY	0.64+
Caserta	PERSON	0.63+
Cos	ORGANIZATION	0.56+
Ella	PERSON	0.56+
k	ORGANIZATION	0.53+

Colin Mahony, Vertica | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, you're watching The Cube, the leader in tech coverage. My name is Dave Vellante here with my cohost Paul Gillin. This is day one of our two day coverage of the MIT CDOIQ conferences. CDO, Chief Data Officer, IQ, information quality. Colin Mahoney is here, he's a good friend and long time CUBE alum. I haven't seen you in awhile, >> I know >> But thank you so much for taking some time, you're like a special guest here >> Thank you, yeah it's great to be here, thank you. >> Yeah, so, this is not, you know, something that you would normally attend. I caught up with you, invited you in. This conference has started as, like back office governance, information quality, kind of wonky stuff, hidden. And then when the big data meme took off, kind of around the time we met. The Chief Data Officer role emerged, the whole Hadoop thing exploded, and then this conference kind of got bigger and bigger and bigger. Still intimate, but very high level, very senior. It's kind of come full circle as we've been saying, you know, information quality still matters. You have been in this data business forever, so I wanted to invite you in just to get your perspectives, we'll talk about what's new with what's going on in your company, but let's go back a little bit. When we first met and even before, you saw it coming, you kind of invested your whole career into data. So, take us back 10 years, I mean it was so different, remember it was Batch, it was Hadoop, but it was cool. There was a lot of cool >> It's still cool. (laughs) projects going on, and it's still cool. But, take a look back. >> Yeah, so it's changed a lot, look, I got into it a while ago, I've always loved data, I had no idea, the explosion and the three V's of data that we've seen over the last decade. But, data's really important, and it's just going to get more and more important. But as I look back I think what's really changed, and even if you just go back a decade I mean, there's an insatiable appetite for data. And that is not slowing down, it hasn't slowed down at all, and I think everybody wants that perfect solution that they can ask any question and get an immediate answers to. We went through the Hadoop boom, I'd argue that we're going through the Hadoop bust, but what people actually want is still the same. You know, they want real answers, accurate answers, they want them quickly, and they want it against all their information and all their data. And I think that Hadoop evolved a lot as well, you know, it started as one thing 10 years ago, with MapReduce and I think in the end what it's really been about is disrupting the storage market. But if you really look at what's disrupting storage right now, public clouds, S3, right? That's the new data league. So there's always a lot of hype cycles, everybody talks about you know, now it's Cloud, everything, for maybe the last 10 years it was a lot of Hadoop, but at the end of the day I think what people want to do with data is still very much the same. And a lot of companies are still struggling with it, hence the role for Chief Data Officers to really figure out how do I monetize data on the one hand and how to I protect that asset on the other hand. >> Well so, and the cool this is, so this conference is not a tech conference, really. And we love tech, we love talking about this, this is why I love having you on. We kind of have a little Vertica thread that I've created here, so Colin essentially, is the current CEO of Vertica, I know that's not your title, you're GM and Senior Vice President, but you're running Vertica. So, Michael Stonebreaker's coming on tomorrow, >> Yeah, excellent. >> Chris Lynch is coming on tomorrow, >> Oh, great, yeah. >> we've got Andy Palmer >> Awesome, yeah. >> coming up as well. >> Pretty cool. (laughs) >> So we have this connection, why is that important? It's because, you know, Vertica is a very cool company and is all about data, and it was all about disrupting, sort of the traditional relational database. It's kind of doing more with data, and if you go back to the roots of Vertica, it was like how do you do things faster? How do you really take advantage of data to really drive new business? And that's kind of what it's all about. And the tech behind it is really cool, we did your conference for many, many years. >> It's coming back by the way. >> Is it? >> Yeah, this March, so March 30th. >> Oh, wow, mark that down. >> At Boston, at the new Encore Hotel. >> Well we better have theCUBE there, bro. (laughs) >> Yeah, that's great. And yeah, you've done that conference >> Yep. >> haven't you before? So very cool customers, kind of leading edge, so I want to get to some of that, but let's talk the disruption for a minute. So you guys started with the whole architecture, MPP and so forth. And you talked about Cloud, Cloud really disrupted Hadoop. What are some of the other technology disruptions that you're seeing in the market space? >> I think, I mean, you know, it's hard not to talk about AI machine learning, and what one means versus the other, who knows right? But I think one thing that is definitely happening is people are leveraging the volumes of data and they're trying to use all the processing power and storage power that we have to do things that humans either are too expensive to do or simply can't do at the same speed and scale. And so, I think we're going through a renaissance where a lot more is being automated, certainly on the Vertica roadmap, and our path has always been initially to get the data in and then we want the platform to do a lot more for our customers, lots more analytics, lots more machine-learning in the platform. So that's definitely been a lot of the buzz around, but what's really funny is when you talk to a lot of customers they're still struggling with just some basic stuff. Forget about the predictive thing, first you've got to get to what happened in the past. Let's give accurate reporting on what's actually happening. The other big thing I think as a disruption is, I think IOT, for all the hype that it's getting it's very real. And every device is kicking off lots of information, the feedback loop of AB testing or quality testing for predictive maintenance, it's happening almost instantly. And so you're getting massive amounts of new data coming in, it's all this machine sensor type data, you got to figure out what it means really quick, and then you actually have to do something and act on it within seconds. And that's a whole new area for so many people. It's not their traditional enterprise data network warehouse and you know, back to you comment on Stonebreaker, he got a lot of this right from the beginning, you know, and I think he looked at the architectures, he took a lot of the best in class designs, we didn't necessarily invent everything, but we put a lot of that together. And then I think the other you've got to do is constantly re-invent your platform. We came out with our Eon Mode to run cloud native, we just got rated the best cloud data warehouse from a net promoter score rating perspective, so, but we got to keep going you know, we got to keep re-inventing ourselves, but leverage everything that we've done in the past as well. >> So one of the things that you said, which is kind of relevant for here, Paul, is you're still seeing a real data quality issue that customers are wrestling with, and that's a big theme here, isn't it? >> Absolutely, and the, what goes around comes around, as Dave said earlier, we're still talking about information quality 13 years after this conference began. Have the tools to improve quality improved all that much? >> I think the tools have improved, I think that's another area where machine learning, if you look at Tamr, and I know you're going to have Andy here tomorrow, they're leveraging a lot of the augmented things you can do with the processing to make it better. But I think one thing that makes the problem worse now, is it's gotten really easy to pour data in. It's gotten really easy to store data without having to have the right structure, the right quality, you know, 10 years ago, 20 years ago, everything was perfect before it got into the platform. Right, everything was, there was quality, everything was there. What's been happening over the last decade is you're pumping data into these systems, nobody knows if it's redundant data, nobody knows if the quality's any good, and the amount of data is massive. >> And it's cheap to store >> Very cheap to store. >> So people keep pumping it in. >> But I think that creates a lot of issues when it comes to data quality. So, I do think the technology's gotten better, I think there's a lot of companies that are doing a great job with it, but I think the challenge has definitely upped. >> So, go ahead. >> I'm sorry. You mentioned earlier that we're seeing the death of Hadoop, but I'd like you to elaborate on that becuase (Dave laughs) Hadoop actually came up this morning in the keynote, it's part of what GlaxoSmithKline did. Came up in a conversation I had with the CEO of Experian last week, I mean, it's still out there, why do you think it's in decline? >> I think, I mean first of all if you look at the Hadoop vendors that are out there, they've all been struggling. I mean some of them are shutting down, two of them have merged and they've got killed lately. I think there are some very successful implementations of Hadoop. I think Hadoop as a storage environment is wonderful, I think you can process a lot of data on Hadoop, but the problem with Hadoop is it became the panacea that was going to solve all things data. It was going to be the database, it was going to be the data warehouse, it was going to do everything. >> That's usually the kiss of death, isn't it? >> It's the kiss of death. And it, you know, the killer app on Hadoop, ironically, became SQL. I mean, SQL's the killer app on Hadoop. If you want to SQL engine, you don't need Hadoop. But what we did was, in the beginning Mike sort of made fun of it, Stonebreaker, and joked a lot about he's heard of MapReduce, it's called Group By, (Dave laughs) and that created a lot of tension between the early Vertica and Hadoop. I think, in the end, we embraced it. We sit next to Hadoop, we sit on top of Hadoop, we sit behind it, we sit in front of it, it's there. But I think what the reality check of the industry has been, certainly by the business folks in these companies is it has not fulfilled all the promises, it has not fulfilled a fraction on the promises that they bet on, and so they need to figure those things out. So I don't think it's going to go away completely, but I think its best success has been disrupting the storage market, and I think there's some much larger disruptions of technologies that frankly are better than HTFS to do that. >> And the Cloud was a gamechanger >> And a lot of them are in the cloud. >> Which is ironic, 'cause you know, cloud era, (Colin laughs) they didn't really have a cloud strategy, neither did Hortonworks, neither did MapR and, it just so happened Amazon had one, Google had one, and Microsoft has one, so, it's just convenient to-- >> Well, how is that affecting your business? We've seen this massive migration to the cloud (mumbles) >> It's actually been great for us, so one of the things about Vertica is we run everywhere, and we made a decision a while ago, we had our own data warehouse as a service offering. It might have been ahead of its time, never really took off, what we did instead is we pivoted and we say "you know what? "We're going to invest in that experience "so it's a SaaS-like experience, "but we're going to let our customers "have full control over the cloud. "And if they want to go to Amazon they can, "if they want to go to Google they can, "if they want to go to Azure they can." And we really invested in that and that experience. We're up on the Amazon marketplace, we have lots of customers running up on Amazon Cloud as well as Google and Azure now, and then about two years ago we went down and did this endeavor to completely re-architect our product so that we could separate compute and storage so that our customers could actually take advantage of the cloud economics as well. That's been huge for us, >> So you scale independent-- >> Scale independently, cloud native, add compute, take away compute, and for our existing customers, they're loving the hybrid aspect, they love that they can still run on Premise, they love that they can run up on a public cloud, they love that they can run in both places. So we will continue to invest a lot in that. And it is really, really important, and frankly, I think cloud has helped Vertica a lot, because being able to provision hardware quickly, being able to tie in to these public clouds, into our customers' accounts, give them control, has been great and we're going to continue on that path. >> Because Vertica's an ISV, I mean you're a software company. >> We're a software company. >> I know you were a part of HP for a while, and HP wanted to mash that in and run it on it's hardware, but software runs great in the cloud. And then to you it's another hardware platform. >> It's another hardware platform, exactly. >> So give us the update on Micro Focus, Micro Focus acquired Vertica as part of the HPE software business, how many years ago now? Two years ago? >> Less than two years ago. >> Okay, so how's that going, >> It's going great. >> Give us the update there. >> Yeah, so first of all it is great, HPE and HP were wonderful to Vertica, but it's great being part of a software company. Micro Focus is a software company. And more than just a software company it's a company that has a lot of experience bridging the old and the new. Leveraging all of the investments that you've made but also thinking about cloud and all these other things that are coming down the pike. I think for Vertica it's been really great because, as you've seen Vertica has gotten its identity back again. And that's something that Micro Focus is very good at. You can look at what Micro Focus did with SUSE, the Linux company, which actually you know, now just recently spun out of Micro Focus but, letting organizations like Vertica that have this culture, have this product, have this passion, really focus on our market and our customers and doing the right thing by them has been just really great for us and operating as a software company. The other nice thing is that we do integrate with a lot of other products, some of which came from the HPE side, some of which came from Micro Focus, security products is an example. The other really nice thing is we've been doing this insource thing at Micro Focus where we open up our source code to some of the other teams in Micro Focus and they've been contributing now in amazing ways to the product. In ways that we would just never be able to scale, but with 4,000 engineers strong in Micro Focus, we've got a much larger development organization that can actually contribute to the things that Vertica needs to do. And as we go into the cloud and as we do a lot more operational aspects, the experience that these teams have has been incredible, and security's another great example there. So overall it's been great, we've had four different owners of Vertica, our job is to continue what we do on the innovation side in the culture, but so far Micro Focus has been terrific. >> Well, I'd like to say, you're kind of getting that mojo back, because you guys as an independent company were doing your own thing, and then you did for a while inside of HP, >> We did. >> And that obviously changed, 'cause they wanted more integration, but, and Micro Focus, they know what they're doing, they know how to do acquisitions, they've been very successful. >> It's a very well run company, operationally. >> The SUSE piece was really interesting, spinning that out, because now RHEL is part of IBM, so now you've got SUSE as the lone independent. >> Yeah. >> Yeah. >> But I want to ask you, go back to a technology question, is NoSQL the next Hadoop? Are these databases, it seems to be that the hot fad now is NoSQL, it can do anything. Is the promise overblown? >> I think, I mean NoSQL has been out almost as long as Hadoop, and I, we always say not only SQL, right? Mike's said this from day one, best tool for the job. Nothing is going to do every job well, so I think that there are, whether it's key value stores or other types of NoSQL engines, document DB's, now you have some of these DB's that are running on different chips, >> Graph, yeah. >> there's always, yeah, graph DBs, there's always going to be specialty things. I think one of the things about our analytic platform is we can do, time series is a great example. Vertica's a great time series database. We can compete with specialized time series databases. But we also offer a lot of, the other things that you can do with Vertica that you wouldn't be able to do on a database like that. So, I always think there's going to be specialty products, I also think some of these can do a lot more workloads than you might think, but I don't see as much around the NoSQL movement as say I did a few years ago. >> But so, and you mentioned the cloud before as kind of, your position on it I think is a tailwind, not to put words in your mouth, >> Yeah, yeah, it's a great tailwind. >> You're in the Amazon marketplace, I mean they have products that are competitive, right? >> They do, they do. >> But, so how are you differentiating there? >> I think the way we differentiate, whether it's Redshift from Amazon, or BigQuery from Google, or even what Azure DB does is, first of all, Vertica, I think from, feature functionality and performance standpoint is ahead. Number one, I think the second thing, and we hear this from a lot of customers, especially at the C-level is they don't want to be locked into these full stacks of the clouds. Having the ability to take a product and run it across multiple clouds is a big thing, because the stack lock-in now, the full stack lock-in of these clouds is scary. It's really easy to develop in their ecosystems but you get very locked into them, and I think a lot of people are concerned about that. So that works really well for Vertica, but I think at the end of the day it's just, it's the robustness of the product, we continue to innovate, when you look at separating compute and storage, believe it or not, a lot of these cloud-native databases don't do that. And so we can actually leverage a lot of the cloud hardware better than the native cloud databases do themselves. So, like I said, we have to keep going, those guys aren't going to stop, and we actually have great relationships with those companies, we work really well with the clouds, they seem to care just as much about their cloud ecosystem as their own database products, and so I think that's going to continue as well. >> Well, Colin, congratulations on all the success >> Yeah, thank you, yeah. >> It's awesome to see you again and really appreciate you coming to >> Oh thank you, it's great, I appreciate the invite, >> MIT. >> it's great to be here. >> All right, keep it right there everybody, Paul and I will be back with our next guest from MIT, you're watching theCUBE. (electronic jingle)

Published Date : Jul 31 2019

SUMMARY :

brought to you by SiliconANGLE Media. I haven't seen you in awhile, kind of around the time we met. It's still cool. but at the end of the day I think is the current CEO of Vertica, (laughs) and if you go back to the roots of Vertica, at the new Encore Hotel. Well we better have theCUBE there, bro. And yeah, you've done that conference but let's talk the disruption for a minute. but we got to keep going you know, Have the tools to improve quality the right quality, you know, But I think that creates a lot of issues but I'd like you to elaborate on that becuase I think you can process a lot of data on Hadoop, and so they need to figure those things out. so one of the things about Vertica is we run everywhere, and frankly, I think cloud has helped Vertica a lot, I mean you're a software company. And then to you it's another hardware platform. the Linux company, which actually you know, and Micro Focus, they know what they're doing, so now you've got SUSE as the lone independent. is NoSQL the next Hadoop? Nothing is going to do every job well, the other things that you can do with Vertica and so I think that's going to continue as well. Paul and I will be back with our next guest from MIT,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Andy Palmer	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Colin Mahoney	PERSON	0.99+
Paul	PERSON	0.99+
Colin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
Chris Lynch	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Michael Stonebreaker	PERSON	0.99+
HP	ORGANIZATION	0.99+
Micro Focus	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
Colin Mahony	PERSON	0.99+
last week	DATE	0.99+
Andy	PERSON	0.99+
March 30th	DATE	0.99+
NoSQL	TITLE	0.99+
Mike	PERSON	0.99+
Experian	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
SQL	TITLE	0.99+
two day	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
4,000 engineers	QUANTITY	0.99+
Two years ago	DATE	0.99+
SUSE	TITLE	0.99+
Azure DB	TITLE	0.98+
second thing	QUANTITY	0.98+
20 years ago	DATE	0.98+
10 years ago	DATE	0.98+
one	QUANTITY	0.98+
Vertica	TITLE	0.98+
Hortonworks	ORGANIZATION	0.97+
MapReduce	ORGANIZATION	0.97+
one thing	QUANTITY	0.97+

Mark Krzysko, US Department of Defense | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's The Cube, covering MIT Chief data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, everybody. We're here at Tang building at MIT for the MIT CDOIQ Conference. This is the 13th annual MIT CDOIQ. It started as a information quality conference and grew through the big data era, the Chief Data Officer emerged and now it's sort of a combination of those roles. That governance role, the Chief Data Officer role. Critical for organizations for quality and data initiatives, leading digital transformations ans the like. I'm Dave Vallante with my cohost Paul Gillin, you're watching The Cube, the leader in tech coverage. Mark Chrisco is here, the deputy, sorry, Principle Deputy Director for Enterprise Information at the Department of Defense. Good to see you again, thanks for coming on. >> Oh, thank you for having me. >> So, Principle Deputy Director Enterprise Information, what do you do? >> I do data. I do acquisition data. I'm the person in charge of lining the acquisition data for the programs for the Under Secretary and the components so a strong partnership with the army, navy, and air force to enable the department and the services to execute their programs better, more efficiently, and be efficient in the data management. >> What is acquisition data? >> So acquisition data generally can be considered best in the shorthand of cost schedule performance data. When a program is born, you have to manage, you have to be sure it's resourced, you're reporting up to congress, you need to be sure you have insight into the programs. And finally, sometimes you have to make decisions on those programs. So, cost schedule performance is a good shorthand for it. >> So kind of the key metrics and performance metrics around those initiatives. And how much of that is how you present that data? The visualization of it. Is that part of your role or is that, sort of, another part of the organization you partner with, or? >> Well, if you think about it, the visualization can take many forms beyond that. So a good part of the role is finding the authoritative trusted source of that data, making sure it's accurate so we don't spend time disagreeing on different data sets on cost schedule performance. The major programs are tremendously complex and large and involve and awful lot of data in the a buildup to a point where you can look at that. It's just not about visualizing, it's about having governed authoritative data that is, frankly, trustworthy that you can can go operate in. >> What are some of the challenges of getting good quality data? >> Well, I think part of the challenge was having a common lexicon across the department and the services. And as I said, the partnership with the services had been key in helping define and creating a semantic data model for the department that we can use. So we can have agreement on what it would mean when we were using it and collecting it. The services have thrown all in and, in their perspective, have extended that data model down through their components to their programs so they can better manage the programs because the programs are executed at a service level, not at an OSD level. >> Can you make that real? I mean, is there an example you can give us of what you mean by a common semantic model? >> So for cost schedule, let's take a very simple one, program identification. Having a key number for that, having a long name, a short name, and having just the general description of that, were in various states amongst the systems. We've had decades where, however the system was configured, configured it the way they wanted to. It was largely not governed and then trying to bring those data sets together were just impossible to do. So even with just program identification. Since the majority of the programs and numbers are executed at a service level, we worked really hard to get the common words and meanings across all the programs. >> So it's a governance exercise the? >> Yeah. It is certainly a governance exercise. I think about it as not so much as, in the IT world or the data world will call it governance, it's leadership. Let's settle on some common semantics here that we can all live with and go forward and do that. Because clearly there's needs for other pieces of data that we may or may not have but establishing a core set of common meanings across the department has proven very valuable. >> What are some of the key data challenges that the DOD faces? And how is your role helping address them? >> Well in our case, and I'm certain there's a myriad of data choices across the department. In our place it was clarity in and the governance of this. Many of the pieces of data were required by statute, law, police, or regulation. We came out of eras where data was the piece of a report and not really considered data. And we had to lead our ways to beyond the report to saying, "No, we're really "talking about key data management." So we've been at this for a few years and working with the services, that has been a challenge. I think we're at the part where we've established the common semantics for the department to go forward with that. And one of the challenges that I think is the access and dissemination of knowing what you can share and when you can share it. Because Michael Candolim said earlier that the data in mosaic, sometimes you really need to worry about it from our perspective. Is too much publicly available or should we protect on behalf of the government? >> That's a challenge. Is the are challenge in terms of, I'm sure there is but I wonder if you can describe it or maybe talk about how you might have solved it, maybe it's not a big deal, but you got to serve the mission of the organization. >> Absolutely. >> That's, like, number one. But at the same time, you've got stakeholders and they're powerful politicians and they have needs and there's transparency requirements, there are laws. They're not always aligned, those two directives, are they? >> No, thank goodness I don't have to deal with misalignments of those. We try to speak in the truth of here's the data and the decisions across the organization of our reports still go to congress, they go to congress on an annual basis through the selected acquisition report. And, you know, we are better understanding what we need to protect and how to advice congress on what should be protected and why. I would not say that's an easy proposition. The demands for those data come from the GAO, come from congress, come from the Inspector General and having to navigate that requires good access and dissemination controls and knowing why. We've sponsored some research though the RAND organization to help us look and understand why you have got to protect it and what policies, rules, and regulations are. And all those reports have been public so we could be sure that people would understand what it is. We're coming out of an era where data was not considered as it is today where reports were easily stamped with a little rubber stamp but data now moves at the velocities of milliseconds not as the velocity of reports. So we really took a comprehensive look at that. How do you manage data in a world where it is data and it is on infrastructures like data models. >> So, the future of war. Everybody talks about cyber as the future of war. There's a lot of data associated with that. How does that change what you guys do? Or does it? >> Well, I think from an acquisition perspective, you would think, you know. In that discussion that you just presented us, we're micro in that. We're equipping and acquiring through acquisitions. What we've done is we make sure that our data is shareable, you know? Open I, API structures. Having our data models. Letting the war fighters have our data so they could better understand where information is here. Letting other communities to better help that. By us doing our jobs where we sit, we can contribute to their missions and we've aways been every sharing in that. >> Is technology evolving to the point where, let's assume you could dial back 10 or 15 years and you had the nirvana of data quality. We know how fast technology is changing but is it changing as an enabler to really leverage that quality of data in ways that you might not have even envision 10 or 15 years ago? >> I think technology is. I think a lot of this is not in tools, it's now in technique and management practices. I think many of us find ourselves rethinking of how to do this now that you have data, now that you have tools that you can get them. How can you adopt better and faster? That requires a cultural change to organization. In some cases it requires more advanced skills, in other cases it requires you to think differently about the problems. I always like to consider that we, at some point, thought about it as a process-driven organization. Step one to step two to step three. Now process is ubiquitous because data becomes ubiquitous and you could refactor your processes and decisions much more efficiently and effectively. >> What are some of the information quality problems you have to wrestle with? >> Well, in our case, by setting a definite semantic meaning, we kicked the quality problems to those who provide the authoritative data. And if they had a quality problem, we said, "Here's your data. "We're going to now use it." So it spurs, it changes the model of them ensuring the quality of those who own the data. And by working with the services, they've worked down through their data issues and have used us a bit as the foil for cleaning up their data errors that they have from different inputs. And I like to think about it as flipping the model of saying, "It's not my job to drive quality, "it's my job to drive clarity, "it's their job to drive the quality into the system." >> Let's talk about this event. So, you guys are long-time contributors to the event. Mark, have you been here since the beginning? Or close to it? >> Um... About halfway through I think. >> When the focus was primarily on information quality? >> Yes. >> Was it CDOIQ at the time or was it IQ? >> It was the very beginnings of CDOIQ. It was right before it became CDOIQ. >> Early part of this decade? >> Yes. >> Okay. >> It was Information Quality Symposium originally, is that was attracted you to it? >> Well, yes, I was interested in it because I think there were two things that drew my interest. One, a colleague had told me about it and we were just starting the data journey at that point. And it was talking about information quality and it was out of a business school in the MIT slenton side of the house. And coming from a business perspective, it was not just the providence of IT, I wanted to learn form others because I sit on the business side of the equation. Not a pure IT-ist or technology. And I came here to learn. I've never stopped learning through my entire journey here. >> What have you learned this week? >> Well, there's an awful lot I learned. I think it's been... This space is evolving so rapidly with the law, policy, and regulation. Establishing the CDOs, establishing the roles, getting hear from the CDOs, getting to hear from visions, hear from Michael Conlan and hear from others in the federal agencies. Having them up here and being able to collaborate and talk to them. Also hearing from the technology people, the people that're bringing solutions to the table. And then, I always say this is a bit like group therapy here because many of us have similar problems, we have different start and end points and learning from each other has proven to be very valuable. From the hallway conversations to hearing somebody and seeing how they thought about the products, seeing how commercial industry has implemented data management. And you have a lot of similarity of focus of people dealing with trying to bring data to bring value to the organizations and understanding their transformations, it's proven invaluable. >> Well, what did the appointment of the DOD's first CDO last year, what statement did that make to the organization? >> That data's important. Data are important. And having a CDO in that and, when Micheal came on board, we shared some lessons learned and we were thinking about how to do that, you know? As I said, I function in a, arguably a silo of the institution is the acquisition data. But we were copying CDO homework so it helped in my mind that we can go across to somebody else that would understand and could understand what we're trying to do and help us. And I think it becomes, the CDO community has always been very sharing and collaborative and I hold that true with Micheal today. >> It's kind of the ethos of this event. I mean, obviously you guys have been heavily involved. We've always been thrilled to cover this. I think we started in 2013 and we've seen it grow, it's kind of fire marshal full now. We got to get to a new facility, I understand. >> Fire marshal full. >> Next year. So that's congratulations to all the success. >> Yeah, I think it's important and we've now seen, you know, you hear it, you can read it in every newspaper, every channel out there, that data are important. And what's more important than the factor of governance and the factor of bringing safety and security to the nation? >> I do feel like a lot in, certainly in commercial world, I don't know if it applies in the government, but a lot of these AI projects are moving really fast. Especially in Silicon Valley, there's this move fast and break things mentality. And I think that's part of why you're seeing some of these big tech companies struggle right now because they're moving fast and they're breaking things without the governance injected and many CDOs are not heavily involved in some of these skunk works projects and it's almost like they're bolting on governance which has never been a great formula for success in areas like governance and compliance and security. You know, the philosophy of designing it in has tangible benefits. I wonder if you could comment on that? >> Yeah, I can talk about it as we think about it in our space and it may be limited. AI is a bit high on the hype curve as you might imagine right now, and the question would be is can it solve a problem that you have? Well, you just can't buy a piece of software or a methodology and have it solve a problem if you don't know what problem you're trying to solve and you wouldn't understand the answer when it gave it to you. And I think we have to raise our data intellectualism across the organization to better work with these products because they certainly represent utility but it's not like you give it with no fences on either side or you open up your aperture to find basic solution on this. How you move forward with it is your workforce has got to be in tune with that, you have to understand some of the data, at least the basics, and particularly with products when you get the machine learning AI deep learning, the models are going to be moving so fast that you have to intellectually understand them because you'll never be able to go all the way back and stubby pencil back to an answer. And if you don't have the skills and the math and the understanding of how these things are put together, it may not bring the value that they can bring to us. >> Mark, thanks very much for coming on The Cube. >> Thank you very much. >> Great to see you again and appreciate all the work you guys both do for the community. All right. And thank you for watching. We'll be right back with our next guest right after this short break. You're watching The Cube from MIT CDOIQ.

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Good to see you again, thanks for coming on. and be efficient in the data management. And finally, sometimes you have to make another part of the organization you partner with, or? and involve and awful lot of data in the a buildup And as I said, the partnership with the services and having just the general description of that, in the IT world or the data world And one of the challenges that I think but you got to serve the mission of the organization. But at the same time, you've got stakeholders and the decisions across the organization How does that change what you guys do? In that discussion that you just presented us, and you had the nirvana of data quality. rethinking of how to do this now that you have data, So it spurs, it changes the model of them So, you guys are long-time contributors to the event. About halfway through I think. It was the very beginnings of CDOIQ. in the MIT slenton side of the house. getting hear from the CDOs, getting to hear from visions, and we were thinking about how to do that, you know? It's kind of the ethos of this event. So that's congratulations to all the success. and the factor of bringing safety I don't know if it applies in the government, across the organization to better work with these products all the work you guys both do for the community.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Michael Dell	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Michael	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Elizabeth	PERSON	0.99+
Paul Gillan	PERSON	0.99+
Jeff Clark	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Savannah	PERSON	0.99+
Dave	PERSON	0.99+
Richard	PERSON	0.99+
Micheal	PERSON	0.99+
Carolyn Rodz	PERSON	0.99+
Dave Vallante	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Eric Seidman	PERSON	0.99+
Paul	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Google	ORGANIZATION	0.99+
Keith	PERSON	0.99+
Chris McNabb	PERSON	0.99+
Joe	PERSON	0.99+
Carolyn	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Alice	PERSON	0.99+
2006	DATE	0.99+
John	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
congress	ORGANIZATION	0.99+
Ericsson	ORGANIZATION	0.99+
AT&T	ORGANIZATION	0.99+
Elizabeth Gore	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Madhu Kutty	PERSON	0.99+
1999	DATE	0.99+
Michael Conlan	PERSON	0.99+
2013	DATE	0.99+
Michael Candolim	PERSON	0.99+
Pat	PERSON	0.99+
Yvonne Wassenaar	PERSON	0.99+
Mark Krzysko	PERSON	0.99+
Boston	LOCATION	0.99+
Pat Gelsinger	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Willie Lu	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Yvonne	PERSON	0.99+
Hertz	ORGANIZATION	0.99+
Andy	PERSON	0.99+
2012	DATE	0.99+
Microsoft	ORGANIZATION	0.99+

Tom Davenport, Babson College | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back >> to M I. T. Everybody watching the Cube, The leader in live tech coverage. My name is Dave Volonte here with Paul Guillen. My co host, Tom Davenport, is here is the president's distinguished professor at Babson College. Huebel? Um, good to see again, Tom. Thanks for coming on. Glad to be here. So, yeah, this is, uh let's see. The 13th annual M I t. Cdo lucky. >> Yeah, sure. As this year. Our seventh. I >> think so. Really? Maybe we'll offset. So you gave a talk earlier? She would be afraid of the machines, Or should we embrace them? I think we should embrace them, because so far, they are not capable of replacing us. I mean, you know, when we hit the singularity, which I'm not sure we'll ever happen, But it's certainly not going happen anytime soon. We'll have a different answer. But now good at small, narrow task. Not so good at doing a lot of the things that we do. So I think we're fine. Although as I said in my talk, I have some survey data suggesting that large U. S. Corporations, their senior executives, a substantial number of them more than half would liketo automate as many jobs as possible. They say. So that's a little scary. But unfortunately for us human something, it's gonna be >> a while before they succeed. Way had a case last year where McDonald's employees were agitating for increasing the minimum wage and tThe e management used the threat of wrote of robotics sizing, hamburger making process, which can be done right to thio. Get them to back down. Are you think we're going to Seymour of four that were maybe a eyes used as a threat? >> Well, I haven't heard too many other examples. I think for those highly structured, relatively low level task, it's quite possible, particularly if if we do end up raising the minimum wage beyond a point where it's economical, pay humans to do the work. Um, but I would like to think that, you know, if we gave humans the opportunity, they could do Maur than they're doing now in many cases, and one of the things I was saying is that I think companies are. Generally, there's some exceptions, but most companies they're not starting to retrain their workers. Amazon recently announced they're going to spend 700,000,000 to retrain their workers to do things that a I and robots can't. But that's pretty rare. Certainly that level of commitment is very rare. So I think it's time for the companies to start stepping up and saying, How can we develop a better combination of humans and machines? >> The work by, you know, brain Nelson and McAfee, which is a little dated now. But it definitely suggests that there's some things to be concerned about. Of course, ultimately there prescription was one of an optimist and education, and yeah, on and so forth. But you know, the key point there is the machines have always replace humans, but now, in terms of cognitive functions, but you see it everywhere you drive to the airport. Now it's Elektronik billboards. It's not some person putting up the kiosks, etcetera, but you know, is you know, you've you've used >> the term, you know, paid the cow path. We don't want to protect the past from the future. All right, so, to >> your point, retraining education I mean, that's the opportunity here, isn't it? And the potential is enormous. Well, and, you know, let's face it, we haven't had much in the way of productivity improvements in the U. S. Or any other advanced economy lately. So we need some guests, you know, replacement of humans by machines. But my argument has always been You can handle innovation better. You can avoid sort of race to the bottom at automation sometimes leads to, if you think creatively about humans and machines working as colleagues. In many cases, you remember in the PC boom, I forget it with a Fed chairman was it might have been, Greenspan said, You can see progress everywhere except in the product. That was an M. I. T. Professor Robert Solow. >> OK, right, and then >> won the Nobel Prize. But then, shortly thereafter, there was a huge productivity boom. So I mean is there may be a pent up Well, God knows. I mean, um, everybody's wondering. We've been spending literally trillions on I t. And you would think that it would have led toe productivity, But you know, certain things like social media, I think reduced productivity in the workplace and you know, we're all chatting and talking and slacking and sewing all over the place. Maybe that's is not conducive to getting work done. It depends what you >> do with that social media here in our business. It's actually it's phenomenal to see political coverage these days, which is almost entirely consist of reprinting politicians. Tweets >> Exactly. I guess it's made life easier for for them all people reporters sitting in the White House waiting for a press conference. They're not >> doing well. There are many reporters left. Where do you see in your consulting work your academic work? Where do you see a I being used most effectively in organizations right now? And where do you think that's gonna be three years from now? >> Well, I mean, the general category of activity of use case is the sort of someone's calling boring I. It's data integration. One thing that's being discussed a lot of this conference, it's connecting your invoices to your contracts to see Did we actually get the stuff that we contracted for its ah, doing a little bit better job of identifying fraud and doing it faster so all of those things are quite feasible. They're just not that exciting. What we're not seeing are curing cancer, creating fully autonomous vehicles. You know, the really aggressive moonshots that we've been trying for a while just haven't succeeded at what if we kind of expand a I is gonna The rumor, trawlers. New cool stuff that's coming out. So considering all these new checks with detective Aye, aye, Blockchain new security approaches. When do you think that machines will be able to make better diagnoses than doctors? Well, I think you know, in a very narrow sense in some cases, that could do it now. But the thing is, first of all, take a radiologist, which is one of the doctors I think most at risk from this because they don't typically meet with patients and they spend a lot of time looking at images. It turns out that the lab experiments that say you know, these air better than human radiologist say I tend to be very narrow, and what one lab does is different from another lab. So it's just it's gonna take a very long time to make it into, you know, production deployment in the physician's office. We'll probably have to have some regulatory approval of it. You know, the lab research is great. It's just getting it into day to day. Reality is the problem. Okay, So staying in this context of digital a sort of umbrella topic, do you think large retail stores roll largely disappeared? >> Uh, >> some sectors more than others for things that you don't need toe, touch and feel, And soon before you're to them. Certainly even that obviously, it's happening more and more on commerce. What people are saying will disappear. Next is the human at the point of sale. And we've been talking about that for a while. In In grocery, Not so not achieve so much yet in the U. S. Amazon Go is a really interesting experiment where every time I go in there, I tried to shoplift. I took a while, and now they have 12 stores. It's not huge yet, but I think if you're in one of those jobs that a substantial chunk of it is automata ble, then you really want to start looking around thinking, What else can I do to add value to these machines? Do you think traditional banks will lose control of the payment system? Uh, No, I don't because the Finn techs that you see thus far keep getting bought by traditional bank. So my guess is that people will want that certainty. And you know, the funny thing about Blockchain way say in principle it's more secure because it's spread across a lot of different ledgers. But people keep hacking into Bitcoin, so it makes you wonder. I think Blockchain is gonna take longer than way thought as well. So, you know, in my latest book, which is called the Aye Aye Advantage, I start out talking by about Tamara's Law, This guy Roy Amara, who was a futurist, not nearly as well known as Moore's Law. But it said, You know, for every new technology, we tend to overestimate its impact in the short run and underestimated Long, long Ryan. And so I think a I will end up doing great things. We may have sort of tuned it out of the time. It actually happens way finally have autonomous vehicles. We've been talking about it for 50 years. Last one. So one of the Democratic candidates of the 75 Democratic ended last night mentioned the chief manufacturing officer Well, do you see that automation will actually swing the pendulum and bring back manufacturing to the U. S. I think it could if we were really aggressive about using digital technologies in manufacturing, doing three D manufacturing doing, um, digital twins of every device and so on. But we are not being as aggressive as we ought to be. And manufacturing companies have been kind of slow. And, um, I think somewhat delinquent and embracing these things. So they're gonna think, lose the ability to compete. We have to really go at it in a big way to >> bring it. Bring it all back. Just we've got an election coming up. There are a lot of concern following the last election about the potential of a I chatbots Twitter chat bots, deep fakes, technologies that obscure or alter reality. Are you worried about what's coming in the next year? And that that >> could never happen? Paul. We could never see anything deep fakes I'm quite worried about. We don't seem. I know there's some organizations working on how we would certify, you know, an image as being really But we're not there yet. My guess is, certainly by the time the election happens, we're going to have all sorts of political candidates saying things that they never really said through deep fakes and image manipulation. Scary? What do you think about the call to break up? Big check. What's your position on that? I think that sell a self inflicted wound. You know, we just saw, for example, that the automobile manufacturers decided to get together. Even though the federal government isn't asking for better mileage, they said, We'll do it. We'll work with you in union of states that are more advanced. If Big Tak had said, we're gonna work together to develop standards of ethical behavior and privacy and data and so on, they could've prevented some of this unless they change their attitude really quickly. I've seen some of it sales force. People are talking about the need for data standard data protection standards, I must say, change quickly. I think they're going to get legislation imposed and maybe get broken up. It's gonna take awhile. Depends on the next administration, but they're not being smart >> about it. You look it. I'm sure you see a lot of demos of advanced A I type technology over the last year, what is really impressed you. >> You know, I think the biggest advances have clearly been in image recognition looking the other day. It's a big problem with that is you need a lot of label data. It's one of the reasons why Google was able to identify cat photos on the Internet is we had a lot of labeled cat images and the Image net open source database. But the ability to start generating images to do synthetic label data, I think, could really make a big difference in how rapidly image recognition works. >> What even synthetic? I'm sorry >> where we would actually create. We wouldn't have to have somebody go around taking pictures of cats. We create a bunch of different cat photos, label them as cat photos have variations in them, you know, unless we have a lot of variation and images. That's one of the reasons why we can't use autonomous vehicles yet because images differ in the rain and the snow. And so we're gonna have to have synthetic snow synthetic rain to identify those images. So, you know, the GPU chip still realizes that's a pedestrian walking across there, even though it's kind of buzzed up right now. Just a little bit of various ation. The image can throw off the recognition altogether. Tom. Hey, thanks so much for coming in. The Cube is great to see you. We gotta go play Catch. You're welcome. Keep right. Everybody will be back from M I t CDO I Q In Cambridge, Massachusetts. Stable, aren't they? Paul Gillis, You're watching the Cube?

Published Date : Jul 31 2019

SUMMARY :

Brought to you by My co host, Tom Davenport, is here is the president's distinguished professor at Babson College. I I mean, you know, when we hit the singularity, Are you think we're going to Seymour of four that were maybe a eyes used as you know, if we gave humans the opportunity, they could do Maur than they're doing now But you know, the key point there is the machines the term, you know, paid the cow path. Well, and, you know, in the workplace and you know, we're all chatting and talking It's actually it's phenomenal to see reporters sitting in the White House waiting for a press conference. And where do you think that's gonna be three years from now? I think you know, in a very narrow sense in some cases, No, I don't because the Finn techs that you see thus far keep There are a lot of concern following the last election about the potential of a I chatbots you know, an image as being really But we're not there yet. I'm sure you see a lot of demos of advanced A But the ability to start generating images to do synthetic as cat photos have variations in them, you know, unless we have

ENTITIES

Entity	Category	Confidence
McDonald	ORGANIZATION	0.99+
Dave Volonte	PERSON	0.99+
Paul Gillis	PERSON	0.99+
Roy Amara	PERSON	0.99+
Paul Guillen	PERSON	0.99+
Tom Davenport	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Tom	PERSON	0.99+
Seymour	PERSON	0.99+
700,000,000	QUANTITY	0.99+
12 stores	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Robert Solow	PERSON	0.99+
Paul	PERSON	0.99+
last year	DATE	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
one	QUANTITY	0.99+
50 years	QUANTITY	0.99+
U. S.	LOCATION	0.99+
Babson College	ORGANIZATION	0.99+
Huebel	PERSON	0.99+
next year	DATE	0.99+
Fed	ORGANIZATION	0.98+
four	QUANTITY	0.98+
Democratic	ORGANIZATION	0.98+
more than half	QUANTITY	0.98+
M I. T.	PERSON	0.98+
seventh	QUANTITY	0.98+
2019	DATE	0.98+
Nobel Prize	TITLE	0.97+
McAfee	ORGANIZATION	0.97+
Greenspan	PERSON	0.97+
Twitter	ORGANIZATION	0.96+
One	QUANTITY	0.96+
U. S.	LOCATION	0.96+
one lab	QUANTITY	0.96+
Ryan	PERSON	0.95+
Catch	TITLE	0.95+
this year	DATE	0.95+
last night	DATE	0.94+
Big Tak	ORGANIZATION	0.87+
Professor	PERSON	0.84+
Aye Aye Advantage	TITLE	0.84+
75	QUANTITY	0.84+
Amazon Go	ORGANIZATION	0.81+
U.	ORGANIZATION	0.78+
Maur	PERSON	0.77+
trillions	QUANTITY	0.76+
Nelson	ORGANIZATION	0.73+
Tamara	PERSON	0.71+
one of the reasons	QUANTITY	0.71+
White House	ORGANIZATION	0.69+
Big check	ORGANIZATION	0.69+
Law	TITLE	0.67+
three years	QUANTITY	0.66+
M I t. Cdo	EVENT	0.66+
M	PERSON	0.65+
Moore	PERSON	0.59+
13th annual	QUANTITY	0.58+
first	QUANTITY	0.57+
Last	QUANTITY	0.54+
Aye	PERSON	0.52+
MIT CDOIQ	ORGANIZATION	0.51+
M.	PERSON	0.48+
Finn	ORGANIZATION	0.45+
Cube	TITLE	0.41+

Lisa Ehrlinger, Johannes Kepler University | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Hi, everybody, welcome back to Cambridge, Massachusetts. This is theCUBE, the leader in tech coverage. I'm Dave Vellante with my cohost, Paul Gillin, and we're here covering the MIT Chief Data Officer Information Quality Conference, #MITCDOIQ. Lisa Ehrlinger is here, she's the Senior Researcher at the Johannes Kepler University in Linz, Austria, and the Software Competence Center in Hagenberg. Lisa, thanks for coming in theCUBE, great to see you. >> Thanks for having me, it's great to be here. >> You're welcome. So Friday you're going to lay out the results of the study, and it's a study of Data Quality Tools. Kind of the long tail of tools, some of those ones that may not have made the Gartner Magic Quadrant and maybe other studies, but talk about the study and why it was initiated. >> Okay, so the main motivation for this study was actually a very practical one, because we have many company projects with companies from different domains, like steel industry, financial sector, and also focus on automotive industry at our department at Johannes Kepler University in Linz. We have experience with these companies for more than 20 years, actually, in this department, and what reoccurred was the fact that we spent the majority of time in such big data projects on data quality measurement and improvement tasks. So at some point we thought, okay, what possibilities are there to automate these tasks and what tools are out there on the market to automate these data quality tasks. So this was actually the motivation why we thought, okay, we'll look at those tools. Also, companies ask us, "Do you have any suggestions? "Which tool performs best in this-and-this domain?" And I think this study answers some questions that have not been answered so far in this particular detail, in these details. For example, Gartner Magic Quadrant of Data Quality Tools, it's pretty interesting but it's very high-level and focusing on some global windows, but it does not look on the specific measurement functionalities. >> Yeah, you have to have some certain number of whatever, customers or revenue to get into the Magic Quadrant. So there's a long tail that they don't cover. But talk a little bit more about the methodology, was it sort of you got hands-on or was it more just kind of investigating what the capabilities of the tools were, talking to customers? How did you come to the conclusions? >> We actually approached this from a very scientific side. We conducted a systematic search, which tools are out there on the market, not only industrial tools, but also open-sourced tools were included. And I think this gives a really nice digest of the market from different perspectives, because we also include some tools that have not been investigated by Gartner, for example, like more BTQ, Data Quality, or Apache Griffin, which has really nice monitoring capabilities, but lacks some other features from these comprehensive tools, of course. >> So was the goal of the methodology largely to capture a feature function analysis of being able to compare that in terms of binary, did it have it or not, how robust is it? And try to develop a common taxonomy across all these tools, is that what you did? >> So we came up with a very detailed requirements catalog, which is divided into three fields, like the focuses on data profiling to get a first insight into data quality. The second is data quality management in terms of dimensions, metrics, and rules. And the third part is dedicated to data quality monitoring over time, and for all those three categories, we came up with different case studies on a database, on a test database. And so we conducted, we looked, okay, does this tool, yes, support this feature, no, or partially? And when partially, to which extent? So I think, especially on the partial assessment, we got a lot into detail in our survey, which is available on Archive online already. So the preliminary results are already online. >> How do you find it? Where is it available? >> On Archive. >> Archive? >> Yes. >> What's the URL, sorry. Archive.com, or .org, or-- >> Archive.org, yeah. >> Archive.org. >> But actually there is a ID I have not with me currently, but I can send you afterwards, yeah. >> Yeah, maybe you can post that with the show notes. >> We can post it afterwards. >> I was amazed, you tested 667 tools. Now, I would've expected that there would be 30 or 40. Where are all of these, what do all of these long tail tools do? Are they specialized by industry or by function? >> Oh, sorry, I think we got some confusion here, because we identified 667 tools out there on the market, but we narrowed this down. Because, as you said, it's quite impossible to observe all those tools. >> But the question still stands, what is the difference, what are these very small, niche tools? What do they do? >> So most of them are domain-specific, and I think this really highlights also these very basic early definition about data quality, of like data qualities defined as fitness for use, and we can pretty much see it here that we excluded the majority of these tools just because they assess some specific kind of data, and we just really wanted to find tools that are generally applicable for different kinds of data, for structured data, unstructured data, and so on. And most of these tools, okay, someone came up with, we want to assess the quality of our, I don't know, like geological data or something like that, yeah. >> To what extent did you consider other sort of non-technical factors? Did you do that at all? I mean, was there pricing or complexity of downloading or, you know, is there a free version available? Did you ignore those and just focus on the feature function, or did those play a role? >> So basically the focus was on the feature function, but of course we had to contact the customer support. Especially with the commercial tools, we had to ask them to provide us with some trial licenses, and there we perceived different feedback from those companies, and I think the best comprehensive study here is definitely Gartner Magic Quadrant for Data Quality Tools, because they give a broad assessment here, but what we also highlight in our study are companies that have a very open support and they are very willing to support you. For example, Informatica Data Quality, we perceived a really close interaction with them in terms of support, trial licenses, and also like specific functionality. Also Experian, our contact from Experian from France was really helpful here. And other companies, like IBM, they focus on big vendors, and here, it was not able to assess these tools, for example, yeah. >> Okay, but the other differences of the Magic Quadrant is you guys actually used the tools, played with them, experienced firsthand the customer experience. >> Exactly, yeah. >> Did you talk to customers as well, or, because you were the customer, you had that experience. >> Yes, I were the customer, but I was also happy to attend some data quality event in Vienna, and there I met some other customers who had experience with single tools. Not of course this wide range we observed, but it was interesting to get feedback on single tools and verify our results, and it matched pretty good. >> How large was the team that ran the study? >> Five people. >> Five people, and how long did it take you from start to finish? >> Actually, we performed it for one year, roughly. The assessment. And I think it's a pretty long time, especially when you see how quick the market responds, especially in the open source field. But nevertheless, you need to make some cut, and I think it's a very recent study now, and there is also the idea to publish it now, the preliminary results, and we are happy with that. >> Were there any surprises in the results? >> I think the main results, or one of the surprises was that we think that there is definitely more potential for automation, but not only for automation. I really enjoyed the keynote this morning that we need more automation, but at the same time, we think that there is also the demand for more declaration. We observed some tools that say, yeah, we apply machine learning, and then you look into their documentation and find no information, which algorithm, which parameters, which thresholds. So I think this is definitely, especially if you want to assess the data quality, you really need to know what algorithm and how it's attuned and give the user, which in most case will be a technical person with technical background, like some chief data officer. And he or she really needs to have the possibility to tune these algorithms to get reliable results and to know what's going on and why, which records are selected, for example. >> So now what? You're presenting the results, right? You're obviously here at this conference and other conferences, and so it's been what, a year, right? >> Yes. >> And so what's the next wave? What's next for you? >> The next wave, we're currently working on a project which is called some Knowledge Graph for Data Quality Assessment, which should tackle two problems in ones. The first is to come up with a semantic representation of your data landscape in your company, but not only the data landscape itself in terms of gathering meta data, but also to automatically improve or annotate this data schema with data profiles. And I think what we've seen in the tools, we have a lot of capabilities for data profiling, but this is usually left to the user ad hoc, and here, we store it centrally and allow the user to continuously verify newly incoming data if this adheres to this standard data profile. And I think this is definitely one step into the way into more automation, and also I think it's the most... The best thing here with this approach would be to overcome this very arduous way of coming up with all the single rules within a team, but present the data profile to a group of data, within your data quality project to those peoples involved in the projects, and then they can verify the project and only update it and refine it, but they have some automated basis that is presented to them. >> Oh, great, same team or new team? >> Same team, yeah. >> Oh, great. >> We're continuing with it. >> Well, Lisa, thanks so much for coming to theCUBE and sharing the results of your study. Good luck with your talk on Friday. >> Thank you very much, thank you. >> All right, and thank you for watching. Keep it right there, everybody. We'll be back with our next guest right after this short break. From MIT CDOIQ, you're watching theCUBE. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and the Software Competence Center in Hagenberg. it's great to be here. Kind of the long tail of tools, Okay, so the main motivation for this study of the tools were, talking to customers? And I think this gives a really nice digest of the market And the third part is dedicated to data quality monitoring What's the URL, sorry. but I can send you afterwards, yeah. Yeah, maybe you can post that I was amazed, you tested 667 tools. Oh, sorry, I think we got some confusion here, and I think this really highlights also these very basic So basically the focus was on the feature function, Okay, but the other differences of the Magic Quadrant Did you talk to customers as well, or, and there I met some other customers and we are happy with that. or one of the surprises was that we think but present the data profile to a group of data, and sharing the results of your study. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Lisa Ehrlinger	PERSON	0.99+
Paul Gillin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Hagenberg	LOCATION	0.99+
Lisa	PERSON	0.99+
Vienna	LOCATION	0.99+
Linz	LOCATION	0.99+
Five people	QUANTITY	0.99+
30	QUANTITY	0.99+
Johannes Kepler University	ORGANIZATION	0.99+
40	QUANTITY	0.99+
Friday	DATE	0.99+
one year	QUANTITY	0.99+
667 tools	QUANTITY	0.99+
France	LOCATION	0.99+
three categories	QUANTITY	0.99+
third part	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Experian	ORGANIZATION	0.99+
second	QUANTITY	0.99+
two problems	QUANTITY	0.99+
more than 20 years	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
single tools	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.98+
first	QUANTITY	0.98+
MIT CDOIQ	ORGANIZATION	0.98+
a year	QUANTITY	0.97+
three fields	QUANTITY	0.97+
Apache Griffin	ORGANIZATION	0.97+
Archive.org	OTHER	0.96+
.org	OTHER	0.96+
one step	QUANTITY	0.96+
Linz, Austria	LOCATION	0.95+
one	QUANTITY	0.94+
single	QUANTITY	0.94+
first insight	QUANTITY	0.93+
theCUBE	ORGANIZATION	0.92+
2019	DATE	0.92+
this morning	DATE	0.91+
BTQ	ORGANIZATION	0.91+
MIT Chief Data Officer and	EVENT	0.9+
Archive.com	OTHER	0.88+
Informatica	ORGANIZATION	0.85+
Software Competence Center	ORGANIZATION	0.84+
Information Quality Symposium 2019	EVENT	0.81+
MIT Chief Data Officer Information Quality Conference	EVENT	0.72+
Data Quality	ORGANIZATION	0.67+
#MITCDOIQ	EVENT	0.65+
Magic Quadrant	COMMERCIAL_ITEM	0.63+
Magic	COMMERCIAL_ITEM	0.45+
next	EVENT	0.44+
wave	EVENT	0.43+
Magic Quadrant	ORGANIZATION	0.43+
wave	DATE	0.41+
Magic	TITLE	0.39+

Michael Conlin, US Department of Defense | MIT CDOIQ 2019

(upbeat music) >> From Cambridge, Massachusetts, it's the CUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (upbeat music) >> Welcome back to MIT in Cambridge Massachusetts everybody you're watching the CUBE the leader in live tech coverage. We go out to the events and extract the signal from the noise we hear at the MIT CDOIQ. It's the MIT Chief Data Officer event the 13th annual event. The CUBE started covering this show in 2013. I'm Dave Vellante with Paul Gillin, my co-host, and Michael Conlin is here as the chief data officer of the Department of Defense, Michael welcome, thank you for coming on. >> Thank you, it's a pleasure to be here. >> So the DoD is, I think it's the largest organization in the world, what does the chief data officer of the DoD do on a day to day basis? >> A range of things because we have a range of challenges at the Department of Defense. We are the single largest organization on the planet. We have the greatest scope and scale and complexity. We have the most dangerous competitors of anybody on the planet, it's not a trivial issue for us. So, I've a range of challenges. Challenges around, how do I lift the overall performance of the department using data effectively? How do I help executives make better decisions faster, using more recent, more common data? More common enterprise data is the expression we use. How do I help them become more sophisticated consumers of data and especially data analytics? And, how do we get to the point where, I can compare performance over here with performance over there, on a common basis? And compared to commercial benchmark? Which is now an expectation for us, and ask are we doing this as well as we should, right across the patch? Knowing, that all that data comes from multiple different places to start with. So we have to overcome all those differences and provide that department wide view. That's the essence of the role. And now with the recent passage of the Foundations for Evidenced-Based Policymaking Act, there are a number of additional expectations that go on top of that, but this is ultimately about improving affordability and performance of the department. >> So overall performance of the organization... >> Overall performance. >> ...as well, and maybe that comes from supporting various initiatives, and making sure you're driving performance on that basis as well. >> It does, but our litmus test is are we enabling the National Defense Strategy to succeed? Only reason to touch data is to enable the National Defense Strategy to be more successful than without it. And so we're always measuring ourselves against that. But it is, can we objectively say we're performing better? Can we objectively say that we are more affordable? In terms of the way we support the National Defense Strategy. >> I'm curious about your motivations for taking on this assignment because your background, as I see, is primarily in the private sector. A year ago you joined the US Department of Defense. A huge set of issues that you're tackling now, why'd you do it? >> So I am a capitalist, like most Americans, and I'm a serial entrepreneur. This was my first opportunity to serve government. And when I looked at it, knowing that I could directly support national defense, knowing that I could make a direct meaningful contribution, let me exercise that spirit of patriotism that many of us have, but we just not found ourselves an opportunity. When this opportunity came along I just couldn't say no to it. There's so much to be done and so much appetite for improvement that I just couldn't walk away for this. Now I've to tell you, when you start you take an oath of office to protect and defend the constitution. I don't know, it's maybe a paragraph or maybe it's two paragraphs. It felt like it took an hour to choke it out, because I was suddenly struck with all of this emotion. >> The gravity of what you were doing. >> Yeah, the gravity of what I'm doing. And that was just a reinforcement of the choice I'd already made, obviously right. But the chance to be the first chief data officer of the entire Department of Defense, just an enormous privilege. The chance to bring commercial sector best practices in and really lift the game of the department, again enormous privilege. There's so many people who could do this, probably better than me. The fact that I got the opportunity I just couldn't say no. Just too important, to many places I could see that we could make things better. I think anybody with a patriotic bone in their body would of jumped at the opportunity. >> That's awesome, I love that congratulations on getting that role and seemingly thrive in it. A big part of preserving that capitalist belief, defending the constitution and the American way, it sounds corny, but... >> It's real. >> I'm a patriot as well, is security. And security and data are intertwined. And just the whole future of warfare is dramatically changing. Can you talk about in a format like this, security, you're thinking on that, the department's thinking on that from a CDO's perspective? >> So as you know we have a number of swimlanes within the department and security is very clear swimlane, it's aligned under our chief information officer, but security is everybody's responsibility, of course. Now the longstanding criticism of security people is that they think they best way to secure anything is to permit nobody to touch it. The clear expectation for me as chief data officer is to make sure that information is shared to the right people as rapidly as possible. And, that's a different philosophy. Now I'm really lucky. Lieutenant General Denis Crall our principal cyber advisor, Dana Deasy our CIO, these people understand how important it is to get information in the right place at the right time, make it rapidly available and secure it every step along the way. We embrace the zero trust mantra. And because we embrace the zero trust mantra we're directly concerned with defending the data itself. And as long as we defend the data and the same mechanisms are the mechanisms we use to let people share it, suddenly the tension goes away. Suddenly we all have the same goal. Because the goal is not to prevent use of data, it's to enable use of data in a secure way. So the traditional tension that might be in that place doesn't exist in the department. Very productive, very professional level of collaboration with those folks in this space. Very sophisticated people. >> When we were talking before we went live you mentioned that the DoD has 10,000 plus operational systems... >> That's correct. >> A portfolio of that magnitude just overwhelming, I mean how did you know what to do first when you moved into this job, or did you have a clear mandate when you were hired? >> So I did have a clear mandate when I was hired and luckily that was spelled out. We knew what to do first because we sat down with actual leaders of the department and asked them what their goals were for improving the performance of the department. And everything starts from that conversation. You find those executives that what to improve performance, you understand what those goals are, and what data they need to manage that improvement. And you capture all the critical business questions they need answers to. From that point on they're bought in to everything that happens, right. Because they want those answers to those critical business questions. They have performance targets of their own, this is now aligned with. And so you have the support you need to go down the rest of the path of finding the data, standardizing it, et cetera. In order to deliver the answers to those questions. But it all starts which either the business mission leaders or the warfighting mission leaders who define the steps they're taking to implement the National Defense Strategy. Everything gets lined up against that, you get instant support and you know you're going after the right thing. This is not, an if you build it they will come. This is not, a driftnet the organization try to gather up all the data. This is spear fishing for specific answers to materially important questions, and everything we do is done on that basis. >> We hear Mark Ramsey this morning talk about the... He showed a picture of stove pipes and then he complicated that picture by showing multiple copies within each of those stove pipes, and says this is organizations that we've all lived in. >> That's my organization too. >> So talk about some of those data challenges at the DoD and how you're addressing those, specifically how you're enabling soldiers in the field to get the right data to the field when they need it. >> So what we'll be delicate when we talk about what we do for soldiers in the field. >> Understood, yeah. >> That tends to be sensitive. >> Understand why, sure. >> But all of those dynamics that Mark described in that presentation are present in every large cooperation I've ever served. And that includes the Department of Defense. That heterogeneity and sprawl of IT that what I would refer to, he showed us a hair ball of IT. Every large organization has a hair ball of IT. And data scattered all over the place. We took many of the same steps that he described in terms of organizing and presenting meaningful answers to questions, in almost exactly the same sequence. The challenge as you heard me use the statistics that our CIO's published digital monetization strategies, which calls out that we have roughly 10,000 operational systems. Well, every one of them is different. Every one's put in place by a different group of people at a different time, with a different set of requirements, and a different budget, and a different focus. You know organizational scope. We're just like he showed. We're trying to blend all that in to a common view. So we have to find what's the real authoritative piece of data, cause it's not all of those systems. It's only a subset of those systems. And you have to do all of the mapping and translations, to make the result add up. Otherwise you double count or you miss something. This is work in progress. This will always be a work in progress to any large organization. So I don't want to give you impression it's all sorted. Definitely not all sorted. But, the reality is we're trying to get to the point where people can see the data that's available and that's a requirement by the way under the Foundations Act that we have a data catalog, an authoritative data catalog so people can see it and they have the ability to then request access to that through automation. This is what's critical, you need to be able to request access and have it arbitraged on the basis of whether you should directly have access based on your role, your workflow, et cetera, but it should happen in real time. You don't want to wait weeks, or months, or however long for some paperwork to move around. So this all has to become highly automated. So, what's the data, who can access it under what policy, for what purpose? Our roles and responsibilities? Identity management? All this is a combined set of solutions that we have to put in place. I'm mostly worried about a subset of that. My colleagues in these other swimlanes are working to do the rest. Most people in the department have access to data they need in their space. That hasn't been a problem. The problem is you go from space to space, you have to learn a new set of systems and a new set of techniques for a new set of data formats which means you have to be retrained. That really limits our freedom of maneuver of human beings. In the ideal world you'd be able to move from any job in any part of the department to the same job in another part of the department with no retraining whatsoever. You'd be instantly able to make a contribution. That's what we're trying to get to. So that's a different kind of a challenge, right. How do we get that level of consistency in the user experience, a modern user experience. So that if I'm a real estate manager, or I'm a medical business manager, or I'm a clinical professional, or I'm whatever, I can go from this location in this part of the department to that location in that part and my experience is the same. It's completely modern, and it's completely consistent. No retraining. >> How much of that challenge pie is people, process and technology? How would you split that opportunity? >> Well everything starts for a process perspective. Because if you automate a bad process, you just make more mistakes in less time at greater costs. Obviously that's not the ideal. But the biggest single challenge is people. It's talent, it's culture. Both on the demand side and on the supply side. If fact a lot of what I talked about in my remarks, was the additional changes we need to put in place to bring people into a more modern approach to data, more modern consumption. And look, we have pockets of excellence. And they can hold their own against any team, any place on the planet. But they are pockets of excellence. And what we're trying to do is raise the entire organization's performance. So it's people, people, and people and then the other stuff. But the products, don't care about (laughs). >> We often here about... >> They're going to change in 12 to 18 months. I'm a technologist, I'm hands on. The products are going to change rapidly, I make no emotional commitment to products. But the people that's a different story. >> Well we know that in the commercial world we often hear that cultural resistance is what sabotages modernization efforts. The DoD is sort of the ultimate top-down organization. It is any easier to get buy-in because the culture is sort of command and control oriented? >> It's hard in the DoD, it's not easier in the DoD. Ultimately people respond to their performance incentives. That's the dirty secrets performance incentives, they work every time. So unless you restructure performance measures and incentives for people their behavior's never going to change. They need to see their personal future in the future you're prescribing. And if they don't see it, you're going to get resistance every time. They're going to do what they believe they're incented to do. Making those changes, cascading those performance measures down, has been difficult because much of the decision-making processes in the department have been based on slow-moving systems and slow-moving data. I mean think about it, our budget planning process was created by Robert McNamara, as the Secretary of Defense. It requires you to plan everything for five years. And it takes more than a year to plan a single year's worth of activities, it's slow-moving. And we have regulation, we have legislation, we're a law-abiding organization, we do what we have to do. All of those things slow things down. And there's a culture of expecting macro-level consensus building. Which means everybody feels they can say no. If everybody can say no, then change becomes peanut butter spread across an organization. When you peanut butter spread across something our size and scale, the layer's pretty thin. So we have the same problem that other organizations have. There is clearly a perception of top-down change and if the Secretary or the Deputy Secretary issue an instruction people will obey it. It just takes some time to work it's way down into all the detailed combinations and permutations. Cause you have to make sophisticated decisions now. How am I going to change for my performance measures for that group to that group? And that takes time and energy and thought. There's a natural sort of pipeline effect in this. So there's real tension I think in between this perception of top-down and people will obey the orders their given. But when you're trying to integrate those changes into a board set of policy and process and people, that takes time and energy. >> And as a result the leaders have to be circumspect about the orders they give because they want to see success. They want to make sure that what they say is actually implemented or it reflects poorly on the organization. >> I think that out leaders are absolutely concerned about accomplishing the outcomes that they set out. And I think that they are rightfully determined to get the change as rapidly as possible. I would not expect them to be circumspect. I would anticipate that they would be firm and clear in the direction that they set and they would set aggressive targets because you need aggressive targets to get aggressively changed outcomes. Now. >> But they would have to choose wisely, they can't just fire off orders and expect everything to be done. I would think that they got to really think about what they want to get done, and put all the wood behind the arrow as you... >> I think that they constantly balance all those considerations. I must say, I did not appreciate before I joined the department the extraordinary caliber of leadership we enjoy. We have people with real insight and experience, and high intellectual horsepower making the decisions in the department. We've been blessed with the continuing stream of them at all of the senior ranks. These people could go anywhere, or do anything that they wanted in the economy and they've chosen to be in the department. And they bring enormous intellectual firepower to bear on challenges. >> Well you mentioned the motivation at the top of the segment, that's largely pretty powerful. >> Yeah, oh absolutely. >> I want to ask you, we have to break, but the organizational structure, you talked about the CIO, actually the responsibility for security within the CIO. >> Sure. >> To whom do you report. What's the organization look like? >> So I report to the Chief Management Officer of the Department of Defense. So if you think about the order of precedents, there's the Secretary of Defense, the Deputy Secretary of Defense and third in order is the Chief Management Officer. I report to the Chief Management Officer. >> As does the CIO, is that right? >> As does the CIO, as does the CIO. And actually this is quite typical in large organizations, that you don't have the CDO and the CIO in the same space because the concerns are very different. They have to collaborate but very different concerns. We used to see CDOs reporting to CIOs that's fallen dramatically in terms of the frequency you see that. Cause we now recognize that's just a failure mode. So you don't want to go down that path. The number one most common reporting relationship is actually to a CEO, the chief executive officer, of an organization. It's all about, what executive is driving performance for the organization? That's the person the CDO should report to. And I'm blessed in that I do find myself reporting to the executive driving organizational improvement. For me, that's a critical thing. That would make the difference between whether I could succeed or whether I'm doomed to fail. >> COO would be common too in a commercial organization. >> Yeah, in certain commercial organizations, it's a COO. It just depends on the nature of the business and their maturity with data. But if you're in the... If data's the business, CDO will report to the CEO. There are other organizations where it'll be the COO or CFO, it just depends on the nature of that business. And in our case I'm quite fortunate. >> Well Michael, thank you for, not only the coming to the CUBE but the service you're providing to the country, we really appreciate your insights and... >> It's a pleasure meeting you. >> It's a pleasure meeting you. All right, keep it right there everybody we'll be right back with our next guest. You're watching the CUBE live from MIT CDOIQ, be right back. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and Michael Conlin is here as the chief data officer More common enterprise data is the expression we use. and maybe that comes from supporting various initiatives, In terms of the way we support as I see, is primarily in the private sector. I just couldn't say no to it. But the chance to be the first chief data officer defending the constitution and the American way, And just the whole future of warfare Because the goal is not to prevent use of data, you mentioned that the DoD has 10,000 plus This is not, a driftnet the organization and says this is organizations that we've all lived in. enabling soldiers in the field to get the right data for soldiers in the field. in any part of the department to the same job Both on the demand side and on the supply side. But the people that's a different story. The DoD is sort of the ultimate top-down organization. and if the Secretary or the Deputy Secretary And as a result the leaders have to be circumspect about in the direction that they set and they would set behind the arrow as you... the extraordinary caliber of leadership we enjoy. of the segment, that's largely pretty powerful. but the organizational structure, you talked about the CIO, What's the organization look like? of the Department of Defense. dramatically in terms of the frequency you see that. It just depends on the nature of the business to the CUBE but the service you're providing to the country, It's a pleasure meeting you.

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Jeff	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
David	PERSON	0.99+
Lisa Martin	PERSON	0.99+
PCCW	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Michelle Dennedy	PERSON	0.99+
Matthew Roszak	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Mark Ramsey	PERSON	0.99+
George	PERSON	0.99+
Jeff Swain	PERSON	0.99+
Andy Kessler	PERSON	0.99+
Europe	LOCATION	0.99+
Matt Roszak	PERSON	0.99+
Frank Slootman	PERSON	0.99+
John Donahoe	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dan Cohen	PERSON	0.99+
Michael Biltz	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Michael Conlin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Melo	PERSON	0.99+
John Furrier	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Joe Brockmeier	PERSON	0.99+
Sam	PERSON	0.99+
Matt	PERSON	0.99+
Jeff Garzik	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Joe	PERSON	0.99+
George Canuck	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Rebecca Night	PERSON	0.99+
Brian	PERSON	0.99+
Dave Valante	PERSON	0.99+
NUTANIX	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Michael	PERSON	0.99+
Mike Nickerson	PERSON	0.99+
Jeremy Burton	PERSON	0.99+
Fred	PERSON	0.99+
Robert McNamara	PERSON	0.99+
Doug Balog	PERSON	0.99+
2013	DATE	0.99+
Alistair Wildman	PERSON	0.99+
Kimberly	PERSON	0.99+
California	LOCATION	0.99+
Sam Groccot	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Rebecca	PERSON	0.99+
two	QUANTITY	0.99+

Keynote Analysis | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's The Cube! Covering MIT Chief Data Officer and Information Qualities Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome to Cambridge, Massachusetts everybody. You're watching The Cube, the leader in live tech coverage. My name is Dave Vellante and I'm here with my cohost Paul Gillin. And we're covering the 13th annual MIT CDOIQ conference. The Cube first started here in 2013 when the whole industry Paul, this segment of the industry was kind of moving out of the ashes of the compliance world and the data quality world and kind of that back office role, and it had this tailwind of the so called big data movement behind it. And the Chief Data Officer was emerging very strongly within as we've talked about many times in theCube, within highly regulated industries like financial services and government and healthcare and now we're seeing data professionals from all industries join this symposium at MIT as I say 13th year, and we're now seeing a lot of discussion about not only the role of the Chief Data Officer, but some of what we heard this morning from Mark Ramsey some of the failures along the way of all these north star data initiatives, and kind of what to do about it. So this conference brings together several hundred practitioners and we're going to be here for two days just unpacking all the discussions the major trends that touch on data. The data revolution, whether it's digital transformation, privacy, security, blockchain and the like. Now Paul, you've been involved in this conference for a number of years, and you've seen it evolve. You've seen that chief data officer role both emerge from the back office into a c-level executive role, and now spanning a very wide scope of responsibilities. Your thoughts? >> It's been like being part of a soap opera for the last eight years that I've been part of this conference because as you said Dave, we've gone through all of these transitions. In the early days this conference actually started as an information qualities symposium. It has evolved to become about chief data officer and really about the data as an asset to the organization. And I thought that the presentation we saw this morning, Mark Ramsey's talk, we're going to have him on later, very interesting about what they did at GlaxoSmithKline to get their arms around all of the data within that organization. Now a project like that would've unthinkable five years ago, but we've seen all of these new technologies come on board, essentially they've created a massive search engine for all of their data. We're seeing organizations beginning to get their arms around this massive problem. And along the way I say it's a soap opera because along the way we've seen failure after failure, we heard from Mark this morning that data governance is a failure too. That was news to me! All of these promising initiatives that have started and fallen flat or failed to live up to their potential, the chief data officer role has emerged out of that to finally try to get beyond these failures and really get their arms around that organizational data and it's a huge project, and it's something that we're beginning to see some organization succeed at. >> So let's talk a little bit about the role. So the chief data officer in many ways has taken a lot of the heat off the chief information officer, right? It used to be CIO stood for career is over. Well, when you throw all the data problems at an individual c-level executive, that really is a huge challenge. And so, with the cloud it's created opportunities for CIOs to actually unburden themselves of some of the crapplications and actually focus on some of the mission critical stuff that they've always been really strong at and focus their budgets there. But the chief data officer has had somewhat of an unclear scope. Different organizations have different roles and responsibilities. And there's overlap with the chief digital officer. There's a lot of emphasis on monetization whether that's increasing revenue or cutting costs. And as we heard today from the keynote speaker Mark Ramsey, a lot of the data initiatives have failed. So what's your take on that role and its viability and its longterm staying power? >> I think it's coming together. I think last year we saw the first evidence of that. I talked to a number of CDOs last year as well as some of the analysts who were at this conference, and there was pretty good clarity beginning to emerge about what they chief data officer role stood for. I think a lot of what has driven this is this digital transformation, the hot buzz word of 2019. The foundation of digital transformation is a data oriented culture. It's structuring the entire organization around data, and when you get to that point when an organization is ready to do that, then the role of the CDO I think becomes crystal clear. It's not so much just an extract transform load discipline. It's not just technology, it's not just governance. It really is getting that data, pulling that data together and putting it at the center of the organization. That's the value that the CDO can provide, I think organizations are coming around to that. >> Yeah and so we've seen over the last 10 years the decrease, the rapid decrease in cost, the cost of storage. Microprocessor performance we've talked about endlessly. And now you've got the machine intelligence piece layering in. In the early days Hadoop was the hot tech, and interesting now nobody talks even about Hadoop. Rarely. >> Yet it was discussed this morning. >> It was mentioned today. It is a fundamental component of infrastructures. >> Yeah. >> But what it did is it dramatically lowered the cost of storing data, and allowing people to leave data in place. The old adage of ship a five megabytes of code to a petabyte of data versus the reverse. Although we did hear today from Mark Ramsey that they copied all the data into a centralized location so I got some questions on that. But the point I want to make is that was really early days. We're now entered an era and it's underscored by if you look at the top five companies in terms of market cap in the US stock market, obviously Microsoft is now over a trillion. Microsoft, Apple, Amazon, Google and Facebook. Top five. They're data companies, their assets are all data driven. They've surpassed the banks, the energy companies, of course any manufacturing automobile companies, et cetera, et cetera. So they're data companies, and they're wrestling with big issues around security. You can't help but open the paper and see issues on security. Yesterday was the big Capital One. The Equifax issue was resolved in terms of the settlement this week, et cetera, et cetera. Facebook struggling mightily with whether or not how to deal fake news, how to deal with deep fakes. Recently it shut down likes for many Instagram accounts in some countries because they're trying to protect young people who are addicted to this. Well, they need to shut down likes for business accounts. So what kids are doing is they're moving over to the business Instagram accounts. Well when that happens, it exposes their emails automatically so they've all kinds of privacy landmines and people don't know how to deal with them. So this data explosion, while there's a lot of energy and excitement around it, brings together a lot of really sticky issues. And that falls right in the lap of the chief data officer, doesn't it? >> We're in uncharted territory and all of the examples you used are problems that we couldn't have foreseen, those companies couldn't have foreseen. A problem may be created but then the person who suffers from that problem changes their behavior and it creates new problems as you point out with kids shifting where they're going to communicate with each other. So these are all uncharted waters and I think it's got to be scary if you're a company that does have large amounts of consumer data in particular, consumer packaged goods companies for example, you're looking at what's happening to these big companies and these data breaches and you know that you're sitting on a lot of customer data yourself, and that's scary. So we may see some backlash to this from companies that were all bought in to the idea of the 360 degree customer view and having these robust data sources about each one of your customers. Turns out now that that's kind of a dangerous place to be. But to your point, these are data companies, the companies that business people look up to now, that they emulate, are companies that have data at their core. And that's not going to change, and that's certainly got to be good for the role of the CDO. >> I've often said that the enterprise data warehouse failed to live up to its expectations and its promises. And Sarbanes-Oxley basically saved EDW because reporting became a critical component post Enron. Mark Ramsey talked today about EDW failing, master data management failing as kind of a mapping and masking exercise. The enterprise data model which was a top down push for a sort of distraction layer, that failed. You had all these failures and so we turned to governance. That failed. And so you've had this series of issues. >> Let me just point out, what do all those have in common? They're all top down. >> Right. >> All top down initiatives. And what Glaxo did is turn that model on its head and left the data where it was. Went and discovered it and figured it out without actually messing with the data. That may be the difference that changes the game. >> Yeah and it's prescription was basically taking a tactical approach to that problem, start small, get quick hits. And then I think they selected a workload that was appropriate for solving this problem which was clinical trials. And I have some questions for him. And of the big things that struck me is the edge. So as you see a new emerging data coming out of the edge, how are organizations going to deal with that? Because I think a lot of what he was talking about was a lot of legacy on-prem systems and data. Think about JEDI, a story we've been following on SiliconANGLE the joint enterprise defense infrastructure. This is all about the DOD basically becoming cloud enabled. So getting data out into the field during wartime fast. We're talking about satellite data, you're talking about telemetry, analytics, AI data. A lot of distributed data at the edge bringing new challenges to how organizations are going to deal with data problems. It's a whole new realm of complexity. >> And you talk about security issues. When you have a lot of data at the edge and you're sending data to the edge, you're bringing it back in from the edge, every device in the middle is from the smart thermostat. at the edge all the way up to the cloud is a potential failure point, a potential vulnerability point. These are uncharted waters, right? We haven't had to do this on a large scale. Organizations like the DOD are going to be the ones that are going to be the leaders in figuring this out because they are so aggressive. They have such an aggressive infrastructure and place. >> The other question I had, striking question listening to Mark Ramsey this morning. Again Mark Ramsey was former data God at GSK, GlaxoSmithKline now a consultant. We're going to hear from a number of folks like him and chief data officers. But he basically kind of poopooed, he used the example of build it and they will come. You know the Kevin Costner movie Field of Dreams. Don't go after the field of dreams. So my question is, and I wonder if you can weigh in on this is, everywhere we go we hear about digital transformation. They have these big digital transformation projects, they generally are top down. Every CEO wants to get digital right. Is that the wrong approach? I want to ask Mark Ramsey that. Are they doing field of dreams type stuff? Is it going to be yet another failure of traditional legacy systems to try to compete with cloud native and born in data era companies? >> Well he mentioned this morning that the research is already showing that digital transformation most initiatives are failing. Largely because of cultural reasons not technical reasons, and I think Ramsey underscored that point this morning. It's interesting that he led off by mentioning business process reengineering which you remember was a big fad in the 1990s, companies threw billions of dollars at trying to reinvent themselves and most of them failed. Is digital transformation headed down the same path? I think so. And not because the technology isn't there, it's because creating a culture where you can break down these silos and you can get everyone oriented around a single view of the organizations data. The bigger the organization the less likely that is to happen. So what does that mean for the CDO? Well, chief information officer at one point we said the CIO stood for career is over. I wonder if there'll be a corresponding analogy for the CDOs at some of these big organizations when it becomes obvious that pulling all that data together is just not feasible. It sounds like they've done something remarkable at GSK, maybe we'll learn from that example. But not all organizations have the executive support, which was critical to what they did, or just the organizational will to organize themselves around that central data storm. >> And I also said before I think the CDO is taking a lot of heat off the CIO and again my inference was the GSK use case and workload was actually quite narrow in clinical trials and was well suited to success. So my takeaway in this, if I were CDO what I would be doing is trying to figure out okay how does data contribute to the monetization of my organization? Maybe not directly selling the data, but what data do I have that's valuable and how can I monetize that in terms of either saving money, supply chain, logistics, et cetera, et cetera, or making money? Some kind of new revenue opportunity. And I would super glue myself for the line of business executive and go after a small hit. You're talking about digital transformations being top down and largely failing. Shadow digital transformations is maybe the answer to that. Aligning with a line of business, focusing on a very narrow use case, and building successes up that way using data as the ingredient to drive value. >> And big ideas. I recently wrote about Experian which launched a service last called Boost that enables the consumers to actually impact their own credit scores by giving Experian access to their bank accounts to see that they are at better credit risk than maybe portrayed in the credit store. And something like 600,000 people signed up in the first six months of this service. That's an example I think of using inspiration, creating new ideas about how data can be applied And in the process by the way, Experian gains data that they can use in other context to better understand their consumer customers. >> So digital meets data. Data is not the new oil, data is more valuable than oil because you can use it multiple times. The same data can be put in your car or in your house. >> Wish we could do that with the oil. >> You can't do that with oil. So what does that mean? That means it creates more data, more complexity, more security risks, more privacy risks, more compliance complexity, but yet at the same time more opportunities. So we'll be breaking that down all day, Paul and myself. Two days of coverage here at MIT, hashtag MITCDOIQ. You're watching The Cube, we'll be right back right after this short break. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

and Information Qualities Symposium 2019. and the data quality world and really about the data as an asset to the organization. and actually focus on some of the mission critical stuff and putting it at the center of the organization. In the early days Hadoop was the hot tech, It is a fundamental component of infrastructures. And that falls right in the lap of and all of the examples you used I've often said that the enterprise data warehouse what do all those have in common? and left the data where it was. And of the big things that struck me is the edge. Organizations like the DOD are going to be the ones Is that the wrong approach? the less likely that is to happen. and how can I monetize that in terms of either saving money, that enables the consumers to actually Data is not the new oil, You can't do that with oil.

ENTITIES

Entity	Category	Confidence
Mark Ramsey	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Paul	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
Google	ORGANIZATION	0.99+
2013	DATE	0.99+
Ramsey	PERSON	0.99+
Kevin Costner	PERSON	0.99+
Enron	ORGANIZATION	0.99+
last year	DATE	0.99+
DOD	ORGANIZATION	0.99+
Experian	ORGANIZATION	0.99+
2019	DATE	0.99+
GlaxoSmithKline	ORGANIZATION	0.99+
Dave	PERSON	0.99+
GSK	ORGANIZATION	0.99+
Glaxo	ORGANIZATION	0.99+
Two days	QUANTITY	0.99+
five megabytes	QUANTITY	0.99+
360 degree	QUANTITY	0.99+
two days	QUANTITY	0.99+
today	DATE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Field of Dreams	TITLE	0.99+
billions of dollars	QUANTITY	0.99+
Mark	PERSON	0.99+
Equifax	ORGANIZATION	0.99+
Yesterday	DATE	0.99+
over a trillion	QUANTITY	0.99+
1990s	DATE	0.98+
600,000 people	QUANTITY	0.98+
US	LOCATION	0.98+
this week	DATE	0.98+
SiliconANGLE Media	ORGANIZATION	0.98+
first six months	QUANTITY	0.98+
Instagram	ORGANIZATION	0.98+
The Cube	TITLE	0.98+
five years ago	DATE	0.97+
Capital One	ORGANIZATION	0.96+
first evidence	QUANTITY	0.96+
both	QUANTITY	0.96+
first	QUANTITY	0.95+
MIT	ORGANIZATION	0.93+
this morning	DATE	0.91+
Hadoop	TITLE	0.88+
one point	QUANTITY	0.87+
13th year	QUANTITY	0.86+
MIT CDOIQ conference	EVENT	0.84+
MITCDOIQ	TITLE	0.84+
each one	QUANTITY	0.82+
hundred practitioners	QUANTITY	0.82+
EDW	ORGANIZATION	0.81+
last eight years	DATE	0.81+
MIT Chief Data Officer and	EVENT	0.81+
Sarbanes-Oxley	PERSON	0.8+
top five companies	QUANTITY	0.78+
The Cube	ORGANIZATION	0.75+
Top five	QUANTITY	0.74+
single view	QUANTITY	0.7+
last 10 years	DATE	0.69+
Boost	TITLE	0.68+
a petabyte of data	QUANTITY	0.65+
EDW	TITLE	0.64+
SiliconANGLE	ORGANIZATION	0.64+

Dr Prakriteswar Santikary, ERT | MIT CDOIQ 2018

>> Live from the MIT campus in Cambridge, Massachusetts, it's the Cube, covering the 12th Annual MIT Chief Data Officer and Information Quality Symposium. Brought to you by SiliconANGLE Media. >> Welcome back to the Cube's coverage of MITCDOIQ here in Cambridge, Massachusetts. I'm your host, Rebecca Knight, along with my co-host, Peter Burris. We're joined by Dr. Santikary, he is the vice-president and chief data officer at ERT. Thanks so much for coming on the show. >> Thanks for inviting me. >> We're going to call you Santi, that's what you go by. So, start by telling our viewers a little bit about ERT. What you do, and what kind of products you deliver to clients. >> I'll be happy to do that. The ERT is a clinical trial small company and we are a global data and technology company that minimizes risks and uncertainties within clinical trials for our customers. Our customers are top pharma companies, biotechnologic companies, medical device companies and they trust us to run their clinical trials so that they can bring their life-saving drugs to the market on time and every time. So we have a huge responsibility in that regard, because they put their trust in us, so we serve as their custodians of data and the processes, and the therapeutic experience that you bring to the table as well as compliance-related expertise that we have. So not only do we provide data and technology expertise, we also provide science expertise, regulatory expertise, so that's one of the reasons they trust us. And we also have been around since 1977, so it's almost over 50 years, so we have this collective wisdom that we have gathered over the years. And we have really earned trust in this past and because we deal with safety and efficacy of drugs and these are the two big components that help MDA, or any regulatory authority for that matter, to approve the drugs. So we have a huge responsibility in this regard, as well. In terms of product, as I said, we are in the safety and efficacy side of the clinical trial process, and as part of that, we have multiple product lines. We have respiratory product lines, we have cardiac safety product lines, we have imaging. As you know, imaging is becoming more and more so important for every clinical trial and particularly on oncology space for sure. To measure the growth of the tumor and that kind of things. So we have a business that focuses exclusively on the imaging side. And then we have data and analytics side of the house, because we provide real-time information about the trial itself, so that our customers can really measure risks and uncertainties before they become a problem. >> At this symposium, you're going to be giving a talk about clinical trials and the problems of, the missteps that can happen when the data is not accurate. Lay out the problem for our viewers, and then we're going to talk about the best practices that have emerged. >> I think that clinical trial space is very complex by its own nature, and the process itself is very lengthy. If you know one of the statistics, for example, it takes about 10 to 15 years to really develop and commercialize a drug. And it usually costs about $2.5 to 3 billion. Per drug. So think about the enormity of this. So the challenges are too many. One is data collection itself. Your clinical trials are becoming more and more complex. Becoming more and more global. Getting patients to the sites is another problem. Patient selection and retention, another one. Regulatory guidelines is another big issue because not every regulated authority follows the same sets of rules and regulations. And cost. Cost is a big imperative to the whole thing, because the development life-cycle of a drug is so lengthy. And as I said, it takes about $3 billion to commercialize a drug and that cost comes down to the consumers. That means patients. So the cost of the health care is growing, is sky-rocketing. And in terms of data collection, there are lots of devices in the field, as you know. Wearables, mobile helds, so the data volume is a tremendous problem. And the vendors. Each pharmaceutical companies use so many vendors to run their trials. CRO's. The Clinical Research Organizations. They have EDC systems, they can have labs. You name it. So they outsource all these to different vendors. Now, how do you coordinate and how do you make them to collaborate? And that's where the data plays a big role because now the data is everywhere across different systems, and those systems don't talk to each other. So how do you really make real-time decisioning when you don't know where your data is? And data is the primary ingredient that you use to make decisions? So that's where data and analytics, and bringing that data in real-time, is a very, very critical service that we provide to our customers. >> When you look at medicine, obviously, the whole notion of evidence-based medicine has been around for 15 years now, and it's becoming a seminal feature of how we think about the process of delivering medical services and ultimately paying it forward to everything else, and partly that's because doctors are scientists and they have an affinity for data. But if we think about going forward, it seems to me as though learning more about the genome and genomics is catalyzing additional need and additional understanding of the role that drugs play in the human body and it almost becomes an information problem, where the drug, I don't want to say that a drug is software, but a drug is delivering something that, ultimately, is going to get known at a genomic level. So does that catalyze additional need for data? is that changing the way we think about clinical trials? Especially when we think about, as you said, it's getting more complex because we have to make sure that a drug has the desired effect with men and women, with people from here, people from there. Are we going to push the data envelope even harder over the next few years? >> Oh, you bet. And that's where the real world evidence is playing a big role. So, instead of patients coming to the clinical trials, clinical trial is going to the patient. It is becoming more and more patient-centric. >> Interesting. >> And the early part of protocol design, for example, the study design, that is step one. So more and more the real world evidence data is being used to design the protocol. The very first stage of the clinical trial. Another thing that is pushing the envelope is artificial intelligence and other data mining techniques and now people can be used to really mine that data, the MAR data, prescription data, claims data. Those are real evidence data coming from the real patients. So now you can use these artificial intelligence and mission learning techniques to mine that data then to really design the protocol and the study design instead of flipping through the year MAR data manually. So patient collection, for example, is no patients, no trials, right? So gathering patients, and the right set of patients, is one of the big problems. It takes a lot of that time to bring those patients and even more troublesome is to retain those patients over time. These, too, are big, big things that take a long time and site selection, as well. Which site is going to really be able to bring the right patients for the right trials? >> So, two quick comments on that. One of the things, when you say the patients, when someone has a chronic problem, a chronic disease, when they start to feel better as a consequence of taking the drug, they tend to not take the drug anymore. And that creates this ongoing cycle. But going back to what you're saying, does it also mean that clinical trial processes, because we can gather data more successfully over time, it used to be really segmented. We did the clinical trial and it stopped. Then the drug went into production and maybe we caught some data. But now because we can do a better job with data, the clinical trial concept can be sustained a little bit more. That data becomes even more valuable over time and we can add additional volumes of data back in, to improve the process. >> Is that shortening clinical trials? Tell us a little bit about that. >> Yes, as I said, it takes 10 to 15 years if we follow the current process, like Phase One, Phase Two, Phase Three. And then post-marketing, that is Phase Four. I'm not taking the pre-clinical side of these trials in the the picture. That's about 10 to 15 years, about $3 billion kind of thing. So when you use these kind of AI techniques and the real world evidence data and all this, the projection is that it will reduce the cycle by 60 to 70%. >> Wow. >> The whole study, beginning to end time. >> So from 15 down to four or five? >> Exactly. So think about, there are two advantages. One is obviously, you are creating efficiency within the system, and this drug industry and drug discovery industry is rife for disruption. Because it has been using that same process over and over for a long time. It's like, it is working, so why fix it? But unfortunately, it's not working. Because the health care cost has sky-rocketed. So these inefficiencies are going to get solved when we employ real world evidencing into the mixture. Real-time decision making. Risks analysis before they become risks. Instead of spending one year to recruit patients, you use AI techniques to get to the right patients in minutes, so think about the efficiency again. And also, the home monitoring, or mHealth type of program, where the patients don't need to come to the sites, the clinical sites, for check-up anymore. You can wear wearables that are MDA regulated and approved and then, they're going to do all the work from within the comfort of their home. So think about that. And the other thing is, very, terminally sick patients, for example. They don't have time, nor do they have the energy, to come to the clinical site for check-up. Because every day is important to them. So, this is the paradigm shift that is going on. Instead of patients coming to the clinical trials, clinical trials are coming to the patients. And that shift, that's a paradigm shift and that is happening because of these AI techniques. Blockchain. Precision Medicine is another one. You don't run a big clinical trial anymore. You just go micro-trial, you just group small number of patients. You don't run a trial on breast cancer anymore, you just say, breast cancer for these patients, so it's micro-trials. And that needs -- >> Well that can still be aggregated. >> Exactly. It still needs to be aggregated, but you can get the RTD's quickly, so that you can decide whether you need to keep investing in that trial, or not. Instead of waiting 10 years, only to find out that your trial is going to fail. So you are wasting not only your time, but also preventing patients from getting the right medicine on time. So you have that responsibility as a pharmaceutical company, as well. So yes, it is a paradigm shift and this whole industry is rife for disruption and ERT is right at the center. We have not only data and technology experience, but as I said, we have deep domain experience within the clinical domain as well as regulatory and compliance experience. You need all these to navigate through this turbulent water of clinical research. >> Revolutionary changes taking place. >> It is and the satisfaction is, you are really helping the patients. You know? >> And helping the doctor. >> Helping the doctors. >> At the end of the day, the drug company does not supply the drug. >> Exactly. >> The doctor is prescribing, based on knowledge that she has about that patient and that drug and how they're going to work together. >> And out of the good statistics, in 2017, just last year, 60% of the MDA approved drugs were supported through our platform. 60 percent. So there were, I think, 60 drugs got approved? I think 30 or 35 of them used our platform to run their clinical trial, so think about the satisfaction that we have. >> A job well done. >> Exactly. >> Well, thank you for coming on the show Santi, it's been really great having you on. >> Thank you very much. >> Yes. >> Thank you. >> I'm Rebecca Knight. For Peter Burris, we will have more from MITCDOIQ, and the Cube's coverage of it. just after this. (techno music)

Published Date : Aug 15 2018

SUMMARY :

Brought to you by SiliconANGLE Media. Thanks so much for coming on the show. We're going to call you Santi, that's what you go by. and the therapeutic experience that you bring to the table the missteps that can happen And data is the primary ingredient that you use is that changing the way we think about clinical trials? patients coming to the clinical trials, So more and more the real world evidence data is being used One of the things, when you say the patients, Is that shortening clinical trials? and the real world evidence data and all this, and then, they're going to do all the work is rife for disruption and ERT is right at the center. It is and the satisfaction is, At the end of the day, and how they're going to work together. And out of the good statistics, Well, thank you for coming on the show Santi, and the Cube's coverage of it.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
David	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Alan	PERSON	0.99+
Jeff	PERSON	0.99+
Adrian	PERSON	0.99+
Peter Burris	PERSON	0.99+
Paul	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Adrian Swinscoe	PERSON	0.99+
Jeff Brewer	PERSON	0.99+
MAN Energy Solutions	ORGANIZATION	0.99+
2017	DATE	0.99+
Tony	PERSON	0.99+
Shelly	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Volkswagen	ORGANIZATION	0.99+
Tony Fergusson	PERSON	0.99+
Pega	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Paul Greenberg	PERSON	0.99+
James Hutton	PERSON	0.99+
Shelly Kramer	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Rob Walker	PERSON	0.99+
Dylan	PERSON	0.99+
10	QUANTITY	0.99+
June 2019	DATE	0.99+
Corey Quinn	PERSON	0.99+
Don	PERSON	0.99+
Santikary	PERSON	0.99+
Croom	PERSON	0.99+
china	LOCATION	0.99+
Tony Ferguson	PERSON	0.99+
30	QUANTITY	0.99+
60 drugs	QUANTITY	0.99+
roland cleo	PERSON	0.99+
UK	LOCATION	0.99+
Don Schuerman	PERSON	0.99+
cal poly	ORGANIZATION	0.99+
Santi	PERSON	0.99+
1985	DATE	0.99+
Duncan Macdonald	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
millions	QUANTITY	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
one year	QUANTITY	0.99+
10 years	QUANTITY	0.99+
Pegasystems	ORGANIZATION	0.99+
80%	QUANTITY	0.99+

Dr Prakriteswar Santikary, ERT | MIT CDOIQ 2018

>> Live from the MIT campus in Cambridge, Massachusetts, it's the Cube covering the 12th annual MIT Chief Data Officer and Information Quality Symposium. Brought to you by SiliconANGLE Media. >> Welcome back to the Cube's coverage of MIT CDOIQ here in Cambridge, Massachusetts. I'm your host, Rebecca Knight along with my co-host, Peter Burris. We're welcoming back Dr. Santikary who is the Vice President and Chief Data Officer of ERT, thanks for coming back on the program. >> Thank you very much. >> So, in our first interview, we talked about the why and the what and now we're really going to focus on the how. How, what are the kinds of imperatives that ERT needs to build into its platform to accomplish the goals that we talked about earlier? >> Yeah, it's a great question. So, that's where our data and technology pieces come in. As we were talking about, you know, the frustration that the complexity of clinical trials. So, in our platform like we are just drowning in data, because the data is coming from everywhere. They are like real-time data, there is unstructured data, there is binary data such as image data, and they normally don't fit in one data store. They are like different types of data. So, what we have come up with is a unique way to really gather the data real-time in a data lake and we implemented that platform on Amazon Web Services Cloud and that has the ability to ingest as well as integrate data of any volume of any type coming to us at any velocity. So, it's a unique platform and it is already live. Press release came out early part of June and we are very excited about that and it is commercial right now, so yeah. >> But, you're more than just a platform. The product and services on top of that platform, one might say that the services in many respects are what you're really providing to the customers. The services that the platform provides, have I got that right? >> Yes, yes. So, platform like in a uBuild different kinds of services, we call it data products on top of that platform. So, one of the data products is business intelligence where you do real-time decisioning and the product is RBM, Risk Based Monitoring, where you come up with all the risks that a clinical trial may be facing and really expose those risks preemptively. >> So, give us an examples. >> Examples will be like patient visit, for example. A patient may be noncompliant with the protocol, so if that happens, then FDA is not going to like it. So, before they get there, our platform almost warns the sponsors that hey, there is something going on, can you take preemptive actions? Instead of just waiting for the 11th hour and only to find out that you have really missed out on some major things. It's just one example, another could be data quality issues, right? So, let's say there's a gap in data, and/or inconsistent data, or the data is not statistically significant, so you raise some of these with the sponsors so that they can start gathering data that makes sense. Because at the end of the day, data quality is vital for the approval of the drug. If that quality of the data that you are collecting is not good, then what good is the drug? >> So, that also suggests a data governance is gotta be a major feature of some of the services associated with the platform. >> Yes, data governance is key, because that's where you get to know who owns which data, how do you really maintain the quality of data overtime? So, we use both tools, technologies, and processes to really govern the data. And as I was telling you in our session one, that we are the custodian of this data, so we have fiduciary responsibility in some sense to really make sure that the data is ingested properly, gathered properly, integrated properly. And then, we make it available real-time for our real-time decision making, so that our customers can really make the right decisions based on the right information. So, data governance is key. >> One of the things that I believe about medical profession is that it's always been at the vanguard of ethics, social ethics, and increasingly, well, there's always been a correspondence within social ethics and business ethics. I mean ideally, they're very closely aligned. Are you finding that the medical ethics, social medical ethics of privacy and how you handle data, are starting to inform a broader understanding of the issues of privacy, ethical use of data, and how are you guys pushing that envelope if you think that has an important future? >> Yes, that is a great question like we use all these, but we have like data security in place in our platform, right? And the data security in our case plays at multiple level. We don't co-mingle one sponsor's data with others, so they're always like particularized. We partition the data in technical sense and then we have permissions and roles so they will see what they're supposed to be seeing. Not like interdepending on the roles, so yeah, data security is very critical to what we do. We also de-anonymize the data, we don't really store the PII like personally identifiable information as well like e-mail address, or first name or last name, you know? Or social security number for that matter. We don't, when you do analysis, we de-identify the data. >> Are you working with say, European pharmaceuticals as well, Bayer and others? >> Yeah, we have like as I said -- >> So, you have GDPR issues that you have satisfied? >> We have GDPR issues, we have like HIPAA issues, so you name it, so data privacy, data security, data protection, they're all a part of what we do and that's why technology's one piece that we do very well. Another pieces are the compliance, science, because you need all of those three in order to be really, you know, trustworthy to your ultimate customers and in our case they are pharmaceutical companies, medical device companies, and biotechnology companies. >> Where there are lives at stake. >> Exactly. >> So, I know you have worked, Santi, in a number of different industries, I'd love to get your thoughts on what differentiates ERT from your competitors and then, more broadly, what will separate the winners from the losers in this area? >> Yeah, obviously before joining ERT I was the Head of Engineering at Ebay. >> Who? (panel members laughing) >> So, that's the bidding platform, so obviously we were dealing with consumer data, right? So, we were applying artificial intelligence, machine learning, and predictive analytics, all kinds of things to drive the business. In this case, while we are still doing predictive analytics, but the idea of predictive analytics is very different, because in our case here at ERT, we can't recommend anything because they are all like, we can't say hey, don't take Aspirin, take Tylenol, we can't do that, it needs to be driven by doctors. Whereas at Ebay, we would just talking to the end consumers here and we would just predict. >> Again, different ethical considerations. >> Exactly, but in our domain primarily like ERT, ERT is the best of breed in terms of what we do, driving clinical trials and helping our customers and the things that we do best are those three ideas like data collection, obviously the data custodiancy that includes privacy, security, you name it. Another thing we do very well is real-time decisioning that allow our customers, in this case pharmaceutical companies, who will have this integrated dataset in one place, almost like cockpit, where they can see which data is where, what the risks are, how to mitigate those risks, because remember that this trials are happening globally. So, your sites, some sites are here, some sites are in India, who knows where? >> So, the mission control is so critical. >> Critical, time critical. And as well as, you know, cost effective as well, because if you can mitigate those risks before they become problems, you save not only cost, but you shorten the timeline of the study itself. So, your time to market, you know? You reduce that time to market, so that you can go to market faster. >> And you mentioned that it can be as long, the process can be a $3 billion dollar process, so reducing time to market could be a billion dollars a cost and a few billion dollars of revenue, because you get your product out before anybody else. >> Exactly, plus you're helping your end goals which is to help the ultimate patients, right? Because you can bring the drug five years earlier than what you have ended for, then you would save lots of lives there. >> So, the one question I had is we've talked a lot about these various elements, we haven't once mentioned master data management. >> Yes. >> So, give us a little sense of the role that master data management plays within ERT and how you see it changing, because you used to be a very metadata, technical-oriented thing and it's becoming much more something that is almost a reflection of the degree to which an institution has taken up the role that data plays within decision-making and operations. >> Exactly, a great question. At the master data management has people, process, and technology, all three that they co-mingle each other to drive master data management. It's not just about technology. So, in our case, our master data is for example, site, or customers, or vendors, or study, they're master data because they lead in each system. Now, depenation of those entities and semantics of those entities are different in each system. Now, in our platform, when you bring data together from this pair of systems, somehow we need to harmonize these master entities. That's why master data management comes into play. >> While complying with regulatory and ethical requirements. >> Exactly. So, customers for example aren't worried as once said. Or, pick any other name, can be spared 20 different ways in 20 different systems, but when you are bringing the data together, into a called platform, we want nobody to be spared only one way. So that's how you mental the data quality of those master entities. And then obviously we have the technology side of things, we have master data management tools, we have data governance that is allowing data qualities to be established over time. And then that is also allowing us to really help our ultimate customers, who are also seeing the high-quality data set. That's the end goal, whether they can trust the number. And that's the main purpose of our integrated platform that we have just launched on AWS. >> Trust, it's been such a recurring theme in our conversation. The immense trust that the pharmaceutical companies are putting in you, the trust that the patients are putting in the pharmaceutical companies to build and manufacture these drugs. How do you build trust, particularly in this environment? On the main stage they were talking this morning about, how just this very notion of data as an asset. It really requires buy-in, but also trust in that fact. >> Yeah, trust is a two-way street, because it has always been. So, our customers trust us- we trust them. And the way you build the trust is through showing, not through talking, right? So, as I said, in 2017 alone, 60% of the FDA approval went through our platform, so that says something. So customers are seeing the results, they're seeing their drugs are getting approved, we are helping them with compliance, we're artists with science, obviously with tools and technologies. So that's how you build trust, over time, and we have been around since 1977, that helps as well because it says that true and tried methods, we know the procedures, we know the water as they say, and obviously folks like us, we know the modern tools and technologies to expedite the clinical trials. To really gain efficiency within the process itself. >> I'll just add one thing to that, trust- and test you on this- trust is a social asset. At the end of the day it's a social asset. There are a lot of people in the technology industry continuously forget is that they think trust is about your hardware, or it's about something in your infrastructure, or even your applications. You can say you have a trusted asset, but if your customer says you don't, or a partner says you don't, or some group of your employees say you don't, you don't have a trusted asset. Trust is where the technological, the process, and the people really come together, that's the test of whether or not you've really got something the people want. >> Yes, and your results will show that, right. Because at the end of the day, your ultimate test is the results. Everything hinges on that. And the experience helps, as your experience with tools and technologies, signs, regulatories, because it's a multidimensional venn diagram almost, and we are very good at that, and we have been for the past 50 years. >> Well Santi, thank you so much for coming on the program again, it's really fun talking to you. >> Thank you very much, thank you. >> I'm Rebecca Knight for Peter Burris, we will have more from M.I.T CDOIQ in just a little bit.

Published Date : Aug 15 2018

SUMMARY :

Brought to you by SiliconANGLE Media. thanks for coming back on the program. So, in our first interview, we talked about and that has the ability to ingest one might say that the services in many respects and the product is RBM, Risk Based Monitoring, where you If that quality of the data that you are collecting a major feature of some of the services so that our customers can really make the right decisions is that it's always been at the vanguard of ethics, and then we have permissions and roles in order to be really, you know, trustworthy Yeah, obviously before joining ERT So, that's the bidding platform, and the things that we do best are those three ideas so that you can go to market faster. because you get your product out before anybody else. Because you can bring the drug So, the one question I had is something that is almost a reflection of the degree Now, in our platform, when you bring data together that we have just launched on AWS. in the pharmaceutical companies And the way you build the trust is through showing, and the people really come together, that's the test Because at the end of the day, your ultimate test is Well Santi, thank you so much for coming on the program we will have more from M.I.T CDOIQ in just a little bit.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Santi	PERSON	0.99+
India	LOCATION	0.99+
2017	DATE	0.99+
60%	QUANTITY	0.99+
Bayer	ORGANIZATION	0.99+
Santikary	PERSON	0.99+
ERT	ORGANIZATION	0.99+
each system	QUANTITY	0.99+
20 different systems	QUANTITY	0.99+
Ebay	ORGANIZATION	0.99+
11th hour	QUANTITY	0.99+
GDPR	TITLE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
HIPAA	TITLE	0.99+
three ideas	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
FDA	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first interview	QUANTITY	0.99+
one piece	QUANTITY	0.98+
1977	DATE	0.98+
one example	QUANTITY	0.98+
One	QUANTITY	0.98+
three	QUANTITY	0.98+
Cube	ORGANIZATION	0.98+
one question	QUANTITY	0.98+
one way	QUANTITY	0.98+
both tools	QUANTITY	0.97+
20 different ways	QUANTITY	0.97+
Amazon Web Services	ORGANIZATION	0.97+
Prakriteswar Santikary	PERSON	0.97+
one place	QUANTITY	0.97+
one	QUANTITY	0.96+
one thing	QUANTITY	0.95+
early part of June	DATE	0.95+
MIT	ORGANIZATION	0.95+
MIT Chief Data Officer and Information Quality Symposium	EVENT	0.94+
Dr.	PERSON	0.93+
MIT CDOIQ	ORGANIZATION	0.92+
five years	QUANTITY	0.92+
this morning	DATE	0.87+
two-way street	QUANTITY	0.85+
$3 billion dollar	QUANTITY	0.84+
M.I.T	ORGANIZATION	0.83+
few billion dollars	QUANTITY	0.82+
2018	DATE	0.77+
one data	QUANTITY	0.77+
billion dollars	QUANTITY	0.76+
session one	QUANTITY	0.72+
12th annual	QUANTITY	0.7+
CDOIQ	ORGANIZATION	0.69+
Risk Based Monitoring	OTHER	0.68+
first	QUANTITY	0.67+
Tylenol	ORGANIZATION	0.67+
European	OTHER	0.65+
Vice President	PERSON	0.65+
each	QUANTITY	0.6+
Aspirin	ORGANIZATION	0.57+
years	QUANTITY	0.51+
past 50	DATE	0.51+
ERT	TITLE	0.47+
ERT	OTHER	0.39+
CDOIQ	EVENT	0.3+

Ilana Golbin, PwC | MIT CDOIQ 2018

>> Live from the MIT campus in Cambridge, Massachusetts, it's The Cube, covering the 12th annual MIT Chief Data Officer and Information Quality Symposium. Brought to you by Silicon Angle Media. >> Welcome back to The Cube's coverage of MIT CDOIQ, here in Cambridge, Massachusetts. I'm your host, Rebecca Knight, along with my cohost Peter Burris. We're joined by Ilana Golbin. She is the manager of artificial intelligence accelerator PWC... >> Hi. >> Based out of Los Angeles. Thanks so much for coming on the show! >> Thank you for having me. >> So I know you were on the main stage, giving a presentation, really talking about fears, unfounded or not, about how artificial intelligence will change the way companies do business. Lay out the problem for us. Tell our viewers a little bit about how you see the landscape right now. >> Yeah, so I think... We've really all experienced this, that we're generating more data than we ever have in the past. So there's all this data coming in. A few years ago that was the hot topic: big data. That big data's coming and how are we going to harness big data. And big data coupled with this increase in computing power has really enabled us to build stronger models that can provide more predictive power for a variety of use cases. So this is a good thing. The problem is that we're seeing these really cool models come out that are black box. Very difficult to understand how they're making decisions. And it's not just for us as end users, but also developers. We don't really know 100% why some models are making the decisions that they are. And that can be a problem for auditing. It can be a problem for regulation if that comes into play. And as end users for us to trust the model. Comes down to the use case, so why we're building these models. But ultimately we want to ensure that we're building models responsibly so the models are in line with our mission as business, and they also don't do any unintended harm. And so because of that, we need some additional layers to protect ourself. We need to build explainability into models and really understand what they're doing. >> You said two really interesting things. Let's take one and then the other. >> Of course. >> We need to better understand how we build models and we need to do a better job of articulating what those models are. Let's start with the building of models. What does it mean to do a better job of building models? Where are we in the adoption of better? >> So I think right now we're at the point where we just have a lot of data and we're very excited about it and we just want to throw it into whatever models we can and see what we can get that has the best performance. But we need to take a step back and look at the data that we're using. Is the data biased? Does the data match what we see in the real world? Do we have a variety of opinions in both the data collection process and also the model design process? Diversity is not just important for opinions in a room but it's also important for models. So we need to take a step back and make sure that we have that covered. Once we're sure that we have data that's sufficient for our use case and the bias isn't there or the bias is there to the extent that we want it to be, then we can go forward and build these better models. So I think we're at the point where we're really excited, and we're seeing what we can do, but businesses are starting to take a step back and see how they can do that better. >> Now the one B and the tooling, where is the tooling? >> The tooling... If you follow any of the literature, you'll see new publications come out sometimes every minute of the different applications for these really advanced models. Some of the hottest models on the market today are deep learning models and reinforcement learning models. They may not have an application for some businesses yet, but they definitely are building those types of applications, so the techniques themselves are continuing to advance, and I expect them to continue to do so. Mostly because the data is there and the processing power is there and there's so much investment coming in from various government institutions and governments in these types of models. >> And the way typically that these things work is the techniques and the knowledge of techniques advance and then we turn them into tools. So the tools are lagging a little bit still behind the techniques, but it's catching up. Would you agree? >> I would agree with that. Just because commercial tools can't keep up with the pace of academic environment, we wouldn't really expect them to, but once you've invested in a tool you want to try and improve that tool rather than reformat that tool with the best technique that came out yesterday. So there is some kind of iteration that will continue to happen to make sure that our commercially available tools match what we see in the academic space. >> So a second question is, now we've got the model, how do we declare the model? What is the state of the art in articulating metadata, what the model does, what its issues are? How are we doing a better job and what can we do better to characterize these models so they can be more applicable while at the same time maintaining fidelity that was originally intended and embedded? >> I think the first step is identifying your use case. The extent to which we want to explain a model really is dependent on this use case. For instance, if you have a model that is going to be navigating a self-driving car, you probably want to have a lot more rigor around how that model is developed than with a model that targets mailers. There's a lot of middle ground there, and most of the business applications fall into that middle ground, but there're still business risks that need to be considered. So to the extent to which we can clearly articulate and define the use case for an AI application, that will help inform what level of explainability or interpretability we need out of our tool. >> So are you thinking in terms of what it means, how do we successfully define use cases? Do you have templates that you're using at PWC? Or other approaches to ensure that you get the rigor in the definition or the characterization of the model that then can be applied both to a lesser, you know, who are you mailing, versus a life and death situation like, is the car behaving the way it's expected to? >> And yet the mailing, we have the example, the very famous Target example that outed a young teenage girl who was pregnant before. So these can have real life implications. >> And they can, but that's a very rare instance, right? And you could also argue that that's not the same as missing a stop sign and potentially injuring someone in a car. So there are always going to be extremes, but usually when we think about use cases we think about criticality, which is the extent to which someone could be harmed. And vulnerability, which is the willingness for an end user to accept a model and the decision that it makes. A high vulnerability use case could be... Like a few years ago or a year ago I was talking to a professor at UCSC, University of California San Diego, and he was talking to a medical devices company that manufactures devices for monitoring your blood sugar levels. So this could be a high vulnerability case. If you have an incorrect reading, someone's life could be in danger. This medical device was intended to read the blood sugar levels by noninvasive means, just by scanning your skin. But the metric that was used to calculate this blood sugar was correct, it just wasn't the same that an end user was expecting. Because that didn't match, these end users did not accept this device, even though it did operate very well. >> They abandoned it? >> They abandoned it. It didn't sell. And what this comes down to is this is a high vulnerability case. People want to make sure that their lives, the lives of their kids, whoever's using this devices is in good hands, and if they feel like they can't trust it, they're not going to use it. So the use case I do believe is very important, and when we think about use cases, we think of them on those two metrics: vulnerability and criticality. >> Vulnerability and criticality. >> And we're always evolving our thinking on this, but this is our current thinking, yeah. >> Where are we, in terms of the way in which... From your perspective, the way in which corporations are viewing this, do you believe that they have the right amount of trepidation? Or are they too trepidatious when it comes to this? What is the mindset? Speaking in general terms. >> I think everybody's still trying to figure it out. What I've been seeing, personally, is businesses taking a step back and saying, "You know we've been building all these proof of concepts, "or deploying these pilots, "but we haven't done anything enterprise-wide yet." Generally speaking. So what we're seeing are business coming back and saying, "Before we go any further, we need "a comprehensive AI strategy. "We need something central within our organization "that tells us, that defines how we're going to move forward "and build these future tools, so that we're not then "moving backwards and making sure everything aligns." So I think this is really the stage that businesses are in. Once they have a central AI strategy, I think it becomes much easier to evaluate regulatory risks or anything like that. Just because it all reports to a central entity. >> But I want to build on that notion. 'Cause generally we agree. But I want to build on that notion, though. We're doing a good job in the technology world of talking about how we're distributing processing power. We're doing a good job of describing how we're distributing data. And we're even doing a good job of just describing how we're distributing known process. We're not doing a particularly good job of what we call systems of agency. How we're distributing agency. In other words, the degree to which a model is made responsible for acting on behalf of the brand. Now in some domains, medical devices, there is a very clear relationship between what the device says it's going to do, and who ultimately is decided to be, who's culpable. But in the software world, we use copyright law. And copyright law is a speech act. How do we ensure that this notion of agency, we're distributing agency appropriately so that when something is being done on behalf of the brand, that there is a lineage of culpability, a lineage of obligations associated with that? Where are we? >> I think right now we're still... And I can't speak for most organizations, just my personal experience. I think that the companies or the instances I've seen, we're still really early on in that. Because AI is different from traditional software, but it still needs to be audited. So we're at the stage where we're taking a step back and we're saying, "We know we need a mechanism "to monitor and audit our AI." We need controls around this. We need to accurately provide auditing and assurance around our AI applications. But we recognize it's different from traditional software. For a variety of reasons. AI is adaptive. It's not static like traditional software. >> It's probabilistic and not categorical. >> Exactly. So there are a lot of other externalities that need to be considered. And so this is something that a lot of businesses are thinking about. One of the reasons why having a central AI strategy is really important, is that you can also define a central controls framework, some type of centralized assurance and auditing process that's mandated from a high level of the organization that everybody will follow. And that's really the best way to get AI widely adopted. Because otherwise, I think we'll be seeing a lot of challenges. >> So I've got one more question. And one question I have is, if you look out in the next three years, as someone who is working with customers, working with academics, trying to match the need to the expertise, what is the next conversation that's going to pop to the top of the stack in this world, in, say, within the next two years? >> Yeah what we'll we be talking about next year or five years from now, too, at the next CDOIQ? >> I think this topic of explainability will persist. Because I don't think we will necessarily tick all the boxes in the next year. I think we'll uncover new challenges and we'll have to think about new ways to explain how models are operating. Other than that, I think customers will want to see more transparency in the process itself. So not just the model and how it's making its decisions, but what data is feeding into that. How are you using my data to impact how a model is making decisions on my behalf? What is feeding into my credit score? And what can I do to improve it? Those are the types of conversations I think we'll be having in the next two years, for sure. >> Great, well Ilana, thanks so much for coming on The Cube. It was great having you. >> Thank you for having me. >> I'm Rebecca Knight for Peter Burris. We will have more from MIT Chief Data Officer Symposium 2018 just after this. (upbeat electronic music)

Published Date : Jul 19 2018

SUMMARY :

Brought to you by Silicon Angle Media. She is the manager of artificial intelligence accelerator Thanks so much for coming on the show! Lay out the problem for us. are making the decisions that they are. really interesting things. We need to better understand how we build models and look at the data that we're using. and the processing power is there and there's so much So the tools are lagging a little bit still of academic environment, we wouldn't really expect them to, and most of the business applications the very famous Target example and the decision that it makes. So the use case I do believe is very important, And we're always evolving our thinking on this, What is the mindset? I think it becomes much easier to evaluate But in the software world, we use copyright law. So we're at the stage where we're taking a step back And that's really the best way the need to the expertise, So not just the model and how it's making its decisions, It was great having you. We will have more from MIT Chief Data Officer Symposium 2018

ENTITIES

Entity	Category	Confidence
Ilana	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Ilana Golbin	PERSON	0.99+
Peter Burris	PERSON	0.99+
PWC	ORGANIZATION	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
two	QUANTITY	0.99+
UCSC	ORGANIZATION	0.99+
Los Angeles	LOCATION	0.99+
one question	QUANTITY	0.99+
first step	QUANTITY	0.99+
next year	DATE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
second question	QUANTITY	0.99+
yesterday	DATE	0.99+
one more question	QUANTITY	0.99+
one	QUANTITY	0.99+
two metrics	QUANTITY	0.99+
a year ago	DATE	0.99+
One	QUANTITY	0.98+
both	QUANTITY	0.98+
The Cube	ORGANIZATION	0.97+
MIT	ORGANIZATION	0.93+
MIT Chief Data Officer and Information Quality Symposium	EVENT	0.93+
few years ago	DATE	0.93+
next two years	DATE	0.92+
today	DATE	0.92+
Target	ORGANIZATION	0.91+
MIT CDOIQ	ORGANIZATION	0.91+
interesting things	QUANTITY	0.88+
PwC	ORGANIZATION	0.87+
University of California San Diego	ORGANIZATION	0.85+
next three years	DATE	0.81+
MIT Chief Data Officer Symposium 2018	EVENT	0.79+
12th annual	QUANTITY	0.75+
MIT CDOIQ 2018	EVENT	0.74+
five	DATE	0.69+
years	QUANTITY	0.63+
Cube	ORGANIZATION	0.59+
CDOIQ	ORGANIZATION	0.45+

Wrap | MIT CDOIQ

>> Live from the MIT campus in Cambridge, Massachusetts, it's theCUBE covering the 12th annual MIT Chief Data Officer and Information Quality Symposium, brought to you by SiliconANGLE media. >> We are wrapping up a day of coverage here at theCUBE for MIT CDOIQ here in Cambridge, Massachusetts. I'm Rebecca Knight, along with Peter Burris. We've been here all day, folks. We've learned a lot, we've had a lot of great conversations here, a lot of lively debate and interest. So, Peter, this morning, you were talking about this fundamental idea that data needs to be viewed as an asset within an organization. Obviously we're here with a bunch of people who are drinking that Kool-Aid, but-- >> Living that Kool-Aid. >> Living that Kool-Aid, embodying that Kool-Aid. So based on what we heard today, do you think that business has caught up? >> Well, I would say two things. First of all, this has been, as you said, it's been absolutely marvelous series of conversations in many respects. This is what theCUBE is built for, right? Smart people in conversation on camera. And we've had some smart people here today. What I got out of it on that particular issue is that there is general agreement among CDOs that they have to start introducing this notion of asset and what that means in their business. There's not general agreement, or there's a general, I guess not agreement, but there's general concern that we still aren't there yet. I think that everybody that we talk to I think, would come back and say, yes we grew those practices, but the conventions are not as established and mature as they need to be for everybody in our business to agree so that we can acculturate. Now we did hear some examples of folks that have done it. So that great BBDA case we talked about was an example. There was a company that is actually becoming, is really truly institutionalizing, acculturating that notion of data as an asset that performs work, but I think we've got general agreement that that's the right way of thinking about it, but also a recognition that more work needs to be done, and that's why conferences like this are so important. >> Well, one of the things that really struck me about what BBDA did was this education campaign of its 130,000 employees, and as you said, really starting from the ground and saying, this is how we're going to do things. This is who we are as an organization. >> Yeah, and it was a great conversation because one of the points I made was, specifically, that BBDA is a bank. It is an information-based business that has very deep practices and principles associated with information, and when they decided that they need to move beyond that, they were able to get the entire bank to adopt a set of practices that are leading to new types of engagement models, product orientations, service capabilities. That's a pretty phenomenal feat. So, it's happening and it can get done, and there are examples of it happening. Another thing we talked about was the fact that over the course of the next few years, one of the big, one of the most exciting things about digital business is not just digital business and digital, what people call digital maintenance, but that transformation practices. That way forward. And we talked about the idea of how you wrapper existing goods and services and offerings with data to turn them into something else, and the incumbents are going to find ways of doing that so they can re-establish themselves as leaders in a lot of different markets. >> And that's what will separate the people who really get this from the people who, or from the organizations that are going to lag. >> Yeah, we're starting to hear that a lot more from clients, is that the idea increasingly is, okay, I've already got customers. I've already got offers. How do I wrapper them? Using a term we heard from a professor at MIT. How do I wrapper them to improve them utilizing data? And that's a big challenge, but it's happening. >> One of the other fun interviews we had was all about clinical trials, and the use of data in these clinical trials. There are so many challenges about, with clinical trials because of the time it takes to conduct one of these, the cost that it takes, and then at the end you are dealing with patients who just say, "oh, I think I'm not going to take that drug today." Or other factors that take place here. I mean, what do you see, I know your dad is a physician, what do you see as the most exciting thing about the use of data in clinical trials, but also just in the healthcare industry in general? >> Well, so what we heard, and it was a great combination of interviews, but what we heard is that to bring a new drug to market can cost $4 billion and take 15 years. And the question is, can data, first off, reduce the cost of bringing a new drug to market? And we heard numbers like, yeah, by $1 billion or even more. So imagine having the cost of bringing a new drug to market, but also reducing the time by as much as two thirds. That's very, very powerful stuff when we come down to it. And as you said, the way you do that is you have to protect your data to make sure you're complying with various regulations, but as you said, for example, sustaining someone in the trial even though they're starting to feel better because the drug's working. Well, people opt out. They abandon the trial. Well can you use data to keep them tied in, to provide new types of benefits and new types of capabilities so they want to sustain their participation in the trial. >> Or at least the pharma company, hey, this person's dropping out, you need to explain that to the FDA, and that's going to become a point, yeah. >> Or you need to provide an incentive to keep them in. >> Right. >> Or another example that was used was, if we can compress the amount of time, but then recognize that we can sustain an engagement with a patient and collect data longer, that even though we can satisfy the specific regulatory mandates of a trial, shorter, we can still be collecting data because we have a digital engagement model as part of this whole process subject to keeping privacy in place and ownership notions in place, and everything else, complying with regulatory notions. So that is I think a very powerful example. And again, Santi, Dr. Santi was talking specifically about how ERT is helping to accelerate this whole process because over the course of the next dozen years, we're going to learn more about people, the genome is going to become better understood. Genomics is going to continue to evolve. Data is going to become increasingly central to how we think about defining disease and disease processes, and one of the key responses is to learn from that and apply data so that we can more rapidly build the new procedures, devices, and drugs that are capable of responding. >> When we're thinking about what keeps the chief data officers up at night, we know that data security, data fidelity, privacy, the other thing we really heard about from Melana Goldban from PwC Accelerator is the idea about bias, and that is a real concern. From the way she is talking about, it sounded as though companies are more aware of this. It really is an organizational challenge that they recognize that not just matters for social reasons but really for business reasons too, frankly. It affects your bottom line. Where do you come out on that? Do you think we're moving in the right direction? >> First of all, it was a great interview, and a lot of what Melana said was illuminating to me, and I agree with virtually everything she said. We're doing a piece of research on that right now. I would say that, in fact, most companies are not fully factoring the role that bias plays in a lot of different ways. That's one of the things that absolutely must happen as part of the acculturation process, what's known as evidence-based management starts to take grip more within businesses is to understand not only what bias introduced into data now, but as you create derivatives on that data, how that bias changes, delays that data. And that is a relatively poorly understood problem. >> But it's a big problem. >> Oh, it's going to be even bigger because we're going to utilize AI and it's actually going to limit the range of options that people consider as they make a decision, or make the decisions directly for the individual, act on behalf of the brand, what we call agency, a system of agency. And not understanding that range, not having it be auditable, not understanding what the inherent bias is can very quickly send a business off the rails in unexpected ways. So we're devoting a lot of time and energy into understanding that right now. But here's the challenge, that we've got business decision makers who are very familiar with certain kinds of information. There's nobody gets to be the CEO or the COO or a senior person in business if they don't have a pretty decent understanding of findings. So financial information is absolutely adopted within the board room and the senior ranks of management in virtually all businesses of any consequential size today. What we're asking them to do is to learn about wholly new classes of data. New data conventions, what it means, how to apply it, how you should factor it, how to converge agreement around things, that allows them to be as mature in their use of customer data or production data or partner data or any other number of metrics as they are with financial data. That's a real tall order. It's one of the significant challenges that a lot of businesses face today. So it's not that they don't get data or they don't understand data. It's that the sources of data and therefore the range of options that are going to be shaped by data are becoming that much more significant in business. >> And it's how they need to think about data too. I mean I was really struck by Tom Sasala at the very beginning saying, one of the reasons the intelligence community didn't predict 9/11 is that we didn't have people who were thinking like Hollywood people, thinking audaciously enough about what could happen and that similarly we need to have business leaders and executives, who may be very good at crunching numbers, really think much more broadly about the kinds of-- >> And Tom is absolutely right. We also, cuz I was very close to the DoD at the time, there was serious confirmation bias that was going on at that time too. >> Exactly. But clearly he's right, that the objective is for executives to, as a group, acknowledge the powerful role that data can play, have a data-first mentality as opposed to a bias or experience-first mentality. Because my experience is very private relative to your experience. And it takes a lot of time for us to negotiate that before we can make a very, very consequential move. That's not going to go away. We're human beings. But we increasingly need to look at data, which can provide a common foundation for us to build our biases upon so that we can be more specific and more transparent about articulating my interpretations. You can't start doing that until you are better, more willing to utilize data as a potentially unifying tool and mechanism for thinking about, thinking about how we move forward with something. >> That's great. And it's a great way to end our day of coverage here at M.I.T CDOIQ. Thank you so much. It's been a pleasure, >> As always, Rebecca. >> hosting with you. And thanks to the crew and everyone here. It's been really a lot of fun. I'm Rebecca Knight for Peter Burris. We will see you next time on theCUBE. (techno music)

Published Date : Jul 18 2018

SUMMARY :

brought to you by SiliconANGLE media. data needs to be viewed Living that Kool-Aid, that they have to start Well, one of the things that are leading to new that are going to lag. from clients, is that One of the other fun interviews we had but also reducing the time and that's going to become a point, yeah. incentive to keep them in. the genome is going to the other thing we really heard about is to understand not only what bias It's that the sources of data and that similarly we need that was going on at that time too. But clearly he's right, that the objective And it's a great way to And thanks to the crew and everyone here.

ENTITIES

Entity	Category	Confidence
Santi	PERSON	0.99+
Melana	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Peter Burris	PERSON	0.99+
Tom Sasala	PERSON	0.99+
Melana Goldban	PERSON	0.99+
15 years	QUANTITY	0.99+
BBDA	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Rebecca	PERSON	0.99+
Tom	PERSON	0.99+
$4 billion	QUANTITY	0.99+
130,000 employees	QUANTITY	0.99+
ERT	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
one	QUANTITY	0.99+
two thirds	QUANTITY	0.98+
today	DATE	0.98+
SiliconANGLE	ORGANIZATION	0.98+
first	QUANTITY	0.98+
9/11	EVENT	0.98+
PwC Accelerator	ORGANIZATION	0.98+
Kool-Aid	ORGANIZATION	0.98+
this morning	DATE	0.98+
FDA	ORGANIZATION	0.97+
First	QUANTITY	0.97+
M.I.T CDOIQ	ORGANIZATION	0.97+
MIT Chief Data Officer and Information Quality Symposium	EVENT	0.97+
$1 billion	QUANTITY	0.97+
two things	QUANTITY	0.95+
One	QUANTITY	0.95+
MIT CDOIQ	ORGANIZATION	0.94+
DoD	TITLE	0.88+
theCUBE	ORGANIZATION	0.79+
next few years	DATE	0.78+
next dozen years	DATE	0.68+
Hollywood	ORGANIZATION	0.65+
12th annual	QUANTITY	0.61+
points	QUANTITY	0.53+
a day	QUANTITY	0.52+
CDOIQ	TITLE	0.28+

Cortnie Abercrombie & Carl Gerber | MIT CDOIQ 2018

>> Live from the MIT campus in Cambridge, Massachusetts, it's theCUBE, covering the 12th Annual MIT Chief Data Officer and Information Quality Symposium. Brought to you by SiliconANGLE Media. >> Welcome back to theCUBE's coverage of MIT CDOIQ here in Cambridge, Massachusetts. I'm your host Rebecca Knight along with my cohost Peter Burris. We have two guests on this segment. We have Cortnie Abercrombie, she is the founder of the nonprofit AI Truth, and Carl Gerber, who is the managing partner at Global Data Analytics Leaders. Thanks so much for coming on theCUBE Cortnie and Carl. >> Thank you. >> Thank you. >> So I want to start by just having you introduce yourselves to our viewers, what you do. So tell us a little bit about AI Truth, Cortnie. >> So this was born out of a passion. As I, the last gig I had at IBM, everybody knows me for chief data officer and what I did with that, but the more recent role that I had was developing custom offerings for Fortune 500 in the AI solutions area, so as I would go meet and see different clients, and talk with them and start to look at different processes for how you implement AI solutions, it became very clear that not everybody is attuned, just because they're the ones funding the project or even initiating the purpose of the project, the business leaders don't necessarily know how these things work or run or what can go wrong with them. And on the flip side of that, we have very ambitious up-and-comer-type data scientists who are just trying to fulfill the mission, you know, the talent at hand, and they get really swept up in it. To the point where you can even see that data's getting bartered back and forth with any real governance over it or policies in place to say, "Hey, is that right? Should we have gotten that kind of information?" Which leads us into things like the creepy factor. Like, you know target (laughs) and some of these cases that are well-known. And so, as I saw some of these mistakes happening that were costing brand reputation, our return on investment, or possibly even creating opportunities for risk for the companies and for the business leaders, I felt like someone's got to take one for the team here and go out and start educating people on how this stuff actually works, what the issues can be and how to prevent those issues, and then also what do you do when things do go wrong, how do you fix it? So that's the mission of AI Truth and I have a book. Yes, power to the people, but you know really my main concern was concerned individuals, because I think we've all been affected when we've sent and email and all of a sudden we get a weird ad, and we're like, "Hey, what, they should not, is somebody reading my email?" You know, and we feel this, just, offense-- >> And the answer is yes. >> Yes, and they are, they are. So I mean, we, but we need to know because the only way we can empower ourselves to do something is to actually know how it works. So, that's what my missions is to try and do. So, for the concerned individuals out there, I am writing a book to kind of encapsulate all the experiences that I had so people know where to look and what they can actually do, because you'll be less fearful if you know, "Hey, I can download DuckDuckGo for my browser, or my search engine I mean, and Epic for my browser, and some private, you know, private offerings instead of the typical free offerings. There's not an answer for Facebook yet though. >> So, (laughs) we'll get there. Carl, tell us a little bit about Global Data Analytics Leaders. >> So, I launched Analytics Leaders and CDO Coach after a long career in corporate America. I started building an executive information system when I was in the military for a four-star commander, and I've really done a lot in data analytics throughout my career. Most recently, starting a CDO function at two large multinational companies in leading global transformation programs. And, what I've experienced is even though the industries may vary a little bit, the challenges are the same and the patterns of behavior are the same, both the good and bad behavior, bad habits around the data. And, through the course of my career, I've developed these frameworks and playbooks and just ways to get a repeatable outcome and bring these new technologies like machine learning to bear to really overcome the challenges that I've seen. And what I've seen is a lot of the current thinking is we're solving these data management problems manually. You know, we all hear the complaints about the people who are analysts and data scientists spending 70, 80% of their time being a data gatherer and not really generating insight from the data itself and making it actionable. Well, that's why we have computer systems, right? But that large-scale technology in automation hasn't really served us well, because we think in silos, right? We fund these projects based on departments and divisions. We acquire companies through mergers and acquisitions. And the CDO role has emerged because we need to think about this, all the data that an enterprise uses, horizontally. And with that, I bring a high degree of automation, things like machine learning, to solve those problems. So, I'm now bottling that and advising my clients. And at the same time, the CDO role is where the CIO role was 20 years ago. We're really in it's infancy, and so you see companies define it differently, have different expectations. People are filling the roles that may have not done this before, and so I provide the coaching services there. It's like a professional golfer who has a swing coach. So I come in and I help the data executives with upping their game. >> Well, it's interesting, I actually said the CIO role 40 years ago. But, here's why. If we look back in the 1970s, hardcore financial systems were made possible by the technology which allowed us to run businesses like a portfolio: Jack Welch, the GE model. That was not possible if you didn't have a common asset management system, if you didn't have a common cached management system, etc. And so, when we started creating those common systems, we needed someone that could describe how that shared asset was going to be used within the organization. And we went from the DP manager in HR, the DP manager within finance, to the CIO. And in many respects, we're doing the same thing, right? We're talking about data in a lot of different places and now the business is saying, "We can bring this data together in new and interesting ways into more a shared asset, and we need someone that can help administer that process, and you know, navigate between different groups and different needs and whatnot." Is that kind of what you guys are seeing? >> Oh yeah. >> Yeah. >> Well you know once I get to talking (laughs). For me, I can going right back to the newer technologies like AI and IOT that are coming from externally into your organization, and then also the fact that we're seeing bartering at an unprec... of data at an unprecedented level before. And yet, what the chief data officer role originally did was look at data internally, and structured data mostly. But now, we're asking them to step out of their comfort zone and start looking at all these unknown, niche data broker firms that may or may not be ethical in how they're... I mean, I... look I tell people, "If you hear the word scrape, you run." No scraping, we don't want scraped data, no, no, no (laugh). But I mean, but that's what we're talking about-- >> Well, what do you mean by scraped data, 'cause that's important? >> Well, this is a well-known data science practice. And it's not that... nobody's being malicious here, nobody's trying to have a malintent, but I think it's just data scientists are just scruffy, they roll up their sleeves and they get data however they can. And so, the practice emerged. Look, they're built off of open-source software and everything's free, right, for them, for the most part? So they just start reading in screens and things that are available that you could see, they can optical character read it in, or they can do it however without having to have a subscription to any of that data, without having to have permission to any of that data. It's, "I can see it, so it's mine." But you know, that doesn't work in candy stores. We can't just go, or jewelry stores in my case, I mean, you can't just say, "I like that diamond earring, or whatever, I'm just going to take it because I can see it." (laughs) So, I mean, yeah we got to... that's scraping though. >> And the implications of that are suddenly now you've got a great new business initiative and somebody finds out that you used their private data in that initiative, and now they've got a claim on that asset. >> Right. And this is where things start to get super hairy, and you just want to make sure that you're being on the up-and-up with your data practices and you data ethics, because, in my opinion, 90% of what's gone wrong in AI or the fear factor of AI is that your privacy's getting violated and then you're labeled with data that you may or may not know even exists half the time. I mean. >> So, what's the answer? I mean as you were talking about these data scientists are scrappy, scruffy, roll-up-your-sleeves kind of people, and they are coming up with new ideas, new innovations that sometimes are good-- >> Oh yes, they are. >> So what, so what is the answer? Is this this code of ethics? Is it a... sort of similar to a Hippocratic Oath? I mean how would you, what do you think? >> So, it's a multidimensional problem. Cortnie and I were talking earlier that you have to have more transparency into the models you're creating, and that means a significant validation process. And that's where the chief data officer partners with folks in risk and other areas and the data science team around getting more transparency and visibility into what's the data that's feeding into it? Is it really the authoritative data of the company? And as Cortnie points out, do we even have the rights to that data that's feeding our models? And so, by bringing that transparency and a little more validation before you actually start making key, bet-the-business decisions on the outcomes of these models, you need to look at how you're vetting them. >> And the vetting process is part technology, part culture, part process, it goes back to that people process technology trying. >> Yeah, absolutely, know where your data came from. Why are you doing this model? What are you doing to do with the outcomes? Are you actually going to do something with it or are you going to ignore it? Under what conditions will you empower a decision-maker to use the information that is the output of the model? A lot of these things, you have to think through when you want to operationalize it. It's not just, "I'm going to go get a bunch of data wherever I can, I put a model together. Here, don't you like the results?" >> But this is Silicon Valley way, right? An MVP for everything and you just let it run until... you can't. >> That's a great point Cortnie (laughs) I've always believed, and I want to test this with you, we talk about people process technology about information, we never talk about people process technology and information of information. There's a manner of respects what we're talking about is making explicit the information about... information, the metadata, and how we manage that and how we treat that, and how we defuse that, and how we turn that, the metadata itself, into models to try to govern and guide utilization of this. That's especially important in AI world, isn't it? >> I start with this. For me, it's simple, I mean, but everything he said was true. But, I try to keep it to this: it's about free will. If I said you can do that with my data, to me it's always my data. I don't care if it's on Facebook, I don't care where it is and I don't care if it's free or not, it's still my data. Even if it's X23andMe, or 23andMe, sorry, and they've taken the swab, or whether it's Facebook or I did a google search, I don't care, it's still my data. So if you ask me if it's okay to do a certain type of thing, then maybe I will consent to that. But I should at least be given an option. And no, be given the transparency. So it's all about free will. So in my mind, as long as you're always providing some sort of free will (laughs), the ability for me to having a decision to say, "Yes, I want to participate in that," or, "Yes, you can label me as whatever label I'm getting, Trump or a pro-Hillary or Obam-whatever, name whatever issue of the day is," then I'm okay with that as long as I get a choice. >> Let's go back to it, I want to build on that if I can, because, and then I want to ask you a question about it Carl, the issue of free will presupposes that both sides know exactly what's going into the data. So for example, if I have a medical procedure, I can sit down on that form and I can say, "Whatever happens is my responsibility." But if bad things happen because of malfeasance, guess what? That piece of paper's worthless and I can sue. Because the doctor and the medical provider is supposed to know more about what's going on than I do. >> Right. >> Does the same thing exist? You talked earlier about governance and some of the culture imperatives and transparency, doesn't that same thing exist? And I'm going to ask you a question: is that part of your nonprofit is to try to raise the bar for everybody? But doesn't that same notion exist, that at the end of the day, you don't... You do have information asymmetries, both sides don't know how the data's being used because of the nature of data? >> Right. That's why you're seeing the emergence of all these data privacy laws. And so what I'm advising executives and the board and my clients is we need to step back and think bigger about this. We need to think about as not just GDPR, the European scope, it's global data privacy. And if we look at the motivation, why are we doing this? Are we doing it just because we have to be regulatory-compliant 'cause there's a law in the books, or should we reframe it and say, "This is really about the user experience, the customer experience." This is a touchpoint that my customers have with my company. How transparent should I be with what data I have about you, how I'm using it, how I'm sharing it, and is there a way that I can turn this into a positive instead of it's just, "I'm doing this because I have to for regulatory-compliance." And so, I believe if you really examine the motivation and look at it from more of the carrot and less of the stick, you're going to find that you're more motivated to do it, you're going to be more transparent with your customers, and you're going to share, and you're ultimately going to protect that data more closely because you want to build that trust with your customers. And then lastly, let's face it, this is the data we want to analyze, right? This is the authenticated data we want to give to the data scientists, so I just flip that whole thing on its head. We do for these reasons and we increase the transparency and trust. >> So Cortnie, let me bring it back to you. >> Okay. >> That presupposes, again, an up-leveling of knowledge about data privacy not just for the executive but also for the consumer. How are you going to do that? >> Personally, I'm going to come back to free will again, and I'm also going to add: harm impacts. We need to start thinking impact assessments instead of governance, quite frankly. We need to start looking at if I, you know, start using a FICO score as a proxy for another piece of information, like a crime record in a certain district of whatever, as a way to understand how responsible you are and whether or not your car is going to get broken into, and now you have to pay more. Well, you're... if you always use a FICO score, for example, as a proxy for responsibility which, let's face it, once a data scientist latches onto something, they share it with everybody 'cause that's how they are, right? They love that and I love that about them, quite frankly. But, what I don't like is it propagates, and then before you know it, the people who are of lesser financial means, it's getting propagated because now they're going to be... Every AI pricing model is going to use FICO score as a-- >> And they're priced out of the market. >> And they're priced out of the market and how is that fair? And there's a whole group, I think you know about the Fairness Accountability Transparency group that, you know, kind of watch dogs this stuff. But I think business leaders as a whole don't really think through to that level like, "If I do this, then this this and this could incur--" >> So what would be the one thing you could say if, corporate America's listening. >> Let's do impact. Let's do impact assessments. If you're going to cost someone their livelihood, or you're going to cost them thousands of dollars, then let's put more scrutiny, let's put more government validation. To your point, let's put some... 'cause not everything needs the nth level. Like, if I present you with a blue sweater instead of a red sweater on google or whatever, (laughs) You know, that's not going to harm you. But it will harm you if I give you a teacher assessment that's based on something that you have no control over, and now you're fired because you've been laid off 'cause your rating was bad. >> This is a great conversation. Let me... Let me add something different, 'cause... Or say it a different way, and tell me if you agree. In many respects, it's: Does this practice increase inclusion or does this practice decrease inclusion? This is not some goofy, social thing, this is: Are you making your market bigger or are you making your market smaller? Because the last thing you want is that the participation by people ends with: You can't play because of some algorithmic response we had. So maybe the question of inclusion becomes a key issue. Would you agree with that? >> I do agree with it, and I still think there's levels even to inclusion. >> Of course. >> Like, you know, being a part of the blue sweater club versus the (laughs) versus, "I don't want to be a convict," you know, suddenly because of some record you found, or association with someone else. And let's just face it, a lot of these algorithmic models do do these kinds of things where they... They use n+1, you know, a lot... you know what I'm saying. And so you're associated naturally with the next person closest to you, and that's not always the right thing to do, right? So, in some ways, and so I'm positing just little bit of a new idea here, you're creating some policies, whether you're being, and we were just talking about this, but whether you're being implicit about them or explicit, more likely you're being implicit because you're just you're summarily deciding. Well, okay, I have just decided in the credit score example, that if you don't have a good credit threshold... But where in your policies and your corporate policy did it ever say that people of lesser financial means should be excluded from being able to have good car insurance for... 'cause now, the same goes with like Facebook. Some people feel like they're going to have to opt of of life, I mean, if they don't-- >> (laughs) Opt out of life. >> I mean like, seriously, when you think about grandparents who are excluded, you know, out in whatever Timbuktu place they live, and all their families are somewhere else, and the only way that they get to see is, you know, on Facebook. >> Go back to the issue you raised earlier about "Somebody read my email," I can tell you, as a person with a couple of more elderly grandparents, they inadvertently shared some information with me on Facebook about a health condition that they had. You know how grotesque the response of Facebook was to that? And, it affected me to because they had my name in it. They didn't know any better. >> Sometimes there's a stigma. Sometimes things become a stigma as well. There's an emotional response. When I put the article out about why I left IBM to start this new AI Truth nonprofit, the responses I got back that were so immediate were emotional responses about how this stuff affects people. That they're scared of what this means. Can people come after my kids or my grandkids? And if you think about how genetic information can get used, you're not just hosing yourself. I mean, breast cancer genes, I believe, aren't they, like... They run through families, so, I-- >> And they're pretty well-understood. >> If someone swabs my, and uses it and swaps it with other data, you know, people, all of a sudden, not just me is affected, but my whole entire lineage, I mean... It's hard to think of that, but... it's true (laughs). >> These are real life and death... these are-- >> Not just today, but for the future. And in many respects, it's that notion of inclusion... Going back to it, now I'm making something up, but not entirely, but going back to some of the stuff that you were talking about, Carl, the decisions we make about data today, we want to ensure that we know that there's value in the options for how we use that data in the future. So, the issue of inclusion is not just about people, but it's also about other activities, or other things that we might be able to do with data because of the nature of data. I think we always have to have an options approach to thinking about... as we make data decisions. Would you agree with that? Yes, because you know, data's not absolute. So, you can measure something and you can look at the data quality, you can look at the inputs to a model, whatever, but you still have to have that human element of, "Are you we doing the right thing?" You know, the data should guide us in our decisions, but I don't think it's ever an absolute. It's a range of options, and we chose this options for this reason. >> Right, so are we doing the right thing and do no harm too? Carl, Cortnie, we could talk all day, this has been a really fun conversation. >> Oh yeah, and we have. (laughter) >> But we're out of time. I'm Rebecca Knight for Peter Burris, we will have more from MIT CDOIQ in just a little bit. (upbeat music)

Published Date : Jul 18 2018

SUMMARY :

Brought to you by SiliconANGLE Media. she is the founder of the nonprofit AI Truth, So I want to start by just having you To the point where you can even see that and some private, you know, private offerings Carl, tell us a little bit about and not really generating insight from the data itself and you know, navigate between different groups Well you know once I get to talking (laughs). And so, the practice emerged. and somebody finds out that you used and you just want to make sure that you're being on the Is it a... sort of similar to a Hippocratic Oath? that you have to have more transparency And the vetting process is part technology, A lot of these things, you have to think through An MVP for everything and you just let it run until... the metadata, and how we manage that the ability for me to having a decision to say, because, and then I want to ask you a question about it Carl, that at the end of the day, you don't... This is the authenticated data we want to give How are you going to do that? and now you have to pay more. And there's a whole group, I think you know about So what would be the one thing you could say if, But it will harm you if I give you a teacher assessment Because the last thing you want is that I do agree with it, and I still think there's levels and that's not always the right thing to do, right? and the only way that they get to see is, you know, Go back to the issue you raised earlier about And if you think about how genetic information can get used, and uses it and swaps it with other data, you know, people, in the options for how we use that data in the future. and do no harm too? Oh yeah, and we have. we will have more from MIT CDOIQ in just a little bit.

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Cortnie Abercrombie	PERSON	0.99+
Carl	PERSON	0.99+
Cortnie	PERSON	0.99+
Peter Burris	PERSON	0.99+
Trump	PERSON	0.99+
Carl Gerber	PERSON	0.99+
Jack Welch	PERSON	0.99+
IBM	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Hillary	PERSON	0.99+
four-star	QUANTITY	0.99+
GE	ORGANIZATION	0.99+
two guests	QUANTITY	0.99+
1970s	DATE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
both sides	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Obam	PERSON	0.99+
both	QUANTITY	0.98+
SiliconANGLE Media	ORGANIZATION	0.98+
40 years ago	DATE	0.98+
DuckDuckGo	TITLE	0.98+
thousands of dollars	QUANTITY	0.98+
Timbuktu	LOCATION	0.98+
America	LOCATION	0.98+
theCUBE	ORGANIZATION	0.98+
today	DATE	0.98+
FICO	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
MIT CDOIQ	ORGANIZATION	0.96+
20 years ago	DATE	0.95+
google	ORGANIZATION	0.95+
12th Annual MIT Chief Data Officer and Information Quality Symposium	EVENT	0.93+
one	QUANTITY	0.93+
AI Truth	ORGANIZATION	0.89+
70, 80%	QUANTITY	0.87+
MIT	ORGANIZATION	0.87+
Global Data Analytics Leaders	ORGANIZATION	0.86+
2018	DATE	0.83+
CDO Coach	TITLE	0.82+
Hippocratic Oath	TITLE	0.82+
two large multinational companies	QUANTITY	0.79+
half	QUANTITY	0.75+
Fairness	ORGANIZATION	0.68+
X23andMe	ORGANIZATION	0.68+
23andMe	ORGANIZATION	0.66+
Analytics	ORGANIZATION	0.64+
couple	QUANTITY	0.62+
European	OTHER	0.59+
blue sweater	ORGANIZATION	0.58+
Epic	ORGANIZATION	0.5+
Fortune	ORGANIZATION	0.48+
1	QUANTITY	0.46+
CDOIQ	EVENT	0.36+
500	QUANTITY	0.35+

Dr Prakriteswar Santikary, ERT | MIT CDOIQ 2018

>> Live from the MIT campus in Cambridge Massachusetts, it's theCube, covering the 12th annual MIT Chief Data Officer and Information Quality Symposium, brought to you by SiliconANGLE media. >> Welcome back to theCUBE's coverage of MIT CDOIQ here in Cambridge, Massachusetts. I'm your host Rebecca Knight along with my co-host Peter Burris. We're welcoming back Dr. Santikary, who is the Vice President and Chief Data Officer of ERT. Thanks for coming back on the program. >> Thank you very much. >> So in our first interview we talked about the why and the what and now we're really going to focus on how, the how. How, what are the kinds of imperatives that ERT needs to build into its platform to accomplish the goals that we talked about earlier. >> Yeah, it's a great question. So, that's where our data and technology pieces come in. We are as we were talking about in our first session that the complexity of clinical trials. So in our platform like we are just drowning in data because the data is coming from everywhere. There are like real-time data, there is unstructured data, there is binary data such as image data and they normally don't fit in one data store. They are like different types of data. So what we have come up with is a unique way to really gather the data real time, in a data lake, and we implemented that platform on Amazon web services ... Cloud and ... that has the ability to ingest as well as integrate data of any volume, of any type coming to us at any velocity. So it's a unique platform and it is already live, press release came out early part of June and we are very excited about that. And it is commercial right now. So, yeah. >> But you're more than just a platform, you're product and services on top of that platform, one might say that the services in many respects are what you're really providing to the customers, the services that the platform provides. Have I got that right? >> Yes, yes. So platform like you build different kinds of services we call it data products on top of that platform. So one of the data products is business intelligence. Why do you do real time decisioning? Another product is RBM, Risk-Based Monitoring, where you ... come up with all the risks that a clinical trial may be facing and really expose those risks preemptively. >> So give us some examples. >> Examples will be like patient visit for example. Patient may be non-compliant with the protocol. So if that happens then FDA is not going to like it. So before they get there our platform almost warns the sponsor that hey there is something going on can you take preemptive actions? Instead of just waiting for the 11th hour and only to find out that you have really missed out on some major things. It's just one example. Another could be data quality issues, right. So let's say there is a gap in data and/or inconsistent data or the data is not statistically significant. So you've to raise some of these with the sponsors so that they can start gathering data that makes sense because at the end of the day, data quality is vital for the approval of the drug. If the quality of the data that you are collecting is not good, then what good is the trial? >> So that also suggested that data governance is got to be a major feature of some of the services associated with the platform. Have I got that right? >> Yes, data governance is key because that's where you get to know who owns which data. How do you really maintain the quality of data over time? So we use both tools, technologies, and processes to really govern the data and as I was telling you in our session one, that we have the custodian of these data. So we have fiduciary responsibility in some sense to really make sure that the data is ingested properly, gathered properly, integrated properly and then we make it available real time for real time decision making so that our customers can really make the right decisions based on the right information. So data governance is key. >> One of the things that I believe about medical profession is that it's always been at the vanguard of ethics, social ethics and increasingly, well there has always been a correspondence between social ethics and business ethics. I mean, ideally they're very closely aligned. Are you finding that the medical ethics, social medical ethics of privacy and how you handle data are starting to inform a broader understanding of the issues of privacy, ethical use of data, and how are you guys pushing that envelope if you think that that is an important feature? >> Yeah, that's a great question. We use all these, but we have like data security in place in our platform, right? And the data security in our case plays at multiple level. We don't co-mingle one sponsor's data with other's. So they are always like particalized. We partition the data in technical sense and then we have permissions and roles. So they will see what they are supposed to be seeing. Not like, you know depending on the roles. So yeah, data security is very critical to what we do. We also de-anonymize the data. We don't really store the PII like Personally Identifiable Information as well like email address or first name or last name or social security number for that matter. When we do analysis, we de-identify the data. >> Are you working with European pharmaceuticals as well, Bayer and others? >> Yeah, we have like as I said. >> So you have GDPR issues (crosstalk). >> We have GDPR issues. We have like HIPPA issues. So you name it. Data privacy, data security, data protection. They are all a part of what we do and that's why technology is one piece that we do very well. Another pieces are the compliance, science. Because you need all of those three in order to be really trustworthy to your ultimate customers and in our case they are pharmaceutical companies, medical device companies, and biotechnology companies. >> Where there are lives at stake. >> Exactly. >> So I know you have worked Santi in a number of different industries. I'd like to get your thoughts on what differentiates ERT from your competitors and then more broadly, what will separate the winners from the losers in this area. >> Yeah, obviously before joining ERT, I was the head of data engineering at eBay. >> Who? (laughing) >> So that's the bidding platform so obviously we were dealing with consumer data right? So we were applying like artificial intelligence, machine learning and predictive analytics. All kinds of thing to drive the business. In this case, while we are still doing predictive analytics but the ideal predictive analytics is very different because in our case here at ERT we can't recommend anything because they are all like we can't say hey don't take Aspirin, take Tylenol. We can't do that. It's to be driven by doctors. Whereas at eBay, we were just talking to the end consumers here and we would just predict. >> Different ethical considerations. >> Exactly. But in our domain primarily like ERT, ERT is the best of breed in terms of what we do, driving clinical trials and helping our customers and the things that we do best are those three areas like data collection. Obviously the data custodiancy that includes privacy, security, you name it. Another thing we do very well is real time decisioning. So that allow our customers, in this case, pharmaceutical companies who will have this integrated dataset in one place. Almost like a cockpit where they can see which data is where, where the risks are, how to mitigate those risks. Because remember that these trials are happening globally. So some sites are here, some sites are in India. Who knows where? >> So the mission control is so critical. >> Critical, time critical. >> Hmm. >> And as well as you know cost-effective as well because if you can mitigate those risks before they become problems, you save not only cost but you shorten the timeline of the study itself. So your time to market, you know. You reduce that time to market so that you can go to market faster. >> And you mentioned that it can be, they could be, the process could be a 3 billion dollar process. So reducing time to market could be a billion dollars of cost and a few billion dollars of revenue because you get your product out before anybody else. >> Exactly. Plus you are helping your end goals which is to help the ultimate patients, right? >> And that too. >> Because if you can bring the drug five years earlier than what- >> Save lives. >> What you had intended for then you know, you'd save lots of lives there. Definitely. >> So the one question I have is we've talked a lot about these various elements. We haven't once mentioned master data management. >> Yes. >> So give us a little sense of the role that master data management plays within ERT and how you see it changing. Because it used to be a very metadata technical oriented thing and it's becoming much more something that is almost a reflection of the degree to which an institution has taken up the role that data plays within decision making and operation. >> Exactly, a great question. The master data management has like people, process, and technology. All three, they co-mingle each other to drive master data management. So it's not just about technology. So in our case, our master data is for example, site or customers, or vendors or study. They're master data because they live in each system. Now definition of those entities and semantics of those entities are different in each system. Now in our platform when you bring data together from disparate systems, somehow we need to harmonize these master entities. That's why master data management- >> While complying with regulatory and ethical requirements. >> Exactly. So customers for example Novartis let's say, or be it any other name, can be spelled 20 different ways in 20 different systems. But when we are bringing the data together into our core platform, we want Novartis to be spelled only one way. So that's how you maintain the data quality of those master entities. And then obviously we have the technology side of things. We have master data management tools. We have data governance that is allowing data qualities to be established over time and then that is also allowing us to really help our ultimate customers who are also seeing the high quality dataset. That's the end goal, whether they can trust the number. And that's the main purpose of our integrated platform that we have just launched on AWS. >> Trust is just, it's been such a recurring theme in our conversation. The immense trust that the pharmaceutical companies are putting in you, the trust that the patients are putting in the pharmaceutical companies to build and manufacture these drugs. How do you build trust, particularly in this environment? We've talked, on the main stage they were talking this morning about how just this very notion of data as an asset, it really requires buy-in, but also trust in that fact. >> Yeah, yeah. Trust is a two-way street, right? Because it has always been. So our customers trust us, we trust them. And the way you build the trust is through showing not through talking, right? So, as I said, in 2017 alone, 60% of the FDA approval went through our platform. So that says something. So customers are seeing the results. So they are seeing their drugs are getting approved. We are helping them with compliance, with audits, with science, obviously with tools and technologies. So that's how you build trust over time. And we have been around since 1977, that helps as well, because it's a ... true and tried method. We know the procedures. We know the water, as they say. And obviously, folks like us, we know the modern tools and technologies to expedite the clinical trials, to really gain efficiency within the process itself. >> I'll just add one thing to that and test you on this. Trust is a social asset. >> Yeah. >> At the end of the day it's a social asset and I think what a lot of people in the technology industry continuously forget, is that they think the trust is about your hardware, or it's about something in your infrastructure, or even in your applications. You can say you have a trusted asset but if your customer says you don't or a partner says you don't or some group of your employees say you don't, you don't have a trusted asset. >> Exactly. >> Trust is where the technological, the process, and the people really come together. >> And the people come together. >> That's the test of whether or not you've really got something that people want. >> Yes. And your results will show that, right? Because at the end of the day, your ultimate test is the results, right? And because that, everything hinges on that. And then the experience helps as you're experienced with tools and technologies, science, regularities. Because it's a multidimensional Venn diagram almost. And we are very good at that and we have been for the past 50 years. >> Great. Well Santi, thank you so much for coming on the program again. >> Okay, thank you very much. >> It was really fun talking to you. >> Thank you. >> I'm Rebecca Knight for Peter Burris. We will have more from MIT CDOIQ in just a little bit. (upbeat futuristic music)

Published Date : Jul 18 2018

SUMMARY :

brought to you by SiliconANGLE media. Thanks for coming back on the program. So in our first interview we talked about that has the ability to ingest as well as integrate one might say that the services in many respects So one of the data products is business intelligence. So if that happens then FDA is not going to like it. So that also suggested that data governance to really govern the data and as I was telling you is that it's always been at the vanguard of ethics, and then we have permissions and roles. So you name it. So I know you have worked Santi Yeah, obviously before joining ERT, So that's the bidding platform so and the things that we do best are those three areas so that you can go to market faster. So reducing time to market Plus you are helping your end goals What you had intended for then you know, So the one question I have is is almost a reflection of the degree to which Now in our platform when you bring data together and ethical requirements. So that's how you maintain the data quality on the main stage they were talking this morning And the way you build the trust to that and test you on this. is that they think the trust is about your hardware, the process, and the people really come together. That's the test of whether or not Because at the end of the day, for coming on the program again. We will have more from MIT CDOIQ in just a little bit.

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Peter Burris	PERSON	0.99+
2017	DATE	0.99+
Bayer	ORGANIZATION	0.99+
India	LOCATION	0.99+
Santi	PERSON	0.99+
eBay	ORGANIZATION	0.99+
60%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
each system	QUANTITY	0.99+
11th hour	QUANTITY	0.99+
20 different systems	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Santikary	PERSON	0.99+
ERT	ORGANIZATION	0.99+
3 billion dollar	QUANTITY	0.99+
20 different ways	QUANTITY	0.99+
three	QUANTITY	0.99+
first session	QUANTITY	0.99+
FDA	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Cambridge Massachusetts	LOCATION	0.99+
one piece	QUANTITY	0.99+
first interview	QUANTITY	0.99+
One	QUANTITY	0.99+
one example	QUANTITY	0.98+
1977	DATE	0.98+
GDPR	TITLE	0.98+
SiliconANGLE	ORGANIZATION	0.98+
one place	QUANTITY	0.98+
one way	QUANTITY	0.97+
two-way	QUANTITY	0.97+
early part of June	DATE	0.97+
Prakriteswar Santikary	PERSON	0.97+
three areas	QUANTITY	0.96+
Novartis	ORGANIZATION	0.96+
one thing	QUANTITY	0.96+
billion dollars	QUANTITY	0.96+
one	QUANTITY	0.95+
MIT Chief Data Officer and Information Quality Symposium	EVENT	0.95+
Dr.	PERSON	0.95+
one question	QUANTITY	0.94+
MIT	ORGANIZATION	0.94+
this morning	DATE	0.94+
theCUBE	ORGANIZATION	0.94+
MIT CDOIQ	ORGANIZATION	0.92+
each	QUANTITY	0.86+
first	QUANTITY	0.84+
Dr	PERSON	0.84+
both tools	QUANTITY	0.79+
session one	QUANTITY	0.76+
few billion dollars	QUANTITY	0.71+
12th annual	QUANTITY	0.7+
five years	QUANTITY	0.69+
Risk	OTHER	0.68+
Tylenol	ORGANIZATION	0.68+
one data store	QUANTITY	0.67+
European	OTHER	0.65+
Chief Data	PERSON	0.64+
2018	DATE	0.63+
Santi	ORGANIZATION	0.62+
Aspirin	ORGANIZATION	0.6+
Vice President	PERSON	0.6+
50 years	QUANTITY	0.58+
MIT CDOIQ	TITLE	0.57+
Based	OTHER	0.52+
things	QUANTITY	0.5+
Identifiable	OTHER	0.49+
theCube	ORGANIZATION	0.46+
past	DATE	0.46+
HIPPA	ORGANIZATION	0.28+

Kickoff | MIT CDOIQ 2018

>> Live from the MIT Campus in Cambridge, Massachusetts, it's theCUBE. Covering the 12th Annual MIT chief data officer and Information Quality Symposium. Brought to you by: SiliconANGLE Media. >> Welcome to theCUBE's coverage of MITCDOIQ here in Cambridge, Massachusetts on the MIT Campus. I'm your host, Rebecca Knight, along with my co-host, Peter Burris. Peter, it's a pleasure to be here with you. Thanks for joining me. >> Absolutely, good to see you again, Rebecca. >> These are my stomping grounds. >> Ha! >> So welcome to Massachusetts. >> It's an absolutely beautiful day in Cambridge. >> It is, it is, indeed. I'm so excited to be hosting this with you, what do you think, this is about chief data officer information quality. We're really going to get inside the heads of these chief data officers, find out what's on their minds, what's keeping them up at night, how are they thinking about data, how are they pricing it, how are they keeping it clean, how are they optimizing it, exploiting it, how are they hiring for it? What do you think is the top issue of the day, in your mind. There's a lot to talk about here, what's number one? >> Well, I think the first thing, Rebecca, is that if you're going to have a chief in front of your name then, at least in my mind, that means the Board has directed you to generate some return on the assets that you've been entrusted with. I think the first thing that the CDO, the chief data officer, has to do is start to do a better job of pricing out the value of data, demonstrating how they're turning it into assets that can be utilized and exploited in a number of different ways to generate returns that are actually superior to some of the other assets in the business because data is getting greater investment these days. So I think the first thing is, how are you turning your data into an asset, because if you're not, why are you achieve anything? >> (laughs) No, that's a very good point. The other thing we were talking about before the cameras were rolling, is the role of the CDO, chief data officer, and the role of the CIO, chief information officer, and how those roles differ. I mean is that something that we're going to get into today? What do you think? >> I think it's something certainly to ask a lot of the chief data officers that are coming on, there's some confusion in the industry about what the relationship should be and how the roles are different. The chief data officer as a concept, has been around for probably 10-12 years, something like that. I mean, the first time I heard it was probably 2007-2008. The CIO role has always been about information, but it ended up being more about the technology, and then the question was, what does a Chief Technology Officer does? Well it was a Chief Technology Officer could have had a different role, but they also seem to be increasing the responsible for the technology. So if you look at a lot organizations that have a CDO, the CIO looks more often to be the individual in charge of the IT assets, the technology officer tends to be in charge of the IT infrastructure, and the CDO tends to be more associated with, again, the role that the data plays, increasingly associated with analytics. But I think, over the next few years, that set of relationships is going to change, and new regimes will be put in place as businesses start to re-institutionalize their work around their data, and what it really means to have data as an asset. >> And the other role we've not mentioned is the CDO, Chief Digital Officer, which is the convergence of those two roles as well. How do you see, you started out by saying this is really about optimizing the data and finding a way to make money from it. >> Or generate a return. >> Generate a return, exactly! Find value in it, exactly. >> One of the things about data, and one of the things about IT, historically, is that it often doesn't generate money directly, but rather indirectly, and that's one of the reasons why it has been difficult to sustain investments in. The costs are almost always direct, so if I invest in an IT project, for example, the costs show up immediately but the benefits come through whatever function I just invested in the application to support. And the same thing exists with data. So if we take a look at the Chief Digital Officer, often that's a job that has been developed largely close or approximate to the COO to better understand how operations are going to change as a consequence of an increasing use of data. So, the Chief Digital Officer is often an individual whose entrusted to think about as we re-institutionalize work around data, what is that going to mean to our operations and our engagement models too? So, I think it's a combination of operations and engagement. So the Chief Digital Officer is often very proximate to the COO thinking about how data is going to change the way organization works, change the way the organization engages, from a strategic standpoint first, but we're starting to see that role move more directly into operations. I don't want to say compete with the COO, but work much more closely with them in an operational level. >> Right, and of course, depending organization to organization. >> It's always different, and to what degree are your assets historically data-oriented, like if you're a media company or if you're a financial services company, those are companies that are very strong lineages of data as an asset. If you're a manufacturing company, and you're building digital twins, like a GE or something along those lines, then you might be a little bit newer to the game, but still you have to catch up because data is going to mush a lot of industries together, and it's going to to be hard to parse some of these industries in five to ten years. >> Well, precisely, one of the things you said was that the CDO, as a role, is really only 11-12 years old. In fact that this conference is in its 12th year, so really it started at the very beginning of the CDO journey itself, and we're now amidst the CDO movement. I mean, what do you think, how is the CDO thinking about his or her role within the larger AI revolution? >> Well, that's a great question, and it's one of the primary reasons why it's picking up pace. We've had a number of different technology introductions over the past 15 - 20 years that have bought us here. The notion of virtualizing machines changed or broke that relationship between applications and hardware. The idea of very high speed, very flexible, very easy to manage data center networking broke the way that we thought about how resources could be bought together. Very importantly, in the last six or seven years, the historical norm for storage was disc, which was more emphasized how do I persist the data that results from a transaction, and now we're moving to flash, and flash-based systems, which is more about how can I deliver data to new types of applications. That combination of things makes it possible to utilize a lot of these AI algorithms and a lot of these approaches to AI, many of which the algorithms have been around for 40-50 years, so we're catalyzing a new era in which we can think about delivering data faster with higher fidelity, with lower administrative costs because we're not copying everything and putting it in a lot of different places. That is making it possible to do these AI things. That's precisely one of the factors that's really driving the need to look at data as an asset because we can do more with it than we ever have before. You know, it's interesting, I have a little bromide, when people ask me what's really going on in the industry, what I like to say is for the first 50 years of the industry, it was known process, unknown technology. We knew we were going to do accounting, we knew we were going to do HR, there was largely given to us legal or regulatory or other types of considerations, but the unknown was, do we put it on a mainframe? Do we put it on a (mumbles) Do we use a database manager? How distributed is this going to be? We're now moving into an era where it's unknown process because we're focused on engagement or the role that data can play in changing operations, but the technology is going to be relatively common. It's going to be Cloud or Cloud-like. So, we don't have quite as, it's not to say that technology questions go away entirely, they don't, but it's not as focused on the technology questions, we can focus more on the outcomes, but we have a hard time challenging those outcomes or deciding what those outcomes are going to be, and that's one of the interesting things here. We're not only using data to deliver the outcomes, we're also using data to choose what outcomes to pursue. So it's an interesting recursive set of activities where the CDO is responsible for helping the business decide what are we going to do and also, how are we going to do it? >> Well, exactly. That's an excellent point, because there are so many, one of the things that we've heard about on the main stage this morning is the difficulty a lot of CDOs get with just buy-in, and really understanding, this is important, and this is not as important or this is what we're going to do, this is what we're saying the data is telling us, and these are the actions we're going to take. How do you change a culture? How do you get people to embrace it? >> Well, this is an adoption challenge, and an adoption challenges are always met by showing returns quickly and sustainably. So one of the first things, which is why I said, one of the first things that a CDO has to do is show the organization how data can be thought of as an asset, because once you do that, now you can start to describe some concrete returns that you are able to help deliver as a consequence of your chief role. So that's probably the first thing. But, I think, one of the other things to do is to start doing things like demonstrating the role that information quality plays within an organization. Now, information quality is almost always measured in terms of the output or the outcomes that it supports, but there are questions of fidelity, there are questions of what data are we going to use, what data are we not going to use? How are we going to get rid of data? There's a lot of questions related to information quality that have process elements to them, and those processes are just now being introduced to the organization, and doing a good job of that, and acculturating people to understanding the role of equality plays, information quality plays, is another part of it. So I think you have to demonstrate that you have conceived and can execute on a regime of value, and at the same time you have to demonstrate that you have particular insight into some of those on-going processes that are capable of sustaining that value. It's a combination of those two things that, I think, the chief data officer's going to have to do to demonstrate that they belong at the table, on-going. >> Well, today we're going to be talking to an array of people, some from MIT who study this stuff >> I hear they're smart people. >> Yeah, maybe. A little bit. We'll see, we'll see. MIT, some people from the US Government, so CDOs from the US Army, the Air Force, we've got people from industry too, we've also got management consultants coming on to talk about some best practices, so it's going to be a great day. We're going to really dig in here. >> Looking forward to it. >> Yes. I'm Rebecca Knight, for Peter Burris, we will have more from MITCDOIQ in just a little bit. (techno music)

Published Date : Jul 18 2018

SUMMARY :

Brought to you by: SiliconANGLE Media. Massachusetts on the MIT Campus. Absolutely, good to beautiful day in Cambridge. issue of the day, in your mind. the chief data officer, has to do rolling, is the role of the CDO, and the CDO tends to be is the CDO, Chief Digital Officer, Generate a return, exactly! and one of the things depending organization to organization. and to what degree of the things you said the need to look at data as an asset one of the things that we've and at the same time so CDOs from the US Army, the Air Force, we will have more from

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Peter Burris	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Peter	PERSON	0.99+
five	QUANTITY	0.99+
Massachusetts	LOCATION	0.99+
Cambridge	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
12th year	QUANTITY	0.99+
US Army	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
theCUBE	ORGANIZATION	0.99+
MITCDOIQ	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
two roles	QUANTITY	0.99+
one	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
ten years	QUANTITY	0.99+
2007-2008	DATE	0.99+
40-50 years	QUANTITY	0.99+
first 50 years	QUANTITY	0.98+
11-12 years	QUANTITY	0.98+
today	DATE	0.98+
first thing	QUANTITY	0.98+
US Government	ORGANIZATION	0.98+
two things	QUANTITY	0.97+
One	QUANTITY	0.97+
10-12 years	QUANTITY	0.96+
first time	QUANTITY	0.95+
this morning	DATE	0.93+
Information Quality Symposium	EVENT	0.92+
first things	QUANTITY	0.9+
MIT Campus	LOCATION	0.86+
12th Annual MIT chief data officer	EVENT	0.84+
first	QUANTITY	0.79+
CDO	ORGANIZATION	0.78+
seven years	QUANTITY	0.78+
Campus	LOCATION	0.77+
CDO	TITLE	0.65+
MIT CDOIQ 2018	EVENT	0.64+
last six	DATE	0.59+
years	DATE	0.59+
Air Force	ORGANIZATION	0.58+
Kickoff	EVENT	0.57+
things	QUANTITY	0.54+
15	QUANTITY	0.52+
CDO	EVENT	0.51+
20 years	QUANTITY	0.48+
past	DATE	0.44+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for MIT CDOIQ Conference2020: