Chris Lynch, AtScale | MIT CDOIQ 2019

>> From Cambridge, Massachusetts it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by, SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts, everybody. You're watching theCUBE, the leader in live tech coverage. I'm Dave Vellante with my co-host, Paul Gillan. Chris Lynch, good friend is here CEO, newly minted CEO and AtScale and legend. Good to see you. >> In my own mind. >> In mine too. >> It's great to be here. >> It's awesome, thank you for taking time. I know how busy you are, you're running around like crazy your next big thing. I was excited to hear that you got back into it. I predicted it a while ago you were a very successful venture capitalists but at heart, you're startup guy, aren't ya? >> Yeah 100%, 100%. I couldn't be more thrilled, I feel invigorated. I think I've told you many times, when you've interviewed me and asked me about the transition from being an entrepreneur to being a VC and since it's a PG show, I've got a different analog than the one I usually give you. I used to be a movie star and now I'm an executive producer of movies. Now am back to being a movie star, hopefully. >> yeah well, so you told me when you first became a VC you said, I look for startups that have a 10X impact either 10X value, 10X cost reduction. What was it that attracted you to AtScale? What's the 10X? >> AtScale, addresses $150 billion market problem which is basically bringing traditional BI to the cloud. >> That's the other thing you told me, big markets. >> Yeah, so that's the first thing massive market opportunity. The second is, the innovation component and where the 10X comes we're uniquely qualified to virtualize data into the pipeline and out. So I like to say that we're the bridge between BI and AI and back. We make every BI user, a citizen data scientist and that's a game changer. And that's sort of the new futuristic component of what we do. So one part is steeped in, that $150 billion BI marketplace in a traditional analytics platforms and then the second piece is into you delivering the data, into these BI excuse me, these AI machine learning platforms. >> Do you see that ultimately getting integrated into some kind of larger, data pipeline framework. I mean, maybe it lives in the cloud or maybe on prem, how do you see that evolving over time? >> So I believe that, with AtScale as one single pane of glass, we basically are providing an API, to the data and to the user, one single API. The reason that today we haven't seen the delivery of the promise of big data is because we don't have big data. Fortunate 2000 companies don't have big data. They have lots of data but to me big data means you can have one logical view of that data and get the best data pumped into these models in these tools, and today that's not the case. They're constricted by location they're constricted by vendor they're constricted by whether it's in the cloud or on prem. We eliminate those restrictions. >> The single API, I think is important actually. Because when you look at some of these guys what they're doing with their data pipeline they might have 10 or 15 unique API's that they're trying to manage. So there's a simplification aspect to, I suppose. >> One of the knocks on traditional BI has always been the need for extract databases and all the ETL that goes that's involved in that. Do you guys avoid that stage? You go to the production data directly or what's the-- >> It's a great question. The way I put it is, we bring Moses to the mountain the mountain being the data, Moses being the user. Traditionally, what people have been trying to do is bring the mountain to Moses, doesn't scale. At AtScale, we provide an abstraction a logical distraction between the data and the BI user. >> You don't touch, you don't move the data. >> We don't move the data. Which is what's unique and that's what's delivering I think, way more than a 10X delivery in value. >> Because you leave the data in place you bring that value to wherever the data is. Which is the original concept of Hadoop, by the way. That was what was profound about Hadoop everybody craps on it now, but that was the game changer and if you could take advantage of that that's how you tap your 10X. >> To the difference is, we're not, to your point we're not moving the data. Hadoop, in my humble opinion why it plateaued is because to get the value, you had to ask the user to bring and put data in yet another platform. And the reason that we're not delivering on big data as an industry, I believe is because we've too many data sources, too many platforms too many consumers of data and too many producers. As we build all these islands of data, with no connectivity. The idea is, we'll create this big data lake and we're going to physically put everything in there. Guess what? Someday turned out to be never. Because people aren't going to deal with the business disruption. We move thousands of users from a platform like Teradata to a platform like Snowflake or Google BigQuery, we don't care. We're a multi-cloud and we're a hybrid cloud. But we do it without any disruption. You're using Excel, you just continue and use it. You just see the results are faster. You use Tableau, same difference. >> So we had all the vertical rock stars in here. So we had Colin in yesterday, we had Stonebraker around earlier. Andy Palmer just came on and Chris here with the CEO who ultimately sold the company to HP. That really didn't do anything with it and then spun it off and now it's back. Aaron was, he had a spring in his step yesterday. So when you think about, Vertica. The technology behind Vertica go back 10 years and where we come now give us a little journey of, your data journey. >> So I think it plays into the, the original assertion is that, vertical is a best-in-class platform for analytics but it was yet another platform. The analog I give now, is now we have Snowflake and six months, 12 months from now we're going to have another one. And that creates a set of problems if you have to live in the physical world. Because you've all these islands of data and I believe, it's about the data not about the models, it's about the data. You can't get optimal results if you don't have an optimal access to the pertinent data. I believe that having that Universal API is going to make the next platform that more valuable. You're not going to be making the trade-off is, okay we have this platform that has some neat capability but the trade-off is from an enterprise architecture perspective we're never going to be able to connect all this stuff. That's how all of these things proliferated. My view is, in a world where you have that single pane of glass, that abstraction layer between the user and the data. Then innovation can be spawned quicker and you can use these tools effectively 'cause you're not compromising being able to get a logical view of the data and get access to it as a user. >> What's your issue with Snowflake you mentioned them, Mugli's company-- >> No issue, they're a great partner of ours. We eliminate the friction between the user going from an on-prem solution to the cloud. >> Slootman just took over there. So you know where that's going. >> Yep (laughing) >> Frank's got the magic touch. Okay good, you say they're a partner yours how are you guys partnering? >> They refer us into customers that, if you want to buy Snowflake now the next issue is, how do i migrate? You don't. You put our virtualization layer in and then we allow you access to Snowflake in a non-disruptive way, versus having to move data into their system or into a particular cloud which creates sales friction. >> Moving data is just, you want to avoid it at all cost. >> I do want to ask you because I met with your predecessors, Dave Mariani last year and I know he was kind of a reluctant CEO he didn't really want to be CEO but wanted to be CTO, which is what he is now. How did that come about, that they found you that you connected with them and decided this was the right opportunity. >> That's a great question. I actually looked at the company at the seed stage when I was in venture, but I had this thing as you know that, I wanted to move companies to Boston and they're about my vintage age-wise and he's married with four kids so that wasn't in the cards. I said look, it doesn't make sense for me to seed this company 'cause I can't give you the time you're out in California everything I'm instrumenting is around Boston. We parted friends. And I was skeptical whether he could build this 'cause people have been talking about building a heterogeneous universal semantic layer, for years and it's never come to fruition. And then he read in Fortune or Forbes that I was leaving Accomplice and that I was looking for one more company to operate. He reached out and he told me what they were doing that hey, we really built it but we need help and I don't want to run this. It's not right for the company and the opportunity So he said, "I'll come and I'll consult to you." I put together a plan and I had my Vertica and data robot. NekTony guys do the technical diligence to make sure that the architecture wasn't wedded to the dupe, like all the other ones were and when I saw it wasn't then I knew the market opportunity was to take that, rifle and point it at that legacy $150 billion BI market not at the billion dollar market of Hadoop. And when we did that, we've been growing at 162% quarter-over-quarter. We've built development centers in Bulgaria. We've moved all operations, non-technical to Boston here down in our South Station. We've been on fire and we are the partner of choice of every cloud manner, because we eliminate the sales friction, for customers being able to take advantage of movement to the cloud and we're able through our intelligent pipeline and capability. We're able to reduce the cost significantly of queries because we understand and we were able to intelligently cash those queries. >> Sales ops is here, all-- >> Sales marketing, customer support, customer success and we're building a machine learning team here at Dev team here. >> Where are you in that sort of Boston build-out? >> We have an office on 711 Atlantic that we opened in the fall. We're actually moving from 4,000 square feet to 10,000 this month. In less than six months and we'll house by the first year, 100 employees in Boston 100 in Bulgaria and about that same hundred in San Mateo. >> Are you going after net new business mainly? Or there's a lot of legacy BI out there are you more displacing those products? >> A couple of things. What we find is that, customers want to evolve into the cloud, they don't want a revolution they want a evolution. So we allow them, because we support hybrid cloud to keep some data behind the firewall and then experiment with moving other data to the cloud platform of choice but we're still providing that one logical view. I would say most of our customers are looking to reap platform, off of Teradata or something onto a, another platform like Snowflake. And then we have a set of customers that see that as part of the solution but not the whole solution. They're more true hybrids but I would say that 80% of our customers are traditional BI customers that are trying to contemporize their environments and be able to take advantage of tabular support and multidimensional, the things that we do in addition to the cube world. >> They can keep whatever they're using. >> Correct, that's the key. >> Did you do the series D, you did, right? >> Yes, Morgan Stanely led. >> So you're not actively but you're good for now, It was like $50 million >> Yeah we raised $50 million. >> You're good for a bit. Who's in the Chris Lynch target? (laughs) Who's the enemy? Vertica, I could say it was the traditional database guys. Who's the? >> We're in a unique position, we're almost Switzerland so we could be friend to foe, of anybody in that ecosystem because we can, non-disruptively re-platform customers between legacy platforms or from legacy platforms to the cloud. We're an interesting position. >> So similar to the file sharing. File virtualization company >> The Copier. >> Copier yeah. >> It puts us in an interesting position. They need to be friends with us and at the same time I'm sure that they're concerned about the capabilities we have but we have a number of retail customers for instance that have asked us to move down from Amazon to Google BigQuery, which we accommodate and because we can do that non-disruptively. The cost and the ability to move is eliminated. It gives customers true freedom of choice. >> How worried are you, that AWS tries to replicate what you guys do. You're in their sights. >> I think there are technical, legal and structural barriers to them doing that. The technical is, this team has been at it for six and a half years. So to do what we do, they'll have to do what we've done. Structurally from a business perspective if they could, I'm not sure they want to. The way to think about Amazon is, they're no different than Teradata, except for they want the same vendor lock-in except they want it to be the Amazon Cloud when Teradata wanted it to be, their data warehouse. >> They don't promote multi-cloud versus-- >> Yeah, they don't want multi-cloud they don't want >> On Prem >> Customers to have a freedom of choice. Would they really enable a heterogeneous abstraction layer, I don't think they would nor do I think any of the big guys would. They all claim to have this capability for their system. It's like the old IBM adage I'm in prison but the food's going to get three squares a day, I get cable TV but I'm in prison. (laughing) >> Awesome, all right, parting thoughts. >> Parting thoughts, oh geez you got to give me a question I'm not that creative. >> What's next, for you guys? What should we be paying attention to? >> I think you're going to see some significant announcements in September regarding the company and relationships that I think will validate the impact we're having in the market. >> Give you some leverage >> Yeah, will give us, better channel leverage. We have a major technical announcement that I think will be significant to the marketplace and what will be highly disruptive to some of the people you just mentioned. In terms of really raising the bar for customers to be able to have the freedom of choice without any sort of vendor lock-in. And I think that that will create some counter strike which we'll be ready for. (laughing) >> If you've never heard of AtScale before trust me you're going to in the next 18 months. Chris Lynch, thanks so much for coming on theCUBE. >> It's my pleasure. >> Great to see you. All right, keep it right there everybody we're back with our next guest, right after this short break you're watching theCUBE from MIT, right back. (upbeat music)

Published Date : Aug 2 2019

SUMMARY :

Brought to you by, SiliconANGLE Media. Good to see you. that you got back into it. and asked me about the transition What was it that attracted you to AtScale? traditional BI to the cloud. That's the other thing and then the second piece is into you I mean, maybe it lives in the cloud and get the best data Because when you look and all the ETL that goes is bring the mountain don't move the data. We don't move the data. and if you could take advantage of that is because to get the value, So when you think about, Vertica. and I believe, it's about the data We eliminate the friction between the user So you know where that's going. Frank's got the magic touch. and then we allow you access to Snowflake you want to avoid it that they found you and it's never come to fruition. and we're building a by the first year, 100 employees in Boston the things that we do Who's in the Chris Lynch target? to the cloud. So similar to the file sharing. about the capabilities we have tries to replicate what you guys do. So to do what we do, they'll I'm in prison but the food's you got to give me a question in September regarding the to some of the people you just mentioned. in the next 18 months. Great to see you.

ENTITIES

Entity	Category	Confidence
Paul Gillan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Chris Lynch	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Bulgaria	LOCATION	0.99+
September	DATE	0.99+
Chris	PERSON	0.99+
AWS	ORGANIZATION	0.99+
10	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Andy Palmer	PERSON	0.99+
Dave Mariani	PERSON	0.99+
California	LOCATION	0.99+
Aaron	PERSON	0.99+
Boston	LOCATION	0.99+
San Mateo	LOCATION	0.99+
$150 billion	QUANTITY	0.99+
$50 million	QUANTITY	0.99+
$150 billion	QUANTITY	0.99+
Moses	PERSON	0.99+
80%	QUANTITY	0.99+
4,000 square feet	QUANTITY	0.99+
last year	DATE	0.99+
second piece	QUANTITY	0.99+
162%	QUANTITY	0.99+
South Station	LOCATION	0.99+
AtScale	ORGANIZATION	0.99+
Morgan Stanely	PERSON	0.99+
100%	QUANTITY	0.99+
four kids	QUANTITY	0.99+
Excel	TITLE	0.99+
six and a half years	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Tableau	TITLE	0.99+
yesterday	DATE	0.99+
first	QUANTITY	0.99+
second	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
less than six months	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Frank	PERSON	0.99+
today	DATE	0.98+
this month	DATE	0.98+
Switzerland	LOCATION	0.98+
Hadoop	TITLE	0.98+
10X	QUANTITY	0.98+
100 employees	QUANTITY	0.98+
one part	QUANTITY	0.98+
Slootman	PERSON	0.98+
10,000	QUANTITY	0.97+
Vertica	ORGANIZATION	0.97+
Mugli	ORGANIZATION	0.97+
Google	ORGANIZATION	0.97+
15 unique API	QUANTITY	0.96+
hundred	QUANTITY	0.96+
six months	QUANTITY	0.96+
three squares a day	QUANTITY	0.96+
thousands of users	QUANTITY	0.96+
NekTony	ORGANIZATION	0.96+
Fortune	TITLE	0.96+
12 months	QUANTITY	0.95+
single API	QUANTITY	0.95+
711 Atlantic	LOCATION	0.95+
2000 companies	QUANTITY	0.94+
One	QUANTITY	0.94+
next 18 months	DATE	0.94+
Colin	PERSON	0.93+
one more company	QUANTITY	0.92+
one single API	QUANTITY	0.92+
single pane	QUANTITY	0.91+

Susan Wilson, Informatica & Blake Andrews, New York Life | MIT CDOIQ 2019

(techno music) >> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, we're here with theCUBE at the MIT Chief Data Officer Information Quality Conference. I'm Dave Vellante with my co-host Paul Gillin. Susan Wilson is here, she's the vice president of data governance and she's the leader at Informatica. Blake Anders is the corporate vice president of data governance at New York Life. Folks, welcome to theCUBE, thanks for coming on. >> Thank you. >> Thank you. >> So, Susan, interesting title; VP, data governance leader, Informatica. So, what are you leading at Informatica? >> We're helping our customers realize their business outcomes and objectives. Prior to joining Informatica about 7 years ago, I was actually a customer myself, and so often times I'm working with our customers to understand where they are, where they going, and how to best help them; because we recognize data governance is more than just a tool, it's a capability that represents people, the processes, the culture, as well as the technology. >> Yeah so you've walked the walk, and you can empathize with what your customers are going through. And Blake, your role, as the corporate VP, but more specifically the data governance lead. >> Right, so I lead the data governance capabilities and execution group at New York Life. We're focused on providing skills and tools that enable government's activities across the enterprise at the company. >> How long has that function been in place? >> We've been in place for about two and half years now. >> So, I don't know if you guys heard Mark Ramsey this morning, the key-note, but basically he said, okay, we started with enterprise data warehouse, we went to master data management, then we kind of did this top-down enterprise data model; that all failed. So we said, all right, let's pump the governance. Here you go guys, you fix our corporate data problem. Now, right tool for the right job but, and so, we were sort of joking, did data governance fail? No, you always have to have data governance. It's like brushing your teeth. But so, like I said, I don't know if you heard that, but what are your thoughts on that sort of evolution that he described? As sort of, failures of things like EDW to live up to expectations and then, okay guys over to you. Is that a common theme? >> It is a common theme, and what we're finding with many of our customers is that they had tried many of the, if you will, the methodologies around data governance, right? Around policies and structures. And we describe this as the Data 1.0 journey, which was more application-centric reporting to Data 2.0 to data warehousing. And a lot of the failed attempts, if you will, at centralizing, if you will, all of your data, to now Data 3.0, where we look at the explosion of data, the volumes of data, the number of data consumers, the expectations of the chief data officer to solve business outcomes; crushing under the scale of, I can't fit all of this into a centralized data at repository, I need something that will help me scale and to become more agile. And so, that message does resonate with us, but we're not saying data warehouses don't exist. They absolutely do for trusted data sources, but the ability to be agile and to address many of your organizations needs and to be able to service multiple consumers is top-of-mind for many of our customers. >> And the mind set from 1.0 to 2.0 to 3.0 has changed. From, you know, data as a liability, to now data as this massive asset. It's sort of-- >> Value, yeah. >> Yeah, and the pendulum is swung. It's almost like a see-saw. Where, and I'm not sure it's ever going to flip back, but it is to a certain extent; people are starting to realize, wow, we have to be careful about what we do with our data. But still, it's go, go, go. But, what's the experience at New York Life? I mean, you know. A company that's been around for a long time, conservative, wants to make sure risk averse, obviously. >> Right. >> But at the same time, you want to keep moving as the market moves. >> Right, and we look at data governance as really an enabler and a value-add activity. We're not a governance practice for the sake of governance. We're not there to create a lot of policies and restrictions. We're there to add value and to enable innovation in our business and really drive that execution, that efficiency. >> So how do you do that? Square that circle for me, because a lot of people think, when people think security and governance and compliance they think, oh, that stifles innovation. How do you make governance an engine of innovation? >> You provide transparency around your data. So, it's transparency around, what does the data mean? What data assets do we have? Where can I find that? Where are my most trusted sources of data? What does the quality of that data look like? So all those things together really enable your data consumers to take that information and create new value for the company. So it's really about enabling your value creators throughout the organization. >> So data is an ingredient. I can tell you where it is, I can give you some kind of rating as to the quality of that data and it's usefulness. And then you can take it and do what you need to do with it in your specific line of business. >> That's right. >> Now you said you've been at this two and half years, so what stages have you gone through since you first began the data governance initiative. >> Sure, so our first year, year and half was really focused on building the foundations, establishing the playbook for data governance and building our processes and understanding how data governance needed to be implemented to fit New York Life in the culture of the company. The last twelve months or so has really been focused on operationalizing governance. So we've got the foundations in place, now it's about implementing tools to further augment those capabilities and help assist our data stewards and give them a better skill set and a better tool set to do their jobs. >> Are you, sort of, crowdsourcing the process? I mean, you have a defined set of people who are responsible for governance, or is everyone taking a role? >> So, it is a two-pronged approach, we do have dedicated data stewards. There's approximately 15 across various lines of business throughout the company. But, we are building towards a data democratization aspect. So, we want people to be self-sufficient in finding the data that they need and understanding the data. And then, when they have questions, relying on our stewards as a network of subject matter experts who also have some authorizations to make changes and adapt the data as needed. >> Susan, one of the challenges that we see is that the chief data officers often times are not involved in some of these skunkworks AI projects. They're sort of either hidden, maybe not even hidden, but they're in the line of business, they're moving. You know, there's a mentality of move fast and break things. The challenge with AI is, if you start operationalizing AI and you're breaking things without data quality, without data governance, you can really affect lives. We've seen it. In one of these unintended consequences. I mean, Facebook is the obvious example and there are many, many others. But, are you seeing that? How are you seeing organizations dealing with that problem? >> As Blake was mentioning often times what it is about, you've got to start with transparency, and you got to start with collaborating across your lines of businesses, including the data scientists, and including in terms of what they are doing. And actually provide that level of transparency, provide a level of collaboration. And a lot of that is through the use of our technology enablers to basically go out and find where the data is and what people are using and to be able to provide a mechanism for them to collaborate in terms of, hey, how do I get access to that? I didn't realize you were the SME for that particular component. And then also, did you realize that there is a policy associated to the data that you're managing and it can't be shared externally or with certain consumer data sets. So, the objective really is around how to create a platform to ensure that any one in your organization, whether I'm in the line of business, that I don't have a technical background, or someone who does have a technical background, they can come and access and understand that information and connect with their peers. >> So you're helping them to discover the data. What do you do at that stage? >> What we do at that stage is, creating insights for anyone in the organization to understand it from an impact analysis perspective. So, for example, if I'm going to make changes, to as well as discovery. Where exactly is my information? And so we have-- >> Right. How do you help your customers discover that data? >> Through machine learning and artificial intelligence capabilities of our, specifically, our data catalog, that allows us to do that. So we use such things like similarity based matching which help us to identify. It doesn't have to be named, in miscellaneous text one, it could be named in that particular column name. But, in our ability to scan and discover we can identify in that column what is potentially social security number. It might have resided over years of having this data, but you may not realize that it's still stored there. Our ability to identify that and report that out to the data stewards as well as the data analysts, as well as to the privacy individuals is critical. So, with that being said, then they can actually identify the appropriate policies that need to be adhered to, alongside with it in terms of quality, in terms of, is there something that we need to archive. So that's where we're helping our customers in that aspect. >> So you can infer from the data, the meta data, and then, with a fair degree of accuracy, categorize it and automate that. >> Exactly. We've got a customer that actually ran this and they said that, you know, we took three people, three months to actually physically tag where all this information existed across something like 7,000 critical data elements. And, basically, after the set up and the scanning procedures, within seconds we were able to get within 90% precision. Because, again, we've dealt a lot with meta data. It's core to our artificial intelligence and machine learning. And it's core to how we built out our platforms to share that meta data, to do something with that meta data. It's not just about sharing the glossary and the definition information. We also want to automate and reduce the manual burden. Because we recognize with that scale, manual documentation, manual cataloging and tagging just, >> It doesn't work. >> It doesn't work. It doesn't scale. >> Humans are bad at it. >> They're horrible at it. >> So I presume you have a chief data officer at New York Life, is that correct? >> We have a chief data and analytics officer, yes. >> Okay, and you work within that group? >> Yes, that is correct. >> Do you report it to that? >> Yes, so-- >> And that individual, yeah, describe the organization. >> So that sits in our lines of business. Originally, our data governance office sat in technology. And then, our early 2018 we actually re-orged into the business under the chief data and analytics officer when that role was formed. So we sit under that group along with a data solutions and governance team that includes several of our data stewards and also some others, some data engineer-type roles. And then, our center for data science and analytics as well that contains a lot of our data science teams in that type of work. >> So in thinking about some of these, I was describing to Susan, as these skunkworks projects, is the data team, the chief data officer's team involved in those projects or is it sort of a, go run water through the pipes, get an MVP and then you guys come in. How does that all work? >> We're working to try to centralize that function as much as we can, because we do believe there's value in the left hand knowing what the right hand is doing in those types of things. So we're trying to build those communications channels and build that network of data consumers across the organization. >> It's hard right? >> It is. >> Because the line of business wants to move fast, and you're saying, hey, we can help. And they think you're going to slow them down, but in fact, you got to make the case and show the success because you're actually not going to slow them down to terms of the ultimate outcome. I think that's the case that you're trying to make, right? >> And that's one of the things that we try to really focus on and I think that's one of the advantages to us being embedded in the business under the CDAO role, is that we can then say our objectives are your objectives. We are here to add value and to align with what you're working on. We're not trying to slow you down or hinder you, we're really trying to bring more to the table and augment what you're already trying to achieve. >> Sometimes getting that organization right means everything, as we've seen. >> Absolutely. >> That's right. >> How are you applying governance discipline to unstructured data? >> That's actually something that's a little bit further down our road map, but one of the things that we have started doing is looking at our taxonomy's for structured data and aligning those with the taxonomy's that we're using to classify unstructured data. So, that's something we're in the early stages with, so that when we get to that process of looking at more of our unstructured content, we can, we already have a good feel for there's alignment between the way that we think about and organize those concepts. >> Have you identified automation tools that can help to bring structure to that unstructured data? >> Yes, we have. And there are several tools out there that we're continuing to investigate and look at. But, that's one of the key things that we're trying to achieve through this process is bringing structure to unstructured content. >> So, the conference. First year at the conference. >> Yes. >> Kind of key take aways, things that interesting to you, learnings? >> Oh, yes, well the number of CDO's that are here and what's top of mind for them. I mean, it ranges from, how do I stand up my operating model? We just had a session just about 30 minutes ago. A lot of questions around, how do I set up my organization structure? How do I stand up my operating model so that I could be flexible? To, right, the data scientists, to the folks that are more traditional in structured and trusted data. So, still these things are top-of-mind and because they're recognizing the market is also changing too. And the growing amount of expectations, not only solving business outcomes, but also regulatory compliance, privacy is also top-of-mind for a lot of customers. In terms of, how would I get started? And what's the appropriate structure and mechanism for doing so? So we're getting a lot of those types of questions as well. So, the good thing is many of us have had years of experience in this phase and the convergence of us being able to support our customers, not only in our principles around how we implement the framework, but also the technology is really coming together very nicely. >> Anything you'd add, Blake? >> I think it's really impressive to see the level of engagement with thought leaders and decision makers in the data space. You know, as Susan mentioned, we just got out of our session and really, by the end of it, it turned into more of an open discussion. There was just this kind of back and forth between the participants. And so it's really engaging to see that level of passion from such a distinguished group of individuals who are all kind of here to share thoughts and ideas. >> Well anytime you come to a conference, it's sort of any open forum like this, you learn a lot. When you're at MIT, it's like super-charged. With the big brains. >> Exactly, you feel it when you come on the campus. >> You feel smarter when you walk out of here. >> Exactly, I know. >> Well, guys, thanks so much for coming to theCUBE. It was great to have you. >> Thank you for having us. We appreciate it, thank you. >> You're welcome. All right, keep it right there everybody. Paul and I will be back with our next guest. You're watching theCUBE from MIT in Cambridge. We'll be right back. (techno music)

Published Date : Aug 2 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Susan Wilson is here, she's the vice president So, what are you leading at Informatica? and how to best help them; but more specifically the data governance lead. Right, so I lead the data governance capabilities and then, okay guys over to you. And a lot of the failed attempts, if you will, And the mind set from 1.0 to 2.0 to 3.0 has changed. Where, and I'm not sure it's ever going to flip back, But at the same time, Right, and we look at data governance So how do you do that? What does the quality of that data look like? and do what you need to do with it so what stages have you gone through in the culture of the company. in finding the data that they need is that the chief data officers often times and to be able to provide a mechanism What do you do at that stage? So, for example, if I'm going to make changes, How do you help your customers discover that data? and report that out to the data stewards and then, with a fair degree of accuracy, categorize it And it's core to how we built out our platforms It doesn't work. And that individual, And then, our early 2018 we actually re-orged is the data team, the chief data officer's team and build that network of data consumers but in fact, you got to make the case and show the success and to align with what you're working on. Sometimes getting that organization right but one of the things that we have started doing is bringing structure to unstructured content. So, the conference. And the growing amount of expectations, and decision makers in the data space. it's sort of any open forum like this, you learn a lot. when you come on the campus. Well, guys, thanks so much for coming to theCUBE. Thank you for having us. Paul and I will be back with our next guest.

ENTITIES

Entity	Category	Confidence
Paul Gillin	PERSON	0.99+
Susan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paul	PERSON	0.99+
Susan Wilson	PERSON	0.99+
Blake	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
Cambridge	LOCATION	0.99+
Mark Ramsey	PERSON	0.99+
Blake Anders	PERSON	0.99+
three months	QUANTITY	0.99+
three people	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
New York Life	ORGANIZATION	0.99+
early 2018	DATE	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
First year	QUANTITY	0.99+
one	QUANTITY	0.99+
90%	QUANTITY	0.99+
two and half years	QUANTITY	0.98+
first	QUANTITY	0.98+
approximately 15	QUANTITY	0.98+
7,000 critical data elements	QUANTITY	0.97+
about two and half years	QUANTITY	0.97+
first year	QUANTITY	0.96+
two	QUANTITY	0.96+
about 30 minutes ago	DATE	0.96+
theCUBE	ORGANIZATION	0.95+
Blake Andrews	PERSON	0.95+
MIT Chief Data Officer and	EVENT	0.93+
MIT Chief Data Officer Information Quality Conference	EVENT	0.91+
EDW	ORGANIZATION	0.86+
last twelve months	DATE	0.86+
skunkworks	ORGANIZATION	0.85+
CDAO	ORGANIZATION	0.85+
this morning	DATE	0.83+
MIT	ORGANIZATION	0.83+
7 years ago	DATE	0.78+
year	QUANTITY	0.75+
Information Quality Symposium 2019	EVENT	0.74+
3.0	OTHER	0.66+
York Life	ORGANIZATION	0.66+
2.0	OTHER	0.59+
MIT CDOIQ 2019	EVENT	0.58+
half	QUANTITY	0.52+
Data 2.0	OTHER	0.52+
Data 3.0	TITLE	0.45+
1.0	OTHER	0.43+
Data	OTHER	0.21+

Robert Abate, Global IDS | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (futuristic music) >> Welcome back to Cambridge, Massachusetts everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events and we extract the signal from the noise. This is day two, we're sort of wrapping up the Chief Data Officer event. It's MIT CDOIQ, it started as an information quality event and with the ascendancy of big data the CDO emerged and really took center stage here. And it's interesting to know that it's kind of come full circle back to information quality. People are realizing all this data we have, you know the old saying, garbage in, garbage out. So the information quality worlds and this chief data officer world have really come colliding together. Robert Abate is here, he's the Vice President and CDO of Global IDS and also the co-chair of next year's, the 14th annual MIT CDOIQ. Robert, thanks for coming on. >> Oh, well thank you. >> Now you're a CDO by background, give us a little history of your career. >> Sure, sure. Well I started out with an Electrical Engineering degree and went into applications development. By 2000, I was leading the Ralph Lauren's IT, and I realized when Ralph Lauren hired me, he was getting ready to go public. And his problem was he had hired eight different accounting firms to do eight different divisions. And each of those eight divisions were reporting a number, but the big number didn't add up, so he couldn't go public. So he searched the industry to find somebody who could figure out the problem. Now I was, at the time, working in applications and had built this system called Service Oriented Architectures, a way of integrating applications. And I said, "Well I don't know if I could solve the problem, "but I'll give it a shot." And what I did was, just by taking each silo as it's own problem, which was what EID Accounting Firm had done, I was able to figure out that one of Ralph Lauren's policies was if you buy a garment, you can return it anytime, anywhere, forever, however long you own it. And he didn't think about that, but what that meant is somebody could go to a Bloomingdale's, buy a garment and then go to his outlet store and return it. Well, the cross channels were different systems. So the outlet stores were his own business, retail was a different business, there was a completely different, each one had their own AS/400, their own data. So what I quickly learned was, the problem wasn't the systems, the problem was the data. And it took me about two months to figure it out and he offered me a job, he said well, I was a consultant at the time, he says, "I'm offering you a job, you're going to run my IT." >> Great user experience but hard to count. >> (laughs) Hard to count. So that's when I, probably 1999 was when that happened. I went into data and started researching-- >> Sorry, so how long did it take you to figure that out? You said a couple of months? >> A couple of months, I think it was about two months. >> 'Cause jeez, it took Oracle what, 10 years to build Fusion with SOA? That's pretty good. (laughs) >> This was a little bit of luck. When we started integrating the applications we learned that the messages that we were sending back and forth didn't match, and we said, "Well that's impossible, it can't not match." But what didn't match was it was coming from one channel and being returned in another channel, and the returns showed here didn't balance with the returns on this side. So it was a data problem. >> So a forensics showdown. So what did you do after? >> After that I went into ICICI Bank which was a large bank in India who was trying to integrate their systems, and again, this was a data problem. But they heard me giving a talk at a conference on how SOA had solved the data challenge, and they said, "We're a bank with a wholesale, a retail, "and other divisions, "and we can't integrate the systems, can you?" I said, "Well yeah, I'd build a website "and make them web services and now what'll happen is "each of those'll kind of communicate." And I was at ICICI Bank for about six months in Mumbai, and finished that which was a success, came back and started consulting because now a lot of companies were really interested in this concept of Service Oriented Architectures. Back then when we first published on it, myself, Peter Aiken, and a gentleman named Joseph Burke published on it in 1996. The publisher didn't accept the book, it was a really interesting thing. We wrote the book called, "Services Based Architectures: A Way to Integrate Systems." And the way Wiley & Sons, or most publishers work is, they'll have three industry experts read your book and if they don't think what you're saying has any value, they, forget about it. So one guy said this is brilliant, one guy says, "These guys don't know what they're talking about," and the third guy says, "I don't even think what they're talking about is feasible." So they decided not to publish. Four years later it came back and said, "We want to publish the book," and Peter said, "You know what, they lost their chance." We were ahead of them by four years, they didn't understand the technology. So that was kind of cool. So from there I went into consulting, eventually took a position as the Head of Enterprise and Director of Enterprise Information Architecture with Walmart. And Walmart, as you know, is a huge entity, almost the size of the federal government. So to build an architecture that integrates Walmart would've been a challenge, a behemoth challenge, and I took it on with a phenomenal team. >> And when was this, like what timeframe? >> This was 2010, and by the end of 2010 we had presented an architecture to the CIO and the rest of the organization, and they came back to me about a week later and said, "Look, everybody agrees what you did was brilliant, "but nobody knows how to implement it. "So we're taking you away, "you're no longer Director of Information Architecture, "you're now Director of Enterprise Information Management. "Build it. "Prove that what you say you could do, you could do." So we built something called the Data CAFE, and CAFE was an acronym, it stood for: Collaborative Analytics Facility for the Enterprise. What we did was we took data from one of the divisions, because you didn't want to take on the whole beast, boil the ocean. We picked Sam's Club and we worked with their CFO, and because we had information about customers we were able to build a room with seven 80 inch monitors that surrounded anyone in the room. And in the center was the Cisco telecommunications so you could be a part of a meeting. >> The TelePresence. >> TelePresence. And we built one room in one facility, and one room in another facility, and we labeled the monitors, one red, one blue, one green, and we said, "There's got to be a way where we can build "data science so it's interactive, so somebody, "an executive could walk into the room, "touch the screen, and drill into features. "And in another room "the features would be changing simultaneously." And that's what we built. The room was brought up on Black Friday of 2013, and we were able to see the trends of sales on the East Coast that we quickly, the executives in the room, and these are the CEO of Walmart and the heads of Sam's Club and the like, they were able to change the distribution in the Mountain Time Zone and west time zones because of the sales on the East Coast gave them the idea, well these things are going to sell, and these things aren't. And they saw a tremendous increase in productivity. We received the 2014, my team received the 2014 Walmart Innovation Project of the Year. >> And that's no slouch. Walmart has always been heavily data-oriented. I don't know if it's urban legend or not, but the famous story in the '80s of the beer and the diapers, right? Walmart would position beer next to diapers, why would they do that? Well the father goes in to buy the diapers for the baby, picks up a six pack while he's on the way, so they just move those proximate to each other. (laughs) >> In terms of data, Walmart really learned that there's an advantage to understanding how to place items in places that, a path that you might take in a store, and knowing that path, they actually have a term for it, I believe it's called, I'm sorry, I forgot the name but it's-- >> Selling more stuff. (laughs) >> Yeah, it's selling more stuff. It's the way you position items on a shelf. And Walmart had the brilliance, or at least I thought it was brilliant, that they would make their vendors the data champion. So the vendor, let's say Procter & Gamble's a vendor, and they sell this one product the most. They would then be the champion for that aisle. Oh, it's called planogramming. So the planogramming, the way the shelves were organized, would be set up by Procter & Gamble for that entire area, working with all their other vendors. And so Walmart would give the data to them and say, "You do it." And what I was purporting was, well, we shouldn't just be giving the data away, we should be using that data. And that was the advent of that. From there I moved to Kimberly-Clark, I became Global Director of Enterprise Data Management and Analytics. Their challenge was they had different teams, there were four different instances of SAP around the globe. One for Latin America, one for North America called the Enterprise Edition, one for EMEA, Europe, Middle East, and Africa, and one for Asia-Pacific. Well when you have four different instances of SAP, that means your master data doesn't exist because the same thing that happens in this facility is different here. And every company faces this challenge. If they implement more than one of a system the specialty fields get used by different companies in different ways. >> The gold standard, the gold version. >> The golden version. So I built a team by bringing together all the different international teams, and created one team that was able to integrate best practices and standards around data governance, data quality. Built BI teams for each of the regions, and then a data science and advanced analytics team. >> Wow, so okay, so that makes you uniquely qualified to coach here at the conference. >> Oh, I don't know about that. (laughs) There are some real, there are some geniuses here. >> No but, I say that because these are your peeps. >> Yes, they are, they are. >> And so, you're a practitioner, this conference is all about practitioners talking to practitioners, it's content-heavy, There's not a lot of fluff. Lunches aren't sponsored, there's no lanyard sponsor and it's not like, you know, there's very subtle sponsor desks, you have to have sponsors 'cause otherwise the conference's not enabled, and you've got costs associated with it. But it's a very intimate event and I think you guys want to keep it that way. >> And I really believe you're dead-on. When you go to most industry conferences, the industry conferences, the sponsors, you know, change the format or are heavily into the format. Here you have industry thought leaders from all over the globe. CDOs of major Fortune 500 companies who are working with their peers and exchanging ideas. I've had conversations with a number of CDOs and the thought leadership at this conference, I've never seen this type of thought leadership in any conference. >> Yeah, I mean the percentage of presentations by practitioners, even when there's a vendor name, they have a practitioner, you know, internal practitioner presenting so it's 99.9% which is why people attend. We're moving venues next year, I understand. Just did a little tour of the new venue, so, going to be able to accommodate more attendees, so that's great. >> Yeah it is. >> So what are your objectives in thinking ahead a year from now? >> Well, you know, I'm taking over from my current peer, Dr. Arka Mukherjee, who just did a phenomenal job of finding speakers. People who are in the industry, who are presenting challenges, and allowing others to interact. So I hope could do a similar thing which is, find with my peers people who have real world challenges, bring them to the forum so they can be debated. On top of that, there are some amazing, you know, technology change is just so fast. One of the areas like big data I remember only five years ago the chart of big data vendors maybe had 50 people on it, now you would need the table to put all the vendors. >> Who's not a data vendor, you know? >> Who's not a data vendor? (laughs) So I would think the best thing we could do is, is find, just get all the CDOs and CDO-types into a room, and let us debate and talk about these points and issues. I've seen just some tremendous interactions, great questions, people giving advice to others. I've learned a lot here. >> And how about long term, where do you see this going? How many CDOs are there in the world, do you know? Is that a number that's known? >> That's a really interesting point because, you know, only five years ago there weren't that many CDOs to be called. And then Gartner four years ago or so put out an article saying, "Every company really should have a CDO." Not just for the purpose of advancing your data, and to Doug Laney's point that data is being monetized, there's a need to have someone responsible for information 'cause we're in the Information Age. And a CIO really is focused on infrastructure, making sure I've got my PCs, making sure I've got a LAN, I've got websites. The focus on data has really, because of the Information Age, has turned data into an asset. So organizations realize, if you utilize that asset, let me reverse this, if you don't use data as an asset, you will be out of business. I heard a quote, I don't know if it's true, "Only 10 years ago, 250 of the Fortune 10 no longer exists." >> Yeah, something like that, the turnover's amazing. >> Many of those companies were companies that decided not to make the change to be data-enabled, to make data decision processing. Companies still use data warehouses, they're always going to use them, and a warehouse is a rear-view mirror, it tells you what happened last week, last month, last year. But today's businesses work forward-looking. And just like driving a car, it'd be really hard to drive your car through a rear-view mirror. So what companies are doing today are saying, "Okay, let's start looking at this as forward-looking, "a prescriptive and predictive analytics, "rather than just what happened in the past." I'll give you an example. In a major company that is a supplier of consumer products, they were leading in the industry and their sales started to drop, and they didn't know why. Well, with a data science team, we were able to determine by pulling in data from the CDC, now these are sources that only 20 years ago nobody ever used to bring in data in the enterprise, now 60% of your data is external. So we brought in data from the CDC, we brought in data on maternal births from the national government, we brought in data from the Census Bureau, we brought in data from sources of advertising and targeted marketing towards mothers. Pulled all that data together and said, "Why are diaper sales down?" Well they were targeting the large regions of the country and putting ads in TV stations in New York and California, big population centers. Birth rates in population centers have declined. Birth rates in certain other regions, like the south, and the Bible Belt, if I can call it that, have increased. So by changing the marketing, their product sales went up. >> Advertising to Texas. >> Well, you know, and that brings to one of the points, I heard a lecture today about ethics. We made it a point at Walmart that if you ran a query that reduced a result to less than five people, we wouldn't allow you to see the result. Because, think about it, I could say, "What is my neighbor buying? "What are you buying?" So there's an ethical component to this as well. But that, you know, data is not political. Data is not chauvinistic. It doesn't discriminate, it just gives you facts. It's the interpretation of that that is hard CDOs, because we have to say to someone, "Look, this is the fact, and your 25 years "of experience in the business, "granted, is tremendous and it's needed, "but the facts are saying this, "and that would mean that the business "would have to change its direction." And it's hard for people to do, so it requires that. >> So whether it's called the chief data officer, whatever the data czar rubric is, the head of analytics, there's obviously the data quality component there whatever that is, this is the conference for, as I called them, your peeps, for that role in the organization. People often ask, "Will that role be around?" I think it's clear, it's solidifying. Yes, you see the chief digital officer emerging and there's a lot of tailwinds there, but the information quality component, the data architecture component, it's here to stay. And this is the premiere conference, the premiere event, that I know of anyway. There are a couple of others, perhaps, but it's great to see all the success. When I first came here in 2013 there were probably about 130 folks here. Today, I think there were 500 people registered almost. Next year, I think 600 is kind of the target, and I think it's very reasonable with the new space. So congratulations on all the success, and thank you for stepping up to the co-chair role, I really appreciate it. >> Well, let me tell you I thank you guys. You provide a voice at these IT conferences that we really need, and that is the ability to get the message out. That people do think and care, the industry is not thoughtless and heartless. With all the data breaches and everything going on there's a lot of fear, fear, loathing, and anticipation. But having your voice, kind of like ESPN and a sports show, gives the technology community, which is getting larger and larger by the day, a voice and we need that so, thank you. >> Well thank you, Robert. We appreciate that, it was great to have you on. Appreciate the time. >> Great to be here, thank you. >> All right, and thank you for watching. We'll be right back with out next guest as we wrap up day two of MIT CDOIQ. You're watching theCUBE. (futuristic music)

Published Date : Aug 1 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and also the co-chair of next year's, give us a little history of your career. So he searched the industry to find somebody (laughs) Hard to count. 10 years to build Fusion with SOA? and the returns showed here So what did you do after? and the third guy says, And in the center was the Cisco telecommunications and the heads of Sam's Club and the like, Well the father goes in to buy the diapers for the baby, (laughs) So the planogramming, the way the shelves were organized, and created one team that was able to integrate so that makes you uniquely qualified to coach here There are some real, there are some geniuses here. and it's not like, you know, the industry conferences, the sponsors, you know, Yeah, I mean the percentage of presentations by One of the areas like big data I remember just get all the CDOs and CDO-types into a room, because of the Information Age, and the Bible Belt, if I can call it that, have increased. It's the interpretation of that that is hard CDOs, the data architecture component, it's here to stay. and that is the ability to get the message out. We appreciate that, it was great to have you on. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Peter Aiken	PERSON	0.99+
Robert Abate	PERSON	0.99+
Robert	PERSON	0.99+
Procter & Gamble	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
India	LOCATION	0.99+
Mumbai	LOCATION	0.99+
Census Bureau	ORGANIZATION	0.99+
2010	DATE	0.99+
1996	DATE	0.99+
New York	LOCATION	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
last month	DATE	0.99+
60%	QUANTITY	0.99+
Bloomingdale	ORGANIZATION	0.99+
Next year	DATE	0.99+
1999	DATE	0.99+
Texas	LOCATION	0.99+
25 years	QUANTITY	0.99+
10 years	QUANTITY	0.99+
one room	QUANTITY	0.99+
2014	DATE	0.99+
2013	DATE	0.99+
Doug Laney	PERSON	0.99+
Sam's Club	ORGANIZATION	0.99+
ICICI Bank	ORGANIZATION	0.99+
99.9%	QUANTITY	0.99+
Wiley & Sons	ORGANIZATION	0.99+
50 people	QUANTITY	0.99+
Arka Mukherjee	PERSON	0.99+
next year	DATE	0.99+
Jos	PERSON	0.99+
Today	DATE	0.99+
third guy	QUANTITY	0.99+
2000	DATE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
500 people	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
one channel	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
each	QUANTITY	0.99+
One	QUANTITY	0.99+
CDC	ORGANIZATION	0.99+
less than five people	QUANTITY	0.99+
Ralph Lauren	ORGANIZATION	0.99+
one guy	QUANTITY	0.99+
six pack	QUANTITY	0.99+
ESPN	ORGANIZATION	0.99+
four years ago	DATE	0.98+
Africa	LOCATION	0.98+
SOA	TITLE	0.98+
five years ago	DATE	0.98+
California	LOCATION	0.98+
Gartner	ORGANIZATION	0.98+
three industry experts	QUANTITY	0.98+
Global IDS	ORGANIZATION	0.98+
Four years later	DATE	0.98+
600	QUANTITY	0.98+
20 years ago	DATE	0.98+
East Coast	LOCATION	0.98+
250	QUANTITY	0.98+
Middle East	LOCATION	0.98+
four years	QUANTITY	0.98+
one team	QUANTITY	0.97+
months	QUANTITY	0.97+
first	QUANTITY	0.97+
about two months	QUANTITY	0.97+
Latin America	LOCATION	0.97+

Andy Palmer, TAMR | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019 Brought to you by Silicon Angle Media >> Welcome back to M I. T. Everybody watching the Cube. The leader in live tech coverage we hear a Day two of the M I t chief data officer information Quality Conference Day Volonte with Paul Dillon. Andy Palmer's here. He's the co founder and CEO of Tamer. Good to see again. It's great to see it actually coming out. So I didn't ask this to Mike. I could kind of infirm from someone's dances. But why did you guys start >> Tamer? >> Well, it really started with an academic project that Mike was doing over at M. I. T. And I was over in of artists at the time. Is the chief get officer over there? And what we really found was that there were a lot of companies really suffering from data mastering as the primary bottleneck in their company did used great new tech like the vertical system that we've built and, you know, automated a lot of their warehousing and such. But the real bottleneck was getting lots of data integrated and mastered really, really >> quickly. Yeah, He took us through the sort of problems with obviously the d. W. In terms of scaling master data management and the scanning problems was Was that really the problem that you were trying to solve? >> Yeah, it really was. And when we started, I mean, it was like, seven years ago, eight years ago, now that we started the company and maybe almost 10 when we started working on the academic project, and at that time, people weren't really thinking are worried about that. They were still kind of digesting big data. A zit was called, but I think what Mike and I kind of felt was going on was that people were gonna get over the big data, Um, and the volume of data. And we're going to start worrying about the variety of the data and how to make the data cleaner and more organized. And, uh, I think I think way called that one pretty much right. Maybe >> we're a little >> bit early, but but I think now variety is the big problem >> with the other thing about your big day. Big data's oftentimes associated with Duke, which was a batch and then you sort of saw the shifter real time and spark was gonna fix all that. And so what are you seeing in terms of the trends in terms of how data is being used to drive almost near real time business decisions. >> You know, Mike and I came out really specifically back in 2007 and declared that we thought, uh, Hadoop and H D f s was going to be far less impactful than other people. >> 07 >> Yeah, Yeah. And Mike Mike actually was really aggressive and saying it was gonna be a disaster. And I think we've finally seen that actually play out of it now that the bloom is off the rose, so to speak. And so they're They're these fundamental things that big companies struggle with in terms of their data and, you know, cleaning it up and organizing it and making it, Iike want. Anybody that's worked at one of these big companies can tell you that the data that they get from most of their internal system sucks plain and simple, and so cleaning up that data, turning it into something it's an asset rather than liability is really what what tamers all about? And it's kind of our mission. We're out there to do this and it sort of pails and compare. Do you think about the amount of money that some of these companies have spent on systems like ASAP on you're like, Yeah, but all the data inside of the systems so bad and so, uh, ugly and unuseful like we're gonna fix that problem. >> So you're you're you're special sauce and machine learning. Where are you applying machine learning most most effectively when >> we apply machine learning to probably the least sexy problem on the planet. There are a lot of companies out there that use machine learning and a I t o do predictive algorithms and all kinds of cool stuff. All we do with machine learning is actually use it to clean up data and organize data. Get it ready for people to use a I I I started in the eye industry back in the late 19 eighties on, you know, really, I learned from the sky. Marvin Minsky and Mark Marvin taught me two things. First was garbage in garbage out. There's no algorithm that's worth anything unless you've got great data, and the 2nd 1 is it's always about the human in the machine working together. And I've really been working on those two same principles most of my career, and Tamer really brings both of those together. Our goal is to prepare data so that it can be used analytically inside of these companies, that it's actually high quality and useful. And the way we do that involves bringing together the machine, mostly these advanced machine learning algorithms with humans, subject matter experts inside of these companies that actually know all the ins and outs and all the intricacies of the data inside of their company. >> So say garbage in garbage out. If you don't have good training data course you're not going good ML model. How much how much upfront work is required. G. I know it was one of your customers and how much time is required to put together on ML model that can deal with 20,000,000 records like that? >> Well, you know, the amazing thing that this happened for us in the last five years, especially is that now we've got we've built enough models from scratch inside of these large global 2000 companies that very rarely do we go into a place where there we don't already have a model that's pre built. That they can use is a starting point. And I think that's the same thing that's happening in modeling in general. If you look a great companies like data robot Andi and even in in the Python community ml live that the accessibility of these modeling tools and the models themselves are actually so they're commoditized. And so most of our models and most of the projects we work on, we've already got a model. That's a starting point. We don't really have to start from scratch. >> You mentioned gonna ta I in the eighties Is that is the notion of a I Is it same as it was in the eighties and now we've just got the tooling, the horsepower, the data to take advantage of it is the concept changed? The >> math is all the same, like, you know, absolutely full stop, like there's really no new math. The two things I think that have changed our first. There's a lot more data that's available now, and, you know, uh, neural nets are a great example, right? in Marvin's things that, you know when you look at Google translate and how aggressively they used neural nets, it was the quantity of data that was available that actually made neural nets work. The second thing that that's that's changed is the cheap availability of Compute that Now the largest supercomputer in the world is available to rent by the minute. And so we've got all this data. You've got all this really cheap compute. And then third thing is what you alluded to earlier. The accessibility of all the math that now it's becoming so simple and easy to apply these math techniques, and they're becoming you know, it's It's almost to the point where the average data scientists not the advance With the average data, scientists can do a practice. Aye, aye. Techniques that 20 years ago required five PhDs. >> It's not surprising that Google, with its new neural net technology, all the search data that it has has been so successful. It's a surprise you that that Amazon with Alexa was able to compete so effectively. >> Oh, I think that I would never underestimate Amazon and their ability to, you know, build great tact. They've done some amazing work. One of my favorite Mike and I actually, one of our favorite examples in the last, uh, three years, they took their red shift system, you know, that competed with with Veronica and they they re implemented it and, you know, as a compiled system and it really runs incredibly fast. I mean, that that feat of engineering, what was truly exceptional >> to hear you say that Because it wasn't Red Shift originally Park. So yeah, that's right, Larry Ellison craps all over Red Shift because it's just open source offer that they just took and repackage. But you're saying they did some major engineering to Oh >> my gosh, yeah, It's like Mike and I both way Never. You know, we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. But this this latest rewrite that they've done this compiled version like it's really good. >> So as a guy has been doing a eye for 30 years now, and it's really seeing it come into its own, a lot of a I project seems right now are sort of low hanging fruit is it's small scale stuff where you see a I in five years what kind of projects are going our bar company's gonna be undertaking and what kind of new applications are gonna come out of this? But >> I think we're at the very beginning of this cycle, and actually there's a lot more potential than has been realized. So I think we are in the pick the low hanging fruit kind of a thing. But some of the potential applications of A I are so much more impactful, especially as we modernize core infrastructure in the enterprise. So the enterprise is sort of living with this huge legacy burden. And we always air encouraging a tamer our customers to think of all their existing legacy systems is just dated generating machines and the faster they can get that data into a state where they can start doing state of the art A. I work on top of it, the better. And so really, you know, you gotta put the legacy burden aside and kind of draw this line in the sand so that as you really get, build their muscles on the A. I side that you can take advantage of that with all the data that they're generating every single day. >> Everything about these data repose. He's Enterprise Data Warehouse. You guys built better with MPP technology. Better data warehouses, the master data management stuff, the top down, you know, Enterprise data models, Dupin in big data, none of them really lived up to their promise, you know? Yeah, it's kind of somewhat unfair toe toe like the MPP guys because you said, Hey, we're just gonna run faster. And you did. But you didn't say you're gonna change the world and all that stuff, right? Where's e d? W? Did Do you feel like this next wave is actually gonna live up to the promise? >> I think the next phase is it's very logical. Like, you know, I know you're talking to Chris Lynch here in a minute, and you know what? They're doing it at scale and at scale and tamer. These companies are all in the same general area. That's kind of related to how do you take all this data and actually prepare it and turn it into something that's consumable really quickly and easily for all of these new data consumers in the enterprise and like so that that's the next logical phase in this process. Now, will this phase be the one that finally sort of meets the high expectations that were set 2030 years ago with enterprise data warehousing? I don't know, but we're certainly getting closer >> to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out there. That was a technology just >> I'm huge. I'm fanatical right now about health care. I think that the opportunity for health care to be transformed with technology is, you know, almost makes everything else look like chump change. What aspect of health care? Well, I think that the most obvious thing is that now, with the consumer sort of in the driver seat in healthcare, that technology companies that come in and provide consumer driven solutions that meet the needs of patients, regardless of how dysfunctional the health care system is, that's killer stuff. We had a great company here in Boston called Pill Pack was a great example of that where they just build something better for consumers, and it was so popular and so, you know, broadly adopted again again. Eventually, Amazon bought it for $1,000,000,000. But those kinds of things and health care Pill pack is just the beginning. There's lots and lots of those kinds of opportunities. >> Well, it's right. Healthcare's ripe for disruption on, and it hasn't been hit with the digital destruction. And neither is financialservices. Really? Certainly, defenses has not yet another. They're high risk industry, so Absolutely takes longer. Well, Andy, thanks so much for making the time. You know, You gotta run. Yeah. Yeah. Thank you. All right, keep it right. Everybody move back with our next guest right after this short break. You're watching the Cube from M I T c B O Q. Right back.

Published Date : Aug 1 2019

SUMMARY :

you by Silicon Angle Media But why did you guys start like the vertical system that we've built and, you know, the problem that you were trying to solve? now that we started the company and maybe almost 10 when we started working on the academic And so what are you seeing in terms of the trends in terms of how data that we thought, uh, Hadoop and H D f s was going to be far big companies struggle with in terms of their data and, you know, cleaning it up and organizing Where are you applying machine the eye industry back in the late 19 eighties on, you know, If you don't have good training data course And so most of our models and most of the projects we work on, we've already got a model. math is all the same, like, you know, absolutely full stop, like there's really no new math. It's a surprise you that that Amazon implemented it and, you know, as a compiled system and to hear you say that Because it wasn't Red Shift originally Park. we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. And so really, you know, you gotta put the legacy of them really lived up to their promise, you know? That's kind of related to how do you take all this data and actually to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out health care to be transformed with technology is, you know, Well, Andy, thanks so much for making the time.

ENTITIES

Entity	Category	Confidence
Mike	PERSON	0.99+
Andy	PERSON	0.99+
Andy Palmer	PERSON	0.99+
Mark Marvin	PERSON	0.99+
2007	DATE	0.99+
Amazon	ORGANIZATION	0.99+
Paul Dillon	PERSON	0.99+
Boston	LOCATION	0.99+
$1,000,000,000	QUANTITY	0.99+
Chris Lynch	PERSON	0.99+
Marvin Minsky	PERSON	0.99+
Larry Ellison	PERSON	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
third thing	QUANTITY	0.99+
20,000,000 records	QUANTITY	0.99+
two same principles	QUANTITY	0.99+
seven years ago	DATE	0.99+
eight years ago	DATE	0.99+
Mike Mike	PERSON	0.98+
three years	QUANTITY	0.98+
late 19 eighties	DATE	0.98+
first	QUANTITY	0.98+
five years	QUANTITY	0.98+
2030 years ago	DATE	0.98+
2nd 1	QUANTITY	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
two things	QUANTITY	0.97+
five PhDs	QUANTITY	0.97+
Day two	QUANTITY	0.97+
Veronica	PERSON	0.97+
M I. T.	PERSON	0.96+
Marvin	PERSON	0.96+
20 years ago	DATE	0.96+
Python	TITLE	0.96+
eighties	DATE	0.94+
2019	DATE	0.94+
2000 companies	QUANTITY	0.94+
Red Shift	TITLE	0.94+
Duke	ORGANIZATION	0.93+
Alexa	TITLE	0.91+
last five years	DATE	0.9+
M I t	EVENT	0.88+
almost 10	QUANTITY	0.87+
TAMR	PERSON	0.86+
Andi	PERSON	0.8+
M. I. T.	ORGANIZATION	0.79+
Tamer	ORGANIZATION	0.78+
Information Quality Symposium	EVENT	0.78+
Quality Conference Day Volonte	EVENT	0.77+
Tamer	PERSON	0.77+
Google translate	TITLE	0.75+
single day	QUANTITY	0.71+
H	PERSON	0.71+
Chief	PERSON	0.66+
Hadoop	PERSON	0.64+
MIT	ORGANIZATION	0.63+
Cube	ORGANIZATION	0.61+
more	QUANTITY	0.6+
M. I. T.	PERSON	0.57+
Pill pack	COMMERCIAL_ITEM	0.56+
Pill Pack	ORGANIZATION	0.53+
D f s	ORGANIZATION	0.48+
Park	TITLE	0.44+
CDOIQ	EVENT	0.32+
Cube	PERSON	0.27+

Aaron Kalb, Alation | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. (dramatic music) >> Welcome back to Cambridge, Massachusetts, everybody. This is theCUBE, the leader in live tech coverage. We go out to the events, and we extract the signal from then noise. And, we're here at the MIT CDOIQ, the Chief Data Officer conference. I'm Dave Vellante with my cohost Paul Gillin. Day two of our wall to wall coverage. Aaron Kalb is here. He's the cofounder and chief data officer of Alation. Aaron, thanks for making the time to come on. >> Thanks so much Dave and Paul for having me. >> You're welcome. So, words matter, you know, and we've been talking about data, and big data, and the three Vs, and data is the new oil, and all this stuff. You gave a talk this week about, you know, "We're maybe not talking the right language "when it comes to data." What did you mean by all that? >> Absolutely, so I get a little bit frustrated by some of these cliques we hear at conference after conference, and the one I, sort of, took aim at in this talk is, data is the new oil. I think what people want to invoke with that is to say, in the same way that oil powered the industrial age, data's powering the information age. Just saying, data's really cool and trendy and important. That's true, but there are a lot of other associations and contexts that people have with data, and some of them don't really apply as, I'm sorry, with oil. And, some of them apply, as well, to data. >> So, is data more valuable than oil? >> Well, I think they're each valuable in different ways, but I think there's a couple issues with the metaphor. One is that data is scarce and dwindling, and part of value comes from the fact that it's so rare. Whereas, the experience with data is that it's so plentiful and abundant, we're almost drowning in it. And so, what I contend is, instead of talking about data as compared to oil, we should talk about data compared to water. And, the idea is, you know, water is very plentiful on the planet, but sometimes, you know, if you have saltwater or contaminated water, you can't drink it. Water is good for different purposes, depending on its form, and so it's all about getting the right data for the right purpose, like water. >> Well, we've certainly, at least in my opinion, fought wars, Paul, over oil. >> And, over water. >> And, certainly, conflicts over water. Do you think we'll be fighting wars over data? Or, are we already? >> No, we might be. One of my favorite talks from the sessions here was a keynote by the CDO for the Department of Defense, who was talking about, you know, the civic duty about transparency but was observing that, actually, more IP addresses from China and Russia are looking at our public datasets than from within the country. So, you know, it's definitely a resource that can be very powerful. >> So, what was the reaction to your premise from the audience. What kind of questions did you get? >> You know, people actually responded very favorably, including some folks from the oil and gas industry, which I was pleased to find. We have a lot of customers in energy, so that was cool. But, what it was nice being here at MIT and just really geeking out about language and linguistics and data with a bunch of CDOs and other people who are, kind of, data intellectuals. >> Right, so if data is not the new oil. >> And, water isn't really a good analogy either, because the supply of water is finite. >> That's true. >> So, what is data? >> Yeah. >> Space? >> Yeah, it's a good point. >> Matter? >> Maybe it is like the universe in that it's always expanding, right, somehow. Right, because any thing, any physic which is on the planet probably won't be growing at that exponential speed. >> So, give us the punchline. >> Well, so I contend that water, while imperfect, is, actually, a really good metaphor that helps for a lot of things. It has properties like the fact that if it's a data quality issue, it flows downstream like pollution in a river. It's the fact that it can come in different forms, useful for different purposes. You might have gray water, right, which is good enough for, you know, irrigation or industrial purposes, but not safe to drink. And so, you rely on metadata to get the data that's in the right form. And, you know, the talk is more fun because you've a lot of visual examples that make this clear. >> Yeah, of course, yeah. >> I actually had one person in the audience say that he used a similar analogy in his own company, so it's fun to trade notes. >> So, chief data officer is a relatively new title for you, is it not? In terms of your role at Alation. >> Yeah, that's right, and the most fun thing about my job is being able to interact with all of the other CDOs and CDAOs at a conference like this. And, it was cool to see. I believe this conference doubled since the last year. Is that right? >> No. >> No, it's up about a hundred, though. >> Right. >> Well. >> And, it's about double from three years ago. >> And, when we first started, in 2013, yeah. >> 130 people, yeah. >> Yeah, it was a very small and intimate event. >> Yeah, here we're outgrowing this building, it seems. >> Yeah, they're kicking us out. >> I think what's interesting is, you know, if we do a little bit of analysis, this is a small data, within our own company, you know, our biggest and most visionary customers typically bought Alation. The buyer champion either was a CDO or they weren't a CDO when they bought the software and have since been promoted to be a CDO. And so, seeing this trend of more and more CDOs cropping up is really exciting for us. And also, just hearing all of the people at the conference saying, two trends we're hearing. A move from, sort of, infrastructure and technology to driving business value, and a move from defense and governance to, sort of, playing offense and doing revenue generation with data. Both of those trends are really exciting for us. >> So, don't hate me for asking this question, because what a lot of companies will do is, they'll give somebody a CDO title, and it's, kind of, a little bit of gimmick, right, to go to market. And, they'll drag you into sales, because I'm sure they do, as a cofounder. But, as well, I know CDOs at tech companies that are actually trying to apply new techniques, figure out how data contributes to their business, how they can cut costs, raise revenue. Do you have an internal role, as well? >> Absolutely, yeah. >> Explain that. >> So, Alation, you know, we're about 250 people, so we're not at the same scale as many of the attendees here. But, we want to learn, you know, from the best, and always apply everything that we learn internally as well. So, obviously, analytics, data science is a huge role in our internal operations. >> And so, what kinds of initiatives are you driving internally? Is it, sort of, cost initiatives, efficiency, innovation? >> Yeah, I think it's all of the above, right. Every single division and both in the, sort of, operational efficiency and cost cutting side as well as figuring out the next big bet to make, can be informed by data. And, our goal was to empower a curious and rational world, and our every decision be based not on the highest paid person's opinion, but on the best evidence possible. And so, you know, the goal of my function is largely to enable that both centrally and within each business unit. >> I want to talk to you about data catalogs a bit because it's a topic close to my heart. I've talked to a lot of data catalog companies over the last couple years, and it seems like, for one thing, the market's very crowded right now. It seems to me. Would you agree there are a lot of options out there? >> Yeah, you know, it's been interesting because when we started it, we were basically the first company to make this technology and to, kind of, use this term, data catalog, in this way. And, it's been validating to see, you know, a lot of big players and other startups even, kind of, coming to that terminology. But, yeah, it has gotten more crowded, and I think our customers who, or our prospects, used to ask us, you know, "What is it that you do? "Explain this catalog metaphor to me," are now saying, "Yeah, catalogs, heard about that." >> It doesn't need to be defined anymore. >> "Which one should I pick? "Why you?" Yeah. >> What distinguished one product from another, you know? What are the major differentiation points? >> Yeah, I think one thing that's interesting is, you know, my talk was about how the metaphors we use shape the way we think. And, I think there's a sense in which, kind of, the history of each company shapes their philosophy and their approach, so we've always been a data catalog company. That's our one product. Some of the other catalog vendors come from ETL background, so they're a lot more focused on technical metadata and infrastructure. Some of the catalog products grew out of governance, and so it's, sort of, governance first, no sorry, defense first and then offense secondary. So, I think that's one of the things, I think, we encourage our prospects to look at, is, kind of, the soul of the company and how that affects their decisions. The other thing is, of course, technology. And, what we at Alation are really excited about, and it's been validating to hear Gartner and others and a lot of the people here, like the GSK keynote speaker yesterday, talking about the importance of comprehensiveness and on taking a behavioral approach, right. We have our Behavioral IO technology that really says, "Let's not look at all the bits and the bytes, "but how are people using the data to drive results?" As our core differentiator. >> Do your customers generally standardize on one data catalog, or might they have multiple catalogs for multiple purposes? >> Yeah, you know, we heard a term more last season, of catalog of catalogs, you know. And, people here can get arbitrarily, you know, meta, meta, meta data, where we like to go there. I think the customers we see most successful tend to have one catalog that serves this function of the single source of reference. Many of our customers will say, you know, that their catalog serves as, sort of, their internal Google for data. Or, the one stop shop where you could find everything. Even though they may have many different sources, Typically you don't want to have siloed catalogs. It makes it harder to find what you're looking for. >> Let's play a little word association with some metaphors. Data lake. (laughter) >> Data lake's another one that I sort of hate. If you think about it, people had data warehouses and didn't love them, but at least, when you put something into a warehouse, you can get it out, right. If you throw something into a lake, you know, there's really no hope you're ever going to find it. It's probably not going to be in great shape, and we're not surprised to find that many folks who invested heavily in data lakes are now having to invest in a layer over it, to make it comprehensible and searchable. >> So, yeah, the lake is where we hide the stolen cars. Data swamp. >> Yeah, I mean, I think if your point is it's worse than lake, it works. But, I think we can do better a lake, right. >> How about data ocean? (laughter) >> You know, out of respect for John Furrier, I'll say it's fantastic. But, to us we think, you know, it isn't really about the size. The more data you have, people think the more data the better. It's actually the more data the worse unless you have a mechanism for finding the little bit of data that is relevant and useful for your task and put it to use. >> And to, want to set up, enter the catalog. So, technically, how does the catalog solve that problem? >> Totally, so if we think about, maybe let's go to the warehouse, for example. But, it works just as well on a data lake in practice. >> Yeah, cool. >> Through the catalog is. It starts with the inventory, you know, what's on every single shelf. But, if you think about what Amazon has done, they have the inventory warehouse in the back, but what you see as a consumer is a simple search interface, where you type in the word of the product you're looking for. And then, you see ranked suggestions for different items, you know, toasters, lamps, whatever, books I want to buy. Same thing for data. I can type in, you know, if I'm at the DOD, you know, information about aircraft, or information about, you know, drug discovery if I'm at GSK. And, I should be able to therefore see all of the different data sets that I have. And, that's true in almost any catalog, that you can do some search over the curated data sets there. With Alation in particular, what I can see is, who's using it, how are they using it, what are they joining it with, what results do they find in that process. And, that can really accelerate the pace of discovery. >> Go ahead. >> I'm sorry, Dave. To what degree can you automate some of that detail, like who's using it and what it's being used for. I mean, doesn't that rely on people curating the catalog? Or, to what degree can you automate that? >> Yeah, so it's a great question. I think, sometimes, there's a sense with AI or ML that it's like the computer is making the decisions or making things up. Which is, obviously, very scary. Usually, the training data comes from humans. So, our goal is to learn from humans in two ways. There's learning from humans where humans explicitly teach you. Somebody goes and says, "This is goal standard data versus this is, "you know, low quality data." And, they do that manually. But, there's also learning implicitly from people. So, in the same way on amazon.com, if I buy one item and then buy another, I'm doing that for my own purposes, but Amazon can do collaborative filtering over all of these trends and say, "You might want to buy this item." We can do a similar thing where we parse the query logs, parse the usage logs and be eye tools, and can basically watch what people are doing for their own purposes. Not to, you know, extra work on top of their job to help us. We can learn from that and make everybody more effective. >> Aaron, is data classification a part of all this? Again, when we started in the industry, data classification was a manual exercise. It's always been a challenge. Certainly, people have applied math to it. You've seen support vector machines and probabilistic latent cement tech indexing being used to classify data. Have we solved that problem, as an industry? Can you automate the classification of data on creation or use at this point in time? >> Well, one thing that came up in a few talks about AI and ML here is, regardless of the algorithm you're using, whether it's, you know, IFH or SVM, or something really modern and exciting that keeps learning. >> Stuff that's been around forever or, it's like you say, some new stuff, right. >> Yeah, you know, actually, I think it was said best by Michael Collins at the DOD, that data is more important than the algorithm because even the best algorithm is useless without really good training data. Plus, the algorithm's, kind of, everyone's got them. So, really often, training data is the limiting reactant in getting really good classification. One thing we try to do at Alation is create an upward spiral where maybe some data is curated manually, and then we can use that as a seed to make some suggestions about how to label other data. And then, it's easier to just do a confirm or deny of a guess than to actually manually label everything. So, then you get more training, get it faster, and it kind of accelerates that way instead of being a big burden. >> So, that's really the advancement in the last five to what, five, six years. Where you're able to use machine intelligence to, sort of, solve that problem as opposed to brute forcing it with some algorithm. Is that fair? >> Yeah, I think that's right, and I think what gets me very excited is when you can have these interactive loops where the human helps the computer, which helps the human. You get, again, this upward spiral. Instead of saying, "We have to have all of this, "you know, manual step done "before we even do the first step," or trying to have an algorithm brute force it without any human intervention. >> It's kind of like notes key mode on write, except it actually works. I'm just kidding to all my ADP friends. All right, Aaron, hey. Thanks very much for coming on theCUBE, but give your last word on the event. I think, is this your first one or no? >> This is our first time here. >> Yeah, okay. So, what are your thoughts? >> I think we'll be back. It's just so exciting to get people who are thinking really big about data but are also practitioners who are solving real business problems. And, just the exchange of ideas and best practices has been really inspiring for me. >> Yeah, that's great. >> Yeah. >> Well, thank you for the support of the event, and thanks for coming on theCUBE. It was great to see you again. >> Thanks Dave, thanks Paul. >> All right, you're welcome. >> Thank you, sir. >> All right, keep it right there, everybody. We'll be back with our next guest right after this short break. You're watching theCUBE from MIT CDOIQ. Be right back. (upbeat music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. Aaron, thanks for making the time to come on. and data is the new oil, and all this stuff. in the same way that oil powered the industrial age, And, the idea is, you know, water is very plentiful Well, we've certainly, at least in my opinion, Do you think we'll be fighting wars over data? So, you know, it's definitely a resource What kind of questions did you get? We have a lot of customers in energy, so that was cool. because the supply of water is finite. Maybe it is like the universe And, you know, the talk is more fun because you've a lot I actually had one person in the audience say So, chief data officer is a relatively Yeah, that's right, and the most fun thing I think what's interesting is, you know, And, they'll drag you into sales, But, we want to learn, you know, from the best, And so, you know, the goal of my function I want to talk to you about data catalogs a bit And, it's been validating to see, you know, "Which one should I pick? Yeah, I think one thing that's interesting is, you know, Or, the one stop shop where you could find everything. Data lake. when you put something into a warehouse, So, yeah, the lake is where we hide the stolen cars. But, I think we can do better a lake, right. But, to us we think, you know, So, technically, how does the catalog solve that problem? maybe let's go to the warehouse, for example. I can type in, you know, if I'm at the DOD, you know, Or, to what degree can you automate that? Not to, you know, extra work on top of their job to help us. Can you automate the classification of data whether it's, you know, IFH or SVM, or something it's like you say, some new stuff, right. Yeah, you know, actually, I think it was said best in the last five to what, five, six years. when you can have these interactive loops I'm just kidding to all my ADP friends. So, what are your thoughts? And, just the exchange of ideas It was great to see you again. We'll be back with our next guest

ENTITIES

Entity	Category	Confidence
Michael Collins	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Paul	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
2013	DATE	0.99+
Aaron Kalb	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Aaron	PERSON	0.99+
five	QUANTITY	0.99+
Department of Defense	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
John Furrier	PERSON	0.99+
amazon.com	ORGANIZATION	0.99+
yesterday	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Alation	PERSON	0.99+
Alation	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
one item	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
first step	QUANTITY	0.99+
last year	DATE	0.99+
GSK	ORGANIZATION	0.99+
both	QUANTITY	0.99+
DOD	ORGANIZATION	0.99+
one person	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
130 people	QUANTITY	0.98+
One	QUANTITY	0.98+
first time	QUANTITY	0.98+
MIT	ORGANIZATION	0.98+
one product	QUANTITY	0.97+
three years ago	DATE	0.97+
this week	DATE	0.97+
two	QUANTITY	0.97+
MIT CDOIQ	ORGANIZATION	0.96+
MIT Chief Data Officer and	EVENT	0.96+
one data catalog	QUANTITY	0.96+
each	QUANTITY	0.96+
each company	QUANTITY	0.95+
Both	QUANTITY	0.95+
one thing	QUANTITY	0.95+
first one	QUANTITY	0.94+
one catalog	QUANTITY	0.93+
two trends	QUANTITY	0.93+
theCUBE	ORGANIZATION	0.93+
first	QUANTITY	0.92+
first company	QUANTITY	0.92+
last couple years	DATE	0.92+
CDO	ORGANIZATION	0.91+
about a hundred	QUANTITY	0.91+
single shelf	QUANTITY	0.88+
about 250 people	QUANTITY	0.88+
single source	QUANTITY	0.87+
China	LOCATION	0.87+
2019	DATE	0.86+
Day two	QUANTITY	0.86+
one	QUANTITY	0.85+
each business unit	QUANTITY	0.82+
MIT CDOIQ	EVENT	0.79+
ADP	ORGANIZATION	0.79+
couple issues	QUANTITY	0.76+
Information Quality Symposium 2019	EVENT	0.76+
One thing	QUANTITY	0.7+
single division	QUANTITY	0.69+
one stop	QUANTITY	0.68+
Russia	LOCATION	0.64+
three	QUANTITY	0.61+
double	QUANTITY	0.59+
favorite	QUANTITY	0.5+
CDOIQ	EVENT	0.46+
Chief	PERSON	0.42+

Jeanne Ross, MIT CISR | MIT CDOIQ 2019

(techno music) >> From Cambridge, Massachusetts, it's theCUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to MIT CDOIQ. The CDO Information Quality Conference. You're watching theCUBE, the leader in live tech coverage. My name is Dave Vellante. I'm here with my co-host, Paul Gillin. This is our day two of our two day coverage. Jean Ross is here. She's the principle research scientist at MIT CISR, Jean good to see you again. >> Nice to be here! >> Welcome back. Okay, what do all these acronyms stand for, I forget. MIT CISR. >> CISR which we pronounce scissor, is the Center for Information Systems Research. It's a research center that's been at MIT since 1974, studying how big companies use technology effectively. >> So and, what's your role as a research scientist? >> As a research scientist, I work with both researchers and with company leaders to understand what's going on out there, and try to present some simple succinct ideas about how companies can generate greater value from information technology. >> Well, I guess not much has changed in information technology since 1974. (laughing) So let's fast forward to the big, hot trend, digital transformation, digital business. What's the difference between a business and a digital business? >> Right now, you're hoping there's no difference for you and your business. >> (chuckling) Yeah, for sure. >> The main thing about a digital business is it's being inspired by technology. So in the past, we would establish a strategy, and then we would check out technology and say, okay, how can technology make us more effective with that strategy? Today, and this has been driven a lot by start-ups, we have to stop and say, well wait a minute, what is technology making possible? Because if we're not thinking about it, there sure are a lot of students at MIT who are, and we're going to miss the boat. We're going to get Ubered if you will, somebody's going to think of a value proposition that we should be offering and aren't, and we'll be left in the dust. So, our digital businesses are those that are recognizing the opportunities that digital technologies make possible. >> Now, and what about data? In terms of the role of digital business, it seems like that's an underpinning of a digital business. Is it not? >> Yeah, the single biggest capability that digital technologies provide, is ubiquitous data that's readily accessible anytime. So when we think about being inspired by technology, we could reframe that as inspired by the availability of ubiquitous data that's readily accessible. >> Your premise about the difference between digitization and digital business is interesting. It's more than just a sematic debate. Do companies now, when companies talk about digital transformation these days, in fact, are most of them of thinking of digitization rather than really transformative business change? >> Yeah, this is so interesting to me. In 2006, we wrote a book that said, you need to become more agile, and you need to rely on information technology to get you there. And these are basic things like SAP and salesforce.com and things like that. Just making sure that your core processes are disciplined and reliable and predictable. We said this in 2006. What we didn't know is that we were explaining digitization, which is very effective use of technology in your underlying process. Today, when somebody says to me, we're going digital, I'm thinking about the new value propositions, the implications of the data, right? And they're often actually saying they're finally doing what we thought they should do in 2006. The problem is, in 2006, we said get going on this, it's a long journey. This could take you six, 10 years to accomplish. And then we gave examples of companies that took six to 10 years. LEGO, and USAA and really great companies. And now, companies are going, "Ah, you know, we really ought to do that". They don't have six to 10 years. They get this done now, or they're in trouble, and it's still a really big deal. >> So how realistic is it? I mean, you've got big established companies that have got all these information silos, as we've been hearing for the last two days, just pulling their information together, knowing what they've got is a huge challenge for them. Meanwhile, you're competing with born on the web, digitally native start-ups that don't have any of that legacy, is it really feasible for these companies to reinvent themselves in the way you're talking about? Or should they just be buying the companies that have already done it? >> Well good luck with buying, because what happens is that when a company starts up, they can do anything, but they can't do it to scale. So most of these start-ups are going to have to sell themselves because they don't know anything about scale. And the problem is, the companies that want to buy them up know about the scale of big global companies but they don't know how to do this seamlessly because they didn't do the basic digitization. They relied on basically, a lot of heroes in their company to pull of the scale. So now they have to rely more on technology than they did in the past, but they still have a leg up if you will, on the start-up that doesn't want to worry about the discipline of scaling up a good idea. They'd rather just go off and have another good idea, right? They're perpetual entrepreneurs if you will. So if we look at the start-ups, they're not really your concern. Your concern is the very well run company, that's been around, knows how to be inspired by technology and now says, "Oh I see what you're capable of doing, "or should be capable of doing. "I think I'll move into your space". So this, the Amazon's, and the USAA's and the LEGO's who say "We're good at what we do, "and we could be doing more". We're watching Schneider Electric, Phillips's, Ferovial. These are big ole companies who get digital, and they are going to start moving into a lot of people's territory. >> So let's take the example of those incumbents that you've used as examples of companies that are leaning into digital, and presumably doing a good job of it, they've got a lot of legacy debt, as you know people call it technical debt. The question I have is how they're using machine intelligence. So if you think about Facebook, Amazon, Microsoft, Google, they own horizontal technologies around machine intelligence. The incumbents that you mentioned, do not. Now do they close the gap? They're not going to build their own A.I. They're going to buy it, and then apply it. It's how they apply it that's going to be the difference. So do you agree with that premise, and where are they getting it, do they have the skill sets to do it, how are they closing that gap? >> They're definitely partnering. When you say they're not going to build any of it, that's actually not quite true. They're going to build a lot around the edges. They'll rely on partners like Microsoft and Google to provide some of the core, >> Yes, right. >> But they are bringing in their own experts to take it to the, basically to the customer level. How do I take, let me just take Schneider Electric for an example. They have gone from being an electrical equipment manufacturer, to a purveyor of energy management solutions. It's quite a different value proposition. To do that, they need a lot of intelligence. Some of it is data analytics of old, and some of it is just better representation on dashboards and things like that. But there is a layer of intelligence that is new, and it is absolutely essential to them by relying on partners and their own expertise in what they do for customers, and then co-creating a fair amount with customers, they can do things that other companies cannot. >> And they're developing a software presumably, a SAS revenue stream as part of that, right? >> Yeah, absolutely. >> How about the innovators dilemma though, the problem that these companies often have grown up, they're very big, they're very profitable, they see disruption coming, but they are unable to make the change, their shareholders won't let them make the change, they know what they have to do, but they're simply not able to do it, and then they become paralyzed. Is there a -- I mean, looking at some of the companies you just mentioned, how did they get over that mindset? >> This is real leadership from CEO's, who basically explain to their boards and to their investors, this is our future, we are... we're either going this direction or we're going down. And they sell it. It's brilliant salesmanship, and it's why when we go out to study great companies, we don't have that many to choose from. I mean, they are hard to find, right? So you are at such a competitive advantage right now. If you understand, if your own internal processes are cleaned up and you know how to rely on the E.R.P's and the C.R.M's, to get that done, and on the other hand, you're using the intelligence to provide value propositions, that new technologies and data make possible, that is an incredibly powerful combination, but you have to invest. You have to convince your boards and your investors that it's a good idea, you have to change your talent internally, and the biggest surprise is, you have to convince your customers that they want something from you that they never wanted before. So you got a lot of work to do to pull this off. >> Right now, in today's economy, the economy is sort of lifting all boats. But as we saw when the .com implosion happened in 2001, often these breakdown gives birth to great, new companies. Do you see that the next recession, which is inevitably coming, will be sort of the turning point for some of these companies that can't change? >> It's a really good question. I do expect that there are going to be companies that don't make it. And I think that they will fail at different rates based on their, not just the economy, but their industry, and what competitors do, and things like that. But I do think we're going to see some companies fail. We're going to see many other companies understand that they are too complex. They are simply too complex. They cannot do things end to end and seamlessly and present a great customer experience, because they're doing everything. So we're going to see some pretty dramatic changes, we're going to see failure, it's a fair assumption that when we see the economy crash, it's also going to contribute, but that's, it's not the whole story. >> But when the .com blew up, you had the internet guys that actually had a business model to make money, and the guys that didn't, the guys that didn't went away, and then you also had the incumbents that embrace the internet, so when we came out of that .com downturn, you had the survivors, who was Google and eBay, and obviously Amazon, and then you had incumbent companies who had online retailing, and e-tailing and e-commerce etc, who thrived. I would suspect you're going to see something similar, but I wonder what you guys think. The street today is rewarding growth. And we got another near record high today after the rate cut yesterday. And so, but companies that aren't making money are getting rewarded, 'cause they're growing. Well when the recession comes, those guys are going to get crushed. >> Right. >> Yeah. >> And you're going to have these other companies emerge, and you'll see the winners, are going to be those ones who have truly digitized, not just talking the talk, or transformed really, to use your definition. That's what I would expect. I don't know, what do you think about that? >> I totally agree. And, I mean, we look at industries like retail, and they have been fundamentally transformed. There's still lots of opportunities for innovation, and we're going to see some winners that have kind of struggled early but not given up, and they're kind of finding their footing. But we're losing some. We're losing a lot, right? I think the surprise is that we thought digital was going to replace what we did. We'd stop going to stores, we'd stop reading books, we wouldn't have newspapers anymore. And it hasn't done that. Its only added, it hasn't taken anything away. >> It could-- >> I don't think the newspaper industry has been unscathed by digital. >> No, nor has retail. >> Nor has retail, right. >> No, no no, not unscathed, but here's the big challenge. Is if I could substitute, If I could move from newspaper to online, I'm fine. You don't get to do that. You add online to what you've got, right? And I think this right now is the big challenge. Is that nothing's gone away, at least yet. So we have to sustain the business we are, so that it can feed the business we want to be. And we have to make that transition into new capabilities. I would argue that established companies need to become very binary, that there are people that do nothing but sustain and make better and better and better, who they are. While others, are creating the new reality. You see this in auto companies by the way. They're creating not just the autonomous automobiles, but the mobility services, the whole new value propositions, that will become a bigger and bigger part of their revenue stream, but right now are tiny. >> So, here's the scary thing to me. And again, I'd love to hear your thoughts on this. And I've been an outspoken critic of Liz Warren's attack on big tech. >> Absolutely. >> I just think if they're breaking the law, and they're really acting like monopolies, the D.O.J and F.T.C should do something, but to me, you don't just break up big tech because they're good capitalists. Having said that, one of the things that scares me is, when you see Apple getting into payment systems, Amazon getting into grocery and logistics. Digital allows you to do something that's never happened before which is, you can traverse industries. >> Yep. >> Yeah, absolutely >> You used to have this stack of industries, and if you were in that industry, you're stuck in healthcare, you're stuck in financial services or whatever it was. And today, digital allows you to traverse those. >> It absolutely does. And so in theory, Amazon and Apple and Facebook and Google, they can attack virtually any industry and they kind of are. >> Yeah they kind are. I would certainly not break up anything. I would really look hard though at acquisitions, because I think that's where some of this is coming from. They can stop the overwhelming growth, but I do think you're right. That you get these opportunities from digital that are just so much easier because they're basically sharing information and technology, not building buildings and equipment and all that kind of thing. But I think there all limits to all this. I do not fear these companies. I think there, we need some law, we need some regulations, they're fine. They are adding a lot of value and the great companies, I mean, you look at the Schneider's and the Phillips, yeah they fear what some of them can do, but they're looking forward to what they provide underneath. >> Doesn't Cloud change the equation here? I mean, when you think of something like Amazon getting into the payments business, or Google in the payments business, you know it used to be that the creating of global payments processing network, just going global was a huge barrier to entry. Now, you don't have nearly that same level of impediment right? I mean the cloud eliminates much of the traditional barrier. >> Yeah, but I'll tell you what limits it, is complexity. Every company we've studied gets a little over anxious and becomes too complex, and they cannot run themselves effectively anymore. It happens to everyone. I mean, remember when we were terrified about what Microsoft was going to become? But then it got competition because it's trying to do so many things, and somebody else is offering, Sales Force and others, something simpler. And this will happen to every company that gets overly ambitious. Something simpler will come along, and everybody will go "Oh thank goodness". Something simpler. >> Well with Microsoft, I would argue two things. One is the D.O.J put some handcuffs on them , and two, with Steve Ballmer, I wouldn't get his nose out of Windows, and then finally stuck on a (mumbles) (laughter) >> Well it's they had a platform shift. >> Well this is exactly it. They will make those kind of calls . >> Sure, and I think that talks to their legacy, that they won't end up like Digital Equipment Corp or Wang and D.G, who just ignored the future and held onto the past. But I think, a colleague of ours, David Moschella wrote a book, it's called "Seeing Digital". And his premise was we're moving from a world of remote cloud services, to one where you have to, to use your word, ubiquitous digital services that you can access upon which you can build your business and new business models. I mean, the simplest example is Waves, you mentioned Uber. They're using Cloud, they're using OAuth.in with Google, Facebook or LinkedIn and they've got a security layer, there's an A.I layer, there's all your BlockChain, mobile, cognitive, it's all these sets of services that are now ubiquitous on which you're building, so you're leveraging, he calls it the matrix, to the extent that these companies that you're studying, these incumbents can leverage that matrix, they should be fine. >> Yes. >> The part of the problem is, they say "No, we're going to invent everything ourselves, we're going to build it all ourselves". To use Andy Jassy's term, it's non-differentiated heavy lifting, slows them down, but there's no reason why they can't tap that matrix, >> Absolutely >> And take advantage of it. Where I do get scared is, the Facebooks, Apples, Googles, Amazons, they're matrix companies, their data is at their core, and they get this. It's not like they're putting data around the core, data is the core. So your thoughts on that? I mean, it looks like your slide about disruption, it's coming. >> Yeah, yeah, yeah, yeah. >> No industry is safe. >> Yeah, well I'll go back to the complexity argument. We studied complexity at length, and complexity is a killer. And as we get too ambitious, and we're constantly looking for growth, we start doing things that create more and more tensions in our various lines of business, causes to create silos, that then we have to coordinate. I just think every single company that, no cloud is going to save us from this. It, complexity will kill us. And we have to keep reminding ourselves to limit that complexity, and we've just not seen the example of the company that got that right. Sooner or later, they just kind of chop them, you know, create problems for themselves. >> Well isn't that inherent though in growth? >> Absolutely! >> It's just like, big companies slow down. >> That's right. >> They can't make decisions as quickly. >> That's right. >> I haven't seen a big company yet that moves nimbly. >> Exactly, and that's the complexity thing-- >> Well wait a minute, what about AWS? They're a 40 billion dollar company. >> Oh yeah, yeah, yeah >> They're like the agile gorilla. >> Yeah, yeah, yeah. >> I mean, I think they're breaking the rule, and my argument would be, because they have data at their core, and they've got that, its a bromide, but that common data model, that they can apply now to virtually any business. You know, we're been expecting, a lot of people have been expecting that growth to attenuate. I mean it hasn't yet, we'll see. But they're like a 40 billion dollar firm-- >> No that's a good example yeah. >> So we'll see. And Microsoft, is the other one. Microsoft is demonstrating double digit growth. For such a large company, it's astounding. I wonder, if the law of large numbers is being challenged, so. >> Yeah, well it's interesting. I do think that what now constitutes "so big" that you're really going to struggle with the complexity. I think that has definitely been elevated a lot. But I still think there will be a point at which human beings can't handle-- >> They're getting away. >> Whatever level of complexity we reach, yeah. >> Well sure, right because even though this great new, it's your point. Cloud technology, you know, there's going to be something better that comes along. Even, I think Jassy might have said, If we had to do it all over again, we would have built the whole thing on lambda functions >> Yeah. >> Oh, yeah. >> Not on, you know so there you go. >> So maybe someone else does that-- >> Yeah, there you go. >> So now they've got their hybrid. >> Yeah, yeah. >> Yeah, absolutely. >> You know maybe it'll take another ten years, but well Jean, thanks so much for coming to theCUBE, >> it was great to have you. >> My pleasure! >> Appreciate you coming back. >> Really fun to talk. >> All right, keep right there everybody, Paul Gillin and Dave Villante, we'll be right back from MIT CDOIQ, you're watching theCUBE. (chuckles) (techno music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. Jean good to see you again. Okay, what do all these acronyms stand for, I forget. is the Center for Information Systems Research. to understand what's going on out there, So let's fast forward to the big, hot trend, for you and your business. We're going to get Ubered if you will, Now, and what about data? Yeah, the single biggest capability and digital business is interesting. information technology to get you there. to reinvent themselves in the way you're talking about? and they are going to start moving into It's how they apply it that's going to be the difference. They're going to build a lot around the edges. and it is absolutely essential to them I mean, looking at some of the companies you just mentioned, and the biggest surprise is, you have to convince often these breakdown gives birth to great, new companies. I do expect that there are going to be companies and then you also had the incumbents I don't know, what do you think about that? and they have been fundamentally transformed. I don't think the newspaper industry so that it can feed the business we want to be. So, here's the scary thing to me. but to me, you don't just break up big tech and if you were in that industry, they can attack virtually any industry and they kind of are. But I think there all limits to all this. I mean, when you think of something like and they cannot run themselves effectively anymore. One is the D.O.J put some handcuffs on them , Well this is exactly it. Sure, and I think that talks to their legacy, The part of the problem is, they say data is the core. that then we have to coordinate. Well wait a minute, what about AWS? that growth to attenuate. And Microsoft, is the other one. I do think that what now constitutes "so big" that you're there's going to be something better that comes along. Paul Gillin and Dave Villante,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
David Moschella	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Jean Ross	PERSON	0.99+
2006	DATE	0.99+
six	QUANTITY	0.99+
Steve Ballmer	PERSON	0.99+
Jeanne Ross	PERSON	0.99+
Liz Warren	PERSON	0.99+
LEGO	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Schneider Electric	ORGANIZATION	0.99+
Dave Villante	PERSON	0.99+
Amazons	ORGANIZATION	0.99+
Googles	ORGANIZATION	0.99+
Jean	PERSON	0.99+
Facebooks	ORGANIZATION	0.99+
Phillips	ORGANIZATION	0.99+
USAA	ORGANIZATION	0.99+
Center for Information Systems Research	ORGANIZATION	0.99+
Apples	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Ferovial	ORGANIZATION	0.99+
Digital Equipment Corp	ORGANIZATION	0.99+
2001	DATE	0.99+
1974	DATE	0.99+
two day	QUANTITY	0.99+
two	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
D.O.J	ORGANIZATION	0.99+
yesterday	DATE	0.99+
eBay	ORGANIZATION	0.99+
40 billion dollar	QUANTITY	0.99+
MIT	ORGANIZATION	0.99+
Jassy	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
today	DATE	0.99+
10 years	QUANTITY	0.99+
ten years	QUANTITY	0.99+
Today	DATE	0.99+
One	QUANTITY	0.99+
CISR	ORGANIZATION	0.98+
MIT CISR	ORGANIZATION	0.98+
Seeing Digital	TITLE	0.98+
two things	QUANTITY	0.98+
single	QUANTITY	0.97+
Ubered	ORGANIZATION	0.97+
LinkedIn	ORGANIZATION	0.97+
Windows	TITLE	0.96+
OAuth.in	TITLE	0.96+
one	QUANTITY	0.94+
Wang and D.G	ORGANIZATION	0.94+
CDO Information Quality Conference	EVENT	0.94+
D.O.J	PERSON	0.87+

Gokula Mishra | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019 brought to you by SiliconANGLE Media. (upbeat techno music) >> Hi everybody, welcome back to Cambridge, Massachusetts. You're watching theCUBE, the leader in tech coverage. We go out to the events. We extract the signal from the noise, and we're here at the MIT CDOIQ Conference, Chief Data Officer Information Quality Conference. It is the 13th year here at the Tang building. We've outgrown this building and have to move next year. It's fire marshal full. Gokula Mishra is here. He is the Senior Director of Global Data and Analytics and Supply Chain-- >> Formerly. Former, former Senior Director. >> Former! I'm sorry. It's former Senior Director of Global Data Analytics and Supply Chain at McDonald's. Oh, I didn't know that. I apologize my friend. Well, welcome back to theCUBE. We met when you were at Oracle doing data. So you've left that, you're on to your next big thing. >> Yes, thinking through it. >> Fantastic, now let's start with your career. You've had, so you just recently left McDonald's. I met you when you were at Oracle, so you cut over to the dark side for a while, and then before that, I mean, you've been a practitioner all your life, so take us through sort of your background. >> Yeah, I mean my beginning was really with a company called Tata Burroughs. Those days we did not have a lot of work getting done in India. We used to send people to U.S. so I was one of the pioneers of the whole industry, coming here and working on very interesting projects. But I was lucky to be working on mostly data analytics related work, joined a great company called CS Associates. I did my Master's at Northwestern. In fact, my thesis was intelligent databases. So, building AI into the databases and from there on I have been with Booz Allen, Oracle, HP, TransUnion, I also run my own company, and Sierra Atlantic, which is part of Hitachi, and McDonald's. >> Awesome, so let's talk about use of data. It's evolved dramatically as we know. One of the themes in this conference over the years has been sort of, I said yesterday, the Chief Data Officer role emerged from the ashes of sort of governance, kind of back office information quality compliance, and then ascended with the tailwind of the Big Data meme, and it's kind of come full circle. People are realizing actually to get value out of data, you have to have information quality. So those two worlds have collided together, and you've also seen the ascendancy of the Chief Digital Officer who has really taken a front and center role in some of the more strategic and revenue generating initiatives, and in some ways the Chief Data Officer has been a supporting role to that, providing the quality, providing the compliance, the governance, and the data modeling and analytics, and a component of it. First of all, is that a fair assessment? How do you see the way in which the use of data has evolved over the last 10 years? >> So to me, primarily, the use of data was, in my mind, mostly around financial reporting. So, anything that companies needed to run their company, any metrics they needed, any data they needed. So, if you look at all the reporting that used to happen it's primarily around metrics that are financials, whether it's around finances around operations, finances around marketing effort, finances around reporting if it's a public company reporting to the market. That's where the focus was, and so therefore a lot of the data that was not needed for financial reporting was what we call nowadays dark data. This is data we collect but don't do anything with it. Then, as the capability of the computing, and the storage, and new technologies, and new techniques evolve, and are able to handle more variety and more volume of data, then people quickly realize how much potential they have in the other data outside of the financial reporting data that they can utilize too. So, some of the pioneers leverage that and actually improved a lot in their efficiency of operations, came out with innovation. You know, GE comes to mind as one of the companies that actually leverage data early on, and number of other companies. Obviously, you look at today data has been, it's defining some of the multi-billion dollar company and all they have is data. >> Well, Facebook, Google, Amazon, Microsoft. >> Exactly. >> Apple, I mean Apple obviously makes stuff, but those other companies, they're data companies. I mean largely, and those five companies have the highest market value on the U.S. stock exchange. They've surpassed all the other big leaders, even Berkshire Hathaway. >> So now, what is happening is because the market changes, the forces that are changing the behavior of our consumers and customers, which I talked about which is everyone now is digitally engaging with each other. What that does is all the experiences now are being captured digitally, all the services are being captured digitally, all the products are creating a lot of digital exhaust of data and so now companies have to pay attention to engage with their customers and partners digitally. Therefore, they have to make sure that they're leveraging data and analytics in doing so. The other thing that has changed is the time to decision to the time to act on the data inside that you get is shrinking, and shrinking, and shrinking, so a lot more decision-making is now going real time. Therefore, you have a situation now, you have the capability, you have the technology, you have the data now, you have to make sure that you convert that in what I call programmatic kind of data decision-making. Obviously, there are people involved in more strategic decision-making. So, that's more manual, but at the operational level, it's going more programmatic decision-making. >> Okay, I want to talk, By the way, I've seen a stat, I don't know if you can confirm this, that 80% of the data that's out there today is dark data or it's data that's behind a firewall or not searchable, not open to Google's crawlers. So, there's a lot of value there-- >> So, I would say that percent is declining over time as companies have realized the value of data. So, more and more companies are removing the silos, bringing those dark data out. I think the key to that is companies being able to value their data, and as soon as they are able to value their data, they are able to leverage a lot of the data. I still believe there's a large percent still not used or accessed in companies. >> Well, and of course you talked a lot about data monetization. Doug Laney, who's an expert in that topic, we had Doug on a couple years ago when he, just after, he wrote Infonomics. He was on yesterday. He's got a very detailed prescription as to, he makes strong cases as to why data should be valued like an asset. I don't think anybody really disagrees with that, but then he gave kind of a how-to-do-it, which will, somewhat, make your eyes bleed, but it was really well thought out, as you know. But you talked a lot about data monetization, you talked about a number of ways in which data can contribute to monetization. Revenue, cost reduction, efficiency, risk, and innovation. Revenue and cost is obvious. I mean, that's where the starting point is. Efficiency is interesting. I look at efficiency as kind of a doing more with less but it's sort of a cost reduction, but explain why it's not in the cost bucket, it's different. >> So, it is first starts with doing what we do today cheaper, better, faster, and doing more comes after that because if you don't understand, and data is the way to understand how your current processes work, you will not take the first step. So, to take the first step is to understand how can I do this process faster, and then you focus on cheaper, and then you focus on better. Of course, faster is because of some of the market forces and customer behavior that's driving you to do that process faster. >> Okay, and then the other one was risk reduction. I think that makes a lot of sense here. Actually, let me go back. So, one of the key pieces of it, of efficiency is time to value. So, if you can compress the time, or accelerate the time and you get the value that means more cash in house faster, whether it's cost reduction or-- >> And the other aspect you look at is, can you automate more of the processes, and in that way it can be faster. >> And that hits the income statement as well because you're reducing headcount cost of your, maybe not reducing headcount cost, but you're getting more out of different, out ahead you're reallocating them to more strategic initiatives. Everybody says that but the reality is you hire less people because you just automated. And then, risk reduction, so the degree to which you can lower your expected loss. That's just instead thinking in insurance terms, that's tangible value so certainly to large corporations, but even midsize and small corporations. Innovation, I thought was a good one, but maybe you could use an example of, give us an example of how in your career you've seen data contribute to innovation. >> So, I'll give an example of oil and gas industry. If you look at speed of innovation in the oil and gas industry, they were all paper-based. I don't know how much you know about drilling. A lot of the assets that goes into figuring out where to drill, how to drill, and actually drilling and then taking the oil or gas out, and of course selling it to make money. All of those processes were paper based. So, if you can imagine trying to optimize a paper-based innovation, it's very hard. Not only that, it's very, very by itself because it's on paper, it's in someone's drawer or file. So, it's siloed by design and so one thing that the industry has gone through, they recognize that they have to optimize the processes to be better, to innovate, to find, for example, shale gas was a result output of digitizing the processes because otherwise you can't drill faster, cheaper, better to leverage the shale gas drilling that they did. So, the industry went through actually digitizing a lot of the paper assets. So, they went from not having data to knowingly creating the data that they can use to optimize the process and then in the process they're innovating new ways to drill the oil well cheaper, better, faster. >> In the early days of oil exploration in the U.S. go back to the Osage Indian tribe in northern Oklahoma, and they brilliantly, when they got shuttled around, they pushed him out of Kansas and they negotiated with the U.S. government that they maintain the mineral rights and so they became very, very wealthy. In fact, at one point they were the wealthiest per capita individuals in the entire world, and they used to hold auctions for various drilling rights. So, it was all gut feel, all the oil barons would train in, and they would have an auction, and it was, again, it was gut feel as to which areas were the best, and then of course they evolved, you remember it used to be you drill a little hole, no oil, drill a hole, no oil, drill a hole. >> You know how much that cost? >> Yeah, the expense is enormous right? >> It can vary from 10 to 20 million dollars. >> Just a giant expense. So, now today fast-forward to this century, and you're seeing much more sophisticated-- >> Yeah, I can give you another example in pharmaceutical. They develop new drugs, it's a long process. So, one of the initial process is to figure out what molecules this would be exploring in the next step, and you could have thousand different combination of molecules that could treat a particular condition, and now they with digitization and data analytics, they're able to do this in a virtual world, kind of creating a virtual lab where they can test out thousands of molecules. And then, once they can bring it down to a fewer, then the physical aspect of that starts. Think about innovation really shrinking their processes. >> All right, well I want to say this about clouds. You made the statement in your keynote that how many people out there think cloud is cheaper, or maybe you even said cheap, but cheaper I inferred cheaper than an on-prem, and so it was a loaded question so nobody put their hand up they're afraid, but I put my hand up because we don't have any IT. We used to have IT. It was a nightmare. So, for us it's better but in your experience, I think I'm inferring correctly that you had meant cheaper than on-prem, and certainly we talked to many practitioners who have large systems that when they lift and shift to the cloud, they don't change their operating model, they don't really change anything, they get a bill at the end of the month, and they go "What did this really do for us?" And I think that's what you mean-- >> So what I mean, let me make it clear, is that there are certain use cases that cloud is and, as you saw, that people did raise their hand saying "Yeah, I have use cases where cloud is cheaper." I think you need to look at the whole thing. Cost is one aspect. The flexibility and agility of being able to do things is another aspect. For example, if you have a situation where your stakeholder want to do something for three weeks, and they need five times the computing power, and the data that they are buying from outside to do that experiment. Now, imagine doing that in a physical war. It's going to take a long time just to procure and get the physical boxes, and then you'll be able to do it. In cloud, you can enable that, you can get GPUs depending on what problem we are trying to solve. That's another benefit. You can get the fit for purpose computing environment to that and so there are a lot of flexibility, agility all of that. It's a new way of managing it so people need to pay attention to the cost because it will add to the cost. The other thing I will point out is that if you go to the public cloud, because they make it cheaper, because they have hundreds and thousands of this canned CPU. This much computing power, this much memory, this much disk, this much connectivity, and they build thousands of them, and that's why it's cheaper. Well, if your need is something that's very unique and they don't have it, that's when it becomes a problem. Either you need more of those and the cost will be higher. So, now we are getting to the IOT war. The volume of data is growing so much, and the type of processing that you need to do is becoming more real-time, and you can't just move all this bulk of data, and then bring it back, and move the data back and forth. You need a special type of computing, which is at the, what Amazon calls it, adds computing. And the industry is kind of trying to design it. So, that is an example of hybrid computing evolving out of a cloud or out of the necessity that you need special purpose computing environment to deal with new situations, and all of it can't be in the cloud. >> I mean, I would argue, well I guess Microsoft with Azure Stack was kind of the first, although not really. Now, they're there but I would say Oracle, your former company, was the first one to say "Okay, we're going to put the exact same infrastructure on prem as we have in the public cloud." Oracle, I would say, was the first to truly do that-- >> They were doing hybrid computing. >> You now see Amazon with outposts has done the same, Google kind of has similar approach as Azure, and so it's clear that hybrid is here to stay, at least for some period of time. I think the cloud guys probably believe that ultimately it's all going to go to the cloud. We'll see it's going to be a long, long time before that happens. Okay! I'll give you last thoughts on this conference. You've been here before? Or is this your first one? >> This is my first one. >> Okay, so your takeaways, your thoughts, things you might-- >> I am very impressed. I'm a practitioner and finding so many practitioners coming from so many different backgrounds and industries. It's very, very enlightening to listen to their journey, their story, their learnings in terms of what works and what doesn't work. It is really invaluable. >> Yeah, I tell you this, it's always a highlight of our season and Gokula, thank you very much for coming on theCUBE. It was great to see you. >> Thank you. >> You're welcome. All right, keep it right there everybody. We'll be back with our next guest, Dave Vellante. Paul Gillin is in the house. You're watching theCUBE from MIT. Be right back! (upbeat techno music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. He is the Senior Director of Global Data and Analytics Former, former Senior Director. We met when you were at Oracle doing data. I met you when you were at Oracle, of the pioneers of the whole industry, and the data modeling and analytics, So, if you look at all the reporting that used to happen the highest market value on the U.S. stock exchange. So, that's more manual, but at the operational level, that 80% of the data that's out there today and as soon as they are able to value their data, Well, and of course you talked a lot and data is the way to understand or accelerate the time and you get the value And the other aspect you look at is, Everybody says that but the reality is you hire and of course selling it to make money. the mineral rights and so they became very, very wealthy. and you're seeing much more sophisticated-- So, one of the initial process is to figure out And I think that's what you mean-- and the type of processing that you need to do I mean, I would argue, and so it's clear that hybrid is here to stay, and what doesn't work. Yeah, I tell you this, Paul Gillin is in the house.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Hitachi	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Doug Laney	PERSON	0.99+
five times	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Kansas	LOCATION	0.99+
TransUnion	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
HP	ORGANIZATION	0.99+
three weeks	QUANTITY	0.99+
India	LOCATION	0.99+
10	QUANTITY	0.99+
Sierra Atlantic	ORGANIZATION	0.99+
Gokula Mishra	PERSON	0.99+
Doug	PERSON	0.99+
hundreds	QUANTITY	0.99+
Berkshire Hathaway	ORGANIZATION	0.99+
five companies	QUANTITY	0.99+
80%	QUANTITY	0.99+
U.S.	LOCATION	0.99+
Booz Allen	ORGANIZATION	0.99+
Tata Burroughs	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Gokula	PERSON	0.99+
next year	DATE	0.99+
thousands	QUANTITY	0.99+
McDonald's	ORGANIZATION	0.99+
one aspect	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
yesterday	DATE	0.99+
thousands of molecules	QUANTITY	0.99+
first one	QUANTITY	0.99+
One	QUANTITY	0.98+
GE	ORGANIZATION	0.98+
northern Oklahoma	LOCATION	0.98+
today	DATE	0.97+
CS Associates	ORGANIZATION	0.97+
20 million dollars	QUANTITY	0.97+
one	QUANTITY	0.96+
First	QUANTITY	0.96+
Global Data and Analytics and Supply Chain	ORGANIZATION	0.95+
MIT CDOIQ Conference	EVENT	0.95+
13th year	QUANTITY	0.94+
U.S. government	ORGANIZATION	0.93+
two worlds	QUANTITY	0.92+
Azure Stack	TITLE	0.91+
one thing	QUANTITY	0.9+
one point	QUANTITY	0.9+
Northwestern	ORGANIZATION	0.9+
couple years ago	DATE	0.89+
MIT Chief Data Officer and Information Quality Symposium 2019	EVENT	0.87+
this century	DATE	0.85+
Tang building	LOCATION	0.85+
Global Data Analytics and	ORGANIZATION	0.83+
Chief Data Officer Information Quality Conference	EVENT	0.81+
MIT	ORGANIZATION	0.78+
theCUBE	ORGANIZATION	0.77+
thousand different combination of molecules	QUANTITY	0.74+
last	DATE	0.67+
years	DATE	0.66+
U.S.	ORGANIZATION	0.66+
billion dollar	QUANTITY	0.65+
themes	QUANTITY	0.65+
Osage Indian	OTHER	0.64+

Matt Kobe, Chicago Bulls | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to M. I. T. In Cambridge, Massachusetts. Everybody You're watching The Cube, the Leader and Live Tech coverage. My name is Dave Volante, and it's my pleasure to introduce Matt Kobe, who's the vice president of business strategy Analytics of Chicago Bulls. We love talking sports. We love talking data. Matt. Thanks for coming on. >> No problem getting a date. So talk about >> your role. Is the head of analytics for the Bulls? >> Sure. So I work exclusively on the business side of the operation. So we have a separate team that those the basketball side, which is kind of your players stuff. But on the business side, um, what we're focused on is really two things. One is being essentially internal consultants for the rest of the customer facing functions. So we work a lot with ticketing, allow its sponsorship, um, marketing digital, all of those folks that engage with our customer base and then on the backside back end of it, we're building out the technical infrastructure for the organization right. So everything from data warehouse to C. R M to email marketing All of that sits with my team. And so we were a lot of hats, which is exciting. But at the end of the day, we're trying to use data to enhance the customer and fan experience. Um and that's our aim. And that's what we're driving towards >> success in sports. In a larger respect. It's come down to don't be offended by this. Who's got the best geeks? So now your side of the house is not about like you say, player performance about the business performances. But that's it. That's a big part of getting the best players. I mean, if it's successful and all the nuances of the N B, A salary cap and everything else, but I think there is one, and so that makes it even more important. But you're helping fund. You know that in various ways, but so are the other two teams that completely separate. Is there a Chinese wall between them? Are you part of the sort of same group? >> Um, we're pretty separate. So the basketball folks do their thing. The business folks do their thing from an analytic standpoint. We meet and we collaborate on tools and other methods of actually doing the analysis. But in terms of, um, the analysis itself, there is a little bit of separation there, and mainly that is from priority standpoint. Obviously, the basketball stuff is the most important stuff. And so if we're working on both sides that we'd always be doing the basketball stuff and the business stuff needs to get done, >> drag you into exactly okay. But which came first? The chicken or the egg was It was the sort of post Moneyball activity applied to the N B. A. And I want to ask you a question about that. And then somebody said, Hey, we should do this for the business side. Or was the business side of sort of always there? >> I think I think, the business side and probably the last 5 to 7 years you've really seen it grown. So if you look at the N. B. A. I've been with the Bulls for five years. If you look at the N. B. A. 78 years ago, there was a handful of Business analytics teams and those those teams had one or two people at him. Now every single team in the NBA has some sort of business analytics team, and the average staff is seven. So my staff is six full time folks pushed myself, so we'll write it right at the average. And I think what you've seen is everything has become more complex in sports. Right? If you look at ticketing, you've got all the secondary markets. You have all this data flowing in, and they need someone to make sense of all that data. If you look at sponsorship sponsorship, his transition from selling a sign that sits on the side of the court for these truly integrated partnerships, where our partners are coming to us and saying, What do we get out of? This was our return. And so you're seeing a lot more part lot more collaboration between analytics and sponsorship to go back to those partners and say, Hey, here's what we delivered And so I think you it started on the basketball side, certainly because that's that's where the, you know that is the most important piece. But it quickly followed on the business side because they saw the value that that type of thinking can bring in the business. >> So I know this is not, you know, your swim lane, but But, you know, the lore of Billy Beane and Moneyball and all that, a sort of the starting point for sports analytics. Is that Is that Is that a fair characterization? Yeah. I mean, was that Was that really the main spring? >> I think it It probably started even before that. I think if you have got to see Billy being at the M I t Sports Analytics conference and him thought he always references kind of Bill James is first, and so I think it started. Baseball was I wouldn't say the easiest place to start, But it was. It's a one versus one, right? It's pitcher versus batter. In a lot of cases, basketball is a little bit more fluid. It's a team. Sport is a little harder, but I think as technology has advanced, there's been more and more opportunities to do the analytics on the basketball side and on the business side. I think what you're seeing is this huge. What we've heard the first day and 1/2 here, this huge influx of data, not nearly to the levels of the MasterCard's and others of the world. But as more and more things moved to the mobile phone, I think you're going to see this huge influx of data on the business side, and you're going to need the same systems in the same sort of approach to tackle it. >> S O. Bill James is the ultimate sports geek, and he's responsible for all these stats that, no, none of us understand. He's why we don't pay attention to batting average anymore. Of course, I still do. So let's talk about the business side of things. If you think about the business of baseball, you know it's all about maximizing the gate. Yeah, there's there's some revenue, a lot of revenue course from TV. But it's not like football, which is dominated by the by the TV. Basketball, I think, is probably a mix right. You got 80 whatever 82 game season, so filling up the stadium is important. Obviously, N v A has done a great job of of really getting it right. Free agency is like, fascinating. Now >> it's 12 months a year >> scored way. Talk about the NBA all the time and of course, you know, people like celebrities like LeBron have certainly helped, and now a whole batch of others. But what's the money side of the n ba look like? Where's the money coming from? >> Yeah, I mean, I think you certainly have broadcast right, but in many ways, like national broadcast sort of takes care of it itself. In some ways, from the standpoint of my team, doesn't have a lot of control over national broadcast money. That's a league level thing. And so the things that we have control over the two big buckets are ticketing and sponsorship. Those those are the two big buckets of revenue that my team spends a lot of time on. Ticketing is, is one that is important from the standpoint, as you say, which is like, How do we fill the building right? We've got 41 home game, supposed three preseason games. We got 44 events a year. Our goal is to fill the building for all 44 of those events. We do a pretty good job of doing it, but that has cascading effects into other revenue streams. Right, As you think about concessions and merchandise and sponsorship, it's a lot easier to spell spot cell of sponsorship when you're building is full, then if you're building isn't full. And so our focus is on. How do we? How do we fill the building in the most efficient way possible? And as you have things like the secondary market and people have access to tickets in different ways than they did 10 to 15 years ago, I think that becomes increasingly complex. Um, but that's the fun area that's like, That's where we spend a lot of time. There's the pricing, There's inventory management. It's a lot of, you know, is you look a traditional cpg. There's there's some of those same principles being applied, which is how do you are you looking airline right there? They're selling a plane. It's an asset you have to fill. We have ah, building. That's an asset we have to fill, and how do we fill it in the most optimal way? >> So the idea of surge pricing demand supply, But so several years ago, the Red Sox went to a tiered pricing. You guys do the same If the Sox are playing Kansas City Royals tickets way cheaper than if they're playing the Yankees. You guys do a similar. So >> we do it for single game tickets. So far are season ticket holders. It's the same price for every game, but on the price for primary tickets for single games, right? So if we're playing, you know this year will be the Clippers and the Lakers. That price is going to be much more expensive, so we dynamically price on a game to game basis. But our season ticket holders pay this. >> Why don't you do it for the season ticket holders? Um, just haven't gone there yet. >> Yeah, I mean, there's some teams have, right, so there's a few different approaches you convey. Lovely price. Those tickets, I think, for for us, the there's in years past. In the last few years, in particular, there's been a couple of flagship games, and then every other game feels similar. I think this will be the first year where you have 8 to 10 teams that really have a shot at winning the title, and so I think you'll see a more balanced schedule. Um, and so we've We've talked about it a lot. We just haven't gone to that made that move yet? >> Well, a season ticket holder that shares his tickets with seven other guys with red sauce. You could buy a BMW. You share the tickets, so but But I would love it if they didn't do the tiered. Pricing is a season ticket holder, so hope you hold off a while, but I don't know. It could maximize revenues if the Red Sox that was probably not a stupid thing is they're smart people. What about the sponsorships? Is fascinating about the partners looking for our ally. How are you measuring that? You're building your forging a tighter relationship, obviously, with the sponsors in these partners. Yeah, what's that are? Why look like it's >> measured? A variety of relies, largely based on the assets that they deliver. But I think every single partner we talk to these days, I also leave the sponsorship team. So I oversee. It's It's rare in sports, but I stayed over business strategy and Alex and sponsorship team. Um, it's not my title, but in practice, that's what I do. And I think everyone we talked to wants digital right? They want we've got over 25,000,000 social media followers with the Bulls, right? We've got 19,000,000 on Facebook alone. And so sponsors see those numbers and they know that we can deliver impression. They know we can deliver engagement and they want access to those channels. And so, from a return on, I always call a return on objectives, right? Return on investment is a little bit tricky, but return on objectives is if we're trying to reel brand awareness, we're gonna go back to them and say, Here's how many people came to our arena and saw your logo and saw the feature that you had on the scoreboard. If you're on our social media channels or a website, here's the number of impressions you got. Here is the number of engagements you got. I think where we're at now is Maura's Bad Morris. Still better, right? Everyone wants the big numbers. I think where you're starting to see it move, though, is that more isn't always better. We want the right folks engaging with our brands, and that's really what we're starting to think about is if you get 10,000,000 impressions, but they're 10,000,000 impressions to the wrong group of potential customers, that's not terribly helpful. for a brand. We're trying to work with our brands to reach the right demographics that they want to reach in order to actually build that brand awareness they want to build. >> What, What? Your primary social channels. Twitter, Obviously. >> So every platform has a different purpose way. Have Facebook, Twitter, instagram, Snapchat. We're in a week. We bow in in China and you know, every platform has a different function. Twitter's obviously more real time news. Um, you know the timeline stuff, it falls off really quick. Instagram is really the artistic piece of it on, and then Facebook is a blend of both, and so that's kind of how we deploy our channels. We have a whole social team that generates content and pushes that content out. But those are the channels we use and those air incredibly valuable. Now what you're starting to see is those channels are changing very rapidly, based on their own set of algorithms, of how they deliver content of fans. And so we're having to continue to adapt to those changing environments in those social >> show impressions. In the term, impressions varies by various platforms. So so I know. I know I'm more familiar with Twitter impressions. They have the definition. It's not just somebody who might have seen it. It's somebody that they believe actually spent a few seconds looking at. They have some algorithm to figure that out. Yeah. Is that a metric that you finding your brands are are buying into, for example? >> Yeah. I mean, I think certainly there they view it's kind of the old, you know, when you bought TV ads, it's how many households. So my commercial right, it's It's a similar type of metric of how many eyeballs saw a piece of content that we put out. I think we're the metrics. More people are starting to care about his engagements, which is how many of you actually engaged with that piece of content, whether it's a like a common a share, because then that's actual. Yeah, you might have seen it for three seconds, but we know how things work. You're scrolling pretty fast, But if you actually stopped to engage it with something, that's where I think brands are starting to see value. And as we think about our content, we have ah framework that our digital team uses. But one of the pillars of that is thumb stopping. We want to create content that is some stopping that people actually engage with. And that's been a big focus of ours. Last couple years, >> I presume. Using video, huge >> video We've got a whole graphics team that does custom graphics for whether it's stats or for history, historical anniversaries. We have a hole in house production team that does higher end, and then our digital team does more kind of straight from the phone raw footage. So we're using a variety of different mediums toe reach our fans >> that What's your background? How'd you get into all of this? >> I spent seven years in consulting, so I worked for Deloitte on their strategy group out of Chicago, And I worked for CPG companies like at the intersection of Retailer and CPG. So a lot of in store promotional work helping brands think through just General Revenue management, pricing strategy, promotional strategy and, um stumbled upon greatness with the Bulls job. A friend gave me the heads up that they were looking to fill this type of role and I was able to get my resume in the mix and I was lucky enough to get get the job, and it's been when I started. We're single, single, single, so it's a team of one. Five years later, we're a team of six, and we'll probably keep growing. So it's been an exciting ride and >> your background is >> maths. That's eyes business. Undergrad. And then I got a went Indian undergrad business and then went to Kellogg. Northwestern got an MBA on strategy, so that's my background. But it's, you know, I've dabbled in sports. I worked for the Chicago 2016 Olympic bid back in the day when I was at Deloitte. Um, and so it's been It's always been a dream of mine. I just never knew how I get there like I was wanted to work in sports. They just don't know the path. And I'm lucky enough to find the path a lot earlier than I thought. >> How about this conference? I know you have been the other M I T. Event. How about this one? How we found some of the key takeaways. Think you >> think it's been great because a lot of the conferences we go to our really sports focus? So you've got the M. I T Sports Analytics conference. You have seat. You have n b a type, um, programming that they put on. But it's nice to get out of sports and sort of see how other bigger industries are thinking about some of the problems specifically around data management and the influx of data and how they're thinking about it. It's always nice to kind of elevated. Just have some room to breathe and think and meet people that are not in sports and start to build those, you know, relationships and with thought leaders and things like that. So it's been great. It's my first time here. What are probably back >> good that Well, hopefully get to see a game, even though that stocks are playing that well. Thanks so much for coming in Cuba. No problems here on your own. You have me. It was great to have you. All right. Keep right, everybody. I'll be back with our next guest with Paul Gill on day Volante here in the house. You're watching the cue from M I T CEO. I cube. Right back

Published Date : Aug 1 2019

SUMMARY :

Brought to you by Silicon Angle Media. Welcome back to M. I. T. In Cambridge, Massachusetts. So talk about Is the head of analytics for the Bulls? But on the business side, um, what we're focused on is really two things. the house is not about like you say, player performance about the business performances. always be doing the basketball stuff and the business stuff needs to get done, A. And I want to ask you a question about that. it started on the basketball side, certainly because that's that's where the, you know that is the most important So I know this is not, you know, your swim lane, but But, you know, the lore of Billy Beane I think if you have got to see Billy being at the M So let's talk about the business side of things. Talk about the NBA all the time and of course, you know, And so the things that we have control over the two big buckets are So the idea of surge pricing demand supply, But so several years ago, It's the same price for every game, Why don't you do it for the season ticket holders? I think this will be the first year where you have 8 to 10 teams that really have a shot at winning so hope you hold off a while, but I don't know. Here is the number of engagements you got. Twitter, Obviously. Um, you know the timeline stuff, it falls off really quick. Is that a metric that you finding your brands are are More people are starting to care about his engagements, which is how many of you actually engaged with that piece of content, I presume. We have a hole in house production team A friend gave me the heads up that they were looking to fill this type of role and I was able to get my resume in the But it's, you know, I've dabbled I know you have been the other M I T. Event. you know, relationships and with thought leaders and things like that. good that Well, hopefully get to see a game, even though that stocks are playing that well.

ENTITIES

Entity	Category	Confidence
Dave Volante	PERSON	0.99+
Matt Kobe	PERSON	0.99+
19,000,000	QUANTITY	0.99+
Cuba	LOCATION	0.99+
8	QUANTITY	0.99+
Deloitte	ORGANIZATION	0.99+
Red Sox	ORGANIZATION	0.99+
Clippers	ORGANIZATION	0.99+
China	LOCATION	0.99+
Billy	PERSON	0.99+
five years	QUANTITY	0.99+
Bill James	PERSON	0.99+
seven	QUANTITY	0.99+
Chicago	LOCATION	0.99+
Matt	PERSON	0.99+
Yankees	ORGANIZATION	0.99+
Paul Gill	PERSON	0.99+
Lakers	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
BMW	ORGANIZATION	0.99+
three seconds	QUANTITY	0.99+
one	QUANTITY	0.99+
Chicago Bulls	ORGANIZATION	0.99+
80	QUANTITY	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
single	QUANTITY	0.99+
MasterCard	ORGANIZATION	0.99+
two teams	QUANTITY	0.99+
two big buckets	QUANTITY	0.99+
82 game	QUANTITY	0.99+
Sox	ORGANIZATION	0.99+
seven other guys	QUANTITY	0.99+
M. I T Sports Analytics	EVENT	0.99+
10,000,000 impressions	QUANTITY	0.99+
Bulls	ORGANIZATION	0.99+
three preseason games	QUANTITY	0.99+
M I t Sports Analytics	EVENT	0.99+
two things	QUANTITY	0.99+
two people	QUANTITY	0.99+
first	QUANTITY	0.99+
single games	QUANTITY	0.99+
Five years later	DATE	0.98+
Twitter	ORGANIZATION	0.98+
several years ago	DATE	0.98+
10 teams	QUANTITY	0.98+
41 home game	QUANTITY	0.98+
Northwestern	ORGANIZATION	0.98+
both sides	QUANTITY	0.98+
first time	QUANTITY	0.98+
Facebook	ORGANIZATION	0.98+
LeBron	PERSON	0.98+
both	QUANTITY	0.98+
10	DATE	0.98+
Alex	PERSON	0.98+
this year	DATE	0.97+
Kansas City Royals	ORGANIZATION	0.97+
One	QUANTITY	0.97+
12 months a year	QUANTITY	0.97+
first year	QUANTITY	0.97+
78 years ago	DATE	0.95+
single game tickets	QUANTITY	0.95+
M I T. Event	EVENT	0.94+
1/2	QUANTITY	0.94+
Indian	OTHER	0.94+
Instagram	ORGANIZATION	0.94+
instagram	ORGANIZATION	0.93+
7 years	QUANTITY	0.92+
first day	QUANTITY	0.92+
15 years ago	DATE	0.92+
44 of those events	QUANTITY	0.91+
six full	QUANTITY	0.91+
Maura's Bad Morris	ORGANIZATION	0.9+
a week	QUANTITY	0.9+
Snapchat	ORGANIZATION	0.9+
M. I. T.	PERSON	0.9+
over 25,000,000 social media followers	QUANTITY	0.88+
seconds	QUANTITY	0.88+
Last couple years	DATE	0.88+
N. B.	LOCATION	0.87+

Julie Johnson, Armored Things | MIT CDOIQ 2019

>> From Cambridge Massachusetts, it's The Cube covering MIT Chief Data Officer, and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (electronic music) >> Welcome back to MIT in Cambridge, Massachusets everybody. You're watching The Cube, the leader in live tech coverage. My name is Dave Vellante I'm here with Paul Gillin. Day two of the of the MIT Chief Data Officer Information Quality Conference. One of the things we like to do, at these shows, we love to profile Boston area start-ups that are focused on data, and in particular we love to focus on start-ups that are founded by women. Julie Johnson is here, She's the Co-founder and CEO of Armored Things. Julie, great to see you again. Thanks for coming on. >> Great to see you. >> So why did you start Armored Things? >> You know, Armored Things was created around a mission to keep people safe. Early in the time where were looking at starting this company, incidents like Las Vegas happened, Parkland happened, and we realized that the world of security and operations was really stuck in the past right? It's a manual solutions generally driven by a human instinct, anecdotal evidence, and tools like Walkie-Talkies and video cameras. We knew there had to be a better way right? In the world of Data that we live in today, I would ask if either of you got in your car this morning without turning on Google Maps to see where you were going, and the best route with traffic. We want to help universities, ball parks, corporate campuses do that for people. How do we keep our people safe? By understanding how they live. >> Yeah, and stay away from Lambert Street in Cambridge by the way. >> (laughing) >> Okay so, you know in people, when they think about security they think about cyber, they think about virtual security, et cetera et cetera, but there's also the physical security aspect. Can you talk about the balance of those two? >> Yeah, and I think both are very important. We actually tend to mimic some of the revolutions that have happened on the cyber security side over the last 10 years with what we're trying to do in the world of physical security. So, folks watching this who are familiar with cyber security might understand concepts like anomaly detection, SIEM and SOAR for orchestrated response. We very much believe that similar concepts can be applied to the physical world, but the unique thing about the physical world, is that it has defined boundaries, right? People behave in accordance with their environment. So, how do we take the lessons learned in cyber security over 10 to 15 years, and apply them to that physical world? I also believe that physical and cyber security are converging. So, are there things that we know in the physical world because of how we approach the problem? That can be a leading indicator of a threat in either the physical world or the digital world. What many people don't understand is that for some of these cyber security hacks, the first weak link is physical access to your network, to your data, to your systems. How do we actually help you get an eye on that, so you already have some context when you notice it in the digital realm. >> So, go back to the two examples you sited earlier, the two shooting examples. Could those have been prevented or mitigated in some way using the type of technology you're building? >> Yeah, I hate to say that you could ever prevent an incident like that. Everyone wants us to do better. Our goal is to get a better sense predicatively of the leading indicators that tell you you have a problem. So, because we're fundamentally looking at patterns of people and flow, I want to know when a normal random environment starts to disperse in a certain way, or if I have a bottle neck in my environment. Because if then I have that type of incident occur, I already know where my hotspots are, where my pockets of risk are. So, I can address it that much more efficiently from a response perspective. >> So if people are moving quickly away from a venue, it might be and indication that there's something wrong- >> It could be, Yeah. That demands attention. >> Yeah, when you go to a baseball game, or when you go to work I would imagine that you generally have a certain pattern of behavior. People know conceptually what those patterns are. But, we're the first effort to bring them data to prove what those patterns are so that they can actually use that data to consistently re-examine their operations, re-examine their security from a staffing perspective, from a management perspective, to make sure that they're using all the data that's at their disposal. >> Seems like there would be many other applications beyond security of this type of analysis. Are you committed to the security space, or do you have broader ambitions? >> Are we committed to the security space is a hundred percent. I would say the number one reason why people join our team, and the number one reason why people call us to be customers is for security. There's a better way to do things. We fundamentally believe that every ball park, every university, every corporate campus, needs a better way. I think what we've seen though is exactly what you're saying. As we built our software, for security in these venues, and started with an understanding of people and flow, there's a lot that falls out of that right? How do I open gates that are more effective based on patterns of entry and exit. How do I make sure that my staffing's appropriate for the number of people I have in my environment. There's lots of other contextual information that can ultimately drive a bottom line or top line revenue. So, you take a pro sports venue for example. If we know that on a 10 degree colder day people tend to eagres more early in the game, how do we adjust our food and beverage strategy to save money on hourly workers, so that we're not over staffing in a period of time that doesn't need those resources. >> She's talking about the physical and the logical security worlds coming together, and security of course has always been about data, but 10 years ago it was staring at logs increasing the machines are helping us do that, and software is helping us do that. So can you add some color to at least the trends in the market generally, and then maybe specifically what you're doing bringing machine intelligence to the data to make us more secure. >> Sure, and I hate to break it to you, but logs are still a pretty big part of what people are watching on a daily basis, as are video cameras. We've seen a lot of great technology evolve in the video management system realm. Very advanced technology great at object recognition and detecting certain behaviors with a video only solution, right? How do we help pinpoint certain behaviors on a specific frame or specific camera. The only problem with that is, if you have people watching those cameras, you're still relying on humans in the loop to catch a malicious behavior, to respond in the event that they're notified about something unusual. That still becomes a manual process. What we do, is we use data to watch not only cameras, but we are watching your cameras, your Wi-Fi, access control. Contextual data from public transit, or weather. How do we get this greater understanding of your environment that helps us watch everything so that we can surface the things that you want the humans in the loop to pay attention to, right? So, we're not trying to remove the human, we're trying to help them focus their time and make decisions that are backed by data in the most efficient way possible. >> How about the concerns about The Surveillance Society? In some countries, it's just taken for granted now that you're on camera all the time. In the US that's a little bit more controversial. Is what your doing, do you have to be sensitive to that in designing the tools you're building? >> Yeah, and I think to Dave's question, there are solutions like facial recognition which are very much working on identifying the individual. We have a philosophy as a company, that security doesn't necessarily start with the individual, it starts with the aggregate. How do we understand at an aggregate macro level, the patterns in an environment. Which means I don't have to identify Paul, or I don't have to identify Dave. I want to look for what's usual and unusual, and use that as the basis of my response. There's certain instances where you want to know who people are. Do I want to know who my security personnel are so I can dispatch them more efficiently? Absolutely. Let's opt those people in and allow them to share the information they need to share to be better resources for our environment. But, that's the exception not the norm. If we make the norm privacy first, I think we'll be really successful in this emerging GDPR data centric world. >> But I could see somebody down the road saying hey can you help us find this bad guy? And my kids at camp this week, This is his 7th year of camp, and this year was the first year my wife, she was able to sign up for a facial recognition thing. So, we used to have to scroll through hundreds and hundreds of pictures to see oh, there he is! And so Deb signs up for this thing, and then it pings you when your son has a picture taken. >> Yeah. And I was like, That's awesome. Oh. (laughing) >> That's great until you think about it. >> But there aren't really any clear privacy laws today. And so you guys are saying, look it, we're looking at the big picture. >> That's right. >> But that day is coming isn't it? >> There's certain environments that care more than others. If you think about universities, which is where we first started building our technology, they cared greatly about the privacy of their students. Health care is a great example. We want to make sure that we're protecting peoples personal data at a different level. Not only because that's the right thing to do, but also from a regulatory perspective. So, how do we give them the same security without compromising the privacy. >> Talk about Bottom line. You mentioned to us earlier that you just signed a contract with a sports franchise, you're actually going to help them, help save them money by deploying their resources more efficiently. How does your technology help the bottom line? >> Sure, you're average sporting venue, is getting great information at the point a ticket is scanned or a ticket is purchased, they have very little visibility beyond that into the customer journey during an event at their venue. So, if you think about again, patterns of people and flow from a security perspective, at our core we're helping them staff the right gates, or figure out where people need to be based on hot spots in their environment. But, what that also results in is an ability to drive other operational benefits. Do we have a zone that's very low utilization that we could use as maybe even a benefit to our avid fans. Send them to that area, get traffic in that area, and now give them a better concession experience because of it, right? Where they're going to end up spending more money because they're not waiting in line in the different zone. So, how do we give them a dashboard in real time, but also alerts or reports that they can use on an ongoing basis to change their decision making going forward. >> So, give us the company overview. Where are you guys at with funding, head count, all that good stuff. >> So, we raised a seed round with some great Boston and Silicon Valley investors a year ago. So, that was Glasswing is a Boston AI focused fund, has been a great partner for us, and Inovia which is Canada's largest VC fund recently opened a Silicon Valley office. We just started raising a series A about a week ago. I'm excited to say those conversation have been going really well so far. We have some potential strategic partners who we're excited about who know data better then anyone else that we think would help us accelerate our business. We also have a few folks who are very familiar with the large venue space. You know, the distributed campuses, the sporting and entertainment venues. So, we're out looking for the right partner to lead our series A round, and take our business to the next level, but where we are today with five really great branded customers, I think we'll have 20 by the end of next year, and we won't stop fighting 'till we're at every ball park, every football stadium, every convention center, school. >> The big question, at some point will you be able to eliminate security lines? (laughing) >> I don't think that's my core mission. (laughing) But, optimistically I'd love to help you. Right, I think there's some very talented people working on that challenge, so I'll defer that one to them. >> And rough head count today? >> We have 23 people. >> You're 23 people so- >> Yeah, I headquartered in Boston Post Office Square. >> Awesome, great location. So, and you say you've got five customers, so you're generating revenue? >> Yes >> Okay, good. Well, thank you for coming in The Cube >> Yeah, thank you. >> And best of luck with the series A- >> I appreciate it and going forward >> Yeah, great. >> All right, and thank you for watching. Paul Gillin and I will be back right after this short break. This is The Cube from MIT Chief Data Officer Information Quality Conference in Cambridge. We'll be right back. (electronic music)

Published Date : Aug 1 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Julie, great to see you again. to see where you were going, in Cambridge by the way. Okay so, you know in people, How do we actually help you get an eye on that, So, go back to the two examples you sited earlier, Yeah, I hate to say that you could ever prevent That demands attention. data to prove what those patterns are or do you have broader ambitions? and the number one reason why people bringing machine intelligence to the data Sure, and I hate to break it to you, sensitive to that in designing the tools you're building? Yeah, and I think to Dave's question, and then it pings you when your son And I was like, That's awesome. And so you guys are saying, Not only because that's the right thing to do, You mentioned to us earlier that you So, if you think about again, Where are you guys at with funding, head count, and take our business to the next level, so I'll defer that one to them. So, and you say you've got five customers, Well, thank you for coming in The Cube All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Julie Johnson	PERSON	0.99+
Julie	PERSON	0.99+
Cambridge	LOCATION	0.99+
7th year	QUANTITY	0.99+
Inovia	ORGANIZATION	0.99+
Paul	PERSON	0.99+
Lambert Street	LOCATION	0.99+
Boston	LOCATION	0.99+
five	QUANTITY	0.99+
two examples	QUANTITY	0.99+
10 degree	QUANTITY	0.99+
US	LOCATION	0.99+
five customers	QUANTITY	0.99+
23 people	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
Deb	PERSON	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Armored Things	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Google Maps	TITLE	0.99+
Silicon Valley	LOCATION	0.99+
Glasswing	ORGANIZATION	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
Cambridge Massachusetts	LOCATION	0.98+
a year ago	DATE	0.98+
first year	QUANTITY	0.98+
series A	OTHER	0.98+
hundred percent	QUANTITY	0.98+
20	QUANTITY	0.98+
Day two	QUANTITY	0.97+
The Cube	TITLE	0.97+
Las Vegas	LOCATION	0.97+
first	QUANTITY	0.97+
Canada	LOCATION	0.96+
GDPR	TITLE	0.96+
Chief Data Officer	EVENT	0.95+
over 10	QUANTITY	0.94+
10 years ago	DATE	0.94+
this year	DATE	0.94+
Surveillance Society	ORGANIZATION	0.93+
Boston Post Office Square	LOCATION	0.92+
15 years	QUANTITY	0.91+
first effort	QUANTITY	0.91+
end of next year	DATE	0.89+
MIT	ORGANIZATION	0.89+
this morning	DATE	0.88+
two shooting examples	QUANTITY	0.85+
about a week ago	DATE	0.83+
Thi	PERSON	0.83+
Armored	ORGANIZATION	0.83+
football stadium	QUANTITY	0.82+
one	QUANTITY	0.82+
2019	DATE	0.81+
Information Quality Symposium	EVENT	0.8+
hundreds of pictures	QUANTITY	0.79+
great branded customers	QUANTITY	0.77+
last 10 years	DATE	0.73+
hundreds and	QUANTITY	0.73+
MIT Chief Data Officer Information Quality Conference	EVENT	0.72+
Massachusets	LOCATION	0.7+
Parkland	ORGANIZATION	0.7+
every ball park	QUANTITY	0.7+
one reason	QUANTITY	0.69+
Walkie-	ORGANIZATION	0.66+
first weak link	QUANTITY	0.66+
convention center	QUANTITY	0.65+
The Cube	ORGANIZATION	0.64+
corporate campus	QUANTITY	0.64+
ball park	QUANTITY	0.61+
MIT Chief	ORGANIZATION	0.59+
Talkies	TITLE	0.57+
university	QUANTITY	0.57+
Data Officer Information Quality Conference	EVENT	0.54+

Colin Mahony, Vertica | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, you're watching The Cube, the leader in tech coverage. My name is Dave Vellante here with my cohost Paul Gillin. This is day one of our two day coverage of the MIT CDOIQ conferences. CDO, Chief Data Officer, IQ, information quality. Colin Mahoney is here, he's a good friend and long time CUBE alum. I haven't seen you in awhile, >> I know >> But thank you so much for taking some time, you're like a special guest here >> Thank you, yeah it's great to be here, thank you. >> Yeah, so, this is not, you know, something that you would normally attend. I caught up with you, invited you in. This conference has started as, like back office governance, information quality, kind of wonky stuff, hidden. And then when the big data meme took off, kind of around the time we met. The Chief Data Officer role emerged, the whole Hadoop thing exploded, and then this conference kind of got bigger and bigger and bigger. Still intimate, but very high level, very senior. It's kind of come full circle as we've been saying, you know, information quality still matters. You have been in this data business forever, so I wanted to invite you in just to get your perspectives, we'll talk about what's new with what's going on in your company, but let's go back a little bit. When we first met and even before, you saw it coming, you kind of invested your whole career into data. So, take us back 10 years, I mean it was so different, remember it was Batch, it was Hadoop, but it was cool. There was a lot of cool >> It's still cool. (laughs) projects going on, and it's still cool. But, take a look back. >> Yeah, so it's changed a lot, look, I got into it a while ago, I've always loved data, I had no idea, the explosion and the three V's of data that we've seen over the last decade. But, data's really important, and it's just going to get more and more important. But as I look back I think what's really changed, and even if you just go back a decade I mean, there's an insatiable appetite for data. And that is not slowing down, it hasn't slowed down at all, and I think everybody wants that perfect solution that they can ask any question and get an immediate answers to. We went through the Hadoop boom, I'd argue that we're going through the Hadoop bust, but what people actually want is still the same. You know, they want real answers, accurate answers, they want them quickly, and they want it against all their information and all their data. And I think that Hadoop evolved a lot as well, you know, it started as one thing 10 years ago, with MapReduce and I think in the end what it's really been about is disrupting the storage market. But if you really look at what's disrupting storage right now, public clouds, S3, right? That's the new data league. So there's always a lot of hype cycles, everybody talks about you know, now it's Cloud, everything, for maybe the last 10 years it was a lot of Hadoop, but at the end of the day I think what people want to do with data is still very much the same. And a lot of companies are still struggling with it, hence the role for Chief Data Officers to really figure out how do I monetize data on the one hand and how to I protect that asset on the other hand. >> Well so, and the cool this is, so this conference is not a tech conference, really. And we love tech, we love talking about this, this is why I love having you on. We kind of have a little Vertica thread that I've created here, so Colin essentially, is the current CEO of Vertica, I know that's not your title, you're GM and Senior Vice President, but you're running Vertica. So, Michael Stonebreaker's coming on tomorrow, >> Yeah, excellent. >> Chris Lynch is coming on tomorrow, >> Oh, great, yeah. >> we've got Andy Palmer >> Awesome, yeah. >> coming up as well. >> Pretty cool. (laughs) >> So we have this connection, why is that important? It's because, you know, Vertica is a very cool company and is all about data, and it was all about disrupting, sort of the traditional relational database. It's kind of doing more with data, and if you go back to the roots of Vertica, it was like how do you do things faster? How do you really take advantage of data to really drive new business? And that's kind of what it's all about. And the tech behind it is really cool, we did your conference for many, many years. >> It's coming back by the way. >> Is it? >> Yeah, this March, so March 30th. >> Oh, wow, mark that down. >> At Boston, at the new Encore Hotel. >> Well we better have theCUBE there, bro. (laughs) >> Yeah, that's great. And yeah, you've done that conference >> Yep. >> haven't you before? So very cool customers, kind of leading edge, so I want to get to some of that, but let's talk the disruption for a minute. So you guys started with the whole architecture, MPP and so forth. And you talked about Cloud, Cloud really disrupted Hadoop. What are some of the other technology disruptions that you're seeing in the market space? >> I think, I mean, you know, it's hard not to talk about AI machine learning, and what one means versus the other, who knows right? But I think one thing that is definitely happening is people are leveraging the volumes of data and they're trying to use all the processing power and storage power that we have to do things that humans either are too expensive to do or simply can't do at the same speed and scale. And so, I think we're going through a renaissance where a lot more is being automated, certainly on the Vertica roadmap, and our path has always been initially to get the data in and then we want the platform to do a lot more for our customers, lots more analytics, lots more machine-learning in the platform. So that's definitely been a lot of the buzz around, but what's really funny is when you talk to a lot of customers they're still struggling with just some basic stuff. Forget about the predictive thing, first you've got to get to what happened in the past. Let's give accurate reporting on what's actually happening. The other big thing I think as a disruption is, I think IOT, for all the hype that it's getting it's very real. And every device is kicking off lots of information, the feedback loop of AB testing or quality testing for predictive maintenance, it's happening almost instantly. And so you're getting massive amounts of new data coming in, it's all this machine sensor type data, you got to figure out what it means really quick, and then you actually have to do something and act on it within seconds. And that's a whole new area for so many people. It's not their traditional enterprise data network warehouse and you know, back to you comment on Stonebreaker, he got a lot of this right from the beginning, you know, and I think he looked at the architectures, he took a lot of the best in class designs, we didn't necessarily invent everything, but we put a lot of that together. And then I think the other you've got to do is constantly re-invent your platform. We came out with our Eon Mode to run cloud native, we just got rated the best cloud data warehouse from a net promoter score rating perspective, so, but we got to keep going you know, we got to keep re-inventing ourselves, but leverage everything that we've done in the past as well. >> So one of the things that you said, which is kind of relevant for here, Paul, is you're still seeing a real data quality issue that customers are wrestling with, and that's a big theme here, isn't it? >> Absolutely, and the, what goes around comes around, as Dave said earlier, we're still talking about information quality 13 years after this conference began. Have the tools to improve quality improved all that much? >> I think the tools have improved, I think that's another area where machine learning, if you look at Tamr, and I know you're going to have Andy here tomorrow, they're leveraging a lot of the augmented things you can do with the processing to make it better. But I think one thing that makes the problem worse now, is it's gotten really easy to pour data in. It's gotten really easy to store data without having to have the right structure, the right quality, you know, 10 years ago, 20 years ago, everything was perfect before it got into the platform. Right, everything was, there was quality, everything was there. What's been happening over the last decade is you're pumping data into these systems, nobody knows if it's redundant data, nobody knows if the quality's any good, and the amount of data is massive. >> And it's cheap to store >> Very cheap to store. >> So people keep pumping it in. >> But I think that creates a lot of issues when it comes to data quality. So, I do think the technology's gotten better, I think there's a lot of companies that are doing a great job with it, but I think the challenge has definitely upped. >> So, go ahead. >> I'm sorry. You mentioned earlier that we're seeing the death of Hadoop, but I'd like you to elaborate on that becuase (Dave laughs) Hadoop actually came up this morning in the keynote, it's part of what GlaxoSmithKline did. Came up in a conversation I had with the CEO of Experian last week, I mean, it's still out there, why do you think it's in decline? >> I think, I mean first of all if you look at the Hadoop vendors that are out there, they've all been struggling. I mean some of them are shutting down, two of them have merged and they've got killed lately. I think there are some very successful implementations of Hadoop. I think Hadoop as a storage environment is wonderful, I think you can process a lot of data on Hadoop, but the problem with Hadoop is it became the panacea that was going to solve all things data. It was going to be the database, it was going to be the data warehouse, it was going to do everything. >> That's usually the kiss of death, isn't it? >> It's the kiss of death. And it, you know, the killer app on Hadoop, ironically, became SQL. I mean, SQL's the killer app on Hadoop. If you want to SQL engine, you don't need Hadoop. But what we did was, in the beginning Mike sort of made fun of it, Stonebreaker, and joked a lot about he's heard of MapReduce, it's called Group By, (Dave laughs) and that created a lot of tension between the early Vertica and Hadoop. I think, in the end, we embraced it. We sit next to Hadoop, we sit on top of Hadoop, we sit behind it, we sit in front of it, it's there. But I think what the reality check of the industry has been, certainly by the business folks in these companies is it has not fulfilled all the promises, it has not fulfilled a fraction on the promises that they bet on, and so they need to figure those things out. So I don't think it's going to go away completely, but I think its best success has been disrupting the storage market, and I think there's some much larger disruptions of technologies that frankly are better than HTFS to do that. >> And the Cloud was a gamechanger >> And a lot of them are in the cloud. >> Which is ironic, 'cause you know, cloud era, (Colin laughs) they didn't really have a cloud strategy, neither did Hortonworks, neither did MapR and, it just so happened Amazon had one, Google had one, and Microsoft has one, so, it's just convenient to-- >> Well, how is that affecting your business? We've seen this massive migration to the cloud (mumbles) >> It's actually been great for us, so one of the things about Vertica is we run everywhere, and we made a decision a while ago, we had our own data warehouse as a service offering. It might have been ahead of its time, never really took off, what we did instead is we pivoted and we say "you know what? "We're going to invest in that experience "so it's a SaaS-like experience, "but we're going to let our customers "have full control over the cloud. "And if they want to go to Amazon they can, "if they want to go to Google they can, "if they want to go to Azure they can." And we really invested in that and that experience. We're up on the Amazon marketplace, we have lots of customers running up on Amazon Cloud as well as Google and Azure now, and then about two years ago we went down and did this endeavor to completely re-architect our product so that we could separate compute and storage so that our customers could actually take advantage of the cloud economics as well. That's been huge for us, >> So you scale independent-- >> Scale independently, cloud native, add compute, take away compute, and for our existing customers, they're loving the hybrid aspect, they love that they can still run on Premise, they love that they can run up on a public cloud, they love that they can run in both places. So we will continue to invest a lot in that. And it is really, really important, and frankly, I think cloud has helped Vertica a lot, because being able to provision hardware quickly, being able to tie in to these public clouds, into our customers' accounts, give them control, has been great and we're going to continue on that path. >> Because Vertica's an ISV, I mean you're a software company. >> We're a software company. >> I know you were a part of HP for a while, and HP wanted to mash that in and run it on it's hardware, but software runs great in the cloud. And then to you it's another hardware platform. >> It's another hardware platform, exactly. >> So give us the update on Micro Focus, Micro Focus acquired Vertica as part of the HPE software business, how many years ago now? Two years ago? >> Less than two years ago. >> Okay, so how's that going, >> It's going great. >> Give us the update there. >> Yeah, so first of all it is great, HPE and HP were wonderful to Vertica, but it's great being part of a software company. Micro Focus is a software company. And more than just a software company it's a company that has a lot of experience bridging the old and the new. Leveraging all of the investments that you've made but also thinking about cloud and all these other things that are coming down the pike. I think for Vertica it's been really great because, as you've seen Vertica has gotten its identity back again. And that's something that Micro Focus is very good at. You can look at what Micro Focus did with SUSE, the Linux company, which actually you know, now just recently spun out of Micro Focus but, letting organizations like Vertica that have this culture, have this product, have this passion, really focus on our market and our customers and doing the right thing by them has been just really great for us and operating as a software company. The other nice thing is that we do integrate with a lot of other products, some of which came from the HPE side, some of which came from Micro Focus, security products is an example. The other really nice thing is we've been doing this insource thing at Micro Focus where we open up our source code to some of the other teams in Micro Focus and they've been contributing now in amazing ways to the product. In ways that we would just never be able to scale, but with 4,000 engineers strong in Micro Focus, we've got a much larger development organization that can actually contribute to the things that Vertica needs to do. And as we go into the cloud and as we do a lot more operational aspects, the experience that these teams have has been incredible, and security's another great example there. So overall it's been great, we've had four different owners of Vertica, our job is to continue what we do on the innovation side in the culture, but so far Micro Focus has been terrific. >> Well, I'd like to say, you're kind of getting that mojo back, because you guys as an independent company were doing your own thing, and then you did for a while inside of HP, >> We did. >> And that obviously changed, 'cause they wanted more integration, but, and Micro Focus, they know what they're doing, they know how to do acquisitions, they've been very successful. >> It's a very well run company, operationally. >> The SUSE piece was really interesting, spinning that out, because now RHEL is part of IBM, so now you've got SUSE as the lone independent. >> Yeah. >> Yeah. >> But I want to ask you, go back to a technology question, is NoSQL the next Hadoop? Are these databases, it seems to be that the hot fad now is NoSQL, it can do anything. Is the promise overblown? >> I think, I mean NoSQL has been out almost as long as Hadoop, and I, we always say not only SQL, right? Mike's said this from day one, best tool for the job. Nothing is going to do every job well, so I think that there are, whether it's key value stores or other types of NoSQL engines, document DB's, now you have some of these DB's that are running on different chips, >> Graph, yeah. >> there's always, yeah, graph DBs, there's always going to be specialty things. I think one of the things about our analytic platform is we can do, time series is a great example. Vertica's a great time series database. We can compete with specialized time series databases. But we also offer a lot of, the other things that you can do with Vertica that you wouldn't be able to do on a database like that. So, I always think there's going to be specialty products, I also think some of these can do a lot more workloads than you might think, but I don't see as much around the NoSQL movement as say I did a few years ago. >> But so, and you mentioned the cloud before as kind of, your position on it I think is a tailwind, not to put words in your mouth, >> Yeah, yeah, it's a great tailwind. >> You're in the Amazon marketplace, I mean they have products that are competitive, right? >> They do, they do. >> But, so how are you differentiating there? >> I think the way we differentiate, whether it's Redshift from Amazon, or BigQuery from Google, or even what Azure DB does is, first of all, Vertica, I think from, feature functionality and performance standpoint is ahead. Number one, I think the second thing, and we hear this from a lot of customers, especially at the C-level is they don't want to be locked into these full stacks of the clouds. Having the ability to take a product and run it across multiple clouds is a big thing, because the stack lock-in now, the full stack lock-in of these clouds is scary. It's really easy to develop in their ecosystems but you get very locked into them, and I think a lot of people are concerned about that. So that works really well for Vertica, but I think at the end of the day it's just, it's the robustness of the product, we continue to innovate, when you look at separating compute and storage, believe it or not, a lot of these cloud-native databases don't do that. And so we can actually leverage a lot of the cloud hardware better than the native cloud databases do themselves. So, like I said, we have to keep going, those guys aren't going to stop, and we actually have great relationships with those companies, we work really well with the clouds, they seem to care just as much about their cloud ecosystem as their own database products, and so I think that's going to continue as well. >> Well, Colin, congratulations on all the success >> Yeah, thank you, yeah. >> It's awesome to see you again and really appreciate you coming to >> Oh thank you, it's great, I appreciate the invite, >> MIT. >> it's great to be here. >> All right, keep it right there everybody, Paul and I will be back with our next guest from MIT, you're watching theCUBE. (electronic jingle)

Published Date : Jul 31 2019

SUMMARY :

brought to you by SiliconANGLE Media. I haven't seen you in awhile, kind of around the time we met. It's still cool. but at the end of the day I think is the current CEO of Vertica, (laughs) and if you go back to the roots of Vertica, at the new Encore Hotel. Well we better have theCUBE there, bro. And yeah, you've done that conference but let's talk the disruption for a minute. but we got to keep going you know, Have the tools to improve quality the right quality, you know, But I think that creates a lot of issues but I'd like you to elaborate on that becuase I think you can process a lot of data on Hadoop, and so they need to figure those things out. so one of the things about Vertica is we run everywhere, and frankly, I think cloud has helped Vertica a lot, I mean you're a software company. And then to you it's another hardware platform. the Linux company, which actually you know, and Micro Focus, they know what they're doing, so now you've got SUSE as the lone independent. is NoSQL the next Hadoop? Nothing is going to do every job well, the other things that you can do with Vertica and so I think that's going to continue as well. Paul and I will be back with our next guest from MIT,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Andy Palmer	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Colin Mahoney	PERSON	0.99+
Paul	PERSON	0.99+
Colin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
Chris Lynch	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Michael Stonebreaker	PERSON	0.99+
HP	ORGANIZATION	0.99+
Micro Focus	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
Colin Mahony	PERSON	0.99+
last week	DATE	0.99+
Andy	PERSON	0.99+
March 30th	DATE	0.99+
NoSQL	TITLE	0.99+
Mike	PERSON	0.99+
Experian	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
SQL	TITLE	0.99+
two day	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
4,000 engineers	QUANTITY	0.99+
Two years ago	DATE	0.99+
SUSE	TITLE	0.99+
Azure DB	TITLE	0.98+
second thing	QUANTITY	0.98+
20 years ago	DATE	0.98+
10 years ago	DATE	0.98+
one	QUANTITY	0.98+
Vertica	TITLE	0.98+
Hortonworks	ORGANIZATION	0.97+
MapReduce	ORGANIZATION	0.97+
one thing	QUANTITY	0.97+

Mark Krzysko, US Department of Defense | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's The Cube, covering MIT Chief data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, everybody. We're here at Tang building at MIT for the MIT CDOIQ Conference. This is the 13th annual MIT CDOIQ. It started as a information quality conference and grew through the big data era, the Chief Data Officer emerged and now it's sort of a combination of those roles. That governance role, the Chief Data Officer role. Critical for organizations for quality and data initiatives, leading digital transformations ans the like. I'm Dave Vallante with my cohost Paul Gillin, you're watching The Cube, the leader in tech coverage. Mark Chrisco is here, the deputy, sorry, Principle Deputy Director for Enterprise Information at the Department of Defense. Good to see you again, thanks for coming on. >> Oh, thank you for having me. >> So, Principle Deputy Director Enterprise Information, what do you do? >> I do data. I do acquisition data. I'm the person in charge of lining the acquisition data for the programs for the Under Secretary and the components so a strong partnership with the army, navy, and air force to enable the department and the services to execute their programs better, more efficiently, and be efficient in the data management. >> What is acquisition data? >> So acquisition data generally can be considered best in the shorthand of cost schedule performance data. When a program is born, you have to manage, you have to be sure it's resourced, you're reporting up to congress, you need to be sure you have insight into the programs. And finally, sometimes you have to make decisions on those programs. So, cost schedule performance is a good shorthand for it. >> So kind of the key metrics and performance metrics around those initiatives. And how much of that is how you present that data? The visualization of it. Is that part of your role or is that, sort of, another part of the organization you partner with, or? >> Well, if you think about it, the visualization can take many forms beyond that. So a good part of the role is finding the authoritative trusted source of that data, making sure it's accurate so we don't spend time disagreeing on different data sets on cost schedule performance. The major programs are tremendously complex and large and involve and awful lot of data in the a buildup to a point where you can look at that. It's just not about visualizing, it's about having governed authoritative data that is, frankly, trustworthy that you can can go operate in. >> What are some of the challenges of getting good quality data? >> Well, I think part of the challenge was having a common lexicon across the department and the services. And as I said, the partnership with the services had been key in helping define and creating a semantic data model for the department that we can use. So we can have agreement on what it would mean when we were using it and collecting it. The services have thrown all in and, in their perspective, have extended that data model down through their components to their programs so they can better manage the programs because the programs are executed at a service level, not at an OSD level. >> Can you make that real? I mean, is there an example you can give us of what you mean by a common semantic model? >> So for cost schedule, let's take a very simple one, program identification. Having a key number for that, having a long name, a short name, and having just the general description of that, were in various states amongst the systems. We've had decades where, however the system was configured, configured it the way they wanted to. It was largely not governed and then trying to bring those data sets together were just impossible to do. So even with just program identification. Since the majority of the programs and numbers are executed at a service level, we worked really hard to get the common words and meanings across all the programs. >> So it's a governance exercise the? >> Yeah. It is certainly a governance exercise. I think about it as not so much as, in the IT world or the data world will call it governance, it's leadership. Let's settle on some common semantics here that we can all live with and go forward and do that. Because clearly there's needs for other pieces of data that we may or may not have but establishing a core set of common meanings across the department has proven very valuable. >> What are some of the key data challenges that the DOD faces? And how is your role helping address them? >> Well in our case, and I'm certain there's a myriad of data choices across the department. In our place it was clarity in and the governance of this. Many of the pieces of data were required by statute, law, police, or regulation. We came out of eras where data was the piece of a report and not really considered data. And we had to lead our ways to beyond the report to saying, "No, we're really "talking about key data management." So we've been at this for a few years and working with the services, that has been a challenge. I think we're at the part where we've established the common semantics for the department to go forward with that. And one of the challenges that I think is the access and dissemination of knowing what you can share and when you can share it. Because Michael Candolim said earlier that the data in mosaic, sometimes you really need to worry about it from our perspective. Is too much publicly available or should we protect on behalf of the government? >> That's a challenge. Is the are challenge in terms of, I'm sure there is but I wonder if you can describe it or maybe talk about how you might have solved it, maybe it's not a big deal, but you got to serve the mission of the organization. >> Absolutely. >> That's, like, number one. But at the same time, you've got stakeholders and they're powerful politicians and they have needs and there's transparency requirements, there are laws. They're not always aligned, those two directives, are they? >> No, thank goodness I don't have to deal with misalignments of those. We try to speak in the truth of here's the data and the decisions across the organization of our reports still go to congress, they go to congress on an annual basis through the selected acquisition report. And, you know, we are better understanding what we need to protect and how to advice congress on what should be protected and why. I would not say that's an easy proposition. The demands for those data come from the GAO, come from congress, come from the Inspector General and having to navigate that requires good access and dissemination controls and knowing why. We've sponsored some research though the RAND organization to help us look and understand why you have got to protect it and what policies, rules, and regulations are. And all those reports have been public so we could be sure that people would understand what it is. We're coming out of an era where data was not considered as it is today where reports were easily stamped with a little rubber stamp but data now moves at the velocities of milliseconds not as the velocity of reports. So we really took a comprehensive look at that. How do you manage data in a world where it is data and it is on infrastructures like data models. >> So, the future of war. Everybody talks about cyber as the future of war. There's a lot of data associated with that. How does that change what you guys do? Or does it? >> Well, I think from an acquisition perspective, you would think, you know. In that discussion that you just presented us, we're micro in that. We're equipping and acquiring through acquisitions. What we've done is we make sure that our data is shareable, you know? Open I, API structures. Having our data models. Letting the war fighters have our data so they could better understand where information is here. Letting other communities to better help that. By us doing our jobs where we sit, we can contribute to their missions and we've aways been every sharing in that. >> Is technology evolving to the point where, let's assume you could dial back 10 or 15 years and you had the nirvana of data quality. We know how fast technology is changing but is it changing as an enabler to really leverage that quality of data in ways that you might not have even envision 10 or 15 years ago? >> I think technology is. I think a lot of this is not in tools, it's now in technique and management practices. I think many of us find ourselves rethinking of how to do this now that you have data, now that you have tools that you can get them. How can you adopt better and faster? That requires a cultural change to organization. In some cases it requires more advanced skills, in other cases it requires you to think differently about the problems. I always like to consider that we, at some point, thought about it as a process-driven organization. Step one to step two to step three. Now process is ubiquitous because data becomes ubiquitous and you could refactor your processes and decisions much more efficiently and effectively. >> What are some of the information quality problems you have to wrestle with? >> Well, in our case, by setting a definite semantic meaning, we kicked the quality problems to those who provide the authoritative data. And if they had a quality problem, we said, "Here's your data. "We're going to now use it." So it spurs, it changes the model of them ensuring the quality of those who own the data. And by working with the services, they've worked down through their data issues and have used us a bit as the foil for cleaning up their data errors that they have from different inputs. And I like to think about it as flipping the model of saying, "It's not my job to drive quality, "it's my job to drive clarity, "it's their job to drive the quality into the system." >> Let's talk about this event. So, you guys are long-time contributors to the event. Mark, have you been here since the beginning? Or close to it? >> Um... About halfway through I think. >> When the focus was primarily on information quality? >> Yes. >> Was it CDOIQ at the time or was it IQ? >> It was the very beginnings of CDOIQ. It was right before it became CDOIQ. >> Early part of this decade? >> Yes. >> Okay. >> It was Information Quality Symposium originally, is that was attracted you to it? >> Well, yes, I was interested in it because I think there were two things that drew my interest. One, a colleague had told me about it and we were just starting the data journey at that point. And it was talking about information quality and it was out of a business school in the MIT slenton side of the house. And coming from a business perspective, it was not just the providence of IT, I wanted to learn form others because I sit on the business side of the equation. Not a pure IT-ist or technology. And I came here to learn. I've never stopped learning through my entire journey here. >> What have you learned this week? >> Well, there's an awful lot I learned. I think it's been... This space is evolving so rapidly with the law, policy, and regulation. Establishing the CDOs, establishing the roles, getting hear from the CDOs, getting to hear from visions, hear from Michael Conlan and hear from others in the federal agencies. Having them up here and being able to collaborate and talk to them. Also hearing from the technology people, the people that're bringing solutions to the table. And then, I always say this is a bit like group therapy here because many of us have similar problems, we have different start and end points and learning from each other has proven to be very valuable. From the hallway conversations to hearing somebody and seeing how they thought about the products, seeing how commercial industry has implemented data management. And you have a lot of similarity of focus of people dealing with trying to bring data to bring value to the organizations and understanding their transformations, it's proven invaluable. >> Well, what did the appointment of the DOD's first CDO last year, what statement did that make to the organization? >> That data's important. Data are important. And having a CDO in that and, when Micheal came on board, we shared some lessons learned and we were thinking about how to do that, you know? As I said, I function in a, arguably a silo of the institution is the acquisition data. But we were copying CDO homework so it helped in my mind that we can go across to somebody else that would understand and could understand what we're trying to do and help us. And I think it becomes, the CDO community has always been very sharing and collaborative and I hold that true with Micheal today. >> It's kind of the ethos of this event. I mean, obviously you guys have been heavily involved. We've always been thrilled to cover this. I think we started in 2013 and we've seen it grow, it's kind of fire marshal full now. We got to get to a new facility, I understand. >> Fire marshal full. >> Next year. So that's congratulations to all the success. >> Yeah, I think it's important and we've now seen, you know, you hear it, you can read it in every newspaper, every channel out there, that data are important. And what's more important than the factor of governance and the factor of bringing safety and security to the nation? >> I do feel like a lot in, certainly in commercial world, I don't know if it applies in the government, but a lot of these AI projects are moving really fast. Especially in Silicon Valley, there's this move fast and break things mentality. And I think that's part of why you're seeing some of these big tech companies struggle right now because they're moving fast and they're breaking things without the governance injected and many CDOs are not heavily involved in some of these skunk works projects and it's almost like they're bolting on governance which has never been a great formula for success in areas like governance and compliance and security. You know, the philosophy of designing it in has tangible benefits. I wonder if you could comment on that? >> Yeah, I can talk about it as we think about it in our space and it may be limited. AI is a bit high on the hype curve as you might imagine right now, and the question would be is can it solve a problem that you have? Well, you just can't buy a piece of software or a methodology and have it solve a problem if you don't know what problem you're trying to solve and you wouldn't understand the answer when it gave it to you. And I think we have to raise our data intellectualism across the organization to better work with these products because they certainly represent utility but it's not like you give it with no fences on either side or you open up your aperture to find basic solution on this. How you move forward with it is your workforce has got to be in tune with that, you have to understand some of the data, at least the basics, and particularly with products when you get the machine learning AI deep learning, the models are going to be moving so fast that you have to intellectually understand them because you'll never be able to go all the way back and stubby pencil back to an answer. And if you don't have the skills and the math and the understanding of how these things are put together, it may not bring the value that they can bring to us. >> Mark, thanks very much for coming on The Cube. >> Thank you very much. >> Great to see you again and appreciate all the work you guys both do for the community. All right. And thank you for watching. We'll be right back with our next guest right after this short break. You're watching The Cube from MIT CDOIQ.

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Good to see you again, thanks for coming on. and be efficient in the data management. And finally, sometimes you have to make another part of the organization you partner with, or? and involve and awful lot of data in the a buildup And as I said, the partnership with the services and having just the general description of that, in the IT world or the data world And one of the challenges that I think but you got to serve the mission of the organization. But at the same time, you've got stakeholders and the decisions across the organization How does that change what you guys do? In that discussion that you just presented us, and you had the nirvana of data quality. rethinking of how to do this now that you have data, So it spurs, it changes the model of them So, you guys are long-time contributors to the event. About halfway through I think. It was the very beginnings of CDOIQ. in the MIT slenton side of the house. getting hear from the CDOs, getting to hear from visions, and we were thinking about how to do that, you know? It's kind of the ethos of this event. So that's congratulations to all the success. and the factor of bringing safety I don't know if it applies in the government, across the organization to better work with these products all the work you guys both do for the community.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Michael Dell	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Michael	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Elizabeth	PERSON	0.99+
Paul Gillan	PERSON	0.99+
Jeff Clark	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Savannah	PERSON	0.99+
Dave	PERSON	0.99+
Richard	PERSON	0.99+
Micheal	PERSON	0.99+
Carolyn Rodz	PERSON	0.99+
Dave Vallante	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Eric Seidman	PERSON	0.99+
Paul	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Google	ORGANIZATION	0.99+
Keith	PERSON	0.99+
Chris McNabb	PERSON	0.99+
Joe	PERSON	0.99+
Carolyn	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Alice	PERSON	0.99+
2006	DATE	0.99+
John	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
congress	ORGANIZATION	0.99+
Ericsson	ORGANIZATION	0.99+
AT&T	ORGANIZATION	0.99+
Elizabeth Gore	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Madhu Kutty	PERSON	0.99+
1999	DATE	0.99+
Michael Conlan	PERSON	0.99+
2013	DATE	0.99+
Michael Candolim	PERSON	0.99+
Pat	PERSON	0.99+
Yvonne Wassenaar	PERSON	0.99+
Mark Krzysko	PERSON	0.99+
Boston	LOCATION	0.99+
Pat Gelsinger	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Willie Lu	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Yvonne	PERSON	0.99+
Hertz	ORGANIZATION	0.99+
Andy	PERSON	0.99+
2012	DATE	0.99+
Microsoft	ORGANIZATION	0.99+

Lisa Ehrlinger, Johannes Kepler University | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Hi, everybody, welcome back to Cambridge, Massachusetts. This is theCUBE, the leader in tech coverage. I'm Dave Vellante with my cohost, Paul Gillin, and we're here covering the MIT Chief Data Officer Information Quality Conference, #MITCDOIQ. Lisa Ehrlinger is here, she's the Senior Researcher at the Johannes Kepler University in Linz, Austria, and the Software Competence Center in Hagenberg. Lisa, thanks for coming in theCUBE, great to see you. >> Thanks for having me, it's great to be here. >> You're welcome. So Friday you're going to lay out the results of the study, and it's a study of Data Quality Tools. Kind of the long tail of tools, some of those ones that may not have made the Gartner Magic Quadrant and maybe other studies, but talk about the study and why it was initiated. >> Okay, so the main motivation for this study was actually a very practical one, because we have many company projects with companies from different domains, like steel industry, financial sector, and also focus on automotive industry at our department at Johannes Kepler University in Linz. We have experience with these companies for more than 20 years, actually, in this department, and what reoccurred was the fact that we spent the majority of time in such big data projects on data quality measurement and improvement tasks. So at some point we thought, okay, what possibilities are there to automate these tasks and what tools are out there on the market to automate these data quality tasks. So this was actually the motivation why we thought, okay, we'll look at those tools. Also, companies ask us, "Do you have any suggestions? "Which tool performs best in this-and-this domain?" And I think this study answers some questions that have not been answered so far in this particular detail, in these details. For example, Gartner Magic Quadrant of Data Quality Tools, it's pretty interesting but it's very high-level and focusing on some global windows, but it does not look on the specific measurement functionalities. >> Yeah, you have to have some certain number of whatever, customers or revenue to get into the Magic Quadrant. So there's a long tail that they don't cover. But talk a little bit more about the methodology, was it sort of you got hands-on or was it more just kind of investigating what the capabilities of the tools were, talking to customers? How did you come to the conclusions? >> We actually approached this from a very scientific side. We conducted a systematic search, which tools are out there on the market, not only industrial tools, but also open-sourced tools were included. And I think this gives a really nice digest of the market from different perspectives, because we also include some tools that have not been investigated by Gartner, for example, like more BTQ, Data Quality, or Apache Griffin, which has really nice monitoring capabilities, but lacks some other features from these comprehensive tools, of course. >> So was the goal of the methodology largely to capture a feature function analysis of being able to compare that in terms of binary, did it have it or not, how robust is it? And try to develop a common taxonomy across all these tools, is that what you did? >> So we came up with a very detailed requirements catalog, which is divided into three fields, like the focuses on data profiling to get a first insight into data quality. The second is data quality management in terms of dimensions, metrics, and rules. And the third part is dedicated to data quality monitoring over time, and for all those three categories, we came up with different case studies on a database, on a test database. And so we conducted, we looked, okay, does this tool, yes, support this feature, no, or partially? And when partially, to which extent? So I think, especially on the partial assessment, we got a lot into detail in our survey, which is available on Archive online already. So the preliminary results are already online. >> How do you find it? Where is it available? >> On Archive. >> Archive? >> Yes. >> What's the URL, sorry. Archive.com, or .org, or-- >> Archive.org, yeah. >> Archive.org. >> But actually there is a ID I have not with me currently, but I can send you afterwards, yeah. >> Yeah, maybe you can post that with the show notes. >> We can post it afterwards. >> I was amazed, you tested 667 tools. Now, I would've expected that there would be 30 or 40. Where are all of these, what do all of these long tail tools do? Are they specialized by industry or by function? >> Oh, sorry, I think we got some confusion here, because we identified 667 tools out there on the market, but we narrowed this down. Because, as you said, it's quite impossible to observe all those tools. >> But the question still stands, what is the difference, what are these very small, niche tools? What do they do? >> So most of them are domain-specific, and I think this really highlights also these very basic early definition about data quality, of like data qualities defined as fitness for use, and we can pretty much see it here that we excluded the majority of these tools just because they assess some specific kind of data, and we just really wanted to find tools that are generally applicable for different kinds of data, for structured data, unstructured data, and so on. And most of these tools, okay, someone came up with, we want to assess the quality of our, I don't know, like geological data or something like that, yeah. >> To what extent did you consider other sort of non-technical factors? Did you do that at all? I mean, was there pricing or complexity of downloading or, you know, is there a free version available? Did you ignore those and just focus on the feature function, or did those play a role? >> So basically the focus was on the feature function, but of course we had to contact the customer support. Especially with the commercial tools, we had to ask them to provide us with some trial licenses, and there we perceived different feedback from those companies, and I think the best comprehensive study here is definitely Gartner Magic Quadrant for Data Quality Tools, because they give a broad assessment here, but what we also highlight in our study are companies that have a very open support and they are very willing to support you. For example, Informatica Data Quality, we perceived a really close interaction with them in terms of support, trial licenses, and also like specific functionality. Also Experian, our contact from Experian from France was really helpful here. And other companies, like IBM, they focus on big vendors, and here, it was not able to assess these tools, for example, yeah. >> Okay, but the other differences of the Magic Quadrant is you guys actually used the tools, played with them, experienced firsthand the customer experience. >> Exactly, yeah. >> Did you talk to customers as well, or, because you were the customer, you had that experience. >> Yes, I were the customer, but I was also happy to attend some data quality event in Vienna, and there I met some other customers who had experience with single tools. Not of course this wide range we observed, but it was interesting to get feedback on single tools and verify our results, and it matched pretty good. >> How large was the team that ran the study? >> Five people. >> Five people, and how long did it take you from start to finish? >> Actually, we performed it for one year, roughly. The assessment. And I think it's a pretty long time, especially when you see how quick the market responds, especially in the open source field. But nevertheless, you need to make some cut, and I think it's a very recent study now, and there is also the idea to publish it now, the preliminary results, and we are happy with that. >> Were there any surprises in the results? >> I think the main results, or one of the surprises was that we think that there is definitely more potential for automation, but not only for automation. I really enjoyed the keynote this morning that we need more automation, but at the same time, we think that there is also the demand for more declaration. We observed some tools that say, yeah, we apply machine learning, and then you look into their documentation and find no information, which algorithm, which parameters, which thresholds. So I think this is definitely, especially if you want to assess the data quality, you really need to know what algorithm and how it's attuned and give the user, which in most case will be a technical person with technical background, like some chief data officer. And he or she really needs to have the possibility to tune these algorithms to get reliable results and to know what's going on and why, which records are selected, for example. >> So now what? You're presenting the results, right? You're obviously here at this conference and other conferences, and so it's been what, a year, right? >> Yes. >> And so what's the next wave? What's next for you? >> The next wave, we're currently working on a project which is called some Knowledge Graph for Data Quality Assessment, which should tackle two problems in ones. The first is to come up with a semantic representation of your data landscape in your company, but not only the data landscape itself in terms of gathering meta data, but also to automatically improve or annotate this data schema with data profiles. And I think what we've seen in the tools, we have a lot of capabilities for data profiling, but this is usually left to the user ad hoc, and here, we store it centrally and allow the user to continuously verify newly incoming data if this adheres to this standard data profile. And I think this is definitely one step into the way into more automation, and also I think it's the most... The best thing here with this approach would be to overcome this very arduous way of coming up with all the single rules within a team, but present the data profile to a group of data, within your data quality project to those peoples involved in the projects, and then they can verify the project and only update it and refine it, but they have some automated basis that is presented to them. >> Oh, great, same team or new team? >> Same team, yeah. >> Oh, great. >> We're continuing with it. >> Well, Lisa, thanks so much for coming to theCUBE and sharing the results of your study. Good luck with your talk on Friday. >> Thank you very much, thank you. >> All right, and thank you for watching. Keep it right there, everybody. We'll be back with our next guest right after this short break. From MIT CDOIQ, you're watching theCUBE. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and the Software Competence Center in Hagenberg. it's great to be here. Kind of the long tail of tools, Okay, so the main motivation for this study of the tools were, talking to customers? And I think this gives a really nice digest of the market And the third part is dedicated to data quality monitoring What's the URL, sorry. but I can send you afterwards, yeah. Yeah, maybe you can post that I was amazed, you tested 667 tools. Oh, sorry, I think we got some confusion here, and I think this really highlights also these very basic So basically the focus was on the feature function, Okay, but the other differences of the Magic Quadrant Did you talk to customers as well, or, and there I met some other customers and we are happy with that. or one of the surprises was that we think but present the data profile to a group of data, and sharing the results of your study. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Lisa Ehrlinger	PERSON	0.99+
Paul Gillin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Hagenberg	LOCATION	0.99+
Lisa	PERSON	0.99+
Vienna	LOCATION	0.99+
Linz	LOCATION	0.99+
Five people	QUANTITY	0.99+
30	QUANTITY	0.99+
Johannes Kepler University	ORGANIZATION	0.99+
40	QUANTITY	0.99+
Friday	DATE	0.99+
one year	QUANTITY	0.99+
667 tools	QUANTITY	0.99+
France	LOCATION	0.99+
three categories	QUANTITY	0.99+
third part	QUANTITY	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Experian	ORGANIZATION	0.99+
second	QUANTITY	0.99+
two problems	QUANTITY	0.99+
more than 20 years	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
single tools	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.98+
first	QUANTITY	0.98+
MIT CDOIQ	ORGANIZATION	0.98+
a year	QUANTITY	0.97+
three fields	QUANTITY	0.97+
Apache Griffin	ORGANIZATION	0.97+
Archive.org	OTHER	0.96+
.org	OTHER	0.96+
one step	QUANTITY	0.96+
Linz, Austria	LOCATION	0.95+
one	QUANTITY	0.94+
single	QUANTITY	0.94+
first insight	QUANTITY	0.93+
theCUBE	ORGANIZATION	0.92+
2019	DATE	0.92+
this morning	DATE	0.91+
BTQ	ORGANIZATION	0.91+
MIT Chief Data Officer and	EVENT	0.9+
Archive.com	OTHER	0.88+
Informatica	ORGANIZATION	0.85+
Software Competence Center	ORGANIZATION	0.84+
Information Quality Symposium 2019	EVENT	0.81+
MIT Chief Data Officer Information Quality Conference	EVENT	0.72+
Data Quality	ORGANIZATION	0.67+
#MITCDOIQ	EVENT	0.65+
Magic Quadrant	COMMERCIAL_ITEM	0.63+
Magic	COMMERCIAL_ITEM	0.45+
next	EVENT	0.44+
wave	EVENT	0.43+
Magic Quadrant	ORGANIZATION	0.43+
wave	DATE	0.41+
Magic	TITLE	0.39+

Veda Bawo, Raymond James & Althea Davis, ING Bank | MIT CDOIQ 2019

>> From Cambridge Massachusetts, it's the CUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by silicon angle media. >> Welcome back to Cambridge Massachusetts everybody you're watching the cube. The leader in live tech coverage. The cubes two day coverage of MIT's CDOIQ. The chief data officer information quality event. Thirteenth year we started here in 2013. I'm Dave Vallante with my co-host Paul Gillin. Veda Bawo. Bowo. Bawo. Sorry Veda Bawo is here. Did I get that right? >> That's close enough. >> The director of data governance at Raymond James and Althea Davis the former chief data officer of ING bank challengers and growth markets. Ladies welcome to the cube thanks so much for coming on. >> Thank you. >> Thank you. >> Hi Vita, talk about your role at Raymond James. Relatively new role for you? >> It is a relatively new role. So I recently left fifth third bank as their managing director of data governance and I've moved on to Raymond James in sunny Florida. And I am now the director of data governance for Raymond James. So it's a global financial services company they do asset wealth management, investment banking, retail banking. So I'm excited, I'm very excited about it. >> So we've been talking all day and actually several years about how the chief data officer role kind of emerged from the back office of the data governance. >> Mmm >> And the information quality and now its come you know front and center. And actually we've seen a full circle because now it's all about data quality again. So Althea as the former CDO right is that a fair assessment that it sort of came out of the ashes of the back room. >> Yeah, I mean its definitely a fair assessment. That's where we got started. That's how we got our budgets that's how we got our teams. However, now we have to serve many masters. We have to deal with all of the privacy, we have to deal with the multiple compliancies. We have to deal with the data operations and we have to deal with all of the new, sexy emerging technologies. So to do AI and data science you need a lot of data. You need data rich. You need it to be knowledge management, you need it to be information management. And it needs to be intelligent. So we need to actually raise the bar on what we do and at the same time get the credibility from our sea sweet peers. >> Well I think we no longer have the. We don't have the luxury of being just a cost center anymore . >> No. >> Right, we have to generate revenue. So it's about data monetization. It's about partnering with our businesses to make sure that we're helping to drive strategy and deliver results for the broader organization. >> So you got to hit the bottom line. >> Yeah. >> Either raise revenue or cut costs >> Yeah absolutely >> You know directly that can be tangibly monetized. >> Exactly keep them out of jail. Right. Save money >> That too. >> Save money, make money. (inaudible laughter) keep them out of jail. >> Like both CDO's you do not study for this career path because it didn't exist a few years ago. So talk about your backgrounds and how you came to come into this role Veda. >> Yeah absolutely so you know you talked about you know data kind of starting in the bowels of the back office. So I am that person right. So I am an accountant by training. So I am the person who is non legally entity controllership by book journal entries I've closed the books. I've done regulatory reporting so I know what it feels like to have to deal with dirty data every single month end, every single quarter end right. And I know the pain of having to cleanse it and having to deal with our business partners and having experienced that gave me the passion to want to do better. Right so I want to influence my partners upstream to do better as well as to take away some of the pain points that my teams experiencing over and over again it really was groundhog day. So that really made me feel passionate about going into the data discipline. Right and so you know the benefit is great it's not an easy journey but yeah out of accounting finance and that kind of back office operational support was boring right. A data evangelist and some passionate were about it. >> Which made sense because you have to have quality. >> Absolutely. >> Consistency. You have to have so called single version of the truth. >> Absolutely because you look regularly there's light for the financial reports to be accurate. All the time. (laughter) >> Exactly >> How about you? >> I came at it from a totally different angle. I was a marketeer so I was a business manager, a marketeer I was working with the big retail brands you know the Nikes and the Levi's strauss's of the world. So I came to it from a value chain perspective from marketing you know from rolling out retail chains across Europe. And I went from there as a line management position and all the pains of the different types of data we needed and then did quite a bit of consulting with some of the big consultancies accenture. And then rolled more into the data migration so dealing with those huge change projects and having teams from all of the world. And knowing the pains what all of the guys didn't want to work on. I got it all on my plate. But it put me in position to be a really solid chief data officer. >> Somebody it was called like data chicks or something like that (laughter) and I snuck in I was like the lone >> Data chicks >> I was like the lone data dude >> You can be a data chick. It's okay no judgement here. >> And so one of the things that one of the CDO's said there. She was a woman obviously. And she said you know I think that and the stat was there was a higher proportion of women as CDO's than there were across tech which is like I don't know fifty seventeen percent. And she's positive that the reason was because it's like a thankless job that nobody wants and so I just wonder as woman CDO your thoughts on that is that true. >> Well first of all we're the newest to the table right so you're the new kid on the block it doesn't matter if you're man or woman you're the new kid on the block so you know the CFO's got the four thousand year history behind him or her. The CIO or CTO they've got the fifty, sixty year up on us. So we're new. So you have to calve out your space and I do think that a lot of women by nature like to take on things big. To do things that other people don't want to do. So I can see how women kind of fell into that. But, at the same time you know data it's an asset and it is the newest asset. And it's definitely misunderstood. So I do think that you know women you know we kind of fell into it but it was actually something that happened good for women because there's a big future in data. >> Well let's just be realistic right. Woman have unique skillset. I may be a little bias but we have a unique skillset. We're able to solve problems creatively. Right there's no one size fits all solution for data. There's no accounting pronouncement that tells me how to handle and manage my data. Right I have to kind of figure it out as I go along and pivot when something doesn't work. I think that's something that is very natural to women. >> Yeah. >> I think that contributes to us kind of taking on these roles. >> Can I just do a little survey here (laughter) We hear that the chief data officer of function is defined differently at different organizations. Now you both are in financial services. You both have a chief data function. Are you doing the same thing? (laughter) >> Absolutely not! (laughter) >> You know this is data by design. I mean I'm getting lucky I've had teams that go the whole gammon right so. From the compliancy side through to the data operations through to all of the like I said the exotics, sexy you know emerging technologies stuff with the data scientists. So I've had the whole thing. I've also had my last position at ING bank I had to you know lead a team of chief data officers across three different continents Australia, Asia and also Eastern and Western Europe. So it's totally different than you know maybe another company that they've only got to chief data officer working on data quality and data governance. >> So again another challenge of being the new kid on the block right. Defining roles and responsibilities. There's no one globally, universally accepted definition of what a chief data officer should do. >> Right >> Right is data science in or out are analytics in or out. Right. >> Security sometimes. >> Security right sometimes privacy is it or out. Do you have operational responsibilities or are you truly just a second line governance function right? There's a mixed bag out there in the industry. I don't know that we have one answer that we know for sure is true. But I do know for sure is that data is not an IT function. >> Well okay. That's really important. >> It's not an IT asset. >> Yeah. >> I want to say that it's not an IT asset. It is an information asset or a data asset which is a different asset than an IT asset or a financial asset or a human asset. >> But and that's the other big change is that fifteen. Ten to fifteen years ago data was assumed to be a liability right. >> Mmm. >> Federal rules set up a civil procedure we got to get rid of the data or you know we're going to get sued. Number one and number two is that data because it's digital you know people say data is the new oil. I always say it's not. It's more important than oil. >> It's like blood. >> Oil you can only use in one use case. Data you can reuse over and over again. >> Reuse, reuse perpetual. It goes on and on and on. And every time you reuse it the value increases. So I would agree with you it is not the new oil. It is much bigger than that and it needs to I mean I know from some of my colleagues in the profession. We talk about borrowing from other more mature disciplines to make data management, information management and knowledge management much more robust and be much more professional. We also need to be more professional about it as the data leaders. >> So when you're a little panel today. One of the things that you guys addressed is what keeps the CDO up at night. >> Yes >> I presume it's data. (laughter) >> No, no, no. >> It's our payers that don't get it. (laughter) >> That's what keeps us up at night. >> Its the sponsors that keep us up at night. (laughter) So what was that discussion like? >> So yeah I mean it was a lively discussion. Um, great attendance at the panel so we appreciate everyone who came out and supported. >> Full house. >> Definitely a full house. Great reviews so far. >> Yep. >> Okay, so the thing that definitely keeps folks up at night and I'm going to start with my standard one which is quality. Right you can have all of the fancy tools, right you can have a million data scientists but if the quality is not good or sufficient. Then you're no where. So quality is fundamentally the thing that the CDO has to always pay attention to. And there's no magic you know pill or magic right potion that's going to make the quality right. It's something that the entire organization has a rally around. And it's not a one thing done right it has to be a sustainable approach to making sure the quality is good enough so that you can actually reap the benefits or derive the value right from your data. >> Absolutely and I would say you know following on from the quality and I consider that trustworthiness of the data. I would say as a chief data officer you're coming to the table. You're coming to the executive table you need to bring it all so you need to be impactful. You need to be absolutely relevant to your peers. You also need to be able to make their teams in a position to act. So it needs to be actionable. And if you don't have all of that combination with the trustworthiness you're dead in the water. So it is a hard act and that's why there is a high attrition for chief data officers. You know it's a hard job. But I think it's very much worthwhile because this particular asset this new asset we haven't been able to even scratch the surface of what it could mean for us a society and for commercial organizations or government organizations. >> To your point it's not a technology problem when Mark Ramsay who was surveying the audience this morning. He said you know why have we had so many failures and the first hand that went up said. It's because of relations with the database. >> And I wanted to say it's not a technology problem. >> It's a hearts, minds and haves >> Absolutely. Absolutely. You couldn't make an impact to your data landscape without changing your technology. >> You said at the outset how important it is for you to show a bottom line impact. >> Right >> What's one project you've worked on or that you've led in your tenure that did that. >> If we're talking about for example I can't say specifics but if we're looking at one of institutions I worked at in an insurance firm and we looked at the customer journey. So we worked with some of the different departments that traditionally did not get access to data for them to be able to be effective at their jobs. But they wanted to do in marketing was create actually new products to make you know increase the wallet from the existing customers other things they wanted to do was for example, when there were problems with the customers instead of customer you know leaving you know the journey they were able to bring them back in by getting access to the data. So we either gave them insight like you know looking back to make sure that things didn't happen wrong the next time or we helped them giving them information so they could develop new products so this is all about going to market. So that's absolutely bottom line. It's not just all cost efficiency and products to begin . >> Yeah pipeline. (laughter) >> And that's really valid but you know. >> Absolutely so I'll give you one example where the data organization partnered with our data scientists. To try to figure out the best location for various branches. For that particular institution. And it was taking right trillions of data points right about current footprint as well as other information about geographic information that was out there publicly available. Taking that and using the analytics to figure out okay where should we have our branches, our ATM's etc... and then conslidating the footprint or expanding where appropriate. So that is bottom line impact for sure. >> I remember in the early part of the two thousands I remember reading a Harvard business review article about gut feel trumps data every time. But that's an example where no way. >> Nope. >> You could never do better with the gut than that example that you just gave. >> Absolutely. >> Veda. I want to ask you a question. I don't know if you've heard Mark Ramsays talk this morning but he sort of. He sort of declared that data governance was over. >> Mmm. >> And as the director of data governance >> Never! >> I wondered if you would disagree with that. >> Never! >> Look. >> Were you surprised? >> It's just like saying that I should stop brushing my teeth. Right I always will have to maintain a certain level of data hygiene. And I don't think that employees and executives and organizations have reached a level of maturity where I can trust them to maintain that level of hygiene independently. And therefore I need a governance function. I need to check to make sure you brush your teeth in the morning and in the evening. Right and I need you to go for your annual exam to make sure you don't have any cavities that weren't detected. Right so I think that there's still a role for governance to play. It will evolve over time for sure. Right as you know the landscape changes but I think there's still a role right for like governance. >> And that wasn't my takeaway part. I think he said that basically enterprise data warehouse fail massive data management fail. The single data model failed so we punted to governance and that's not going to solve the enterprise data problem. >> I think it's a one leg in the stool. It's one leg in the stool. ` >> Yeah I think I would really sum it up as a monolithic data storage approach failed. Like that. And then our attention went to data governance but that's not going to solve it either. Look, data management is about twelve different data capabilties it's a discipline so we give the title data governance but it means multiple things. And I think that if we're more educated and we have more confidence on what we're doing on those different areas. Plus information and knowledge management then we're way ahead of the game. I mean knowledge graphs and semantics. That puts companies you know at the top of that you know corporate inequality gap that we're looking at right now. Where you know companies are you know five and thousand times more valuable then their competition and the gap is just going to get bigger considering if some of those companies at the bottom of the gap are you know just keep on doing the same thing. >> I agree I was just trying to get you worked up. (laughter) >> Well you did. >> It's going to be a different kind of show. >> But that point you're making. Microsoft, Apple, Amazon and Google, Facebook. Top five companies in terms of market cap. And they're all data companies. They surpass all the financial services, all the energy companies, all the manufacturers. >> And Alibaba same thing. >> Oh yeah. >> They're doing the same thing. >> They're coming right up there. With four or five hundred billion. >> They're all doing the knowledge approach. They're doing all of this stuff and that's a much more comprehensive approach to looking at it as a full spectrum and if we keep on in the financial industry or any industry keep on just kind of looking at little bits and pieces. It's not going to work. It's a lot of talk but there's no action. >> We are losing right. I know that Fintechs are right fringing upon are territory. Right if Amazon can provide a credit card or lend you money or extend you credit. They're now functioning as a traditional bank would. If we're not paying attention to them as real competitors. We've lost the battle. >> That's a really important point you're making because it's all digital now. >> Absolutely. >> You used to be you'd never see companies traverse industries and now you see it Apple pay and Amazon and healthcare. >> Yeah. >> And government organizations teaming up with corporations and individuals. Everything is free flowing so that means the knowledge and the data and the information also needs to flow freely but it needs to be managed. >> Now you're into a whole realm of privacy and security. >> And regulations right. Regulations for the non right traditional banks. So we're doing banking transactions. >> Do you think traditional banks will lose control over the payment systems? >> If they don't move with the time they will. If they don't. I mean it's not something that's going to happen tomorrow but you know there is a category of bank called Challenger banks so there's a reason. You know even within their own niche there's a group of banks. >> I mean not even just payments right. Think about cash transactions like if I do money transfer am I going to my traditional bank to do it or am I going to cashapp. >> I think it's interesting particularly in the retail banking business where you know one banking app looks pretty much like other and people don't go to branches anymore and so that brand affinity that used to exist is harder and harder to maintain and I wonder what role does data play in reestablishing that connection. >> Well for me right I get really excited and sometimes annoyed when I can open up my app for my bank and I can see the pie chart of my spending. They're using my data to inform me about my behaviors sometimes a good story, sometimes a bad story. But they're using it to inform me. That's making me more loyal to that particular institution right so I can also link all of my financial accounts in that one institutions app and I can see a full list of all of my credit cards, all of my loans, all of my investments in one stop shopping. That's making me go to their app more often versus the other options that are out there. So I think we can use the data in order to endear the customer source but we have to be smart about it. >> That's the accountant in you. I just refuse to not look. (laughter) >> You can afford to not look. I can't. >> Thank you. >> Thanks for riling us up. >> Alright thank you for watching everybody we'll be right back with our next guest right after this short break. You're watching the cube from MIT in Boston, Cambridge. Right back. (atmospheric music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by silicon angle media. Did I get that right? and Althea Davis the former chief data officer Hi Vita, talk about your role at Raymond James. And I am now the director of data of the data governance. So Althea as the former CDO right is that So to do AI and data science you need a lot of data. We don't have the luxury of being and deliver results for the broader organization. Right. keep them out of jail. you came to come into this role Veda. And I know the pain of having to cleanse it You have to have so called single version of the truth. light for the financial reports to be accurate. So I came to it from a value chain perspective You can be a data chick. And she's positive that the reason was because But, at the same time you know data it's an asset Right I have to kind of figure it out as I go along I think that contributes to us kind of We hear that the chief data officer of function I had to you know lead a team of chief data officers the new kid on the block right. Right is data science in or out are I don't know that we have one answer that we know That's really important. I want to say that it's not an IT asset. But and that's the other big change is that fifteen. we got to get rid of the data or you know Data you can reuse over and over again. So I would agree with you it is not the new oil. One of the things that you guys addressed I presume it's data. It's our payers that don't get it. Its the sponsors that keep us up at night. Um, great attendance at the panel so we appreciate Great reviews so far. the thing that the CDO has to always pay attention to. So it needs to be actionable. and the first hand that went up said. You couldn't make an impact to your data it is for you to show a bottom line impact. or that you've led in your tenure that did that. actually new products to make you know increase (laughter) Absolutely so I'll give you one example I remember in the early part of the two thousands than that example that you just gave. He sort of declared that data governance was over. I need to check to make sure you brush your and that's not going to solve the enterprise data problem. It's one leg in the stool. and the gap is just going to get bigger considering I agree I was just trying to get you worked up. all the energy companies, all the manufacturers. They're coming right up there. It's not going to work. I know that Fintechs are right fringing upon are territory. That's a really important point you're industries and now you see it and the data and the information also needs to Regulations for the non right traditional banks. I mean it's not something that's going to happen tomorrow am I going to my traditional bank to do it banking business where you know one banking app looks and I can see the pie chart of my spending. I just refuse to not look. You can afford to not look. Alright thank you for watching everybody we'll

ENTITIES

Entity	Category	Confidence
Mark Ramsay	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Veda Bawo	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Dave Vallante	PERSON	0.99+
Google	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
2013	DATE	0.99+
Europe	LOCATION	0.99+
ING Bank	ORGANIZATION	0.99+
Vita	PERSON	0.99+
five	QUANTITY	0.99+
fifty	QUANTITY	0.99+
Nikes	ORGANIZATION	0.99+
four	QUANTITY	0.99+
ING	ORGANIZATION	0.99+
MIT	ORGANIZATION	0.99+
Raymond James	ORGANIZATION	0.99+
Althea Davis	PERSON	0.99+
fifty seventeen percent	QUANTITY	0.99+
both	QUANTITY	0.99+
two day	QUANTITY	0.99+
Levi's	ORGANIZATION	0.99+
one leg	QUANTITY	0.99+
Mark Ramsays	PERSON	0.99+
tomorrow	DATE	0.99+
four thousand year	QUANTITY	0.99+
Australia	LOCATION	0.99+
two thousands	QUANTITY	0.99+
five hundred billion	QUANTITY	0.98+
Cambridge Massachusetts	LOCATION	0.98+
Asia	LOCATION	0.98+
Bawo	PERSON	0.98+
today	DATE	0.98+
Veda	PERSON	0.98+
single	QUANTITY	0.97+
Western Europe	LOCATION	0.97+
one example	QUANTITY	0.97+
One	QUANTITY	0.97+
Eastern	LOCATION	0.97+
one	QUANTITY	0.97+
Ten	DATE	0.97+
Boston, Cambridge	LOCATION	0.97+
Thirteenth year	QUANTITY	0.97+
fifteen	DATE	0.96+
second line	QUANTITY	0.96+
a million data scientists	QUANTITY	0.95+
Raymond James	PERSON	0.95+
five companies	QUANTITY	0.95+
fifteen years ago	DATE	0.94+
fifth third bank	QUANTITY	0.94+
Bowo	PERSON	0.94+
sixty year	QUANTITY	0.93+
trillions of data points	QUANTITY	0.92+
ING bank	ORGANIZATION	0.92+
one use case	QUANTITY	0.91+
one project	QUANTITY	0.91+
this morning	DATE	0.91+
first	QUANTITY	0.9+
thousand times	QUANTITY	0.9+
2019	DATE	0.89+
one answer	QUANTITY	0.87+
Althea	PERSON	0.86+
Challenger	ORGANIZATION	0.85+
one banking app	QUANTITY	0.84+
MIT Chief Data Officer and	EVENT	0.83+
three different continents	QUANTITY	0.82+
few years ago	DATE	0.81+
this morning	DATE	0.8+
single version	QUANTITY	0.78+
number two	QUANTITY	0.76+
Information Quality Symposium 2019	EVENT	0.75+
Harvard	ORGANIZATION	0.73+
pay	TITLE	0.72+
sunny Florida	LOCATION	0.7+

Michael Conlin, US Department of Defense | MIT CDOIQ 2019

(upbeat music) >> From Cambridge, Massachusetts, it's the CUBE. Covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. (upbeat music) >> Welcome back to MIT in Cambridge Massachusetts everybody you're watching the CUBE the leader in live tech coverage. We go out to the events and extract the signal from the noise we hear at the MIT CDOIQ. It's the MIT Chief Data Officer event the 13th annual event. The CUBE started covering this show in 2013. I'm Dave Vellante with Paul Gillin, my co-host, and Michael Conlin is here as the chief data officer of the Department of Defense, Michael welcome, thank you for coming on. >> Thank you, it's a pleasure to be here. >> So the DoD is, I think it's the largest organization in the world, what does the chief data officer of the DoD do on a day to day basis? >> A range of things because we have a range of challenges at the Department of Defense. We are the single largest organization on the planet. We have the greatest scope and scale and complexity. We have the most dangerous competitors of anybody on the planet, it's not a trivial issue for us. So, I've a range of challenges. Challenges around, how do I lift the overall performance of the department using data effectively? How do I help executives make better decisions faster, using more recent, more common data? More common enterprise data is the expression we use. How do I help them become more sophisticated consumers of data and especially data analytics? And, how do we get to the point where, I can compare performance over here with performance over there, on a common basis? And compared to commercial benchmark? Which is now an expectation for us, and ask are we doing this as well as we should, right across the patch? Knowing, that all that data comes from multiple different places to start with. So we have to overcome all those differences and provide that department wide view. That's the essence of the role. And now with the recent passage of the Foundations for Evidenced-Based Policymaking Act, there are a number of additional expectations that go on top of that, but this is ultimately about improving affordability and performance of the department. >> So overall performance of the organization... >> Overall performance. >> ...as well, and maybe that comes from supporting various initiatives, and making sure you're driving performance on that basis as well. >> It does, but our litmus test is are we enabling the National Defense Strategy to succeed? Only reason to touch data is to enable the National Defense Strategy to be more successful than without it. And so we're always measuring ourselves against that. But it is, can we objectively say we're performing better? Can we objectively say that we are more affordable? In terms of the way we support the National Defense Strategy. >> I'm curious about your motivations for taking on this assignment because your background, as I see, is primarily in the private sector. A year ago you joined the US Department of Defense. A huge set of issues that you're tackling now, why'd you do it? >> So I am a capitalist, like most Americans, and I'm a serial entrepreneur. This was my first opportunity to serve government. And when I looked at it, knowing that I could directly support national defense, knowing that I could make a direct meaningful contribution, let me exercise that spirit of patriotism that many of us have, but we just not found ourselves an opportunity. When this opportunity came along I just couldn't say no to it. There's so much to be done and so much appetite for improvement that I just couldn't walk away for this. Now I've to tell you, when you start you take an oath of office to protect and defend the constitution. I don't know, it's maybe a paragraph or maybe it's two paragraphs. It felt like it took an hour to choke it out, because I was suddenly struck with all of this emotion. >> The gravity of what you were doing. >> Yeah, the gravity of what I'm doing. And that was just a reinforcement of the choice I'd already made, obviously right. But the chance to be the first chief data officer of the entire Department of Defense, just an enormous privilege. The chance to bring commercial sector best practices in and really lift the game of the department, again enormous privilege. There's so many people who could do this, probably better than me. The fact that I got the opportunity I just couldn't say no. Just too important, to many places I could see that we could make things better. I think anybody with a patriotic bone in their body would of jumped at the opportunity. >> That's awesome, I love that congratulations on getting that role and seemingly thrive in it. A big part of preserving that capitalist belief, defending the constitution and the American way, it sounds corny, but... >> It's real. >> I'm a patriot as well, is security. And security and data are intertwined. And just the whole future of warfare is dramatically changing. Can you talk about in a format like this, security, you're thinking on that, the department's thinking on that from a CDO's perspective? >> So as you know we have a number of swimlanes within the department and security is very clear swimlane, it's aligned under our chief information officer, but security is everybody's responsibility, of course. Now the longstanding criticism of security people is that they think they best way to secure anything is to permit nobody to touch it. The clear expectation for me as chief data officer is to make sure that information is shared to the right people as rapidly as possible. And, that's a different philosophy. Now I'm really lucky. Lieutenant General Denis Crall our principal cyber advisor, Dana Deasy our CIO, these people understand how important it is to get information in the right place at the right time, make it rapidly available and secure it every step along the way. We embrace the zero trust mantra. And because we embrace the zero trust mantra we're directly concerned with defending the data itself. And as long as we defend the data and the same mechanisms are the mechanisms we use to let people share it, suddenly the tension goes away. Suddenly we all have the same goal. Because the goal is not to prevent use of data, it's to enable use of data in a secure way. So the traditional tension that might be in that place doesn't exist in the department. Very productive, very professional level of collaboration with those folks in this space. Very sophisticated people. >> When we were talking before we went live you mentioned that the DoD has 10,000 plus operational systems... >> That's correct. >> A portfolio of that magnitude just overwhelming, I mean how did you know what to do first when you moved into this job, or did you have a clear mandate when you were hired? >> So I did have a clear mandate when I was hired and luckily that was spelled out. We knew what to do first because we sat down with actual leaders of the department and asked them what their goals were for improving the performance of the department. And everything starts from that conversation. You find those executives that what to improve performance, you understand what those goals are, and what data they need to manage that improvement. And you capture all the critical business questions they need answers to. From that point on they're bought in to everything that happens, right. Because they want those answers to those critical business questions. They have performance targets of their own, this is now aligned with. And so you have the support you need to go down the rest of the path of finding the data, standardizing it, et cetera. In order to deliver the answers to those questions. But it all starts which either the business mission leaders or the warfighting mission leaders who define the steps they're taking to implement the National Defense Strategy. Everything gets lined up against that, you get instant support and you know you're going after the right thing. This is not, an if you build it they will come. This is not, a driftnet the organization try to gather up all the data. This is spear fishing for specific answers to materially important questions, and everything we do is done on that basis. >> We hear Mark Ramsey this morning talk about the... He showed a picture of stove pipes and then he complicated that picture by showing multiple copies within each of those stove pipes, and says this is organizations that we've all lived in. >> That's my organization too. >> So talk about some of those data challenges at the DoD and how you're addressing those, specifically how you're enabling soldiers in the field to get the right data to the field when they need it. >> So what we'll be delicate when we talk about what we do for soldiers in the field. >> Understood, yeah. >> That tends to be sensitive. >> Understand why, sure. >> But all of those dynamics that Mark described in that presentation are present in every large cooperation I've ever served. And that includes the Department of Defense. That heterogeneity and sprawl of IT that what I would refer to, he showed us a hair ball of IT. Every large organization has a hair ball of IT. And data scattered all over the place. We took many of the same steps that he described in terms of organizing and presenting meaningful answers to questions, in almost exactly the same sequence. The challenge as you heard me use the statistics that our CIO's published digital monetization strategies, which calls out that we have roughly 10,000 operational systems. Well, every one of them is different. Every one's put in place by a different group of people at a different time, with a different set of requirements, and a different budget, and a different focus. You know organizational scope. We're just like he showed. We're trying to blend all that in to a common view. So we have to find what's the real authoritative piece of data, cause it's not all of those systems. It's only a subset of those systems. And you have to do all of the mapping and translations, to make the result add up. Otherwise you double count or you miss something. This is work in progress. This will always be a work in progress to any large organization. So I don't want to give you impression it's all sorted. Definitely not all sorted. But, the reality is we're trying to get to the point where people can see the data that's available and that's a requirement by the way under the Foundations Act that we have a data catalog, an authoritative data catalog so people can see it and they have the ability to then request access to that through automation. This is what's critical, you need to be able to request access and have it arbitraged on the basis of whether you should directly have access based on your role, your workflow, et cetera, but it should happen in real time. You don't want to wait weeks, or months, or however long for some paperwork to move around. So this all has to become highly automated. So, what's the data, who can access it under what policy, for what purpose? Our roles and responsibilities? Identity management? All this is a combined set of solutions that we have to put in place. I'm mostly worried about a subset of that. My colleagues in these other swimlanes are working to do the rest. Most people in the department have access to data they need in their space. That hasn't been a problem. The problem is you go from space to space, you have to learn a new set of systems and a new set of techniques for a new set of data formats which means you have to be retrained. That really limits our freedom of maneuver of human beings. In the ideal world you'd be able to move from any job in any part of the department to the same job in another part of the department with no retraining whatsoever. You'd be instantly able to make a contribution. That's what we're trying to get to. So that's a different kind of a challenge, right. How do we get that level of consistency in the user experience, a modern user experience. So that if I'm a real estate manager, or I'm a medical business manager, or I'm a clinical professional, or I'm whatever, I can go from this location in this part of the department to that location in that part and my experience is the same. It's completely modern, and it's completely consistent. No retraining. >> How much of that challenge pie is people, process and technology? How would you split that opportunity? >> Well everything starts for a process perspective. Because if you automate a bad process, you just make more mistakes in less time at greater costs. Obviously that's not the ideal. But the biggest single challenge is people. It's talent, it's culture. Both on the demand side and on the supply side. If fact a lot of what I talked about in my remarks, was the additional changes we need to put in place to bring people into a more modern approach to data, more modern consumption. And look, we have pockets of excellence. And they can hold their own against any team, any place on the planet. But they are pockets of excellence. And what we're trying to do is raise the entire organization's performance. So it's people, people, and people and then the other stuff. But the products, don't care about (laughs). >> We often here about... >> They're going to change in 12 to 18 months. I'm a technologist, I'm hands on. The products are going to change rapidly, I make no emotional commitment to products. But the people that's a different story. >> Well we know that in the commercial world we often hear that cultural resistance is what sabotages modernization efforts. The DoD is sort of the ultimate top-down organization. It is any easier to get buy-in because the culture is sort of command and control oriented? >> It's hard in the DoD, it's not easier in the DoD. Ultimately people respond to their performance incentives. That's the dirty secrets performance incentives, they work every time. So unless you restructure performance measures and incentives for people their behavior's never going to change. They need to see their personal future in the future you're prescribing. And if they don't see it, you're going to get resistance every time. They're going to do what they believe they're incented to do. Making those changes, cascading those performance measures down, has been difficult because much of the decision-making processes in the department have been based on slow-moving systems and slow-moving data. I mean think about it, our budget planning process was created by Robert McNamara, as the Secretary of Defense. It requires you to plan everything for five years. And it takes more than a year to plan a single year's worth of activities, it's slow-moving. And we have regulation, we have legislation, we're a law-abiding organization, we do what we have to do. All of those things slow things down. And there's a culture of expecting macro-level consensus building. Which means everybody feels they can say no. If everybody can say no, then change becomes peanut butter spread across an organization. When you peanut butter spread across something our size and scale, the layer's pretty thin. So we have the same problem that other organizations have. There is clearly a perception of top-down change and if the Secretary or the Deputy Secretary issue an instruction people will obey it. It just takes some time to work it's way down into all the detailed combinations and permutations. Cause you have to make sophisticated decisions now. How am I going to change for my performance measures for that group to that group? And that takes time and energy and thought. There's a natural sort of pipeline effect in this. So there's real tension I think in between this perception of top-down and people will obey the orders their given. But when you're trying to integrate those changes into a board set of policy and process and people, that takes time and energy. >> And as a result the leaders have to be circumspect about the orders they give because they want to see success. They want to make sure that what they say is actually implemented or it reflects poorly on the organization. >> I think that out leaders are absolutely concerned about accomplishing the outcomes that they set out. And I think that they are rightfully determined to get the change as rapidly as possible. I would not expect them to be circumspect. I would anticipate that they would be firm and clear in the direction that they set and they would set aggressive targets because you need aggressive targets to get aggressively changed outcomes. Now. >> But they would have to choose wisely, they can't just fire off orders and expect everything to be done. I would think that they got to really think about what they want to get done, and put all the wood behind the arrow as you... >> I think that they constantly balance all those considerations. I must say, I did not appreciate before I joined the department the extraordinary caliber of leadership we enjoy. We have people with real insight and experience, and high intellectual horsepower making the decisions in the department. We've been blessed with the continuing stream of them at all of the senior ranks. These people could go anywhere, or do anything that they wanted in the economy and they've chosen to be in the department. And they bring enormous intellectual firepower to bear on challenges. >> Well you mentioned the motivation at the top of the segment, that's largely pretty powerful. >> Yeah, oh absolutely. >> I want to ask you, we have to break, but the organizational structure, you talked about the CIO, actually the responsibility for security within the CIO. >> Sure. >> To whom do you report. What's the organization look like? >> So I report to the Chief Management Officer of the Department of Defense. So if you think about the order of precedents, there's the Secretary of Defense, the Deputy Secretary of Defense and third in order is the Chief Management Officer. I report to the Chief Management Officer. >> As does the CIO, is that right? >> As does the CIO, as does the CIO. And actually this is quite typical in large organizations, that you don't have the CDO and the CIO in the same space because the concerns are very different. They have to collaborate but very different concerns. We used to see CDOs reporting to CIOs that's fallen dramatically in terms of the frequency you see that. Cause we now recognize that's just a failure mode. So you don't want to go down that path. The number one most common reporting relationship is actually to a CEO, the chief executive officer, of an organization. It's all about, what executive is driving performance for the organization? That's the person the CDO should report to. And I'm blessed in that I do find myself reporting to the executive driving organizational improvement. For me, that's a critical thing. That would make the difference between whether I could succeed or whether I'm doomed to fail. >> COO would be common too in a commercial organization. >> Yeah, in certain commercial organizations, it's a COO. It just depends on the nature of the business and their maturity with data. But if you're in the... If data's the business, CDO will report to the CEO. There are other organizations where it'll be the COO or CFO, it just depends on the nature of that business. And in our case I'm quite fortunate. >> Well Michael, thank you for, not only the coming to the CUBE but the service you're providing to the country, we really appreciate your insights and... >> It's a pleasure meeting you. >> It's a pleasure meeting you. All right, keep it right there everybody we'll be right back with our next guest. You're watching the CUBE live from MIT CDOIQ, be right back. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. and Michael Conlin is here as the chief data officer More common enterprise data is the expression we use. and maybe that comes from supporting various initiatives, In terms of the way we support as I see, is primarily in the private sector. I just couldn't say no to it. But the chance to be the first chief data officer defending the constitution and the American way, And just the whole future of warfare Because the goal is not to prevent use of data, you mentioned that the DoD has 10,000 plus This is not, a driftnet the organization and says this is organizations that we've all lived in. enabling soldiers in the field to get the right data for soldiers in the field. in any part of the department to the same job Both on the demand side and on the supply side. But the people that's a different story. The DoD is sort of the ultimate top-down organization. and if the Secretary or the Deputy Secretary And as a result the leaders have to be circumspect about in the direction that they set and they would set behind the arrow as you... the extraordinary caliber of leadership we enjoy. of the segment, that's largely pretty powerful. but the organizational structure, you talked about the CIO, What's the organization look like? of the Department of Defense. dramatically in terms of the frequency you see that. It just depends on the nature of the business to the CUBE but the service you're providing to the country, It's a pleasure meeting you.

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Jeff	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
David	PERSON	0.99+
Lisa Martin	PERSON	0.99+
PCCW	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Michelle Dennedy	PERSON	0.99+
Matthew Roszak	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Mark Ramsey	PERSON	0.99+
George	PERSON	0.99+
Jeff Swain	PERSON	0.99+
Andy Kessler	PERSON	0.99+
Europe	LOCATION	0.99+
Matt Roszak	PERSON	0.99+
Frank Slootman	PERSON	0.99+
John Donahoe	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dan Cohen	PERSON	0.99+
Michael Biltz	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Michael Conlin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Melo	PERSON	0.99+
John Furrier	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Joe Brockmeier	PERSON	0.99+
Sam	PERSON	0.99+
Matt	PERSON	0.99+
Jeff Garzik	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Joe	PERSON	0.99+
George Canuck	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Rebecca Night	PERSON	0.99+
Brian	PERSON	0.99+
Dave Valante	PERSON	0.99+
NUTANIX	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Michael	PERSON	0.99+
Mike Nickerson	PERSON	0.99+
Jeremy Burton	PERSON	0.99+
Fred	PERSON	0.99+
Robert McNamara	PERSON	0.99+
Doug Balog	PERSON	0.99+
2013	DATE	0.99+
Alistair Wildman	PERSON	0.99+
Kimberly	PERSON	0.99+
California	LOCATION	0.99+
Sam Groccot	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Rebecca	PERSON	0.99+
two	QUANTITY	0.99+

Mark Ramsey, Ramsey International LLC | MIT CDOIQ 2019

>> From Cambridge, Massachusetts. It's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts, everybody. We're here at MIT, sweltering Cambridge, Massachusetts. You're watching theCUBE, the leader in live tech coverage, my name is Dave Vellante. I'm here with my co-host, Paul Gillin. Special coverage of the MITCDOIQ. The Chief Data Officer event, this is the 13th year of the event, we started seven years ago covering it, Mark Ramsey is here. He's the Chief Data and Analytics Officer Advisor at Ramsey International, LLC and former Chief Data Officer of GlaxoSmithKline. Big pharma, Mark, thanks for coming onto theCUBE. >> Thanks for having me. >> You're very welcome, fresh off the keynote. Fascinating keynote this evening, or this morning. Lot of interest here, tons of questions. And we have some as well, but let's start with your history in data. I sat down after 10 years, but I could have I could have stretched it to 20. I'll sit down with the young guns. But there was some folks in there with 30 plus year careers. How about you, what does your data journey look like? >> Well, my data journey, of course I was able to stand up for the whole time because I was in the front, but I actually started about 32, a little over 32 years ago and I was involved with building. What I always tell folks is that Data and Analytics has been a long journey, and the name has changed over the years, but we've been really trying to tackle the same problems of using data as a strategic asset. So when I started I was with an insurance and financial services company, building one of the first data warehouse environments in the insurance industry, and that was in the 87, 88 range, and then once I was able to deliver that, I ended up transitioning into being in consulting for IBM and basically spent 18 years with IBM in consulting and services. When I joined, the name had evolved from Data Warehousing to Business Intelligence and then over the years it was Master Data Management, Customer 360. Analytics and Optimization, Big Data. And then in 2013, I joined Samsung Mobile as their first Chief Data Officer. So, moving out of consulting, I really wanted to own the end-to-end delivery of advanced solutions in the Data Analytics space and so that made the transition to Samsung quite interesting, very much into consumer electronics, mobile phones, tablets and things of that nature, and then in 2015 I joined GSK as their first Chief Data Officer to deliver a Data Analytics solution. >> So you have long data history and Paul, Mark took us through. And you're right, Mark-o, it's a lot of the same narrative, same wine, new bottle but the technology's obviously changed. The opportunities are greater today. But you took us through Enterprise Data Warehouse which was ETL and then MAP and then Master Data Management which is kind of this mapping and abstraction layer, then an Enterprise Data Model, top-down. And then that all failed, so we turned to Governance which has been very very difficult and then you came up with another solution that we're going to dig into, but is it the same wine, new bottle from the industry? >> I think it has been over the last 20, 30 years, which is why I kind of did the experiment at the beginning of how long folks have been in the industry. I think that certainly, the technology has advanced, moving to reduction in the amount of schema that's required to move data so you can kind of move away from the map and move type of an approach of a data warehouse but it is tackling the same type of problems and like I said in the session it's a little bit like Einstein's phrase of doing the same thing over and over again and expecting a different answer is certainly the definition of insanity and what I really proposed at the session was let's come at this from a very different perspective. Let's actually use Data Analytics on the data to make it available for these purposes, and I do think I think it's a different wine now and so I think it's just now a matter of if folks can really take off and head that direction. >> What struck me about, you were ticking off some of the issues that have failed like Data Warehouses, I was surprised to hear you say Data Governance really hasn't worked because there's a lot of talk around that right now, but all of those are top-down initiatives, and what you did at GSK was really invert that model and go from the bottom up. What were some of the barriers that you had to face organizationally to get the cooperation of all these people in this different approach? >> Yeah, I think it's still key. It's not a complete bottoms up because then you do end up really just doing data for the sake of data, which is also something that's been tried and does not work. I think it has to be a balance and that's really striking that right balance of really tackling the data at full perspective but also making sure that you have very definitive use cases to deliver value for the organization and then striking the balance of how you do that and I think of the things that becomes a struggle is you're talking about very large breadth and any time you're covering multiple functions within a business it's getting the support of those different business functions and I think part of that is really around executive support and what that means, I did mention it in the session, that executive support to me is really stepping up and saying that the data across the organization is the organization's data. It isn't owned by a particular person or a particular scientist, and I think in a lot of organization, that gatekeeper mentality really does put barriers up to really tackling the full breadth of the data. >> So I had a question around digital initiatives. Everywhere you go, every C-level Executive is trying to get digital right, and a lot of this is top-down, a lot of it is big ideas and it's kind of the North Star. Do you think that that's the wrong approach? That maybe there should be a more tactical line of business alignment with that threaded leader as opposed to this big picture. We're going to change and transform our company, what are your thoughts? >> I think one of the struggles is just I'm not sure that organizations really have a good appreciation of what they mean when they talk about digital transformation. I think there's in most of the industries it is an initiative that's getting a lot of press within the organizations and folks want to go through digital transformation but in some cases that means having a more interactive experience with consumers and it's maybe through sensors or different ways to capture data but if they haven't solved the data problem it just becomes another source of data that we're going to mismanage and so I do think there's a risk that we're going to see the same outcome from digital that we have when folks have tried other approaches to integrate information, and if you don't solve the basic blocking and tackling having data that has higher velocity and more granularity, if you're not able to solve that because you haven't tackled the bigger problem, I'm not sure it's going to have the impact that folks really expect. >> You mentioned that at GSK you collected 15 petabytes of data of which only one petabyte was structured. So you had to make sense of all that unstructured data. What did you learn about that process? About how to unlock value from unstructured data as a result of that? >> Yeah, and I think this is something. I think it's extremely important in the unstructured data to apply advanced analytics against the data to go through a process of making sense of that information and a lot of folks talk about or have talked about historically around text mining of trying to extract an entity out of unstructured data and using that for the value. There's a few steps before you even get to that point, and first of all it's classifying the information to understand which documents do you care about and which documents do you not care about and I always use the story that in this vast amount of documents there's going to be, somebody has probably uploaded the cafeteria menu from 10 years ago. That has no scientific value, whereas a protocol document for a clinical trial has significant value, you don't want to look through manually a billion documents to separate those, so you have to apply the technology even in that first step of classification, and then there's a number of steps that ultimately lead you to understanding the relationship of the knowledge that's in the documents. >> Side question on that, so you had discussed okay, if it's a menu, get rid of it but there's certain restrictions where you got to keep data for decades. It struck me, what about work in process? Especially in the pharmaceutical industry. I mean, post Federal Rules of Civil Procedure was everybody looking for a smoking gun. So, how are organizations dealing with what to keep and what to get rid of? >> Yeah, and I think certainly the thinking has been to remove the excess and it's to your point, how do you draw the line as to what is excess, right, so you don't want to just keep every document because then if an organization is involved in any type of litigation and there's disclosure requirements, you don't want to have to have thousands of documents. At the same time, there are requirements and so it's like a lot of things. It's figuring out how do you abide by the requirements, but that is not an easy thing to do, and it really is another driver, certainly document retention has been a big thing over a number of years but I think people have not applied advanced analytics to the level that they can to really help support that. >> Another Einstein bro-mahd, you know. Keep everything you must but no more. So, you put forth a proposal where you basically had this sort of three approaches, well, combined three approaches. The crawlers to go, the spiders to go out and do the discovery and I presume that's where the classification is done? >> That's really the identification of all of the source information >> Okay, so find out what you got, okay. >> so that's kind of the start. Find out what you have. >> Step two is the data repository. Putting that in, I thought it was when I heard you I said okay it must be a logical data repository, but you said you basically told the CIO we're copying all the data and putting it into essentially one place. >> A physical location, yes. >> Okay, and then so I got another question about that and then use bots in the pipeline to move the data and then you sort of drew the diagram of the back end to all the databases. Unstructured, structured, and then all the fun stuff up front, visualization. >> Which people love to focus on the fun stuff, right? Especially, you can't tell how many articles are on you got to apply deep learning and machine learning and that's where the answers are, we have to have the data and that's the piece that people are missing. >> So, my question there is you had this tactical mindset, it seems like you picked a good workload, the clinical trials and you had at least conceptually a good chance of success. Is that a fair statement? >> Well, the clinical trials was one aspect. Again, we tackled the entire data landscape. So it was all of the data across all of R&D. It wasn't limited to just, that's that top down and bottom up, so the bottom up is tackle everything in the landscape. The top down is what's important to the organization for decision making. >> So, that's actually the entire R&D application portfolio. >> Both internal and external. >> So my follow up question there is so that largely was kind of an inside the four walls of GSK, workload or not necessarily. My question was what about, you hear about these emerging Edge applications, and that's got to be a nightmare for what you described. In other words, putting all the data into one physical place, so it must be like a snake swallowing a basketball. Thoughts on that? >> I think some of it really does depend on you're always going to have these, IOT is another example where it's a large amount of streaming information, and so I'm not proposing that all data in every format in every location needs to be centralized and homogenized, I think you have to add some intelligence on top of that but certainly from an edge perspective or an IOT perspective or sensors. The data that you want to then make decisions around, so you're probably going to have a filter level that will impact those things coming in, then you filter it down to where you're going to really want to make decisions on that and then that comes together with the other-- >> So it's a prioritization exercise, and that presumably can be automated. >> Right, but I think we always have these cases where we can say well what about this case, and you know I guess what I'm saying is I've not seen organizations tackle their own data landscape challenges and really do it in an aggressive way to get value out of the data that's within their four walls. It's always like I mentioned in the keynote. It's always let's do a very small proof of concept, let's take a very narrow chunk. And what ultimately ends up happening is that becomes the only solution they build and then they go to another area and they build another solution and that's why we end up with 15 or 25-- (all talk over each other) >> The conventional wisdom is you start small. >> And fail. >> And you go on from there, you fail and that's now how you get big things done. >> Well that's not how you support analytic algorithms like machine learning and deep learning. You can't feed those just fragmented data of one aspect of your business and expect it to learn intelligent things to then make recommendations, you've got to have a much broader perspective. >> I want to ask you about one statistic you shared. You found 26 thousand relational database schemas for capturing experimental data and you standardized those into one. How? >> Yeah, I mean we took advantage of the Tamr technology that Michael Stonebraker created here at MIT a number of years ago which is really, again, it's applying advanced analytics to the data and using the content of the data and the characteristics of the data to go from dispersed schemas into a unified schema. So if you look across 26 thousand schemas using machine learning, you then can understand what's the consolidated view that gives you one perspective across all of those different schemas, 'cause ultimately when you give people flexibility they love to take advantage of it but it doesn't mean that they're actually doing things in an extremely different way, 'cause ultimately they're capturing the same kind of data. They're just calling things different names and they might be using different formats but in that particular case we use Tamr very heavily, and that again is back to my example of using advanced analytics on the data to make it available to do the fun stuff. The visualization and the advanced analytics. >> So Mark, the last question is you well know that the CDO role emerged in these highly regulated industries and I guess in the case of pharma quasi-regulated industries but now it seems to be permeating all industries. We have Goka-lan from McDonald's and virtually every industry is at least thinking about this role or has some kind of de facto CDO, so if you were slotted in to a CDO role, let's make it generic. I know it depends on the industry but where do you start as a CDO for an organization large company that doesn't have a CDO. Even a mid-sized organization, where do you start? >> Yeah, I mean my approach is that a true CDO is maximizing the strategic value of data within the organization. It isn't a regulatory requirement. I know a lot of the banks started there 'cause they needed someone to be responsible for data quality and data privacy but for me the most critical thing is understanding the strategic objectives of the organization and how will data be used differently in the future to drive decisions and actions and the effectiveness of the business. In some cases, there was a lot of discussion around monetizing the value of data. People immediately took that to can we sell our data and make money as a different revenue stream, I'm not a proponent of that. It's internally monetizing your data. How do you triple the size of the business by using data as a strategic advantage and how do you change the executives so what is good enough today is not good enough tomorrow because they are really focused on using data as their decision making tool, and that to me is the difference that a CDO needs to make is really using data to drive those strategic decision points. >> And that nuance you mentioned I think is really important. Inderpal Bhandari, who is the Chief Data Officer of IBM often says how can you monetize the data and you're right, I don't think he means selling data, it's how does data contribute, if I could rephrase what you said, contribute to the value of the organization, that can be cutting costs, that can be driving new revenue streams, that could be saving lives if you're a hospital, improving productivity. >> Yeah, and I think what I've shared typically shared with executives when I've been in the CDO role is that they need to change their behavior, right? If a CDO comes in to an organization and a year later, the executives are still making decisions on the same data PowerPoints with spinning logos and they said ooh, we've got to have 'em. If they're still making decisions that way then the CDO has not been successful. The executives have to change what their level of expectation is in order to make a decision. >> Change agents, top down, bottom up, last question. >> Going back to GSK, now that they've completed this massive data consolidation project how are things different for that business? >> Yeah, I mean you look how Barron joined as the President of R&D about a year and a half ago and his primary focus is using data and analytics and machine learning to drive the decision making in the discovery of a new medicine and the environment that has been created is a key component to that strategic initiative and so they are actually completely changing the way they're selecting new targets for new medicines based on data and analytics. >> Mark, thanks so much for coming on theCUBE. >> Thanks for having me. >> Great keynote this morning, you're welcome. All right, keep it right there everybody. We'll be back with our next guest. This is theCUBE, Dave Vellante with Paul Gillin. Be right back from MIT. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Special coverage of the MITCDOIQ. I could have stretched it to 20. and so that made the transition to Samsung and then you came up with another solution on the data to make it available some of the issues that have failed striking the balance of how you do that and it's kind of the North Star. the bigger problem, I'm not sure it's going to You mentioned that at GSK you against the data to go through a process of Especially in the pharmaceutical industry. as to what is excess, right, so you and do the discovery and I presume Okay, so find out what you so that's kind of the start. all the data and putting it into essentially one place. and then you sort of drew the diagram of and that's the piece that people are missing. So, my question there is you had this Well, the clinical trials was one aspect. My question was what about, you hear about these and homogenized, I think you have to exercise, and that presumably can be automated. and then they go to another area and that's now how you get big things done. Well that's not how you support analytic and you standardized those into one. on the data to make it available to do the fun stuff. and I guess in the case of pharma the difference that a CDO needs to make is of the organization, that can be Yeah, and I think what I've shared and the environment that has been created This is theCUBE, Dave Vellante with Paul Gillin.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
Mark	PERSON	0.99+
Mark Ramsey	PERSON	0.99+
15 petabytes	QUANTITY	0.99+
Samsung	ORGANIZATION	0.99+
Inderpal Bhandari	PERSON	0.99+
Michael Stonebraker	PERSON	0.99+
2013	DATE	0.99+
Paul	PERSON	0.99+
GlaxoSmithKline	ORGANIZATION	0.99+
Barron	PERSON	0.99+
Ramsey International, LLC	ORGANIZATION	0.99+
26 thousand schemas	QUANTITY	0.99+
GSK	ORGANIZATION	0.99+
18 years	QUANTITY	0.99+
2015	DATE	0.99+
thousands	QUANTITY	0.99+
Einstein	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
tomorrow	DATE	0.99+
Samsung Mobile	ORGANIZATION	0.99+
26 thousand	QUANTITY	0.99+
Ramsey International LLC	ORGANIZATION	0.99+
30 plus year	QUANTITY	0.99+
a year later	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Federal Rules of Civil Procedure	TITLE	0.99+
20	QUANTITY	0.99+
25	QUANTITY	0.99+
Both	QUANTITY	0.99+
first step	QUANTITY	0.99+
one petabyte	QUANTITY	0.98+
today	DATE	0.98+
15	QUANTITY	0.98+
one	QUANTITY	0.98+
three approaches	QUANTITY	0.98+
13th year	QUANTITY	0.98+
one aspect	QUANTITY	0.97+
MIT	ORGANIZATION	0.97+
seven years ago	DATE	0.97+
McDonald's	ORGANIZATION	0.96+
MIT Chief Data Officer and	EVENT	0.95+
R&D	ORGANIZATION	0.95+
10 years ago	DATE	0.95+
this morning	DATE	0.94+
this evening	DATE	0.93+
one place	QUANTITY	0.93+
one perspective	QUANTITY	0.92+
about a year and a half ago	DATE	0.91+
over 32 years ago	DATE	0.9+
a lot of talk	QUANTITY	0.9+
a billion documents	QUANTITY	0.9+
CDO	TITLE	0.89+
decades	QUANTITY	0.88+
one statistic	QUANTITY	0.87+
2019	DATE	0.85+
first data	QUANTITY	0.84+
of years ago	DATE	0.83+
Step two	QUANTITY	0.8+
Tamr	OTHER	0.77+
Information Quality Symposium 2019	EVENT	0.77+
PowerPoints	TITLE	0.76+
documents	QUANTITY	0.75+
theCUBE	ORGANIZATION	0.75+
one physical	QUANTITY	0.73+
10 years	QUANTITY	0.72+
87, 88 range	QUANTITY	0.71+
President	PERSON	0.7+
Chief Data Officer	PERSON	0.7+
Enterprise Data Warehouse	ORGANIZATION	0.66+
Goka-lan	ORGANIZATION	0.66+
first Chief Data	QUANTITY	0.63+
first Chief Data Officer	QUANTITY	0.63+
Edge	TITLE	0.63+
tons	QUANTITY	0.62+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Information Quality Symposium 2019: