Michele Goetz,, Forrester Research | Collibra Data Citizens'21

>> From around the globe, it's theCUBE, covering Data Citizens '21. Brought to you by Collibra. >> For the past decade organizations have been effecting very deliberate data strategies and investing quite heavily in people, processes and technology, specifically designed to gain insights from data, better serve customers, drive new revenue streams we've heard this before. The results quite frankly have been mixed. As much of the effort is focused on analytics and technology designed to create a single version of the truth, which in many cases continues to be elusive. Moreover, the world of data is changing. Data is increasingly distributed making collaboration and governance more challenging, especially where operational use cases are a priority. Hello, everyone. My name is Dave Vellante and you're watching theCUBE coverage of Data Citizens '21. And we're pleased to welcome Michele Goetz who's the vice president and principal analyst at Forrester Research. Hello, Michele. Welcome to theCUBE. >> Hi, Dave. Thanks for having me today. >> It's our pleasure. So I want to start, you serve have a wide range of roles including enterprise architects, CDOs, chief data officers that is, analyst, the analyst, et cetera, and many data-related functions. And my first question is what are they thinking about today? What's on their minds, these data experts? >> So there's actually two things happening. One is what is the demand that's placed on data for our new intelligent digital systems. So we're seeing a lot of investment and interest in things like edge computing. And then how does that intersect with artificial intelligence to really run your business intelligently and drive new value propositions to be both adaptive to the market as well as resilient to changes that are unforeseen. The second thing is then you create this massive complexity to managing the data, governing the data, orchestrating the data because it's not just a centralized data warehouse environment anymore. You have a highly diverse and distributed landscape that you both control internally, as well as taking advantage of third party information. So really what the struggle then becomes is how do you trust the data? How do you govern it, and secure, and protect that data? And then how do you ensure that it's hyper contextualized to the types of value propositions that our intelligence systems are going to serve? >> Well, I think you're hitting on the key issues here. I mean, you're right. The data and I sort of refer to this as well is sort of out there, it's distributed at the edge. But generally our data organizations are actually quite centralized and as well you talk about the need to trust the data obviously that's crucial. But are you seeing the organization change? I know you're talking about this to clients, your discussion about collaboration. How are you seeing that change? >> Yeah, so as you have to bring data into context of the insights that you're trying to get or the intelligence that's automating and scaling out the value streams and outcomes within your business, we're actually seeing a federated model emerge in organizations. So while there's still a centralized data management and data services organization led typical enterprise architects for data, a data engineering team that's managing warehouses as in data lakes. They're creating this great platform to access and orchestrate information, but we're also seeing data, and analytics, and governance teams come together under chief data officers or chief data and analytics officers. And this is really where the insights are being generated from either BI and analytics or from data science itself and having dedicated data engineers and stewards that are helping to access and prepare data for analytic efforts. And then lastly, this is the really interesting part is when you push data into the edge the goal is that you're actually driving an experience and an application. And so in that case we are seeing data engineering teams starting to be incorporated into the solutions teams that are aligned to lines of business or divisions themselves. And so really what's happening is if there is a solution consultant who is also overseeing value-based portfolio management when you need to instrument the data to these new use cases and keep up with the pace of the business it's this engineering team that is part of the DevOps work bench to execute on that. So really the balances we need the core, we need to get to the insights and build our models for AI. And then the next piece is how do you activate all that? And there's a team over there to help. So it's really spreading the wealth and expertise where it needs to go. >> Yeah, I love that. You took a couple of things that really resonated with me. You talked about context a couple of times and this notion of a federated model, because historically the sort of big data architecture, the team, they didn't have the context, the business context, and my inference is that's changing and I think that's critical. Your talk at Data Citizens is called how obsessive collaboration fuels scalable DataOps. You talk about the data, the DevOps team. What's the premise you put forth to the audience? >> So the point about obsessive collaboration is sort of taking the hubris out of your expertise on the data. Certainly there's a recognition by data professionals that the business understands and owns their data. They know the semantics, they know the context of it and just receiving the requirements on that was assumed to be okay. And then you could provide a data foundation, whether it's just a lake or whether you have a warehouse environment where you're pulling for your analytics. The reality is that as we move into more of AI machine learning type of model, one, more context is necessary. And you're kind of balancing between what are the things that you can ascribe to the data globally which is what data engineers can support. And then there's what is unique about the data and the context of the data that is related to the business value and outcome as well as the feature engineering that is being done on the machine learning models. So there has to be a really tight link and collaboration between the data engineers, the data scientists, and analysts, and the business stakeholders themselves. You see a lot of pods starting up that way to build the intelligence within the system. And then lastly, what do you do with that model? What do you do with that data? What do you do with that insight? You now have to shift your collaboration over to the work bench that is going to pull all these components together to create the experiences and the automation that you're looking for. And that requires a different collaboration model around software development. And still incorporating the business expertise from those stakeholders, so that you're satisfying, not only the quality of the code to run the solution, but the quality towards the outcome that meets the expectation and the time to value that your stakeholders have. So data teams aren't just sitting in the basement or in another part of the organization and digitally disconnected anymore. You're finding that they're having to work much more closely and side by side with their colleagues and stakeholders. >> I think it's clear that you understand this space really well. Hubris out context in, I mean, that's kind of what's been lacking. And I'm glad you said you used the word anymore because I think it's a recognition that that's kind of what it was. They were down in the basement or out in some kind of silo. And I think, and I want to ask you this. I come back to organization because I think a lot of organizations look the most cost effective way for us to serve the business is to have a single data team with hyper specialized roles. That'll be the cheapest way, the most efficient way that we can serve them. And meanwhile, the business, which as you pointed out has the context is frustrated. They can't get to data. So there's this notion of a federated governance model is actually quite interesting. Are you seeing actual common use cases where this is being operationalized? >> Absolutely, I think the first place that you were seeing it was within the operational technology use cases. There the use cases where a lot of the manufacturing industrial device. Any sort of IOT based use case really recognized that without applying data and intelligence to whatever process was going to be executed. It was really going to be challenging to know that you're creating the right foundation, meeting the SLA requirements, and then ultimately bringing the right quality and integrity to the data, let alone any sort of data protection and regulatory compliance that has to be necessary. So you already started seeing the solution teams coming together with the data engineers, the solution developers, the analysts, and data scientists, and the business stakeholders to drive that. But that is starting to come back down into more of the IT mindset as well. And so DataOps starts to emerge from that paradigm into more of the corporate types of use cases and sort of parrot that because there are customer experience use cases that have an IOT or edge component to though. We live on our smart phones, we live on our smart watches, we've got our laptops. All of us have been put into virtual collaboration. And so we really need to take into account not just the insight of analytics but how do you feed that forward. And so this is really where you're seeing sort of the evolution of DataOps as a competency not only to engineer the data and collaborate but ensure that there sort of an activation and alignment where the value is going to come out, and still being trusted and governed. >> I got kind of a weird question, but I'm going. I was talking to somebody in Israel the other day and they told me masks are off, the economy's booming. And he noted that Israel said, hey, we're going to pay up for the price of a vaccine. The cost per dose out, 28 bucks or whatever it was. And he pointed out that the EU haggled big time and they don't want to pay $19. And as a result they're not as far along. Israel understood that the real value was opening up the economy. And so there's an analogy here which I want to come back to my organization and it relates to the DataOps. Is if the real metric is, hey, I have an idea for a data product. How long does it take to go from idea to monetization? That seems to me to be a better KPI than how much storage I have, or how much geometry petabytes I'm managing. So my question is, and it relates to DataOps. Can that DataOps, should that DataOps individual maybe live, and then maybe even the data engineer live inside of the business and is that even feasible technically with this notion of federated governance? Are you seeing that and maybe talk a little bit more about this DataOps role. Is it. >> Yeah. >> Fungible. >> Yeah, it's definitely fungible. And in fact, when I talked about sort of those three units of there's your core enterprise data services, there's your BI and data, and then there's your line of business. All of those, the engineering and the ops is the DataOps which is living in all of those environments and being as close as possible to where the value proposition is being defined and designed. So absolutely being able to federate that. And I think the other piece on DataOps that is really important is recognizing how the practices around continuous integration and continuous deployment using agile methodologies is really reshaping. A lot of the waterfall approaches that were done before where data was lagging 12 to 18 months behind any sort of insights, but a lot of the platforms today assume that you're moving into a standard mature software development life cycle. And you can start seeing returns on investment within a quarter, really, so that you can iterate and then speed that up so that you're delivering new value every two weeks. But it does change the mindset this DataOps team aligned to solution development, aligned to a broader portfolio management of business capabilities and outcomes needs to understand how to appropriately scope the data products that they're delivering to incremental value-based milestones. So the business feels that they're getting improvements over time and not just waiting. So there's an MVP, you move forward on that and optimize, optimize, extend scale. So again, that CICD mindset is helping to not bottleneck and wait for the complete field of dreams to come from your data and your insights. >> Thank you for that, Michelle. I want to come back to this idea of collaboration because over the last decade we've seen attempts, I've seen software come out to try to help the various roles collaborate and some of it's been okay, but you have these hyper specialized roles. You've got data scientists, data engineers, quality engineers, analysts, et cetera. And they tend to be in their own little worlds. But at the end of the day we rely on them all to get answers. So how can these data scientists, all these stewards, how can they collaborate better? What are you seeing there? >> You need to get them onto the same process. That's really what it comes down to. If you're working from different points of view, that's one thing. But if you're working from different processes collaborating is really challenging. And I think the one thing that's really come out of this move to machine learning and AI is recognizing that you need processes that reinforce collaboration. So that's number one. So you see agile development in CICD not just for DataOps, not just for DevOps, but also encouraging and propelling these projects and iterations for the data science teams as well or even if there's machine learning engineers incorporated. And then certainly the business stakeholders are inserted within there as appropriate to accept what it is that is going to be developed. So processes is number one. And number two is what is the platform that's going to reinforce those processes and collaboration. And it's really about what's being shared. How do you share? So certainly what we're seeing within the platforms themselves is everybody contributing into some sort of a library where their components and products are being ascribed to and then that's able to help different teams grab those components and build out what those solutions are going to be. And in fact, what gets really cool about that is you don't always need hardcore data scientists anymore as you have this social platform for data product and analytic product development. This is where a lot of the auto ML begins because those who are less data science-oriented but can build an insight pipeline, can grab all the different components from the pipelines to the transformations, to capture mechanisms, to bolting into the model itself and allowing that to be delivered to the application. So really kind of balancing out between process and platforms that enable and encourage, and almost force you to collaborate and manage through sharing. >> Thank you for that. I want to ask you about the role data governance. You've mentioned trust and that's data quality, and you've got teams that are focused on and specialists focused on data quality. There's the data catalog. Here's my question. You mentioned edge a couple of times and I can see a lot of that. I mean, today, most AI is are a lot of value, I would say most is modeling. And in the future, you mentioned edge it's going to be a lot of influencing in real time. And people maybe not going to have the time or be involved in that decision. So what are you seeing in terms of data governance, federate. We talked about federated governance, this notion of a data catalog and maybe automating data quality without necessarily having it be so labor intensive. What are you seeing the trends there? >> Yeah, so I think our new environment, our new normal is that you have to be composable, interoperable, and portable. Portability is really the key here. So from a cataloging perspective and governance we would bring everything together into our catalogs and business glossaries. And it would be a reference point, it was like a massive Wiki. Well, that's wonderful, but why just how's it in a museum. You really want to activate that. And I think what's interesting about the technologies today for governance is that you can turn those rules, and business logic, and policies into services that are composable components and bring those into the solutions that you're defining. And in that way what happens is that creates portability. You can drive them wherever they need to go. But from the composability and the interoperability portion of that you can put those services in the right place at the right time for what you need for an outcome so that you start to become behaviorally driven on executing on governance rather than trying to write all of the governance down into transformations and controls to where the data lives. You can have quality and observability of that quality and performance right at the edge and context of behavior and use of that solution. You can run those services and in governance on gateways that are managing and routing information at those edge solutions and we synchronization between the edge and the cloud comes up. And if it's appropriate during synchronization of the data back into the data lake you can run those services there. So there's a lot more flexibility and elasticity for today's modern approaches to cataloging, and glossaries, and governance of data than we had before. And that goes back into what we talked about earlier of like, this is the new wave of DataOps. This is how you bring data products to fruition now. Everything is about activation. >> So how do you see the future of DataOps? I mean, I kind of been pushing you to a more decentralized model where the business has more control 'cause the business has the context. I mean, I feel as though, hey, we've done a great job of contextualizing our operational systems. The sales team they know when the data is crap within my CRM, but our data systems are context agnostic generally. And you obviously understand that problem well. But so how do you see the future of DataOps? >> So I think what's kind of interesting about that is we're going to go to governance on greed versus governance on right more so. What do I mean by that? That means that from a business perspective there's two sides of it. There's ensuring that where governance is run is as we talked about before executing at the appropriate place at the appropriate time. It's semantically domain-centric driven not logical and systems centric. So that's number one. Number two is also recognizing that business owners or business operations actually plays a role in this, because as you're working within your CRM systems, like a Salesforce, for example you're using an iPaaS MuleSoft to connect to other applications, connect to other data sources, connect to other analytics sources. And what's happening there is that the data is being modeled and personalized to whatever view insight our task has to happen within those processes. So even CRM environments where we think of as sort of traditional technologies that we're used to are getting a lift, both in terms of intelligence from the data but also your flexibility and how you execute governance and quality services within that environment. And that actually opens up the data foundations a lot more and avoids you from having to do a lot of moving, copying centralizing data and creating an over-weighted business application and an over, both in terms of the data foundation but also in terms of the types of business services, and status updates, and processes that happen in the application itself. You're drawing those tasks back down to where they should be and where performance can be managed rather than trying to over customize your application environment. And that gives you a lot more flexibility later too for any sort of upgrades or migrations that you want to make because all of the logic is contained back down in a service layer instead. >> Great perspectives, Michelle, you obviously know your stuff and it's been a pleasure having you on. My last question is when you look out there anything that really excites you or any specific research that you're working on that you want to share, that you're super pumped about? >> I think there's two things. One is it's truly incredible the amount of insight and growth that is coming through data profiling and observation. Really understanding and contextualizing data anomalies so that you understand is data helping or hurting the business value and tying it very specifically to processes and metrics, which is fantastic as well as models themselves like really understanding how data inputs and outputs are making a difference whether the model performs or not. And then I think the second thing is really the emergence of more active data, active insights. And as what we talked about before your ability to package up services for governance and quality in particular that allow you to scale your data out towards the edge or where it's needed. And doing so not just so that you can run analytics but that you're also driving overall processes and value. So the research around the operationalization and activation of data is really exciting. And looking at the networks and service mesh to bring those things together is kind of where I'm focusing right now because what's the point of having data in a database if it's not providing any value. >> Michele Goetz, Forrester Research, thanks so much for coming on theCUBE. Really awesome perspectives. You're in an exciting space, so appreciate your time. >> Absolutely, thank you. >> And thank you for watching Data Citizens '21 on theCUBE. My name is Dave Vellante. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Brought to you by Collibra. of the truth, which in many Thanks for having me today. So I want to start, you serve that you both control internally, the need to trust the data the data to these new use cases What's the premise you and the time to value that And meanwhile, the business, But that is starting to come back down and it relates to the DataOps. and the ops is the DataOps And they tend to be in and allowing that to be And in the future, you mentioned edge of that you can put those services I mean, I kind of been pushing you And that gives you a lot more flexibility on that you want to share, that allow you to scale your so appreciate your time. And thank you for watching

ENTITIES

Entity	Category	Confidence
Michele Goetz	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Michele	PERSON	0.99+
Dave	PERSON	0.99+
Michelle	PERSON	0.99+
$19	QUANTITY	0.99+
Israel	LOCATION	0.99+
12	QUANTITY	0.99+
28 bucks	QUANTITY	0.99+
first question	QUANTITY	0.99+
two sides	QUANTITY	0.99+
EU	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Forrester Research	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.99+
Data Citizens	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
both	QUANTITY	0.98+
Collibra	ORGANIZATION	0.98+
18 months	QUANTITY	0.98+
Forrester Research	ORGANIZATION	0.98+
one	QUANTITY	0.96+
Israel	ORGANIZATION	0.96+
three units	QUANTITY	0.94+
Data Citizens '21	TITLE	0.94+
DataOps	ORGANIZATION	0.93+
one thing	QUANTITY	0.9+
Hubris	PERSON	0.89+
first place	QUANTITY	0.85+
past decade	DATE	0.84+
agile	TITLE	0.83+
Number two	QUANTITY	0.82+
single data team	QUANTITY	0.82+
DevOps	TITLE	0.81+
last	DATE	0.8+
DataOps	TITLE	0.8+
edge	ORGANIZATION	0.78+
DataOps	OTHER	0.78+
single version	QUANTITY	0.78+
wave	EVENT	0.74+
two weeks	QUANTITY	0.74+
DataOps	EVENT	0.73+
times	QUANTITY	0.73+
SLA	TITLE	0.72+
number two	QUANTITY	0.71+
Salesforce	TITLE	0.7+
CICD	ORGANIZATION	0.67+
number one	QUANTITY	0.65+
CICD	TITLE	0.6+
iPaaS	TITLE	0.59+
Citizens'21	ORGANIZATION	0.56+
couple	QUANTITY	0.42+
MuleSoft	ORGANIZATION	0.41+
theCUBE	TITLE	0.34+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Hubris: