Laura Sellers, Collibra | Data Citizens 22

>> Welcome to theCUBE's Virtual Coverage of Data Citizens 2022. My name is Dave Vellante and I'm here with Laura Sellers who is the Chief Product Officer at Collibra, the host of Data Citizens, Laura, welcome. Good to see you. >> Thank you. Nice to be here. >> Yeah, your keynote at Data Citizens this year focused on you know, your mission to drive ease of use and scale. Now, when I think about historically fast access to the right data at the right time in a form that's really easily consumable it's been kind of challenging especially for business users. Can you explain to our audience why this matters so much and what's actually different today in the data ecosystem to make this a reality? >> Yeah, definitely. So I think what we really need and what I hear from customers every single day is that we need a new approach to data management and our product teams. What inspired me to come to Collibra a little bit over a year ago, was really the fact that they're very focused on bringing trusted data to more users across more sources for more use cases. And so as we look at what we're announcing with these innovations of ease of use and scale it's really about making teams more productive in getting started with and the ability to manage data across the entire organization. So we've been very focused on richer experiences, a broader ecosystem of partners, as well as a platform that delivers performance, scale and security that our users and teams need and demand. So as we look at, oh, go ahead. >> I was going to say, you know, when I look back at like the last 10 years it was all about getting the technology to work and it was just so complicated, but, but please carry on. I'd love to hear more about this. >> Yeah, I really, you know, Collibra is a system of engagement for data and we really are working on bringing that entire system of engagement to life for everyone to leverage here and now. So what we're announcing from our ease of use side of the world is first our data marketplace. This is the ability for all users to discover and access data quickly and easily shop for it, if you will. The next thing that we're also introducing is the new homepage. It's really about the ability to drive adoption and have users find data more quickly. And then the two more areas of the ease of use side of the world is our world of usage analytics. And one of the big pushes and passions we have at Collibra is to help with this data-driven culture that all companies are trying to create. And also helping with data literacy. With something like usage analytics, it's really about driving adoption of the Collibra platform, understanding what's working, who's accessing it, what's not. And then finally we're also introducing what's called Workflow Designer. And we love our workflows at Collibra, it's a big differentiator to be able to automate business processes. The Designer is really about a way for more people to be able to create those workflows, collaborate on those workflows, as well as people to be able to easily interact with them. So a lot of of exciting things when it comes to ease of use to make it easier for all users to find data. >> Yes, there's definitely a lot to unpack there. You know, you mentioned this idea of shopping for the data. That's interesting to me. Why this analogy, metaphor or analogy, I always get those confused. Let's go with analogy. Why is it so important to data consumers? >> I think when you look at the world of data, and I talked about this system of engagement, it's really about making it more accessible to the masses. And what users are used to is a shopping experience like your Amazon, if you will. And so having a consumer grade experience where users can quickly go in and find the data, trust that data, understand where the data's coming from and then be able to quickly access it, is the idea of being able to shop for it. Just making it as simple as possible and really speeding the time to value for any of the business analysts, data analysts out there. >> Yeah, I think you see a lot of discussion about rethinking data architectures, putting data in the hands of the users and business people, decentralized data and of course that's awesome. I love that. But of course then you have to have self-service infrastructure and you have to have governance. And those are really challenging. And I think so many organizations they're facing adoption challenges. You know, when it comes to enabling teams generally, especially domain experts to adopt new data technologies you know, like the tech comes fast and furious. You got all these open source projects and you get really confusing. Of course it risks security, governance and all that good stuff. You got all this jargon. So where do you see, you know, the friction in adopting new data technologies? What's your point of view, and how can organizations overcome these challenges? >> You're, you're dead on. There's so much technology and there's so much to stay on top of, which is part of the friction, right? Is just being able to stay ahead of and understand all the technologies that are coming. You also look at it as there's so many more sources of data and people are migrating data to the cloud and they're migrating to new sources. Where the friction comes is really that ability to understand where the data came from, where it's moving to and then also to be able to put the access controls on top of it. So people are only getting access to the data that they should be getting access to. So one of the other things we're announcing with, with all of the innovations that are coming is what we're doing around performance and scale. So with all of the data movement, with all of the data that's out there, the first thing we're launching in the world of performance and scale is our world of data quality. It's something that Collibra has been working on for the past year and a half, but we're launching the ability to have data quality in the cloud. So it's currently an on-premise offering, but we'll now be able to carry that over into the cloud for us to manage that way. We're also introducing the ability to push down data quality into Snowflake. So this is, again, one of those challenges is making sure that that data that you have is, is high quality as you move forward. And so really another, we're just reducing friction. You already have Snowflake stood up, it's not another machine for you to manage, it's just push-down capabilities into Snowflake to be able to track that quality. Another thing that we're launching with that is what we call Collibra Protect. And this is that ability for users to be able to ingest metadata, understand where the PII data is and then set policies up on top of it. So very quickly be able to set policies and have them enforced at the data level. So anybody in the organization is only getting access to the data they should have access to. >> This topic of data quality is interesting. It's something that I've followed for a number of years. It used to be a back office function, you know and really confined only to highly regulated industries like financial services and healthcare and government. You know, you look back over a decade ago, you didn't have this worry about personal information, GDPR, and you know, California Consumer Privacy Act all becomes so much important. The cloud is really changed things in terms of performance and scale. And of course partnering for, with Snowflake, it's all about sharing data and monetization anything but a back office function. So it was kind of smart that you guys were early on and of course attracting them and as an investor as well was very strong validation. What can you tell us about the nature of the relationship with Snowflake and specifically interested in sort of joint engineering and product innovation efforts, you know, beyond the standard go-to-market stuff? >> Definitely. So you mentioned there were a strategic investor in Collibra about a year ago. A little less than that I guess. We've been working with them though for over a year really tightly with their product and engineering teams to make sure that Collibra is adding real value. Our unified platform is touching pieces of, our unified platform are touching all pieces of Snowflake. And when I say that, what I mean is we're first, you know, able to ingest data with Snowflake, which which has always existed. We're able to profile and classify that data. We're announcing with Collibra Protect this week that you're now able to create those policies on top of Snowflake and have them enforced. So again, people can get more value out of their Snowflake more quickly, as far as time to value with our policies for all business users to be able to create. We're also announcing Snowflake Lineage 2.0. So this is the ability to take stored procedures in Snowflake and understand the lineage of where did the data come from, how was it transformed, within Snowflake as well as the data quality push-down, as I mentioned, data quality, you brought it up. It is a new, it is a big industry push and you know, one of the things I think Gartner mentioned is people are losing up to $15 million dollars without having great data quality. So this push-down capability for Snowflake really is again a big ease of use push for us at Collibra of that ability to, to push it into Snowflake, take advantage of the data, the data source and the engine that already lives there, and get the right, and make sure you have the right quality. >> I mean the nice thing about Snowflake if you play in the Snowflake sandbox, you, you can get sort of a, you know, high degree of confidence that the data sharing can be done in a safe way. Bringing, you know, Collibra into the, into the story allows me to have that data quality and and that governance that I, that I need. You know, we've said many times on theCUBE that one of the notable differences in cloud this decade versus last decade I mean there are obvious differences just in terms of scale and scope, but it's shaping up to be about the strength of the ecosystems. That's really a hallmark of these big cloud players. I mean they're, it's a key factor for innovating, accelerating product delivery, filling gaps in in the hyperscale offerings. Because you got more stack, you know, mature stack capabilities and you know, that creates this flywheel momentum as we often say. But, so my question is, how do you work with the hyperscalers? Like whether it's AWS or Google or whomever, and what do you see as your role and what's the Collibra sweet spot? >> Yeah, definitely. So, you know, one of the things I mentioned early on is the broader ecosystem of partners is what it's all about. And so we have that strong partnership with Snowflake. We also are doing more with Google around, you know, GCP and Collibra Protect there, but also tighter Dataplex integration. So similar to what you've seen with our strategic moves around Snowflake, and really covering the broad ecosystem of what Collibra can do on top of that data source. We're extending that to the world of Google as well and the world of Dataplex. We also have great partners in SI's. Infosys is somebody we spoke with at the conference who's done a lot of great work with Levi's, as they're really important to help people with their whole data strategy and driving that data-driven culture and and Collibra being the core of it. >> Hi Laura, we're going to, we're going to end it there but I wonder if you could kind of put a bow on, you know, this year, the event your, your perspectives. So just give us your closing thoughts. >> Yeah, definitely. So I, I want to say this is one of the biggest releases Collibra's ever had. Definitely the biggest one since I've been with the company a little over a year. We have all these great new product innovations coming to really drive the ease of use, to make data more valuable for users everywhere and, and companies everywhere. And so it's all about everybody being able to easily find, understand and trust and get access to that data going forward. >> Well congratulations on all the progress. It was great to have you on theCUBE. First time, I believe. And really appreciate you, you taking the time with us. >> Yes, thank you, for your time. >> You're very welcome. Okay, you're watching the coverage of Data Citizens 2022 on theCUBE your leader in enterprise and emerging tech coverage.

Published Date : Nov 2 2022

SUMMARY :

the host of Data Citizens, Nice to be here. in the data ecosystem the ability to manage data the technology to work at Collibra is to help with Why is it so important to data consumers? and really speeding the time to value But of course then you have to have the ability to have data and really confined only to and the engine that already lives there, into the story allows me to and the world of Dataplex. of put a bow on, you know, and get access to that data going forward. on all the progress. of Data Citizens 2022 on theCUBE

ENTITIES

Entity	Category	Confidence
Laura	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Laura Sellers	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Collibra	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
California Consumer Privacy Act	TITLE	0.99+
AWS	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
Infosys	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Dataplex	ORGANIZATION	0.99+
one	QUANTITY	0.99+
first	QUANTITY	0.98+
Data Citizens	ORGANIZATION	0.97+
this year	DATE	0.97+
this week	DATE	0.95+
Levi's	ORGANIZATION	0.94+
Snowflake	TITLE	0.94+
past year and a half	DATE	0.94+
First time	QUANTITY	0.94+
Gartner	ORGANIZATION	0.93+
last decade	DATE	0.93+
two more areas	QUANTITY	0.91+
today	DATE	0.91+
GCP	ORGANIZATION	0.86+
up to $15 million dollars	QUANTITY	0.86+
a year ago	DATE	0.85+
first thing	QUANTITY	0.83+
Data Citizens 22	ORGANIZATION	0.83+
about a year ago	DATE	0.83+
over a decade ago	DATE	0.82+
Collibra Protect	ORGANIZATION	0.82+
over a year	QUANTITY	0.81+
theCUBE	ORGANIZATION	0.81+
Snowflake	EVENT	0.8+
Snowf	TITLE	0.79+
Data Citizens 2022	EVENT	0.76+
over	DATE	0.72+
last 10 years	DATE	0.7+
Data	EVENT	0.67+
Snowflake Lineage 2.0	TITLE	0.64+
Protect	COMMERCIAL_ITEM	0.63+
decade	DATE	0.62+
single day	QUANTITY	0.62+
Data Citizens 2022	TITLE	0.53+
Citizens	ORGANIZATION	0.52+

Stijn Christiaens, Collibra, Data Citizens 22

(Inspiring rock music) >> Hey everyone, I'm Lisa Martin covering Data Citizens 22 brought to you by Collibra. This next conversation is going to focus on the importance of data culture. One of our Cube alumni is back; Stan Christians is Collibra's co-founder and it's Chief Data citizen. Stan, it's great to have you back on theCUBE. >> Hey Lisa, nice to be here. >> So we're going to be talking about the importance of data culture, data intelligence, maturity all those great things. When we think about the data revolution that every business is going through, you know, it's so much more than technology innovation; it also really requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >> Right. So as you know, our event is called Data Citizens because we believe that, in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations you have a lot of people, most of the employees in an organization, are somehow going to be a data citizen, right? So you need to make sure that these people are aware of it, you need to make sure that these people have the skills and competencies to do with data what is necessary, and that's on all levels, right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss we need to make this decision, that your boss is also open to and able to interpret, you know, the data presented in the dashboard to actually make that decision and take that action. Right? And once you have that "Why" to the organization that's when you have a good data culture. That's a continuous effort for most organizations because they're always moving somehow, they're hiring new people. And it has to be a continuous effort because we've seen that, on the one hand, organizations continue to be challenged with controlling their data sources and where all the data is flowing right? Which in itself creates lot of risk, but also on the other hand of the equation, you have the benefits, you know, you might look at regulatory drivers like we have to do this, right? But it's, it's much better right now to consider the competitive drivers for example. And we did an IDC study earlier this year, quite interesting, I can recommend anyone to read it, and one of the conclusions they found as they surveyed over a thousand people across organizations worldwide, is that the ones who are higher in maturity, so the organizations that really look at data as an asset, look at data as a product and actively try to be better at it don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, okay, I'm doing this, you know, data culture for everyone, awakening them up as data citizens. I'm doing this for competitive reasons. I'm doing this for regulatory reasons. You're trying to bring both of those together. And the ones that get data intelligence, right, are just going to be more successful and more competitive. That's our view and that's what we're seeing out there in the market. >> Absolutely. We know that just generally, Stan, right, The organizations that are really creating a a data culture and enabling everybody within the organization to become data citizens are, we know that, in theory, they're more competitive, they're more successful, But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >> Of course, of course it's difficult for an organization to adapt, but it's also necessary as you just said, imagine that, you know, you're a modern day organization, phones, laptops, what have you. You're not using those IT assets, right? Or you know, you're delivering them throughout the organization, but not enabling your colleagues to actually do something with that asset. Same thing is true with data today, right, if you're not properly using the data asset, and your competitors are, they're going to get more advantage. So as to how you get this done or how you establish this culture there's a few angles to look at, I would say. So one angle is obviously the leadership angle whereby whoever is the boss of data in the organization you typically have multiple bosses there, like a chief Data Officer, sometimes there's multiple, but they may have a different title, right? So I'm just going to summarize it as a data leader for a second. So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? >> Lisa: Yes. >> Now, that's one part because then you can clearly see the example of your leadership in the organization, and also the business value, and that's important because those people, their job, in essence, really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that go to right is it's not enough to just have that leadership out there but you also have to get the hearts and minds of the data champions across the organization. You really have to win them over. And if you have those two combined, and obviously good technology to, you know, connect those people and have them execute on their responsibilities such as a data intelligence platform like ePlus, then you have the pieces in place to really start upgrading that culture inch by inch, if you will. >> Yes, I like that. The recipe for success. So you are the co-founder of Collibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how, before we went live, we were talking about Collibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what, maybe some of the specific projects are that Collibra's data office is working on. >> Yes. And it is indeed data citizens. There are a ton of speakers here, very excited. You know, we have Barb from MIT speaking about data monetization. We have DJ Patil at the last minute on the agenda so really exciting agenda, can't wait to get back out there. But essentially you're right. So over the years at Collibra, we've been doing this now since 2008, so a good 15 years, and I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around, as are we, and myself, you know, when you start a company we were four people in a garage, if you will, so everybody's wearing all sorts of hat at that time. But over the years I've run pre-sales at Collibra, I've run post sales, partnerships, product, et cetera, and as our company got a little bit biggish, we're now 1,200 something like that, people in the company I believe, systems and processes become a lot more important, right? So we said, you know, Collibra isn't the size of our customers yet, but we're getting there in terms of organization, structure, process systems et cetera. So we said it's really time for us to put our money where our mouth is, and to set up our own data office, which is what we were seeing that all of our customers are doing, and which is what we're seeing that organizations worldwide are doing and Gartner was predicting as well. They said, okay, organizations have an HR unit, they have a finance unit, and over time they'll all have a department, if you will, that is responsible somehow for the data. >> Lisa: Hm. >> So we said, okay, let's try to set an example with Collibra. Let's set up our own data office in such a way that other people can take away with it, right? Can take away from it? So we set up a data strategy, we started building data products, took care of the data infrastructure, that sort of good stuff, And in doing all of that, Lisa, exactly as you said, we said, okay, we need to also use our own products and our own practices, right? And from that use, learn how we can make the product better, learn how we can make the practice better and share that learning with all of the markets, of course. And on Monday mornings, we sometimes refer to that as eating our own dog foods, Friday evenings, we refer to that as drinking our own champagne. >> Lisa: I like it. >> So we, we had a (both chuckle) We had the drive do this, you know, there's a clear business reason, so we involved, we included that in the data strategy and that's a little bit of our origin. Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should follow. This is just the organization that works at our company, but it can serve as an inspiration. So we have pillars, which is data science, The data product builders, if you will or the people who help the business build data products, we have the data engineers who help keep the lights on for that data platform to make sure that the products, the data products, can run, the data can flow and, you know, the quality can be checked. And then we have a data intelligence or data governance pillar where we have those data governance data intelligence stakeholders who help the business as a sort of data partners to the business stakeholders. So that's how we've organized it. And then we started following the Collibra approach, which is, well, what are the challenges that our business stakeholders have in HR, finance, sales, marketing all over? And how can data help overcome those challenges? And from those use cases, we then just started to build a roadmap, and started execution on use case after use case. And a few important ones there are very simple, we see them with all our customers as well, people love talking about the catalog, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in Deagle and privacy, So they have their process registry, and they can see how the data flows. So that's a popular starting place and that turns into a marketplace so that if new analysts and data citizens join Collibra, they immediately have a place to go to to look at what data is out there for me as an analyst or data scientist or whatever, to do my job, right? So they can immediately get access to the data. And another one that we did is around trusted business reporting. We're seeing that, since 2008, you know, self-service BI allowed everyone to make beautiful dashboards, you know, by pie charts. I always, my pet peeve is the pie charts because I love pie, and you shouldn't always be using pie charts, but essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report? They're reporting on the same thing but the numbers seem different, right? So that's why we have trusted business reporting. So we know if the reports, the dashboard, a data product essentially, is built, we know that all the right steps are being followed, and that whoever is consuming that can be quite confident in the result. >> Lisa: Right, and that confidence is absolutely key. >> Exactly. Yes. >> Absolutely. Talk a little bit about some of the the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >> KPIs and measuring is a big topic in the chief data officer profession I would say, and again, it always varies, with respect to your organization, but there's a few that we use that might be of interest to you. So remember you have those three pillars, right? And we have metrics across those pillars. So, for example, a pillar on the data engineering side is going to be more related to that uptime, right? Is the data platform up and running? Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption's a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data signs and the products. Are people using them? Are they getting value from it? Can we calculate that value in a monetary perspective, right? >> Lisa: Yes. >> So that we can, to the rest of the business, continue to say, "We're tracking all those numbers and those numbers indicate that value is generated" and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example you have a number of domains in a data mesh [Indistinct] People talk about being the owner a data domain for example, like product or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open? Closed? How many data products are built according to process? And so on and so forth, so these are a set of examples of KPI's. There's a lot more but hopefully those can already inspire the audience. >> Absolutely. So we've, we've talked about the rise of cheap data offices, it's only accelerating. You mentioned this is like a 10-year journey. So if you were to look into a crystal ball, what do you see, in terms of the maturation of data offices over the next decade? >> So we, we've seen, indeed, the role sort of grow up. I think in 2010 there may have been like, 10 chief data officers or something, Gartner has exact numbers on them. But then they grew, you know, 400's they were like mostly in financial services, but they expanded them to all industries and the number is estimated to be about 20,000 right now. >> Wow. >> And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy, support for the digital program and now all about data products, right? So as a data leader, you now need all those competences and need to include them in your strategy. How is that going to evolve for the next couple of years? I wish I had one of those crystal balls, right? But essentially, I think for the next couple of years there's going to be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data officers. So you'll see, over the years that's going to evolve more digital and more data products. So for the next three, five years, my prediction is it's all going to be about data products because it's an immediate link between the data and the dollar essentially. >> Right. >> So that's going to be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up a few years. I think there's going to be a continued challenge for the chief data officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not, right? So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done, will be the ones that are successful, and the ones who get that done will be the ones that do it on the basis of data monetization, right? Connecting value to the data and making that very clear to all the data citizens in the organization, right? >> Right, really creating that value chain. >> In that sense they'll need to have both, you know, technical audiences and non-technical audiences aligned of course, and they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you are waking up data citizens across the organization and you make everyone in the organization think about data as an essence. >> Absolutely, because there's so much value that can be extracted if organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely going to keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show, that you mentioned in that IDC study you mentioned Gartner as well. Organizations have so much more likelihood of being successful and being competitive. So we're going to watch this space. Stan, thank you so much for joining me on theCUBE at Data Citizens 22. We appreciate it. >> Thanks for having me over. >> From Data Citizens 22, I'm Lisa Martin you're watching theCUBE, the leader in live tech coverage. (inspiring rock music) >> Okay, this concludes our coverage of Data Citizens 2022 brought to you by Collibra. Remember, all these videos are available on demand at theCUBE.net. And don't forget to check out siliconangle.com for all the news and wikibon.com for our weekly breaking analysis series where we cover many data topics and share survey research from our partner ETR, Enterprise Technology Research. If you want more information on the products announced at Data Citizens, go to Collibra.com. There are tons of resources there. You'll find analyst reports, product demos. It's really worthwhile to check those out. Thanks for watching our program and digging into Data Citizens 2022 on theCUBE Your leader in enterprise and emerging tech coverage. We'll see you soon. (inspiring rock music continues)

Published Date : Nov 2 2022

SUMMARY :

brought to you by Collibra. Talk to us about what you is that the ones who that you just mentioned demonstrates And that strategy needs to and minds of the data champions Talk to us about how you are building So we said, you know, of the data infrastructure, We had the drive do this, you know, Lisa: Right, and that Yes. little bit about some of the in the chief data officer profession So that we can, to So if you were to look the number is estimated to So for the next three, five that do it on the basis of that value chain. in the organization think And as the data show, that you you're watching theCUBE, the brought to you by Collibra.

ENTITIES

Entity	Category	Confidence
Collibra	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Lisa	PERSON	0.99+
Lisa Martin	PERSON	0.99+
2010	DATE	0.99+
Stan	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
1,200	QUANTITY	0.99+
Stan Christians	PERSON	0.99+
Barb	PERSON	0.99+
10-year	QUANTITY	0.99+
2008	DATE	0.99+
one angle	QUANTITY	0.99+
one part	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
both	QUANTITY	0.99+
10 chief data officers	QUANTITY	0.99+
DJ Patil	PERSON	0.99+
15 years	QUANTITY	0.99+
two	QUANTITY	0.99+
Stijn Christiaens	PERSON	0.99+
400	QUANTITY	0.99+
today	DATE	0.99+
siliconangle.com	OTHER	0.98+
IDC	ORGANIZATION	0.98+
MIT	ORGANIZATION	0.98+
three pillars	QUANTITY	0.98+
Cube	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Monday mornings	DATE	0.98+
Enterprise Technology Research	ORGANIZATION	0.97+
four people	QUANTITY	0.97+
One	QUANTITY	0.97+
over a thousand people	QUANTITY	0.97+
second part	QUANTITY	0.97+
three times	QUANTITY	0.97+
theCUBE.net	OTHER	0.97+
Data Citizens	EVENT	0.96+
about 20,000	QUANTITY	0.96+
Data Citizens 22	ORGANIZATION	0.95+
Data Citizens 22	EVENT	0.95+
five years	QUANTITY	0.94+
one set	QUANTITY	0.94+
next decade	DATE	0.94+
Friday evenings	DATE	0.94+
earlier this year	DATE	0.93+
theCUBE	ORGANIZATION	0.92+
next couple of years	DATE	0.89+
next couple of years	DATE	0.89+
first chief	QUANTITY	0.87+
ePlus	TITLE	0.87+
Data	EVENT	0.82+
Collibra.com	OTHER	0.79+
version one	OTHER	0.78+
four levels	QUANTITY	0.76+
version two	OTHER	0.76+
three	QUANTITY	0.73+
Citizens	ORGANIZATION	0.7+
Data Citizens	ORGANIZATION	0.65+
wikibon.com	ORGANIZATION	0.65+
Absolu	PERSON	0.64+
22	EVENT	0.64+
Data Citizens 2022	TITLE	0.63+

Felix Van de Maele, Collibra, Data Citizens 22

(upbeat techno music) >> Collibra is a company that was founded in 2008 right before the so-called modern big data era kicked into high gear. The company was one of the first to focus its business on data governance. Now, historically, data governance and data quality initiatives, they were back office functions, and they were largely confined to regulated industries that had to comply with public policy mandates. But as the cloud went mainstream the tech giants showed us how valuable data could become, and the value proposition for data quality and trust, it evolved from primarily a compliance driven issue, to becoming a linchpin of competitive advantage. But, data in the decade of the 2010s was largely about getting the technology to work. You had these highly centralized technical teams that were formed and they had hyper-specialized skills, to develop data architectures and processes, to serve the myriad data needs of organizations. And it resulted in a lot of frustration, with data initiatives for most organizations, that didn't have the resources of the cloud guys and the social media giants, to really attack their data problems and turn data into gold. This is why today, for example, there's quite a bit of momentum to re-thinking monolithic data architectures. You see, you hear about initiatives like Data Mesh and the idea of data as a product. They're gaining traction as a way to better serve the the data needs of decentralized business users. You hear a lot about data democratization. So these decentralization efforts around data, they're great, but they create a new set of problems. Specifically, how do you deliver, like a self-service infrastructure to business users and domain experts? Now the cloud is definitely helping with that but also, how do you automate governance? This becomes especially tricky as protecting data privacy has become more and more important. In other words, while it's enticing to experiment, and run fast and loose with data initiatives, kind of like the Wild West, to find new veins of gold, it has to be done responsibly. As such, the idea of data governance has had to evolve to become more automated and intelligent. Governance and data lineage is still fundamental to ensuring trust as data. It moves like water through an organization. No one is going to use data that is entrusted. Metadata has become increasingly important for data discovery and data classification. As data flows through an organization, the continuously ability to check for data flaws and automating that data quality, they become a functional requirement of any modern data management platform. And finally, data privacy has become a critical adjacency to cyber security. So you can see how data governance has evolved into a much richer set of capabilities than it was 10 or 15 years ago. Hello and welcome to theCUBE's coverage of Data Citizens made possible by Collibra, a leader in so-called Data intelligence and the host of Data Citizens 2022, which is taking place in San Diego. My name is Dave Vellante and I'm one of the hosts of our program which is running in parallel to Data Citizens. Now at theCUBE we like to say we extract the signal from the noise, and over the next couple of days we're going to feature some of the themes from the keynote speakers at Data Citizens, and we'll hear from several of the executives. Felix Van de Maele, who is the co-founder and CEO of Collibra, will join us. Along with one of the other founders of Collibra, Stan Christiaens, who's going to join my colleague Lisa Martin. I'm going to also sit down with Laura Sellers, she's the Chief Product Officer at Collibra. We'll talk about some of the the announcements and innovations they're making at the event, and then we'll dig in further to data quality with Kirk Haslbeck. He's the Vice President of Data Quality at Collibra. He's an amazingly smart dude who founded Owl DQ, a company that he sold to Collibra last year. Now, many companies they didn't make it through the Hadoop era, you know they missed the industry waves and they became driftwood. Collibra, on the other hand, has evolved its business, they've leveraged the cloud, expanded its product portfolio and leaned in heavily to some major partnerships with cloud providers as well as receiving a strategic investment from Snowflake, earlier this year. So, it's a really interesting story that we're thrilled to be sharing with you. Thanks for watching and I hope you enjoy the program. (upbeat rock music) Last year theCUBE covered Data Citizens, Collibra's customer event, and the premise that we put forth prior to that event was that despite all the innovation that's gone on over the last decade or more with data, you know starting with the Hadoop movement, we had Data lakes, we had Spark, the ascendancy of programming languages like Python, the introduction of frameworks like Tensorflow, the rise of AI, Low Code, No Code, et cetera. Businesses still find it's too difficult to get more value from their data initiatives, and we said at the time, you know maybe it's time to rethink data innovation. While a lot of the effort has been focused on, you more efficiently storing and processing data, perhaps more energy needs to go into thinking about the people and the process side of the equation. Meaning, making it easier for domain experts to both gain insights from data, trust the data, and begin to use that data in new ways, fueling data products, monetization, and insights. Data Citizens 2022 is back and we're pleased to have Felix Van de Maele who is the founder and CEO of Collibra. He's on theCUBE. We're excited to have you Felix. Good to see you again. >> Likewise Dave. Thanks for having me again. >> You bet. All right, we're going to get the update from Felix on the current data landscape, how he sees it why data intelligence is more important now than ever, and get current on what Collibra has been up to over the past year, and what's changed since Data citizens 2021, and we may even touch on some of the product news. So Felix, we're living in a very different world today with businesses and consumers. They're struggling with things like supply chains, uncertain economic trends and we're not just snapping back to the 2010s, that's clear, and that's really true as well in the world of data. So what's different in your mind, in the data landscape of the 2020s, from the previous decade, and what challenges does that bring for your customers? >> Yeah, absolutely, and and I think you said it well, Dave and the intro that, that rising complexity and fragmentation, in the broader data landscape, that hasn't gotten any better over the last couple of years. When when we talk to our customers, that level of fragmentation, the complexity, how do we find data that we can trust, that we know we can use, has only gotten more more difficult. So that trend that's continuing, I think what is changing is that trend has become much more acute. Well, the other thing we've seen over the last couple of years is that the level of scrutiny that organizations are under, respect to data, as data becomes more mission critical, as data becomes more impactful than important, the level of scrutiny with respect to privacy, security, regulatory compliance, as only increasing as well. Which again, is really difficult in this environment of continuous innovation, continuous change, continuous growing complexity, and fragmentation. So, it's become much more acute. And to your earlier point, we do live in a different world and and the past couple of years we could probably just kind of brute force it, right? We could focus on, on the top line, there was enough kind of investments to be, to be had. I think nowadays organizations are focused or are, are, are are, are, are in a very different environment where there's much more focus on cost control, productivity, efficiency, how do we truly get the value from that data? So again, I think it just another incentive for organization to now truly look at data and to scale with data, not just from a a technology and infrastructure perspective, but how do we actually scale data from an organizational perspective, right? You said at the, the people and process, how do we do that at scale? And that's only, only, only becoming much more important, and we do believe that the, the economic environment that we find ourselves in today is going to be catalyst for organizations to really take that more seriously if, if, if you will, than they maybe have in the have in the past. >> You know, I don't know when you guys founded Collibra, if you had a sense as to how complicated it was going to get, but you've been on a mission to really address these problems from the beginning. How would you describe your, your, your mission and what are you doing to address these challenges? >> Yeah, absolutely. We, we started Collibra in 2008. So, in some sense and the, the last kind of financial crisis and that was really the, the start of Collibra, where we found product market fit, working with large financial institutions to help them cope with the increasing compliance requirements that they were faced with because of the, of the financial crisis. And kind of here we are again, in a very different environment of course 15 years, almost 15 years later, but data only becoming more important. But our mission to deliver trusted data for every user, every use case and across every source, frankly, has only become more important. So, what has been an incredible journey over the last 14, 15 years, I think we're still relatively early in our mission to again, be able to provide everyone, and that's why we call it Data Citizens, we truly believe that everyone in the organization should be able to use trusted data in an easy, easy matter. That mission is is only becoming more important, more relevant. We definitely have a lot more work ahead of us because we still relatively early in that, in that journey. >> Well that's interesting, because you know, in my observation it takes 7 to 10 years to actually build a company, and then the fact that you're still in the early days is kind of interesting. I mean, you, Collibra's had a good 12 months or so since we last spoke at Data Citizens. Give us the latest update on your business. What do people need to know about your current momentum? >> Yeah, absolutely. Again, there's a lot of tailwind organizations that are only maturing their data practices and we've seen that kind of transform or influence a lot of our business growth that we've seen, broader adoption of the platform. We work at some of the largest organizations in the world with its Adobe, Heineken, Bank of America and many more. We have now over 600 enterprise customers, all industry leaders and every single vertical. So it's, it's really exciting to see that and continue to partner with those organizations. On the partnership side, again, a lot of momentum in the org in the, in the market with some of the cloud partners like Google, Amazon, Snowflake, Data Breaks, and and others, right? As those kind of new modern data infrastructures, modern data architectures, are definitely all moving to the cloud. A great opportunity for us, our partners, and of course our customers, to help them kind of transition to the cloud even faster. And so we see a lot of excitement and momentum there. We did an acquisition about 18 months ago around data quality, data observability, which we believe is an enormous opportunity. Of course data quality isn't new but I think there's a lot of reasons why we're so excited about quality and observability now. One, is around leveraging AI machine learning again to drive more automation. And a second is that those data pipelines, that are now being created in the cloud, in these modern data architecture, architectures, they've become mission critical. They've become real time. And so monitoring, observing those data pipelines continuously, has become absolutely critical so that they're really excited about, about that as well. And on the organizational side, I'm sure you've heard the term around kind of data mesh, something that's gaining a lot of momentum, rightfully so. It's really the type of governance that we always believed in. Federated, focused on domains, giving a lot of ownership to different teams. I think that's the way to scale data organizations, and so that aligns really well with our vision and from a product perspective, we've seen a lot of momentum with our customers there as well. >> Yeah, you know, a couple things there. I mean, the acquisition of OwlDQ, you know Kirk Haslbeck and, and their team. It's interesting, you know the whole data quality used to be this back office function and and really confined to highly regulated industries. It's come to the front office, it's top of mind for Chief Data Officers. Data mesh, you mentioned you guys are a connective tissue for all these different nodes on the data mesh. That's key. And of course we see you at all the shows. You're, you're a critical part of many ecosystems and you're developing your own ecosystem. So, let's chat a little bit about the, the products. We're going to go deeper into products later on, at Data Citizens 22, but we know you're debuting some, some new innovations, you know, whether it's, you know, the the under the covers in security, sort of making data more accessible for people, just dealing with workflows and processes, as you talked about earlier. Tell us a little bit about what you're introducing. >> Yeah, absolutely. We we're super excited, a ton of innovation. And if we think about the big theme and like, like I said, we're still relatively early in this, in this journey towards kind of that mission of data intelligence that really bolts and compelling mission. Either customers are still start, are just starting on that, on that journey. We want to make it as easy as possible for the, for organization to actually get started, because we know that's important that they do. And for our organization and customers, that have been with us for some time, there's still a tremendous amount of opportunity to kind of expand the platform further. And again to make it easier for, really to, to accomplish that mission and vision around that Data Citizen, that everyone has access to trustworthy data in a very easy, easy way. So that's really the theme of a lot of the innovation that we're driving, a lot of kind of ease of adoption, ease of use, but also then, how do we make sure that, as clear becomes this kind of mission critical enterprise platform, from a security performance, architecture scale supportability, that we're truly able to deliver that kind of an enterprise mission critical platform. And so that's the big theme. From an innovation perspective, from a product perspective, a lot of new innovation that we're really excited about. A couple of highlights. One, is around data marketplace. Again, a lot of our customers have plans in that direction, How to make it easy? How do we make How do we make available to true kind of shopping experience? So that anybody in the organization can, in a very easy search first way, find the right data product, find the right dataset, that they can then consume. Usage analytics, how do you, how do we help organizations drive adoption? Tell them where they're working really well and where they have opportunities. Homepages again to, to make things easy for, for people, for anyone in your organization, to kind of get started with Collibra. You mentioned Workflow Designer, again, we have a very powerful enterprise platform, one of our key differentiators is the ability to really drive a lot of automation through workflows. And now we provided a, a new Low-Code, No-Code kind of workflow designer experience. So, so really customers can take it to the next level. There's a lot more new product around Collibra protect, which in partnership with Snowflake, which has been a strategic investor in Collibra, focused on how do we make access governance easier? How do we, how do we, how are we able to make sure that as you move to the cloud, things like access management, masking around sensitive data, PIA data, is managed as a much more effective, effective rate. Really excited about that product. There's more around data quality. Again, how do we, how do we get that deployed as easily, and quickly, and widely as we can? Moving that to the cloud has been a big part of our strategy. So, we launch our data quality cloud product, as well as making use of those, those native compute capabilities and platforms, like Snowflake, Databricks, Google, Amazon, and others. And so we are bettering a capability, a capability that we call push down, so we're actually pushing down the computer and data quality, to monitoring into the underlying platform, which again from a scale performance and ease of use perspective, is going to make a massive difference. And then more broadly, we talked a little bit about the ecosystem. Again, integrations, we talk about being able to connect to every source. Integrations are absolutely critical, and we're really excited to deliver new integrations with Snowflake, Azure and Google Cloud storage as well. So that's a lot coming out, the team has been work, at work really hard, and we are really really excited about what we are coming, what we're bringing to market. >> Yeah, a lot going on there. I wonder if you could give us your, your closing thoughts. I mean, you you talked about, you know, the marketplace, you know you think about Data Mesh, you think of data as product, one of the key principles, you think about monetization. This is really different than what we've been used to in data, which is just getting the technology to work has been, been so hard. So, how do you see sort of the future and, you know give us the, your closing thoughts please? >> Yeah, absolutely. And, and I think we we're really at a pivotal moment and I think you said it well. We, we all know the constraint and the challenges with data, how to actually do data at scale. And while we've seen a ton of innovation on the infrastructure side, we fundamentally believe that just getting a faster database is important, but it's not going to fully solve the challenges and truly kind of deliver on the opportunity. And that's why now is really the time to, deliver this data intelligence vision, this data intelligence platform. We are still early, making it as easy as we can, as kind of our, as our mission. And so I'm really, really excited to see what we, what we are going to, how the marks are going to evolve over the next, next few quarters and years. I think the trend is clearly there. We talked about Data Mesh, this kind of federated approach focus on data products, is just another signal that we believe, that a lot of our organization are now at the time, they're understanding need to go beyond just the technology. I really, really think about how to actually scale data as a business function, just like we've done with IT, with HR, with sales and marketing, with finance. That's how we need to think about data. I think now is the time, given the economic environment that we are in, much more focus on control, much more focus on productivity, efficiency, and now is the time we need to look beyond just the technology and infrastructure to think of how to scale data, how to manage data at scale. >> Yeah, it's a new era. The next 10 years of data won't be like the last, as I always say. Felix, thanks so much. Good luck in, in San Diego. I know you're going to crush it out there. >> Thank you Dave. >> Yeah, it's a great spot for an in-person event and and of course the content post-event is going to be available at collibra.com and you can of course catch theCUBE coverage at theCUBE.net and all the news at siliconangle.com. This is Dave Vellante for theCUBE, your leader in enterprise and emerging tech coverage. (upbeat techno music)

Published Date : Nov 2 2022

SUMMARY :

and the premise that we put for having me again. in the data landscape of the 2020s, and to scale with data, and what are you doing to And kind of here we are again, still in the early days a lot of momentum in the org in the, And of course we see you at all the shows. is the ability to the technology to work and now is the time we need to look of data won't be like the and of course the content

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Heineken	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Felix Van de Maele	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Laura Sellers	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
2008	DATE	0.99+
Felix	PERSON	0.99+
San Diego	LOCATION	0.99+
Stan Christiaens	PERSON	0.99+
Dave	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
7	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
2020s	DATE	0.99+
last year	DATE	0.99+
2010s	DATE	0.99+
Data Breaks	ORGANIZATION	0.99+
Python	TITLE	0.99+
Last year	DATE	0.99+
12 months	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
one	QUANTITY	0.99+
Data Citizens	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Owl DQ	ORGANIZATION	0.98+
10	DATE	0.98+
OwlDQ	ORGANIZATION	0.98+
Kirk Haslbeck	PERSON	0.98+
10 years	QUANTITY	0.98+
One	QUANTITY	0.98+
Spark	TITLE	0.98+
today	DATE	0.98+
first	QUANTITY	0.97+
Data Citizens	EVENT	0.97+
earlier this year	DATE	0.96+
Tensorflow	TITLE	0.96+
Data Citizens 22	ORGANIZATION	0.95+
both	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
15 years ago	DATE	0.93+
over 600 enterprise customers	QUANTITY	0.91+
past couple of years	DATE	0.91+
about 18 months ago	DATE	0.9+
collibra.com	OTHER	0.89+
Data citizens 2021	ORGANIZATION	0.88+
Data Citizens 2022	EVENT	0.86+
almost 15 years later	DATE	0.85+
West	LOCATION	0.85+
Azure	TITLE	0.84+
first way	QUANTITY	0.83+
Vice President	PERSON	0.83+
last couple of years	DATE	0.8+

Kirk Haslbeck, Collibra, Data Citizens 22

(atmospheric music) >> Welcome to theCUBE Coverage of Data Citizens 2022 Collibra's Customer event. My name is Dave Vellante. With us is Kirk Haslbeck, who's the Vice President of Data Quality of Collibra. Kirk, good to see you, welcome. >> Thanks for having me, Dave. Excited to be here. >> You bet. Okay, we're going to discuss data quality, observability. It's a hot trend right now. You founded a data quality company, OwlDQ, and it was acquired by Collibra last year. Congratulations. And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >> Yeah, absolutely. It's definitely exciting times for data quality which you're right, has been around for a long time. So why now? And why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before, and the variety has changed and the volume has grown. And while I think that remains true there are a couple other hidden factors at play that everyone's so interested in as to why this is becoming so important now. And I guess you could kind of break this down simply and think about if Dave you and I were going to build a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, what the ramifications could be, what those incidents would look like. Or maybe better yet, we try to build a new trading algorithm with a crossover strategy where the 50 day crosses the 10 day average. And imagine if the data underlying the inputs to that is incorrect. We will probably have major financial ramifications in that sense. So, kind of starts there, where everybody's realizing that we're all data companies, and if we are using bad data we're likely making incorrect business decisions. But I think there's kind of two other things at play. I bought a car not too long ago and my dad called and said, "How many cylinders does it have?" And I realized in that moment, I might have failed him cause I didn't know. And I used to ask those types of questions about any lock breaks and cylinders, and if it's manual or automatic. And I realized, I now just buy a car that I hope works. And it's so complicated with all the computer chips. I really don't know that much about it. And that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the individuals loading and consuming all of this data for the company actually may not know that much about the data itself and that's not even their job anymore. So, we'll talk more about that in a minute, but that's really what's setting the foreground for this observability play and why everybody's so interested. It's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >> You know, the other thing too about data quality, and for years we did the MIT, CDO, IQ event. We didn't do it last year at COVID, messed everything up. But the observation I would make there, your thoughts is, data quality used to be information quality, used to be this back office function, and then it became sort of front office with financial services, and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well they sort of flipped the bit from sort of a data as a risk to data as an asset. And now as we say, we're going to talk about observability. And so it's really become front and center, just the whole quality issue because data's so fundamental, hasn't it? >> Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my favorite stock ticker app, and I check out the Nasdaq market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And that's kind of what's going on. There's so many numbers and they're coming from all of these different sources, and data providers, and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor, but with the scale that we've achieved in early days, even before Collibra. And what's been so exciting is, we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting, and why I think the CDO is listening right intently nowadays to this topic is, so maybe we could surface all of these problems with the right solution of data observability and with the right scale, and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's not ever going to be based on one or two domain experts anymore. >> So how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they cousins? What's your perspective on that? >> Yeah, it's super interesting. It's an emerging market. So the language is changing, a lot of the topic and areas changing. The way that I like to say it or break it down because the lingo is constantly moving, as a target on the space is really breaking records versus breaking trends. And I could write a condition when this thing happens it's wrong, and when it doesn't it's correct. Or I could look for a trend and I'll give you a good example. Everybody's talking about fresh data and stale data, and why would that matter? Well, if your data never arrived, or only part of it arrived, or didn't arrive on time, it's likely stale, and there will not be a condition that you could write that would show you all the good and the bads. That was kind of your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data. But it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there, there's more than a couple of these happening every day. >> So what's the Collibra angle on all this stuff? Made the acquisition, you got data quality, observability coming together. You guys have a lot of expertise in this area, but you hear providence of data. You just talked about stale data, the whole trend toward realtime. How is Collibra approaching the problem and what's unique about your approach? >> Well I think where we're fortunate is with our background. Myself and team, we sort of lived this problem for a long time in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with, before it was called data observability or reliability, was basically the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution. It's more advanced than some of the observation techniques that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights. And they want to see break records and breaking trends together, so they can correlate the root cause. And we hear that all the time. "I have so many things going wrong just show me the big picture. Help me find the thing that if I were to fix it today would make the most impact." So we're really focused on root cause analysis, business impact, connecting it with lineage and catalog metadata. And as that grows you can actually achieve total data governance. At this point with the acquisition of what was a Lineage company years ago, and then my company OwlDQ, now Collibra Data Quality. Collibra may be the best positioned for total data governance and intelligence in the space. >> Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was. They would just say, "Oh, it's a glitch." So they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens 22 that you're announcing, you got to announce new products, right? It is your yearly event. What's new? Give us a sense as to what products are coming out but specifically around data quality and observability. >> Absolutely. There's this, there's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and BigQuery, and Databricks, Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a SaaS like model. And we've started to hook into these databases, and while we've always worked with the same databases in the past they're supported today. We're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now? Is everyone's concerned with something called Egress. Did my data that I've spent all this time and money with my security team securing ever leave my hands, did it ever leave my secure VPC as they call it? And with these native integrations that we're building and about to unveil here as kind of a sneak peak for next week at Data Citizens, we're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration you could log into the Collibra data quality app and have all of your data quality running inside the database that you've probably already picked as your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >> So this is interesting because what you just described, you mentioned Snowflake, you mentioned Google, oh actually you mentioned yeah, Databricks. You know, Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool. But then Google's got the open data cloud. If you heard, Google next. And now Databricks doesn't call it the data cloud, but they have like the open source data cloud. So you have all these different approaches and there's really no way, up until now I'm hearing, to really understand the relationships between all those and have confidence across, it's like yamarket AMI, you should just be a note on the mesh. I don't care if it's a data warehouse or a data lake, or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And that's what you're bringing to the table. Is that right? Did I get that right? >> Yeah, that's right. And it's, for us, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now we can send them the operating ability to crunch all of the calculations, the governance, the quality, and get the answers. And what that's doing, it's basically zero network cost, zero egress cost, zero latency of time. And so when you were to log into BigQuery tomorrow using our tool, or say Snowflake for example, you have instant data quality metrics, instant profiling, instant lineage in access, privacy controls, things of that nature that just become less onerous. What we're seeing is there's so much technology out there just like all of the major brands that you mentioned but how do we make it easier? The future is about less clicks, faster time to value, faster scale, and eventually lower cost. And we think that this positions us to be the leader there. >> I love this example because, we've got talks about well the cloud guys you're going to own the world. And of course now we're seeing that the ecosystem is finding so much white space to add value connect across cloud. Sometimes we call it super cloud and so, or inter clouding. Alright, Kirk, give us your final thoughts on the trends that we've talked about and data Citizens 22. >> Absolutely. Well I think, one big trend is discovery and classification. Seeing that across the board, people used to know it was a zip code and nowadays with the amount of data that's out there they want to know where everything is, where their sensitive data is, if it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases how fast they can get controls and insights out of their tools. So I think we're going to see more one click solutions, more SaaS based solutions, and solutions that hopefully prove faster time to value on all of these modern cloud platforms. >> Excellent. All right, Kirk Haslbeck, thanks so much for coming on theCUBE and previewing Data Citizens 22. Appreciate it. >> Thanks for having me, Dave. >> You're welcome. All right. And thank you for watching. Keep it right there for more coverage from theCUBE. (atmospheric music)

Published Date : Nov 2 2022

SUMMARY :

Kirk, good to see you, welcome. Excited to be here. And now you lead data quality at Collibra. And it's so complex that the And now as we say, we're going and I check out the Nasdaq market cap. of the thing that you're observing and what's unique about your approach? ahead of the curve there and some examples, And the one right now is these and has the proper lineage, providence. and get the answers. And of course now we're and solutions that hopefully and previewing Data Citizens 22. And thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
2010	DATE	0.99+
Kirk Haslbeck	PERSON	0.99+
one	QUANTITY	0.99+
OwlDQ	ORGANIZATION	0.99+
Kirk	PERSON	0.99+
50 day	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
10 day	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
last year	DATE	0.99+
Collibra Data Quality	ORGANIZATION	0.99+
next week	DATE	0.99+
Data Citizens	ORGANIZATION	0.99+
tomorrow	DATE	0.98+
two other things	QUANTITY	0.98+
BigQuery	TITLE	0.98+
five seconds	QUANTITY	0.98+
one click	QUANTITY	0.97+
today	DATE	0.97+
Collibra	TITLE	0.96+
Wall Street	LOCATION	0.96+
SQL Pushdown	TITLE	0.94+
Data Citizens 22	ORGANIZATION	0.93+
COVID	ORGANIZATION	0.93+
Snowflake	TITLE	0.91+
Nasdaq	ORGANIZATION	0.9+
Data Citizens 22	ORGANIZATION	0.89+
Delta Lake	TITLE	0.89+
Egress	ORGANIZATION	0.89+
MIT	EVENT	0.89+
more than a couple	QUANTITY	0.87+
a decade ago	DATE	0.85+
zero	QUANTITY	0.84+
Citizens	ORGANIZATION	0.83+
Data Citizens 2022 Collibra	EVENT	0.83+
years	DATE	0.81+
thousands of data	QUANTITY	0.8+
Data Citizens 22	TITLE	0.78+
two domain experts	QUANTITY	0.77+
Snowflake	ORGANIZATION	0.76+
IQ	EVENT	0.76+
couple	QUANTITY	0.75+
Collibra	PERSON	0.75+
theCUBE	ORGANIZATION	0.71+
many numbers	QUANTITY	0.7+
Vice President	PERSON	0.68+
Lineage	ORGANIZATION	0.66+
Databricks	TITLE	0.64+
too long ago	DATE	0.62+
three	QUANTITY	0.6+
Data	ORGANIZATION	0.57+
CDO	EVENT	0.53+
minute	QUANTITY	0.53+
CDO	TITLE	0.53+
number	QUANTITY	0.51+
AMI	ORGANIZATION	0.44+
Quality	PERSON	0.43+

Collibra Data Citizens 22

>>Collibra is a company that was founded in 2008 right before the so-called modern big data era kicked into high gear. The company was one of the first to focus its business on data governance. Now, historically, data governance and data quality initiatives, they were back office functions and they were largely confined to regulatory regulated industries that had to comply with public policy mandates. But as the cloud went mainstream, the tech giants showed us how valuable data could become and the value proposition for data quality and trust. It evolved from primarily a compliance driven issue to becoming a lynchpin of competitive advantage. But data in the decade of the 2010s was largely about getting the technology to work. You had these highly centralized technical teams that were formed and they had hyper specialized skills to develop data architectures and processes to serve the myriad data needs of organizations. >>And it resulted in a lot of frustration with data initiatives for most organizations that didn't have the resources of the cloud guys and the social media giants to really attack their data problems and turn data into gold. This is why today for example, this quite a bit of momentum to rethinking monolithic data architectures. You see, you hear about initiatives like data mesh and the idea of data as a product. They're gaining traction as a way to better serve the the data needs of decentralized business Uni users, you hear a lot about data democratization. So these decentralization efforts around data, they're great, but they create a new set of problems. Specifically, how do you deliver like a self-service infrastructure to business users and domain experts? Now the cloud is definitely helping with that, but also how do you automate governance? This becomes especially tricky as protecting data privacy has become more and more important. >>In other words, while it's enticing to experiment and run fast and loose with data initiatives kinda like the Wild West, to find new veins of gold, it has to be done responsibly. As such, the idea of data governance has had to evolve to become more automated. And intelligence governance and data lineage is still fundamental to ensuring trust as data. It moves like water through an organization. No one is gonna use data that isn't trusted. Metadata has become increasingly important for data discovery and data classification. As data flows through an organization, the continuously ability to check for data flaws and automating that data quality, they become a functional requirement of any modern data management platform. And finally, data privacy has become a critical adjacency to cyber security. So you can see how data governance has evolved into a much richer set of capabilities than it was 10 or 15 years ago. >>Hello and welcome to the Cube's coverage of Data Citizens made possible by Calibra, a leader in so-called Data intelligence and the host of Data Citizens 2022, which is taking place in San Diego. My name is Dave Ante and I'm one of the hosts of our program, which is running in parallel to data citizens. Now at the Cube we like to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the themes from the keynote speakers at Data Citizens and we'll hear from several of the executives. Felix Von Dala, who is the co-founder and CEO of Collibra, will join us along with one of the other founders of Collibra, Stan Christians, who's gonna join my colleague Lisa Martin. I'm gonna also sit down with Laura Sellers, she's the Chief Product Officer at Collibra. We'll talk about some of the, the announcements and innovations they're making at the event, and then we'll dig in further to data quality with Kirk Hasselbeck. >>He's the vice president of Data quality at Collibra. He's an amazingly smart dude who founded Owl dq, a company that he sold to Col to Collibra last year. Now many companies, they didn't make it through the Hado era, you know, they missed the industry waves and they became Driftwood. Collibra, on the other hand, has evolved its business. They've leveraged the cloud, expanded its product portfolio, and leaned in heavily to some major partnerships with cloud providers, as well as receiving a strategic investment from Snowflake earlier this year. So it's a really interesting story that we're thrilled to be sharing with you. Thanks for watching and I hope you enjoy the program. >>Last year, the Cube Covered Data Citizens Collibra's customer event. And the premise that we put forth prior to that event was that despite all the innovation that's gone on over the last decade or more with data, you know, starting with the Hado movement, we had data lakes, we'd spark the ascendancy of programming languages like Python, the introduction of frameworks like TensorFlow, the rise of ai, low code, no code, et cetera. Businesses still find it's too difficult to get more value from their data initiatives. And we said at the time, you know, maybe it's time to rethink data innovation. While a lot of the effort has been focused on, you know, more efficiently storing and processing data, perhaps more energy needs to go into thinking about the people and the process side of the equation, meaning making it easier for domain experts to both gain insights for data, trust the data, and begin to use that data in new ways, fueling data, products, monetization and insights data citizens 2022 is back and we're pleased to have Felix Van Dema, who is the founder and CEO of Collibra. He's on the cube or excited to have you, Felix. Good to see you again. >>Likewise Dave. Thanks for having me again. >>You bet. All right, we're gonna get the update from Felix on the current data landscape, how he sees it, why data intelligence is more important now than ever and get current on what Collibra has been up to over the past year and what's changed since Data Citizens 2021. And we may even touch on some of the product news. So Felix, we're living in a very different world today with businesses and consumers. They're struggling with things like supply chains, uncertain economic trends, and we're not just snapping back to the 2010s. That's clear, and that's really true as well in the world of data. So what's different in your mind, in the data landscape of the 2020s from the previous decade, and what challenges does that bring for your customers? >>Yeah, absolutely. And, and I think you said it well, Dave, and and the intro that that rising complexity and fragmentation in the broader data landscape, that hasn't gotten any better over the last couple of years. When when we talk to our customers, that level of fragmentation, the complexity, how do we find data that we can trust, that we know we can use has only gotten kinda more, more difficult. So that trend that's continuing, I think what is changing is that trend has become much more acute. Well, the other thing we've seen over the last couple of years is that the level of scrutiny that organizations are under respect to data, as data becomes more mission critical, as data becomes more impactful than important, the level of scrutiny with respect to privacy, security, regulatory compliance, as only increasing as well, which again, is really difficult in this environment of continuous innovation, continuous change, continuous growing complexity and fragmentation. >>So it's become much more acute. And, and to your earlier point, we do live in a different world and and the the past couple of years we could probably just kind of brute for it, right? We could focus on, on the top line. There was enough kind of investments to be, to be had. I think nowadays organizations are focused or are, are, are, are, are, are in a very different environment where there's much more focus on cost control, productivity, efficiency, How do we truly get value from that data? So again, I think it just another incentive for organization to now truly look at data and to scale it data, not just from a a technology and infrastructure perspective, but how do you actually scale data from an organizational perspective, right? You said at the the people and process, how do we do that at scale? And that's only, only only becoming much more important. And we do believe that the, the economic environment that we find ourselves in today is gonna be catalyst for organizations to really dig out more seriously if, if, if, if you will, than they maybe have in the have in the best. >>You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated it was gonna get, but you've been on a mission to really address these problems from the beginning. How would you describe your, your, your mission and what are you doing to address these challenges? >>Yeah, absolutely. We, we started Colli in 2008. So in some sense and the, the last kind of financial crisis, and that was really the, the start of Colli where we found product market fit, working with large finance institutions to help them cope with the increasing compliance requirements that they were faced with because of the, of the financial crisis and kind of here we are again in a very different environment, of course 15 years, almost 15 years later. But data only becoming more important. But our mission to deliver trusted data for every user, every use case and across every source, frankly, has only become more important. So what has been an incredible journey over the last 14, 15 years, I think we're still relatively early in our mission to again, be able to provide everyone, and that's why we call it data citizens. We truly believe that everyone in the organization should be able to use trusted data in an easy, easy matter. That mission is is only becoming more important, more relevant. We definitely have a lot more work ahead of us because we are still relatively early in that, in that journey. >>Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a company and then the fact that you're still in the early days is kind of interesting. I mean, you, Collibra's had a good 12 months or so since we last spoke at Data Citizens. Give us the latest update on your business. What do people need to know about your, your current momentum? >>Yeah, absolutely. Again, there's, there's a lot of tail organizations that are only maturing the data practices and we've seen it kind of transform or, or, or influence a lot of our business growth that we've seen, broader adoption of the platform. We work at some of the largest organizations in the world where it's Adobe, Heineken, Bank of America, and many more. We have now over 600 enterprise customers, all industry leaders and every single vertical. So it's, it's really exciting to see that and continue to partner with those organizations. On the partnership side, again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners like Google, Amazon, Snowflake, data bricks and, and others, right? As those kind of new modern data infrastructures, modern data architectures that are definitely all moving to the cloud, a great opportunity for us, our partners and of course our customers to help them kind of transition to the cloud even faster. >>And so we see a lot of excitement and momentum there within an acquisition about 18 months ago around data quality, data observability, which we believe is an enormous opportunity. Of course, data quality isn't new, but I think there's a lot of reasons why we're so excited about quality and observability now. One is around leveraging ai, machine learning, again to drive more automation. And the second is that those data pipelines that are now being created in the cloud, in these modern data architecture arch architectures, they've become mission critical. They've become real time. And so monitoring, observing those data pipelines continuously has become absolutely critical so that they're really excited about about that as well. And on the organizational side, I'm sure you've heard a term around kind of data mesh, something that's gaining a lot of momentum, rightfully so. It's really the type of governance that we always believe. Then federated focused on domains, giving a lot of ownership to different teams. I think that's the way to scale data organizations. And so that aligns really well with our vision and, and from a product perspective, we've seen a lot of momentum with our customers there as well. >>Yeah, you know, a couple things there. I mean, the acquisition of i l dq, you know, Kirk Hasselbeck and, and their team, it's interesting, you know, the whole data quality used to be this back office function and, and really confined to highly regulated industries. It's come to the front office, it's top of mind for chief data officers, data mesh. You mentioned you guys are a connective tissue for all these different nodes on the data mesh. That's key. And of course we see you at all the shows. You're, you're a critical part of many ecosystems and you're developing your own ecosystem. So let's chat a little bit about the, the products. We're gonna go deeper in into products later on at, at Data Citizens 22, but we know you're debuting some, some new innovations, you know, whether it's, you know, the, the the under the covers in security, sort of making data more accessible for people just dealing with workflows and processes as you talked about earlier. Tell us a little bit about what you're introducing. >>Yeah, absolutely. We're super excited, a ton of innovation. And if we think about the big theme and like, like I said, we're still relatively early in this, in this journey towards kind of that mission of data intelligence that really bolts and compelling mission, either customers are still start, are just starting on that, on that journey. We wanna make it as easy as possible for the, for our organization to actually get started because we know that's important that they do. And for our organization and customers that have been with us for some time, there's still a tremendous amount of opportunity to kind of expand the platform further. And again, to make it easier for really to, to accomplish that mission and vision around that data citizen that everyone has access to trustworthy data in a very easy, easy way. So that's really the theme of a lot of the innovation that we're driving. >>A lot of kind of ease of adoption, ease of use, but also then how do we make sure that lio becomes this kind of mission critical enterprise platform from a security performance architecture scale supportability that we're truly able to deliver that kind of an enterprise mission critical platform. And so that's the big theme from an innovation perspective, From a product perspective, a lot of new innovation that we're really excited about. A couple of highlights. One is around data marketplace. Again, a lot of our customers have plans in that direction, how to make it easy. How do we make, how do we make available to true kind of shopping experience that anybody in your organization can, in a very easy search first way, find the right data product, find the right dataset, that data can then consume usage analytics. How do you, how do we help organizations drive adoption, tell them where they're working really well and where they have opportunities homepages again to, to make things easy for, for people, for anyone in your organization to kind of get started with ppia, you mentioned workflow designer, again, we have a very powerful enterprise platform. >>One of our key differentiators is the ability to really drive a lot of automation through workflows. And now we provided a new low code, no code kind of workflow designer experience. So, so really customers can take it to the next level. There's a lot more new product around K Bear Protect, which in partnership with Snowflake, which has been a strategic investor in kib, focused on how do we make access governance easier? How do we, how do we, how are we able to make sure that as you move to the cloud, things like access management, masking around sensitive data, PII data is managed as much more effective, effective rate, really excited about that product. There's more around data quality. Again, how do we, how do we get that deployed as easily and quickly and widely as we can? Moving that to the cloud has been a big part of our strategy. >>So we launch more data quality cloud product as well as making use of those, those native compute capabilities in platforms like Snowflake, Data, Bricks, Google, Amazon, and others. And so we are bettering a capability, a capability that we call push down. So actually pushing down the computer and data quality, the monitoring into the underlying platform, which again, from a scale performance and ease of use perspective is gonna make a massive difference. And then more broadly, we, we talked a little bit about the ecosystem. Again, integrations, we talk about being able to connect to every source. Integrations are absolutely critical and we're really excited to deliver new integrations with Snowflake, Azure and Google Cloud storage as well. So there's a lot coming out. The, the team has been work at work really hard and we are really, really excited about what we are coming, what we're bringing to markets. >>Yeah, a lot going on there. I wonder if you could give us your, your closing thoughts. I mean, you, you talked about, you know, the marketplace, you know, you think about data mesh, you think of data as product, one of the key principles you think about monetization. This is really different than what we've been used to in data, which is just getting the technology to work has been been so hard. So how do you see sort of the future and, you know, give us the, your closing thoughts please? >>Yeah, absolutely. And I, and I think we we're really at this pivotal moment, and I think you said it well. We, we all know the constraint and the challenges with data, how to actually do data at scale. And while we've seen a ton of innovation on the infrastructure side, we fundamentally believe that just getting a faster database is important, but it's not gonna fully solve the challenges and truly kind of deliver on the opportunity. And that's why now is really the time to deliver this data intelligence vision, this data intelligence platform. We are still early, making it as easy as we can. It's kind of, of our, it's our mission. And so I'm really, really excited to see what we, what we are gonna, how the marks gonna evolve over the next, next few quarters and years. I think the trend is clearly there when we talk about data mesh, this kind of federated approach folks on data products is just another signal that we believe that a lot of our organization are now at the time. >>The understanding need to go beyond just the technology. I really, really think about how do we actually scale data as a business function, just like we've done with it, with, with hr, with, with sales and marketing, with finance. That's how we need to think about data. I think now is the time given the economic environment that we are in much more focus on control, much more focused on productivity efficiency and now's the time. We need to look beyond just the technology and infrastructure to think of how to scale data, how to manage data at scale. >>Yeah, it's a new era. The next 10 years of data won't be like the last, as I always say. Felix, thanks so much and good luck in, in San Diego. I know you're gonna crush it out there. >>Thank you Dave. >>Yeah, it's a great spot for an in-person event and, and of course the content post event is gonna be available@collibra.com and you can of course catch the cube coverage@thecube.net and all the news@siliconangle.com. This is Dave Valante for the cube, your leader in enterprise and emerging tech coverage. >>Hi, I'm Jay from Collibra's Data Office. Today I want to talk to you about Collibra's data intelligence cloud. We often say Collibra is a single system of engagement for all of your data. Now, when I say data, I mean data in the broadest sense of the word, including reference and metadata. Think of metrics, reports, APIs, systems, policies, and even business processes that produce or consume data. Now, the beauty of this platform is that it ensures all of your users have an easy way to find, understand, trust, and access data. But how do you get started? Well, here are seven steps to help you get going. One, start with the data. What's data intelligence? Without data leverage the Collibra data catalog to automatically profile and classify your enterprise data wherever that data lives, databases, data lakes or data warehouses, whether on the cloud or on premise. >>Two, you'll then wanna organize the data and you'll do that with data communities. This can be by department, find a business or functional team, however your organization organizes work and accountability. And for that you'll establish community owners, communities, make it easy for people to navigate through the platform, find the data and will help create a sense of belonging for users. An important and related side note here, we find it's typical in many organizations that data is thought of is just an asset and IT and data offices are viewed as the owners of it and who are really the central teams performing analytics as a service provider to the enterprise. We believe data is more than an asset, it's a true product that can be converted to value. And that also means establishing business ownership of data where that strategy and ROI come together with subject matter expertise. >>Okay, three. Next, back to those communities there, the data owners should explain and define their data, not just the tables and columns, but also the related business terms, metrics and KPIs. These objects we call these assets are typically organized into business glossaries and data dictionaries. I definitely recommend starting with the topics that are most important to the business. Four, those steps that enable you and your users to have some fun with it. Linking everything together builds your knowledge graph and also known as a metadata graph by linking or relating these assets together. For example, a data set to a KPI to a report now enables your users to see what we call the lineage diagram that visualizes where the data in your dashboards actually came from and what the data means and who's responsible for it. Speaking of which, here's five. Leverage the calibra trusted business reporting solution on the marketplace, which comes with workflows for those owners to certify their reports, KPIs, and data sets. >>This helps them force their trust in their data. Six, easy to navigate dashboards or landing pages right in your platform for your company's business processes are the most effective way for everyone to better understand and take action on data. Here's a pro tip, use the dashboard design kit on the marketplace to help you build compelling dashboards. Finally, seven, promote the value of this to your users and be sure to schedule enablement office hours and new employee onboarding sessions to get folks excited about what you've built and implemented. Better yet, invite all of those community and data owners to these sessions so that they can show off the value that they've created. Those are my seven tips to get going with Collibra. I hope these have been useful. For more information, be sure to visit collibra.com. >>Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. My name is Dave Valante. With us is Kirk Hasselbeck, who's the vice president of Data Quality of Collibra Kirk, good to see you. Welcome. >>Thanks for having me, Dave. Excited to be here. >>You bet. Okay, we're gonna discuss data quality observability. It's a hot trend right now. You founded a data quality company, OWL dq, and it was acquired by Collibra last year. Congratulations. And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >>Yeah, absolutely. It's, it's definitely exciting times for data quality, which you're right, has been around for a long time. So why now and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And, and while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as, as to why this is becoming so important now. And, and I guess you could kind of break this down simply and think about if Dave, you and I were gonna build, you know, a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, you know, what the ramifications could be, what, what those incidents would look like, or maybe better yet, we try to build a, a new trading algorithm with a crossover strategy where the 50 day crosses the, the 10 day average. >>And imagine if the data underlying the inputs to that is incorrect. We will probably have major financial ramifications in that sense. So, you know, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. You know, I, I bought a car not too long ago and my dad called and said, How many cylinders does it have? And I realized in that moment, you know, I might have failed him because, cause I didn't know. And, and I used to ask those types of questions about any lock brakes and cylinders and, and you know, if it's manual or, or automatic and, and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips, I, I really don't know that much about it. >>And, and that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the, the individuals loading and consuming all of this data for the company actually may not know that much about the data itself, and that's not even their job anymore. So we'll talk more about that in a minute, but that's really what's setting the foreground for this observability play and why everybody's so interested. It, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >>You know, the other thing too about data quality, and for years we did the MIT CDO IQ event, we didn't do it last year, Covid messed everything up. But the observation I would make there thoughts is, is it data quality? Used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a, a risk to data as a, as an asset. And now as we say, we're gonna talk about observability. And so it's really become front and center just the whole quality issue because data's so fundamental, hasn't it? >>Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my, my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And, and that's kind of what's going on. There's, there's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor, but with the scale that we've achieved in early days, even before calibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is, is listening right intently nowadays to this topic is, so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, you know, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's, it's not ever going to be based on one or two domain experts anymore. >>So, So how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they, are they cousins? What's your perspective on that? >>Yeah, it's, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the, the lingo is constantly moving is, you know, as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens, it's wrong and when it doesn't it's correct. Or I could look for a trend and I'll give you a good example. You know, everybody's talking about fresh data and stale data and, and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good in the bads. That was kind of your, your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data, but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >>So what's the Collibra angle on all this stuff made the acquisition, you got data quality observability coming together, you guys have a lot of expertise in, in this area, but you hear providence of data, you just talked about, you know, stale data, you know, the, the whole trend toward real time. How is Calibra approaching the problem and what's unique about your approach? >>Well, I think where we're fortunate is with our background, myself and team, we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the, the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution, it's more advanced than some of the observation techniques that that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights, and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong, just show me the big picture, help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact, connecting it with lineage and catalog metadata. And as that grows, you can actually achieve total data governance at this point with the acquisition of what was a Lineage company years ago, and then my company Ldq now Collibra, Data quality Collibra may be the best positioned for total data governance and intelligence in the space. >>Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens 22 that you're announcing, you gotta announce new products, right? You're yearly event what's, what's new. Give us a sense as to what products are coming out, but specifically around data quality and observability. >>Absolutely. There's this, you know, there's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks is Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a SaaS like model. And we've started to hook in to these databases. And while we've always worked with the the same databases in the past, they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did your, my data that I've spent all this time and money with my security team securing ever leave my hands, did it ever leave my secure VPC as they call it? >>And with these native integrations that we're building and about to unveil, here's kind of a sneak peek for, for next week at Data Citizens. We're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration, you could log into the Collibra data quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress, cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >>So this is interesting because what you just described, you know, you mentioned Snowflake, you mentioned Google, Oh actually you mentioned yeah, data bricks. You know, Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool, but then Google's got the open data cloud. If you heard, you know, Google next and now data bricks doesn't call it the data cloud, but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm, I'm hearing to, to really understand the relationships between all those and have confidence across, you know, it's like Jak Dani, you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And, and, and that's what you're bringing to the table, Is that right? Did I get that right? >>Yeah, that's right. And it's, for us, it's, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now, we can send them the, the operating ability to crunch all of the calculations, the governance, the quality, and get the answers. And what that's doing, it's basically zero network costs, zero egress cost, zero latency of time. And so when you were to log into Big Query tomorrow using our tool or like, or say Snowflake for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls, things of that nature that just become less onerous. What we're seeing is there's so much technology out there, just like all of the major brands that you mentioned, but how do we make it easier? The future is about less clicks, faster time to value, faster scale, and eventually lower cost. And, and we think that this positions us to be the leader there. >>I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. All right, Kirk, give us your, your final thoughts and on on the trends that we've talked about and Data Citizens 22. >>Absolutely. Well, I think, you know, one big trend is discovery and classification. Seeing that across the board, people used to know it was a zip code and nowadays with the amount of data that's out there, they wanna know where everything is, where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases how fast they can get controls and insights out of their tools. So I think we're gonna see more one click solutions, more SAS based solutions and solutions that hopefully prove faster time to value on, on all of these modern cloud platforms. >>Excellent. All right, Kurt Hasselbeck, thanks so much for coming on the Cube and previewing Data Citizens 22. Appreciate it. >>Thanks for having me, Dave. >>You're welcome. Right, and thank you for watching. Keep it right there for more coverage from the Cube. Welcome to the Cube's virtual Coverage of Data Citizens 2022. My name is Dave Valante and I'm here with Laura Sellers, who's the Chief Product Officer at Collibra, the host of Data Citizens. Laura, welcome. Good to see you. >>Thank you. Nice to be here. >>Yeah, your keynote at Data Citizens this year focused on, you know, your mission to drive ease of use and scale. Now when I think about historically fast access to the right data at the right time in a form that's really easily consumable, it's been kind of challenging, especially for business users. Can can you explain to our audience why this matters so much and what's actually different today in the data ecosystem to make this a reality? >>Yeah, definitely. So I think what we really need and what I hear from customers every single day is that we need a new approach to data management and our product teams. What inspired me to come to Calibra a little bit a over a year ago was really the fact that they're very focused on bringing trusted data to more users across more sources for more use cases. And so as we look at what we're announcing with these innovations of ease of use and scale, it's really about making teams more productive in getting started with and the ability to manage data across the entire organization. So we've been very focused on richer experiences, a broader ecosystem of partners, as well as a platform that delivers performance, scale and security that our users and teams need and demand. So as we look at, Oh, go ahead. >>I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it was just so complicated. But, but please carry on. I'd love to hear more about this. >>Yeah, I, I really, you know, Collibra is a system of engagement for data and we really are working on bringing that entire system of engagement to life for everyone to leverage here and now. So what we're announcing from our ease of use side of the world is first our data marketplace. This is the ability for all users to discover and access data quickly and easily shop for it, if you will. The next thing that we're also introducing is the new homepage. It's really about the ability to drive adoption and have users find data more quickly. And then the two more areas of the ease of use side of the world is our world of usage analytics. And one of the big pushes and passions we have at Collibra is to help with this data driven culture that all companies are trying to create. And also helping with data literacy, with something like usage analytics, it's really about driving adoption of the CLE platform, understanding what's working, who's accessing it, what's not. And then finally we're also introducing what's called workflow designer. And we love our workflows at Libra, it's a big differentiator to be able to automate business processes. The designer is really about a way for more people to be able to create those workflows, collaborate on those workflow flows, as well as people to be able to easily interact with them. So a lot of exciting things when it comes to ease of use to make it easier for all users to find data. >>Y yes, there's definitely a lot to unpack there. I I, you know, you mentioned this idea of, of of, of shopping for the data. That's interesting to me. Why this analogy, metaphor or analogy, I always get those confused. I let's go with analogy. Why is it so important to data consumers? >>I think when you look at the world of data, and I talked about this system of engagement, it's really about making it more accessible to the masses. And what users are used to is a shopping experience like your Amazon, if you will. And so having a consumer grade experience where users can quickly go in and find the data, trust that data, understand where the data's coming from, and then be able to quickly access it, is the idea of being able to shop for it, just making it as simple as possible and really speeding the time to value for any of the business analysts, data analysts out there. >>Yeah, I think when you, you, you see a lot of discussion about rethinking data architectures, putting data in the hands of the users and business people, decentralized data and of course that's awesome. I love that. But of course then you have to have self-service infrastructure and you have to have governance. And those are really challenging. And I think so many organizations, they're facing adoption challenges, you know, when it comes to enabling teams generally, especially domain experts to adopt new data technologies, you know, like the, the tech comes fast and furious. You got all these open source projects and get really confusing. Of course it risks security, governance and all that good stuff. You got all this jargon. So where do you see, you know, the friction in adopting new data technologies? What's your point of view and how can organizations overcome these challenges? >>You're, you're dead on. There's so much technology and there's so much to stay on top of, which is part of the friction, right? It's just being able to stay ahead of, of and understand all the technologies that are coming. You also look at as there's so many more sources of data and people are migrating data to the cloud and they're migrating to new sources. Where the friction comes is really that ability to understand where the data came from, where it's moving to, and then also to be able to put the access controls on top of it. So people are only getting access to the data that they should be getting access to. So one of the other things we're announcing with, with all of the innovations that are coming is what we're doing around performance and scale. So with all of the data movement, with all of the data that's out there, the first thing we're launching in the world of performance and scale is our world of data quality. >>It's something that Collibra has been working on for the past year and a half, but we're launching the ability to have data quality in the cloud. So it's currently an on-premise offering, but we'll now be able to carry that over into the cloud for us to manage that way. We're also introducing the ability to push down data quality into Snowflake. So this is, again, one of those challenges is making sure that that data that you have is d is is high quality as you move forward. And so really another, we're just reducing friction. You already have Snowflake stood up. It's not another machine for you to manage, it's just push down capabilities into Snowflake to be able to track that quality. Another thing that we're launching with that is what we call Collibra Protect. And this is that ability for users to be able to ingest metadata, understand where the PII data is, and then set policies up on top of it. So very quickly be able to set policies and have them enforced at the data level. So anybody in the organization is only getting access to the data they should have access to. >>Here's Topica data quality is interesting. It's something that I've followed for a number of years. It used to be a back office function, you know, and really confined only to highly regulated industries like financial services and healthcare and government. You know, you look back over a decade ago, you didn't have this worry about personal information, g gdpr, and, you know, California Consumer Privacy Act all becomes, becomes so much important. The cloud is really changed things in terms of performance and scale and of course partnering for, for, with Snowflake it's all about sharing data and monetization, anything but a back office function. So it was kind of smart that you guys were early on and of course attracting them and as a, as an investor as well was very strong validation. What can you tell us about the nature of the relationship with Snowflake and specifically inter interested in sort of joint engineering or, and product innovation efforts, you know, beyond the standard go to market stuff? >>Definitely. So you mentioned there were a strategic investor in Calibra about a year ago. A little less than that I guess. We've been working with them though for over a year really tightly with their product and engineering teams to make sure that Collibra is adding real value. Our unified platform is touching pieces of our unified platform or touching all pieces of Snowflake. And when I say that, what I mean is we're first, you know, able to ingest data with Snowflake, which, which has always existed. We're able to profile and classify that data we're announcing with Calibra Protect this week that you're now able to create those policies on top of Snowflake and have them enforce. So again, people can get more value out of their snowflake more quickly as far as time to value with, with our policies for all business users to be able to create. >>We're also announcing Snowflake Lineage 2.0. So this is the ability to take stored procedures in Snowflake and understand the lineage of where did the data come from, how was it transformed with within Snowflake as well as the data quality. Pushdown, as I mentioned, data quality, you brought it up. It is a new, it is a, a big industry push and you know, one of the things I think Gartner mentioned is people are losing up to $15 million without having great data quality. So this push down capability for Snowflake really is again, a big ease of use push for us at Collibra of that ability to, to push it into snowflake, take advantage of the data, the data source, and the engine that already lives there and get the right and make sure you have the right quality. >>I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, you know, high degree of confidence that the data sharing can be done in a safe way. Bringing, you know, Collibra into the, into the story allows me to have that data quality and, and that governance that I, that I need. You know, we've said many times on the cube that one of the notable differences in cloud this decade versus last decade, I mean ob there are obvious differences just in terms of scale and scope, but it's shaping up to be about the strength of the ecosystems. That's really a hallmark of these big cloud players. I mean they're, it's a key factor for innovating, accelerating product delivery, filling gaps in, in the hyperscale offerings cuz you got more stack, you know, mature stack capabilities and you know, it creates this flywheel momentum as we often say. But, so my question is, how do you work with the hyperscalers? Like whether it's AWS or Google, whomever, and what do you see as your role and what's the Collibra sweet spot? >>Yeah, definitely. So, you know, one of the things I mentioned early on is the broader ecosystem of partners is what it's all about. And so we have that strong partnership with Snowflake. We also are doing more with Google around, you know, GCP and kbra protect there, but also tighter data plex integration. So similar to what you've seen with our strategic moves around Snowflake and, and really covering the broad ecosystem of what Collibra can do on top of that data source. We're extending that to the world of Google as well and the world of data plex. We also have great partners in SI's Infosys is somebody we spoke with at the conference who's done a lot of great work with Levi's as they're really important to help people with their whole data strategy and driving that data driven culture and, and Collibra being the core of it. >>Hi Laura, we're gonna, we're gonna end it there, but I wonder if you could kind of put a bow on, you know, this year, the event your, your perspectives. So just give us your closing thoughts. >>Yeah, definitely. So I, I wanna say this is one of the biggest releases Collibra's ever had. Definitely the biggest one since I've been with the company a little over a year. We have all these great new product innovations coming to really drive the ease of use to make data more valuable for users everywhere and, and companies everywhere. And so it's all about everybody being able to easily find, understand, and trust and get access to that data going forward. >>Well congratulations on all the pro progress. It was great to have you on the cube first time I believe, and really appreciate you, you taking the time with us. >>Yes, thank you for your time. >>You're very welcome. Okay, you're watching the coverage of Data Citizens 2022 on the cube, your leader in enterprise and emerging tech coverage. >>So data modernization oftentimes means moving some of your storage and computer to the cloud where you get the benefit of scale and security and so on. But ultimately it doesn't take away the silos that you have. We have more locations, more tools and more processes with which we try to get value from this data. To do that at scale in an organization, people involved in this process, they have to understand each other. So you need to unite those people across those tools, processes, and systems with a shared language. When I say customer, do you understand the same thing as you hearing customer? Are we counting them in the same way so that shared language unites us and that gives the opportunity for the organization as a whole to get the maximum value out of their data assets and then they can democratize data so everyone can properly use that shared language to find, understand, and trust the data asset that's available. >>And that's where Collibra comes in. We provide a centralized system of engagement that works across all of those locations and combines all of those different user types across the whole business. At Collibra, we say United by data and that also means that we're united by data with our customers. So here is some data about some of our customers. There was the case of an online do it yourself platform who grew their revenue almost three times from a marketing campaign that provided the right product in the right hands of the right people. In other case that comes to mind is from a financial services organization who saved over 800 K every year because they were able to reuse the same data in different kinds of reports and before there was spread out over different tools and processes and silos, and now the platform brought them together so they realized, oh, we're actually using the same data, let's find a way to make this more efficient. And the last example that comes to mind is that of a large home loan, home mortgage, mortgage loan provider where they have a very complex landscape, a very complex architecture legacy in the cloud, et cetera. And they're using our software, they're using our platform to unite all the people and those processes and tools to get a common view of data to manage their compliance at scale. >>Hey everyone, I'm Lisa Martin covering Data Citizens 22, brought to you by Collibra. This next conversation is gonna focus on the importance of data culture. One of our Cube alumni is back, Stan Christians is Collibra's co-founder and it's Chief Data citizens. Stan, it's great to have you back on the cube. >>Hey Lisa, nice to be. >>So we're gonna be talking about the importance of data culture, data intelligence, maturity, all those great things. When we think about the data revolution that every business is going through, you know, it's so much more than technology innovation. It also really re requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >>Right. So as you know, our event is called Data Citizens because we believe that in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations, you have a lot of people, most of the employees in an organization are somehow gonna to be a data citizen, right? So you need to make sure that these people are aware of it. You need that. People have skills and competencies to do with data what necessary and that's on, all right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss, we need to make this decision that your boss is also open to and able to interpret, you know, the data presented in dashboard to actually make that decision and take that action. Right? >>And once you have that why to the organization, that's when you have a good data culture. Now that's continuous effort for most organizations because they're always moving, somehow they're hiring new people and it has to be continuous effort because we've seen that on the hand. Organizations continue challenged their data sources and where all the data is flowing, right? Which in itself creates a lot of risk. But also on the other set hand of the equation, you have the benefit. You know, you might look at regulatory drivers like, we have to do this, right? But it's, it's much better right now to consider the competitive drivers, for example, and we did an IDC study earlier this year, quite interesting. I can recommend anyone to it. And one of the conclusions they found as they surveyed over a thousand people across organizations worldwide is that the ones who are higher in maturity. >>So the, the organizations that really look at data as an asset, look at data as a product and actively try to be better at it, don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them up as data citizens. I'm doing this for competitive reasons, I'm doing this re reasons you're trying to bring both of those together and the ones that get data intelligence right, are successful and competitive. That's, and that's what we're seeing out there in the market. >>Absolutely. We know that just generally stand right, the organizations that are, are really creating a, a data culture and enabling everybody within the organization to become data citizens are, We know that in theory they're more competitive, they're more successful. But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >>Of course, of course it's difficult for an organization to adapt but it's also necessary, as you just said, imagine that, you know, you're a modern day organization, laptops, what have you, you're not using those, right? Or you know, you're delivering them throughout organization, but not enabling your colleagues to actually do something with that asset. Same thing as through with data today, right? If you're not properly using the data asset and competitors are, they're gonna to get more advantage. So as to how you get this done, establish this. There's angles to look at, Lisa. So one angle is obviously the leadership whereby whoever is the boss of data in the organization, you typically have multiple bosses there, like achieve data officers. Sometimes there's, there's multiple, but they may have a different title, right? So I'm just gonna summarize it as a data leader for a second. >>So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? Yes. Now that's one part because then you can leadership in the organization and also the business value. And that's important. Cause those people, their job in essence really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that right, is it's not enough to just have that leadership out there, but you also have to get the hearts and minds of the data champions across the organization. You, I really have to win them over. And if you have those two combined and obviously a good technology to, you know, connect those people and have them execute on their responsibilities such as a data intelligence platform like s then the in place to really start upgrading that culture inch by inch if you'll, >>Yes, I like that. The recipe for success. So you are the co-founder of Collibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how before we went live, we were talking about Calibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what maybe some of the specific projects are that Collibra's data office is working on. >>Yes, and it is indeed data citizens. There are a ton of speaks here, are very excited. You know, we have Barb from m MIT speaking about data monetization. We have Dilla at the last minute. So really exciting agen agenda. Can't wait to get back out there essentially. So over the years at, we've doing this since two and eight, so a good years and I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around as are we. And myself, you know, when you start a company, we were for people in a, if you, so everybody's wearing all sorts of hat at time. But over the years I've run, you know, presales that sales partnerships, product cetera. And as our company got a little bit biggish, we're now thousand two. Something like people in the company. >>I believe systems and processes become a lot important. So we said you CBRA isn't the size our customers we're getting there in of organization structure, process systems, et cetera. So we said it's really time for us to put our money where is and to our own data office, which is what we were seeing customers', organizations worldwide. And they organizations have HR units, they have a finance unit and over time they'll all have a department if you'll, that is responsible somehow for the data. So we said, ok, let's try to set an examples that other people can take away with it, right? Can take away from it. So we set up a data strategy, we started building data products, took care of the data infrastructure. That's sort of good stuff. And in doing all of that, ISA exactly as you said, we said, okay, we need to also use our product and our own practices and from that use, learn how we can make the product better, learn how we make, can make the practice better and share that learning with all the, and on, on the Monday mornings, we sometimes refer to eating our dog foods on Friday evenings. >>We referred to that drinking our own champagne. I like it. So we, we had a, we had the driver to do this. You know, there's a clear business reason. So we involved, we included that in the data strategy and that's a little bit of our origin. Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should, this is just the organization that works at our company, but it can serve as an inspiration. So we have a pillar, which is data science. The data product builders, if you'll or the people who help the business build data products. We have the data engineers who help keep the lights on for that data platform to make sure that the products, the data products can run, the data can flow and you know, the quality can be checked. >>And then we have a data intelligence or data governance builders where we have those data governance, data intelligence stakeholders who help the business as a sort of data partner to the business stakeholders. So that's how we've organized it. And then we started following the CBRA approach, which is, well, what are the challenges that our business stakeholders have in hr, finance, sales, marketing all over? And how can data help overcome those challenges? And from those use cases, we then just started to build a map and started execution use of the use case. And a important ones are very simple. We them with our, our customers as well, people talking about the cata, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in and privacy. So they have their process registry and they can see how the data flows. >>So that's a starting place and that turns into a marketplace so that if new analysts and data citizens join kbra, they immediately have a place to go to, to look at, see, ok, what data is out there for me as an analyst or a data scientist or whatever to do my job, right? So they can immediately get access data. And another one that we is around trusted business. We're seeing that since, you know, self-service BI allowed everyone to make beautiful dashboards, you know, pie, pie charts. I always, my pet pee is the pie chart because I love buy and you shouldn't always be using pie charts. But essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report the reporting on the same thing. But the numbers seem different, right? So that's why we have trusted this reporting. So we know if a, the dashboard, a data product essentially is built, we not that all the right steps are being followed and that whoever is consuming that can be quite confident in the result either, Right. And that silver browser, right? Absolutely >>Decay. >>Exactly. Yes, >>Absolutely. Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >>KPIs and measuring is a big topic in the, in the data chief data officer profession, I would say, and again, it always varies with to your organization, but there's a few that we use that might be of interest. Use those pillars, right? And we have metrics across those pillars. So for example, a pillar on the data engineering side is gonna be more related to that uptime, right? Are the, is the data platform up and running? Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption's a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data sciences and products. Are people using them? Are they getting value from it? >>Can we calculate that value in ay perspective, right? Yeah. So that we can to the rest of the business continue to say we're tracking all those numbers and those numbers indicate that value is generated and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example, you have a number of domains in a data mesh. People talk about being the owner of a data domain, for example, like product or, or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open closed? How many data products are built according to process? And so and so forth. So these are an set of examples of, of KPIs. There's a, there's a lot more, but hopefully those can already inspire the audience. >>Absolutely. So we've, we've talked about the rise cheap data offices, it's only accelerating. You mentioned this is like a 10 year journey. So if you were to look into a crystal ball, what do you see in terms of the maturation of data offices over the next decade? >>So we, we've seen indeed the, the role sort of grow up, I think in, in thousand 10 there may have been like 10 achieve data officers or something. Gartner has exact numbers on them, but then they grew, you know, industries and the number is estimated to be about 20,000 right now. Wow. And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy support for the digital program. And now all about data products, right? So as a data leader, you now need all of those competences and need to include them in, in your strategy. >>How is that going to evolve for the next couple of years? I wish I had one of those balls, right? But essentially I think for the next couple of years there's gonna be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data. So you'll see over the years that's gonna evolve more digital and more data products. So for next years, my, my prediction is it's all products because it's an immediate link between data and, and the essentially, right? Right. So that's gonna be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up in a few years. I think there's gonna be a continued challenge for the chief officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not, right? >>So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done will be the ones that are successful and the ones who get that will the ones that do it on the basis of data monetization, right? Connecting value to the data and making that value clear to all the data citizens in the organization, right? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences aligned of course. And they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you're waking up data citizens across the organization and you make everyone in the organization think about data as an asset. >>Absolutely. Because there's so much value that can be extracted. Organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely gonna keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, organizations have so much more likelihood of being successful and being competitive. So we're gonna watch this space. Stan, thank you so much for joining me on the cube at Data Citizens 22. We appreciate it. >>Thanks for having me over >>From Data Citizens 22, I'm Lisa Martin, you're watching The Cube, the leader in live tech coverage. >>Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra. Remember, all these videos are available on demand@thecube.net. And don't forget to check out silicon angle.com for all the news and wiki bod.com for our weekly breaking analysis series where we cover many data topics and share survey research from our partner ETR Enterprise Technology Research. If you want more information on the products announced at Data Citizens, go to collibra.com. There are tons of resources there. You'll find analyst reports, product demos. It's really worthwhile to check those out. Thanks for watching our program and digging into Data Citizens 2022 on the Cube, your leader in enterprise and emerging tech coverage. We'll see you soon.

Published Date : Nov 2 2022

SUMMARY :

largely about getting the technology to work. Now the cloud is definitely helping with that, but also how do you automate governance? So you can see how data governance has evolved into to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the So it's a really interesting story that we're thrilled to be sharing And we said at the time, you know, maybe it's time to rethink data innovation. 2020s from the previous decade, and what challenges does that bring for your customers? as data becomes more impactful than important, the level of scrutiny with respect to privacy, So again, I think it just another incentive for organization to now truly look at data You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated the last kind of financial crisis, and that was really the, the start of Colli where we found product market Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners And the second is that those data pipelines that are now being created in the cloud, I mean, the acquisition of i l dq, you know, So that's really the theme of a lot of the innovation that we're driving. And so that's the big theme from an innovation perspective, One of our key differentiators is the ability to really drive a lot of automation through workflows. So actually pushing down the computer and data quality, one of the key principles you think about monetization. And I, and I think we we're really at this pivotal moment, and I think you said it well. We need to look beyond just the I know you're gonna crush it out there. This is Dave Valante for the cube, your leader in enterprise and Without data leverage the Collibra data catalog to automatically And for that you'll establish community owners, a data set to a KPI to a report now enables your users to see what Finally, seven, promote the value of this to your users and Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. And now you lead data quality at Collibra. imagine if we get that wrong, you know, what the ramifications could be, And I realized in that moment, you know, I might have failed him because, cause I didn't know. And it's so complex that the way companies consume them in the IT function is And so it's really become front and center just the whole quality issue because data's so fundamental, nowadays to this topic is, so maybe we could surface all of these problems with So the language is changing a you know, stale data, you know, the, the whole trend toward real time. we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. And the one right now is these hyperscalers in the cloud. And I think if you look at the whole So this is interesting because what you just described, you know, you mentioned Snowflake, And so when you were to log into Big Query tomorrow using our I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, Seeing that across the board, people used to know it was a zip code and nowadays Appreciate it. Right, and thank you for watching. Nice to be here. Can can you explain to our audience why the ability to manage data across the entire organization. I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it And one of the big pushes and passions we have at Collibra is to help with I I, you know, you mentioned this idea of, and really speeding the time to value for any of the business analysts, So where do you see, you know, the friction in adopting new data technologies? So one of the other things we're announcing with, with all of the innovations that are coming is So anybody in the organization is only getting access to the data they should have access to. So it was kind of smart that you guys were early on and We're able to profile and classify that data we're announcing with Calibra Protect this week that and get the right and make sure you have the right quality. I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, We also are doing more with Google around, you know, GCP and kbra protect there, you know, this year, the event your, your perspectives. And so it's all about everybody being able to easily It was great to have you on the cube first time I believe, cube, your leader in enterprise and emerging tech coverage. the cloud where you get the benefit of scale and security and so on. And the last example that comes to mind is that of a large home loan, home mortgage, Stan, it's great to have you back on the cube. Talk to us about what you mean by data citizenship and the And we believe that today's organizations, you have a lot of people, And one of the conclusions they found as they So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them But the IDC study that you just mentioned demonstrates they're three times So as to how you get this done, establish this. part of the equation of getting that right, is it's not enough to just have that leadership out Talk to us about how you are building a data culture within Collibra and But over the years I've run, you know, So we said you the data products can run, the data can flow and you know, the quality can be checked. The catalog for the data scientists to know what's in their data lake, and data citizens join kbra, they immediately have a place to go to, Yes, success of the data office. So for example, a pillar on the data engineering side is gonna be more related So how many of those domains do you have covered? to look into a crystal ball, what do you see in terms of the maturation industries and the number is estimated to be about 20,000 right now. How is that going to evolve for the next couple of years? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences And as the data show that you mentioned in that IDC study, the leader in live tech coverage. Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra.

ENTITIES

Entity	Category	Confidence
Laura	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Heineken	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
Laura Sellers	PERSON	0.99+
2008	DATE	0.99+
Collibra	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Felix Von Dala	PERSON	0.99+
Google	ORGANIZATION	0.99+
Felix Van Dema	PERSON	0.99+
seven	QUANTITY	0.99+
Stan Christians	PERSON	0.99+
2010	DATE	0.99+
Lisa	PERSON	0.99+
San Diego	LOCATION	0.99+
Jay	PERSON	0.99+
50 day	QUANTITY	0.99+
Felix	PERSON	0.99+
one	QUANTITY	0.99+
Kurt Hasselbeck	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
10 year	QUANTITY	0.99+
California Consumer Privacy Act	TITLE	0.99+
10 day	QUANTITY	0.99+
Six	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+
Last year	DATE	0.99+
demand@thecube.net	OTHER	0.99+
ETR Enterprise Technology Research	ORGANIZATION	0.99+
Barry	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
one part	QUANTITY	0.99+
Python	TITLE	0.99+
2010s	DATE	0.99+
2020s	DATE	0.99+
Calibra	LOCATION	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Calibra	ORGANIZATION	0.99+
K Bear Protect	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
12 months	QUANTITY	0.99+
tomorrow	DATE	0.99+
AWS	ORGANIZATION	0.99+
Barb	PERSON	0.99+
Stan	PERSON	0.99+
Data Citizens	ORGANIZATION	0.99+

Felix Van de Maele, Collibra | Data Citizens '22

(upbeat music) >> Last year, the Cube covered Data Citizens, Collibra's customer event. And the premise that we put forth prior to that event was that despite all the innovation that's gone on over the last decade or more with data, you know, starting with the Hadoop movement. We had data lakes, we had Spark, the ascendancy of programming languages like Python, the introduction of frameworks like TensorFlow, the rise of AI, low code, no code, et cetera. Businesses still find it's too difficult to get more value from their data initiatives. And we said at the time, you know, maybe it's time to rethink data innovation. While a lot of the effort has been focused on more efficiently storing and processing data, perhaps more energy needs to go into thinking about the people and the process side of the equation, meaning making it easier for domain experts to both gain insights from data, trust the data, and begin to use that data in new ways, fueling data products, monetization, and insights. Data Citizens 2022 is back, and we're pleased to have Felix Van de Maele, who is the founder and CEO of Collibra. He's on the Cube. We're excited to have you, Felix. Good to see you again. >> Likewise Dave. Thanks for having me again. >> You bet. All right, we're going to get the update from Felix on the current data landscape, how he sees it, why data intelligence is more important now than ever, and get current on what Collibra has been up to over the past year, and what's changed since Data Citizens 2021. And we may even touch on some of the product news. So Felix, we're living in a very different world today with businesses and consumers. They're struggling with things like supply chains, uncertain economic trends, and we're not just snapping back to the 2010s. That's clear. And that's really true, as well, in the world of data. So what's different in your mind in the data landscape of the 2020s from the previous decade, and what challenges does that bring for your customers? >> Yeah, absolutely. And I think you said it well, Dave, in the intro that rising complexity and fragmentation in the broader data landscape that hasn't gotten any better over the last couple of years. When we talk to our customers, that level of fragmentation, the complexity, how do we find data that we can trust, that we know we can use, has only gotten kind of more difficult. So that trend is continuing. I think what is changing is that trend has become much more acute. Well, the other thing we've seen over the last couple of years is that the level of scrutiny that organizations are under with respect to data, as data becomes more mission critical, as data becomes more impactful and important, the level of scrutiny with respect to privacy, security, regulatory compliance, is only increasing as well. Which again, is really difficult in this environment of continuous innovation, continuous change, continuous growing complexity and fragmentation. So it's become much more acute. And to your earlier point, we do live in a different world, and the past couple of years, we could probably just kind of brute force it, right? We could focus on the top line. There was enough kind of investments to be had. I think nowadays organizations are focused, or are in a very different environment where there's much more focus on cost control, productivity, efficiency. How do we truly get value from that data? So again, I think it's just another incentive for organizations to now truly look at that data and to scale that data, not just from a technology and infrastructure perspective, but how do we actually scale data from an organizational perspective, right? Like you said, the people and process, how do we do that at scale? And that's only becoming much more important. And we do believe that the economic environment that we find ourselves in today is going to be a catalyst for organizations to really take that more seriously if you will than they maybe have in the past. >> You know, I don't know when you guys founded Collibra, if you had a sense as to how complicated it was going to get, but you've been on a mission to really address these problems from the beginning. How would you describe your mission, and what are you doing to address these challenges? >> Yeah, absolutely. We started Collibra in 2008. So in some sense in the last kind of financial crisis. And that was really the start of Collibra, where we found product market fit working with large financial institutions to help them cope with the increasing compliance requirements that they were faced with because of the financial crisis, and kind of here we are again in a very different environment of course, 15 years, almost 15 years later. But data only becoming more important. But our mission to deliver trusted data for every user, every use case, and across every source, frankly has only become more important. So while it's been an incredible journey over the last 14, 15 years, I think we're still relatively early in our mission to, again, be able to provide everyone, and that's why we call it Data Citizens. We truly believe that everyone in the organization should be able to use trusted data in an easy, easy manner. That mission is only becoming more important, more relevant. We definitely have a lot more work ahead of us because we're still relatively early in that journey. >> Well, that's interesting because, you know, in my observation, it takes seven to 10 years to actually build a company, and then the fact that you're still in the early days is kind of interesting. I mean, Collibra's had a good 12 months or so since we last spoke at Data Citizens. Give us the latest update on your business. What do people need to know about your your current momentum? >> Yeah, absolutely. Again, there's a lot of tailwinds, organizations are only maturing their data practices, and we've seen it kind of transform, or influence a lot of our business growth that we've seen, broader adoption of the platform. We work at some of the largest organizations in the world, whether it's Adobe, Heineken, Bank of America, and many more. We have now over 600 enterprise customers, all industry leaders and every single vertical. So it's really exciting to see that and continue to partner with those organizations. On the partnership side, again, a lot of momentum in the market with some of the cloud partners like Google, Amazon, Snowflake, Databricks, and others, right? As those kind of new modern data infrastructures, modern data architectures, are definitely all moving to the cloud. A great opportunity for us, our partners, and of course our customers, to help them kind of transition to the cloud even faster. And so we see a lot of excitement and momentum there. We did an acquisition about 18 months ago around data quality, data observability, which we believe is an enormous opportunity. Of course data quality isn't new, but I think there's a lot of reasons why we're so excited about quality and observability now. One is around leveraging AI, machine learning, again to drive more automation. And the second is that those data pipelines that are now being created in the cloud, in these modern data architectures, they've become mission critical. They've become real time. And so monitoring, observing those data pipelines continuously has become absolutely critical. So we're really excited about that as well. And on the organizational side, I'm sure you've heard a term around kind of data mesh, something that's gaining a lot of momentum, rightfully so. It's really the type of governance that we always believed in. Federated, focused on domains, giving a lot of ownership to different teams. I think that's the way to scale the data organizations, and so that aligns really well with our vision, and from a product perspective, we've seen a lot of momentum with our customers there as well. >> Yeah, you know, a couple things there. I mean, the acquisition of OwlDQ, you know, Kirk Haslbeck and their team, it's interesting, you know, the whole data quality used to be this back office function and really confined to highly regulated industries. It's come to the front office, it's top of mind for chief data officers, data mesh, you mentioned. You guys are a connective tissue for all these different nodes on the data mesh. That's key. And of course we see you at all the shows. You're a critical part of many ecosystems, and you're developing your own ecosystem. So let's chat a little bit about the products. We're going to go deeper into products later on at Data Citizens '22, but we know you're debuting some new innovations, you know, whether it's, you know, the under the covers in security, sort of making data more accessible for people, just dealing with workflows and processes as you talked about earlier. Tell us a little bit about what you're introducing. >> Yeah, absolutely. We're super excited, a ton of innovation. And if we think about the big theme, and like I said, we're still relatively early in this journey towards kind of that mission of data intelligence, that really bold and compelling mission. Either customers are just starting on that journey, and we want to make it as easy as possible for the organization to actually get started, because we know that's important that they do. And for our organization and customers that have been with us for some time, there's still a tremendous amount of opportunity to kind of expand the platform further. And again, to make it easier for, really to accomplish that mission and vision around that data citizen that everyone has access to trustworthy data in a very easy, easy way. So that's really the theme of a lot of the innovation that we're driving, a lot of kind of ease of adoption, ease of use, but also then, how do we make sure that as Collibra becomes this kind of mission critical enterprise platform from a security performance architecture scale, supportability that we're truly able to deliver that kind of an enterprise mission critical platform. And so that's the big theme. From an innovation perspective, from a product perspective, a lot of new innovation that we're really excited about. A couple of highlights. One is around data marketplace. Again, a lot of our customers have plans in that direction. How do we make it easy? How do we make available a true kind of shopping experience so that anybody in your organization can, in a very easy search first way, find the right data product, find the right data set that data can then consume, use its analytics. How do we help organizations drive adoption, tell them where they're working really well, and where they have opportunities. Home pages, again, to make things easy for people, for anyone in your organization, to kind of get started with Collibra. You mentioned workflow designer, again, we have a very powerful enterprise platform. One of our key differentiators is the ability to really drive a lot of automation through workflows. And now we provided a new low code, no code, kind of workflow designer experience. So really customers can take it to the next level. There's a lot more new product around Collibra Protect, which in partnership with Snowflake, which has been a strategic investor in Collibra, focused on how do we make access governance easier? How do we, how are we able to make sure that as you move to the cloud, things like access management, masking around sensitive data, PII data, is managed in a much more effective way. Really excited about that product. There's more around data quality. Again, how do we get that deployed as easily and quickly and widely as we can? Moving that to the cloud has been a big part of our strategy. So we launched our data quality cloud product as well as making use of those native compute capabilities in platforms like Snowflake, Databricks, Google, Amazon, and others. And so we are bettering a capability that we call push down. So we're actually pushing down the computer and data quality, the monitoring, into the underlying platform, which again, from a scale performance and ease of use perspective is going to make a massive difference. And then more broadly, we talked a little bit about the ecosystem. Again, integrations that we talk about, being able to connect to every source. Integrations are absolutely critical, and we're really excited to deliver new integrations with Snowflake, Azure, and Google Cloud Storage as well. So there's a lot coming out. The team has been at work really hard, and we are really, really excited about what we are coming, what we're bringing to markets. >> Yeah, a lot going on there. I wonder if you could give us your closing thoughts. I mean, you talked about the marketplace, you know, you think about data mesh, you think of data as product, one of the key principles. You think about monetization. This is really different than what we've been used to in data, which is just getting the technology to work has been been so hard, so how do you see sort of the future? And, you know, give us your closing thoughts please. >> Yeah, absolutely. And I think we're really at this pivotal moment, and I think you said it well. We all know the constraint and the challenges with data, how to actually do data at scale. And while we've seen a ton of innovation on the infrastructure side, we fundamentally believe that just getting a faster database is important, but it's not going to fully solve the challenges and truly kind of deliver on the opportunity. And that's why now is really the time to deliver this data intelligence vision, the data intelligence platform. We are still early, making it as easy as we can. It's kind of our, as our mission. And so I'm really, really excited to see what we are going to, how the markets are going to evolve over the next few quarters and years. I think the trend is clearly there, when we talk about data mesh, this kind of federated approach, focus on data products is just another signal that we believe that a lot of our organizations are now at the time, they understand the need to go beyond just the technology, how to really, really think about how to actually scale data as a business function, just like we've done with IT, with HR, with sales and marketing, with finance. That's how we need to think about data. I think now's the time given the economic environment that we are in, much more focus on control, much more focus on productivity, efficiency, and now's the time we need to look beyond just the technology and infrastructure to think of how to scale data, how to manage data at scale. >> Yeah, it's a new era. The next 10 years of data won't be like the last, as I always say. Felix, thanks so much, and good luck in San Diego. I know you're going to crush it out there. >> Thank you Dave. >> Yeah, it's a great spot for an in person event, and of course, the content post event is going to be available at collibra.com, and you can of course catch the Cube coverage at thecube.net, and all the news at siliconangle.com. This is Dave Vellante for the Cube, your leader in enterprise and emerging tech coverage. (light music)

Published Date : Oct 24 2022

SUMMARY :

And the premise that we put Thanks for having me again. of the 2020s from the previous decade, and the past couple of years, and what are you doing to and kind of here we are again What do people need to know And on the organizational side, And of course we see you at all the shows. for the organization to the technology to work and now's the time we need to look beyond I know you're going to crush it out there. and of course, the content post event

ENTITIES

Entity	Category	Confidence
Adobe	ORGANIZATION	0.99+
Heineken	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Collibra	ORGANIZATION	0.99+
San Diego	LOCATION	0.99+
Dave	PERSON	0.99+
Felix Van de Maele	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Snowflake	ORGANIZATION	0.99+
seven	QUANTITY	0.99+
2008	DATE	0.99+
Felix	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
2020s	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Python	TITLE	0.99+
2010s	DATE	0.99+
Last year	DATE	0.99+
thecube.net	OTHER	0.99+
Data Citizens	ORGANIZATION	0.99+
12 months	QUANTITY	0.99+
second	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
One	QUANTITY	0.99+
10 years	QUANTITY	0.99+
OwlDQ	ORGANIZATION	0.98+
Spark	TITLE	0.98+
TensorFlow	TITLE	0.97+
Data Citizens	EVENT	0.97+
today	DATE	0.97+
Kirk Haslbeck	PERSON	0.96+
over 600 enterprise customers	QUANTITY	0.96+
both	QUANTITY	0.96+
Collibra Protect	ORGANIZATION	0.96+
first way	QUANTITY	0.94+
one	QUANTITY	0.93+
last decade	DATE	0.93+
past couple of years	DATE	0.93+
collibra.com	OTHER	0.92+
15 years	QUANTITY	0.88+
about 18 months ago	DATE	0.87+
last couple of years	DATE	0.87+
last couple of years	DATE	0.83+
almost 15 years later	DATE	0.82+
Data	ORGANIZATION	0.81+
previous decade	DATE	0.76+
Data Citizens 2021	ORGANIZATION	0.73+
next 10 years	DATE	0.69+
quarters	DATE	0.67+
last	DATE	0.66+
Data Citizens 2022	ORGANIZATION	0.63+
Google Cloud	ORGANIZATION	0.63+
past year	DATE	0.62+
Storage	TITLE	0.6+
Azure	ORGANIZATION	0.59+
next	DATE	0.58+
case	QUANTITY	0.58+
Cube	ORGANIZATION	0.53+
single vertical	QUANTITY	0.53+
14	QUANTITY	0.46+
Cube	COMMERCIAL_ITEM	0.45+

Kirk Haslbeck, Collibra | Data Citizens '22

(bright upbeat music) >> Welcome to theCUBE's Coverage of Data Citizens 2022 Collibra's Customer event. My name is Dave Vellante. With us is Kirk Hasselbeck, who's the Vice President of Data Quality of Collibra. Kirk, good to see you. Welcome. >> Thanks for having me, Dave. Excited to be here. >> You bet. Okay, we're going to discuss data quality, observability. It's a hot trend right now. You founded a data quality company, OwlDQ and it was acquired by Collibra last year. Congratulations! And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >> Yeah, absolutely. It's definitely exciting times for data quality which you're right, has been around for a long time. So why now, and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as to why this is becoming so important now. And I guess you could kind of break this down simply and think about if Dave, you and I were going to build, you know a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, what the ramifications could be? What those incidents would look like? Or maybe better yet, we try to build a new trading algorithm with a crossover strategy where the 50 day crosses the 10 day average. And imagine if the data underlying the inputs to that is incorrect. We'll probably have major financial ramifications in that sense. So, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. I bought a car not too long ago and my dad called and said, "How many cylinders does it have?" And I realized in that moment, I might have failed him because 'cause I didn't know. And I used to ask those types of questions about any lock brakes and cylinders and if it's manual or automatic and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips. I really don't know that much about it. And that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the individuals loading and consuming all of this data for the company actually may not know that much about the data itself and that's not even their job anymore. So, we'll talk more about that in a minute but that's really what's setting the foreground for this observability play and why everybody's so interested, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >> You know, the other thing too about data quality and for years we did the MIT CDOIQ event we didn't do it last year at COVID, messed everything up. But the observation I would make there love thoughts is it data quality used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a a risk to data as an asset. And now, as we say, we're going to talk about observability. And so it's really become front and center, just the whole quality issue because data's fundamental, hasn't it? >> Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And that's kind of what's going on. There's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor. But with the scale that we've achieved in early days, even before Collibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is listening right intently nowadays to this topic is so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's not ever going to be based on one or two domain experts anymore. >> So, how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they cousins? What's your perspective on that? >> Yeah, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the lingo is constantly moving as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens it's wrong and when it doesn't, it's correct. Or I could look for a trend and I'll give you a good example. Everybody's talking about fresh data and stale data and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good and the bads. That was kind of your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >> So what's the Collibra angle on all this stuff made the acquisition you got data quality observability coming together, you guys have a lot of expertise in this area but you hear providence of data you just talked about stale data, the whole trend toward real time. How is Collibra approaching the problem and what's unique about your approach? >> Well, I think where we're fortunate is with our background, myself and team we sort of lived this problem for a long time in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution. It's more advanced than some of the observation techniques that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong just show me the big picture. Help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact connecting it with lineage and catalog, metadata. And as that grows, you can actually achieve total data governance. At this point, with the acquisition of what was a lineage company years ago and then my company OwlDQ, now Collibra Data Quality, Collibra may be the best positioned for total data governance and intelligence in the space. >> Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, they just said, "Oh, it's a glitch." So they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens '22 that you're announcing you got to announce new products, right? Your yearly event, what's new? Give us a sense as to what products are coming out but specifically around data quality and observability. >> Absolutely. There's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks, Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a salike model. And we've started to hook in to these databases. And while we've always worked with the same databases in the past they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did my data that I've spent all this time and money with my security team securing ever leave my hands? Did it ever leave my secure VPC as they call it? And with these native integrations that we're building and about to unveil here as kind of a sneak peek for next week at Data Citizens, we're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration you could log into the Collibra Data Quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >> So this is interesting because what you just described you mentioned Snowflake, you mentioned Google, oh actually you mentioned yeah, the Data Bricks. Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool but then Google's got the open data cloud. If you heard Google Nest and now Data Bricks doesn't call it the data cloud but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm hearing to really understand the relationships between all those and have confidence across, it's like (indistinct) you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And that's what you're bringing to the table. Is that right? Did I get that right? >> Yeah, that's right. And for us, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now we can send them the operating ability to crunch all of the calculations, the governance, the quality and get the answers. And what that's doing, it's basically zero network cost, zero egress cost, zero latency of time. And so when you were to log into Big BigQuery tomorrow using our tool or let or say Snowflake, for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls things of that nature that just become less onerous. What we're seeing is there's so much technology out there just like all of the major brands that you mentioned but how do we make it easier? The future is about less clicks, faster time to value faster scale, and eventually lower cost. And we think that this positions us to be the leader there. >> I love this example because every talks about wow the cloud guys are going to own the world and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. Alright, Kirk, give us your final thoughts and on the trends that we've talked about and Data Citizens '22. >> Absolutely. Well I think, one big trend is discovery and classification. Seeing that across the board people used to know it was a zip code and nowadays with the amount of data that's out there, they want to know where everything is where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases, how fast they can get controls and insights out of their tools. So I think we're going to see more one click solutions, more SAS-based solutions and solutions that hopefully prove faster time to value on all of these modern cloud platforms. >> Excellent, all right. Kurt Hasselbeck, thanks so much for coming on theCUBE and previewing Data Citizens '22. Appreciate it. >> Thanks for having me, Dave. >> You're welcome. All right, and thank you for watching. Keep it right there for more coverage from theCUBE.

Published Date : Oct 24 2022

SUMMARY :

Kirk, good to see you. Excited to be here. and it was acquired by Collibra last year. And it's so complex that the And now, as we say, we're going and I check out the NASDAQ market cap. and areas changing the and what's unique about your approach? of the curve there when most and some examples, remember and data activity happens in the database. and has the proper lineage, providence. and get the answers. and on the trends that we've talked about and solutions that hopefully and previewing Data Citizens '22. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Kurt Hasselbeck	PERSON	0.99+
2010	DATE	0.99+
one	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
50 day	QUANTITY	0.99+
Kirk	PERSON	0.99+
10 day	QUANTITY	0.99+
OwlDQ	ORGANIZATION	0.99+
Kirk Haslbeck	PERSON	0.99+
next week	DATE	0.99+
Google	ORGANIZATION	0.99+
last year	DATE	0.99+
two sides	QUANTITY	0.99+
thousands	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
Snowflake	TITLE	0.99+
Data Citizens	ORGANIZATION	0.99+
Data Bricks	ORGANIZATION	0.99+
two other things	QUANTITY	0.98+
one click	QUANTITY	0.98+
tomorrow	DATE	0.98+
today	DATE	0.98+
five seconds	QUANTITY	0.97+
two domain	QUANTITY	0.94+
Collibra Data Quality	TITLE	0.92+
MIT CDOIQ	EVENT	0.9+
Data Citizens '22	TITLE	0.9+
Egress	ORGANIZATION	0.89+
Delta Lake	TITLE	0.89+
three	QUANTITY	0.86+
zero	QUANTITY	0.85+
Big Query	TITLE	0.85+
about a decade ago	DATE	0.85+
SQL Pushdown	TITLE	0.83+
Data Citizens 2022 Collibra	EVENT	0.82+
Big BigQuery	TITLE	0.81+
more than a couple	QUANTITY	0.79+
couple	QUANTITY	0.78+
one big	QUANTITY	0.77+
Collibra Data Quality	ORGANIZATION	0.75+
Collibra	OTHER	0.75+
Google Nest	ORGANIZATION	0.75+
Data Citizens '22	ORGANIZATION	0.74+
zero latency	QUANTITY	0.72+
SAS	ORGANIZATION	0.71+
Snowflake	ORGANIZATION	0.69+
COVID	ORGANIZATION	0.69+
years ago	DATE	0.68+
Wall Street	LOCATION	0.66+
theCUBE	ORGANIZATION	0.66+
many numbers	QUANTITY	0.63+
Collibra	PERSON	0.63+
times	QUANTITY	0.61+
Data	ORGANIZATION	0.61+
too long	DATE	0.6+
Vice President	PERSON	0.57+
data	QUANTITY	0.56+
CDO	TITLE	0.52+
Bricks	TITLE	0.48+

Aileen Black, Collibra and Marco Temaner, U.S. Army | AWS PS Partner Awards 2021

>>Mhm. Yes one. >>Hello and welcome. Today's session of the 2021 AWS Global Public Sector Partner Awards. I am pleased to introduce our very next guests. Their names are a lean black S. V. P. Public sector at culebra and Marco Timon are Chief Enterprise Architect at the HQ. D. A. Office of business transformation at the U. S. Army. I'm your host Natalie ehrlich, we're going to be discussing the award for best partner transformation. Best data led migration. Thank you both for joining the program. >>Thank you for having us. >>Thank you. Glad to be here. >>Well, a lien, why is it important to have a data driven migration? >>You know, migrations to the cloud that are simply just a lift and ship does take advantage of the elasticity of the cloud but not really about how to innovate and leverage what truly the AWS cloud has to offer. Um so a data led migration allows agencies to truly innovate and really kind of almost reimagine how they make their mission objectives and how they leverage the cloud, you know, the government has, let's face it mountains of data, right? I mean every single day there's more and more data and you you can't pick up a trade magazine that doesn't talk about how data is the new currency or data is the new oil. Um, so you know, data to have value has to be usable, right? So you to turn your data into knowledge. You really need to have a robust data intelligence platform which allows agencies to find understand and trust or data data intelligence platform like culebra is the system of record for their data no matter where it may reside. Um no strategy is complete without a strong data, governments platform and security and privacy baked in from the very start, data has to be accessible to the average data. Citizen people need to be able to better collaborate to make data driven decisions. Organizations need to be united by data. This is how a technology and platform like cal Ibra really allows agencies to leverage the data as a strategic asset. >>Terrific. Well, why is it more important than ever to do this than ever before? >>Well, you know, there's just the innovation of technology like Ai and Ml truly to be truly leveraged. Um you know, they need to be able to have trust the data that they're using it. If it if the model is trained with only a small set of data, um it's not going to really produce the trusted results they want. ML models deliver faster results at scale, but the results can be only precise when data feeding them is of high quality. And let's say Gardner just came out with a study that said data quality is the number one obstacle for adoption of A. I. Um when good data and good models find a unified scalable platform with superior collaboration capabilities, you're A I. M. L. Opportunities to truly be leveraged and you can truly leverage data as a strategic asset. >>Terrific. Well marco what does the future look like for the army and data >>so and let me play off. Do you think that Allen said so in terms of the future um obviously data's uh as you mentioned the data volumes are growing enormously so. Part of the future has to do with dealing with those data volumes just from a straight >>technological >>perspective. But as the data volumes grow and as we have to react to things that we need to react to the military, we're not just trying to understand the quantity of data but what it is and not just the quality but the nature of it. So understanding authoritative nous. Being able to identify what data we need to solve certain problems or answer certain questions. I mean a major theme in terms of what we're doing with data governance and having a data governance platform and a data catalog is having immediate knowledge of what data is, where what quality and confidence we have in the data. Sometimes it's more important to have data that's approximately correct than truly correct as quickly as possible, you know. So not all data needs to be of perfect quality at all times you need to understand what's authoritative, what the quality is, how current the information is. So as the data volumes grow and grow and grow. Keeping up with that. Not just from the standpoint of can we scale we know how to scale pretty well in terms of containing data volume but keeping up what it is, the knowledge of the data itself, understand authoritative nous quality, providence etcetera, uh that's a whole enterprise to keep keeping up with and that's what we're doing right now with this, with this project. >>Yeah. And I'd like to also follow up with that, how has leveraging palabras data intelligence platform enabled the army to accelerate its overall mission. >>So there's uh there's sort of interplay between, you know, just having a technology does something doesn't mean you're going to use it to do that something, but often having a place to do work of governance, work of knowledge management can be the precipitating functions or the stimulus to do so. So it's not and if you build it they will come. But if you don't have a place to play ball, you're not going to play ball to kind of run with that metaphor. So having technology that can do these things is a precursor to being able to. But then of course we, as an organization have to do it. So the interplay between making a selection of technology and doing the implementation from a technical perspective that plays off of an urgency, we've made the decision to use a technology, so then that helped accelerate getting roles, responsibilities of our ceo of our missionary data. Officers of data Stewart's the folks that have to be doing the work. Um, when you educate system owners in cataloging and giving a central environment, the information is needed. If you say here's a place to put it, then it's very tangible, especially in the military where work is done in a very uh, concrete task based way. If you have a place to do things, then it's easier to tell people to do things. So the technology is great and works for us. But the choice to to move with the technology has then been a productive interplay with with the doing of the things that need to be done to take advantage of the technology, if that makes >>sense? Well, >>yeah, that's really great to hear. I mean, speaking of taking advantage of the technology, a lien can collaborate, help your other public sector customers take advantage of A. I and machine learning. >>Well, people need to be able to collaborate and take advantage of their most strategic asset data to make those data driven decisions. It gives them the agility to be able to act 2020 was a great lesson around the importance of having your data house in order. Let's face it, the pandemic, we watched organizations that, you know, had a strong data governance framework who had looked at and understood where their data were and they were very able to very quickly assess the situation in react and others were not in such a good situation. So, you know, being able to have that data governance framework, being able to have that data quality, being able to have the right information and being able to trust it allows people to be effective and quickly to react to situations >>fascinating. Um do you have any insight on that marco, would you like to weigh in? >>Well, definitely concur. Um I think our strategy, like I said has been to um use the technology to highlight the need to put governance into place and to focus on increasing data quality the data sources. And I would say this has also helped us uh I mean things that we weren't doing before that have to do with just educating the populace, you know all the way from the folks operators of systems to the most senior executives. Being conversant in the principles that we're talking about this whole discipline is a bit arcane and kind of back office and kind of I. T. But it's actually not. If you don't have the data to make, if you don't know where to get the data to make a decision then you're going to make a decision based on incorrect data and and you know that's pretty important in the military to not get wrong. So definitely concur and we're taking that approach as well. >>I'd like to take it one step further. If if you're speaking the same language then so if you have an understanding what the data governments framework is you can understand what the data is, where it is. Sometimes there's duplicate data and there's duplicate data for a reason, but understanding where it came from and what the linage is associated with, it really gives you the power of being able to shop for data and get the right information at the right time and give it the right perspective. And I think that's the power of what has laid the foundation for the work that the army and MArco has done to really set the stage for what they can do in the future. >>Terrific and marco, if you could comment a little bit about data storage ship and how it can positively dry future outcomes. >>Yeah, So um data stewardship for us um has a lot to do with the functional, so the people that were signing as a senior data Stewart's are the senior functional in the respective organizations, logistics, financial management, training, readiness, etcetera. So the idea of the folks who know really everything about those functional domains, um looking at things from the perspective of the data that's needed to support those functions, logistics, human resources, etcetera. Um and being, you know, call it the the most authoritative subject matter experts. So the governance that we're doing is coming much more from a functional perspective than a technical perspective, so that when a when a system is being built, if we're talking about data migration, if we're talking about somebody driving analytics, the knowledge that were associated with the data comes from the functional. So our data stewardship is less about the technical side and more about making sure that the understanding from functional perspective of what the data is for, what the provenance is, not from a technical perspective, but what it means in terms of sources of information, sources of personnel, sources of munitions et cetera um is available to the folks using it. So they basically know what it is. So the emphasis is on that functional infusion of knowledge into the metadata so that then people who are trying to use that day to have a way of understanding what it really is and what the meaning is. And that's what really what data stewardship means from were actually very good at stewarding data. From a technical perspective. We know how to run systems very well. We know how to scale, We're good at that, but making sure that people know what it is and why and when to use it. Um that's where it's maybe we have some catching up to do, which is what this efforts about. >>Terrific. Well, fantastic insights from you both. I really appreciate you taking the time uh to tell all our viewers about this. That was Eileen Black and Marco Timoner and that, of course, was our section for the AWS Global Public Partner Sector Awards. Thanks for watching. I'm your host, Natalie Early. Thank you. >>Yeah. Mm.

Published Date : Jun 22 2021

SUMMARY :

I am pleased to introduce our very next guests. Glad to be here. the elasticity of the cloud but not really about how to innovate and leverage Well, why is it more important than ever to do this than ever before? Um you know, they need to be able to have Well marco what does the future look like for the army and data Part of the future has to do with dealing with those data volumes just from a straight needs to be of perfect quality at all times you need to understand what's authoritative, enabled the army to accelerate its overall mission. doing of the things that need to be done to take advantage of the technology, if that makes I mean, speaking of taking advantage of the technology, Well, people need to be able to collaborate and take advantage of their most strategic asset Um do you have any insight on that marco, would you like to weigh in? that have to do with just educating the populace, you know all the way from the folks operators of systems from and what the linage is associated with, it really gives you the power of being able to shop for data Terrific and marco, if you could comment a little bit about data storage ship and the perspective of the data that's needed to support those functions, logistics, human resources, I really appreciate you taking the time uh to

ENTITIES

Entity	Category	Confidence
Eileen Black	PERSON	0.99+
Marco Timoner	PERSON	0.99+
Natalie ehrlich	PERSON	0.99+
Marco Timon	PERSON	0.99+
Natalie Early	PERSON	0.99+
Marco Temaner	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Aileen Black	PERSON	0.99+
Collibra	PERSON	0.99+
Stewart	PERSON	0.99+
Allen	PERSON	0.98+
U. S. Army	ORGANIZATION	0.98+
both	QUANTITY	0.98+
Gardner	PERSON	0.98+
Today	DATE	0.96+
2021 AWS Global Public Sector Partner Awards	EVENT	0.96+
MArco	ORGANIZATION	0.94+
pandemic	EVENT	0.93+
one	QUANTITY	0.93+
AWS Global Public Partner Sector Awards	EVENT	0.92+
S. V. P.	ORGANIZATION	0.9+
AWS	EVENT	0.9+
2020	DATE	0.86+
U.S. Army	ORGANIZATION	0.8+
single day	QUANTITY	0.75+
PS Partner Awards 2021	EVENT	0.75+
D. A. Office	ORGANIZATION	0.69+
culebra	ORGANIZATION	0.65+
Ai	ORGANIZATION	0.62+
cal	TITLE	0.62+
Ibra	TITLE	0.38+

Jacklyn Osborne, Bank Of America | Collibra Data Citizens'21

>>from >>around the globe. >>It's the cube >>covering Data Citizens >>21 brought to you by culebra. >>Well how everybody john Wallace here as we continue our coverage here on the cube of Data Citizens 21 it is a pleasure of ours to welcome in an award winner here at Data Citizens 21 were with Jacqueline Osborne who is the Managing Director and risk and Finance technology executive at Bank of America and she is also the data citizen of the Year, one of the culebra Excellence Award winners. And Jacqueline congratulations on the honor. Well deserved, I'm sure. >>Thank you so much. It is a true honor and I am so happy to be here and I'm looking forward to our conversation today. Yeah, what is it? >>It's all about just the concept of being a data citizen um in your mind um what is all that about? What are those pillars in terms of being a good data sets? And that gets to the point that you are the data citizen of the year? >>I think that's such a good question and actually is something that I don't even know if I know everything because it's constantly evolving. Being a data citizen yesterday is not what it is today and it's not what it means tomorrow because this field is evolving but with that said I think to me being a data citizen is being is having that awareness that data matters is driving to that data as an asset and really trying to lay the foundation to ensure its the right data in the right place at the right time. >>Yeah, let's talk about that high wire act because it's becoming increasingly more complex as you know, you've been in This realm if you will for what 15 years now? I believe it has evolved dramatically right in terms of capabilities but also complexity. So let's talk about that, about making finding the relevance of data and delivering it on time to the right people within your organization. >>How much >>More challenging is that now than it was maybe five or just you know, 10 years ago? >>I mean it's kind of crazy. There are some areas that make it so much easier and then for your question in some areas that make it so much harder. But if I can, let's start with the easier because I think this is something that really is important is when I started this, I've been in data my entire professional career, I've been achieved data officer since 2013. Um and when I started, I used to joke that I was a used car salesman. I was selling selling something this idea of data quality, data governance that nobody wanted. But now, so the shift of your question is the good if I am now a luxury car salesman selling a product that everybody wants, but shift to the bad nobody wants to pay for. So the complexity of it as data becomes bigger as we talk about big data and unstructured data and social media and facebook feeds. That is hard. It is complex. And the ability to truly manage and govern data to the degree of that perfection is really hard. So the more data we get, the more complexity, the more challenge, the more there is a need to really prioritize align with business strategy and ensure that you are embedding into the culture and the DNA of the corporate and not do it in the silos. >>You know, delivering that data to in the secure environment obviously, critically important for any enterprise, but even more so to put a finer point on financial services in terms of your work in that regard. So, so let's add that layer into this to not only internal, all the communication you have to do in the collaboration, you have to have but you have these external stakeholders to write, you have me, you know, a boa client if you will um that you've got to be aware of and have to communicate with. So so let's talk about that, that kind of merger if you will of not only having to work internally but also externally and making sure that with all the data you've got now that it works >>indeed. And you're kind of moving towards this new one of the newer dimensions, which is privacy, I mean G D p R was the first regulation in the UK, but now you have the C C P A and the California and it's coming and that that right to be forgotten or more importantly, as you said, as a customer of financial issues, that right to understand where your data is is very important because customers do want to know that their information is understood, trusted protected and going to be taken care of. So that ability to really transform back that you have a solid basis and that you are taking the measures and the necessary steps to ensure that that data is air quotes govern is so important. And it really again that shift from that used car salesman to a luxury car salesman. Your question is another example of how that shift is happening. It's no longer a should do or could do. Data governance is really becoming a must do and why you are seeing so many more. Chief data officers. Chief analytics officers, data management professionals. The profession is growing. I mean, incrementally every single day. >>What about the balancing act that you do? Let's just do with the internal audiences that you have to contend with. I shouldn't say content, content has that pejorative term to that you that you that you deal with, you collaborate with. Um you know, governance is also critically important because you want to make data available to the right people at the right time, but only the right people. Right. So what kind of practices or procedures are you putting in a place at B. O. A to make sure that that data is delivered to the right folks, but only to the right people and trying I guess to educate people within your organization as to the need for these strict governance processes. >>Sure. I tend to refer to them as the foundational pillars and if I was to take a step back and say what they are and how we use them. So the first one is metadata management and it is really around that. What data do you have? It's that understanding the information. So I used to refer to it or I still refer to it as when we were going to the library and you used to have to look at the card catalog That metadata manages very similar to the card catalogue for books. It tells you all the information. What's the genre? Who's the author, what the section is, where it is in the library and that is a core pieces. If you don't understand your data you can govern it. So that's kind of Pillar one. Metadata management. Pillar two is what's often referred to his data lineage. But I do think the new buzzword is that a providence? It's really that access low. It's understanding where data comes from the movement along the journey and where it's going. If you don't understand that horizontal front to back you can't govern the information as well because it can be changing hands, it can be altering and so it's that that end to end look at things. This pillar to pillar three is data quality and that's really that measurement of is it the right data and it is made up of a series of data quality dimensions, accuracy, completeness, validity, timeliness, conformity, reasonable nous etcetera. And it's really that fit freezes the data that I have the right data as I said earlier and then last but not least is issue management. At the end of the day there will be problems, there is too much data. It is in too many hands. So it's not we're not trying to remove all data issues but having a process where you can actually log prioritized and ultimately remediate is that that last and final pillar of the data management I would call circle because it has to all come back together and it's rinse and repeat. >>Yeah. And and so you you raise a point, a great point about things are going to go wrong. You know, eventually something happens. We know nothing is foolproof, nothing is bulletproof. Uh and we're certainly seeing that in terms of security now right with breaches pretty well publicized with invasions, ransom, where you name it, right, all kinds of flavors of that. Unfortunately. So from your perspective in terms of being that this data data guardian, if you will um how much of your concerns now have been amplified in terms of security and privacy and and that kind of internal uh communication you have to have or or I guess by in you know to understand the need to make this data ultra secure and ultra private, especially in this environment where the bad actors you know are are prolific, so kind of talk about that it's a struggle but maybe that challenge That you have in this environment here in 2021. >>Yeah, I think what you know the way I would do it is the struggle is again that that need or the desire to to protect everything and at the end of the day that's hard. And so the struggle right now that I have ri faces the prioritization. How do we differentiate what we call the critical few some call it cds chris critical data elements that they call it Katie key dad elements there, there's there's a term but really as that need and that demand grows whether it's for security or privacy or even data democratization, which hopefully we do talk about at some point, all these things are reliance on the right data because like statistics garbage in garbage out. So whether it's because you need the right information because of your analytics and your models or as you talked about its prevention and defensive security reasons that defensive and offensive isn't going away. So the real struggle is not around the driver, but the prioritization. How do you focus to ensure you're spending your time on the right areas and more importantly in alignment with the business priorities? Because one of the things that's critically important for me is ensuring that it's not metadata or data governance or data quality for the sake of it, it is in alignment with that business priority. >>And and and a big part of that is is strategy for the future, right strategy going forward. you know, where you're going to go in the next 18, 24 months and so from uh without, you know, revealing state secrets here. How do you how do you see this playing out in terms of this continual digital transformation? If you will from the B O a side of the fence? Um, you know, what what do you see as being important or in terms of what you would like to accomplish over the next year and a half, two years >>for me? I think it's that and I'm glad you asked that question, cause I wanted to mention that that data democratization I think. And if we if we debunk that or look into that, what do I mean by democratization? It's that real time access, but it's not real time access to the wrong information or to the wrong people as we talked about, it really is ensuring almost like an amazon model that I can simply search for the information I need, I can put it in my shopping cart and I can check out and I am able to that's that data driven, I'm able to use that information knowing it's the right data in the right hands for the right reasons and that's really my future mind where I'm getting to is how do I enable that? How do I democratize it? So data is truly and does become that enterprise asset that everybody and anybody can access, but they can do so in a way that has all of those defensive controls in place, going back to that right data, right place the right time because the shiny toys of ai machine learning all those things is if you're building models off of the wrong data from the wrong place or in the wrong hands, it's going to bite you in about whether it's today, tomorrow, the future. >>Well, exactly. I love that analogy and on that I'm going to thank you for the time. So I'm gonna call you a luxury data salesperson, not a car car salesman. But uh it certainly has paid off and we certainly congratulate you as well on the award that you wanna hear from calabria. >>Thank you so much and thank you for the time. Hopefully you've enjoyed our conversations as much as I have. >>I certainly have. Thank you very much Jacqueline Osborn, joining us on the Bank of America, the data citizen of the Year. Her data citizens 2021. I'm john walls and you've been watching the cube >>mm

Published Date : Jun 17 2021

SUMMARY :

data citizen of the Year, one of the culebra Excellence Award winners. Thank you so much. that data matters is driving to that data as an asset and about making finding the relevance of data and delivering it on time to the right people within your that you are embedding into the culture and the DNA of the corporate and not so let's add that layer into this to not only internal, all the communication back that you have a solid basis and that you are taking the measures I shouldn't say content, content has that pejorative term to that you that you that you deal with, And it's really that fit freezes the data that I have the right data as I said earlier in terms of being that this data data guardian, if you will um So whether it's because you need the right information because of your analytics and your models or as you talked about And and and a big part of that is is strategy for the future, right strategy going forward. or in the wrong hands, it's going to bite you in about whether it's today, I love that analogy and on that I'm going to thank you for the time. Thank you so much and thank you for the time. Thank you very much Jacqueline Osborn, joining us on the Bank of America, the data citizen

ENTITIES

Entity	Category	Confidence
Jacqueline	PERSON	0.99+
Jacqueline Osborne	PERSON	0.99+
Jacqueline Osborn	PERSON	0.99+
Jacklyn Osborne	PERSON	0.99+
john Wallace	PERSON	0.99+
2021	DATE	0.99+
amazon	ORGANIZATION	0.99+
UK	LOCATION	0.99+
15 years	QUANTITY	0.99+
Bank of America	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
today	DATE	0.99+
2013	DATE	0.99+
yesterday	DATE	0.99+
facebook	ORGANIZATION	0.98+
john walls	PERSON	0.98+
Bank Of America	ORGANIZATION	0.98+
first one	QUANTITY	0.98+
one	QUANTITY	0.98+
10 years ago	DATE	0.97+
Katie	PERSON	0.97+
18	QUANTITY	0.91+
24 months	QUANTITY	0.89+
culebra	TITLE	0.88+
California	LOCATION	0.88+
C C P A	TITLE	0.87+
next year and a half	DATE	0.87+
Data Citizens 21	ORGANIZATION	0.85+
first regulation	QUANTITY	0.85+
Excellence Award	TITLE	0.78+
two years	QUANTITY	0.78+
Collibra Data	ORGANIZATION	0.76+
single day	QUANTITY	0.73+
chris	PERSON	0.69+
pillar three	OTHER	0.68+
five	QUANTITY	0.59+
Pillar two	OTHER	0.5+
hands	QUANTITY	0.49+
21	QUANTITY	0.45+
Pillar	PERSON	0.3+

Jim Cushman, CPO, Collibra

>> From around the globe, it's theCUBE, covering Data Citizens'21. Brought to you by Collibra. >> We're back talking all things data at Data Citizens '21. My name is Dave Vellante and you're watching theCUBE's continuous coverage, virtual coverage #DataCitizens21. I'm here with Jim Cushman who is Collibra's Chief Product Officer who shared the company's product vision at the event. Jim, welcome, good to see you. >> Thanks Dave, glad to be here. >> Now one of the themes of your session was all around self-service and access to data. This is a big big point of discussion amongst organizations that we talk to. I wonder if you could speak a little more toward what that means for Collibra and your customers and maybe some of the challenges of getting there. >> So Dave our ultimate goal at Collibra has always been to enable service access for all customers. Now, one of the challenges is they're limited to how they can access information, these knowledge workers. So our goal is to totally liberate them and so, why is this important? Well, in and of itself, self-service liberates, tens of millions of data lyric knowledge workers. This will drive more rapid, insightful decision-making, it'll drive productivity and competitiveness. And to make this level of adoption possible, the user experience has to be as intuitive as say, retail shopping, like I mentioned in my previous bit, like you're buying shoes online. But this is a little bit of foreshadowing and there's even a more profound future than just enabling a self-service, that we believe that a new class of shopper is coming online and she may not be as data-literate as our knowledge worker of today. Think of her as an algorithm developer, she builds machine learning or AI. The engagement model for this user will be, to kind of build automation, personalized experiences for people to engage with data. But in order to build that automation, she too needs data. Because she's not data literate, she needs the equivalent of a personal shopper. Someone that can guide her through the experience without actually having her know all the answers to the questions that would be asked. So this level of self-service goes one step further and becomes an automated service. One to really help find the best unbiased in a labeled training data to help train an algorithm in the future. >> That's, okay please continue. >> No please, and so all of this self and automated service, needs to be complemented with kind of a peace of mind that you're letting the right people gain access to it. So when you automate it, it's like, well, geez are the right people getting access to this. So it has to be governed and secured. This can't become like the Wild Wild West or like a data, what we call a data flea market or you know, data's everywhere. So, you know, history does quickly forget the companies that do not adjust to remain relevant. And I think we're in the midst of an exponential differentiation in Collibra data intelligence cloud is really kind of established to be the key catalyst for companies that will be on the winning side. >> Well, that's big because I mean, I'm a big believer in putting data in the hands of those folks in the line of business. And of course the big question that always comes up is, well, what about governance? What about security? So to the extent that you can federate that, that's huge. Because data is distributed by its very nature, it's going to stay that way. It's complex. You have to make the technology work in that complex environment, which brings me to this idea of low code or no code. It's gaining a lot of momentum in the industry. Everybody's talking about it, but there are a lot of questions, you know, what can you actually expect from no code and low code who were the right, you know potential users of that? Is there a difference between low and no? And so from your standpoint, why is this getting so much attention and why now, Jim? >> You don't want me to go back even 25 years ago we were talking about four and five generational languages that people were building. And it really didn't re reach the total value that folks were looking for because it always fell short. And you'd say, listen, if you didn't do all the work it took to get to a certain point how are you possibly going to finish it? And that's where the four GLs and five GLs fell short as capability. With our stuff where if you really get a great self-service how are you going to be self-service if it still requires somebody right though? Well, I guess you could do it if the only self-service people are people who write code, well, that's not bad factor. So if you truly want the ability to have something show up at your front door, without you having to call somebody or make any efforts to get it, then it needs to generate itself. The beauty of doing a catalog, new governance, understanding all the data that is available for choice, giving someone the selection that is using objective criteria, like this is the best objective cause if it's quality for what you want or it's labeled or it's unbiased and it has that level of deterministic value to it versus guessing or civic activity or what my neighbor used or what I used on my last job. Now that we've given people the power with confidence to say, this is the one that I want, the next step is okay, can you deliver it to them without them having to write any code? So imagine being able to generate those instructions from everything that we have in our metadata repository to say this is exactly the data I need you to go get and perform what we call a distributed query against those data sets and bringing it back to them. No code written. And here's the real beauty Dave, pipeline development, data pipeline development is a relatively expensive thing today and that's why people spend a lot of money maintaining these pipelines but imagine if there was zero cost to building your pipeline would you spend any money to maintain it? Probably not. So if we can build it for no cost, then why maintain it? Just build it every time you need it. And it then again, done on a self-service basis. >> I really liked the way you're thinking about this cause you're right. A lot of times when you hear self self-service it's about making the hardcore developers, you know be able to do self service. But the reality is, and you talk about that data pipeline it's complex a business person sitting there waiting for data or wants to put in new data and it turns out that the smallest unit is actually that entire team. And so you sit back and wait. And so to the extent that you can actually enable self-serve for the business by simplification that is it's been the holy grail for a while, isn't it? >> I agree. >> Let's look a little bit dig into where you're placing your bets. I mean, your head of products, you got to make bets, you know, certainly many many months if not years in advance. What are your big focus areas of investment right now? >> Yeah, certainly. So one of the things we've done very successfully since our origin over a decade ago, was building a business user-friendly software and it was predominantly kind of a plumbing or infrastructure area. So, business users love working with our software. They can find what they're looking for and they don't need to have some cryptic key of how to work with it. They can think about things in their terms and use our business glossary and they can navigate through what we call our data intelligence graph and find just what they're looking for. And we don't require a business to change everything just to make it happen. We give them kind of a universal translator to talk to the data. But with all that wonderful usability the common compromise that you make as well, its only good up to a certain amount of information, kind of like Excel. You know, you can do almost anything with Excel, right? But when you get to into large volumes, it becomes problematic and now you need that, you know go with a hardcore database and application on top. So what the industry is pulling us towards is far greater amounts of data not that just millions or even tens of millions but into the hundreds of millions and billions of things that we need to manage. So we have a huge focus on scale and performance on a global basis and that's a mouthful, right? Not only are you dealing with large amounts at performance but you have to do it in a global fashion and make it possible for somebody who might be operating in a Southeast Asia to have the same experience with the environment as they would be in Los Angeles. And the data needs to therefore go to the user as opposed to having the user come to the data as much as possible. So it really does put a lot of emphasis on some of what you call the non-functional requirements also known as the ilities and so our ability to bring the data and handle those large enterprise grade capabilities at scale and performance globally is what's really driving a good number of our investments today. >> I want to talk about data quality. This is a hard topic, but it's one that's so important. And I think it's been really challenging and somewhat misunderstood when you think about the chief data officer role itself, it kind of emerged from these highly regulated industries. And it came out of the data quality, kind of a back office role that's kind of gone front and center and now is, you know pretty strategic. Having said that, the you know, the prevailing philosophy is okay, we got to have this centralized data quality approach and that it's going to be imposed throughout. And it really is a hard problem and I think about, you know these hyper specialized roles, like, you know the quality engineer and so forth. And again, the prevailing wisdom is, if I could centralize that it can be lower cost and I can service these lines of business when in reality, the real value is, you know speed. And so how are you thinking about data quality? You hear so much about it. Why is it such a big deal and why is it so hard in a priority in the marketplace? You're thoughts. >> Thanks for that. So we of course acquired a data quality company, not burying delete, earlier this year LGQ and the big question is, okay, so why, why them and why now, not before? Well, at least a decade ago you started hearing people talk about big data. It was probably around 2009, it was becoming the big talk and what we don't really talk about when we talk about this ever expanding data, the byproduct is, this velocity of data, is increasing dramatically. So the speed of which new data is being presented the way in which data is changing is dramatic. And why is that important to data quality? Cause data quality historically for the last 30 years or so has been a rules-based business where you analyze the data at a certain point in time and you write a rule for it. Now there's already a room for error there cause humans are involved in writing those rules, but now with the increased velocity, the likelihood that it's going to atrophy and become no longer a valid or useful rule to you increases exponentially. So we were looking for a technology that was doing it in a new way similar to the way that we do auto classification when we're cataloging attributes is how do we look at millions of pieces of information around metadata and decide what it is to put it into context? The ability to automatically generate these rules and then continuously adapt as data changes to adjust these rules, is really a game changer for the industry itself. So we chose OwlDQ for that very reason. It's not only where they had this really kind of modern architecture to automatically generate rules but then to continuously monitor the data and adjust those rules, cutting out the huge amounts of costs, clearly having rules that aren't helping you save and frankly, you know how this works is, you know no one really complains about it until there's the squeaky wheel, you know, you get a fine or exposes and that's what is causing a lot of issues with data quality. And then why now? Well, I think and this is my speculation, but there's so much movement of data moving to the cloud right now. And so anyone who's made big investments in data quality historically for their on-premise data warehouses, Netezzas, Teradatas, Oracles, et cetera or even their data lakes are now moving to the cloud. And they're saying, hmm, what investments are we going to carry forward that we had on premise? And which ones are we going to start a new from and data quality seems to be ripe for something new and so these new investments in data in the cloud are now looking up. Let's look at new next generation method of doing data quality. And that's where we're really fitting in nicely. And of course, finally, you can't really do data governance and cataloging without data quality and data quality without data governance and cataloging is kind of a hollow a long-term story. So the three working together is very a powerful story. >> I got to ask you some Colombo questions about this cause you know, you're right. It's rules-based and so my, you know, immediate like, okay what are the rules around COVID or hybrid work, right? If there's static rules, there's so much unknown and so what you're saying is you've got a dynamic process to do that. So and one of the my gripes about the whole big data thing and you know, you referenced that 2009, 2010, I loved it, because there was a lot of profound things about Hadoop and a lot of failings. And one of the challenges is really that there's no context in the big data system. You know, the data, the folks in the data pipeline, they don't have the business context. So my question is, as you it's and it sounds like you've got this awesome magic to automate, who would adjudicates the dynamic rules? How does, do humans play a role? What role do they play there? >> Absolutely. There's the notion of sampling. So you can only trust a machine for certain point before you want to have some type of a steward or a assisted or supervised learning that goes on. So, you know, suspect maybe one out of 10, one out of 20 rules that are generated, you might want to have somebody look at it. Like there's ways to do the equivalent of supervised learning without actually paying the cost of the supervisor. Let's suppose that you've written a thousand rules for your system that are five years old. And we come in with our ability and we analyze the same data and we generate rules ourselves. We compare the two themselves and there's absolutely going to be some exact matching some overlap that validates one another. And that gives you confidence that the machine learning did exactly what you did and what's likelihood that you guessed wrong and machine learning guessed wrong exactly the right way that seems pretty, pretty small concern. So now you're really saying, well, why are they different? And now you start to study the samples. And what we learned, is that our ability to generate between 60 and 70% of these rules anytime we were different, we were right. Almost every single time, like almost every, like only one out of a hundred where was it proven that the handwritten rule was a more profound outcome. And of course, it's machine learning. So it learned, and it caught up the next time. So that's the true power of this innovation is it learns from the data as well as the stewards and it gives you confidence that you're not missing things and you start to trust it, but you should never completely walk away. You should constantly do your periodic sampling. >> And the secret sauce is math. I mean, I remember back in the mid two thousands it was like 2006 timeframe. You mentioned, you know, auto classification. That was a big problem with the federal rules of civil procedure trying to figure out, okay, you know, had humans classifying humans don't scale, until you had, you know, all kinds of support, vector machines and probabilistic, latent semantic indexing, but you didn't have the compute power or the data corpus to really do it well. So it sounds like a combination of you know, cheaper compute, a lot more data and machine intelligence have really changed the game there. Is that a fair assumption? >> That's absolutely fair. I think the other aspect that to keep in mind is that it's an innovative technology that actually brings all that compute as close into the data as possible. One of the greatest expenses of doing data quality was of course, the profiling concept bringing up the statistics of what the data represents. And in most traditional senses that data is completely pulled out of the database itself, into a separate area and now you start talking about terabytes or petabytes of data that takes a long time to extract that much information from a database and then to process through it all. Imagine bringing that profiling closer into the database, what's happening in the NAPE the same space as the data, that cuts out like 90% of the unnecessary processing speed. It also gives you the ability to do it incrementally. So you're not doing a full analysis each time, you have kind of an expensive play when you're first looking at a full database and then maybe over the course of a day, an hour, 15 minutes you've only seen a small segment of change. So now it feels more like a transactional analysis process. >> Yeah and that's, you know, again, we talked about the old days of big data, you know the Hadoop days and the boat was profound was it was all about bringing five megabytes of code to a petabyte of data, but that didn't happen. We shoved it all into a central data lake. I'm really excited for Collibra. It sounds like you guys are really on the cutting edge and doing some really interesting things. I'll give you the last word, Jim, please bring us on. >> Yeah thanks Dave. So one of the really exciting things about our solution is, it trying to be a combination of best of breed capabilities but also integrated. So to actually create a full and complete story that customers are looking for, you don't want to have them worry about a complex integration in trying to manage multiple vendors and the times of their releases, et cetera. If you can find one customer that you don't have to say well, that's good enough, but every single component is in fact best of breed that you can find in it's integrated and they'll manage it as a service. You truly unlock the power of your data, literate individuals in your organization. And again, that goes back to our overall goal. How do we empower the hundreds of millions of people around the world who are just looking for insightful decision? Did they feel completely locked it's as if they're looking for information before the internet and they're kind of limited to whatever their local library has and if we can truly become somewhat like the internet of data, we make it possible for anyone to access it without controls but we still govern it and secure it for privacy laws, I think we do have a chance to to change the world for better. >> Great. Thank you so much, Jim. Great conversation really appreciate your time and your insights. >> Yeah, thank you, Dave. Appreciate it. >> All right and thank you for watching theCUBE's continuous coverage of Data Citizens'21. My name is Dave Vellante. Keep it right there for more great content. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Brought to you by Collibra. and you're watching theCUBE's and maybe some of the And to make this level So it has to be governed and secured. And of course the big question and it has that level of And so to the extent that you you got to make bets, you know, And the data needs to and that it's going to and frankly, you know how this works is, So and one of the my gripes and it gives you confidence or the data corpus to really do it well. of data that takes a long time to extract Yeah and that's, you know, again, is in fact best of breed that you can find Thank you so much, Jim. you for watching theCUBE's

ENTITIES

Entity	Category	Confidence
Jim Cushman	PERSON	0.99+
Dave	PERSON	0.99+
Jim	PERSON	0.99+
Dave Vellante	PERSON	0.99+
90%	QUANTITY	0.99+
Collibra	ORGANIZATION	0.99+
2009	DATE	0.99+
Oracles	ORGANIZATION	0.99+
Netezzas	ORGANIZATION	0.99+
LGQ	ORGANIZATION	0.99+
Los Angeles	LOCATION	0.99+
Excel	TITLE	0.99+
Teradatas	ORGANIZATION	0.99+
two	QUANTITY	0.99+
2010	DATE	0.99+
15 minutes	QUANTITY	0.99+
2006	DATE	0.99+
millions of pieces	QUANTITY	0.99+
millions	QUANTITY	0.99+
tens of millions	QUANTITY	0.99+
an hour	QUANTITY	0.99+
five GLs	QUANTITY	0.99+
Southeast Asia	LOCATION	0.99+
one	QUANTITY	0.99+
four GLs	QUANTITY	0.99+
billions	QUANTITY	0.99+
Hadoop	TITLE	0.99+
hundreds of millions	QUANTITY	0.98+
20 rules	QUANTITY	0.98+
three	QUANTITY	0.98+
70%	QUANTITY	0.98+
each time	QUANTITY	0.98+
one customer	QUANTITY	0.98+
earlier this year	DATE	0.97+
10	QUANTITY	0.97+
today	DATE	0.95+
a decade ago	DATE	0.95+
first	QUANTITY	0.95+
a day	QUANTITY	0.95+
25 years ago	DATE	0.94+
Collibra	PERSON	0.94+
hundreds of millions of people	QUANTITY	0.94+
four	QUANTITY	0.94+
petabytes	QUANTITY	0.91+
over a decade ago	DATE	0.9+
terabytes	QUANTITY	0.9+
theCUBE	ORGANIZATION	0.9+
five years old	QUANTITY	0.88+
CPO	PERSON	0.87+
Wild Wild West	LOCATION	0.86+
tens of millions of data	QUANTITY	0.86+
One	QUANTITY	0.84+
five generational languages	QUANTITY	0.83+
a thousand rules	QUANTITY	0.81+
single component	QUANTITY	0.8+
60	QUANTITY	0.8+
last 30 years	DATE	0.79+
Data Citizens'21	TITLE	0.78+
zero cost	QUANTITY	0.77+
five megabytes of code	QUANTITY	0.76+
OwlDQ	ORGANIZATION	0.7+
single time	QUANTITY	0.69+
Data Citizens '21	EVENT	0.67+
Chief Product Officer	PERSON	0.64+
hundred	QUANTITY	0.63+
two thousands	QUANTITY	0.63+
Data	EVENT	0.58+
#DataCitizens21	EVENT	0.58+
petabyte	QUANTITY	0.49+
COVID	OTHER	0.48+

Michele Goetz,, Forrester Research | Collibra Data Citizens'21

>> From around the globe, it's theCUBE, covering Data Citizens '21. Brought to you by Collibra. >> For the past decade organizations have been effecting very deliberate data strategies and investing quite heavily in people, processes and technology, specifically designed to gain insights from data, better serve customers, drive new revenue streams we've heard this before. The results quite frankly have been mixed. As much of the effort is focused on analytics and technology designed to create a single version of the truth, which in many cases continues to be elusive. Moreover, the world of data is changing. Data is increasingly distributed making collaboration and governance more challenging, especially where operational use cases are a priority. Hello, everyone. My name is Dave Vellante and you're watching theCUBE coverage of Data Citizens '21. And we're pleased to welcome Michele Goetz who's the vice president and principal analyst at Forrester Research. Hello, Michele. Welcome to theCUBE. >> Hi, Dave. Thanks for having me today. >> It's our pleasure. So I want to start, you serve have a wide range of roles including enterprise architects, CDOs, chief data officers that is, analyst, the analyst, et cetera, and many data-related functions. And my first question is what are they thinking about today? What's on their minds, these data experts? >> So there's actually two things happening. One is what is the demand that's placed on data for our new intelligent digital systems. So we're seeing a lot of investment and interest in things like edge computing. And then how does that intersect with artificial intelligence to really run your business intelligently and drive new value propositions to be both adaptive to the market as well as resilient to changes that are unforeseen. The second thing is then you create this massive complexity to managing the data, governing the data, orchestrating the data because it's not just a centralized data warehouse environment anymore. You have a highly diverse and distributed landscape that you both control internally, as well as taking advantage of third party information. So really what the struggle then becomes is how do you trust the data? How do you govern it, and secure, and protect that data? And then how do you ensure that it's hyper contextualized to the types of value propositions that our intelligence systems are going to serve? >> Well, I think you're hitting on the key issues here. I mean, you're right. The data and I sort of refer to this as well is sort of out there, it's distributed at the edge. But generally our data organizations are actually quite centralized and as well you talk about the need to trust the data obviously that's crucial. But are you seeing the organization change? I know you're talking about this to clients, your discussion about collaboration. How are you seeing that change? >> Yeah, so as you have to bring data into context of the insights that you're trying to get or the intelligence that's automating and scaling out the value streams and outcomes within your business, we're actually seeing a federated model emerge in organizations. So while there's still a centralized data management and data services organization led typical enterprise architects for data, a data engineering team that's managing warehouses as in data lakes. They're creating this great platform to access and orchestrate information, but we're also seeing data, and analytics, and governance teams come together under chief data officers or chief data and analytics officers. And this is really where the insights are being generated from either BI and analytics or from data science itself and having dedicated data engineers and stewards that are helping to access and prepare data for analytic efforts. And then lastly, this is the really interesting part is when you push data into the edge the goal is that you're actually driving an experience and an application. And so in that case we are seeing data engineering teams starting to be incorporated into the solutions teams that are aligned to lines of business or divisions themselves. And so really what's happening is if there is a solution consultant who is also overseeing value-based portfolio management when you need to instrument the data to these new use cases and keep up with the pace of the business it's this engineering team that is part of the DevOps work bench to execute on that. So really the balances we need the core, we need to get to the insights and build our models for AI. And then the next piece is how do you activate all that? And there's a team over there to help. So it's really spreading the wealth and expertise where it needs to go. >> Yeah, I love that. You took a couple of things that really resonated with me. You talked about context a couple of times and this notion of a federated model, because historically the sort of big data architecture, the team, they didn't have the context, the business context, and my inference is that's changing and I think that's critical. Your talk at Data Citizens is called how obsessive collaboration fuels scalable DataOps. You talk about the data, the DevOps team. What's the premise you put forth to the audience? >> So the point about obsessive collaboration is sort of taking the hubris out of your expertise on the data. Certainly there's a recognition by data professionals that the business understands and owns their data. They know the semantics, they know the context of it and just receiving the requirements on that was assumed to be okay. And then you could provide a data foundation, whether it's just a lake or whether you have a warehouse environment where you're pulling for your analytics. The reality is that as we move into more of AI machine learning type of model, one, more context is necessary. And you're kind of balancing between what are the things that you can ascribe to the data globally which is what data engineers can support. And then there's what is unique about the data and the context of the data that is related to the business value and outcome as well as the feature engineering that is being done on the machine learning models. So there has to be a really tight link and collaboration between the data engineers, the data scientists, and analysts, and the business stakeholders themselves. You see a lot of pods starting up that way to build the intelligence within the system. And then lastly, what do you do with that model? What do you do with that data? What do you do with that insight? You now have to shift your collaboration over to the work bench that is going to pull all these components together to create the experiences and the automation that you're looking for. And that requires a different collaboration model around software development. And still incorporating the business expertise from those stakeholders, so that you're satisfying, not only the quality of the code to run the solution, but the quality towards the outcome that meets the expectation and the time to value that your stakeholders have. So data teams aren't just sitting in the basement or in another part of the organization and digitally disconnected anymore. You're finding that they're having to work much more closely and side by side with their colleagues and stakeholders. >> I think it's clear that you understand this space really well. Hubris out context in, I mean, that's kind of what's been lacking. And I'm glad you said you used the word anymore because I think it's a recognition that that's kind of what it was. They were down in the basement or out in some kind of silo. And I think, and I want to ask you this. I come back to organization because I think a lot of organizations look the most cost effective way for us to serve the business is to have a single data team with hyper specialized roles. That'll be the cheapest way, the most efficient way that we can serve them. And meanwhile, the business, which as you pointed out has the context is frustrated. They can't get to data. So there's this notion of a federated governance model is actually quite interesting. Are you seeing actual common use cases where this is being operationalized? >> Absolutely, I think the first place that you were seeing it was within the operational technology use cases. There the use cases where a lot of the manufacturing industrial device. Any sort of IOT based use case really recognized that without applying data and intelligence to whatever process was going to be executed. It was really going to be challenging to know that you're creating the right foundation, meeting the SLA requirements, and then ultimately bringing the right quality and integrity to the data, let alone any sort of data protection and regulatory compliance that has to be necessary. So you already started seeing the solution teams coming together with the data engineers, the solution developers, the analysts, and data scientists, and the business stakeholders to drive that. But that is starting to come back down into more of the IT mindset as well. And so DataOps starts to emerge from that paradigm into more of the corporate types of use cases and sort of parrot that because there are customer experience use cases that have an IOT or edge component to though. We live on our smart phones, we live on our smart watches, we've got our laptops. All of us have been put into virtual collaboration. And so we really need to take into account not just the insight of analytics but how do you feed that forward. And so this is really where you're seeing sort of the evolution of DataOps as a competency not only to engineer the data and collaborate but ensure that there sort of an activation and alignment where the value is going to come out, and still being trusted and governed. >> I got kind of a weird question, but I'm going. I was talking to somebody in Israel the other day and they told me masks are off, the economy's booming. And he noted that Israel said, hey, we're going to pay up for the price of a vaccine. The cost per dose out, 28 bucks or whatever it was. And he pointed out that the EU haggled big time and they don't want to pay $19. And as a result they're not as far along. Israel understood that the real value was opening up the economy. And so there's an analogy here which I want to come back to my organization and it relates to the DataOps. Is if the real metric is, hey, I have an idea for a data product. How long does it take to go from idea to monetization? That seems to me to be a better KPI than how much storage I have, or how much geometry petabytes I'm managing. So my question is, and it relates to DataOps. Can that DataOps, should that DataOps individual maybe live, and then maybe even the data engineer live inside of the business and is that even feasible technically with this notion of federated governance? Are you seeing that and maybe talk a little bit more about this DataOps role. Is it. >> Yeah. >> Fungible. >> Yeah, it's definitely fungible. And in fact, when I talked about sort of those three units of there's your core enterprise data services, there's your BI and data, and then there's your line of business. All of those, the engineering and the ops is the DataOps which is living in all of those environments and being as close as possible to where the value proposition is being defined and designed. So absolutely being able to federate that. And I think the other piece on DataOps that is really important is recognizing how the practices around continuous integration and continuous deployment using agile methodologies is really reshaping. A lot of the waterfall approaches that were done before where data was lagging 12 to 18 months behind any sort of insights, but a lot of the platforms today assume that you're moving into a standard mature software development life cycle. And you can start seeing returns on investment within a quarter, really, so that you can iterate and then speed that up so that you're delivering new value every two weeks. But it does change the mindset this DataOps team aligned to solution development, aligned to a broader portfolio management of business capabilities and outcomes needs to understand how to appropriately scope the data products that they're delivering to incremental value-based milestones. So the business feels that they're getting improvements over time and not just waiting. So there's an MVP, you move forward on that and optimize, optimize, extend scale. So again, that CICD mindset is helping to not bottleneck and wait for the complete field of dreams to come from your data and your insights. >> Thank you for that, Michelle. I want to come back to this idea of collaboration because over the last decade we've seen attempts, I've seen software come out to try to help the various roles collaborate and some of it's been okay, but you have these hyper specialized roles. You've got data scientists, data engineers, quality engineers, analysts, et cetera. And they tend to be in their own little worlds. But at the end of the day we rely on them all to get answers. So how can these data scientists, all these stewards, how can they collaborate better? What are you seeing there? >> You need to get them onto the same process. That's really what it comes down to. If you're working from different points of view, that's one thing. But if you're working from different processes collaborating is really challenging. And I think the one thing that's really come out of this move to machine learning and AI is recognizing that you need processes that reinforce collaboration. So that's number one. So you see agile development in CICD not just for DataOps, not just for DevOps, but also encouraging and propelling these projects and iterations for the data science teams as well or even if there's machine learning engineers incorporated. And then certainly the business stakeholders are inserted within there as appropriate to accept what it is that is going to be developed. So processes is number one. And number two is what is the platform that's going to reinforce those processes and collaboration. And it's really about what's being shared. How do you share? So certainly what we're seeing within the platforms themselves is everybody contributing into some sort of a library where their components and products are being ascribed to and then that's able to help different teams grab those components and build out what those solutions are going to be. And in fact, what gets really cool about that is you don't always need hardcore data scientists anymore as you have this social platform for data product and analytic product development. This is where a lot of the auto ML begins because those who are less data science-oriented but can build an insight pipeline, can grab all the different components from the pipelines to the transformations, to capture mechanisms, to bolting into the model itself and allowing that to be delivered to the application. So really kind of balancing out between process and platforms that enable and encourage, and almost force you to collaborate and manage through sharing. >> Thank you for that. I want to ask you about the role data governance. You've mentioned trust and that's data quality, and you've got teams that are focused on and specialists focused on data quality. There's the data catalog. Here's my question. You mentioned edge a couple of times and I can see a lot of that. I mean, today, most AI is are a lot of value, I would say most is modeling. And in the future, you mentioned edge it's going to be a lot of influencing in real time. And people maybe not going to have the time or be involved in that decision. So what are you seeing in terms of data governance, federate. We talked about federated governance, this notion of a data catalog and maybe automating data quality without necessarily having it be so labor intensive. What are you seeing the trends there? >> Yeah, so I think our new environment, our new normal is that you have to be composable, interoperable, and portable. Portability is really the key here. So from a cataloging perspective and governance we would bring everything together into our catalogs and business glossaries. And it would be a reference point, it was like a massive Wiki. Well, that's wonderful, but why just how's it in a museum. You really want to activate that. And I think what's interesting about the technologies today for governance is that you can turn those rules, and business logic, and policies into services that are composable components and bring those into the solutions that you're defining. And in that way what happens is that creates portability. You can drive them wherever they need to go. But from the composability and the interoperability portion of that you can put those services in the right place at the right time for what you need for an outcome so that you start to become behaviorally driven on executing on governance rather than trying to write all of the governance down into transformations and controls to where the data lives. You can have quality and observability of that quality and performance right at the edge and context of behavior and use of that solution. You can run those services and in governance on gateways that are managing and routing information at those edge solutions and we synchronization between the edge and the cloud comes up. And if it's appropriate during synchronization of the data back into the data lake you can run those services there. So there's a lot more flexibility and elasticity for today's modern approaches to cataloging, and glossaries, and governance of data than we had before. And that goes back into what we talked about earlier of like, this is the new wave of DataOps. This is how you bring data products to fruition now. Everything is about activation. >> So how do you see the future of DataOps? I mean, I kind of been pushing you to a more decentralized model where the business has more control 'cause the business has the context. I mean, I feel as though, hey, we've done a great job of contextualizing our operational systems. The sales team they know when the data is crap within my CRM, but our data systems are context agnostic generally. And you obviously understand that problem well. But so how do you see the future of DataOps? >> So I think what's kind of interesting about that is we're going to go to governance on greed versus governance on right more so. What do I mean by that? That means that from a business perspective there's two sides of it. There's ensuring that where governance is run is as we talked about before executing at the appropriate place at the appropriate time. It's semantically domain-centric driven not logical and systems centric. So that's number one. Number two is also recognizing that business owners or business operations actually plays a role in this, because as you're working within your CRM systems, like a Salesforce, for example you're using an iPaaS MuleSoft to connect to other applications, connect to other data sources, connect to other analytics sources. And what's happening there is that the data is being modeled and personalized to whatever view insight our task has to happen within those processes. So even CRM environments where we think of as sort of traditional technologies that we're used to are getting a lift, both in terms of intelligence from the data but also your flexibility and how you execute governance and quality services within that environment. And that actually opens up the data foundations a lot more and avoids you from having to do a lot of moving, copying centralizing data and creating an over-weighted business application and an over, both in terms of the data foundation but also in terms of the types of business services, and status updates, and processes that happen in the application itself. You're drawing those tasks back down to where they should be and where performance can be managed rather than trying to over customize your application environment. And that gives you a lot more flexibility later too for any sort of upgrades or migrations that you want to make because all of the logic is contained back down in a service layer instead. >> Great perspectives, Michelle, you obviously know your stuff and it's been a pleasure having you on. My last question is when you look out there anything that really excites you or any specific research that you're working on that you want to share, that you're super pumped about? >> I think there's two things. One is it's truly incredible the amount of insight and growth that is coming through data profiling and observation. Really understanding and contextualizing data anomalies so that you understand is data helping or hurting the business value and tying it very specifically to processes and metrics, which is fantastic as well as models themselves like really understanding how data inputs and outputs are making a difference whether the model performs or not. And then I think the second thing is really the emergence of more active data, active insights. And as what we talked about before your ability to package up services for governance and quality in particular that allow you to scale your data out towards the edge or where it's needed. And doing so not just so that you can run analytics but that you're also driving overall processes and value. So the research around the operationalization and activation of data is really exciting. And looking at the networks and service mesh to bring those things together is kind of where I'm focusing right now because what's the point of having data in a database if it's not providing any value. >> Michele Goetz, Forrester Research, thanks so much for coming on theCUBE. Really awesome perspectives. You're in an exciting space, so appreciate your time. >> Absolutely, thank you. >> And thank you for watching Data Citizens '21 on theCUBE. My name is Dave Vellante. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Brought to you by Collibra. of the truth, which in many Thanks for having me today. So I want to start, you serve that you both control internally, the need to trust the data the data to these new use cases What's the premise you and the time to value that And meanwhile, the business, But that is starting to come back down and it relates to the DataOps. and the ops is the DataOps And they tend to be in and allowing that to be And in the future, you mentioned edge of that you can put those services I mean, I kind of been pushing you And that gives you a lot more flexibility on that you want to share, that allow you to scale your so appreciate your time. And thank you for watching

ENTITIES

Entity	Category	Confidence
Michele Goetz	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Michele	PERSON	0.99+
Dave	PERSON	0.99+
Michelle	PERSON	0.99+
$19	QUANTITY	0.99+
Israel	LOCATION	0.99+
12	QUANTITY	0.99+
28 bucks	QUANTITY	0.99+
first question	QUANTITY	0.99+
two sides	QUANTITY	0.99+
EU	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Forrester Research	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.99+
Data Citizens	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
both	QUANTITY	0.98+
Collibra	ORGANIZATION	0.98+
18 months	QUANTITY	0.98+
Forrester Research	ORGANIZATION	0.98+
one	QUANTITY	0.96+
Israel	ORGANIZATION	0.96+
three units	QUANTITY	0.94+
Data Citizens '21	TITLE	0.94+
DataOps	ORGANIZATION	0.93+
one thing	QUANTITY	0.9+
Hubris	PERSON	0.89+
first place	QUANTITY	0.85+
past decade	DATE	0.84+
agile	TITLE	0.83+
Number two	QUANTITY	0.82+
single data team	QUANTITY	0.82+
DevOps	TITLE	0.81+
last	DATE	0.8+
DataOps	TITLE	0.8+
edge	ORGANIZATION	0.78+
DataOps	OTHER	0.78+
single version	QUANTITY	0.78+
wave	EVENT	0.74+
two weeks	QUANTITY	0.74+
DataOps	EVENT	0.73+
times	QUANTITY	0.73+
SLA	TITLE	0.72+
number two	QUANTITY	0.71+
Salesforce	TITLE	0.7+
CICD	ORGANIZATION	0.67+
number one	QUANTITY	0.65+
CICD	TITLE	0.6+
iPaaS	TITLE	0.59+
Citizens'21	ORGANIZATION	0.56+
couple	QUANTITY	0.42+
MuleSoft	ORGANIZATION	0.41+
theCUBE	TITLE	0.34+

Kirk Haslbeck, Collibra | Collibra Data Citizens'21

>> Narrator: From around the globe. It's theCUBE covering Data Citizens, 21 brought to you by Collibra. >> Hi everybody, John Walls here on theCUBE continuing our coverage of Data Citizens 2021. And I'm with now Kirk Haslbeck was the vice president of engineering at Collibra. Kirk joins us from his home, Kirk good to see you today. Thanks for joining us here on theCUBE. >> Well, thanks for having me, I'm excited to be here. >> Yeah, no, this is all about data quality, right? That's your world, you know, making sure that you're making the most of this great asset, right? That continues to evolve and mature. And yet I'm wondering from your perspective from your side of the fence, I assume data quality has always been a concern, right? Making the most of this asset, wherever it is. And whenever you can get it. >> Yeah, absolutely. I mean, the challenge hasn't slowed down, right? We're looking at more data coming in all the time laws of large numbers, but you kind of have to wonder a lot of the large organizations have been trying to solve this for quite some time, right? So what is going on? Why isn't it just easier to get our arms around it? And there's so many reasons, but if I were to list maybe the top one it's the diminishing value of static rules and a good example of that might just be something as simple as starting with a gender column. And back in the day, we might have assumed that it had to be an M or an F male or female. And over the last couple of years, we've actually seen that column evolve into six or seven different types. So just the very act of assuming that we could go in and write rules about our business and that they're never going to change and that the data's not evolving. And we start to think about zip codes and addresses that are changing, you know, Google street view. However you want to think of it. Every column and every record is just changing all the time. And so what, you know, many large organizations have done they've written maybe forty thousand, fifty thousand rules and they have to continue to manage them. So I think we all try to get our arms around rule creation. And it's not even just about that. It would also be about if you had all the rules in place could you even keep up with them on a day-to-day changing basis? And so one of the largest companies in the U.S sat down with myself and team early on and said, so what am I up against? I'm really either going to continue to hire a mountain of rule writers, you know, as they put it per department to get my arms around this and that'll never end, or I need to think of a better way which was the solution that we were ultimately providing at that time. And, you know, and what that solution really entails is using data mining to learn and observe all the data that's already there and to curate the rules based on the data itself, right? That's where all the information is. And then ultimately we have this concept of adaptive ruling which means all the variants in that column all the new values that come in every day, the roll counts, the sizes are all being managed. It's an automatic program, so that the rule is recalibrating itself and I think this is where most most chief data officers sit back and say if I have to protect the franchise, right? If I have to put a trusted data program in place what are my options and how does it scale? And they have to take a really hard look at something like this. >> You know, the process that you're talking about too it just kind of reminds me of, of like, of a diet in that nobody wants to go through that pain, right? We all want to eat, what we want to eat but you're really happy when you get there at the end of the day, you like the way you look like the way you feel, like the way you act, all those things, so it'd be almost like when you're talking about in terms of this data, you know, in terms of a rule setting, right? Governance and accessibility and all these things, it's, it can be a tough process. Can be, but it certainly seems well worth it because you make your data all the more valuable and essential to your business, Is that about right? >> Yeah, that's right, that's right. And you know, it's funny you compare it to a diet. Sometimes I think of a patient stress test, you know, almost like a health exam and we're spending so much time testing the analytics or testing the models and looking at accuracy and can anybody achieve 89 to 90% but we're probably not spending enough time testing our data assumptions, right? Running that diet or health check against the data itself. And I would say that every fortune 100 or even fortune 1000 probably considers themselves a data-driven business at this point in time, which means they're going to make decisions quickly based on data. And if we really pull that thread a little bit, what about what's the cost of making decisions on incorrect data? I mean it's terribly scary as we start to unfold that, so you're absolutely right. They're taking it very seriously. And it takes a lot of thought of how to get enough coverage and how to create trust in that type of environment. >> Yeah, it's almost too, it's like, you know the concept of input bias a little bit here where were if you're assuming that certain data sets are accurate and pertinent, relevant, all those things and then you're making decisions based on those data sets but you might be looking at kind of an input bias if I'm hearing you right, that you're maybe you're not keeping your mind open as to what really should be important or influential in your decision-making in terms of data. And then obviously acting on that appropriately. So you have to decide maybe on the front side, you know, what data matters and you help people do that. And then help me make decisions based on good data basically, right? >> Right, that's right and to be fully transparent and candid we weren't as strong in the what data matters piece of it. We were very strong early on in giving you broad coverage meaning we made no assumptions, right? We wanted to go out and attack the whole surface of the problem and then sort of have a consistent scoring methodology. And as we've partnered and now become acquired by Collibra which is an exciting path, they are very good at what's called critical data elements and lineage and doing graph analysis to sort of identify the assets that are most used. And that's where we see a huge benefit in combining those two powers. So you kind of got there quickly, but ultimately we are combining the forces of total coverage at scale with what is most important to you. >> Imagine we coming OwlDQ, you were the founder of that, that was purchased by Collibra. Tell us a little bit about, just about how that came to be in first off, we did a OwlDQ, what that was all about and then how this, this a marriage, if you will how this relationship with Collibra evolved and then you were eventually purchased. >> Yeah, absolutely, so, I mean, I had this passion that I couldn't hold back on in the data community. Once you see it this way, where you can use data mining and compute power to curate and manage rules and then take it much beyond there and to predicting and seeing around the corner for tomorrow, you have to go that direction. So that's exactly what myself and team did. And what we started to see with the early adopters of our software was that they were getting a seven figure return on investment per department. And they were able to replicate this across many departments, so we've had a great lifespan with those customers, staying and growing and expanding but we were getting a little bit of market pressure from the investment community, as well as that same customer community that they wanted us to integrate with their data catalog and the data catalog of choice. Every time the conversation was Collibra. And interestingly enough, you know, I ran into the likes of Jim Cushman and in the, you know, the whole thing unfolds from there. I think they were seeing a little bit of a similar story saying doesn't catalog and lineage belong together with quality. And when we sat together it was like three market forces suggesting the same answer. And as we laid out the roadmap and the integration we just can't see it any other way. There's no way I'll be bold and say that it goes back the other way, not just for this company but for the industry, data governance and data intelligence will absolutely combine quality, lineage, catalog and all of the above in the future. It is becoming that clear, I think. >> You know, this has kind of a big picture question, about all of that data quality right now, what's driving this avid interest that organizations showing and it's you know, small, medium enterprise it's everybody but in your mind, you know, you've been involved in this for a number of years now. You know, why now, what is it now? Is it just that we have so much more data available that so much of it's own use that, that, you know, we know what we have. And we're realizing that what we have is pretty valuable but you know, what's the driver, what's the big push here? >> Yeah, it is a tough question. And I have gotten this one before and it's interesting because it's been around since the nineties, right? So it's a very fair question. There's a couple things I think that are driving it. One as we start to see more data in Tableau dashboards and pick your favorite BI tool you start to realize the data's not correct. You know, you look at your house on Zillow or whatever you find out it's mislabeled. It doesn't have the right bedrooms. Maybe humans are entering into the listings and as data's become more available visually we're more critical of it. And now businesses are becoming more data-driven where they're humans aren't involved as much and the actions are automatically being taken. And it becomes an embarrassing moment if your data is incorrect and we can really measure that cost at this point. You do see some other factors like cloud migration. Well, that adds a risk to your business. Could you possibly port everything, not just the servers not just the software, but all of your data into another system and think that there would be no errors in that process. So as people are kind of creating their next generation platforms, and then probably even a touch of COVID accelerating that cloud migration adoption and even just technology adoption. So for a multitude of reasons, there's just more data and there's more data quality concerns than ever before. >> So if you're talking to a prospective client right now, which you probably are, you know, what do you want to share with them? Or what would you encourage them to consider in terms of kind of their data venture their data journey if you will, in terms of, you know, refining what they have in terms of mining appropriately in terms of governing it appropriately, all these things that maybe haven't been given a lot of consideration or deep consideration. >> Yeah, I think the two things although if you listen to my other talks I can talk forever about, about all of those items. It probably, you know, maybe just do the napkin math of all the tables, all the files all the Kafka messages, right? All the columns and fields and attributes and kind of just multiply that out and and try to figure out how you would get coverage. And if you could, how you could maintain it. And why shouldn't we be trading compute power for domain knowledge and things at that point I think that's the first place to start. And probably the second is actually the act of traditional data quality rules puts you in a binary situation. It basically says you will either have a break record or you will not. So it's a yes, no question, what it never will tell you is what the answer should have been. And if you take a deeper look at the solution that we're providing to the market we're actually predicting to you what the correct value is and it's a complete paradigm shift it obviously is much more scientific, but it's much more powerful to get you to the end answer more quickly instead of just going through break records. >> Right? Tremendous capability that you just described. And on that, I'm going to thank you for the time but just think about it, right? We're we're not only going to help you make more sense of your data. We're also going to help you make better decisions and show you what that path might be or what you probably should be considering. So it certainly opens up a lot of doors for a lot of companies in that respect. Kirk, thanks for the time, sorry we didn't have enough time to hear that guitar in the background, but next time I'm going to hold you to it, okay. >> Yeah, that sounds good, John, I really appreciate it. >> All right very good Kirk Haslbeck joining us from Collibra, we continue our coverage here at Data Citizens 21 on theCUBE and I'm John Walls. (bright music)

Published Date : Jun 17 2021

SUMMARY :

brought to you by Collibra. Kirk good to see you today. me, I'm excited to be here. And whenever you can get it. and that the data's not evolving. like the way you feel, And you know, it's funny and you help people do that. of identify the assets that are most used. and then you were eventually purchased. and all of the above in the future. but you know, what's the driver, and the actions are you know, what do you to get you to the end answer I'm going to hold you to it, okay. Yeah, that sounds good, joining us from Collibra, we

ENTITIES

Entity	Category	Confidence
Kirk	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
John	PERSON	0.99+
John Walls	PERSON	0.99+
six	QUANTITY	0.99+
89	QUANTITY	0.99+
forty thousand	QUANTITY	0.99+
Kirk Haslbeck	PERSON	0.99+
Jim Cushman	PERSON	0.99+
second	QUANTITY	0.99+
two powers	QUANTITY	0.99+
two things	QUANTITY	0.99+
one	QUANTITY	0.99+
U.S	LOCATION	0.98+
90%	QUANTITY	0.98+
Tableau	TITLE	0.98+
seven figure	QUANTITY	0.97+
tomorrow	DATE	0.97+
OwlDQ	ORGANIZATION	0.96+
today	DATE	0.95+
three market forces	QUANTITY	0.93+
fifty thousand rules	QUANTITY	0.93+
nineties	DATE	0.93+
One	QUANTITY	0.93+
first	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.91+
Kafka	PERSON	0.88+
first place	QUANTITY	0.85+
seven different types	QUANTITY	0.83+
Data Citizens'21	ORGANIZATION	0.82+
couple things	QUANTITY	0.73+
Google	ORGANIZATION	0.73+
Data Citizens	ORGANIZATION	0.72+
2021	DATE	0.69+
COVID	TITLE	0.69+
fortune 1000	ORGANIZATION	0.66+
Data	EVENT	0.66+
fortune 100	ORGANIZATION	0.66+
street view	TITLE	0.65+
last couple of years	DATE	0.63+
21	EVENT	0.55+
Zillow	ORGANIZATION	0.55+
Data Citizens	TITLE	0.51+
Citizens	ORGANIZATION	0.39+
21	QUANTITY	0.35+

Stijn "Stan" Christiaens | Collibra Data Citizens'21

>>From around the globe. It's the Cube covering data citizens 21 brought to you by culebra. Hello everyone john walls here as we continue our cube conversations here as part of Data citizens 21 the conference ongoing caliber at the heart of that really at the heart of data these days and helping companies and corporations make sense. All of those data chaos that they're dealing with, trying to provide new insights, new analyses being a lot more efficient and effective with your data. That's what culebra is all about and their founder and their Chief data Citizen if you will stand christians joins us today and stan I love that title. Chief Data Citizen. What is that all about? What does that mean? >>Hey john thanks for having me over and hopefully we'll get to the point where the chief data citizen titlists cleaves to you. Thanks by the way for giving us the opportunity to speak a little bit about what we're doing with our Chief Data Citizen. Um we started the community, the company about 13 years ago, uh 2008 and over those years as a founder, I've worn many different hats from product presales to partnerships and a bunch of other things. But ultimately the company reaches a certain point, a certain size where systems and processes become absolutely necessary if you want to scale further for us. This is the moment in time when we said, okay, we probably need a data office right now ourselves, something that we've seen with many of our customers. So he said, okay, let me figure out how to lead our own data office and figure out how we can get value out of data using our own software at Clear Bright Self. And that's where it achieved. That a citizen role comes in on friday evening. We like to call that, drinking our own champagne monday morning, you know, eating our own dog food. But essentially um this is what we help our customers do build out the offices. So we're doing this ourselves now when we're very hands on. So there's a lot of things we're learning again, just like our customers do. And for me at culebra, this means that I'm responsible as achieved data citizen for our overall data strategy, which talks a lot about data products as well as our data infrastructure, which is needed to power data problems now because we're doing this in the company and also doing this in a way that is helpful to our customers. Were also figuring out how do we translate the learning that we have ourselves and give them back to our customers, to our partners, to the broader ecosystem as a whole. And that's why uh if you summarize the strategy, I like the sometimes refer to it as Data office 2025, it's 2025. What is the data office looked like by then? And we recommend to our customers also have that forward looking view just as well. So if I summarize the the answer a little bit it's very similar to achieve their officer role but because it has the external evangelization component helping other data leaders we like to refer to it as the chief data scientist. >>Yeah that that kind of uh you talk about evangelizing obviously with that that you're talking about certain kinds of responsibilities and obligations and when I think of citizenship in general I think about privileges and rights and about national citizenship. You're talking about data citizenship. So I assume that with that you're talking about appropriate behaviors and the most uh well defined behaviors and kind of keep it between the lanes basically. Is that is that how you look at being a data citizen. And if not how would you describe that to a client about being a data citizen? >>It's a very good point as a citizen. You have the rights and responsibilities and the same is exactly true for a day to citizens. For us, starting with what it is right for us. The data citizen is somebody who uses data to do their job. And we've purposely made that definition very broad because today we believe that everyone in some way uses data, do their job. You know, data universal. It's critical to business processes and its importance is only increasing and we want all the data citizens to have appropriate access to data and and the ability to do stuff with data but also to do that in the right way. And if you think about it, this is not just something that applies to you and your job but also extends beyond the workplace because as a data citizen, you're also a human being. Of course. So the way you do data at home with your friends and family, all of this becomes important as well. Uh and we like to think about it as informed privacy. Us data citizens who think about trust in data all the time because ultimately everybody's talking today about data as an asset and data is the new gold and the new oil and the new soil. And there is a ton of value uh data but it's not just organizations themselves to see this. It's also the bad actors out there were reading a lot more about data breaches for example. So ultimately there is no value without rescue. Uh so as the data citizen you can achieve value but you also have to think about how do I avoid these risks? And as an organization, if you manage to combine both of those, that's when you can get the maximum value out of data in a trusted manner. >>Yeah, I think this is pretty interesting approach that you've taken here because obviously there are processes with regard to data, right? I mean you know that's that's pretty clear but there are there's a culture that you're talking about here that not only are we going to have an operational plan for how we do this certain activity and how we're going to uh analyze here, input here action uh perform action on that whatever. But we're gonna have a mindset or an approach mentally that we want our company to embrace. So if you would walk me through that process a little bit in terms of creating that kind of culture which is very different then kind of the X's and oh's and the technical side of things. >>Yeah, that's I think where organizations face the biggest challenge because you know, maybe they're hiring the best, most unique data scientists in the world, but it's not about what that individual can do, right? It's about what the combination of data citizens across the organization can do. And I think there it starts first by thinking as an individual about universal goal Golden rule, treat others as you would want to be treated yourself right the way you would ethically use data at your job. Think about that. There's other people and other companies who you would want to do the same thing. Um now from our experience and our own data office at cordoba as well as what we see with our customers, a lot of that personal responsibility, which is where culture starts, starts with data literacy and you know, we talked a little bit about Planet Rock and small statues in brussels Belgium where I'm from. But essentially um here we speak a couple of languages in Belgium and for organizations for individuals, Data literacy is very similar. You know, you're able to read and write, which are pretty essential for any job today. And so we want all data citizens to also be able to speak and read and write data fluently if I if I can express it this way. And one of the key ways of getting that done and establishing that culture around data uh is lies with the one who leads data in the organization, the Chief Petty Officer or however the roll is called. They play a very important role in this. Um, the comparison maybe that I always make there is think about other assets in your organization. You know, you're you're organized for the money asset for the talent assets with HR and a bunch of other assets. So let's talk about the money asset for a little bit, right? You have a finance department, you have a chief financial officer. And obviously their responsibility is around managing that money asset, but it's also around making others in the organization think about that money asset and they do that through established processes and responsibilities like budgeting and planning, but also ultimately to the individual where, you know, through expense sheets that we all off so much they make you think about money. So if the CFO makes everyone in the company thinks about think about money, that data officer or the data lead has to think has to make everyone think uh in the company about data as a as it just as well and and those rights those responsibilities um in that culture, they also change right today. They're set this and this way because of privacy and policy X. And Y. And Z. But tomorrow for example as with the european union's new regulation around the eye, there's a bunch of new responsibilities you have to think about. >>Mhm. You know you mentioned security and about value and risk which is certainly um they are part and parcel right? If I have something important, I gotta protect it because somebody else might want to um to create some damage, some harm uh and and steal my value basically. Well that's what's happening as you point out in the data world these days. So so what kind of work are you doing in that regard in terms of reinforcing the importance of security, culture, privacy culture, you know this kind of protective culture within an organization so that everybody fully understands the risks. But also the huge upsides if you do enforce this responsibility and these good behaviors that that obviously the company can gain from and then provide value to their client base. So how do you reinforce that within your clients to spread that culture if you will within their organizations? >>Um spreading a culture is not always an easy thing. Um especially a lot of organizations think about the value around data but to your point, not always about the risks that come associated with it sometimes just because they don't know about it yet. Right? There's new architecture is that come into play like the clouds and that comes with a whole bunch of new risk. That's why one of the things that we recommend always to our uh customers and to data officers and our customers organizations is that next to establishing that that data literacy, for example, and working on data products is that they also partners strongly with other leaders in their organization. On the one hand, for example, the legal uh folks, where typically you find the aspects around privacy and on the other hand, um the information security folks, because if you're building up a sort of map of your data, look at it like a castle, right that you're trying to protect. Uh if you don't have a map of your castle with the strong points and weak points and you know, where people can build, dig a hole under your wall or what have you, then it's very hard to defend. So you have to be able to get a map of your data. A data map if you will know what data is out there with being used by and and why and how and then you want to prioritize that data which is the most important, what are the most important uses and put the appropriate protections and controls in place. Um and it's fundamental that you do that together with your legal and information security partners because you may have as a data leader you may have the data module data expertise, but there's a bunch of other things that come into play when you're trying to protect, not just the data but really your company on its data as a whole. >>You know you were talking about 2025 a little bit ago and I think good for you. That's quite a crystal ball that you have you know looking uh with the headlights that far down the road. But I know you have to be you know that kind of progressive thinking is very important. What do you see in the long term for number one? You're you're kind of position as a chief data citizen if you will. And then the role of the chief data officer which you think is kind of migrating toward that citizenship if you will. So maybe put on those long term vision uh goggles of yours again and and tell me what do you see as far as these evolving roles and and these new responsibilities for people who are ceos these days? >>Um well 2025 is closer than we think right? And obviously uh my crystal ball is as Fuzzy as everyone else's but there's a few things that trends that you can easily identify and that we've seen by doing this for so long at culebra. Um and one is the push around data I think last year. Um the years 2020, 2020 words uh sort of Covid became the executive director of digitalization forced everyone to think more about digital. And I expect that to continue. Right. So that's an important aspect. The second important aspect that I expect to continue for the next couple of years, easily. 2025 is the whole movement to the cloud. So those cloud native architecture to become important as well as the, you know, preparing your data around and preparing your false, he's around it, et cetera. I also expect that privacy regulations will continue to increase as well as the need to protect your data assets. Um And I expect that a lot of achieved that officers will also be very busy building out those data products. So if you if you think that that trend then okay, data products are getting more important for t data officers, then um data quality is something that's increasingly important today to get right otherwise becomes a garbage in garbage out kind of situation where your data products are being fed bad food and ultimately their their outcomes are very tricky. So for us, for the chief data officers, Um I think there was about one of them in 2002. Um and then in 2019 ISH, let's say there were around 10,000. So there's there's plenty of upside to go for the chief data officers, there's plenty of roles like that needed across the world. Um and they've also evolved in in responsibility and I expect that their position, you know, it it is really a sea level position today in most organizations expect that that trend will also to continue to grow. But ultimately, those achieved that officers have to think about the business, right? Not just the defensive and offensive positions around data like policies and regulations, but also the support for businesses who are today shifting very fast and we'll continue to uh to digital. So those Tv officers will be seen as heroes, especially when they can build out a factory of data products that really supports the business. Um, but at the same time, they have to figure out how to um reach and always branch to their technical counterparts because you cannot build that factory of data products in my mind, at least without the proper infrastructure. And that's where your technical teams come in. And then obviously the partnerships with your video and information security folks, of course. >>Well heroes. Everybody wants to be the hero. And I know that uh you painted a pretty clear path right now as far as the Chief data officer is concerned and their importance and the value to companies down the road stan. We thank you very much for the time today and for the insight and wish you continued success at the conference. Thank you very much. >>Thank you very much. Have a nice day healthy. >>Thank you very much Dan Christians joining us talking about chief data citizenship if you will as part of data citizens 21. The conference being put on by caliber. I'm John Wall's thanks for joining us here on the Cube. >>Mhm.

Published Date : Jun 17 2021

SUMMARY :

citizens 21 brought to you by culebra. So if I summarize the the answer a little bit it's very similar to achieve And if not how would you describe that to a client about being a data So the way you do data So if you would walk me through that process a little bit in terms of creating the european union's new regulation around the eye, there's a bunch of new responsibilities you have But also the huge upsides if you do enforce this the legal uh folks, where typically you find the And then the role of the chief data officer which you think is kind of migrating toward that citizenship responsibility and I expect that their position, you know, it it is really a And I know that uh you painted a pretty Thank you very much. Thank you very much Dan Christians joining us talking about chief data citizenship if you

ENTITIES

Entity	Category	Confidence
Belgium	LOCATION	0.99+
2002	DATE	0.99+
2008	DATE	0.99+
John Wall	PERSON	0.99+
european union	ORGANIZATION	0.99+
john walls	PERSON	0.99+
Clear Bright Self	ORGANIZATION	0.99+
last year	DATE	0.99+
2019	DATE	0.99+
tomorrow	DATE	0.99+
both	QUANTITY	0.99+
culebra	ORGANIZATION	0.99+
today	DATE	0.98+
john	PERSON	0.98+
first	QUANTITY	0.98+
2025	DATE	0.98+
Stijn "Stan" Christiaens	PERSON	0.98+
one	QUANTITY	0.98+
2020	DATE	0.98+
Dan Christians	PERSON	0.98+
monday morning	DATE	0.97+
friday evening	DATE	0.97+
Covid	PERSON	0.97+
Collibra	ORGANIZATION	0.97+
around 10,000	QUANTITY	0.97+
next couple of years	DATE	0.92+
about 13 years ago	DATE	0.9+
brussels	LOCATION	0.85+
second important aspect	QUANTITY	0.8+
cordoba	ORGANIZATION	0.78+
christians	ORGANIZATION	0.62+
uh	ORGANIZATION	0.61+
Planet Rock	LOCATION	0.61+
Data	PERSON	0.58+
Data citizens 21	EVENT	0.56+
about	DATE	0.54+
ISH	ORGANIZATION	0.46+
21	ORGANIZATION	0.41+

Collibra Day 1 Felix Zhamak

>>Hi, Felix. Great to be here. >>Likewise. Um, so when I started reading about data mesh, I think about a year ago, I found myself the more I read about it, the more I find myself agreeing with other principles behind data mesh, it actually took me back to almost the starting of Colibra 13 years ago, based on the research we were doing on semantic technologies, even personally my own master thesis, which was about domain driven ontologies. And we'll talk about domain-driven as it's a key principle behind data mesh, but before we get into that, let's not assume that everybody knows what data measures about. Although we've seen a lot of traction and momentum, which is fantastic to see, but maybe if you could start by talking about some of the key principles and, and a brief overview of what data mesh, uh, Isabella of >>Course, well, they're happy to, uh, so Dana mesh is an approach is a new approach. It's a decentralized, decentralized approach to managing and accessing data and particularly analytical data at scale. So we can break that down a little bit. What is analytical data? Well, analytical data is the data that fuels our reporting as a business intelligence. Most importantly, the machine learning training, right? So it's the data, that's, it's an aggregate view of historical events that happens across organizations, many domains within organizations, or even beyond one organization, right? Um, and today we manage, uh, this analytical data through very centralized solutions. So whether it's a data lake or data warehouse or combinations of the two, and, uh, to be honest, we have kind of outsource the accountability for it, to the data team, right? It doesn't happen within the domains. Uh, what we have found ourselves with is, uh, central button next. >>So as we see the growth in the scale of organizations, in terms of the origins of the data and in terms of the great expectations for the data, all of these wonderful use cases that are, that requires access to that, unless we're data, uh, we find ourselves kind of constraints and limited in agility to respond, you know, because we have a centralized bottleneck from team to technology, to architecture. So there's a mesh kind of is that looks at the past what we've done, accidental complexity that we've kind of created and tries to reimagine a different way of, uh, managing and accessing data that can truly scale as this origins of the data grows. As they become available within one organization, we didn't want a cloud or another, and it links down really the approach based on four principles. Uh, so I so far, I haven't tried to be prescriptive as exactly how you implement it. >>I leave that to Elizabeth, to the imaginations of the users. Um, of course I have my opinions, but, but without being prescriptive, I think there are full shifts that needs to happen. One is, uh, we need to start breaking down the, kind of this complex problem of accessing to data around boundaries that can allow this to scale out a solution. So boundaries that are, that naturally fits into that model or domains, right. Our business domain. So, so there's a first principle is the domain ownership of the data. So analytical data will be shared and served and accountable, uh, by the domains where they come from. And then the second dimension of that is, okay. So once we break down this, the ownership of the database on domains, how can we prevent this data siloing? So the second principle is really treating data as a product. >>So considering the success of that data based on the access and usability and the lifelong experience of data analysts, data scientists. So we talk about data as a product and that the third principle is to really make it possible feasible. We need to really rethink our data platforms, our infrastructure capabilities, and create a new set ourselves of capabilities that allows domain in fact, to own their data in fact, to manage the life cycle of their analytical data. So then self-serve daytime frustration and platform is the fourth principle. And the last principle is really around governance because we have to think about governance. In fact, when I first wrote it down, this was like a little kind of concern in, in embedded in what some of my texts and I thought about, okay, now to make this real, we need to think about securing and quality of the data accessibility of the data at scale, in a fashion that embraces this autonomous domain ownership. So we have to think about how can we make this real with competition of governance? How can we make those domains be part of the governance, federated governance, federally, the competition of governance is the fourth principle. So at insurance it's a organizational shift, it's an architectural change. And of course technology needs to change to get us to decentralize access and management of Emily's school data. >>Yeah, I think that makes a ton of sense. If you want to scale, typically you have to think much more distributed versus centralized at we've seen it in other practices as well, that domain-driven thinking as well. I think, especially around engineering, right? We've seen a lot of the same principles and best practices in order to scale engineering teams and not make the same mistakes again, but maybe we can start there with kind of the core principles around that domain driven thinking. Can you elaborate a little bit on that? Why that is so important than the kind of data organizations, data functions as well? >>Absolutely. I mean, if you look at your organizations, organizations are complex systems, right? There are eight made of parts, which are basically domains functions of the business, your automation and your customer management, yourselves marketing. And then the behavior of the organization is the result of an intuitive, you know, network of dependencies and interactions with these domains. So if we just overlay data on this complex system, it does make sense to really, to scale, to bring the ownership and, um, really access to data right at the domain where it originates, right. But to the people who know that data best and most capable of providing that data. So to optimize response, to change, to optimize creating new features, new services, new machine learning models, we've got to kind of think about your call optimization, but not that the cost of global good. Right. Uh, so the domain ownership really talks about giving autonomy to the domains and accountability to provide their data and model the data, um, in a responsible way, be accountable for its quality. >>So no collect some of the empower them and localize some of those responsibilities, but at the same time, you know, thinking about the global goods, so what are they, how that domain needs to be accountable against the other domains on the mission? That's the governance piece covers that. And that leads to some interesting kind of architectural shifts, because when you think about not submission of the data, then you think about, okay, if I have a machine learning model that needs, you know, three pieces of the data from the different domains, I ended up actually distributing the computer also back to those domains. So it actually starts shifting kind of architectural as well. We start with ownership. Yeah, >>No, I think that makes a ton of sense, but I can imagine people thinking, well, if you're organizing, according to these domains, aren't gonna be going to grades different silos, even more silos. And I think that's where it second principle that's, um, think of data as a product and it comes in, I think that's incredibly powerful in my mind. It's powerful because it helps us think about usability. It helps us think about the consumer of that data and really packaging it in the right way. And as one sentence that I've heard you use that I think is incredibly powerful, it's less collecting, more connecting. Um, and can you elaborate on that a little bit? >>Absolutely. I mean the power and the value of the data is not enhanced, which we have got and stored on this, right. It's really about connecting that data to other data sets to aluminate new insights. The higher order information is connecting that data to the users, right. Then they want to use it. So that's why I think, uh, if we shift that thinking from just collecting more in one place, like whatever, and ability to connect datasets, then, then arrive at a different solution. So, uh, I think data as a product, as you said, exactly, was a kind of a response to the challenges that domain-driven siloing could create. And the idea is that the data that now these domains own needs to be shared with some accountability and incentive structure as a product. So if you bring product thinking to data, what does that mean? >>That means delighting the experience that there are users who are they, they're the data analysts, data scientists. So, you know, how can we delight their experience of their journey starts with a hypothesis. I have a question. Do I have right data to answer this question with a particular model? Let me discover it, let me find it if it's useful. Do I trust it? So really fascinated in that journey? I think we have two choices in that we have the choice of source of that data. The people who are really shouldn't be accountable for it, shrug off the responsibility and say, you know, I dumped this data on some event streaming and somebody downstream, the governance or data team will take care of a terror again. So it usable piece of information. And that's what we have done for, you know, half century almost. And, or let's say let's bring intention of providing quality data back to the source and make the folks both empower them and make them accountable for providing that data right at the source as a product. And I think by being intentional about that, um, w we're going to remove a lot of accidental complexity that we have created with, you know, labyrinth pipelines of moving data from one place to another, and try to build quality back into it. Um, and that requires, you know, architectural shifts, organizational shifts, incentive models, and the whole package, >>The hope is absolutely. And we'll talk about that. Federated computational governance is going to be a really an important aspect, but the other part of kind of data as a product next to usability is whole trust. Right? If you, if you want to use it, why is also trusts so important if you think about data as a product? >>Well, uh, I mean, maybe we turn this question back to you. Would you buy the shiniest product if you don't trust it, if you, if you don't trust where it comes from, can I use it? Is it, does it have integrity? I wouldn't. I think, I think it's almost irresponsible to use the data that you can trust, right. And the, really the meaning of the trust is that, do I know enough about this data to, to, for it, to be useful for the purpose that I'm using it for? So, um, I think trust is absolutely fundamental to, as a fundamental characteristics of a data as a product. And again, it comes back to breaching the gap between what the data user knows needs to know to really trust them, use that data, to find it, whether it's suitable and what they know today. So we can bridge that gap with, uh, you know, adding documentation, adding SLRs, adding lineage, like all of these additional information, but not only that, but also having people that are accountable for providing that integrity and those silos and guaranteeing. So it's really those product owners. So I think, um, it's just, for me, it's a non trust is a non-negotiable characteristic of the data as a product, like any other consumer product. >>Exactly. Like you said, if you think about consumer product, consumer marketplace is almost Uber of Amazon, of Airbnb. You have the simple rating as a very simple way of showing trust and those two and those different stakeholders and that almost. And we also say, okay, how do we actually get there? And I think data measure also talks a little bit about the roles responsibilities. And I think the importance overall of a, of a data product owner probably is aligned with that, that importance and trust. Yeah, >>Absolutely. I think we can't just wish for these good things happens without putting the accountability and the right roles in place. And the data product owner is just the starting point for us to stop playing hot potato. When it comes to, you know, who owns the data will be accountable for not so much. Who's the actual owner of that data because the owner of the data is you and me where the data comes really from, but it's the data product owner who's going to be responsible for the life cycle of this. They know when the data gets changed with consumers, meaning you feel as a new information, make sure that that gets carried out and maybe one day retire that data. So that long term ownership with intimate understanding of the needs of the user for that data, as well as the data itself and the domain itself and managing the life cycle of that, uh, I think that's a, that's a necessary role. >>Um, and then we have to think about why would anybody want to be a data product owner, right? What are the incentives we have to set up in the infrastructure, you know, in the organization. Um, and it really comes down to, I think, adopting prior art that exists in the product ownership landscape and bring it really to the data and assume the data users as the, as the customers, right. To make them happy. So our incentives on KPIs for these people before they get product on it needs to be aligned with the happiness of their data users. >>Yep. I love that. The alignment again, to the consumer using things like we know from product management, product owner of these roles and reusing that for data, I think that makes it makes a ton of sense. And it's a good leeway to talk a little about governance, right? We mentioned already federated governance, computational governance at we seeing that challenge often with our customers centralizing versus decentralizing. How do we find the right balance? Can you talk a little bit about that in the context of data mesh? How do we, how do we do this? >>Yeah, absolutely. I think the, I was hoping to pack three concepts in the title of the governance, but I thought that would be quite mouthful. So, uh, as you mentioned, uh, the kind of that federated aspects, the competition aspects, and I think embedded governance, I would, if I could add another kind of phrasing there and really it's about, um, as we talked about to how to make it happen. So I think the Federation matters because the people who are really in a position listed this, their product owners in a position to provide data in a trustworthy, with integrity and secure way, they have to have a stake in doing that, right. They have to be accountable, not just for their little domain or a big domain, but also they have to have an accountability for the mesh. So some of the concerns that are applied to all of the data front, I've seen fluid, how we secure them are consistently really secure them. >>How do we model the data or the schema language or the SLO metrics, or that allows this, uh, data to be interoperable so we can join multiple data products. So we have to have, I think, a set of policies that are really minimum set of policies that we have to apply globally to all the data products and then in a federated fashion, incentivize the data product owners. So have a stake in that and make that happen because there's always going to be a challenge in prioritizing. Would I add another few attributes? So my data sets to make my customers happy, or would I adopt that this standardized modeling language, right? They have to make that kind of continuous, um, kind of prioritization. Um, and they have to be incentivized to do both. Right. Uh, and then the other piece of it is okay, if we want to apply these consistent policies, across many data products and the mesh, how would it be physically possible? >>And the only way I can see, and I have seen it done in service mesh would be possible is by embedding those policies as competition, as code into every single data product. And how do we do that again, platform has a big part of it. So be able to have this embedded policy engines and whatever those things are into the data products, uh, and to, to be able to competition. So by default, when you become a data product, as part of the scaffolding of that data product, you get all of these, um, kind of computational capabilities to configure your, your policies according to the global policies. >>No, that makes sense. That makes, that makes it on a sense. That makes sense. >>I'm just curious. Really. So you've been at this for a while. You've built this system for the 13 years came from kind of academic background. So, uh, to be honest, we run into your products, lots of our clients, and there's always like a chat conversation within ThoughtWorks that, uh, do you guys know about this product then? So and so, oh, I should have curious, well, how do you think data governance tehcnology then skip and you need to shift with data mesh, right. And, and if, if I would ask, how would your roadmap changes with database? >>Yeah, I think it's a really good question. Um, what I don't want to do is to make, make the mistake that Venice often make and think of data mesh as a product. I think it's a much more holistic mindset change, right? That that's organization. Yes. It needs to be a kind of a platform enablement component there. And we've actually, I think authentically what, how we think about governance, that's very aligned with some of the principles and data measures that federate their thinking or customers know about going to communities domains or operating model. We really support that flexibility. I think from a roadmap perspective, I think making that even easier, uh, as always kind of a, a focus focus area for us, um, specifically around data measures are a few things that come to mind. Uh, one, I think is connectivity, right? If you, if you give different teams more ownership and accountability, we're not going to live in a world where all of the data is going to be stored on one location, right? >>You want to give people themes the opportunity and the accountability to make their own technology decisions so that they are fit for purpose. So I think whatever platform being able to really provide out of the box connectivity to a very wide, um, area or a range of technologies, I think is absolutely critical, um, on the, on the product as a or data as a product, thinking that usability, I think that's top of mind, uh, that's part of our roadmap. You're going to hear us, uh, stock about that tomorrow as well. Um, that data consumer, how do we make it as easy as possible for people to discover data that they can trust that they can access? Um, and in that thinking is a big part of our roadmap. So again, making that as easy as possible, uh, is a, is a big part of it. >>And, and also on the, I think the computation aspect that you mentioned, I think we believe in as well, if, if it's just documentation is going to be really hard to keep that alive, right? And so you have to make an active, we have to get close to the actual data. So if you think about a policy enforcement, for example, some things we're talking about, it's not just definition is the enforcement data quality. That's why we are so excited about our or data quality, um, acquisition as well. Um, so these are a couple of the things that we're thinking of, again, your, your, um, your, your, uh, message around from collecting to connecting. We talk about unity. I think that that works really, really well with our mission and vision as well. So mark, thank you so much. I wish we had more time to continue the conversation, uh, but it's been great to have a conversation here. Thank you so much for being here today and, uh, let's continue to work on that on data. Hello. I'm excited >>To see it. Just come to like.

Published Date : Jun 17 2021

SUMMARY :

Great to be here. I found myself the more I read about it, the more I find myself agreeing with other principles So it's the data, that's, it's an aggregate view of historical events that happens in agility to respond, you know, because we have a centralized bottleneck from team to technology, I leave that to Elizabeth, to the imaginations of the users. some of my texts and I thought about, okay, now to make this real, we need to think about securing in order to scale engineering teams and not make the same mistakes again, but maybe we can start there with kind Uh, so the domain ownership really talks about giving autonomy to the domains and And that leads to some interesting kind of architectural shifts, because when you think about not And as one sentence that I've heard you use that I think is incredibly powerful, it's less collecting, data that now these domains own needs to be shared with some accountability shouldn't be accountable for it, shrug off the responsibility and say, you know, I dumped this data on some event streaming aspect, but the other part of kind of data as a product next to usability is whole So we can bridge that gap with, uh, you know, adding documentation, And I think data measure also talks a little bit about the roles responsibilities. of the data is you and me where the data comes really from, but it's the data product owner who's What are the incentives we have to set up in the infrastructure, you know, in the organization. The alignment again, to the consumer using things like we know from product management, So some of the concerns that are applied to all of the data front, Um, and they have to be incentivized to do both. So be able to have this embedded policy engines That makes, that makes it on a sense. So and so, oh, I should have curious, the principles and data measures that federate their thinking or customers know about going to communities domains or operating of the box connectivity to a very wide, um, area or a range of technologies, And, and also on the, I think the computation aspect that you mentioned, I think we believe in as well, Just come to like.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Felix	PERSON	0.99+
Isabella	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
Elizabeth	PERSON	0.99+
Felix Zhamak	PERSON	0.99+
13 years	QUANTITY	0.99+
second principle	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
one sentence	QUANTITY	0.99+
third principle	QUANTITY	0.99+
second dimension	QUANTITY	0.99+
fourth principle	QUANTITY	0.99+
both	QUANTITY	0.99+
first principle	QUANTITY	0.99+
two choices	QUANTITY	0.98+
Dana	PERSON	0.98+
Emily	PERSON	0.98+
tomorrow	DATE	0.98+
first	QUANTITY	0.98+
one organization	QUANTITY	0.98+
13 years ago	DATE	0.98+
three pieces	QUANTITY	0.97+
a year ago	DATE	0.97+
One	QUANTITY	0.94+
mark	PERSON	0.93+
one location	QUANTITY	0.93+
three concepts	QUANTITY	0.92+
one place	QUANTITY	0.9+
one	QUANTITY	0.86+
eight made	QUANTITY	0.85+
four principles	QUANTITY	0.84+
single data product	QUANTITY	0.79+
Colibra	PERSON	0.76+
Venice	ORGANIZATION	0.73+
half century	DATE	0.63+
Day 1	QUANTITY	0.6+
ThoughtWorks	ORGANIZATION	0.59+

Stijn Stan Christiaens, Co founder & CTO, Collibra EDIT

>> - From around the globe, it's the cube covering data citizens, 21 brought to you by Collibra. >> Hello, everyone, John Walls here, As we continue our cube conversations here as part of data citizens, 21, the conference ongoing. Collibra at the heart of that, really at the heart of data these days and helping companies and corporations make sense. Although this data chaos that they're dealing with, trying to provide new insights, new analysis being a lot more efficient and effective with your data. That's what Collibra is all about. And their founder and their chief data citizen, if you will, Stan Christiaens joins us today. And Stan, I love that title, chief data citizen. What does that all about? What does that mean? >> Hey John, thanks for having me over. And hopefully we'll get to a point where the chief data citizen Titelist cleaves to you. Thanks by the way, for giving us the opportunity to speak a little bit about what we're doing with our chief data citizen. We started the company about 13 years ago, 2008. And over those years, as a founder I've worn many different hats from product to pre-sales to partnerships and a bunch of obvious things. But ultimately the company reaches a certain point a certain size where systems and processes become absolutely necessary if you want to scale further. And for us, this is the moment in time where we said, okay we probably need a data office right now ourselves, something that we've seen with many of our customers. So we said, okay, let me figure out how to lead our own data office and figured out how we can get value out of data using our own software at Collibra itself. And that's where the chief data citizen role comes in. On Friday evening, we like to call that drinking our own champagne moment morning, either eating our own dog food but, essentially this is what we help our customers do, build out the data offices. So we're doing this ourselves now, when we're very hands-on. So there's a lot of things that we're learning, again just like our customers do. And for me, at Collibra, this means that I'm responsible as a chief data citizen for our overall data strategy, which talks a lot about data products, as well as our data infrastructure, which is needed to power data products. Now, because we're doing this in the company and also doing this in a way that is helpful to our customers. We're also figuring out how do we translate the learnings that we have ourselves and give them back to our customers, to our partners, to the broader ecosystem as a whole. And that's why if you summarize the strategy, I like to sometimes refer to it as data office 2025, it's 2025. What is the data office look like by then? And we recommend to our customers to also have that forward looking view just as well. So if I summarize the, the answer a little bit and it's fairly similar to achieve that officer role but, because it has the external evangelization component, helping other data leaders, we like to refer to it as the chief data citizens. >> Yeah, and that, that kind of, you talked about evangelizing, obviously with that, that you're talking about certain kinds of responsibilities and obligations. And I, when I think of citizenship in general I think about privileges and rights and you know, about national citizenship. You're talking about data citizenship, So I assume that with that you're talking about appropriate behaviors and the most well-defined behaviors, and kind of keeping it between the lanes basically. Is that, is that how you look at being a data citizen or, and if not, how would you describe that to a client about being a data citizen? >> It's a very good point, as a citizen you have rights and responsibilities, and the same is exactly true for a data citizen. For us, starting with what it is, right for us, A data citizen is somebody who uses data to do their job. And we've purposely made that definition very broad because today we believe that everyone in some way uses data to do their job. You know, data is universal. It's critical to business processes and it's importance is only increasing. And we want all the data citizens to have appropriate access to data and the ability to do stuff with data but, also to do that in the right way. And if you think about it this is not just something that applies to you in your job but, also extends beyond the workplace because as a data citizen, you're also a human being, of course. So, the way you do data at home with your friends and family, all of this becomes important as well. And we like to think about it as informed privacy aware, data citizens should think about trust in data all the time, because ultimately everybody's talking today about data as an asset, and data is the new gold, and the new oil, and the new soil, and there is a ton of value in data but, as much as organizations themselves to see this, it's also the bad actors out there. We're reading a lot more about data breaches, for example. So, ultimately there's no value without risk. So, as a data citizen, you can achieve a value but, you also have to think about, how do I avoid these risks, and as an organization, if you manage to combine both of those, that's when you can get the maximum value out of data in a trusted manner. >> Yeah, I think this is pretty, an interesting approach that you've taken here because obviously there there are processes with regard to data, right? I mean, the, you know, that that's pretty clear but, there are also, there's a culture that you're talking about here that, that not only are we going to have an operational plan for how we do this certain activity and how we're going to analyze here, input here, action, or perform action on that, whatever but we're going to have a mindset or an approach mentally that we want our company to embrace. So, if you would walk me through that process a little bit in terms of creating that kind of culture, which is very different than kind of the X's and O's and the technical side of things. >> Yeah. That's I think when organizations face the biggest challenge, because, you know maybe they're hiding the best most unique data scientists in the world but, it's not about what that individual can do, right? It's about what the combination of data citizens across the organization can do. And I think it starts first by thinking as an individual about universal goal, golden rule, treat others as you would want to be treated yourself, right? The way you would ethically use data at your job. Think about that, There's other people at other companies, who you would want to do the same thing. Now, from our experience, in our own data office at Collibra, as well as what we see with our customers. A lot of that personal responsibility which is where culture starts, starts with data literacy. And, you know, we talked a little bit about Plymouth rock and the small statues in Brussels Belgium, where I'm from but, essentially here we speak a couple of languages in Belgium. And for organizations, for individuals data literacy is very similar. You know, you're able to read and write which are pretty essential for any job today. And so we want all data citizens to also be able to speak and read and write data fluently. If I, if I can express it this way. And one of the key ways of getting that done and establishing that culture around data, lies with the one who leads data in the organization, the chief data officer, or however the role is called. They play a very important role in this. In comparison, maybe that I always make there is think about other assets in your organization. You know, you're organized for the money assets, for the talent assets, with HR and a bunch of other assets. So let's talk about the, the money assets for a little bit, right? You have a finance department, you have a chief financial officer, and obviously their responsibility is around managing that money asset. But it's also around making others in the organization think about that money. And they do that through established processes and responsibilities like budgeting and planning but, also ultimately to the individual where, you know, through expense sheets that we all love so much, they make you think about money. So, if the CFO makes everyone in the company thinks about think about money, that data officer, or the data lead, has to think, has to make everyone think in the company about data assets, asset, just as well. And those rights, those responsibilities in that culture, they also change, right? Today, they're set this and this way because of privacy and policy X and Y and Z. But tomorrow, for example, as, as with the European union's new regulation around BI, there's a bunch of new responsibilities you'll have to think about. >> You mentioned security and about value and risk, which is certainly, they are part and parcel, right? If I have something important I've got to protect it because somebody else might want to, to create some damage, some harm and and steal my value, basically when that's, what's happening as you point out in the data world these days. So, so what kind of work are you doing in that regard in terms of reinforcing the importance of security culture, privacy culture, you know, this kind of protective culture within an organization so that everybody fully understands, you know, the risks but, also the huge upsides. If you do enforce this responsibility and these good behaviors that that obviously the company can gain from, and then provide value to their client base. So how do you reinforce that within your clients to spread that culture, if you will, within their organizations? >> Spreading a culture is not always an easy thing, And especially a lot of organizations think about the value around data, but to your point, not always about the risks that come associated with it. Sometimes just because they don't know about it yet, right, there's new architectures that come into play, like the clouds and that comes with a whole bunch of new risks. That, that's why one of the things that we recommend always to our customers and to data officers in our customer's organizations, is that next to establishing that, that data literacy, for example, and working on data products is that they also partner strongly with other leaders in their organization. On the one hand, for example, the legal folks, where typically you find the the aspects around privacy and on the other hand, the information security folks, because if you're building up sort of map of your data, look at it like a castle, right, that you're trying to protect. If you don't have a map of your castle, with the strong points and the weak points, and you know where people can build, dig a hole under your wall or what have you, then it's very hard to defend. So, you have to be able to get a map of your data, a data map if you will, know what data is out there. Who its being used by, and why and how, and then you want to prioritize that data, which is the most important what are the most important uses and put the appropriate protections and controls in place. And it's fundamental that you do that together with your legal and information security partners because you may have as a data lead that you may have the data knowledge, the data expertise but, there's a bunch of other things that come into play when you're trying to protect, not just the data but, really your company on its data as a whole. >> No, you Were talking about 2025 a little bit ago, and I thought good for you, that's quite a crystal ball that you have it, you know looking to, you know, with the headlights that far down the road, but I know you have to be, you know that kind of progressive thinking is very important. What do you see in, in the long-term for number one, your kind of position as a chief data citizen, if you will, and then the role of the chief data officer, which you think is kind of migrating toward that citizenship, if you will. So, maybe put on those long-term vision goggles of yours again, and tell me, what do you see as far as these evolving roles and, and these new responsibilities for people who are CEOs these days? >> Well, 2025 is closer than we think right? Then obviously, my crystal ball is as fuzzy as everyone else's but, there's a few things, that trends that you can easily identify and that we've seen by doing this for so long at Collibra. And one is the, the push around data. I think last year, the years, 2020,` where sort of COVID became the executive director of digitalization. Forced everyone to think more about digital, and I expect that to continue. So, that's an important aspect. The second important aspect that I expect to continue for the next couple of years, easily in 2025 is the whole movement to the cloud. So these cloud native architectures become important, as well as the, you know, preparing your data around it, preparing your policies around it, etc.. I also expect that privacy regulations will continue to increase as well as the needs to protect your data assets. And I expected a lot of key data officers will also be very busy building out those data products. So if you, if you take that that trend then, okay data products are getting more important for key data officer's, then data quality is something that's increasingly important today to get right, otherwise, becomes a garbage in garbage out kind of situation, where your data products are being fed bad foods and ultimately their outcomes aren't very clear. So for us, for the chief data officers, I think it was about one of them in 2002, and then 2019 ish, let's say there were 10,000. So there's plenty of upsides for the chief data officer there's plenty of roles like that needed across the world. And they've also evolved in, in responsibility. And I expect that their position, you know, as it it is really a C-level position today in most organizations. Expect that, that trend will also continue to grow. But ultimately those chief data officers have to think about the business, right? Not just the defensive and offensive positions around data, like almost policies and regulations but, also the support for businesses who are today, shifting very fast and will continue to, to digital. So, those key data officers will be seen as key notes. Especially when they can build out the factory of data products that really supports the business. But at the same time, they have to figure out how to reaching all of the branch to their technical counterparts, because you cannot build a factory of data products in my mind at least, without the proper infrastructure. And that's where your technical teams come in. And then obviously the partnerships with your video and information security folks, of course. >> Well heroes, everybody wants to be the hero. And I know that's a, you painted a pretty clear path right now, as far as the chief data officer's concerned and their importance and the value to companies down the road. Stan, we thank you very much for the time today and for the insight, and wish you continued success at the conference. Thank you very much. >> Thank you very much. Have a nice day. Stay healthy. >> Thank you very much Stan Christiaen's joining us, talking about chief data citizenship, if you will, as part of data citizens, 21 the conference being put on by Collibra. I'm John Walls. Thanks for joining us here on the cube. (upbeat music)

Published Date : Jun 14 2021

SUMMARY :

21 brought to you by Collibra. really at the heart of data these days in the company and also doing this and if not, how would you describe that that applies to you in your job and O's and the technical side of things. or the data lead, has to think, that obviously the company can gain from, the weak points, and you know that you have it, you know and I expect that to continue. as the chief data officer's concerned Thank you very much. citizenship, if you will,

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
2002	DATE	0.99+
John Walls	PERSON	0.99+
2025	DATE	0.99+
Stan	PERSON	0.99+
Belgium	LOCATION	0.99+
2019	DATE	0.99+
Collibra	ORGANIZATION	0.99+
Friday evening	DATE	0.99+
tomorrow	DATE	0.99+
last year	DATE	0.99+
10,000	QUANTITY	0.99+
Today	DATE	0.99+
Stijn Stan Christiaens	PERSON	0.99+
2020	DATE	0.99+
Stan Christiaen	PERSON	0.99+
Stan Christiaens	PERSON	0.99+
European union	ORGANIZATION	0.99+
today	DATE	0.98+
first	QUANTITY	0.98+
both	QUANTITY	0.98+
Brussels Belgium	LOCATION	0.96+
one	QUANTITY	0.95+
2008	DATE	0.93+
next couple of years	DATE	0.91+
13 years ago	DATE	0.91+
second important aspect	QUANTITY	0.83+
Plymouth	LOCATION	0.78+
COVID	ORGANIZATION	0.78+
21	QUANTITY	0.7+
Titelist	ORGANIZATION	0.68+
21	DATE	0.52+
data office 2025	ORGANIZATION	0.52+
about	DATE	0.51+

Data Citizens '21 Preview with Felix Van de Maele, CEO, Collibra

>>At the beginning of the last decade, the technology industry was a buzzing because we were on the cusp of a new era of data. The promise of so-called big data was that it would enable data-driven organizations to tap a new form of competitive advantage. Namely insights from data at a much lower cost. The problem was data became plentiful, but insights. They remained scarce, a rash of technical complexity combined with a lack of trust due to conflicting data sources and inconsistent definitions led to the same story that we've heard for decades. We spent a ton of time and money to create a single version of the truth. And we're further away than we've ever been before. Maybe as an industry, we should be approaching this problem differently. Perhaps it should start with the idea that we have to change the way we serve business users. I E those who understand data context, and with me to discuss the evolving data space, his company, and the upcoming data citizens conference is Felix van de Mala, the CEO and founder of Collibra. Felix. Welcome. Great to see you. >>Great to see you. Great to be here. >>So tell us a little bit about Collibra and the problem that you're solving. Maybe you could double click on my upfront narrative. >>Yeah, I think you said it really well. Uh, we've seen so much innovation over the last couple of years in data, the exploding volume complexity of data. We've seen a lot of innovation of how to store and process that data, that, that volume of data more effectively or more cost-effectively, but fundamentally the source of the problem as being able to really derive insights from that data effectively when it's for an AI model or for reporting, it's still as difficult as it was, let's say 10 years ago. And if only in a way it's only become more, uh, more difficult. And so what we fundamentally believe is that next to that innovation on the infrastructure side of data, you really need to look at the people on process side of data. There's so many more people that today consume and produce data to do their job. >>That's why we talk about data citizens. They have to make it easier for them to find the right data in a way that they can trust that there's confidence in that data to be able to make decisions and to be able to trust the output of that, uh, of that model. And that's really what is focused on initially around governance. Uh, how do you make sure people actually are companies know what data they have and make sure they can trust it and they can use it in a compliant way. And now we've extended that into the only data intelligence platform today in the industry where we just make it easier for organizations to truly unite around the data across the whole organization, wherever that data is stored on premise and the cloud, whoever is actually using or consuming data. Uh, that's why we talk about data citizens. I >>Think you're right. I think it is more complex. There's just more of it. And there's more pressure on individuals to get advantage from it. But I, to ask you what sets Culebra apart, because I'd like you to explain why you're not just another data company chasing a problem with w it's going to be an incremental solution. It's really not going to change anything. What, what sets Collibra apart? >>Yeah, that's a really good question. And I think what's fundamentally sets us apart. What makes us unique is that we look at data or the problem around data as truly a business problem and a business function. So we fundamentally believe that if you believe that data is an asset, you really have to run it as a, as a, as a strategic business functions, just like your, um, uh, your HR function, your people function, your it functioning says a marketing function. You have a system to run that function. Now you have Salesforce to run sales and marketing. You have service now to run your, it function. You have Workday to run your people function, but you need the same system to really run your data from. And that's really how we think about GDPR. So we not another kind of faster, better database we know than other data management tool that makes the life of a single individual easier, which really a business application that focuses on how do we bring people together and effective rate so that they can collaborate around the data. It creates efficiency. So you don't have to do things ad hoc. You can easily find the right information. You can collaborate effectively. And it creates the confidence to actually be able to do something with the outcomes of it, the results of all of that work. And so fundamentally I'm looking at the problem as a, as a business function that needs a business system. We call it the system of record or system of engagement for the, for the data function, I think is absolutely critical and, and really unique in the, in our approach. So >>Data citizens are big user conference, data citizens, 21, it's coming up June 16th and 17th, the cubes stoked because we love talking about data. This is the first time we're bringing the cube to that event. So we're really gearing up for it. And I wonder if it could tell us a little bit about the history and the evolution of the data citizens conference? >>Absolutely. I think the first one is set at six years ago where we had a small event at a hotel downtown New York. Uh, most of the customers as their user conference, a lot of the banks, which are at the time of the main customers at 60 people. So very small events, and it exploded ever since, uh, this year we expect over 5,000 people. So it's really expanded beyond just the user conference to really become more of almost the community conference and the industry, um, the conference. So we're really excited, a big part of what we do, why we care so much about the conference. That's an opportunity to build that data citizens community. That's what we hear from our customers, from all attendees that come to the conference, uh, bring those people to get us all care about the same topic and are passionate about doing more at data, uh, being able to connect, uh, connect people together as a big part of that. So we've always, uh, we're always looking forwards, uh, through the event, uh, from that perspective >>Competition, of course, for virtual events these days with them, what's in it for me, what, who should attend and what can attendees expect from data citizens? 21. >>Yeah, absolutely. The good thing about the virtual event, uh, event is that everybody can attend. It's free, it's open from across the road, of course, but what we want for people to take away as attendees is that you learn something at pragmatics or the next day on the job, you can do something. You've learned something very specific. We've also been, um, um, excited and looked at what is possible from an innovation perspective. And so that's how we look at the events. We bring a lot of, um, uh, customers on my realization that they're going to share their best practices, very specifically, how they are, how they are handling data governance, how they're doing data, data, cataloging, how they're doing data privacy. So very specific best practices and tips on how to be successful, but then also industry experts that can paint the picture of where we going as an industry, what are the best practices? >>What do we need to think about today to be ready for what's going to come tomorrow? So that's a big focus. We, of course, we're going to talk about and our product. What are we, what do we have in store from a product roadmap and innovation perspective? How are we helping these organizations get their foster and not aspect as we were being in a lot of partners as well? Um, and so that's a big part of that broader ecosystem, uh, which is, which is really interesting. And I finally, like I said, it's really around the community, right? And that's what we hear continuously from the attendees. Just being able to make these connections, learn new people, learn what they're doing, how they've, uh, kind of, um, solved certain challenges. We hear that's a really big part of, uh, of the value proposition. So as an attendee, uh, the good thing is you can, you can join from anywhere. Uh, all of the content is going to be available on demand. So later it's going to be available for you to have to look at as well. Plus you're going to be farther out. You're going to become part of that data, citizens community, which has a really thriving and growing community where you're going to find a lot of like-minded people with the same passion, the same interest that McConnell learned the most from, well, I'd rather >>Like the term data citizen. I consider myself a data citizen, and it has implications just in terms of putting data in the hands of, of business users. So it's sort of central to this event, obviously. W what is a data citizen to Collibra? >>Yeah, it's, it's a really core part of our mission and our vision that we believe that today everyone needs data to do their job. Everyone in that sense has become a data citizen in the sense that they need to be able to easily access trustworthy data. We have to make it easy for people to easily find the right data that they can trust that they can understand. And I can do something like with and make their job easier. On the other hand, like a citizen, you have rights and you have responsibilities as a data citizen. You also have the responsibility to treat that data in the right way to make sure from a privacy and security perspective, that data is a as again, like I said, treated in the right way. And so that combination of making it easy, making it accessible, democratizing it, uh, but also making sure we treat data in the right way is really important. And that's a core part of what we believe that everyone is going to become a data citizen. And so, um, that's a big part of our mission. I like that >>We're to enter into a contract, I'll do my part and you'll give me access to that data. I think that's a great philosophy. So the call to action here, June 16th and 17th, go register@citizensdotcollibra.com go register because it's not just the normal mumbo jumbo. You're going to get some really interesting data. Felix, I'll give you the last word. >>No, like I said, it's like you said, go register. It's a great event. It's a great community to be part of June 16 at 17, you can block it in your calendar. So go to citizens up pretty bad outcome. It's going to be a, it's going to be a great event. Thanks for helping >>Us preview. Uh, this event is going to be a great event that really excited about Felix. Great to see you. And we'll see you on June 16th and 17th. Absolutely. All right. Thanks for watching everybody. This is Dave Volante for the cube. We'll see you next time.

Published Date : May 12 2021

SUMMARY :

At the beginning of the last decade, the technology industry was a buzzing because we were on Great to be here. So tell us a little bit about Collibra and the problem that you're solving. effectively or more cost-effectively, but fundamentally the source of the problem as being able to to be able to trust the output of that, uh, of that model. But I, to ask you what sets Culebra apart, And it creates the confidence to actually be able to do something with the the cubes stoked because we love talking about data. So it's really expanded beyond just the user conference to really become more of almost the community Competition, of course, for virtual events these days with them, what's in it for me, what, it's open from across the road, of course, but what we want for people to take Uh, all of the content is going to be available on demand. So it's sort of central to this event, You also have the responsibility to treat So the call to action here, June 16th and 17th, go register@citizensdotcollibra.com It's a great community to be part of June Uh, this event is going to be a great event that really excited about Felix.

ENTITIES

Entity	Category	Confidence
Felix van de Mala	PERSON	0.99+
Felix Van de Maele	PERSON	0.99+
Dave Volante	PERSON	0.99+
Felix	PERSON	0.99+
June 16	DATE	0.99+
June 16th	DATE	0.99+
17th	DATE	0.99+
60 people	QUANTITY	0.99+
tomorrow	DATE	0.99+
register@citizensdotcollibra.com	OTHER	0.99+
today	DATE	0.99+
Collibra	ORGANIZATION	0.99+
McConnell	PERSON	0.99+
six years ago	DATE	0.99+
over 5,000 people	QUANTITY	0.98+
single	QUANTITY	0.97+
Culebra	ORGANIZATION	0.96+
this year	DATE	0.96+
GDPR	TITLE	0.95+
first one	QUANTITY	0.95+
10 years ago	DATE	0.95+
first time	QUANTITY	0.95+
last decade	DATE	0.93+
17	DATE	0.92+
New York	LOCATION	0.89+
21	DATE	0.88+
single version	QUANTITY	0.88+
decades	QUANTITY	0.86+
data citizens	EVENT	0.75+
next day	DATE	0.72+
double	QUANTITY	0.66+
last couple of years	DATE	0.64+
Data Citizens	TITLE	0.63+
more people	QUANTITY	0.61+
ton	QUANTITY	0.53+
Salesforce	ORGANIZATION	0.51+
'21	DATE	0.44+

Felix Van de Maele, CEO, Collibra

(upbeat music) >> At the beginning of last decade technology industry was a buzzing because we were on the cusp of a new era of data. The promise of so-called big data was that it would enable data-driven organizations to tap a new form of competitive advantage. Namely insights from data at a much lower cost. The problem was data became plentiful, but insights, they remain scarce. A rash of technical complexity combined with a lack of trust due to conflicting data sources and inconsistent definitions led to the same story that we've heard for decades. We spent a ton of time and money to create a single version of the truth. And we're further away than we've ever been before. Maybe as an industry, we should be approaching this problem differently. Perhaps it should start with the idea that we have to change the way we serve business users i.e. those who understand data context. And with me, to discuss the evolving data space, his company and the upcoming Data Citizens Conference is Felix Van De Maele, the CEO and Founder, of Collibra. Felix, welcome. Great to see you. >> Great to see you. Great to be here. >> So tell us a little bit about Collibra and the problem that you're solving. Maybe you could double click on my upfront narrative. >> Yeah, I think you said it really well. We've seen so much innovation over the last couple of years in data, the exploding volume complexity of data. We've seen a lot of innovation of how to store and process that data, that volume of data more effectively, more cost-effectively. But fundamentally the source of the problem as being able to really derive insights from that data effectively when it's for an AI model or for reporting is still as difficult as it was let's say 10 years ago. And it only... In a way it's only become more difficult. And so what we fundamentally believe is that next to that innovation on the infrastructure side of data you really need to look at the people on process side of data. There are so many more people that today consume and produce data to do their job. That's why we talk about data citizens. They have to make it easier for them to find the right data in a way that they can trust that there's confidence in that data to be able to make decisions and to be able to trust the algorithm of that model. And that's really what Collibra is focused on. Initially, around governance. How do you make sure people actually or companies know what data they have and make sure they can trust it and they can use it in a compliant way. And now we've extended that into the only data intelligence platform today in the industry where we just make it easier for organizations to truly unite around the data across the whole organization. wherever that data stored on premise and the cloud whoever is actually using or consuming that data. That's why we talk about data citizens. >> I think you're right. I think yours is more complex. There's more of it. And there's more pressure on individuals to get advantage from it. But I want to ask you, what sets Collibra apart because I'd like you to explain why you're not just another data company chasing a problem with it's going to be an incremental solution, it's really not going to change anything. What sets Collibra apart? >> Yeah, that's a really good question. And what fundamentally sets us apart, or makes us unique is that we look at data or the problem around data as truly a business owner and a business function. So we fundamentally believe that if you believe that data is an asset, you really have to run it as a strategic business function. Just like you run your HR function, your people function, your IT function your sales and marketing function. You have a system to run that function. And you have Salesforce to run sales and marketing. You have service now to run your IT function. You have word day to run your people function. Like you need the same system to really run your data function. And that's really how we think about Collibra. So we're not another kind of faster better database. We're not another data management tool that makes the life of a single individual easier. We're truly a business application that focuses on how do we bring people together and effective rates so that they can collaborate around the data. It creates efficiency. So you don't have to do things ad hoc. You can easily find the right information. You can collaborate effectively. And it creates the confidence to actually be able to do something with the outcomes or with the results of all of that work. And so fundamentally, looking at the problem as a business function that needs a business system. We call it the system of record or system of engagement. For the data function, I think it's absolutely a critical and really unique in our approach. >> So Data Citizens your big user conference. Data Citizens '21 it's coming up June 16th and 17th cubes stoked because we love talking about data. This is the first time we're bringing theCUBE to that event. And so we're really gearing up for it. And I wonder if you can tell us a little bit about the history and the evolution of the Data Citizens conference? >> Absolutely. I think the first one it started six years ago where we had a small event at a hotel downtown New York mostly customers as their user conference, a lot of the banks, which are at the time are the main customers at 60 people. So very small events. And it's exploded ever since this year, we expect over 5,000 people. So it's really expanded beyond just a user conference to really become more of almost a community conference and an industry conference. So we're really excited. A big part of what we do, why we care so much about the conference. That's an opportunity to build that data citizens community. That's where we hear from our customers, from all attendees that come to the conference, bring those people together that all care about the same topic and are passionate about doing more with data, being able to connect people together as a big part of that. So we've always... We're always looking forward to event from that perspective. >> Well, a lot of competition of course, for virtual events these days with them. What's in it for me? Who should attend? And what can attendees expect from Data Citizens '21? >> Yeah, absolutely. The good thing about the virtual event is that everybody can attend. It's free, it's open from across the world, of course. But what we want for people to take away as attendees is that you learn something pragmatic. So the next day on the job, you can do something. You've learned something very specific. We've also been excited and looked at what is possible from an innovation perspective? And so that's how we look at the event. We bring a lot of customers and organization that are going to share their best practices. Very specifically, how they're handling data governance. How they're doing data cataloging. How they're doing data privacy. So very specific best practices and tips on how to be successful, but then also industry experts that can paint a picture of where we're going as an industry, what are the best practices? What do we need to think about today to be ready for what's going to come tomorrow? So that's a big focus. We, of course, we're going to talk about Collibra and our product. What do we have in store from a product roadmap. And innovation perspective, how we're helping these organizations get there faster and all that aspect as we bring in a lot of partners as well. And so that's a big part of that broader ecosystem which is really interesting. And I finally, like I said it's really around the community. That's what we hear continuously from the attendees. Just being able to make these connections, learn new people, learn what they're doing how they've kind of solved certain challenges. We hear that's a really big part of the value proposition. So as an attendee, the good thing is you can join from anywhere. All of the content is going to be available on demand. So later it's going to be available for you to have to look at as well. Plus you're going to be part, or you're going to become part of that data citizens community. Which is a really thriving and growing community where you're going to find a lot of like-minded people with the same passion, the same interest, that we can all learn a lot from. >> I rather like the term data citizen. I consider myself a data citizen and it has implications just in terms of putting data in the hands of business users. So it's just sort of central to this event, obviously. What is a data citizen to Collibra? >> Yeah. It's a really core part of our mission and our vision that we believe that today everyone needs data to do their job. Everyone in that sense has become a data citizen in the sense that they need to be able to easily access trustworthy data. We have to make it easy for people to easily find the right data that they can trust, that they can understand and they can do something with and make their job easier. On the other hand, like a citizen, you have rights and you have responsibilities. As a data citizen, you also have the responsibility to treat that data in the right way. To make sure from a privacy and security perspective, that data is as again like I said, treated in the right way. And so that combination of making it easy, making it accessible, democratizing it but also making sure we treat data in the right way is really important. And it's a core part of what we believe that everyone is going to become a data citizen. And so that's a big part of our mission. >> I like that. We're going to enter into a contract. I'll do my part and you'll give me access to that data. I think that's a great philosophy. So the call to action here, June 16th and 17th go register at citizens.collibra.com go register because it's not just the normal mumbo jumbo. You're going to get some really interesting data. Felix, I'll give you the last word. >> No, like I said, like you said, go register. It's a great event. It's a great community to be part of at June 16th and 17th you can block it in your calendar. So go to citizens.collibra.com. It's going to be a great event. >> Well, thanks for helping us preview this event. It's going to be a great event that we're really excited about. Felix, great to see you. And we'll see you on June 16th and 17th. >> Absolutely. >> All right. Thanks for watching everyone. This is Dave Vellante for theCUBE. We'll see you next time. (upbeat music)

Published Date : May 10 2021

SUMMARY :

and the upcoming Data Citizens Conference Great to be here. and the problem that you're solving. in that data to be able to make decisions it's really not going to change anything. And it creates the confidence to actually and the evolution of the a lot of the banks, And what can attendees expect and tips on how to be successful, What is a data citizen to Collibra? in the sense that they need to be able So the call to action here, It's a great community to be part of It's going to be a great event We'll see you next time.

ENTITIES

Entity	Category	Confidence
Felix Van De Maele	PERSON	0.99+
Felix Van de Maele	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Felix	PERSON	0.99+
June 16th	DATE	0.99+
citizens.collibra.com	OTHER	0.99+
17th	DATE	0.99+
tomorrow	DATE	0.99+
60 people	QUANTITY	0.99+
Collibra	ORGANIZATION	0.99+
this year	DATE	0.99+
six years ago	DATE	0.99+
Data Citizens	EVENT	0.98+
over 5,000 people	QUANTITY	0.98+
today	DATE	0.98+
first one	QUANTITY	0.98+
last decade	DATE	0.97+
first time	QUANTITY	0.97+
Data Citizens '21	EVENT	0.96+
New York	LOCATION	0.96+
10 years ago	DATE	0.92+
next day	DATE	0.9+
decades	QUANTITY	0.88+
single version	QUANTITY	0.87+
Data Citizens Conference	EVENT	0.86+
single individual	QUANTITY	0.69+
more people	QUANTITY	0.69+
years	DATE	0.61+
ton	QUANTITY	0.6+
double	QUANTITY	0.58+
last couple	DATE	0.52+
Salesforce	ORGANIZATION	0.52+

Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

Breaking Analysis: Grading our 2022 Enterprise Technology Predictions

>>From the Cube Studios in Palo Alto in Boston, bringing you data-driven insights from the cube and E T R. This is breaking analysis with Dave Valante. >>Making technology predictions in 2022 was tricky business, especially if you were projecting the performance of markets or identifying I P O prospects and making binary forecast on data AI and the macro spending climate and other related topics in enterprise tech 2022, of course was characterized by a seesaw economy where central banks were restructuring their balance sheets. The war on Ukraine fueled inflation supply chains were a mess. And the unintended consequences of of forced march to digital and the acceleration still being sorted out. Hello and welcome to this week's weekly on Cube Insights powered by E T R. In this breaking analysis, we continue our annual tradition of transparently grading last year's enterprise tech predictions. And you may or may not agree with our self grading system, but look, we're gonna give you the data and you can draw your own conclusions and tell you what, tell us what you think. >>All right, let's get right to it. So our first prediction was tech spending increases by 8% in 2022. And as we exited 2021 CIOs, they were optimistic about their digital transformation plans. You know, they rushed to make changes to their business and were eager to sharpen their focus and continue to iterate on their digital business models and plug the holes that they, the, in the learnings that they had. And so we predicted that 8% rise in enterprise tech spending, which looked pretty good until Ukraine and the Fed decided that, you know, had to rush and make up for lost time. We kind of nailed the momentum in the energy sector, but we can't give ourselves too much credit for that layup. And as of October, Gartner had it spending growing at just over 5%. I think it was 5.1%. So we're gonna take a C plus on this one and, and move on. >>Our next prediction was basically kind of a slow ground ball. The second base, if I have to be honest, but we felt it was important to highlight that security would remain front and center as the number one priority for organizations in 2022. As is our tradition, you know, we try to up the degree of difficulty by specifically identifying companies that are gonna benefit from these trends. So we highlighted some possible I P O candidates, which of course didn't pan out. S NQ was on our radar. The company had just had to do another raise and they recently took a valuation hit and it was a down round. They raised 196 million. So good chunk of cash, but, but not the i p O that we had predicted Aqua Securities focus on containers and cloud native. That was a trendy call and we thought maybe an M SS P or multiple managed security service providers like Arctic Wolf would I p o, but no way that was happening in the crummy market. >>Nonetheless, we think these types of companies, they're still faring well as the talent shortage in security remains really acute, particularly in the sort of mid-size and small businesses that often don't have a sock Lacework laid off 20% of its workforce in 2022. And CO C e o Dave Hatfield left the company. So that I p o didn't, didn't happen. It was probably too early for Lacework. Anyway, meanwhile you got Netscope, which we've cited as strong in the E T R data as particularly in the emerging technology survey. And then, you know, I lumia holding its own, you know, we never liked that 7 billion price tag that Okta paid for auth zero, but we loved the TAM expansion strategy to target developers beyond sort of Okta's enterprise strength. But we gotta take some points off of the failure thus far of, of Okta to really nail the integration and the go to market model with azero and build, you know, bring that into the, the, the core Okta. >>So the focus on endpoint security that was a winner in 2022 is CrowdStrike led that charge with others holding their own, not the least of which was Palo Alto Networks as it continued to expand beyond its core network security and firewall business, you know, through acquisition. So overall we're gonna give ourselves an A minus for this relatively easy call, but again, we had some specifics associated with it to make it a little tougher. And of course we're watching ve very closely this this coming year in 2023. The vendor consolidation trend. You know, according to a recent Palo Alto network survey with 1300 SecOps pros on average organizations have more than 30 tools to manage security tools. So this is a logical way to optimize cost consolidating vendors and consolidating redundant vendors. The E T R data shows that's clearly a trend that's on the upswing. >>Now moving on, a big theme of 2020 and 2021 of course was remote work and hybrid work and new ways to work and return to work. So we predicted in 2022 that hybrid work models would become the dominant protocol, which clearly is the case. We predicted that about 33% of the workforce would come back to the office in 2022 in September. The E T R data showed that figure was at 29%, but organizations expected that 32% would be in the office, you know, pretty much full-time by year end. That hasn't quite happened, but we were pretty close with the projection, so we're gonna take an A minus on this one. Now, supply chain disruption was another big theme that we felt would carry through 2022. And sure that sounds like another easy one, but as is our tradition, again we try to put some binary metrics around our predictions to put some meat in the bone, so to speak, and and allow us than you to say, okay, did it come true or not? >>So we had some data that we presented last year and supply chain issues impacting hardware spend. We said at the time, you can see this on the left hand side of this chart, the PC laptop demand would remain above pre covid levels, which would reverse a decade of year on year declines, which I think started in around 2011, 2012. Now, while demand is down this year pretty substantially relative to 2021, I D C has worldwide unit shipments for PCs at just over 300 million for 22. If you go back to 2019 and you're looking at around let's say 260 million units shipped globally, you know, roughly, so, you know, pretty good call there. Definitely much higher than pre covid levels. But so what you might be asking why the B, well, we projected that 30% of customers would replace security appliances with cloud-based services and that more than a third would replace their internal data center server and storage hardware with cloud services like 30 and 40% respectively. >>And we don't have explicit survey data on exactly these metrics, but anecdotally we see this happening in earnest. And we do have some data that we're showing here on cloud adoption from ET R'S October survey where the midpoint of workloads running in the cloud is around 34% and forecast, as you can see, to grow steadily over the next three years. So this, well look, this is not, we understand it's not a one-to-one correlation with our prediction, but it's a pretty good bet that we were right, but we gotta take some points off, we think for the lack of unequivocal proof. Cause again, we always strive to make our predictions in ways that can be measured as accurate or not. Is it binary? Did it happen, did it not? Kind of like an O K R and you know, we strive to provide data as proof and in this case it's a bit fuzzy. >>We have to admit that although we're pretty comfortable that the prediction was accurate. And look, when you make an hard forecast, sometimes you gotta pay the price. All right, next, we said in 2022 that the big four cloud players would generate 167 billion in IS and PaaS revenue combining for 38% market growth. And our current forecasts are shown here with a comparison to our January, 2022 figures. So coming into this year now where we are today, so currently we expect 162 billion in total revenue and a 33% growth rate. Still very healthy, but not on our mark. So we think a w s is gonna miss our predictions by about a billion dollars, not, you know, not bad for an 80 billion company. So they're not gonna hit that expectation though of getting really close to a hundred billion run rate. We thought they'd exit the year, you know, closer to, you know, 25 billion a quarter and we don't think they're gonna get there. >>Look, we pretty much nailed Azure even though our prediction W was was correct about g Google Cloud platform surpassing Alibaba, Alibaba, we way overestimated the performance of both of those companies. So we're gonna give ourselves a C plus here and we think, yeah, you might think it's a little bit harsh, we could argue for a B minus to the professor, but the misses on GCP and Alibaba we think warrant a a self penalty on this one. All right, let's move on to our prediction about Supercloud. We said it becomes a thing in 2022 and we think by many accounts it has, despite the naysayers, we're seeing clear evidence that the concept of a layer of value add that sits above and across clouds is taking shape. And on this slide we showed just some of the pickup in the industry. I mean one of the most interesting is CloudFlare, the biggest supercloud antagonist. >>Charles Fitzgerald even predicted that no vendor would ever use the term in their marketing. And that would be proof if that happened that Supercloud was a thing and he said it would never happen. Well CloudFlare has, and they launched their version of Supercloud at their developer week. Chris Miller of the register put out a Supercloud block diagram, something else that Charles Fitzgerald was, it was was pushing us for, which is rightly so, it was a good call on his part. And Chris Miller actually came up with one that's pretty good at David Linthicum also has produced a a a A block diagram, kind of similar, David uses the term metacloud and he uses the term supercloud kind of interchangeably to describe that trend. And so we we're aligned on that front. Brian Gracely has covered the concept on the popular cloud podcast. Berkeley launched the Sky computing initiative. >>You read through that white paper and many of the concepts highlighted in the Supercloud 3.0 community developed definition align with that. Walmart launched a platform with many of the supercloud salient attributes. So did Goldman Sachs, so did Capital One, so did nasdaq. So you know, sorry you can hate the term, but very clearly the evidence is gathering for the super cloud storm. We're gonna take an a plus on this one. Sorry, haters. Alright, let's talk about data mesh in our 21 predictions posts. We said that in the 2020s, 75% of large organizations are gonna re-architect their big data platforms. So kind of a decade long prediction. We don't like to do that always, but sometimes it's warranted. And because it was a longer term prediction, we, at the time in, in coming into 22 when we were evaluating our 21 predictions, we took a grade of incomplete because the sort of decade long or majority of the decade better part of the decade prediction. >>So last year, earlier this year, we said our number seven prediction was data mesh gains momentum in 22. But it's largely confined and narrow data problems with limited scope as you can see here with some of the key bullets. So there's a lot of discussion in the data community about data mesh and while there are an increasing number of examples, JP Morgan Chase, Intuit, H S P C, HelloFresh, and others that are completely rearchitecting parts of their data platform completely rearchitecting entire data platforms is non-trivial. There are organizational challenges, there're data, data ownership, debates, technical considerations, and in particular two of the four fundamental data mesh principles that the, the need for a self-service infrastructure and federated computational governance are challenging. Look, democratizing data and facilitating data sharing creates conflicts with regulatory requirements around data privacy. As such many organizations are being really selective with their data mesh implementations and hence our prediction of narrowing the scope of data mesh initiatives. >>I think that was right on J P M C is a good example of this, where you got a single group within a, within a division narrowly implementing the data mesh architecture. They're using a w s, they're using data lakes, they're using Amazon Glue, creating a catalog and a variety of other techniques to meet their objectives. They kind of automating data quality and it was pretty well thought out and interesting approach and I think it's gonna be made easier by some of the announcements that Amazon made at the recent, you know, reinvent, particularly trying to eliminate ET t l, better connections between Aurora and Redshift and, and, and better data sharing the data clean room. So a lot of that is gonna help. Of course, snowflake has been on this for a while now. Many other companies are facing, you know, limitations as we said here and this slide with their Hadoop data platforms. They need to do new, some new thinking around that to scale. HelloFresh is a really good example of this. Look, the bottom line is that organizations want to get more value from data and having a centralized, highly specialized teams that own the data problem, it's been a barrier and a blocker to success. The data mesh starts with organizational considerations as described in great detail by Ash Nair of Warner Brothers. So take a listen to this clip. >>Yeah, so when people think of Warner Brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of H B O, you think of t n t, you think of C N N. We have 30 plus brands in our portfolio and each have their own needs. So the, the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against, as an example, HBO if Game of Thrones is going on. >>So it's often the case that data mesh is in the eyes of the implementer. And while a company's implementation may not strictly adhere to Jamma Dani's vision of data mesh, and that's okay, the goal is to use data more effectively. And despite Gartner's attempts to deposition data mesh in favor of the somewhat confusing or frankly far more confusing data fabric concept that they stole from NetApp data mesh is taking hold in organizations globally today. So we're gonna take a B on this one. The prediction is shaping up the way we envision, but as we previously reported, it's gonna take some time. The better part of a decade in our view, new standards have to emerge to make this vision become reality and they'll come in the form of both open and de facto approaches. Okay, our eighth prediction last year focused on the face off between Snowflake and Databricks. >>And we realized this popular topic, and maybe one that's getting a little overplayed, but these are two companies that initially, you know, looked like they were shaping up as partners and they, by the way, they are still partnering in the field. But you go back a couple years ago, the idea of using an AW w s infrastructure, Databricks machine intelligence and applying that on top of Snowflake as a facile data warehouse, still very viable. But both of these companies, they have much larger ambitions. They got big total available markets to chase and large valuations that they have to justify. So what's happening is, as we've previously reported, each of these companies is moving toward the other firm's core domain and they're building out an ecosystem that'll be critical for their future. So as part of that effort, we said each is gonna become aggressive investors and maybe start doing some m and a and they have in various companies. >>And on this chart that we produced last year, we studied some of the companies that were targets and we've added some recent investments of both Snowflake and Databricks. As you can see, they've both, for example, invested in elation snowflake's, put money into Lacework, the Secur security firm, ThoughtSpot, which is trying to democratize data with ai. Collibra is a governance platform and you can see Databricks investments in data transformation with D B T labs, Matillion doing simplified business intelligence hunters. So that's, you know, they're security investment and so forth. So other than our thought that we'd see Databricks I p o last year, this prediction been pretty spot on. So we'll give ourselves an A on that one. Now observability has been a hot topic and we've been covering it for a while with our friends at E T R, particularly Eric Bradley. Our number nine prediction last year was basically that if you're not cloud native and observability, you are gonna be in big trouble. >>So everything guys gotta go cloud native. And that's clearly been the case. Splunk, the big player in the space has been transitioning to the cloud, hasn't always been pretty, as we reported, Datadog real momentum, the elk stack, that's open source model. You got new entrants that we've cited before, like observe, honeycomb, chaos search and others that we've, we've reported on, they're all born in the cloud. So we're gonna take another a on this one, admittedly, yeah, it's a re reasonably easy call, but you gotta have a few of those in the mix. Okay, our last prediction, our number 10 was around events. Something the cube knows a little bit about. We said that a new category of events would emerge as hybrid and that for the most part is happened. So that's gonna be the mainstay is what we said. That pure play virtual events are gonna give way to hi hybrid. >>And the narrative is that virtual only events are, you know, they're good for quick hits, but lousy replacements for in-person events. And you know that said, organizations of all shapes and sizes, they learn how to create better virtual content and support remote audiences during the pandemic. So when we set at pure play is gonna give way to hybrid, we said we, we i we implied or specific or specified that the physical event that v i p experience is going defined. That overall experience and those v i p events would create a little fomo, fear of, of missing out in a virtual component would overlay that serves an audience 10 x the size of the physical. We saw that really two really good examples. Red Hat Summit in Boston, small event, couple thousand people served tens of thousands, you know, online. Second was Google Cloud next v i p event in, in New York City. >>Everything else was, was, was, was virtual. You know, even examples of our prediction of metaverse like immersion have popped up and, and and, and you know, other companies are doing roadshow as we predicted like a lot of companies are doing it. You're seeing that as a major trend where organizations are going with their sales teams out into the regions and doing a little belly to belly action as opposed to the big giant event. That's a definitely a, a trend that we're seeing. So in reviewing this prediction, the grade we gave ourselves is, you know, maybe a bit unfair, it should be, you could argue for a higher grade, but the, but the organization still haven't figured it out. They have hybrid experiences but they generally do a really poor job of leveraging the afterglow and of event of an event. It still tends to be one and done, let's move on to the next event or the next city. >>Let the sales team pick up the pieces if they were paying attention. So because of that, we're only taking a B plus on this one. Okay, so that's the review of last year's predictions. You know, overall if you average out our grade on the 10 predictions that come out to a b plus, I dunno why we can't seem to get that elusive a, but we're gonna keep trying our friends at E T R and we are starting to look at the data for 2023 from the surveys and all the work that we've done on the cube and our, our analysis and we're gonna put together our predictions. We've had literally hundreds of inbounds from PR pros pitching us. We've got this huge thick folder that we've started to review with our yellow highlighter. And our plan is to review it this month, take a look at all the data, get some ideas from the inbounds and then the e t R of January surveys in the field. >>It's probably got a little over a thousand responses right now. You know, they'll get up to, you know, 1400 or so. And once we've digested all that, we're gonna go back and publish our predictions for 2023 sometime in January. So stay tuned for that. All right, we're gonna leave it there for today. You wanna thank Alex Myerson who's on production and he manages the podcast, Ken Schiffman as well out of our, our Boston studio. I gotta really heartfelt thank you to Kristen Martin and Cheryl Knight and their team. They helped get the word out on social and in our newsletters. Rob Ho is our editor in chief over at Silicon Angle who does some great editing for us. Thank you all. Remember all these podcasts are available or all these episodes are available is podcasts. Wherever you listen, just all you do Search Breaking analysis podcast, really getting some great traction there. Appreciate you guys subscribing. I published each week on wikibon.com, silicon angle.com or you can email me directly at david dot valante silicon angle.com or dm me Dante, or you can comment on my LinkedIn post. And please check out ETR AI for the very best survey data in the enterprise tech business. Some awesome stuff in there. This is Dante for the Cube Insights powered by etr. Thanks for watching and we'll see you next time on breaking analysis.

Published Date : Dec 18 2022

SUMMARY :

From the Cube Studios in Palo Alto in Boston, bringing you data-driven insights from self grading system, but look, we're gonna give you the data and you can draw your own conclusions and tell you what, We kind of nailed the momentum in the energy but not the i p O that we had predicted Aqua Securities focus on And then, you know, I lumia holding its own, you So the focus on endpoint security that was a winner in 2022 is CrowdStrike led that charge put some meat in the bone, so to speak, and and allow us than you to say, okay, We said at the time, you can see this on the left hand side of this chart, the PC laptop demand would remain Kind of like an O K R and you know, we strive to provide data We thought they'd exit the year, you know, closer to, you know, 25 billion a quarter and we don't think they're we think, yeah, you might think it's a little bit harsh, we could argue for a B minus to the professor, Chris Miller of the register put out a Supercloud block diagram, something else that So you know, sorry you can hate the term, but very clearly the evidence is gathering for the super cloud But it's largely confined and narrow data problems with limited scope as you can see here with some of the announcements that Amazon made at the recent, you know, reinvent, particularly trying to the company so that, you know, CNN can work at their own pace. So it's often the case that data mesh is in the eyes of the implementer. but these are two companies that initially, you know, looked like they were shaping up as partners and they, So that's, you know, they're security investment and so forth. So that's gonna be the mainstay is what we And the narrative is that virtual only events are, you know, they're good for quick hits, the grade we gave ourselves is, you know, maybe a bit unfair, it should be, you could argue for a higher grade, You know, overall if you average out our grade on the 10 predictions that come out to a b plus, You know, they'll get up to, you know,

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
Chris Miller	PERSON	0.99+
CNN	ORGANIZATION	0.99+
Rob Ho	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
5.1%	QUANTITY	0.99+
2022	DATE	0.99+
Charles Fitzgerald	PERSON	0.99+
Dave Hatfield	PERSON	0.99+
Brian Gracely	PERSON	0.99+
2019	DATE	0.99+
Lacework	ORGANIZATION	0.99+
two	QUANTITY	0.99+
GCP	ORGANIZATION	0.99+
33%	QUANTITY	0.99+
Walmart	ORGANIZATION	0.99+
David	PERSON	0.99+
2021	DATE	0.99+
20%	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
Palo Alto	LOCATION	0.99+
2020	DATE	0.99+
Ash Nair	PERSON	0.99+
Goldman Sachs	ORGANIZATION	0.99+
162 billion	QUANTITY	0.99+
New York City	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
October	DATE	0.99+
last year	DATE	0.99+
Arctic Wolf	ORGANIZATION	0.99+
two companies	QUANTITY	0.99+
38%	QUANTITY	0.99+
September	DATE	0.99+
Fed	ORGANIZATION	0.99+
JP Morgan Chase	ORGANIZATION	0.99+
80 billion	QUANTITY	0.99+
29%	QUANTITY	0.99+
32%	QUANTITY	0.99+
21 predictions	QUANTITY	0.99+
30%	QUANTITY	0.99+
HBO	ORGANIZATION	0.99+
75%	QUANTITY	0.99+
Game of Thrones	TITLE	0.99+
January	DATE	0.99+
2023	DATE	0.99+
10 predictions	QUANTITY	0.99+
both	QUANTITY	0.99+
22	QUANTITY	0.99+
ThoughtSpot	ORGANIZATION	0.99+
196 million	QUANTITY	0.99+
30	QUANTITY	0.99+
each	QUANTITY	0.99+
last year	DATE	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
2020s	DATE	0.99+
167 billion	QUANTITY	0.99+
Okta	ORGANIZATION	0.99+
Second	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Eric Bradley	PERSON	0.99+
Aqua Securities	ORGANIZATION	0.99+
Dante	PERSON	0.99+
8%	QUANTITY	0.99+
Warner Brothers	ORGANIZATION	0.99+
Intuit	ORGANIZATION	0.99+
Cube Studios	ORGANIZATION	0.99+
each week	QUANTITY	0.99+
7 billion	QUANTITY	0.99+
40%	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+

Stijn Christiaens | Data Citizen 22

>>Hey everyone. I'm Lisa Martin covering Data Citizens 22, brought to you by Collibra. This next conversation is gonna focus on the importance of data culture. One of our Cube alumni is back, Stan Christians is Collibra's co-founder and it's Chief Data citizen. Stan, it's great to have you back on the cube. >>Hey, Lisa, nice to be here. >>So we're gonna be talking about the importance of data culture, data intelligence, maturity, all those great things. When we think about the data revolution that every business is going through, you know, so much more than technology innovation, it also really re requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >>Right. So as you know, our event is called Data Citizens because we believe that in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations, you have a lot of people, most of the employees in an organization are somehow going to be a data citizen, right? So you need to make sure that these people are aware of it. You need to make sure that these people have the skills and competencies to do with data what is necessary. And that's on all levels, right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss, we need to make this decision that your boss is also open to and able to interpret, you know, the data presented in that dashboard to actually make that decision and take that action, right? >>And once you have that why through the organization, that's when you have a good data culture. Now, that's a continuous effort for most organizations because they, they're always moving, somehow there, hiring new people. And it has to be a continuous effort because we've seen that on the one hand, organizations continue to be challenged with controlling their data sources and where all the data is flowing, right? Which in itself creates a lot of risk. But also on the other set hand of the equation, you have the benefits. You know, you might look at regulatory drivers like, we have to do this, right? But it's, it's much better right now to consider the competitive drivers, for example. And we did an IDC study earlier this year, quite interesting. I can recommend anyone to read it. And one of the conclusions they found as they surveyed over a thousand people across organizations worldwide is that the ones who are higher in maturity. >>So the, the organizations that really look at data as an asset, look at data as a product and actively try to be better at it, don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, Okay, I'm doing this, you know, data culture for everyone, wakening them up as data citizens. I'm doing this for competitive reasons, I'm doing this for regulatory reasons. You're trying to bring both of those together and the ones that get data intelligence right, are just going to be more successful and more competitive. That's our view, and that's what we're seeing out there in the market. >>Absolutely. We know that just generally stand right, The organizations that are, are really creating a, a data culture and enabling everybody within the organization to become data citizens are, We know that in theory they're more competitive, they're more successful. But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >>Of course, of course it's difficult for an organization to adapt, but it's also necessary, as you just said, imagine that, you know, you're a modern day organization, phones, laptops, what have you, you're not using those IT assets, right? Or you know, you're delivering them through your, throughout the organization, but not enabling your colleagues to actually do something with that asset. Same thing is true with data today, right? If you are not properly using the data assets and your competitors are, they're going to get more advantage. So as to how you get this zone or how you establish this culture, there's a few angles to look at. I would say, Lisa, so one angle is obviously the leadership angle whereby whoever is the boss of data in the organization, you typically have multiple bosses there, like achieve data officers. Sometimes there's, there's multiple, but they may have a different title, right? >>So I'm just gonna summarize it as a data leader for a second. So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? Yes. Now that's one part because then you can clearly see the example of your leadership in the organization and also the business value. And that's important because those people, their job in essence really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that culture right, is it's not enough to just have that leadership out there, but you also have to get the hearts and minds of the data champions across the organization. You really have to win them over. And if you have those two combined and obviously a good technology to, you know, connect those people and have them execute on their responsibilities, such as as a data intelligence platform like Colibra, then you have the pieces in place to really start upgrading that culture inch by inch if youll, >>Yes, I like that. The recipe for success. So you are the co-founder of colibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how before we went live, we were talking about Collibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what maybe some of the specific projects are that Collibra's data office is working on. >>Yes, and it is indeed data citizens. There are a ton of speakers here, very excited. You know, we have Barb from MIT speaking about data monetization. We have dig pat at the last minute on the agenda. So really exciting agenda. Can't wait to get back out there. But essentially you're right. So over the years at cbra, we've been doing this now since 2008, so a good 15 years. And I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around as are we. And myself, you know, when you start a company, we were for people in a, in a garage if you will. So everybody's wearing all sorts of hat at that time. But over the years I've run, you know, pre-sales at colibra, I've run post-sales partnerships, product, et cetera. And as our company got a little bit biggish for now, 1,200, something like that, people in the company, I believe systems and processes become a lot more important, right? >>So we said, you know, Colibra isn't the size of our customers yet, but we're getting there in terms of organizations, structure, process systems, et cetera. So we said, it's really time for us to put our money where our mouth is and to set up our own data office, which is what we were seeing at all of our customers are doing, and which is what we're seeing that organizations worldwide are doing. And Gartner was predicting us as well. They said, Okay, organizations have an HR unit, they have a finance unit, and over time they'll all have a department, if you will, that is responsible somehow for the data. So we said, Okay, let's try to set a an example at cbra. Let's try to set up our own data office and such way that other people can take away with it, right? Can take away from it. >>So we set up a data strategy, we started building data products, took care of the data infrastructure, that sort of good stuff. And in doing all of that, Lisa, exactly as you said, we said, okay, we need to also use our own product and our own practices, right? And from that use, learn how we can make the product better, learn how we can make the practice better, and share that learning with all of the markets of course. And on, on the Monday mornings, we sometimes refer to that as eating our own dog foods or Friday evenings we refer to that as drinking our own champagne. I like it. So we, we had a, we had the driver to do this, you know, there's a clear business reason. So we involved, we included that in the data strategy and that's a little bit of our origin. >>Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should follow? This is just the organization that works at our company, but it can serve as an inspiration. So we have a pillar, which is data science. The data product builders if you will, or the people who help the business build data products. We have the data engineers who help keep the lights on for that data platform to make sure the products, the data products can run, the data can flow and you know, the quality can be checked. And then we have a data intelligence or data governance builder where we have those data governance, data intelligence stakeholders who help the business as a sort of data partner to the business stakeholders. So that's how we've organized it. And then we started following the calibra approach, which is, well, what are the challenges that our business stakeholders have in hr, finance, sales, marketing all over? >>And how can data help overcome those challenges? And from those use cases, we then just started to build a roadmap and started execution on use case after use case. And a few important ones there are very simple, we see them with our, all our customers as well. People love talking about the catalog, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in and legal and privacy. So they have their process registry and they can see how the data flows. So that's a popular starting place. And that turns into a marketplace so that if new analysts and data citizens join cbra, they immediately have a place to go to, to look and see, okay, what data is out there for me as an analyst or a data scientist or whatever to do my job, right? >>So they can immediately get access to the data. And another one that we did is around trusted business reporting. We're seeing that since 2008. You know, self-service BI allowed everyone to make beautiful dashboards, you know, by pie charts. I always, my pet peeve is the pie charts because I love buy and you shouldn't always be using pie charts. But essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report the reporting on the same thing. But the numbers seem different, right? So that's why we have trusted business reporting. So we know if a report, a dashboard, a data product essentially is built, we know that all the right steps are being followed and that whoever is consuming that can be quite confident in the result either right, in that silver or browser Absolutely key. Exactly. Yes. A absolutely. >>Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >>KPIs and measuring is a big topic in the, in the data chief data officer profession, I would say, and again, it always varies with respect to your organization, but there's a few that we use that might be of interest to you. So remember we have those three pillars, right? And we have metrics across those pillars. So for example, a pillar on the data engineering side is gonna be more related to that uptime, right? Audit is a data platform up and running. Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption is a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data science and the products. >>Are people using them? Are they getting value from it? Can we calculate that value in a monetary perspective, right? So that we can to the rest of the business continue to say we're tracking on those numbers. And those numbers indicate that value is generated and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example, you have a number of domains in a data mesh. People talk about being the owner of a data domain, for example, like product or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open closed? How many data products are built according to process? And so on and so forth. So these are an a set of examples of, of KPIs. There's a, there's a lot more, but hopefully those can already inspire the audience. >>Absolutely. So we've, we've talked about the rise of cheap data offices, it's only accelerating. You mentioned this is like a 10 year journey. So if you were to look into a crystal ball, what do you see in terms of the maturation of data offices over the next decade? >>So we, we've seen indeed the, the role sort of grow up, I think in, in 2010 there may have been like 10 chief data officers or something. Gartner has exact numbers on them, but then they grew, you know, 400, they were like mostly in financial services, but they expanded then to all of industries and then to all of the season. The number is estimated to be about 20,000 right now. Wow. And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy support for the digital program. And now all about data products, right? So as a data leader, you'd now need all of those competences and need to include them in, in your strategy. >>How is that going to evolve for the next couple of years? I wish I had one of those crystal balls, right? But essentially I think for the next couple of years there's gonna be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data officer. So you'll see over the years that's going to evolve more digital and more data products. So for next three, five years, my, my prediction is it's all going to be about data products because it's an immediate link between the data and, and the dollar essentially, right? So that's gonna be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up in a few years. >>I think there's gonna be a continued challenge for the chief data officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not. Right? So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done will be the ones that are successful. Yeah. And the ones who get that done will be the ones that do it on the basis of data monetization, right? Connecting value to the data and making that very clear to all the data citizens in the organization, right? Really and in that sense, value chain, they'll need to have both, you know, technical audiences and non-technical audiences aligned of course. And they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you're waking up data citizens across the organization and you make everyone in the organization think about data as an essence. >>Absolutely. Because there's so much value that can be extracted if organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely gonna keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, organizations have so much more likelihood of being successful in being competitive. So we're gonna watch this space. Stan, thank you so much for joining me on the queue at Data Citizens 22. We appreciate it. >>Thanks for having me over >>From Data Citizens 22, I'm Lisa Martin, you're watching The Cube, the leader in live tech coverage.

Published Date : Nov 1 2022

SUMMARY :

Stan, it's great to have you back on the cube. Talk to us about what you mean by data citizenship and the And we believe that today's organizations, you have a lot of people, the equation, you have the benefits. So you can say, Okay, I'm doing this, you know, data culture for everyone, wakening them But the IDC study that you just mentioned demonstrates they're So as to how you get this zone or how you establish this of the equation of getting that culture right, is it's not enough to just have that leadership out there, So you are the co-founder of colibra. So over the years at cbra, we've been doing this now since 2008, so a good 15 years. So we said, you know, Colibra isn't the size of our customers yet, but we're we had the driver to do this, you know, there's a clear business reason. make sure the products, the data products can run, the data can flow and you know, the data scientists to know what's in their data lake, for example, and for the people in So they can immediately get access to the data. Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the So for example, a pillar on the data engineering side is gonna be more related So how many of those domains do you have covered? So if you were to Gartner has exact numbers on them, but then they grew, you know, How is that going to evolve for the next couple of years? Really and in that sense, value chain, they'll need to have both, you know, And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, the leader in live tech coverage.

ENTITIES

Entity	Category	Confidence
Lisa	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Barb	PERSON	0.99+
2010	DATE	0.99+
Stijn Christiaens	PERSON	0.99+
10 year	QUANTITY	0.99+
Stan	PERSON	0.99+
Stan Christians	PERSON	0.99+
one part	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
one angle	QUANTITY	0.99+
2008	DATE	0.99+
1,200	QUANTITY	0.99+
15 years	QUANTITY	0.99+
400	QUANTITY	0.99+
10 chief data officers	QUANTITY	0.99+
two	QUANTITY	0.99+
five years	QUANTITY	0.99+
MIT	ORGANIZATION	0.99+
The Cube	TITLE	0.99+
both	QUANTITY	0.99+
IDC	ORGANIZATION	0.98+
over a thousand people	QUANTITY	0.98+
three pillars	QUANTITY	0.98+
three times	QUANTITY	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
today	DATE	0.98+
about 20,000	QUANTITY	0.98+
second part	QUANTITY	0.97+
cbra	ORGANIZATION	0.96+
Colibra	ORGANIZATION	0.95+
next couple of years	DATE	0.94+
Data Citizens	EVENT	0.93+
Data Citizens 22	EVENT	0.93+
Monday mornings	DATE	0.92+
earlier this year	DATE	0.92+
next decade	DATE	0.91+
one set	QUANTITY	0.9+
version two	OTHER	0.89+
colibra	ORGANIZATION	0.89+
Friday	DATE	0.86+
Data Citizens 22	ORGANIZATION	0.85+
version one	OTHER	0.82+
Data	EVENT	0.81+
Data Citizen 22	ORGANIZATION	0.81+
first chief data	QUANTITY	0.8+
four levels	QUANTITY	0.77+
three	QUANTITY	0.76+
second	QUANTITY	0.73+
Citizens	ORGANIZATION	0.68+
Data	ORGANIZATION	0.65+
Cube	ORGANIZATION	0.6+
2022	EVENT	0.48+

Data Citizens 22 | Laura Sellers

(light music) >> Welcome to the Cube's virtual coverage of Data Citizens 2022. My name is Dave Vellante, and I'm here with Laura Sellers, who is the Chief Product Officer at Collibra, the host of Data Citizens. Laura, welcome. Good to see you. >> Thank you. Nice to be here. >> You know, your keynote at Data Citizens this year focused on, you know, your mission to drive ease of use and scale. Now, when I think about historically, fast access to the right data at the right time in a form that's really easily consumable, it's been kind of challenging, especially for business users. Can you explain to our audience why this matters so much, and what's actually different today in the data ecosystem to make this a reality? >> Yeah, definitely. So I think what we really need and what I hear from customers every single day is that we need a new approach to data management, and our product team is what inspired me to come to Collibra a little bit over a year ago, was really the fact that they're very focused on bringing trusted data to more users across more sources for more use cases. And so as we look at what we're announcing with these innovations of ease of use and scale, it's really about making teams more productive in getting started with and the ability to manage data across the entire organization. So we've been very focused on richer experiences, a broader ecosystem of partners, as well as a platform that delivers performance, scale, and security that our users and teams need and demand. So as we look at, oh, go ahead. >> I was going to say, you know, when I look back at like the last 10 years, it was all about getting the technology to work, and it was just so complicated, but please carry on. I'd love to hear more about this. >> Yeah, I really, you know, Collibra is a system of engagement for data, and we really are working on bringing that entire system of engagement to life for everyone to leverage here and now. So what we're announcing from our ease of use side of the world is first our data marketplace. This is the ability for all users to discover and access data quickly and easily, shop for it, if you will. The next thing that we're also introducing is the new homepage. It's really about the ability to drive adoption and have users find data more quickly. And then the two more areas of the ease of use side of the world is our world of usage analytics. And one of the big pushes and passions we have at Collibra is to help with this data driven culture that all companies are trying to create, and also helping with data literacy. With something like usage analytics, it's really about driving adoption of the Collibra platform, understanding what's working, who's accessing it, what's not. And then finally, we're also introducing what's called Workflow Designer. And we love our workflows at Collibra. It's a big differentiator to be able to automate business processes. The designer is really about a way for more people to be able to create those workflows, collaborate on those workflows, as well as people to be able to easily interact with them. So a lot of exciting things when it comes to ease of use to make it easier for all users to find data. >> Yes, there's definitely a lot to unpack there. You know, you mentioned this idea of shopping for the data. That's interesting to me. Why this analogy, metaphor or analogy? I always get those confused. Let's go with analogy. Why is it so important to data consumers? >> I think when you look at the world of data, and I talked about this system of engagement, it's really about making it more accessible to the masses. And what users are used to is a shopping experience, like your Amazon, if you will. And so having a consumer grade experience where users can quickly go in and find the data, trust that data, understand where the data's coming from, and then be able to quickly access it, is the idea of being able to shop for it, just making it as simple as possible and really speeding the time to value for any of the business analysts, data analysts out there. >> Yeah, I think when you see a lot of discussion about rethinking data architectures, putting data in the hands of the users and business people, decentralized data, and of course that's awesome. I love that. But of course, then you have to have self-service infrastructure, and you have to have governance. And those are really challenging. And I think so many organizations, they're facing adoption challenges. You know, when it comes to enabling teams generally, especially domain experts, to adopt new data technologies, you know, like the tech comes fast and furious. You got all these open source projects. It can get really confusing. Of course it risks security, governance, and all that good stuff. You got all this jargon. So where do you see, you know, the friction in adopting new data technologies? What's your point of view, and how can organizations overcome these challenges? >> You're dead on. There's so much technology, and there's so much to stay on top of, which is part of the friction, right? It's just being able to stay ahead of and understand all the technologies that are coming. You also look at as there's so many more sources of data, and people are migrating data to the cloud, and they're migrating to new sources. Where the friction comes is really that ability to understand where the data came from, where it's moving to, and then also to be able to put the access controls on top of it. So people are only getting access to the data that they should be getting access to. So one of the other things we're announcing with all of the innovations that are coming is what we're doing around performance and scale. So with all of the data movement, with all of the data that's out there, the first thing we're launching in the world of performance and scale is our world of data quality. It's something that Collibra has been working on for the past year and a half, but we're launching the ability to have data quality in the cloud. So it's currently an on-premise offering, but we'll now be able to carry that over into the cloud for us to manage that way. We're also introducing the ability to push down data quality into Snowflake. So this is, again, one of those challenges is making sure that that data that you have is high quality as you move forward. And so really another, we're just reducing friction. You already have Snowflake stood up. It's not another machine for you to manage. It's just push down capabilities into Snowflake to be able to track that quality. Another thing that we're launching with that is what we call Collibra Protect. And this is that ability for users to be able to ingest metadata, understand where the PII data is, and then set policies up on top of it. So very quickly be able to set policies and have them enforced at the data level. So anybody in the organization is only getting access to the data they should have access to. >> This topic of data quality is interesting. It's something that I've followed for a number of years. It used to be a back office function, you know, and really confined only to highly regulated industries like financial services and healthcare and government. You know, you look back over a decade ago, you didn't have this worry about personal information, GDPR, and, you know, California Consumer Privacy Act, all becomes so much important. The cloud has really changed things in terms of performance and scale, and of course, partnering with Snowflake, it's all about sharing data and monetization, anything but a back office function. So it was kind of smart that you guys were early on and of course, attracting them as an investor as well was very strong validation. What can you tell us about the nature of the relationship with Snowflake, and specifically interested in sort of joint engineering and product innovation efforts, you know, beyond the standard go to market stuff? >> Definitely. So you mentioned they were a strategic investor in Collibra about a year ago. A little less than that I guess. We've been working with them though for over a year really tightly with their product and engineering teams to make sure that Collibra is adding real value. Our unified platform is touching, pieces of our unified platform are touching all pieces of Snowflake. And when I say that, what I mean is we're first, you know, able to ingest data with Snowflake, which has always existed. We're able to profile and classify that data. We're announcing with Collibra Protect this week that you're now able to create those policies on top of Snowflake and have them enforced. So again, people can get more value out of their Snowflake more quickly. As far as time to value with our policies, for all business users to be able to create. We're also announcing Snowflake Lineage 2.0. So this is the ability to take stored procedures in Snowflake and understand the lineage of where did the data come from, how was it transformed within Snowflake, as well as the data quality pushdown, as I mentioned. Data quality, you brought it up, it is a new, it is a big industry push, and you know, one of the things I think Gartner mentioned is people are losing up to $15 million without having great data quality. So this push down capability for Snowflake really is, again, a big ease of use push for us at Collibra of that ability to push it into Snowflake, take advantage of the data source and the engine that already lives there, and get the right and make sure you have the right quality. >> I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you can get sort of a high degree of confidence that the data sharing can be done in a safe way. Bringing Collibra into the story allows me to have that data quality and that governance that I need. You know, we've said many times on the Cube that one of the notable differences in cloud this decade versus last decade, I mean there are obvious differences just in terms of scale and scope, but it's shaping up to be about the strength of the ecosystems. That's really a hallmark of these big cloud players. I mean they're, it's a key factor for innovating, accelerating product delivery, filling gaps in the hyperscale offerings, 'cause you got more stack, you know, much more stack capabilities, and it creates this flywheel momentum as we often say. But, so my question is, how do you work with the hyperscalers? Like whether it's AWS or Google or whomever, and what do you see as your role, and what's the Collibra sweet spot? >> Yeah, definitely. So, you know, one of the things I mentioned early on is the broader ecosystem of partners is what it's all about. And so we have that strong partnership with Snowflake. We also are doing more with Google around, you know, GCP and Collibra Protect there, but also tighter Dataplex integration. So similar to what you've seen with our strategic moves around Snowflake and really covering the broad ecosystem of what Collibra can do on top of that data source, we're extending that to the world of Google as well and the world of Dataplex. We also have great partners in SIs. Infosys is somebody we spoke with at the conference who's done a lot of great work with Levi's, as they're really important to help people with their whole data strategy and driving that data driven culture and Collibra being the core of it. >> All right, Laura, we're going to end it there, but I wonder if you could kind of put a bow on this year, the event, your perspectives. So just give us your closing thoughts. >> Yeah, definitely. So I want to say, this is one of the biggest releases Collibra's ever had, definitely the biggest one since I've been with the company a little over a year. We have all these great new product innovations coming to really drive the ease of use, to make data more valuable for users everywhere and companies everywhere. And so it's all about everybody being able to easily find, understand, and trust, and get access to that data going forward. >> Well congratulations on all the progress. It was great to have you on the Cube, first time I believe, and really appreciate you taking the time with us. >> Yes, thank you for your time. >> You're very welcome. Okay, you're watching the coverage of Data Citizens 2022 on the Cube, your leader in enterprise and emerging tech coverage. (light music)

Published Date : Oct 31 2022

SUMMARY :

Welcome to the Cube's virtual coverage Nice to be here. fast access to the right the ability to manage data the technology to work, is to help with this data driven culture Why is it so important to data consumers? and really speeding the time to value and you have to have governance. and then also to be able and really confined only to and get the right and make sure and what do you see as your role, and really covering the broad ecosystem going to end it there, and get access to that data going forward. and really appreciate you on the Cube, your leader in enterprise

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Laura	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Laura Sellers	PERSON	0.99+
California Consumer Privacy Act	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Infosys	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
Snowflake	TITLE	0.99+
Dataplex	ORGANIZATION	0.98+
Gartner	ORGANIZATION	0.98+
one	QUANTITY	0.98+
first	QUANTITY	0.97+
this week	DATE	0.97+
Data Citizens	ORGANIZATION	0.96+
first time	QUANTITY	0.96+
Snowflake Lineage 2.0	TITLE	0.94+
up to $15 million	QUANTITY	0.93+
Cube	COMMERCIAL_ITEM	0.93+
today	DATE	0.93+
Levi's	ORGANIZATION	0.92+
a year ago	DATE	0.92+
this year	DATE	0.91+
a decade ago	DATE	0.89+
first thing	QUANTITY	0.88+
Collibra	TITLE	0.87+
Snowflake	EVENT	0.86+
past year and a half	DATE	0.83+
last decade	DATE	0.83+
GCP	ORGANIZATION	0.8+
over a year	QUANTITY	0.79+
two more areas	QUANTITY	0.79+
last 10 years	DATE	0.79+
Data	EVENT	0.77+
single day	QUANTITY	0.77+
about	DATE	0.76+
decade	DATE	0.74+
Collibra Protect	ORGANIZATION	0.72+
Data Citizens 2022	TITLE	0.72+
Cube	ORGANIZATION	0.66+
Data Citizens	TITLE	0.63+
Protect	COMMERCIAL_ITEM	0.63+
over	DATE	0.61+
2022	EVENT	0.58+
22	ORGANIZATION	0.44+
Citizens	ORGANIZATION	0.38+

Breaking Analysis: We Have the Data…What Private Tech Companies Don’t Tell you About Their Business

>> From The Cube Studios in Palo Alto and Boston, bringing you data driven insights from The Cube at ETR. This is "Breaking Analysis" with Dave Vellante. >> The reverse momentum in tech stocks caused by rising interest rates, less attractive discounted cash flow models, and more tepid forward guidance, can be easily measured by public market valuations. And while there's lots of discussion about the impact on private companies and cash runway and 409A valuations, measuring the performance of non-public companies isn't as easy. IPOs have dried up and public statements by private companies, of course, they accentuate the good and they kind of hide the bad. Real data, unless you're an insider, is hard to find. Hello and welcome to this week's "Wikibon Cube Insights" powered by ETR. In this "Breaking Analysis", we unlock some of the secrets that non-public, emerging tech companies may or may not be sharing. And we do this by introducing you to a capability from ETR that we've not exposed you to over the past couple of years, it's called the Emerging Technologies Survey, and it is packed with sentiment data and performance data based on surveys of more than a thousand CIOs and IT buyers covering more than 400 companies. And we've invited back our colleague, Erik Bradley of ETR to help explain the survey and the data that we're going to cover today. Erik, this survey is something that I've not personally spent much time on, but I'm blown away at the data. It's really unique and detailed. First of all, welcome. Good to see you again. >> Great to see you too, Dave, and I'm really happy to be talking about the ETS or the Emerging Technology Survey. Even our own clients of constituents probably don't spend as much time in here as they should. >> Yeah, because there's so much in the mainstream, but let's pull up a slide to bring out the survey composition. Tell us about the study. How often do you run it? What's the background and the methodology? >> Yeah, you were just spot on the way you were talking about the private tech companies out there. So what we did is we decided to take all the vendors that we track that are not yet public and move 'em over to the ETS. And there isn't a lot of information out there. If you're not in Silicon (indistinct), you're not going to get this stuff. So PitchBook and Tech Crunch are two out there that gives some data on these guys. But what we really wanted to do was go out to our community. We have 6,000, ITDMs in our community. We wanted to ask them, "Are you aware of these companies? And if so, are you allocating any resources to them? Are you planning to evaluate them," and really just kind of figure out what we can do. So this particular survey, as you can see, 1000 plus responses, over 450 vendors that we track. And essentially what we're trying to do here is talk about your evaluation and awareness of these companies and also your utilization. And also if you're not utilizing 'em, then we can also figure out your sales conversion or churn. So this is interesting, not only for the ITDMs themselves to figure out what their peers are evaluating and what they should put in POCs against the big guys when contracts come up. But it's also really interesting for the tech vendors themselves to see how they're performing. >> And you can see 2/3 of the respondents are director level of above. You got 28% is C-suite. There is of course a North America bias, 70, 75% is North America. But these smaller companies, you know, that's when they start doing business. So, okay. We're going to do a couple of things here today. First, we're going to give you the big picture across the sectors that ETR covers within the ETS survey. And then we're going to look at the high and low sentiment for the larger private companies. And then we're going to do the same for the smaller private companies, the ones that don't have as much mindshare. And then I'm going to put those two groups together and we're going to look at two dimensions, actually three dimensions, which companies are being evaluated the most. Second, companies are getting the most usage and adoption of their offerings. And then third, which companies are seeing the highest churn rates, which of course is a silent killer of companies. And then finally, we're going to look at the sentiment and mindshare for two key areas that we like to cover often here on "Breaking Analysis", security and data. And data comprises database, including data warehousing, and then big data analytics is the second part of data. And then machine learning and AI is the third section within data that we're going to look at. Now, one other thing before we get into it, ETR very often will include open source offerings in the mix, even though they're not companies like TensorFlow or Kubernetes, for example. And we'll call that out during this discussion. The reason this is done is for context, because everyone is using open source. It is the heart of innovation and many business models are super glued to an open source offering, like take MariaDB, for example. There's the foundation and then there's with the open source code and then there, of course, the company that sells services around the offering. Okay, so let's first look at the highest and lowest sentiment among these private firms, the ones that have the highest mindshare. So they're naturally going to be somewhat larger. And we do this on two dimensions, sentiment on the vertical axis and mindshare on the horizontal axis and note the open source tool, see Kubernetes, Postgres, Kafka, TensorFlow, Jenkins, Grafana, et cetera. So Erik, please explain what we're looking at here, how it's derived and what the data tells us. >> Certainly, so there is a lot here, so we're going to break it down first of all by explaining just what mindshare and net sentiment is. You explain the axis. We have so many evaluation metrics, but we need to aggregate them into one so that way we can rank against each other. Net sentiment is really the aggregation of all the positive and subtracting out the negative. So the net sentiment is a very quick way of looking at where these companies stand versus their peers in their sectors and sub sectors. Mindshare is basically the awareness of them, which is good for very early stage companies. And you'll see some names on here that are obviously been around for a very long time. And they're clearly be the bigger on the axis on the outside. Kubernetes, for instance, as you mentioned, is open source. This de facto standard for all container orchestration, and it should be that far up into the right, because that's what everyone's using. In fact, the open source leaders are so prevalent in the emerging technology survey that we break them out later in our analysis, 'cause it's really not fair to include them and compare them to the actual companies that are providing the support and the security around that open source technology. But no survey, no analysis, no research would be complete without including these open source tech. So what we're looking at here, if I can just get away from the open source names, we see other things like Databricks and OneTrust . They're repeating as top net sentiment performers here. And then also the design vendors. People don't spend a lot of time on 'em, but Miro and Figma. This is their third survey in a row where they're just dominating that sentiment overall. And Adobe should probably take note of that because they're really coming after them. But Databricks, we all know probably would've been a public company by now if the market hadn't turned, but you can see just how dominant they are in a survey of nothing but private companies. And we'll see that again when we talk about the database later. >> And I'll just add, so you see automation anywhere on there, the big UiPath competitor company that was not able to get to the public markets. They've been trying. Snyk, Peter McKay's company, they've raised a bunch of money, big security player. They're doing some really interesting things in developer security, helping developers secure the data flow, H2O.ai, Dataiku AI company. We saw them at the Snowflake Summit. Redis Labs, Netskope and security. So a lot of names that we know that ultimately we think are probably going to be hitting the public market. Okay, here's the same view for private companies with less mindshare, Erik. Take us through this one. >> On the previous slide too real quickly, I wanted to pull that security scorecard and we'll get back into it. But this is a newcomer, that I couldn't believe how strong their data was, but we'll bring that up in a second. Now, when we go to the ones of lower mindshare, it's interesting to talk about open source, right? Kubernetes was all the way on the top right. Everyone uses containers. Here we see Istio up there. Not everyone is using service mesh as much. And that's why Istio is in the smaller breakout. But still when you talk about net sentiment, it's about the leader, it's the highest one there is. So really interesting to point out. Then we see other names like Collibra in the data side really performing well. And again, as always security, very well represented here. We have Aqua, Wiz, Armis, which is a standout in this survey this time around. They do IoT security. I hadn't even heard of them until I started digging into the data here. And I couldn't believe how well they were doing. And then of course you have AnyScale, which is doing a second best in this and the best name in the survey Hugging Face, which is a machine learning AI tool. Also doing really well on a net sentiment, but they're not as far along on that access of mindshare just yet. So these are again, emerging companies that might not be as well represented in the enterprise as they will be in a couple of years. >> Hugging Face sounds like something you do with your two year old. Like you said, you see high performers, AnyScale do machine learning and you mentioned them. They came out of Berkeley. Collibra Governance, InfluxData is on there. InfluxDB's a time series database. And yeah, of course, Alex, if you bring that back up, you get a big group of red dots, right? That's the bad zone, I guess, which Sisense does vis, Yellowbrick Data is a NPP database. How should we interpret the red dots, Erik? I mean, is it necessarily a bad thing? Could it be misinterpreted? What's your take on that? >> Sure, well, let me just explain the definition of it first from a data science perspective, right? We're a data company first. So the gray dots that you're seeing that aren't named, that's the mean that's the average. So in order for you to be on this chart, you have to be at least one standard deviation above or below that average. So that gray is where we're saying, "Hey, this is where the lump of average comes in. This is where everyone normally stands." So you either have to be an outperformer or an underperformer to even show up in this analysis. So by definition, yes, the red dots are bad. You're at least one standard deviation below the average of your peers. It's not where you want to be. And if you're on the lower left, not only are you not performing well from a utilization or an actual usage rate, but people don't even know who you are. So that's a problem, obviously. And the VCs and the PEs out there that are backing these companies, they're the ones who mostly are interested in this data. >> Yeah. Oh, that's great explanation. Thank you for that. No, nice benchmarking there and yeah, you don't want to be in the red. All right, let's get into the next segment here. Here going to look at evaluation rates, adoption and the all important churn. First new evaluations. Let's bring up that slide. And Erik, take us through this. >> So essentially I just want to explain what evaluation means is that people will cite that they either plan to evaluate the company or they're currently evaluating. So that means we're aware of 'em and we are choosing to do a POC of them. And then we'll see later how that turns into utilization, which is what a company wants to see, awareness, evaluation, and then actually utilizing them. That's sort of the life cycle for these emerging companies. So what we're seeing here, again, with very high evaluation rates. H2O, we mentioned. SecurityScorecard jumped up again. Chargebee, Snyk, Salt Security, Armis. A lot of security names are up here, Aqua, Netskope, which God has been around forever. I still can't believe it's in an Emerging Technology Survey But so many of these names fall in data and security again, which is why we decided to pick those out Dave. And on the lower side, Vena, Acton, those unfortunately took the dubious award of the lowest evaluations in our survey, but I prefer to focus on the positive. So SecurityScorecard, again, real standout in this one, they're in a security assessment space, basically. They'll come in and assess for you how your security hygiene is. And it's an area of a real interest right now amongst our ITDM community. >> Yeah, I mean, I think those, and then Arctic Wolf is up there too. They're doing managed services. You had mentioned Netskope. Yeah, okay. All right, let's look at now adoption. These are the companies whose offerings are being used the most and are above that standard deviation in the green. Take us through this, Erik. >> Sure, yet again, what we're looking at is, okay, we went from awareness, we went to evaluation. Now it's about utilization, which means a survey respondent's going to state "Yes, we evaluated and we plan to utilize it" or "It's already in our enterprise and we're actually allocating further resources to it." Not surprising, again, a lot of open source, the reason why, it's free. So it's really easy to grow your utilization on something that's free. But as you and I both know, as Red Hat proved, there's a lot of money to be made once the open source is adopted, right? You need the governance, you need the security, you need the support wrapped around it. So here we're seeing Kubernetes, Postgres, Apache Kafka, Jenkins, Grafana. These are all open source based names. But if we're looking at names that are non open source, we're going to see Databricks, Automation Anywhere, Rubrik all have the highest mindshare. So these are the names, not surprisingly, all names that probably should have been public by now. Everyone's expecting an IPO imminently. These are the names that have the highest mindshare. If we talk about the highest utilization rates, again, Miro and Figma pop up, and I know they're not household names, but they are just dominant in this survey. These are applications that are meant for design software and, again, they're going after an Autodesk or a CAD or Adobe type of thing. It is just dominant how high the utilization rates are here, which again is something Adobe should be paying attention to. And then you'll see a little bit lower, but also interesting, we see Collibra again, we see Hugging Face again. And these are names that are obviously in the data governance, ML, AI side. So we're seeing a ton of data, a ton of security and Rubrik was interesting in this one, too, high utilization and high mindshare. We know how pervasive they are in the enterprise already. >> Erik, Alex, keep that up for a second, if you would. So yeah, you mentioned Rubrik. Cohesity's not on there. They're sort of the big one. We're going to talk about them in a moment. Puppet is interesting to me because you remember the early days of that sort of space, you had Puppet and Chef and then you had Ansible. Red Hat bought Ansible and then Ansible really took off. So it's interesting to see Puppet on there as well. Okay. So now let's look at the churn because this one is where you don't want to be. It's, of course, all red 'cause churn is bad. Take us through this, Erik. >> Yeah, definitely don't want to be here and I don't love to dwell on the negative. So we won't spend as much time. But to your point, there's one thing I want to point out that think it's important. So you see Rubrik in the same spot, but Rubrik has so many citations in our survey that it actually would make sense that they're both being high utilization and churn just because they're so well represented. They have such a high overall representation in our survey. And the reason I call that out is Cohesity. Cohesity has an extremely high churn rate here about 17% and unlike Rubrik, they were not on the utilization side. So Rubrik is seeing both, Cohesity is not. It's not being utilized, but it's seeing a high churn. So that's the way you can look at this data and say, "Hm." Same thing with Puppet. You noticed that it was on the other slide. It's also on this one. So basically what it means is a lot of people are giving Puppet a shot, but it's starting to churn, which means it's not as sticky as we would like. One that was surprising on here for me was Tanium. It's kind of jumbled in there. It's hard to see in the middle, but Tanium, I was very surprised to see as high of a churn because what I do hear from our end user community is that people that use it, like it. It really kind of spreads into not only vulnerability management, but also that endpoint detection and response side. So I was surprised by that one, mostly to see Tanium in here. Mural, again, was another one of those application design softwares that's seeing a very high churn as well. >> So you're saying if you're in both... Alex, bring that back up if you would. So if you're in both like MariaDB is for example, I think, yeah, they're in both. They're both green in the previous one and red here, that's not as bad. You mentioned Rubrik is going to be in both. Cohesity is a bit of a concern. Cohesity just brought on Sanjay Poonen. So this could be a go to market issue, right? I mean, 'cause Cohesity has got a great product and they got really happy customers. So they're just maybe having to figure out, okay, what's the right ideal customer profile and Sanjay Poonen, I guarantee, is going to have that company cranking. I mean they had been doing very well on the surveys and had fallen off of a bit. The other interesting things wondering the previous survey I saw Cvent, which is an event platform. My only reason I pay attention to that is 'cause we actually have an event platform. We don't sell it separately. We bundle it as part of our offerings. And you see Hopin on here. Hopin raised a billion dollars during the pandemic. And we were like, "Wow, that's going to blow up." And so you see Hopin on the churn and you didn't see 'em in the previous chart, but that's sort of interesting. Like you said, let's not kind of dwell on the negative, but you really don't. You know, churn is a real big concern. Okay, now we're going to drill down into two sectors, security and data. Where data comprises three areas, database and data warehousing, machine learning and AI and big data analytics. So first let's take a look at the security sector. Now this is interesting because not only is it a sector drill down, but also gives an indicator of how much money the firm has raised, which is the size of that bubble. And to tell us if a company is punching above its weight and efficiently using its venture capital. Erik, take us through this slide. Explain the dots, the size of the dots. Set this up please. >> Yeah. So again, the axis is still the same, net sentiment and mindshare, but what we've done this time is we've taken publicly available information on how much capital company is raised and that'll be the size of the circle you see around the name. And then whether it's green or red is basically saying relative to the amount of money they've raised, how are they doing in our data? So when you see a Netskope, which has been around forever, raised a lot of money, that's why you're going to see them more leading towards red, 'cause it's just been around forever and kind of would expect it. Versus a name like SecurityScorecard, which is only raised a little bit of money and it's actually performing just as well, if not better than a name, like a Netskope. OneTrust doing absolutely incredible right now. BeyondTrust. We've seen the issues with Okta, right. So those are two names that play in that space that obviously are probably getting some looks about what's going on right now. Wiz, we've all heard about right? So raised a ton of money. It's doing well on net sentiment, but the mindshare isn't as well as you'd want, which is why you're going to see a little bit of that red versus a name like Aqua, which is doing container and application security. And hasn't raised as much money, but is really neck and neck with a name like Wiz. So that is why on a relative basis, you'll see that more green. As we all know, information security is never going away. But as we'll get to later in the program, Dave, I'm not sure in this current market environment, if people are as willing to do POCs and switch away from their security provider, right. There's a little bit of tepidness out there, a little trepidation. So right now we're seeing overall a slight pause, a slight cooling in overall evaluations on the security side versus historical levels a year ago. >> Now let's stay on here for a second. So a couple things I want to point out. So it's interesting. Now Snyk has raised over, I think $800 million but you can see them, they're high on the vertical and the horizontal, but now compare that to Lacework. It's hard to see, but they're kind of buried in the middle there. That's the biggest dot in this whole thing. I think I'm interpreting this correctly. They've raised over a billion dollars. It's a Mike Speiser company. He was the founding investor in Snowflake. So people watch that very closely, but that's an example of where they're not punching above their weight. They recently had a layoff and they got to fine tune things, but I'm still confident they they're going to do well. 'Cause they're approaching security as a data problem, which is probably people having trouble getting their arms around that. And then again, I see Arctic Wolf. They're not red, they're not green, but they've raised fair amount of money, but it's showing up to the right and decent level there. And a couple of the other ones that you mentioned, Netskope. Yeah, they've raised a lot of money, but they're actually performing where you want. What you don't want is where Lacework is, right. They've got some work to do to really take advantage of the money that they raised last November and prior to that. >> Yeah, if you're seeing that more neutral color, like you're calling out with an Arctic Wolf, like that means relative to their peers, this is where they should be. It's when you're seeing that red on a Lacework where we all know, wow, you raised a ton of money and your mindshare isn't where it should be. Your net sentiment is not where it should be comparatively. And then you see these great standouts, like Salt Security and SecurityScorecard and Abnormal. You know they haven't raised that much money yet, but their net sentiment's higher and their mindshare's doing well. So those basically in a nutshell, if you're a PE or a VC and you see a small green circle, then you're doing well, then it means you made a good investment. >> Some of these guys, I don't know, but you see these small green circles. Those are the ones you want to start digging into and maybe help them catch a wave. Okay, let's get into the data discussion. And again, three areas, database slash data warehousing, big data analytics and ML AI. First, we're going to look at the database sector. So Alex, thank you for bringing that up. Alright, take us through this, Erik. Actually, let me just say Postgres SQL. I got to ask you about this. It shows some funding, but that actually could be a mix of EDB, the company that commercializes Postgres and Postgres the open source database, which is a transaction system and kind of an open source Oracle. You see MariaDB is a database, but open source database. But the companies they've raised over $200 million and they filed an S-4. So Erik looks like this might be a little bit of mashup of companies and open source products. Help us understand this. >> Yeah, it's tough when you start dealing with the open source side and I'll be honest with you, there is a little bit of a mashup here. There are certain names here that are a hundred percent for profit companies. And then there are others that are obviously open source based like Redis is open source, but Redis Labs is the one trying to monetize the support around it. So you're a hundred percent accurate on this slide. I think one of the things here that's important to note though, is just how important open source is to data. If you're going to be going to any of these areas, it's going to be open source based to begin with. And Neo4j is one I want to call out here. It's not one everyone's familiar with, but it's basically geographical charting database, which is a name that we're seeing on a net sentiment side actually really, really high. When you think about it's the third overall net sentiment for a niche database play. It's not as big on the mindshare 'cause it's use cases aren't as often, but third biggest play on net sentiment. I found really interesting on this slide. >> And again, so MariaDB, as I said, they filed an S-4 I think $50 million in revenue, that might even be ARR. So they're not huge, but they're getting there. And by the way, MariaDB, if you don't know, was the company that was formed the day that Oracle bought Sun in which they got MySQL and MariaDB has done a really good job of replacing a lot of MySQL instances. Oracle has responded with MySQL HeatWave, which was kind of the Oracle version of MySQL. So there's some interesting battles going on there. If you think about the LAMP stack, the M in the LAMP stack was MySQL. And so now it's all MariaDB replacing that MySQL for a large part. And then you see again, the red, you know, you got to have some concerns about there. Aerospike's been around for a long time. SingleStore changed their name a couple years ago, last year. Yellowbrick Data, Fire Bolt was kind of going after Snowflake for a while, but yeah, you want to get out of that red zone. So they got some work to do. >> And Dave, real quick for the people that aren't aware, I just want to let them know that we can cut this data with the public company data as well. So we can cross over this with that because some of these names are competing with the larger public company names as well. So we can go ahead and cross reference like a MariaDB with a Mongo, for instance, or of something of that nature. So it's not in this slide, but at another point we can certainly explain on a relative basis how these private names are doing compared to the other ones as well. >> All right, let's take a quick look at analytics. Alex, bring that up if you would. Go ahead, Erik. >> Yeah, I mean, essentially here, I can't see it on my screen, my apologies. I just kind of went to blank on that. So gimme one second to catch up. >> So I could set it up while you're doing that. You got Grafana up and to the right. I mean, this is huge right. >> Got it thank you. I lost my screen there for a second. Yep. Again, open source name Grafana, absolutely up and to the right. But as we know, Grafana Labs is actually picking up a lot of speed based on Grafana, of course. And I think we might actually hear some noise from them coming this year. The names that are actually a little bit more disappointing than I want to call out are names like ThoughtSpot. It's been around forever. Their mindshare of course is second best here but based on the amount of time they've been around and the amount of money they've raised, it's not actually outperforming the way it should be. We're seeing Moogsoft obviously make some waves. That's very high net sentiment for that company. It's, you know, what, third, fourth position overall in this entire area, Another name like Fivetran, Matillion is doing well. Fivetran, even though it's got a high net sentiment, again, it's raised so much money that we would've expected a little bit more at this point. I know you know this space extremely well, but basically what we're looking at here and to the bottom left, you're going to see some names with a lot of red, large circles that really just aren't performing that well. InfluxData, however, second highest net sentiment. And it's really pretty early on in this stage and the feedback we're getting on this name is the use cases are great, the efficacy's great. And I think it's one to watch out for. >> InfluxData, time series database. The other interesting things I just noticed here, you got Tamer on here, which is that little small green. Those are the ones we were saying before, look for those guys. They might be some of the interesting companies out there and then observe Jeremy Burton's company. They do observability on top of Snowflake, not green, but kind of in that gray. So that's kind of cool. Monte Carlo is another one, they're sort of slightly green. They are doing some really interesting things in data and data mesh. So yeah, okay. So I can spend all day on this stuff, Erik, phenomenal data. I got to get back and really dig in. Let's end with machine learning and AI. Now this chart it's similar in its dimensions, of course, except for the money raised. We're not showing that size of the bubble, but AI is so hot. We wanted to cover that here, Erik, explain this please. Why TensorFlow is highlighted and walk us through this chart. >> Yeah, it's funny yet again, right? Another open source name, TensorFlow being up there. And I just want to explain, we do break out machine learning, AI is its own sector. A lot of this of course really is intertwined with the data side, but it is on its own area. And one of the things I think that's most important here to break out is Databricks. We started to cover Databricks in machine learning, AI. That company has grown into much, much more than that. So I do want to state to you Dave, and also the audience out there that moving forward, we're going to be moving Databricks out of only the MA/AI into other sectors. So we can kind of value them against their peers a little bit better. But in this instance, you could just see how dominant they are in this area. And one thing that's not here, but I do want to point out is that we have the ability to break this down by industry vertical, organization size. And when I break this down into Fortune 500 and Fortune 1000, both Databricks and Tensorflow are even better than you see here. So it's quite interesting to see that the names that are succeeding are also succeeding with the largest organizations in the world. And as we know, large organizations means large budgets. So this is one area that I just thought was really interesting to point out that as we break it down, the data by vertical, these two names still are the outstanding players. >> I just also want to call it H2O.ai. They're getting a lot of buzz in the marketplace and I'm seeing them a lot more. Anaconda, another one. Dataiku consistently popping up. DataRobot is also interesting because all the kerfuffle that's going on there. The Cube guy, Cube alum, Chris Lynch stepped down as executive chairman. All this stuff came out about how the executives were taking money off the table and didn't allow the employees to participate in that money raising deal. So that's pissed a lot of people off. And so they're now going through some kind of uncomfortable things, which is unfortunate because DataRobot, I noticed, we haven't covered them that much in "Breaking Analysis", but I've noticed them oftentimes, Erik, in the surveys doing really well. So you would think that company has a lot of potential. But yeah, it's an important space that we're going to continue to watch. Let me ask you Erik, can you contextualize this from a time series standpoint? I mean, how is this changed over time? >> Yeah, again, not show here, but in the data. I'm sorry, go ahead. >> No, I'm sorry. What I meant, I should have interjected. In other words, you would think in a downturn that these emerging companies would be less interesting to buyers 'cause they're more risky. What have you seen? >> Yeah, and it was interesting before we went live, you and I were having this conversation about "Is the downturn stopping people from evaluating these private companies or not," right. In a larger sense, that's really what we're doing here. How are these private companies doing when it comes down to the actual practitioners? The people with the budget, the people with the decision making. And so what I did is, we have historical data as you know, I went back to the Emerging Technology Survey we did in November of 21, right at the crest right before the market started to really fall and everything kind of started to fall apart there. And what I noticed is on the security side, very much so, we're seeing less evaluations than we were in November 21. So I broke it down. On cloud security, net sentiment went from 21% to 16% from November '21. That's a pretty big drop. And again, that sentiment is our one aggregate metric for overall positivity, meaning utilization and actual evaluation of the name. Again in database, we saw it drop a little bit from 19% to 13%. However, in analytics we actually saw it stay steady. So it's pretty interesting that yes, cloud security and security in general is always going to be important. But right now we're seeing less overall net sentiment in that space. But within analytics, we're seeing steady with growing mindshare. And also to your point earlier in machine learning, AI, we're seeing steady net sentiment and mindshare has grown a whopping 25% to 30%. So despite the downturn, we're seeing more awareness of these companies in analytics and machine learning and a steady, actual utilization of them. I can't say the same in security and database. They're actually shrinking a little bit since the end of last year. >> You know it's interesting, we were on a round table, Erik does these round tables with CISOs and CIOs, and I remember one time you had asked the question, "How do you think about some of these emerging tech companies?" And one of the executives said, "I always include somebody in the bottom left of the Gartner Magic Quadrant in my RFPs. I think he said, "That's how I found," I don't know, it was Zscaler or something like that years before anybody ever knew of them "Because they're going to help me get to the next level." So it's interesting to see Erik in these sectors, how they're holding up in many cases. >> Yeah. It's a very important part for the actual IT practitioners themselves. There's always contracts coming up and you always have to worry about your next round of negotiations. And that's one of the roles these guys play. You have to do a POC when contracts come up, but it's also their job to stay on top of the new technology. You can't fall behind. Like everyone's a software company. Now everyone's a tech company, no matter what you're doing. So these guys have to stay in on top of it. And that's what this ETS can do. You can go in here and look and say, "All right, I'm going to evaluate their technology," and it could be twofold. It might be that you're ready to upgrade your technology and they're actually pushing the envelope or it simply might be I'm using them as a negotiation ploy. So when I go back to the big guy who I have full intentions of writing that contract to, at least I have some negotiation leverage. >> Erik, we got to leave it there. I could spend all day. I'm going to definitely dig into this on my own time. Thank you for introducing this, really appreciate your time today. >> I always enjoy it, Dave and I hope everyone out there has a great holiday weekend. Enjoy the rest of the summer. And, you know, I love to talk data. So anytime you want, just point the camera on me and I'll start talking data. >> You got it. I also want to thank the team at ETR, not only Erik, but Darren Bramen who's a data scientist, really helped prepare this data, the entire team over at ETR. I cannot tell you how much additional data there is. We are just scratching the surface in this "Breaking Analysis". So great job guys. I want to thank Alex Myerson. Who's on production and he manages the podcast. Ken Shifman as well, who's just coming back from VMware Explore. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor in chief over at SiliconANGLE. Does some great editing for us. Thank you. All of you guys. Remember these episodes, they're all available as podcast, wherever you listen. All you got to do is just search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch david.vellante@siliconangle.com. You can DM me at dvellante or comment on my LinkedIn posts and please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for Erik Bradley and The Cube Insights powered by ETR. Thanks for watching. Be well. And we'll see you next time on "Breaking Analysis". (upbeat music)

Published Date : Sep 7 2022

SUMMARY :

bringing you data driven it's called the Emerging Great to see you too, Dave, so much in the mainstream, not only for the ITDMs themselves It is the heart of innovation So the net sentiment is a very So a lot of names that we And then of course you have AnyScale, That's the bad zone, I guess, So the gray dots that you're rates, adoption and the all And on the lower side, Vena, Acton, in the green. are in the enterprise already. So now let's look at the churn So that's the way you can look of dwell on the negative, So again, the axis is still the same, And a couple of the other And then you see these great standouts, Those are the ones you want to but Redis Labs is the one And by the way, MariaDB, So it's not in this slide, Alex, bring that up if you would. So gimme one second to catch up. So I could set it up but based on the amount of time Those are the ones we were saying before, And one of the things I think didn't allow the employees to here, but in the data. What have you seen? the market started to really And one of the executives said, And that's one of the Thank you for introducing this, just point the camera on me We are just scratching the surface

ENTITIES

Entity	Category	Confidence
Erik	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Ken Shifman	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Erik Bradley	PERSON	0.99+
November 21	DATE	0.99+
Darren Bramen	PERSON	0.99+
Alex	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Postgres	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Netskope	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Fivetran	ORGANIZATION	0.99+
$50 million	QUANTITY	0.99+
21%	QUANTITY	0.99+
Chris Lynch	PERSON	0.99+
19%	QUANTITY	0.99+
Jeremy Burton	PERSON	0.99+
$800 million	QUANTITY	0.99+
6,000	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Redis Labs	ORGANIZATION	0.99+
November '21	DATE	0.99+
ETR	ORGANIZATION	0.99+
First	QUANTITY	0.99+
25%	QUANTITY	0.99+
last year	DATE	0.99+
OneTrust	ORGANIZATION	0.99+
two dimensions	QUANTITY	0.99+
two groups	QUANTITY	0.99+
November of 21	DATE	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
more than 400 companies	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
MySQL	TITLE	0.99+
Moogsoft	ORGANIZATION	0.99+
The Cube	ORGANIZATION	0.99+
third	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Mike Speiser	PERSON	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
second	QUANTITY	0.99+
two	QUANTITY	0.99+
first	QUANTITY	0.99+
28%	QUANTITY	0.99+
16%	QUANTITY	0.99+
Second	QUANTITY	0.99+

Sandy Carter | AWS Global Public Sector Partner Awards 2021

(upbeat music) >> Welcome to the special CUBE presentation of the AWS Global Public Sector Partner Awards Program. I'm here with the leader of the partner program, Sandy Carter, Vice President, AWS, Amazon Web Services @Sandy_Carter on Twitter, prolific on social and great leader. Sandy, great to see you again. And congratulations on this great program we're having here. In fact, thanks for coming out for this keynote. Well, thank you, John, for having me. You guys always talk about the coolest thing. So we had to be part of it. >> Well, one of the things that I've been really loving about this success of public sector we talked to us before is that as we start coming out of the pandemic, is becoming very clear that the cloud has helped a lot of people and your team has done amazing work, just want to give you props for that and say, congratulations, and what a great time to talk about the winners. Because everyone's been working really hard in public sector, because of the pandemic. The internet didn't break. And everyone stepped up with cloud scale and solve some problems. So take us through the award winners and talk about them. Give us an overview of what it is. The criteria and all the specifics. >> Yeah, you got it. So we've been doing this annually, and it's for our public sector partners overall, to really recognize the very best of the best. Now, we love all of our partners, John, as you know, but every year we'd like to really hone in on a couple who really leverage their skills and their ability to deliver a great customer solution. They demonstrate those Amazon leadership principles like working backwards from the customer, having a bias for action, they've engaged with AWS and very unique ways. And as well, they've contributed to our customer success, which is so very important to us and to our customers as well. >> That's awesome. Hey, can we put up a slide, I know we have slide on the winners, I want to look at them, with the tiles here. So here's a list of some of the winners. I see a nice little stars on there. Look at the gold star. I knows IronNet, CrowdStrike. That's General Keith Alexander's company, I mean, super relevant. Presidio, we've interviewed them before many times, got Palantir in there. And is there another one, I want to take a look at some of the other names here. >> In overall we had 21 categories. You know, we have over 1900 public sector partners today. So you'll notice that the awards we did, a big focus on mission. So things like government, education, health care, we spotlighted some of the brand new technologies like Containers, Artificial Intelligence, Amazon Connect. And we also this year added in awards for innovative use of our programs, like think big for small business and PTP as well. >> Yeah, well, great roundup, they're looking forward to hearing more about those companies. I have to ask you, because this always comes up, we're seeing more and more ecosystem discussions when we talk about the future of cloud. And obviously, we're going to, you know, be at Mobile World Congress, theCUBE, back in physical form, again, (indistinct) will continue to go on. The notion of ecosystem is becoming a key competitive advantage for companies and missions. So I have to ask you, why are partners so important to your public sector team? Talk about the importance of partners in context to your mission? >> Yeah, you know, our partners are critical. We drive most of our business and public sector through partners. They have great relationships, they've got great skills, and they have, you know, that really unique ability to meet the customer needs. If I just highlighted a couple of things, even using some of our partners who won awards, the first is, you know, migrations are so critical. Andy talked at Reinvent about still 96% of applications still sitting on premises. So anybody who can help us with the velocity of migrations is really critical. And I don't know if you knew John, but 80% of our migrations are led by partners. So for example, we gave awards to Collibra and Databricks as best lead migration for data as well as Datacom for best data lead migration as well. And that's because they increase the velocity of migrations, which increases customer satisfaction. They also bring great subject matter expertise, in particular around that mission that you're talking about. So for instance, GDIT won best Mission Solution For Federal, and they had just an amazing solution that was a secure virtual desktop that reduced a federal agencies deployment process, from months to days. And then finally, you know, our partners drive new opportunities and innovate on behalf of our customers. So we did award this year for P to P, Partnering to Partner which is a really big element of ecosystems, but it was won by four points and in quizon, and they were able to work together to implement a data, implement a data lake and an AI, ML solution, and then you just did the startup showcase, we have a best startup delivering innovation too, and that was EduTech (indistinct) Central America. And they won for implementing an amazing student registration and early warning system to alert and risks that may impact a student's educational achievement. So those are just some of the reasons why partners are important. I could go on and on. As you know, I'm so passionate about my partners, >> I know you're going to talk for an hour, we have to cut you off a little there. (indistinct) love your partners so much. You have to focus on this mission thing. It was a strong mission focus in the awards this year. Why are customers requiring much more of a mission focused? Is it because, is it a part of the criteria? I mean, we're seeing a mission being big. Why is that the case? >> Well, you know, IDC, said that IT spend for a mission or something with a purpose or line of business was five times greater than IT. We also recently did our CTO study where we surveyed thousands of CTOs. And the biggest and most changing elements today is really not around the technology. But it's around the industry, healthcare, space that we talked about earlier, or government. So those are really important. So for instance, New Reburial, they won Best Emission for Healthcare. And they did that because of their new smart diagnostic system. And then we had a partner when PA consulting for Best Amazon Connect solution around a mission for providing support for those most at risk, the elderly population, those who already had pre existing conditions, and really making sure they were doing what they called risk shielding during COVID. Really exciting and big, strong focus on mission. >> Yeah, and it's also, you know, we've been covering a lot on this, people want to work for a company that has purpose, and that has missions. I think that's going to be part of the table stakes going forward. I got to ask you on the secrets of success when this came up, I love asking this question, because, you know, we're starting to see the playbooks of what I call post COVID and cloud scale 2.0, whatever you want to call it, as you're starting to see this new modern era of success formulas, obviously, large scale value creation mission. These are points we're hearing and keep conversations across the board. What do you see as the secret of success for these parties? I mean, obviously, it's indirect for Amazon, I get that, but they're also have their customers, they're your customers, customers. That's been around for a while. But there's a new model emerging. What are the secrets from your standpoint of success? you know, it's so interesting, John, that you asked me this, because this is the number one question that I get from partners too. I would say the first secret is being able to work backwards from your customer, not just technology. So take one of our award winners Cognizant. They won for their digital tolling solution. And they work backwards from the customer and how to modernize that, or Pariveda, who is one of our best energy solution winners. And again, they looked at some of these major capital projects that oil companies were doing, working backwards from what the customer needed. I think that's number one, working backwards from the customer. Two, is having that mission expertise. So given that you have to have technology, but you also got to have that expertise in the area. We see that as a big secret of our public sector partners. So education cloud, (indistinct) one for education, effectual one for government and not for profit, Accenture won, really leveraging and showcasing their global expansion around public safety and disaster response. Very important as well. And then I would say the last secret of success is building repeatable solutions using those strong skills. So Deloitte, they have a great solution for migration, including mainframes. And then you mentioned early on, CloudStrike and IronNet, just think about the skill sets that they have there for repeatable solutions around security. So I think it's really around working backwards from the customer, having that mission expertise, and then building a repeatable solution, leveraging your skill sets. >> That's a great formula for success. I got you mentioned IronNet, and cybersecurity. One of things that's coming up is, in addition to having those best practices, there's also like real problems to solve, like, ransomware is now becoming a government and commercial problem, right. So (indistinct) seeing that happen a lot in DC, that's a front burner. That's a societal impact issue. That's like a cybersecurity kind of national security defense issue, but also, it's a technical one. And also public sector, through my interviews, I can tell you the past year and a half, there's been a lot of creativity of new solutions, new problems or new opportunities that are not yet identified as problems and I'd love to get your thoughts on my concern is with Jeff Bar yesterday from AWS, who's been blogging all the the news and he is a leader in the community. He was saying that he sees like 5G in the edge as new opportunities where it's creative. It's like he compared to the going to the home improvement store where he just goes to buy one thing. He does other things. And so there's a builder culture. And I think this is something that's coming out of your group more, because the pandemic forced these problems, and they forced new opportunities to be creative, and to build. What's your thoughts? >> Yeah, so I see that too. So if you think about builders, you know, we had a partner, Executive Council yesterday, we had 900, executives sign up from all of our partners. And we asked some survey questions like, what are you building with today? And the number one thing was artificial intelligence and machine learning. And I think that's such a new builders tool today, John, and, you know, one of our partners who won an award for the most innovative AI&ML was Kablamo And what they did was they use AI&ML to do a risk assessment on bushfires or wildfires in Australia. But I think it goes beyond that. I think it's building for that need. And this goes back to, we always talk about #techforgood. Presidio, I love this award that they won for best nonprofit, the Cherokee Nation, which is one of our, you know, Native American heritage, they were worried about their language going out, like completely out like no one being able to speak yet. And so they came to Presidio, and they asked how could we have a virtual classroom platform for the Cherokee Nation? And they created this game that's available on your phone, so innovative, so much of a builder's culture to capture that young generation, so they don't you lose their language. So I do agree. I mean, we're seeing builders everywhere, we're seeing them use artificial intelligence, Container, security. And we're even starting with quantum, so it is pretty powerful of what you can do as a public sector partner. >> I think the partner equation is just so wide open, because it's always been based on value, adding value, right? So adding value is just what they do. And by the way, you make money doing it if you do a good job of adding value. And, again, I just love riffing on this, because Dave and I talked about this on theCUBE all the time, and it comes up all the time in cloud conversations. The lock in isn't proprietary technology anymore, its value, and scale. So you starting to see builders thrive in that environment. So really good points. Great best practice. And I think I'm very bullish on the partner ecosystems in general, and people do it right, flat upside. I got to ask you, though, going forward, because this is the big post COVID kind of conversation. And last time we talked on theCUBE about this, you know, people want to have a growth strategy coming out of COVID. They want to be, they want to have a tail win, they want to be on the right side of history. No one wants to be in the losing end of all this. So last year in 2021 your goals were very clear, mission, migrations, modernization. What's the focus for the partners beyond 2021? What are you guys thinking to enable them, 21 is going to be a nice on ramp to this post COVID growth strategy? What's the focus beyond 2021 for you and your partners? >> Yeah, it's really interesting, we're going to actually continue to focus on those three M's mission, migration and modernization. But we'll bring in different elements of it. So for example, on mission, we see a couple of new areas that are really rising to the top, Smart Cities now that everybody's going back to work and (indistinct) down, operations and maintenance and global defense and using gaming and simulation. I mean, think about that digital twin strategy and how you're doing that. For migration, one of the big ones we see emerging today is data-lead migration. You know, we have been focused on applications and mainframes, but data has gravity. And so we are seeing so many partners and our customers demanding to get their data from on premises to the cloud so that now they can make real time business decisions. And then on modernization. You know, we talked a lot about artificial intelligence and machine learning. Containers are wicked hot right now, provides you portability and performance. I was with a startup last night that just moved everything they're doing to ECS our Container strategy. And then we're also seeing, you know, crippin, quantum blockchain, no code, low code. So the same big focus, mission migration, modernization, but the underpinnings are going to shift a little bit beyond 2021. >> That's great stuff. And you know, you have first of all people don't might not know that your group partners and Amazon Web Services public sector, has a big surface area. You talking about government, health care, space. So I have to ask you, you guys announced in March the space accelerator and you recently announced that you selected 10 companies to participate in the accelerated program. So, I mean, this is this is a space centric, you know, targeting, you know, low earth orbiting satellites to exploring the surface of the Moon and Mars, which people love. And because the space is cool, let's say the tech and space, they kind of go together, right? So take us through, what's this all about? How's that going? What's the selection, give us a quick update, while you're here on this space accelerated selection, because (indistinct) will have had a big blog post that went out (indistinct). >> Yeah, I would be thrilled to do that. So I don't know if you know this. But when I was young, I wanted to be an astronaut. We just helped through (indistinct), one of our partners reach Mars. So Clint, who is a retired general and myself got together, and we decided we needed to do something to help startups accelerate in their space mission. And so we decided to announce a competition for 10 startups to get extra help both from us, as well as a partner Sarafem on space. And so we announced it, everybody expected the companies to come from the US, John, they came from 44 different countries. We had hundreds of startups enter, and we took them through this six week, classroom education. So we had our General Clint, you know, helping and teaching them in space, which he's done his whole life, we provided them with AWS credits, they had mentoring by our partner, Sarafem. And we just down selected to 10 startups, that was what Vernors blog post was. If you haven't read it, you should look at some of the amazing things that they're going to do, from, you know, farming asteroids to, you know, helping with some of the, you know, using small vehicles to connect to larger vehicles, when we all get to space. It's very exciting. Very exciting, indeed, >> You have so much good content areas and partners, exploring, it's a very wide vertical or sector that you're managing. Is there any pattern? Well, I want to get your thoughts on post COVID success again, is there any patterns that you're seeing in terms of the partner ecosystem? You know, whether its business model, or team makeup, or more mindset, or just how they're organizing that that's been successful? Is there like a, do you see a trend? Is there a certain thing, then I've got the working backwards thing, I get that. But like, is there any other observations? Because I think people really want to know, am I doing it right? Am I being a good manager, when you know, people are going to be working remotely more? We're seeing more of that. And there's going to be now virtual events, hybrid events, physical events, the world's coming back to normal, but it's never going to be the same. Do you see any patterns? >> Yeah, you know, we're seeing a lot of small partners that are making an entrance and solving some really difficult problems. And because they're so focused on a niche, it's really having an impact. So I really believe that that's going to be one of the things that we see, I focus on individual creators and companies who are really tightly aligned and not trying to do everything, if you will. I think that's one of the big trends. I think the second we talked about it a little bit, John, I think you're going to see a lot of focus on mission. Because of that purpose. You know, we've talked about #techforgood, with everything going on in the world. As people have been working from home, they've been reevaluating who they are, and what do they stand for, and people want to work for a company that cares about people. I just posted my human footer on LinkedIn. And I got my first over a million hits on LinkedIn, just by posting this human footer, saying, you know what, reply to me at a time that's convenient for you, not necessarily for me. So I think we're going to see a lot of this purpose driven mission, that's going to come out as well. >> Yeah, and I also noticed that, and I was on LinkedIn, I got a similar reaction when I started trying to create more of a community model, not so much have people attend our events, and we need butts in the seats. It was much more personal, like we wanted you to join us, not attend and be like a number. You know, people want to be part of something. This seem to be the new mission. >> Yeah, I completely agree with that. I think that, you know, people do want to be part of something and they want, they want to be part of the meaning of something too, right. Not just be part of something overall, but to have an impact themselves, personally and individually, not just as a company. And I think, you know, one of the other trends that we saw coming up too, was the focus on technology. And I think low code, no code is giving a lot of people entry into doing things I never thought they could do. So I do think that technology, artificial intelligence Containers, low code, no code blockchain, those are going to enable us to even do greater mission-based solutions. >> Low code, no code reduces the friction to create more value, again, back to the value proposition. Adding value is the key to success, your partners are doing it. And of course, being part of something great, like the Global Public Sector Partner Awards list is a good one. And that's what we're talking about here. Sandy, great to see you. Thank you for coming on and sharing your insights and an update and talking more about the 2021, Global Public Sector partner Awards. Thanks for coming on. >> Thank you, John, always a pleasure. >> Okay, the Global Leaders here presented on theCUBE, again, award winners doing great work in mission, modernization, again, adding value. That's what it's all about. That's the new competitive advantage. This is theCUBE. I'm John Furrier, your host, thanks for watching. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Sandy, great to see you again. just want to give you props for and to our customers as well. So here's a list of some of the winners. And we also this year added in awards So I have to ask you, and they have, you know, Why is that the case? And the biggest and most I got to ask you on the secrets of success and I'd love to get your thoughts on And so they came to Presidio, And by the way, you make money doing it And then we're also seeing, you know, And you know, you have first of all that they're going to do, And there's going to be now that that's going to be like we wanted you to join us, And I think, you know, and talking more about the 2021, That's the new competitive advantage.

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
John	PERSON	0.99+
Dave	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
Sandy Carter	PERSON	0.99+
Clint	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Sandy	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
March	DATE	0.99+
Australia	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
US	LOCATION	0.99+
10 companies	QUANTITY	0.99+
21 categories	QUANTITY	0.99+
Jeff Bar	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
900	QUANTITY	0.99+
80%	QUANTITY	0.99+
yesterday	DATE	0.99+
Mars	LOCATION	0.99+
2021	DATE	0.99+
GDIT	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
first	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
10 startups	QUANTITY	0.99+
EduTech	ORGANIZATION	0.99+
Datacom	ORGANIZATION	0.99+
last year	DATE	0.99+
IronNet	ORGANIZATION	0.99+
Keith Alexander	PERSON	0.99+
44 different countries	QUANTITY	0.99+
Global Public Sector Partner Awards	EVENT	0.99+
Two	QUANTITY	0.99+
this year	DATE	0.99+
four points	QUANTITY	0.99+
LinkedIn	ORGANIZATION	0.99+
IDC	ORGANIZATION	0.98+
six week	QUANTITY	0.98+
Presidio	ORGANIZATION	0.98+
@Sandy_Carter	PERSON	0.98+
one	QUANTITY	0.98+
CrowdStrike	ORGANIZATION	0.98+
Moon	LOCATION	0.98+
both	QUANTITY	0.97+
pandemic	EVENT	0.97+
Global Public Sector partner Awards	EVENT	0.97+
Central America	LOCATION	0.97+
last night	DATE	0.97+
today	DATE	0.97+
Reinvent	ORGANIZATION	0.97+
over 1900 public sector partners	QUANTITY	0.96+
first secret	QUANTITY	0.96+
Best Amazon Connect	ORGANIZATION	0.96+
DC	LOCATION	0.96+
Cognizant	PERSON	0.96+
One	QUANTITY	0.95+
Vernors	PERSON	0.95+
an hour	QUANTITY	0.95+
Sarafem	ORGANIZATION	0.95+
Cherokee Nation	ORGANIZATION	0.94+
General	PERSON	0.94+
thousands of CTOs	QUANTITY	0.94+
Pariveda	ORGANIZATION	0.93+
second	QUANTITY	0.93+

Ankit Goel, Aravind Jagannathan, & Atif Malik

>>From around the globe. It's the cube covering data citizens. 21 brought to you by Colibra >>Welcome to the cubes coverage of Collibra data citizens 21. I'm Lisa Martin. I have three guests with me here today. Colibra customer Freddie Mac, please welcome JAG chief data officer and vice president of single family data and decisions. Jog. Welcome to the cube. >>Thank you, Lisa. Look forward to be, >>Uh, excellent on Kiko LSU as well. Vice president data transformation and analytics solution on Kay. Good to have you on the program. >>Thank you, Lisa. Great to be here and >>A teeth Malik senior director from the single family division at Freddie Mac is here as well. A team welcome. So we have big congratulations in order. Uh, pretty Mac was just announced at data citizens as the winners of the Colibra excellence award for data program of the year. Congratulations on that. We're going to unpack that. Talk about what that means, but I'd love to get familiar with the 3d Jack. Start with you. Talk to me a little bit about your background, your current role as chief data officer. >>Appreciate it, Lisa, thank you for the opportunity to share our story. Uh, my name is Arvind calls me Jack. And as you said, I'm just single-family chief data officer at Freddie Mac, but those that don't know, Freddie Mac is a Garland sponsored entity that supports the U S housing finance system and single family deals with the residential side of the marketplace, as CDO are responsible for our managed content data lineage, data governance, business architecture, which Cleaver plays a integral role, uh, in, in depth, that function as well as, uh, support our shared assets across the enterprise and our data monetization efforts, data, product execution, decision modeling, as well as our business intelligence capabilities, including AI and ML for various use cases as a background, starting my career in New York and then moved to Boston and last 20 years of living in the Northern Virginia DC area and fortunate to have been responsible for business operations, as well as led and, um, executed large transformation efforts. That background has reinforced the power of data and how, how it's so critical to meeting our business objectives. Look forward to our dialogue today, Lisa, once again. >>Excellent. You have a great background and clearly not a dull moment in your job with Freddy, Matt. And tell me a little bit about your background, your role, what you're doing at Freddie >>Mac. Definitely. Um, hi everyone. I'm,, I'm vice president of data transformation and analytics solutions. And I worked for JAG. I'm responsible for many of the things he said, including leading our transformation to the cloud and migrating all our existing data assets front of that transformation journey. I'm also responsible for our business information and business data architecture, decision modeling, business intelligence, and some of the analytics and artificial intelligence. I started my career back in the day as a computer engineer, but I've always been in the financial industry up in New York. And now in the Northern Virginia area, I called myself that bridge between business and technology. And I would say, I think over the last six years with data found that perfect spot where business and technology actually come together to solve real problems and, and really lead, um, you know, businesses to the next stage of, so thank you Lisa for the opportunity today. Excellent. >>And we're going to unpack you call yourself the bridge between business and it that's always such an important bridge. We're going to talk about that in just a minute, but I want to get your background, tell our audience about you. >>Uh, I'm Alec Malek, I'm senior director of business, data architecture, data transformation, and Freddie Mac. Uh, I'm responsible for the overall business data architecture and transformation of the existing data onto the cloud data lake. Uh, my team is responsible for the Kleberg platform and the business analysts that are using and maintaining the data in Libra and also driving the data architecture in close collaboration with our engineering teams. My background is I'm a engineer at heart. I still do a lot of development. This is my first time as of crossing over onto the bridge onto business side of maintaining data and working with data teams. >>Jan, let's talk about digital transformation. Freddie Mac is a 50 year old and growing company. I always love talking with established businesses about digital transformation. It's pretty challenging. Talk to me about your initial plan and what some of the main challenges were that you were looking to solve. >>Uh, great question, Lisa, and, uh, it's definitely pertinent as you say, in our digital world or figuring out how we need to accomplish it. If I look at our data, modernization is it is a major program and, uh, effort, uh, in, in our, in our division, what started as a reducing cost or looking at an infrastructure play, moving from physical data assets to the cloud, as well as enhancing our resiliency as quickly morphed into meeting business demand and objectives, whether it be for sourcing, servicing or securitization of our loan products. So where are we as we think about creating this digital data marketplace, we are, we are basically forming, empowering a new data ecosystem, which Columbia is definitely playing a major role. It's more than just a cloud native data lake, but it's bringing in some of our current assets and capabilities into this new data landscape. >>So as we think about creating an information hub, part of the challenges, as you say, 50 years of having millions of loans and millions of data across multiple assets, it's frigging out that you still have to care and feed legacy while you're building the new highway and figuring out how you best have to transform and translate and move data and assets to this new platform. What we've been striving for is looking at what is the business demand or what is the business use case, and what's the value to help prioritize that transformation. Exciting part is, as you think about new uses of acquiring and distribution of data, as well as news new use cases for prescriptive and predictive analytics, the power of what we're building in our daily, this new data ecosystem, we're feeling comfortable, we'll meet the business demand, but as any CTO will tell you demand is always, uh, outpaces our capacity. And that's why we want to be very diligent in terms of our execution plan. So we're very excited as to what we've accomplished so far this year and looking forward as we offered a remainder year. And as you go into 2022. Excellent, >>Thanks JAG. Uh, two books go to you. As I mentioned in the intro of that Freddie Mac has won the Culebra excellence award for data program of the year. Again, congratulations on that, but I'd love to understand the Kleber center of excellence that you're building at Freddie Mac. First of all, define what a center of excellence is to Freddie Mac and then what you're specifically building. Yeah, sure. >>So the Cleaver center of excellence provides us the overall framework from a people and process standpoint to focus in on our use of Colibra and for adopting best practices. Uh, we can have teams that are focused just on developing best practices and implementing workflows and lineage within Collibra and implementing and adopting a number of different aspects of Libra. It provides the central hub of people being domain experts on the tool that can then be leveraged by different groups within the organization to maintain, uh, the tool. >>Put another follow on question a T for you. How does Freddie Mac define, uh, dated citizens as anybody in finance or sales or marketing or operations? What does that definition of data citizen? >>It's really everyone it's within the organization. They all consume data in different ways and we provide a way of governing data and for them to get a better understanding of data from Collibra itself. So it's really everyone within the organization that way. >>Excellent. Okay. Let's go over to you a big topic at data citizens. 21 is collaboration. That's probably a word that we used a ton in the last 15 plus months or so it was every business really pivoted quickly to figure out how do we best collaborate. But something that you talked about in your intro is being the bridge between business and it, I want to understand from your perspective, how can data teams help to drive improved collaboration between business and it, >>The collaboration between business and technology have been a key focus area for us over the last few years, we actually started an agile transformation journey two years ago that we called modern delivery. And that was about moving away from project teams to persistent product teams that brought business and technology together. And we've really been able to pioneer that in the data space within Freddie Mac, where we have now teams with product owners coming from the data team and then full stack ID developers with them creating these combined teams to meet the business needs. We found that bringing these teams together really remove the barriers that were there in the interaction and the employee satisfaction has been high. And like you said, over the last 16 months with the pandemic, we've actually seen the productivity stay same or even go up because the teams were all working together, they work as a unit and they all have the sense of ownership versus working on a project that has a finite end date to fail. So we've, um, you know, we've been really lucky with having started this two years ago. Well, and >>That's great. And congratulations about either maintaining productivity or having it go up during the last 16 months, which had been incredibly challenging. Jack. I want to ask you what does winning this award from Collibra what does this mean to you and your team and does this signify that you're really establishing a data first culture? >>Great question, Lisa again. Um, I think winning the award, uh, just from a team standpoint, it's a great honor. Uh, Kleber has been a fantastic partner. And when I think about the journey of going from spread sheets, right, that all of us had in the past to now having all our business class returns lineage, and really being at the forefront of our data monetization. So as we think about moving to the cloud Beliebers step in step with us in terms of our integral part of that holistic delivery model, when I ultimately, as a CDO, it's really the team's honor and effort, cause this has been a multi-year journey to get here. And it's great that Libra as a, as a partner has helped us achieve some of these goals, but also recognized, um, where we are in terms of, uh, as looking at data as a product and some of our, um, leading forefront and using that holistic delivery, uh, to, uh, to meet our business objectives. So overall poorly jazzed when, uh, we've been found that we wanted the data program here at Collibra and very honored, um, uh, to, to win this award. That's >>Where we got to bring back I'm jazzed. I liked that jug sticking with you, let's unpack a little bit, some of those positive results, those business outcomes that you've seen so far from the data program. What are those? >>Yeah. So again, if you were thinking about a traditional CDO model, what were the terms that would have been used few years ago? It was around governance and may have been viewed as an oversight. Um, maybe less talking, um, monetization of what it was, the business values that you needed to accomplish collectively. It's really those three building blocks managing content. You got to trust the source, but ultimately it's empowering the business. So the best success that I could say at Freddy, as you're moving to this digital world, it's really empowering the business to figure out the new capabilities and demand and objectives that we're meeting. We're not going to be able to transform the mortgage industry. We're not going to be able or any, any industry, if we're still stuck in old world thinking, and ultimately data is going to be the blood that has to enable those capabilities. >>So if you tell me the business best success, we're no longer talking a okay, I got my data governance, what do we have to do? It's all embedded together. And as I alluded to that partnership between business and it informing that data is a product where you now you're delivering capabilities holistically from program teams all across data. It's no longer an afterthought. As I said, a few minutes ago, you're able to then meet the demand what's current. And how do we want to think about going forward? So it's no longer buzzwords of digital data marketplace. What is the value of that? And that's what the success, I think if our group collectively working across the organization, it's just not one team it's across the organization. Um, and we have our partners, our operations, everyone from business owners, all swimming in the same direction with, and I would say critical management support. So top of the house, our, our head of business, my, my boss was the COO full supportive in terms of how we're trying to execute and I've makes us, um, it's critical because when there is a potential, trade-offs, we're all looking at it collectively as an organization, >>Right. And that's the best viewpoint to have is that sort of centralized unified vision. And as you say, JAG, the support from, from up top, uh, I'd see if I want to ask you, you establish the Culebra center of excellence. What are you focused on now? >>So we really focused in allowing our users to consume data and understand data and really democratizing data so that they can really get a better understanding of that. So that's a lot of our focus and engaging with Collibra and getting them to start to define things in Colibra law form. That's a lot of focus right now. >>Excellent. Want to stay with you one more question and take that I'm gonna ask to all of you, what are you most excited about a lot of success that you've talked about transforming a legacy institution? What are you most excited about and what are the next steps for the data program? Uh, teak what's are your thoughts? >>Yeah, so really modernizing onto, uh, onto a cloud data lake and allowing all of the users and, uh, Freddie Mac to consume data with the level of governance that we need around. It is a exciting proposition for me. >>What would you say is most exciting to you? >>I'm really looking forward to the opportunities that artificial intelligence has to offer, not just in the augmented analytics space, but in the overall data management life cycle. There's still a lot of things that are manual in the data management space. And, uh, I personally believe, uh, artificial intelligence has a huge role to play there. And Jackson >>Question to you, it seems like you have a really strong collaborative team. You have a very collaborative relationship with management and with Collibra, what are you excited about? What's coming down the pipe. >>So Lisa, if I look at it, you know, we sit back here June, 2021, where were we a year ago? And you think about a lot of the capabilities and some of the advancements that we may just in a year sitting virtually using that word jazzed or induced or feeling really great about. We made a lot of accomplishments. I'm excited what we're going to be doing for the next year. So there's other use cases, and I could talk about AIML and OCHA talks about, you know, our new ecosystem. Seeing those use cases come to fruition so that we're, we are contributing to value from a business standpoint. The organization is what really keeps me up. Uh, keeps me up at night. It gets me up in the morning and I'm really feeling dues for the entire division. Excellent. >>Well, thank you. I want to thank all three of you for joining me today. Talking about the successes that Freddie Mac has had transforming in partnership with Colibra again, congratulations on the Culebra excellence award for the data program. It's been a pleasure talking to all three of you. I'm Lisa Martin. You're watching the cubes coverage of Collibra data citizens 21.

Published Date : Jun 17 2021

SUMMARY :

21 brought to you by Colibra Welcome to the cubes coverage of Collibra data citizens 21. Good to have you on the program. but I'd love to get familiar with the 3d Jack. has reinforced the power of data and how, how it's so critical to And tell me a little bit about your background, your role, what you're doing at Freddie to solve real problems and, and really lead, um, you know, businesses to the next stage of, We're going to talk about that in just a minute, but I want to get your background, tell our audience about you. Uh, I'm responsible for the overall business data architecture and transformation Talk to me about your initial plan and what some of the main challenges were that Uh, great question, Lisa, and, uh, it's definitely pertinent as you say, building the new highway and figuring out how you best have to transform and translate As I mentioned in the intro of that Freddie Mac has won So the Cleaver center of excellence provides us the overall framework from a people What does that definition of data citizen? So it's really everyone within the organization is being the bridge between business and it, I want to understand from your perspective, over the last 16 months with the pandemic, we've actually seen the productivity this award from Collibra what does this mean to you and your team and the past to now having all our business class returns lineage, I liked that jug sticking with you, let's unpack a little bit, it's really empowering the business to figure out the new capabilities and demand and objectives that we're meeting. And as I alluded to And as you say, JAG, the support from, from up top, uh, I'd see if I want to ask you, So that's a lot of our focus and engaging with Collibra and getting them to Want to stay with you one more question and take that I'm gonna ask to all of you, what are you most excited all of the users and, uh, Freddie Mac to consume data with the I'm really looking forward to the opportunities that artificial intelligence has to offer, with Collibra, what are you excited about? So Lisa, if I look at it, you know, we sit back here June, 2021, where were we a year ago? congratulations on the Culebra excellence award for the data program.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Atif Malik	PERSON	0.99+
Lisa	PERSON	0.99+
Alec Malek	PERSON	0.99+
June, 2021	DATE	0.99+
Boston	LOCATION	0.99+
Ankit Goel	PERSON	0.99+
New York	LOCATION	0.99+
Jack	PERSON	0.99+
Freddie Mac	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
Arvind	PERSON	0.99+
Aravind Jagannathan	PERSON	0.99+
JAG	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
2022	DATE	0.99+
Kay	PERSON	0.99+
Jackson	PERSON	0.99+
two books	QUANTITY	0.99+
Matt	PERSON	0.99+
Northern Virginia DC	LOCATION	0.99+
Freddie	ORGANIZATION	0.99+
Northern Virginia	LOCATION	0.99+
three guests	QUANTITY	0.99+
today	DATE	0.99+
next year	DATE	0.99+
two years ago	DATE	0.99+
a year ago	DATE	0.98+
Colibra	TITLE	0.98+
first time	QUANTITY	0.98+
this year	DATE	0.97+
Freddy	ORGANIZATION	0.97+
pandemic	EVENT	0.97+
OCHA	ORGANIZATION	0.97+
three	QUANTITY	0.97+
three building blocks	QUANTITY	0.97+
Kleber	ORGANIZATION	0.96+
CDO	ORGANIZATION	0.96+
Freddy	PERSON	0.94+
last 16 months	DATE	0.94+
Mac	ORGANIZATION	0.94+
Colibra	ORGANIZATION	0.93+
one more question	QUANTITY	0.93+
First	QUANTITY	0.93+
50 year old	QUANTITY	0.92+
Kleber	PERSON	0.91+
millions of data	QUANTITY	0.9+
millions of loans	QUANTITY	0.9+
single	QUANTITY	0.89+
few years ago	DATE	0.89+
AIML	ORGANIZATION	0.86+
Culebra excellence award	TITLE	0.85+
Cleaver	PERSON	0.83+
one team	QUANTITY	0.83+
few minutes ago	DATE	0.82+
Freddie Mac	ORGANIZATION	0.81+
3d	QUANTITY	0.81+
Culebra	ORGANIZATION	0.8+
Libra	TITLE	0.8+
U	LOCATION	0.8+
last six years	DATE	0.78+
Garland	ORGANIZATION	0.78+
Columbia	LOCATION	0.74+
Malik	PERSON	0.74+
Kleberg	ORGANIZATION	0.73+
Libra	ORGANIZATION	0.72+

Michael Kuzma, Lockheed Martin

>> Announcer: From around the globe. It's theCUBE covering Data Citizens '21 brought to you by Collibra. >> Everybody, John Walls here on theCUBE, continuing our coverage of Data Citizens '21 with Michael Kuzma, who is a Senior Data Engineer at Lockheed Martin but he has just not any Senior Data Engineer. He is the Collibra Ranger of the Year, an outstanding award that certainly honors Michael's dedication to training and evaluation, and development. He is the top dog. And so it is our real pleasure to welcome Michael in this morning. Michael, first off congratulations on the recognition. I know it is well deserved, but, I'm certainly it's been a long time in the making for you. So congratulations on that. >> Thanks, John, thanks so much. >> Yeah, let's talk about the award a little bit here because you're the top Collibra Ranger. The fact that you've undergone this intensive training and evaluation process, what has that or what is that doing for you in terms of your professional development and what you're able to provide Lockheed Martin? >> Well, I think the ranger program definitely has helped with my understanding of the tool. First of all, we're standing up Collibra as sort of the key pillar of data governance within Lockheed Martin. So it's important to have people who are subject-matter experts on the tool that can help the different business areas to be able to stand up and just extract as much value as they can from it. >> Yeah, why did this matter to you? I mean, a lot of work, I mean, a lot of work that went into this and to reach the pinnacle required I know sacrifice and commitment on your part and on your team's part for that matter. But why was this of paramount importance to you? >> Well, I think it was partially because I was early on in my Collibra journey when I took the ranger certification and went through it. So it definitely helped to solidify my understanding of the tool and get more into it. That way I can just provide that value to the customers. We also wanted to see what would it look like for other people at Lockheed Martin to become rangers and get proficient in the tool. So I was kind of the Guinea pig for Lockheed and we were evaluating just how it would help us with standing it up. >> Yeah, I mean, talk about the process, if you will a little bit and share with us just what you went through in terms of how many hours this required, what kind of work you had to do, what kind of training and the evaluation process. So kind of take us through there from A to Z if you will, on your journey. >> Yeah, well, it started off, we had to get a virtual environment stood up just so that we could do some of the exercises that the ranger certification requires. So that was an intensive process of just making sure we had all the infrastructure in place to run the sandbox environment. And then once we got that up, it was mainly doing the exercises of, you're provided with the data landscape. How are you going to represent it in the tool? That way your users both business and technical users could go in and see the data that's in there and be able to get value, be able to get insights from it. And I think it was challenging for sure, to just figure out what all is required for standing up the Collibra environment 'cause that was a piece of the ranger not only how to work the tool, but how to stand it up, how to administrate it and in an effective way and get the metamodel set up in an effective way that way you have that longterm sustainability. So it was good seeing all of those different pieces come together. And then after you put it all together, I had the interviews with the Collibra team where you go over everything you did. So it definitely helps when you have to explain it to somebody they're asking questions. It sort of provides you with that dry run for when people in your business area and your company are going to be trying to use the tool and they might not understand about it or what value it can provide. So having that interview almost like a dry run that you can then help customers when they have questions and come to you. >> Yeah, how helpful was that? I mean, you raised a point, interesting point and really thought about that. You're basically going before the board, if you will, and answering a lot of how's and why's about your process, your thinking process, and what you put into place and how you implemented the tools, what have you. What did you find interesting about that? Or what did you find out about yourself perhaps in your knowledge base through that process? >> I definitely think it stretched my knowledge base for it. It was definitely nerve wracking having to go in and explain your rationale to people but it turned out well. And I feel like if you can explain something, like if you do your prep work and you're able to explain it to somebody else, it sort of proves that you have the true understanding on your side of it. So it was definitely a lot of prep work to just anticipate all the different questions, figure it out on my side first and then be able to answer it effectively. >> Yeah, we all like softballs, but what about curve balls? Were there any curve balls that perhaps that came up in that evaluation process? They're like, "Oh, no, I hadn't thought of that. Or I didn't anticipate that." You know sometimes it's those curve balls that really keep us on our toes. >> Yeah, I can't remember any specific questions. I do remember getting thrown some of those curve balls where you give the answer you think it's sufficient and then there's the build on follow on questions to that where you're like, "Okay, well, I didn't of that." And so you're trying to think through it on the spot. So I definitely got some of those I don't remember the exact questions but it definitely helps to be prepared. >> Yeah, it keeps you on your toes for sure. You mentioned that the value of this, perhaps within Lockheed Martin and being par, I think a great example for others within your organization. What about just kind of in the data community at large or your colleagues at other enterprises. What would you say to them in terms of the value in pursuing this kind of honor, this kind of recognition and how it could be put into good use in their work on the day-to-day side of operations? >> Well, I think for people who are early on and trying to stand it up, the video curriculum definitely helped me out for sure. Learning about both the administrative side, as well as how to use the tool as an end user. If you can put your mind yourself in the mindset of an end user, that's where you can really figure out where the most value is going to be coming from. And it was also good just getting that hands-on experience in a sandbox environment, that you could build it out and not have to worry about it breaking anything for your organization, but also figuring out how are you going to set up the metamodel and get it working before people populate the tool? 'Cause it's a lot harder to make updates when people are using it. It's good to try to get that as well established upfront as possible. So it's definitely good to get that hands-on experience with standing that up. And I think it helps you sort of think through all the different intricacies and nuances for standing up your own environment and getting the most value for your company. >> You know, let's talk about Lockheed Martin a little bit and obviously I'm going to take, everybody's pretty well familiar obviously with your work. I mean 110,000 employees worldwide footprint and obviously security and data security is a critical importance. What does Collibra do for you in that respect in terms of whatever peace of mind you might get in terms of data privacy and data security and reliability all these things that really factor, I would assume in the Lockheed Martin's operations. >> Yeah, it does and we're still thinking through all of the things especially with classified information, but it being metadata helps a lot. People are a lot less apprehensive knowing that it's just metadata in the tool. You're not actually keeping the data itself in the tool. So that way we can still have our security pieces on the underlying data. It's more for that discovery piece for us and we're able to see what shared reports are out there to be able to get lineage for different systems and help people's just business understanding of the things that are out there and the technical users as well, getting value from the lineage and system setups. So I think being able to lock down the view permissions that helps too, you know, puts people's minds at ease if you're able to say, "Okay, well, we can make sure only certain people are able to see this." You know, we have some of those built-in as well. >> Yeah, I mean, that's something I know you've done a lot at Lockheed in terms of working on the tech side and the non-tech side. And trying to explain policies, governance, and determining accessibility and putting the right governance controls in place. From a data perspective, again, sharing your insights what you have learned in that regard at Lockheed Martin what would you say to your fellow data colleagues if you will, again, at other enterprises in terms of getting that kind of collaboration and feedback and input from just not the, just the tech side but also the non-tech side of your house? >> Yeah, it's definitely important to get that business side as well because the technical users that while they work with it so much they might not understand that business users are not going to know what all of these things mean and that they're going to need some sort of human readable version of it. So we have people from the different business areas both business representatives and technical representatives who we work with on a consistent basis to get that continual feedback. And that way we're getting what are the priorities from both sides and seeing sort of where the synergies are across the different business areas as well. That way we're not duplicating effort, but we're trying to make it a comprehensive tool that everybody can use. >> Now I know that your relationship at Lockheed Martin with Clipper goes back some four years now. So you have a maturing relationship for sure. And the value there seems to be pretty well-documented. What would you say to others in your space, again not only about, just about Collibra, but about the data, evolution of data in general in terms of giving advice to somebody who's looking at this as a career, or maybe somebody who is just now getting into a more sophisticated look at their data footprint? >> Yeah it's definitely a large field. There's always new things to learn. It's always evolving too. So I think that that first step for an individual is to be willing to to learn those new things, to learn those new systems, processes, ways of thinking and take on tasks that sort of stretch you in your career. Things that you might not have said yes to before but saying yes could give you more of a comprehensive view of the business or give you a better data view as well. And from the company, it's just trying to figure out where the most value lies. Trying to get everybody sort of on the same page when it's the wild west it becomes a lot harder to extract value and move towards value. So trying to get everybody standardized but also give them the flexibility for their individual program or business needs but try to keep people to where there's a common understanding of the data. >> Now, spoken by someone who's been there and is doing that, Michael, we appreciate the insights. And once again, congratulations on the honor. It is a well-deserved. >> Thank you. Thank you. >> You bet Michael Kuzma joining us from Lockheed Martin as the Collibra Ranger of the Year. We continue our discussion here, Data Citizens '21 on theCUBE. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

brought to you by Collibra. He is the Collibra Ranger of the Year, is that doing for you that can help the different business areas and on your team's part for that matter. and get proficient in the tool. and the evaluation process. and see the data that's in there the board, if you will, it sort of proves that you that came up in that evaluation process? but it definitely helps to be prepared. You mentioned that the value of this, and getting the most and obviously I'm going to take, and the technical users as well, what would you say to your that they're going to need And the value there seems to of the business or give you congratulations on the honor. Thank you. as the Collibra Ranger of the Year.

ENTITIES

Entity	Category	Confidence
Michael	PERSON	0.99+
John	PERSON	0.99+
Michael Kuzma	PERSON	0.99+
Lockheed	ORGANIZATION	0.99+
Lockheed Martin	ORGANIZATION	0.99+
Lockheed Martin	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
110,000 employees	QUANTITY	0.99+
Collibra	ORGANIZATION	0.99+
both sides	QUANTITY	0.98+
four years	QUANTITY	0.97+
both	QUANTITY	0.97+
Clipper	ORGANIZATION	0.97+
First	QUANTITY	0.96+
first	QUANTITY	0.96+
first step	QUANTITY	0.95+
Data Citizens '21	TITLE	0.95+
this morning	DATE	0.82+
both business	QUANTITY	0.71+
Collibra	LOCATION	0.67+
Data	TITLE	0.62+
'21	TITLE	0.57+
Citizens	ORGANIZATION	0.54+
theCUBE	ORGANIZATION	0.48+
'21	DATE	0.42+
Collibra	TITLE	0.39+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Collibra: