Stijn Christiaens | Data Citizen 22
>>Hey everyone. I'm Lisa Martin covering Data Citizens 22, brought to you by Collibra. This next conversation is gonna focus on the importance of data culture. One of our Cube alumni is back, Stan Christians is Collibra's co-founder and it's Chief Data citizen. Stan, it's great to have you back on the cube. >>Hey, Lisa, nice to be here. >>So we're gonna be talking about the importance of data culture, data intelligence, maturity, all those great things. When we think about the data revolution that every business is going through, you know, so much more than technology innovation, it also really re requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >>Right. So as you know, our event is called Data Citizens because we believe that in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations, you have a lot of people, most of the employees in an organization are somehow going to be a data citizen, right? So you need to make sure that these people are aware of it. You need to make sure that these people have the skills and competencies to do with data what is necessary. And that's on all levels, right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss, we need to make this decision that your boss is also open to and able to interpret, you know, the data presented in that dashboard to actually make that decision and take that action, right? >>And once you have that why through the organization, that's when you have a good data culture. Now, that's a continuous effort for most organizations because they, they're always moving, somehow there, hiring new people. And it has to be a continuous effort because we've seen that on the one hand, organizations continue to be challenged with controlling their data sources and where all the data is flowing, right? Which in itself creates a lot of risk. But also on the other set hand of the equation, you have the benefits. You know, you might look at regulatory drivers like, we have to do this, right? But it's, it's much better right now to consider the competitive drivers, for example. And we did an IDC study earlier this year, quite interesting. I can recommend anyone to read it. And one of the conclusions they found as they surveyed over a thousand people across organizations worldwide is that the ones who are higher in maturity. >>So the, the organizations that really look at data as an asset, look at data as a product and actively try to be better at it, don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, Okay, I'm doing this, you know, data culture for everyone, wakening them up as data citizens. I'm doing this for competitive reasons, I'm doing this for regulatory reasons. You're trying to bring both of those together and the ones that get data intelligence right, are just going to be more successful and more competitive. That's our view, and that's what we're seeing out there in the market. >>Absolutely. We know that just generally stand right, The organizations that are, are really creating a, a data culture and enabling everybody within the organization to become data citizens are, We know that in theory they're more competitive, they're more successful. But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >>Of course, of course it's difficult for an organization to adapt, but it's also necessary, as you just said, imagine that, you know, you're a modern day organization, phones, laptops, what have you, you're not using those IT assets, right? Or you know, you're delivering them through your, throughout the organization, but not enabling your colleagues to actually do something with that asset. Same thing is true with data today, right? If you are not properly using the data assets and your competitors are, they're going to get more advantage. So as to how you get this zone or how you establish this culture, there's a few angles to look at. I would say, Lisa, so one angle is obviously the leadership angle whereby whoever is the boss of data in the organization, you typically have multiple bosses there, like achieve data officers. Sometimes there's, there's multiple, but they may have a different title, right? >>So I'm just gonna summarize it as a data leader for a second. So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? Yes. Now that's one part because then you can clearly see the example of your leadership in the organization and also the business value. And that's important because those people, their job in essence really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that culture right, is it's not enough to just have that leadership out there, but you also have to get the hearts and minds of the data champions across the organization. You really have to win them over. And if you have those two combined and obviously a good technology to, you know, connect those people and have them execute on their responsibilities, such as as a data intelligence platform like Colibra, then you have the pieces in place to really start upgrading that culture inch by inch if youll, >>Yes, I like that. The recipe for success. So you are the co-founder of colibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how before we went live, we were talking about Collibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what maybe some of the specific projects are that Collibra's data office is working on. >>Yes, and it is indeed data citizens. There are a ton of speakers here, very excited. You know, we have Barb from MIT speaking about data monetization. We have dig pat at the last minute on the agenda. So really exciting agenda. Can't wait to get back out there. But essentially you're right. So over the years at cbra, we've been doing this now since 2008, so a good 15 years. And I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around as are we. And myself, you know, when you start a company, we were for people in a, in a garage if you will. So everybody's wearing all sorts of hat at that time. But over the years I've run, you know, pre-sales at colibra, I've run post-sales partnerships, product, et cetera. And as our company got a little bit biggish for now, 1,200, something like that, people in the company, I believe systems and processes become a lot more important, right? >>So we said, you know, Colibra isn't the size of our customers yet, but we're getting there in terms of organizations, structure, process systems, et cetera. So we said, it's really time for us to put our money where our mouth is and to set up our own data office, which is what we were seeing at all of our customers are doing, and which is what we're seeing that organizations worldwide are doing. And Gartner was predicting us as well. They said, Okay, organizations have an HR unit, they have a finance unit, and over time they'll all have a department, if you will, that is responsible somehow for the data. So we said, Okay, let's try to set a an example at cbra. Let's try to set up our own data office and such way that other people can take away with it, right? Can take away from it. >>So we set up a data strategy, we started building data products, took care of the data infrastructure, that sort of good stuff. And in doing all of that, Lisa, exactly as you said, we said, okay, we need to also use our own product and our own practices, right? And from that use, learn how we can make the product better, learn how we can make the practice better, and share that learning with all of the markets of course. And on, on the Monday mornings, we sometimes refer to that as eating our own dog foods or Friday evenings we refer to that as drinking our own champagne. I like it. So we, we had a, we had the driver to do this, you know, there's a clear business reason. So we involved, we included that in the data strategy and that's a little bit of our origin. >>Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should follow? This is just the organization that works at our company, but it can serve as an inspiration. So we have a pillar, which is data science. The data product builders if you will, or the people who help the business build data products. We have the data engineers who help keep the lights on for that data platform to make sure the products, the data products can run, the data can flow and you know, the quality can be checked. And then we have a data intelligence or data governance builder where we have those data governance, data intelligence stakeholders who help the business as a sort of data partner to the business stakeholders. So that's how we've organized it. And then we started following the calibra approach, which is, well, what are the challenges that our business stakeholders have in hr, finance, sales, marketing all over? >>And how can data help overcome those challenges? And from those use cases, we then just started to build a roadmap and started execution on use case after use case. And a few important ones there are very simple, we see them with our, all our customers as well. People love talking about the catalog, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in and legal and privacy. So they have their process registry and they can see how the data flows. So that's a popular starting place. And that turns into a marketplace so that if new analysts and data citizens join cbra, they immediately have a place to go to, to look and see, okay, what data is out there for me as an analyst or a data scientist or whatever to do my job, right? >>So they can immediately get access to the data. And another one that we did is around trusted business reporting. We're seeing that since 2008. You know, self-service BI allowed everyone to make beautiful dashboards, you know, by pie charts. I always, my pet peeve is the pie charts because I love buy and you shouldn't always be using pie charts. But essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report the reporting on the same thing. But the numbers seem different, right? So that's why we have trusted business reporting. So we know if a report, a dashboard, a data product essentially is built, we know that all the right steps are being followed and that whoever is consuming that can be quite confident in the result either right, in that silver or browser Absolutely key. Exactly. Yes. A absolutely. >>Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >>KPIs and measuring is a big topic in the, in the data chief data officer profession, I would say, and again, it always varies with respect to your organization, but there's a few that we use that might be of interest to you. So remember we have those three pillars, right? And we have metrics across those pillars. So for example, a pillar on the data engineering side is gonna be more related to that uptime, right? Audit is a data platform up and running. Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption is a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data science and the products. >>Are people using them? Are they getting value from it? Can we calculate that value in a monetary perspective, right? So that we can to the rest of the business continue to say we're tracking on those numbers. And those numbers indicate that value is generated and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example, you have a number of domains in a data mesh. People talk about being the owner of a data domain, for example, like product or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open closed? How many data products are built according to process? And so on and so forth. So these are an a set of examples of, of KPIs. There's a, there's a lot more, but hopefully those can already inspire the audience. >>Absolutely. So we've, we've talked about the rise of cheap data offices, it's only accelerating. You mentioned this is like a 10 year journey. So if you were to look into a crystal ball, what do you see in terms of the maturation of data offices over the next decade? >>So we, we've seen indeed the, the role sort of grow up, I think in, in 2010 there may have been like 10 chief data officers or something. Gartner has exact numbers on them, but then they grew, you know, 400, they were like mostly in financial services, but they expanded then to all of industries and then to all of the season. The number is estimated to be about 20,000 right now. Wow. And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy support for the digital program. And now all about data products, right? So as a data leader, you'd now need all of those competences and need to include them in, in your strategy. >>How is that going to evolve for the next couple of years? I wish I had one of those crystal balls, right? But essentially I think for the next couple of years there's gonna be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data officer. So you'll see over the years that's going to evolve more digital and more data products. So for next three, five years, my, my prediction is it's all going to be about data products because it's an immediate link between the data and, and the dollar essentially, right? So that's gonna be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up in a few years. >>I think there's gonna be a continued challenge for the chief data officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not. Right? So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done will be the ones that are successful. Yeah. And the ones who get that done will be the ones that do it on the basis of data monetization, right? Connecting value to the data and making that very clear to all the data citizens in the organization, right? Really and in that sense, value chain, they'll need to have both, you know, technical audiences and non-technical audiences aligned of course. And they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you're waking up data citizens across the organization and you make everyone in the organization think about data as an essence. >>Absolutely. Because there's so much value that can be extracted if organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely gonna keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, organizations have so much more likelihood of being successful in being competitive. So we're gonna watch this space. Stan, thank you so much for joining me on the queue at Data Citizens 22. We appreciate it. >>Thanks for having me over >>From Data Citizens 22, I'm Lisa Martin, you're watching The Cube, the leader in live tech coverage.
SUMMARY :
Stan, it's great to have you back on the cube. Talk to us about what you mean by data citizenship and the And we believe that today's organizations, you have a lot of people, the equation, you have the benefits. So you can say, Okay, I'm doing this, you know, data culture for everyone, wakening them But the IDC study that you just mentioned demonstrates they're So as to how you get this zone or how you establish this of the equation of getting that culture right, is it's not enough to just have that leadership out there, So you are the co-founder of colibra. So over the years at cbra, we've been doing this now since 2008, so a good 15 years. So we said, you know, Colibra isn't the size of our customers yet, but we're we had the driver to do this, you know, there's a clear business reason. make sure the products, the data products can run, the data can flow and you know, the data scientists to know what's in their data lake, for example, and for the people in So they can immediately get access to the data. Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the So for example, a pillar on the data engineering side is gonna be more related So how many of those domains do you have covered? So if you were to Gartner has exact numbers on them, but then they grew, you know, How is that going to evolve for the next couple of years? Really and in that sense, value chain, they'll need to have both, you know, And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, the leader in live tech coverage.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Collibra | ORGANIZATION | 0.99+ |
Barb | PERSON | 0.99+ |
2010 | DATE | 0.99+ |
Stijn Christiaens | PERSON | 0.99+ |
10 year | QUANTITY | 0.99+ |
Stan | PERSON | 0.99+ |
Stan Christians | PERSON | 0.99+ |
one part | QUANTITY | 0.99+ |
Gartner | ORGANIZATION | 0.99+ |
one angle | QUANTITY | 0.99+ |
2008 | DATE | 0.99+ |
1,200 | QUANTITY | 0.99+ |
15 years | QUANTITY | 0.99+ |
400 | QUANTITY | 0.99+ |
10 chief data officers | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
five years | QUANTITY | 0.99+ |
MIT | ORGANIZATION | 0.99+ |
The Cube | TITLE | 0.99+ |
both | QUANTITY | 0.99+ |
IDC | ORGANIZATION | 0.98+ |
over a thousand people | QUANTITY | 0.98+ |
three pillars | QUANTITY | 0.98+ |
three times | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
about 20,000 | QUANTITY | 0.98+ |
second part | QUANTITY | 0.97+ |
cbra | ORGANIZATION | 0.96+ |
Colibra | ORGANIZATION | 0.95+ |
next couple of years | DATE | 0.94+ |
Data Citizens | EVENT | 0.93+ |
Data Citizens 22 | EVENT | 0.93+ |
Monday mornings | DATE | 0.92+ |
earlier this year | DATE | 0.92+ |
next decade | DATE | 0.91+ |
one set | QUANTITY | 0.9+ |
version two | OTHER | 0.89+ |
colibra | ORGANIZATION | 0.89+ |
Friday | DATE | 0.86+ |
Data Citizens 22 | ORGANIZATION | 0.85+ |
version one | OTHER | 0.82+ |
Data | EVENT | 0.81+ |
Data Citizen 22 | ORGANIZATION | 0.81+ |
first chief data | QUANTITY | 0.8+ |
four levels | QUANTITY | 0.77+ |
three | QUANTITY | 0.76+ |
second | QUANTITY | 0.73+ |
Citizens | ORGANIZATION | 0.68+ |
Data | ORGANIZATION | 0.65+ |
Cube | ORGANIZATION | 0.6+ |
2022 | EVENT | 0.48+ |
Ankit Goel, Aravind Jagannathan, & Atif Malik
>>From around the globe. It's the cube covering data citizens. 21 brought to you by Colibra >>Welcome to the cubes coverage of Collibra data citizens 21. I'm Lisa Martin. I have three guests with me here today. Colibra customer Freddie Mac, please welcome JAG chief data officer and vice president of single family data and decisions. Jog. Welcome to the cube. >>Thank you, Lisa. Look forward to be, >>Uh, excellent on Kiko LSU as well. Vice president data transformation and analytics solution on Kay. Good to have you on the program. >>Thank you, Lisa. Great to be here and >>A teeth Malik senior director from the single family division at Freddie Mac is here as well. A team welcome. So we have big congratulations in order. Uh, pretty Mac was just announced at data citizens as the winners of the Colibra excellence award for data program of the year. Congratulations on that. We're going to unpack that. Talk about what that means, but I'd love to get familiar with the 3d Jack. Start with you. Talk to me a little bit about your background, your current role as chief data officer. >>Appreciate it, Lisa, thank you for the opportunity to share our story. Uh, my name is Arvind calls me Jack. And as you said, I'm just single-family chief data officer at Freddie Mac, but those that don't know, Freddie Mac is a Garland sponsored entity that supports the U S housing finance system and single family deals with the residential side of the marketplace, as CDO are responsible for our managed content data lineage, data governance, business architecture, which Cleaver plays a integral role, uh, in, in depth, that function as well as, uh, support our shared assets across the enterprise and our data monetization efforts, data, product execution, decision modeling, as well as our business intelligence capabilities, including AI and ML for various use cases as a background, starting my career in New York and then moved to Boston and last 20 years of living in the Northern Virginia DC area and fortunate to have been responsible for business operations, as well as led and, um, executed large transformation efforts. That background has reinforced the power of data and how, how it's so critical to meeting our business objectives. Look forward to our dialogue today, Lisa, once again. >>Excellent. You have a great background and clearly not a dull moment in your job with Freddy, Matt. And tell me a little bit about your background, your role, what you're doing at Freddie >>Mac. Definitely. Um, hi everyone. I'm,, I'm vice president of data transformation and analytics solutions. And I worked for JAG. I'm responsible for many of the things he said, including leading our transformation to the cloud and migrating all our existing data assets front of that transformation journey. I'm also responsible for our business information and business data architecture, decision modeling, business intelligence, and some of the analytics and artificial intelligence. I started my career back in the day as a computer engineer, but I've always been in the financial industry up in New York. And now in the Northern Virginia area, I called myself that bridge between business and technology. And I would say, I think over the last six years with data found that perfect spot where business and technology actually come together to solve real problems and, and really lead, um, you know, businesses to the next stage of, so thank you Lisa for the opportunity today. Excellent. >>And we're going to unpack you call yourself the bridge between business and it that's always such an important bridge. We're going to talk about that in just a minute, but I want to get your background, tell our audience about you. >>Uh, I'm Alec Malek, I'm senior director of business, data architecture, data transformation, and Freddie Mac. Uh, I'm responsible for the overall business data architecture and transformation of the existing data onto the cloud data lake. Uh, my team is responsible for the Kleberg platform and the business analysts that are using and maintaining the data in Libra and also driving the data architecture in close collaboration with our engineering teams. My background is I'm a engineer at heart. I still do a lot of development. This is my first time as of crossing over onto the bridge onto business side of maintaining data and working with data teams. >>Jan, let's talk about digital transformation. Freddie Mac is a 50 year old and growing company. I always love talking with established businesses about digital transformation. It's pretty challenging. Talk to me about your initial plan and what some of the main challenges were that you were looking to solve. >>Uh, great question, Lisa, and, uh, it's definitely pertinent as you say, in our digital world or figuring out how we need to accomplish it. If I look at our data, modernization is it is a major program and, uh, effort, uh, in, in our, in our division, what started as a reducing cost or looking at an infrastructure play, moving from physical data assets to the cloud, as well as enhancing our resiliency as quickly morphed into meeting business demand and objectives, whether it be for sourcing, servicing or securitization of our loan products. So where are we as we think about creating this digital data marketplace, we are, we are basically forming, empowering a new data ecosystem, which Columbia is definitely playing a major role. It's more than just a cloud native data lake, but it's bringing in some of our current assets and capabilities into this new data landscape. >>So as we think about creating an information hub, part of the challenges, as you say, 50 years of having millions of loans and millions of data across multiple assets, it's frigging out that you still have to care and feed legacy while you're building the new highway and figuring out how you best have to transform and translate and move data and assets to this new platform. What we've been striving for is looking at what is the business demand or what is the business use case, and what's the value to help prioritize that transformation. Exciting part is, as you think about new uses of acquiring and distribution of data, as well as news new use cases for prescriptive and predictive analytics, the power of what we're building in our daily, this new data ecosystem, we're feeling comfortable, we'll meet the business demand, but as any CTO will tell you demand is always, uh, outpaces our capacity. And that's why we want to be very diligent in terms of our execution plan. So we're very excited as to what we've accomplished so far this year and looking forward as we offered a remainder year. And as you go into 2022. Excellent, >>Thanks JAG. Uh, two books go to you. As I mentioned in the intro of that Freddie Mac has won the Culebra excellence award for data program of the year. Again, congratulations on that, but I'd love to understand the Kleber center of excellence that you're building at Freddie Mac. First of all, define what a center of excellence is to Freddie Mac and then what you're specifically building. Yeah, sure. >>So the Cleaver center of excellence provides us the overall framework from a people and process standpoint to focus in on our use of Colibra and for adopting best practices. Uh, we can have teams that are focused just on developing best practices and implementing workflows and lineage within Collibra and implementing and adopting a number of different aspects of Libra. It provides the central hub of people being domain experts on the tool that can then be leveraged by different groups within the organization to maintain, uh, the tool. >>Put another follow on question a T for you. How does Freddie Mac define, uh, dated citizens as anybody in finance or sales or marketing or operations? What does that definition of data citizen? >>It's really everyone it's within the organization. They all consume data in different ways and we provide a way of governing data and for them to get a better understanding of data from Collibra itself. So it's really everyone within the organization that way. >>Excellent. Okay. Let's go over to you a big topic at data citizens. 21 is collaboration. That's probably a word that we used a ton in the last 15 plus months or so it was every business really pivoted quickly to figure out how do we best collaborate. But something that you talked about in your intro is being the bridge between business and it, I want to understand from your perspective, how can data teams help to drive improved collaboration between business and it, >>The collaboration between business and technology have been a key focus area for us over the last few years, we actually started an agile transformation journey two years ago that we called modern delivery. And that was about moving away from project teams to persistent product teams that brought business and technology together. And we've really been able to pioneer that in the data space within Freddie Mac, where we have now teams with product owners coming from the data team and then full stack ID developers with them creating these combined teams to meet the business needs. We found that bringing these teams together really remove the barriers that were there in the interaction and the employee satisfaction has been high. And like you said, over the last 16 months with the pandemic, we've actually seen the productivity stay same or even go up because the teams were all working together, they work as a unit and they all have the sense of ownership versus working on a project that has a finite end date to fail. So we've, um, you know, we've been really lucky with having started this two years ago. Well, and >>That's great. And congratulations about either maintaining productivity or having it go up during the last 16 months, which had been incredibly challenging. Jack. I want to ask you what does winning this award from Collibra what does this mean to you and your team and does this signify that you're really establishing a data first culture? >>Great question, Lisa again. Um, I think winning the award, uh, just from a team standpoint, it's a great honor. Uh, Kleber has been a fantastic partner. And when I think about the journey of going from spread sheets, right, that all of us had in the past to now having all our business class returns lineage, and really being at the forefront of our data monetization. So as we think about moving to the cloud Beliebers step in step with us in terms of our integral part of that holistic delivery model, when I ultimately, as a CDO, it's really the team's honor and effort, cause this has been a multi-year journey to get here. And it's great that Libra as a, as a partner has helped us achieve some of these goals, but also recognized, um, where we are in terms of, uh, as looking at data as a product and some of our, um, leading forefront and using that holistic delivery, uh, to, uh, to meet our business objectives. So overall poorly jazzed when, uh, we've been found that we wanted the data program here at Collibra and very honored, um, uh, to, to win this award. That's >>Where we got to bring back I'm jazzed. I liked that jug sticking with you, let's unpack a little bit, some of those positive results, those business outcomes that you've seen so far from the data program. What are those? >>Yeah. So again, if you were thinking about a traditional CDO model, what were the terms that would have been used few years ago? It was around governance and may have been viewed as an oversight. Um, maybe less talking, um, monetization of what it was, the business values that you needed to accomplish collectively. It's really those three building blocks managing content. You got to trust the source, but ultimately it's empowering the business. So the best success that I could say at Freddy, as you're moving to this digital world, it's really empowering the business to figure out the new capabilities and demand and objectives that we're meeting. We're not going to be able to transform the mortgage industry. We're not going to be able or any, any industry, if we're still stuck in old world thinking, and ultimately data is going to be the blood that has to enable those capabilities. >>So if you tell me the business best success, we're no longer talking a okay, I got my data governance, what do we have to do? It's all embedded together. And as I alluded to that partnership between business and it informing that data is a product where you now you're delivering capabilities holistically from program teams all across data. It's no longer an afterthought. As I said, a few minutes ago, you're able to then meet the demand what's current. And how do we want to think about going forward? So it's no longer buzzwords of digital data marketplace. What is the value of that? And that's what the success, I think if our group collectively working across the organization, it's just not one team it's across the organization. Um, and we have our partners, our operations, everyone from business owners, all swimming in the same direction with, and I would say critical management support. So top of the house, our, our head of business, my, my boss was the COO full supportive in terms of how we're trying to execute and I've makes us, um, it's critical because when there is a potential, trade-offs, we're all looking at it collectively as an organization, >>Right. And that's the best viewpoint to have is that sort of centralized unified vision. And as you say, JAG, the support from, from up top, uh, I'd see if I want to ask you, you establish the Culebra center of excellence. What are you focused on now? >>So we really focused in allowing our users to consume data and understand data and really democratizing data so that they can really get a better understanding of that. So that's a lot of our focus and engaging with Collibra and getting them to start to define things in Colibra law form. That's a lot of focus right now. >>Excellent. Want to stay with you one more question and take that I'm gonna ask to all of you, what are you most excited about a lot of success that you've talked about transforming a legacy institution? What are you most excited about and what are the next steps for the data program? Uh, teak what's are your thoughts? >>Yeah, so really modernizing onto, uh, onto a cloud data lake and allowing all of the users and, uh, Freddie Mac to consume data with the level of governance that we need around. It is a exciting proposition for me. >>What would you say is most exciting to you? >>I'm really looking forward to the opportunities that artificial intelligence has to offer, not just in the augmented analytics space, but in the overall data management life cycle. There's still a lot of things that are manual in the data management space. And, uh, I personally believe, uh, artificial intelligence has a huge role to play there. And Jackson >>Question to you, it seems like you have a really strong collaborative team. You have a very collaborative relationship with management and with Collibra, what are you excited about? What's coming down the pipe. >>So Lisa, if I look at it, you know, we sit back here June, 2021, where were we a year ago? And you think about a lot of the capabilities and some of the advancements that we may just in a year sitting virtually using that word jazzed or induced or feeling really great about. We made a lot of accomplishments. I'm excited what we're going to be doing for the next year. So there's other use cases, and I could talk about AIML and OCHA talks about, you know, our new ecosystem. Seeing those use cases come to fruition so that we're, we are contributing to value from a business standpoint. The organization is what really keeps me up. Uh, keeps me up at night. It gets me up in the morning and I'm really feeling dues for the entire division. Excellent. >>Well, thank you. I want to thank all three of you for joining me today. Talking about the successes that Freddie Mac has had transforming in partnership with Colibra again, congratulations on the Culebra excellence award for the data program. It's been a pleasure talking to all three of you. I'm Lisa Martin. You're watching the cubes coverage of Collibra data citizens 21.
SUMMARY :
21 brought to you by Colibra Welcome to the cubes coverage of Collibra data citizens 21. Good to have you on the program. but I'd love to get familiar with the 3d Jack. has reinforced the power of data and how, how it's so critical to And tell me a little bit about your background, your role, what you're doing at Freddie to solve real problems and, and really lead, um, you know, businesses to the next stage of, We're going to talk about that in just a minute, but I want to get your background, tell our audience about you. Uh, I'm responsible for the overall business data architecture and transformation Talk to me about your initial plan and what some of the main challenges were that Uh, great question, Lisa, and, uh, it's definitely pertinent as you say, building the new highway and figuring out how you best have to transform and translate As I mentioned in the intro of that Freddie Mac has won So the Cleaver center of excellence provides us the overall framework from a people What does that definition of data citizen? So it's really everyone within the organization is being the bridge between business and it, I want to understand from your perspective, over the last 16 months with the pandemic, we've actually seen the productivity this award from Collibra what does this mean to you and your team and the past to now having all our business class returns lineage, I liked that jug sticking with you, let's unpack a little bit, it's really empowering the business to figure out the new capabilities and demand and objectives that we're meeting. And as I alluded to And as you say, JAG, the support from, from up top, uh, I'd see if I want to ask you, So that's a lot of our focus and engaging with Collibra and getting them to Want to stay with you one more question and take that I'm gonna ask to all of you, what are you most excited all of the users and, uh, Freddie Mac to consume data with the I'm really looking forward to the opportunities that artificial intelligence has to offer, with Collibra, what are you excited about? So Lisa, if I look at it, you know, we sit back here June, 2021, where were we a year ago? congratulations on the Culebra excellence award for the data program.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
Atif Malik | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Alec Malek | PERSON | 0.99+ |
June, 2021 | DATE | 0.99+ |
Boston | LOCATION | 0.99+ |
Ankit Goel | PERSON | 0.99+ |
New York | LOCATION | 0.99+ |
Jack | PERSON | 0.99+ |
Freddie Mac | ORGANIZATION | 0.99+ |
50 years | QUANTITY | 0.99+ |
Arvind | PERSON | 0.99+ |
Aravind Jagannathan | PERSON | 0.99+ |
JAG | PERSON | 0.99+ |
Collibra | ORGANIZATION | 0.99+ |
2022 | DATE | 0.99+ |
Kay | PERSON | 0.99+ |
Jackson | PERSON | 0.99+ |
two books | QUANTITY | 0.99+ |
Matt | PERSON | 0.99+ |
Northern Virginia DC | LOCATION | 0.99+ |
Freddie | ORGANIZATION | 0.99+ |
Northern Virginia | LOCATION | 0.99+ |
three guests | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
next year | DATE | 0.99+ |
two years ago | DATE | 0.99+ |
a year ago | DATE | 0.98+ |
Colibra | TITLE | 0.98+ |
first time | QUANTITY | 0.98+ |
this year | DATE | 0.97+ |
Freddy | ORGANIZATION | 0.97+ |
pandemic | EVENT | 0.97+ |
OCHA | ORGANIZATION | 0.97+ |
three | QUANTITY | 0.97+ |
three building blocks | QUANTITY | 0.97+ |
Kleber | ORGANIZATION | 0.96+ |
CDO | ORGANIZATION | 0.96+ |
Freddy | PERSON | 0.94+ |
last 16 months | DATE | 0.94+ |
Mac | ORGANIZATION | 0.94+ |
Colibra | ORGANIZATION | 0.93+ |
one more question | QUANTITY | 0.93+ |
First | QUANTITY | 0.93+ |
50 year old | QUANTITY | 0.92+ |
Kleber | PERSON | 0.91+ |
millions of data | QUANTITY | 0.9+ |
millions of loans | QUANTITY | 0.9+ |
single | QUANTITY | 0.89+ |
few years ago | DATE | 0.89+ |
AIML | ORGANIZATION | 0.86+ |
Culebra excellence award | TITLE | 0.85+ |
Cleaver | PERSON | 0.83+ |
one team | QUANTITY | 0.83+ |
few minutes ago | DATE | 0.82+ |
Freddie Mac | ORGANIZATION | 0.81+ |
3d | QUANTITY | 0.81+ |
Culebra | ORGANIZATION | 0.8+ |
Libra | TITLE | 0.8+ |
U | LOCATION | 0.8+ |
last six years | DATE | 0.78+ |
Garland | ORGANIZATION | 0.78+ |
Columbia | LOCATION | 0.74+ |
Malik | PERSON | 0.74+ |
Kleberg | ORGANIZATION | 0.73+ |
Libra | ORGANIZATION | 0.72+ |
Jim Cushman Product strategy vision | Data Citizens'21
>>Hi everyone. And welcome to data citizens. Thank you for making the time to join me and the over 5,000 data citizens like you that are looking to become United by data. My name is Jim Cushman. I serve as the chief product officer at Collibra. I have the benefit of sharing with you, the product, vision, and strategy of Culebra. There's several sections to this presentation, and I can't wait to share them with you. The first is a story of how we're taking a business user and making it possible for him or her data, use data and gain. And if it and insight from that data, without relying on anyone in the organization to write code or do the work for them next I'll share with you how Collibra will make it possible to manage metadata at scales, into the billions of assets. And again, load this into our software without writing any code third, I will demonstrate to you the integration we have already achieved with our newest product release it's data quality that's powered by machine learning. >>Right? Finally, you're going to hear about how Colibra has become the most universally available solution in the market. Now, we all know that data is a critical asset that can make or break an organization. Yet organizations struggle to capture the power of their data and many remain afraid of how their data could be misused and or abused. We also observe that the understanding of and access to data remains in the hands of just a small few, three out of every four companies continue to struggle to use data, to drive meaningful insights, all forward looking companies, looking for an advantage, a differentiator that will set them apart from their peers and competitors. What if you could improve your organization's productivity by just 5%, even a modest 5% productivity improvement compounded over a five-year period will make your organization 28% more productive. This will leave you with an overwhelming advantage over your competition and uniting your data. >>Litter employees with data is the key to your success. And dare I say, sorry to unlock this potential for increased productivity, huge competitive advantage organizations need to enable self-service access to data for everyday to literate knowledge worker. Our ultimate goal at Cleaver has always been to enable this self-service for our customers to empower every knowledge worker to access the data they need when they need it. But with the peace of mind that your data is governed insecure. Just to imagine if you had a single integrated solution that could deliver a seamless governed, no code user experience of delivering the right data to the right person at the right time, just as simply as ordering a pair of shoes online would be quite a magic trick and one that would place you and your organization on the fast track for success. Let me introduce you to our character here. >>Cliff cliff is that business analyst. He doesn't write code. He doesn't know Julian or R or sequel, but is data literate. When cliff has presented with data of high quality and can actually help find that data of high-quality cliff knows what to do with it. Well, we're going to expose cliff to our software and see how he can find the best data to solve his problem of the day, which is customer churn. Cliff is going to go out and find this information is going to bring it back to him. And he's going to analyze it in his favorite BI reporting tool. Tableau, of course, that could be Looker, could be power BI or any other of your favorites, but let's go ahead and get started and see how cliff can do this without any help from anyone in the organization. So cliff is going to log into Cleaver and being a business user. >>The first thing he's going to do is look for a business term. He looks for customer churn rate. Now, when he brings back a churn rate, it shows him the definition of churn rate and various other things that have been attributed to it such as data domains like product and customer in order. Now, cliff says, okay, customer is really important. So let me click on that and see what makes up customer definition. Cliff will scroll through a customer and find out the various data concepts attributes that make up the definition of customer and cliff knows that customer identifier is a really important aspect to this. It helps link all the data together. And so cliff is going to want to make sure that whatever source he brings actually has customer identifier in it. And that it's of high quality cliff is also interested in things such as email address and credit activity and credit card. >>But he's now going to say, okay, what data sets actually have customer as a data domain in, and by the way, why I'm doing it, what else has product and order information? That's again, relevant to the concept of customer churn. Now, as he goes on, he can actually filter down because there's a lot of different results that could potentially come back. And again, customer identifier was very important to cliff. So cliff, further filters on customer identifier any further does it on customer churn rate as well. This results in two different datasets that are available to cliff for selection, which one to use? Well, he's first presented with some data quality information you can see for customer analytics. It has a data quality score of 76. You can see for sales data enrichment dataset. It has a data quality score of 68. Something that he can see right at the front of the box of things that he's looking for, but let's dig in deeper because the contents really matter. >>So we see again the score of 76, but we actually have the chance to find out that this is something that's actually certified. And this is something that has a check mark. And so he knows someone he trusts is actually certified. This is a dataset. You'll see that there's 91 columns that make up this data set. And rather than sifting through all of that information, cliff is going to go ahead and say, well, okay, customer identifier is very important to me. Let me search through and see if I can find what it's data quality scores very quickly. He finds that using a fuzzy search and brings back and sees, wow, that's a really high data quality score of 98. Well, what's the alternative? Well, the data set is only has 68, but how about, uh, the customer identifier and quickly, he discovers that the data quality for that is only 70. >>So all things being equal, customer analytics is the better data set for what cliff needs to achieve. But now he wants to look and say, other people have used this, what have they had to say about it? And you can see there are various reviews for different reviews from peers of his, in the organization that have given it five stars. So this is encourages cliffs, a confidence that this is great data set to use. Now cliff wants to look a little bit more detailed before he finally commits to using this dataset. Cliff has the opportunity to look at it in the broader set. What are the things can I learn about customer analytics, such as what else is it related to? Who else uses it? Where did it come from? Where does it go and what actually happens to it? And so within our graph of information, we're able to show you a diagram. >>You can see the customer analytics actually comes from the CRM cloud system. And from there you can inherit some wonderful information. We know exactly what CRM cloud is about as an overall system. It's related to other logical models. And here you're actually seeing that it's related to a policy policy about PII or personally identifiable information. This gets cliff almost the immediate knowledge that there's going to be some customer information in this PII information that he's not going to be able to see given his user role in the organization. But cliff says, Hey, that's okay. I actually don't need to see somebody's name and social security number to do my work. I can actually work with other information in the data file. That'll actually help me understand why our customers churning in, what can I actually do about it. If we dig in deeper, we can see what is personally identifiable information that actually could cause issues. >>And as we scroll down and take a little bit of a focus on what we call or what you'll see here is customer phone, because we'll show that to you a little bit later, but these show the various information that once cliff actually has it fulfilled and delivered to him, he will see that it's actually massed and or redacted from his use. Now cliff might drive in deeper and see more information. And he says, you know what? Another piece that's important to me in my analysis is something called is churned. This is basically suggesting that has a customer actually churned. It's an important flag, of course, because that's the analysis that he's performing cliff sees that the score is a mere 65. That's not exactly a great data quality score, but cliff has, is kind of in a hurry. His bosses is, has come back and said, we need to have this information so we can take action. >>So he's not going to wait around to see if they can go through some long day to quality project before he pursues, but he is going to come up and use it. The speed of thinking. He's going to create a suggestion, an issue. He's going to submit this as a work queue item that actually informs others that are responsible for the quality of data. That there's an opportunity for improvement to this dataset that is highly reviewed, but it may be, it has room for improvement as cliff is actually typing in his explanation that he'll pass along. We can also see that the data quality is made up of multiple components, such as integrity, duplication, accuracy, consistency, and conformity. Um, we see that we can submit this, uh, issue and pass it through. And this will go to somebody else who can actually work on this. >>And we'll show that to you a little bit later, but back to cliff, cliff says, okay, I'd like to, I'd like to work with this dataset. So he adds it to his data basket. And just like if he's shopping online, cliff wants that kind of ability to just say, I want to just click once and be done with it. Now it is data and there's some sensitivity about it. And again, there's an owner of this data who you need to get permission from. So cliff is going to provide information to the owner to say, here's why I need this data. And how long do I need this data for starting on a certain date and ending on a certain date and ultimately, what purpose am I going to have with this data? Now, there are other things that cliff can choose to run. This one is how do you want this day to deliver to you? >>Now, you'll see down below, there are three options. One is borrow the other's lease and others by what does that mean? Well, borrow is this idea of, I don't want to have the data that's currently in this CRM, uh, cloud database moved somewhere. I don't want it to be persistent anywhere else. I just want to borrow it very short term to use in my Tablo report and then poof be gone. Cause I don't want to create any problems in my organization. Now you also see lease. Lease is a situation where you actually do need to take possession of the data, but only for a time box period of time, you don't need it for an indefinite amount of time. And ultimately buy is your ability to take possession of the data and have it in perpetuity. So we're going to go forward with our bar use case and cliff is going to submit this and all the fun starts there. >>So cliff has actually submitted the order and the owner, Joanna is actually going to receive the request for the order. Joanna, uh, opens up her task, UCS there's work to perform. It says, oh, okay, here's this there's work for me to perform. Now, Joanna has the ability to automate this using incorporated workflow that we have in Colibra. But for this situation, she's going to manually review that. Cliff wants to borrow a specific data set for a certain period of time. And he actually wants to be using in a Tablo context. So she reviews. It makes an approval and submits it this in turn, flips it back to cliff who says, okay, what obligations did I just take on in order to work for this data? And he reviews each of these data sharing agreements that you, as an organization would set up and say, what am I, uh, what are my restrictions for using this data site? >>As cliff accepts his notices, he now has triggered the process of what we would call fulfillment or a service broker. And in this situation we're doing a virtualization, uh, access, uh, for the borrow use case. Cliff suggests Tablo is his preferred BI and reporting tool. And you can see the various options that are available from power BI Looker size on ThoughtSpot. There are others that can be added over time. And from there, cliff now will be alerted the minute this data is available to them. So now we're running out and doing a distributed query to get the information and you see it returns back for raw view. Now what's really interesting is you'll see, the customer phone has a bunch of X's in it. If you remember that's PII. So it's actually being massed. So cliff can't actually see the raw data. Now cliff also wants to look at it in a Tablo report and can see the visualization layer, but you also see an incorporation of something we call Collibra on the go. >>Not only do we bring the data to the report, but then we tell you the reader, how to interpret the report. It could be that there's someone else who wants to use the very same report that cliff helped create, but they don't understand exactly all the things that cliff went through. So now they have the ability to get a full interpretation of what was this data that was used, where did it come from? And how do I actually interpret some of the fields that I see on this report? Really a clever combination of bringing the data to you and showing you how to use it. Cliff can also see this as a registered asset within a Colibra. So the next shopper comes through might actually, instead of shopping for the dataset might actually shop for the report itself. And the report is connected with the data set he used. >>So now they have a full bill of materials to run a customer Shern report and schedule it anytime they want. So now we've turned cliff actually into a creator of data assets, and this is where intelligent, it gets more intelligence and that's really what we call data intelligence. So let's go back through that magic trick that we just did with cliff. So cliff went into the software, not knowing if the source of data that he was looking for for customer product sales was even available to him. He went in very quickly and searched and found his dataset, use facts and facets to filter down to exactly what was available. Compare to contrast the options that were there actually made an observation that there actually wasn't enough data quality around a certain thing was important to him, created an idea, or basically a suggestion for somebody to follow up on was able to put that into his shopping basket checkout and have it delivered to his front door. >>I mean, that's a bit of a magic trick, right? So, uh, cliff was successful in finding data that he wanted and having it, deliver it to him. And then in his preferred model, he was able to look at it into Tableau. All right. So let's talk about how we're going to make this vision a reality. So our first section here is about performance and scale, but it's also about codeless database registration. How did we get all that stuff into the data catalog and available for, uh, cliff to find? So allow us to introduce you to what we call the asset life cycle and some of the largest organizations in the world. They might have upwards of a billion data assets. These are columns and tables, reports, API, APIs, algorithms, et cetera. These are very high volume and quite technical and far more information than a business user like cliff might want to be engaged with those very same really large organizations may have upwards of say, 20 to 25 million that are critical data sources and data assets, things that they do need to highly curate and make available. >>But through that as a bit of a distillation, a lifecycle of different things you might want to do along that. And so we're going to share with you how you can actually automatically register these sources, deal with these very large volumes at speed and at scale, and actually make it available with just a level of information you need to govern and protect, but also make it available for opportunistic use cases, such as the one we presented with cliff. So as you recall, when cliff was actually trying to look for his dataset, he identified that the is churned, uh, data at your was of low quality. So he passed this over to Eliza, who's a data steward and she actually receives this work queue in a collaborative fashion. And she has to review, what is the request? If you recall, this was the request to improve the data quality for his churn. >>Now she needs to familiarize herself with what cliff was observing when he was doing his shopping experience. So she digs in and wants to look at the quality that he was observing and sure enough, as she goes down and it looks at his churn, she sees that it was a low 65% and now understands exactly what cliff was referring to. She says, aha, okay. I need to get help. I need to decide whether I have a data quality project to fix the data, or should I see if there's another data set in the organization that has better, uh, data for this. And so she creates a queue that can go over to one of her colleagues who really focuses on data quality. She submits this request and it goes over to, uh, her colleague, John who's really familiar with data quality. So John actually receives the request from Eliza and you'll see a task showing up in his queue. >>He opens up the request and finds out that Eliza's asking if there's another source out there that actually has good is churned, uh, data available. Now he actually knows quite a bit about the quality of information sturdiness. So he goes into the data quality console and does a quick look for a dataset that he's familiar with called customer product sales. He quickly scrolls down and finds out the one that's actually been published. That's the one he was looking for and he opens it up to find out more information. What data sets are, what columns are actually in there. And he goes down to find his churned is in fact, one of the attributes in there. It actually does have active rules that are associated with it to manage the quality. And so he says, well, let's look in more detail and find out what is the quality of this dataset? >>Oh, it's 86. This is a dramatic improvement over what we've seen before. So we can see again, it's trended quite nicely over time each day, it hasn't actually degraded in performance. So we actually responds back to realize and say, this data set, uh, is actually the data set that you want to bring in. It really will improve. And you'll see that he refers to the refined database within the CRM cloud solution. Once he actually submits this, it goes back to Eliza and she's able to continue her work. Now when Eliza actually brings this back open, she's able to very quickly go into the database registration process for her. She very quickly goes into the CRM cloud, selects the community, to which she wants to register this, uh, data set into the schemas community. And the CRM cloud is the system that she wants to load it in. >>And the refined is the database that John told her that she should bring in. After a quick description, she's able to click register. And this triggers that automatic codeless process of going out to the dataset and bringing back its metadata. Now metadata is great, but it's not the end all be all. There's a lot of other values that she really cares about as she's actually registering this dataset and synchronizing the metadata she's also then asked, would you like to bring in quality information? And so she'll go out and say, yes, of course, I want to enable the quality information from CRM refined. I also want to bring back lineage information to associate with this metadata. And I also want to select profiling and classification information. Now when she actually selects it, she can also say, how often do you want to synchronize this? This is a daily, weekly, monthly kind of update. >>That's part of the change data capture process. Again, all automated without the require of actually writing code. So she's actually run this process. Now, after this loads in, she can then open up this new registered, uh, dataset and actually look and see if it actually has achieved the problem that cliff set her out on, which was improved data quality. So looking into the data quality for the is churn capability shows her that she has fantastic quality. It's at a hundred, it's exactly what she was looking for. So she can with confidence actually, uh, suggest that it's done, but she did notice something and something that she wants to tell John, which is there's a couple of data quality checks that seem to be missing from this dataset. So again, in a collaborative fashion, she can pass that information, uh, for validity and completeness to say, you know what, check for NOLs and MPS and send that back. >>So she submits this onto John to work on. And John now has a work queue in his task force, but remember she's been working in this task forklift and because she actually has actually added a much better source for his churn information, she's going to update that test that was sent to her to notify cliff that the work has actually been done and that she actually has a really good data set in there. In fact, if you recall, it was 100% in terms of its data quality. So this will really make life a lot easier for cliff. Once he receives that data and processes, the churn report analysis next time. So let's talk about these audacious performance goals that we have in mind. Now today, we actually have really strong performance and amazing usability. Our customers continue to tell us how great our usability is, but they keep asking for more well, we've decided to present to you. >>Something you can start to bank on. This is the performance you can expect from us on the highly curated assets that are available for the business users, as well as the technical and lineage assets that are more available for the developer uses and for things that are more warehoused based, you'll see in Q1, uh, our Q2 of this year, we're making available 5 million curated assets. Now you might be out there saying, Hey, I'm already using the software and I've got over 20 million already. That's fair. We do. We have customers that are actually well over 20 million in terms of assets they're managing, but we wanted to present this to you with zero conditions, no limitations we wouldn't talk about, well, it depends, et cetera. This is without any conditions. That's what we can offer you without fail. And yes, it can go higher and higher. We're also talking about the speed with which you can ingest the data right now, we're ingesting somewhere around 50,000 to a hundred thousand records per and of course, yes, you've probably seen it go quite a bit faster, but we are assuring you that that's the case, but what's really impressive is right now, we can also, uh, help you manage 250 million technical assets and we can load it at a speed of 25 million for our, and you can see how over the next 18 months about every two quarters, we show you dramatic improvements, more than doubling of these. >>For most of them leading up to the end of 2022, we're actually handling over a billion technical lineage assets and we're loading at a hundred million per hour. That sets the mark for the industry. Earlier this year, we announced a recent acquisition Al DQ. LDQ brought to us machine learning based data quality. We're now able to introduce to you Collibra data quality, the first integrated approach to Al DQ and Culebra. We've got a demo to follow. I'm really excited to share it with you. Let's get started. So Eliza submitted a task for John to work on, remember to add checks for no and for empty. So John picks up this task very quickly and looks and sees what's what's the request. And from there says, ah, yes, we do have a quality check issue when we look at these churns. So he jumps over to the data quality console and says, I need to create a new data quality test. >>So cliff is able to go in, uh, to the solution and, uh, set up quick rules, automated rules. Uh, he could inherit rules from other things, but it starts with first identifying what is the data source that he needs to connect to, to perform this. And so he chooses the CRM refined data set that was most recently, uh, registered by Lysa. You'll see the same score of 86 was the quality score for the dataset. And you'll also see, there are four rules that are associated underneath this. Now there are various checks that, uh, that John can establish on this, but remember, this is a fairly easy request that he receives from Eliza. So he's going to go in and choose the actual field, uh, is churned. Uh, and from there identify quick rules of, uh, an empty check and that quickly sets up the rules for him. >>And also the null check equally fast. This one's established and analyzes all the data in there. And this sets up the baseline of data quality, uh, for this. Now this data, once it's captured then is periodically brought back to the catalog. So it's available to not only Eliza, but also to cliff next time he, uh, where to shop in the environment. As we look through the rules that were created through that very simple user experience, you can see the one for is empty and is no that we're set up. Now, these are various, uh, styles that can be set up either manually, or you can set them up through machine learning again, or you can inherit them. But the key is to track these, uh, rule creation in the metrics that are generated from these rules so that it can be brought back to the catalog and then used in meaningful context, by someone who's shopping and the confidence that this has neither empty nor no fields, at least most of them don't well now give a confidence as you go forward. >>And as you can see, those checks have now been entered in and you can see that it's a hundred percent quality score for the Knoll check. So with confidence now, John can actually respond back to Eliza and say, I've actually inserted them they're up and running. And, uh, you're in good status. So that was pretty amazing integration, right? And four months after our acquisition, we've already brought that level of integration between, uh, Colibra, uh, data intelligence, cloud, and data quality. Now it doesn't stop there. We have really impressive and high site set early next year. We're getting introduced a fully immersive experience where customers can work within Culebra and actually bring the data quality information all the way in as well as start to manipulate the rules and generate the machine learning rules. On top of it, all of that will be a deeply immersive experience. >>We also have something really clever coming, which we call continuous data profiling, where we bring the power of data quality all the way into the database. So it's continuously running and always making that data available for you. Now, I'd also like to share with you one of the reasons why we are the most universally available software solutions in data intelligence. We've already announced that we're available on AWS and Google cloud prior, but today we can announce to you in Q3, we're going to be, um, available on Microsoft Azure as well. Now it's not just these three cloud providers that were available on we've also become available on each of their marketplaces. So if you are buying our software, you can actually go out and achieve that same purchase from their marketplace and achieve your financial objectives as well. We're very excited about this. These are very important partners for, uh, for our, for us. >>Now, I'd also like to introduce you our system integrators, without them. There's no way we could actually achieve our objectives of growing so rapidly and dealing with the demand that you customers have had Accenture, Deloitte emphasis, and even others have been instrumental in making sure that we can serve your needs when you need them. Uh, and so it's been a big part of our growth and will be a continued part of our growth as well. And finally, I'd like to actually introduce you to our product showcases where we can go into absolute detail on many of the topics I talked about today, such as data governance with Arco or data privacy with Sergio or data quality with Brian and finally catalog with Peter. Again, I'd like to thank you all for joining us. Uh, and we really look forward to hearing your feedback. Thank you..
SUMMARY :
I have the benefit of sharing with you, We also observe that the understanding of and access to data remains in the hands of to imagine if you had a single integrated solution that could deliver a seamless governed, And he's going to analyze it in his favorite BI reporting tool. And so cliff is going to want to make sure that are available to cliff for selection, which one to use? And rather than sifting through all of that information, cliff is going to go ahead and say, well, okay, Cliff has the opportunity to look at it in the broader set. knowledge that there's going to be some customer information in this PII information that he's not going to be And as we scroll down and take a little bit of a focus on what we call or what you'll see here is customer phone, We can also see that the data quality is made up of multiple components, So cliff is going to provide information to the owner to say, case and cliff is going to submit this and all the fun starts there. So cliff has actually submitted the order and the owner, Joanna is actually going to receive the request for the order. in a Tablo report and can see the visualization layer, but you also see an incorporation of something we call Collibra Really a clever combination of bringing the data to you and showing you how to So now they have a full bill of materials to run a customer Shern report and schedule it anytime they want. So allow us to introduce you to what we call the asset life cycle and And so we're going to share with you how you can actually automatically register these sources, And so she creates a queue that can go over to one of her colleagues who really focuses on data quality. And he goes down to find So we actually responds back to realize and say, this data set, uh, is actually the data set that you want And the refined is the database that John told her that she should bring in. So again, in a collaborative fashion, she can pass that information, uh, So she submits this onto John to work on. We're also talking about the speed with which you can ingest the data right We're now able to introduce to you Collibra data quality, the first integrated approach to Al So cliff is able to go in, uh, to the solution and, uh, set up quick rules, So it's available to not only Eliza, but also to cliff next time he, uh, And as you can see, those checks have now been entered in and you can see that it's a hundred percent quality Now, I'd also like to share with you one of the reasons why we are the most And finally, I'd like to actually introduce you to our product showcases where we can go into
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Joanna | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Brian | PERSON | 0.99+ |
Jim Cushman | PERSON | 0.99+ |
Deloitte | ORGANIZATION | 0.99+ |
Peter | PERSON | 0.99+ |
Eliza | PERSON | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
cliff | PERSON | 0.99+ |
Arco | ORGANIZATION | 0.99+ |
100% | QUANTITY | 0.99+ |
5 million | QUANTITY | 0.99+ |
250 million | QUANTITY | 0.99+ |
20 | QUANTITY | 0.99+ |
65 | QUANTITY | 0.99+ |
28% | QUANTITY | 0.99+ |
25 million | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
98 | QUANTITY | 0.99+ |
Cliff | PERSON | 0.99+ |
Collibra | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
5% | QUANTITY | 0.99+ |
first section | QUANTITY | 0.99+ |
68 | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
76 | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
five stars | QUANTITY | 0.99+ |
Culebra | ORGANIZATION | 0.99+ |
LDQ | ORGANIZATION | 0.99+ |
91 columns | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Al DQ | ORGANIZATION | 0.99+ |
Cleaver | ORGANIZATION | 0.99+ |
86 | QUANTITY | 0.99+ |
one | QUANTITY | 0.98+ |
three | QUANTITY | 0.98+ |
end of 2022 | DATE | 0.98+ |
each day | QUANTITY | 0.98+ |
each | QUANTITY | 0.98+ |
over 20 million | QUANTITY | 0.98+ |
Cliff cliff | PERSON | 0.98+ |
next year | DATE | 0.98+ |
Q1 | DATE | 0.98+ |
70 | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
Tableau | TITLE | 0.98+ |
Collibra Day 1 Felix Zhamak
>>Hi, Felix. Great to be here. >>Likewise. Um, so when I started reading about data mesh, I think about a year ago, I found myself the more I read about it, the more I find myself agreeing with other principles behind data mesh, it actually took me back to almost the starting of Colibra 13 years ago, based on the research we were doing on semantic technologies, even personally my own master thesis, which was about domain driven ontologies. And we'll talk about domain-driven as it's a key principle behind data mesh, but before we get into that, let's not assume that everybody knows what data measures about. Although we've seen a lot of traction and momentum, which is fantastic to see, but maybe if you could start by talking about some of the key principles and, and a brief overview of what data mesh, uh, Isabella of >>Course, well, they're happy to, uh, so Dana mesh is an approach is a new approach. It's a decentralized, decentralized approach to managing and accessing data and particularly analytical data at scale. So we can break that down a little bit. What is analytical data? Well, analytical data is the data that fuels our reporting as a business intelligence. Most importantly, the machine learning training, right? So it's the data, that's, it's an aggregate view of historical events that happens across organizations, many domains within organizations, or even beyond one organization, right? Um, and today we manage, uh, this analytical data through very centralized solutions. So whether it's a data lake or data warehouse or combinations of the two, and, uh, to be honest, we have kind of outsource the accountability for it, to the data team, right? It doesn't happen within the domains. Uh, what we have found ourselves with is, uh, central button next. >>So as we see the growth in the scale of organizations, in terms of the origins of the data and in terms of the great expectations for the data, all of these wonderful use cases that are, that requires access to that, unless we're data, uh, we find ourselves kind of constraints and limited in agility to respond, you know, because we have a centralized bottleneck from team to technology, to architecture. So there's a mesh kind of is that looks at the past what we've done, accidental complexity that we've kind of created and tries to reimagine a different way of, uh, managing and accessing data that can truly scale as this origins of the data grows. As they become available within one organization, we didn't want a cloud or another, and it links down really the approach based on four principles. Uh, so I so far, I haven't tried to be prescriptive as exactly how you implement it. >>I leave that to Elizabeth, to the imaginations of the users. Um, of course I have my opinions, but, but without being prescriptive, I think there are full shifts that needs to happen. One is, uh, we need to start breaking down the, kind of this complex problem of accessing to data around boundaries that can allow this to scale out a solution. So boundaries that are, that naturally fits into that model or domains, right. Our business domain. So, so there's a first principle is the domain ownership of the data. So analytical data will be shared and served and accountable, uh, by the domains where they come from. And then the second dimension of that is, okay. So once we break down this, the ownership of the database on domains, how can we prevent this data siloing? So the second principle is really treating data as a product. >>So considering the success of that data based on the access and usability and the lifelong experience of data analysts, data scientists. So we talk about data as a product and that the third principle is to really make it possible feasible. We need to really rethink our data platforms, our infrastructure capabilities, and create a new set ourselves of capabilities that allows domain in fact, to own their data in fact, to manage the life cycle of their analytical data. So then self-serve daytime frustration and platform is the fourth principle. And the last principle is really around governance because we have to think about governance. In fact, when I first wrote it down, this was like a little kind of concern in, in embedded in what some of my texts and I thought about, okay, now to make this real, we need to think about securing and quality of the data accessibility of the data at scale, in a fashion that embraces this autonomous domain ownership. So we have to think about how can we make this real with competition of governance? How can we make those domains be part of the governance, federated governance, federally, the competition of governance is the fourth principle. So at insurance it's a organizational shift, it's an architectural change. And of course technology needs to change to get us to decentralize access and management of Emily's school data. >>Yeah, I think that makes a ton of sense. If you want to scale, typically you have to think much more distributed versus centralized at we've seen it in other practices as well, that domain-driven thinking as well. I think, especially around engineering, right? We've seen a lot of the same principles and best practices in order to scale engineering teams and not make the same mistakes again, but maybe we can start there with kind of the core principles around that domain driven thinking. Can you elaborate a little bit on that? Why that is so important than the kind of data organizations, data functions as well? >>Absolutely. I mean, if you look at your organizations, organizations are complex systems, right? There are eight made of parts, which are basically domains functions of the business, your automation and your customer management, yourselves marketing. And then the behavior of the organization is the result of an intuitive, you know, network of dependencies and interactions with these domains. So if we just overlay data on this complex system, it does make sense to really, to scale, to bring the ownership and, um, really access to data right at the domain where it originates, right. But to the people who know that data best and most capable of providing that data. So to optimize response, to change, to optimize creating new features, new services, new machine learning models, we've got to kind of think about your call optimization, but not that the cost of global good. Right. Uh, so the domain ownership really talks about giving autonomy to the domains and accountability to provide their data and model the data, um, in a responsible way, be accountable for its quality. >>So no collect some of the empower them and localize some of those responsibilities, but at the same time, you know, thinking about the global goods, so what are they, how that domain needs to be accountable against the other domains on the mission? That's the governance piece covers that. And that leads to some interesting kind of architectural shifts, because when you think about not submission of the data, then you think about, okay, if I have a machine learning model that needs, you know, three pieces of the data from the different domains, I ended up actually distributing the computer also back to those domains. So it actually starts shifting kind of architectural as well. We start with ownership. Yeah, >>No, I think that makes a ton of sense, but I can imagine people thinking, well, if you're organizing, according to these domains, aren't gonna be going to grades different silos, even more silos. And I think that's where it second principle that's, um, think of data as a product and it comes in, I think that's incredibly powerful in my mind. It's powerful because it helps us think about usability. It helps us think about the consumer of that data and really packaging it in the right way. And as one sentence that I've heard you use that I think is incredibly powerful, it's less collecting, more connecting. Um, and can you elaborate on that a little bit? >>Absolutely. I mean the power and the value of the data is not enhanced, which we have got and stored on this, right. It's really about connecting that data to other data sets to aluminate new insights. The higher order information is connecting that data to the users, right. Then they want to use it. So that's why I think, uh, if we shift that thinking from just collecting more in one place, like whatever, and ability to connect datasets, then, then arrive at a different solution. So, uh, I think data as a product, as you said, exactly, was a kind of a response to the challenges that domain-driven siloing could create. And the idea is that the data that now these domains own needs to be shared with some accountability and incentive structure as a product. So if you bring product thinking to data, what does that mean? >>That means delighting the experience that there are users who are they, they're the data analysts, data scientists. So, you know, how can we delight their experience of their journey starts with a hypothesis. I have a question. Do I have right data to answer this question with a particular model? Let me discover it, let me find it if it's useful. Do I trust it? So really fascinated in that journey? I think we have two choices in that we have the choice of source of that data. The people who are really shouldn't be accountable for it, shrug off the responsibility and say, you know, I dumped this data on some event streaming and somebody downstream, the governance or data team will take care of a terror again. So it usable piece of information. And that's what we have done for, you know, half century almost. And, or let's say let's bring intention of providing quality data back to the source and make the folks both empower them and make them accountable for providing that data right at the source as a product. And I think by being intentional about that, um, w we're going to remove a lot of accidental complexity that we have created with, you know, labyrinth pipelines of moving data from one place to another, and try to build quality back into it. Um, and that requires, you know, architectural shifts, organizational shifts, incentive models, and the whole package, >>The hope is absolutely. And we'll talk about that. Federated computational governance is going to be a really an important aspect, but the other part of kind of data as a product next to usability is whole trust. Right? If you, if you want to use it, why is also trusts so important if you think about data as a product? >>Well, uh, I mean, maybe we turn this question back to you. Would you buy the shiniest product if you don't trust it, if you, if you don't trust where it comes from, can I use it? Is it, does it have integrity? I wouldn't. I think, I think it's almost irresponsible to use the data that you can trust, right. And the, really the meaning of the trust is that, do I know enough about this data to, to, for it, to be useful for the purpose that I'm using it for? So, um, I think trust is absolutely fundamental to, as a fundamental characteristics of a data as a product. And again, it comes back to breaching the gap between what the data user knows needs to know to really trust them, use that data, to find it, whether it's suitable and what they know today. So we can bridge that gap with, uh, you know, adding documentation, adding SLRs, adding lineage, like all of these additional information, but not only that, but also having people that are accountable for providing that integrity and those silos and guaranteeing. So it's really those product owners. So I think, um, it's just, for me, it's a non trust is a non-negotiable characteristic of the data as a product, like any other consumer product. >>Exactly. Like you said, if you think about consumer product, consumer marketplace is almost Uber of Amazon, of Airbnb. You have the simple rating as a very simple way of showing trust and those two and those different stakeholders and that almost. And we also say, okay, how do we actually get there? And I think data measure also talks a little bit about the roles responsibilities. And I think the importance overall of a, of a data product owner probably is aligned with that, that importance and trust. Yeah, >>Absolutely. I think we can't just wish for these good things happens without putting the accountability and the right roles in place. And the data product owner is just the starting point for us to stop playing hot potato. When it comes to, you know, who owns the data will be accountable for not so much. Who's the actual owner of that data because the owner of the data is you and me where the data comes really from, but it's the data product owner who's going to be responsible for the life cycle of this. They know when the data gets changed with consumers, meaning you feel as a new information, make sure that that gets carried out and maybe one day retire that data. So that long term ownership with intimate understanding of the needs of the user for that data, as well as the data itself and the domain itself and managing the life cycle of that, uh, I think that's a, that's a necessary role. >>Um, and then we have to think about why would anybody want to be a data product owner, right? What are the incentives we have to set up in the infrastructure, you know, in the organization. Um, and it really comes down to, I think, adopting prior art that exists in the product ownership landscape and bring it really to the data and assume the data users as the, as the customers, right. To make them happy. So our incentives on KPIs for these people before they get product on it needs to be aligned with the happiness of their data users. >>Yep. I love that. The alignment again, to the consumer using things like we know from product management, product owner of these roles and reusing that for data, I think that makes it makes a ton of sense. And it's a good leeway to talk a little about governance, right? We mentioned already federated governance, computational governance at we seeing that challenge often with our customers centralizing versus decentralizing. How do we find the right balance? Can you talk a little bit about that in the context of data mesh? How do we, how do we do this? >>Yeah, absolutely. I think the, I was hoping to pack three concepts in the title of the governance, but I thought that would be quite mouthful. So, uh, as you mentioned, uh, the kind of that federated aspects, the competition aspects, and I think embedded governance, I would, if I could add another kind of phrasing there and really it's about, um, as we talked about to how to make it happen. So I think the Federation matters because the people who are really in a position listed this, their product owners in a position to provide data in a trustworthy, with integrity and secure way, they have to have a stake in doing that, right. They have to be accountable, not just for their little domain or a big domain, but also they have to have an accountability for the mesh. So some of the concerns that are applied to all of the data front, I've seen fluid, how we secure them are consistently really secure them. >>How do we model the data or the schema language or the SLO metrics, or that allows this, uh, data to be interoperable so we can join multiple data products. So we have to have, I think, a set of policies that are really minimum set of policies that we have to apply globally to all the data products and then in a federated fashion, incentivize the data product owners. So have a stake in that and make that happen because there's always going to be a challenge in prioritizing. Would I add another few attributes? So my data sets to make my customers happy, or would I adopt that this standardized modeling language, right? They have to make that kind of continuous, um, kind of prioritization. Um, and they have to be incentivized to do both. Right. Uh, and then the other piece of it is okay, if we want to apply these consistent policies, across many data products and the mesh, how would it be physically possible? >>And the only way I can see, and I have seen it done in service mesh would be possible is by embedding those policies as competition, as code into every single data product. And how do we do that again, platform has a big part of it. So be able to have this embedded policy engines and whatever those things are into the data products, uh, and to, to be able to competition. So by default, when you become a data product, as part of the scaffolding of that data product, you get all of these, um, kind of computational capabilities to configure your, your policies according to the global policies. >>No, that makes sense. That makes, that makes it on a sense. That makes sense. >>I'm just curious. Really. So you've been at this for a while. You've built this system for the 13 years came from kind of academic background. So, uh, to be honest, we run into your products, lots of our clients, and there's always like a chat conversation within ThoughtWorks that, uh, do you guys know about this product then? So and so, oh, I should have curious, well, how do you think data governance tehcnology then skip and you need to shift with data mesh, right. And, and if, if I would ask, how would your roadmap changes with database? >>Yeah, I think it's a really good question. Um, what I don't want to do is to make, make the mistake that Venice often make and think of data mesh as a product. I think it's a much more holistic mindset change, right? That that's organization. Yes. It needs to be a kind of a platform enablement component there. And we've actually, I think authentically what, how we think about governance, that's very aligned with some of the principles and data measures that federate their thinking or customers know about going to communities domains or operating model. We really support that flexibility. I think from a roadmap perspective, I think making that even easier, uh, as always kind of a, a focus focus area for us, um, specifically around data measures are a few things that come to mind. Uh, one, I think is connectivity, right? If you, if you give different teams more ownership and accountability, we're not going to live in a world where all of the data is going to be stored on one location, right? >>You want to give people themes the opportunity and the accountability to make their own technology decisions so that they are fit for purpose. So I think whatever platform being able to really provide out of the box connectivity to a very wide, um, area or a range of technologies, I think is absolutely critical, um, on the, on the product as a or data as a product, thinking that usability, I think that's top of mind, uh, that's part of our roadmap. You're going to hear us, uh, stock about that tomorrow as well. Um, that data consumer, how do we make it as easy as possible for people to discover data that they can trust that they can access? Um, and in that thinking is a big part of our roadmap. So again, making that as easy as possible, uh, is a, is a big part of it. >>And, and also on the, I think the computation aspect that you mentioned, I think we believe in as well, if, if it's just documentation is going to be really hard to keep that alive, right? And so you have to make an active, we have to get close to the actual data. So if you think about a policy enforcement, for example, some things we're talking about, it's not just definition is the enforcement data quality. That's why we are so excited about our or data quality, um, acquisition as well. Um, so these are a couple of the things that we're thinking of, again, your, your, um, your, your, uh, message around from collecting to connecting. We talk about unity. I think that that works really, really well with our mission and vision as well. So mark, thank you so much. I wish we had more time to continue the conversation, uh, but it's been great to have a conversation here. Thank you so much for being here today and, uh, let's continue to work on that on data. Hello. I'm excited >>To see it. Just come to like.
SUMMARY :
Great to be here. I found myself the more I read about it, the more I find myself agreeing with other principles So it's the data, that's, it's an aggregate view of historical events that happens in agility to respond, you know, because we have a centralized bottleneck from team to technology, I leave that to Elizabeth, to the imaginations of the users. some of my texts and I thought about, okay, now to make this real, we need to think about securing in order to scale engineering teams and not make the same mistakes again, but maybe we can start there with kind Uh, so the domain ownership really talks about giving autonomy to the domains and And that leads to some interesting kind of architectural shifts, because when you think about not And as one sentence that I've heard you use that I think is incredibly powerful, it's less collecting, data that now these domains own needs to be shared with some accountability shouldn't be accountable for it, shrug off the responsibility and say, you know, I dumped this data on some event streaming aspect, but the other part of kind of data as a product next to usability is whole So we can bridge that gap with, uh, you know, adding documentation, And I think data measure also talks a little bit about the roles responsibilities. of the data is you and me where the data comes really from, but it's the data product owner who's What are the incentives we have to set up in the infrastructure, you know, in the organization. The alignment again, to the consumer using things like we know from product management, So some of the concerns that are applied to all of the data front, Um, and they have to be incentivized to do both. So be able to have this embedded policy engines That makes, that makes it on a sense. So and so, oh, I should have curious, the principles and data measures that federate their thinking or customers know about going to communities domains or operating of the box connectivity to a very wide, um, area or a range of technologies, And, and also on the, I think the computation aspect that you mentioned, I think we believe in as well, Just come to like.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Amazon | ORGANIZATION | 0.99+ |
Felix | PERSON | 0.99+ |
Isabella | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Airbnb | ORGANIZATION | 0.99+ |
Elizabeth | PERSON | 0.99+ |
Felix Zhamak | PERSON | 0.99+ |
13 years | QUANTITY | 0.99+ |
second principle | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
one sentence | QUANTITY | 0.99+ |
third principle | QUANTITY | 0.99+ |
second dimension | QUANTITY | 0.99+ |
fourth principle | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
first principle | QUANTITY | 0.99+ |
two choices | QUANTITY | 0.98+ |
Dana | PERSON | 0.98+ |
Emily | PERSON | 0.98+ |
tomorrow | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
one organization | QUANTITY | 0.98+ |
13 years ago | DATE | 0.98+ |
three pieces | QUANTITY | 0.97+ |
a year ago | DATE | 0.97+ |
One | QUANTITY | 0.94+ |
mark | PERSON | 0.93+ |
one location | QUANTITY | 0.93+ |
three concepts | QUANTITY | 0.92+ |
one place | QUANTITY | 0.9+ |
one | QUANTITY | 0.86+ |
eight made | QUANTITY | 0.85+ |
four principles | QUANTITY | 0.84+ |
single data product | QUANTITY | 0.79+ |
Colibra | PERSON | 0.76+ |
Venice | ORGANIZATION | 0.73+ |
half century | DATE | 0.63+ |
Day 1 | QUANTITY | 0.6+ |
ThoughtWorks | ORGANIZATION | 0.59+ |
Welcome to Data Citizens'21
>>Welcome to data, citizens, to anyone I'm thrilled that so many of you joining us this year for what I think will be our best conference yet. This is always my favorite moment of the year. And what makes it especially meaningful for me at this time is that we've all faced so much uncertainty over the last year. Being able to bring together or community of data, citizens, data professionals, customers, and partners gives me so much energy. We all share the same passion to use data, to create positive change in our work. And in our lives 2021 has been called a year of transitions and rightfully so the pandemic has changed our lives, our businesses and our society. It has changed or world. There's been a number of notable shifts over the last 18 months. And I like to bring up three shifts that I personally connect to. >>And these will likely resonate with many of you too. First as a shift though, it's remote work at the start of the pandemic, tens of millions of people across many industries, transition to working from home. This transition happened and presented really fast. And in many cases have happened overnight. For me not being able to meet our customers and our federal court. He begins in Berson, especially during such turbulent times. I've always actually welcomed over 200 new colleagues. New Colombians was especially hard. The second is of course, a shift towards online retail in the U S e-commerce was forecasted to reach 24% of total retail sales by 2024, but by July, 2020. So four years earlier, it had already reached 33% that has translated into an enormous boost for delivery companies. And finally, the supply chain reinvention, the pandemic reveal the complexity and vulnerabilities in the supply chains of many different companies from raw materials to freight disruptions, to labor shortages. >>The damage from the pandemic was felt everywhere. For example, my wife and I have been waiting for us for over six months for a four year old daughter's first bike. Now, many companies are oriented towards data and analytics to reduce costs and better understand, manage and optimize their entire value chain. Now, the one thing that all of these shifts have in common is that they accelerated the massive growth of digitization. This transition to digital isn't new, but how much it has accelerated. Hasn't been easy for organizations in many cases as has happened under enormous pressure. And that digitization has resulted in two related trends. First, an explosion of digital channels, which has created unprecedented amounts of new data, this more volume and more variety of data than ever before. It's been distributed broadly across organizations. Again, this is not a nutrient, but one that has also accelerated imagine just the amount of data that is now on tick-tock. >>It's also a great example of the responsibilities and risks that come with all of that data. This brings me to the second trend and risk that we had started seeing even before the pandemic, the creation of ever more data silos, these silos result in disjointed and often ineffective data teams. And what is more concerning is that it's often a lack of confidence in the outcome. This leads to an overall lack of trust in the information we need to solve this every day, maybe every hour, every minute we rely on data to make both transactional and transformative business decisions. Every organization today depends on mission, critical insights and data critical processes. What happens if suddenly there's a data problem, this could impact our resourcing or customers or back-office or entire ecosystem, the integrity and the reliability of data has real immediate, uh, long-term implications for our businesses and our reputations. >>And this will determine the trajectory of our success. We all feel the weight of data, the immense opportunities and potential implications associated with it. And this is a lot of pressure to bear, but I believe that we have the ability to take control of our data to become more effective and how we work to be more productive and to ultimately generate faster and better outcomes. I believe this is a pivotal moment as organizations transition from reacting to the pandemic, to building a healthy new, normal, we have an extraordinary opportunity to make good use of our data and by doing so, I believe we can achieve extraordinary things by making trusted data more accessible and more usable. We can do even more. We can get more out of our work. Uh, we can put more work into it. We can help our organizations serve more customers and enrich more communities with trusted data. We have the power to change things for good and with it, there's no limit to what people, businesses or society can achieve. When we are United by data, >>The world doesn't just run on information. It runs on people living their passion, dreaming big ideas, but without information without the data, those ideas won't become innovations. That's why at Colibra we're changing how organizations use data. So our customers can change the world. We make data easier to access by making it usable, manageable, and practical. We make it make sense. So people have a common language to share and shape their ideas. And no matter how far and wide that data is scattered, we make sure it's all within reach, connecting the disconnected, joining the disjointed so people can collaborate and trust that their data won't slow them down so they can prove that data has the power to change things for good, doing more enriching, more, helping more together with Collibra. You can be United by data >>United by data. All of us here are United by our passion for data. We are all data citizens, and there's so much power in this community. Uniting is also what the Colibra data intelligence cloud or product does it unites your entire organization to deliver accurate, trusted data for every use for every user and across every source managed, trusted, and accessible. These are the crucial elements that will give your teams the ability to easily collaborate and make every data workflow more productive. There's also some of the experience and the impacts of our customers take Freddie Mac. For example, it's driving their data ecosystem transformation with 5.5 billion data points and over true trillion dollars in assets. Under management, Freddie Mac leveraged Columbia to support the digital transformation and management of its data ecosystem. They eliminate duplicate data, spending improve data lake productivity and drive enhance data quality while delivering increased value for their customers. >>It's also at the heart of what Yelp is doing to connect its engineers, to trusted data unleashed product innovation and instill a data-driven culture. And why companies like audio and BT are promoting the importance of data, culture, and making data easily accessible to the data citizens throughout their organization. Over the next couple of days, the knowledge shared by our partners, our customers, guest speakers. And could you begins, will inspire and energize you to keep moving forward as change agents United by data. Again, I'm so glad to kick off data citizens and thank you for being here with us.
SUMMARY :
We all share the same passion to use data, to create positive change in the supply chains of many different companies from raw materials to freight disruptions, imagine just the amount of data that is now on tick-tock. It's also a great example of the responsibilities and risks that come with all of that data. We have the power to change things for good and with it, We make data easier to access by making it These are the crucial elements that will give your teams the ability It's also at the heart of what Yelp is doing to connect its engineers, to trusted data unleashed
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
33% | QUANTITY | 0.99+ |
24% | QUANTITY | 0.99+ |
July, 2020 | DATE | 0.99+ |
Yelp | ORGANIZATION | 0.99+ |
United | ORGANIZATION | 0.99+ |
first bike | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
First | QUANTITY | 0.99+ |
2024 | DATE | 0.99+ |
last year | DATE | 0.99+ |
Colibra | ORGANIZATION | 0.99+ |
Berson | LOCATION | 0.99+ |
second | QUANTITY | 0.99+ |
over six months | QUANTITY | 0.99+ |
Freddie Mac | ORGANIZATION | 0.99+ |
pandemic | EVENT | 0.99+ |
5.5 billion data points | QUANTITY | 0.98+ |
four year old | QUANTITY | 0.98+ |
this year | DATE | 0.98+ |
BT | ORGANIZATION | 0.98+ |
both | QUANTITY | 0.98+ |
over 200 new colleagues | QUANTITY | 0.98+ |
four years earlier | DATE | 0.97+ |
second trend | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
U S | LOCATION | 0.96+ |
tens of millions of people | QUANTITY | 0.96+ |
Collibra | ORGANIZATION | 0.95+ |
one thing | QUANTITY | 0.94+ |
last 18 months | DATE | 0.94+ |
trillion dollars | QUANTITY | 0.94+ |
Columbia | ORGANIZATION | 0.93+ |
two related | QUANTITY | 0.85+ |
three shifts | QUANTITY | 0.66+ |
days | DATE | 0.59+ |
United | LOCATION | 0.59+ |
a year | QUANTITY | 0.58+ |
one | QUANTITY | 0.57+ |
Colombians | PERSON | 0.52+ |
hour | QUANTITY | 0.5+ |
Data | ORGANIZATION | 0.45+ |
couple | DATE | 0.36+ |