Breaking Analysis: Snowflake caught in the storm clouds

>> From the CUBE Studios in Palo Alto in Boston, bringing you data driven insights from the Cube and ETR. This is Breaking Analysis with Dave Vellante. >> A better than expected earnings report in late August got people excited about Snowflake again, but the negative sentiment in the market is weighed heavily on virtually all growth tech stocks and Snowflake is no exception. As we've stressed many times the company's management is on a long term mission to dramatically simplify the way organizations use data. Snowflake is tapping into a multi hundred billion dollar total available market and continues to grow at a rapid pace. In our view, Snowflake is embarking on its third major wave of innovation data apps, while its first and second waves are still bearing significant fruit. Now for short term traders focused on the next 90 or 180 days, that probably doesn't matter. But those taking a longer view are asking, "Should we still be optimistic about the future of this high flyer or is it just another over hyped tech play?" Hello and welcome to this week's Wiki Bond Cube Insights powered by ETR. Snowflake's Quarter just ended. And in this breaking analysis we take a look at the most recent survey data from ETR to see what clues and nuggets we can extract to predict the near term future in the long term outlook for Snowflake which is going to announce its earnings at the end of this month. Okay, so you know the story. If you've been investor in Snowflake this year, it's been painful. We said at IPO, "If you really want to own this stock on day one, just hold your nose and buy it." But like most IPOs we said there will be likely a better entry point in the future, and not surprisingly that's been the case. Snowflake IPOed a price of 120, which you couldn't touch on day one unless you got into a friends and family Delio. And if you did, you're still up 5% or so. So congratulations. But at one point last year you were up well over 200%. That's been the nature of this volatile stock, and I certainly can't help you with the timing of the market. But longer term Snowflake is targeting 10 billion in revenue for fiscal year 2028. A big number. Is it achievable? Is it big enough? Tell you what, let's come back to that. Now shorter term, our expert trader and breaking analysis contributor Chip Simonton said he got out of the stock a while ago after having taken a shot at what turned out to be a bear market rally. He pointed out that the stock had been bouncing around the 150 level for the last few months and broke that to the downside last Friday. So he'd expect 150 is where the stock is going to find resistance on the way back up, but there's no sign of support right now. He said maybe at 120, which was the July low and of course the IPO price that we just talked about. Now, perhaps earnings will be a catalyst, when Snowflake announces on November 30th, but until the mentality toward growth tech changes, nothing's likely to change dramatically according to Simonton. So now that we have that out of the way, let's take a look at the spending data for Snowflake in the ETR survey. Here's a chart that shows the time series breakdown of snowflake's net score going back to the October, 2021 survey. Now at that time, Snowflake's net score stood at a robust 77%. And remember, net score is a measure of spending velocity. It's a proprietary network, and ETR derives it from a quarterly survey of IT buyers and asks the respondents, "Are you adopting the platform new? Are you spending 6% or more? Is you're spending flat? Is you're spending down 6% or worse? Or are you leaving the platform decommissioning?" You subtract the percent of customers that are spending less or churning from those that are spending more and adopting or adopting and you get a net score. And that's expressed as a percentage of customers responding. In this chart we show Snowflake's in out of the total survey which ranges... The total survey ranges between 1,200 and 1,400 each quarter. And the very last column... Oh sorry, very last row, we show the number of Snowflake respondents that are coming in the survey from the Fortune 500 and the Global 2000. Those are two very important Snowflake constituencies. Now what this data tells us is that Snowflake exited 2021 with very strong momentum in a net score of 82%, which is off the charts and it was actually accelerating from the previous survey. Now by April that sentiment had flipped and Snowflake came down to earth with a 68% net score. Still highly elevated relative to its peers, but meaningfully down. Why was that? Because we saw a drop in new ads and an increase in flat spend. Then into the July and most recent October surveys, you saw a significant drop in the percentage of customers that were spending more. Now, notably, the percentage of customers who are contemplating adding the platform is actually staying pretty strong, but it is off a bit this past survey. And combined with a slight uptick in planned churn, net score is now down to 60%. That uptick from 0% and 1% and then 3%, it's still small, but that net score at 60% is still 20 percentage points higher than our highly elevated benchmark of 40% as you recall from listening to earlier breaking analysis. That 40% range is we consider a milestone. Anything above that is actually quite strong. But again, Snowflake is down and coming back to churn, while 3% churn is very low, in previous quarters we've seen Snowflake 0% or 1% decommissions. Now the last thing to note in this chart is the meaningful uptick in survey respondents that are citing, they're using the Snowflake platform. That's up to 212 in the survey. So look, it's hard to imagine that Snowflake doesn't feel the softening in the market like everyone else. Snowflake is guiding for around 60% growth in product revenue against the tough compare from a year ago with a 2% operating margin. So like every company, the reaction of the street is going to come down to how accurate or conservative the guide is from their CFO. Now, earlier this year, Snowflake acquired a company called Streamlit for around $800 million. Streamlit is an open source Python library and it makes it easier to build data apps with machine learning, obviously a huge trend. And like Snowflake, generally its focus is on simplifying the complex, in this case making data science easier to integrate into data apps that business people can use. So we were excited this summer in the July ETR survey to see that they added some nice data and pick on Streamlit, which we're showing here in comparison to Snowflake's core business on the left hand side. That's the data warehousing, the Streamlit pieces on the right hand side. And we show again net score over time from the previous survey for Snowflake's core database and data warehouse offering again on the left as compared to a Streamlit on the right. Snowflake's core product had 194 responses in the October, 22 survey, Streamlit had an end of 73, which is up from 52 in the July survey. So significant uptick of people responding that they're doing business in adopting Streamlit. That was pretty impressive to us. And it's hard to see, but the net scores stayed pretty constant for Streamlit at 51%. It was 52% I think in the previous quarter, well over that magic 40% mark. But when you blend it with Snowflake, it does sort of bring things down a little bit. Now there are two key points here. One is that the acquisition seems to have gained exposure right out of the gate as evidenced by the large number of responses. And two, the spending momentum. Again while it's lower than Snowflake overall, and when you blend it with Snowflake it does pull it down, it's very healthy and steady. Now let's do a little pure comparison with some of our favorite names in this space. This chart shows net score or spending velocity in the Y-axis, an overlap or presence, pervasiveness if you will, in the data set on the X-axis. That red dotted line again is that 40% highly elevated net score that we like to talk about. And that table inserted informs us as to how the companies are plotted, where the dots set up, the net score, the ins. And we're comparing a number of database players, although just a caution, Oracle includes all of Oracle including its apps. But we just put it in there for reference because it is the leader in database. Right off the bat, Snowflake jumps out with a net score of 64%. The 60% from the earlier chart, again included Streamlit. So you can see its core database, data warehouse business actually is higher than the total company average that we showed you before 'cause the Streamlit is blended in. So when you separate it out, Streamlit is right on top of data bricks. Isn't that ironic? Only Snowflake and Databricks in this selection of names are above the 40% level. You see Mongo and Couchbase, they know they're solid and Teradata cloud actually showing pretty well compared to some of the earlier survey results. Now let's isolate on the database data platform sector and see how that shapes up. And for this analysis, same XY dimensions, we've added the big giants, AWS and Microsoft and Google. And notice that those three plus Snowflake are just at or above the 40% line. Snowflake continues to lead by a significant margin in spending momentum and it keeps creeping to the right. That's that end that we talked about earlier. Now here's an interesting tidbit. Snowflake is often asked, and I've asked them myself many times, "How are you faring relative to AWS, Microsoft and Google, these big whales with Redshift and Synapse and Big Query?" And Snowflake has been telling folks that 80% of its business comes from AWS. And when Microsoft heard that, they said, "Whoa, wait a minute, Snowflake, let's partner up." 'Cause Microsoft is smart, and they understand that the market is enormous. And if they could do better with Snowflake, one, they may steal some business from AWS. And two, even if Snowflake is winning against some of the Microsoft database products, if it wins on Azure, Microsoft is going to sell more compute and more storage, more AI tools, more other stuff to these customers. Now AWS is really aggressive from a partnering standpoint with Snowflake. They're openly negotiating, not openly, but they're negotiating better prices. They're realizing that when it comes to data, the cheaper that you make the offering, the more people are going to consume. At scale economies and operating leverage are really powerful things at volume that kick in. Now Microsoft, they're coming along, they obviously get it, but Google is seemingly resistant to that type of go to market partnership. Rather than lean into Snowflake as a great partner Google's field force is kind of fighting fashion. Google itself at Cloud next heavily messaged what they call the open data cloud, which is a direct rip off of Snowflake. So what can we say about Google? They continue to be kind of behind the curve when it comes to go to market. Now just a brief aside on the competitive posture. I've seen Slootman, Frank Slootman, CEO of Snowflake in action with his prior companies and how he depositioned the competition. At Data Domain, he eviscerated a company called Avamar with their, what he called their expensive and slow post process architecture. I think he actually called it garbage, if I recall at one conference I heard him speak at. And that sort of destroyed BMC when he was at ServiceNow, kind of positioning them as the equivalent of the department of motor vehicles. And so it's interesting to hear how Snowflake openly talks about the data platforms of AWS, Microsoft, Google, and data bricks. I'll give you this sort of short bumper sticker. Redshift is just an on-prem database that AWS morphed to the cloud, which by the way is kind of true. They actually did a brilliant job of it, but it's basically a fact. Microsoft Excel, a collection of legacy databases, which also kind of morphed to run in the cloud. And even Big Query, which is considered cloud native by many if not most, is being positioned by Snowflake as originally an on-prem database to support Google's ad business, maybe. And data bricks is for those people smart enough to get it to Berkeley that love complexity. And now Snowflake doesn't, they don't mention Berkeley as far as I know. That's my addition. But you get the point. And the interesting thing about Databricks and Snowflake is a while ago in the cube I said that there was a new workload type emerging around data where you have AWS cloud, Snowflake obviously for the cloud database and Databricks data for the data science and EML, you bring those things together and there's this new workload emerging that's going to be very powerful in the future. And it's interesting to see now the aspirations of all three of these platforms are colliding. That's quite a dynamic, especially when you see both Snowflake and Databricks putting venture money and getting their hooks into the loyalties of the same companies like DBT labs and Calibra. Anyway, Snowflake's posture is that we are the pioneer in cloud native data warehouse, data sharing and now data apps. And our platform is designed for business people that want simplicity. The other guys, yes, they're formidable, but we Snowflake have an architectural lead and of course we run in multiple clouds. So it's pretty strong positioning or depositioning, you have to admit. Now I'm not sure I agree with the big query knockoffs completely. I think that's a bit of a stretch, but snowflake, as we see in the ETR survey data is winning. So in thinking about the longer term future, let's talk about what's different with Snowflake, where it's headed and what the opportunities are for the company. Snowflake put itself on the map by focusing on simplifying data analytics. What's interesting about that is the company's founders are as you probably know from Oracle. And rather than focusing on transactional data, which is Oracle's sweet spot, the stuff they worked on when they were at Oracle, the founder said, "We're going to go somewhere else. We're going to attack the data warehousing problem and the data analytics problem." And they completely re-imagined the database and how it could be applied to solve those challenges and reimagine what was possible if you had virtually unlimited compute and storage capacity. And of course Snowflake became famous for separating the compute from storage and being able to completely shut down compute so you didn't have to pay for it when you're not using it. And the ability to have multiple clusters hit the same data without making endless copies and a consumption/cloud pricing model. And then of course everyone on the planet realized, "Wow, that's a pretty good idea." Every venture capitalist in Silicon Valley has been funding companies to copy that move. And that today has pretty much become mainstream in table stakes. But I would argue that Snowflake not only had the lead, but when you look at how others are approaching this problem, it's not necessarily as clean and as elegant. Some of the startups, the early startups I think get it and maybe had an advantage of starting later, which can be a disadvantage too. But AWS is a good example of what I'm saying here. Is its version of separating compute from storage was an afterthought and it's good, it's... Given what they had it was actually quite clever and customers like it, but it's more of a, "Okay, we're going to tier to storage to lower cost, we're going to sort of dial down the compute not completely, we're not going to shut it off, we're going to minimize the compute required." It's really not true as separation is like for instance Snowflake has. But having said that, we're talking about competitors with lots of resources and cohort offerings. And so I don't want to make this necessarily all about the product, but all things being equal architecture matters, okay? So that's the cloud S-curve, the first one we're showing. Snowflake's still on that S-curve, and in and of itself it's got legs, but it's not what's going to power the company to 10 billion. The next S-curve we denote is the multi-cloud in the middle. And now while 80% of Snowflake's revenue is AWS, Microsoft is ramping up and Google, well, we'll see. But the interesting part of that curve is data sharing, and this idea of data clean rooms. I mean it really should be called the data sharing curve, but I have my reasons for calling it multi-cloud. And this is all about network effects and data gravity, and you're seeing this play out today, especially in industries like financial services and healthcare and government that are highly regulated verticals where folks are super paranoid about compliance. There not going to share data if they're going to get sued for it, if they're going to be in the front page of the Wall Street Journal for some kind of privacy breach. And what Snowflake has done is said, "Put all the data in our cloud." Now, of course now that triggers a lot of people because it's a walled garden, okay? It is. That's the trade off. It's not the Wild West, it's not Windows, it's Mac, it's more controlled. But the idea is that as different parts of the organization or even partners begin to share data that they need, it's got to be governed, it's got to be secure, it's got to be compliant, it's got to be trusted. So Snowflake introduced the idea of, they call these things stable edges. I think that's the term that they use. And they track a metric around stable edges. And so a stable edge, or think of it as a persistent edge is an ongoing relationship between two parties that last for some period of time, more than a month. It's not just a one shot deal, one a done type of, "Oh guys shared it for a day, done." It sent you an FTP, it's done. No, it's got to have trajectory over time. Four weeks or six weeks or some period of time that's meaningful. And that metric is growing. Now I think sort of a different metric that they track. I think around 20% of Snowflake customers are actively sharing data today and then they track the number of those edge relationships that exist. So that's something that's unique. Because again, most data sharing is all about making copies of data. That's great for storage companies, it's bad for auditors, and it's bad for compliance officers. And that trend is just starting out, that middle S-curve, it's going to kind of hit the base of that steep part of the S-curve and it's going to have legs through this decade we think. And then finally the third wave that we show here is what we call super cloud. That's why I called it multi-cloud before, so it could invoke super cloud. The idea that you've built a PAS layer that is purpose built for a specific objective, and in this case it's building data apps that are cloud native, shareable and governed. And is a long-term trend that's going to take some time to develop. I mean, application development platforms can take five to 10 years to mature and gain significant adoption, but this one's unique. This is a critical play for Snowflake. If it's going to compete with the big cloud players, it has to have an app development framework like Snowpark. It has to accommodate new data types like transactional data. That's why it announced this thing called UniStore last June, Snowflake a summit. And the pattern that's forming here is Snowflake is building layer upon layer with its architecture at the core. It's not currently anyway, it's not going out and saying, "All right, we're going to buy a company that's got to another billion dollars in revenue and that's how we're going to get to 10 billion." So it's not buying its way into new markets through revenue. It's actually buying smaller companies that can complement Snowflake and that it can turn into revenue for growth that fit in to the data cloud. Now as to the 10 billion by fiscal year 28, is that achievable? That's the question. Yeah, I think so. Would the momentum resources go to market product and management prowess that Snowflake has? Yes, it's definitely achievable. And one could argue to $10 billion is too conservative. Indeed, Snowflake CFO, Mike Scarpelli will fully admit his forecaster built on existing offerings. He's not including revenue as I understand it from all the new stuff that's in the pipeline because he doesn't know what it's going to look like. He doesn't know what the adoption is going to look like. He doesn't have data on that adoption, not just yet anyway. And now of course things can change quite dramatically. It's possible that is forecast for existing businesses don't materialize or competition picks them off or a company like Databricks actually is able in the longer term replicate the functionality of Snowflake with open source technologies, which would be a very competitive source of innovation. But in our view, there's plenty of room for growth, the market is enormous and the real key is, can and will Snowflake deliver on the promises of simplifying data? Of course we've heard this before from data warehouse, the data mars and data legs and master data management and ETLs and data movers and data copiers and Hadoop and a raft of technologies that have not lived up to expectations. And we've also, by the way, seen some tremendous successes in the software business with the likes of ServiceNow and Salesforce. So will Snowflake be the next great software name and hit that 10 billion magic mark? I think so. Let's reconnect in 2028 and see. Okay, we'll leave it there today. I want to thank Chip Simonton for his input to today's episode. Thanks to Alex Myerson who's on production and manages the podcast. Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hove is our Editor in Chief over at Silicon Angle. He does some great editing for us. Check it out for all the news. Remember all these episodes are available as podcasts. Wherever you listen, just search Breaking Analysis podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch David.vallante@siliconangle.com. DM me @dvellante or comment on our LinkedIn post. And please do check out etr.ai, they've got the best survey data in the enterprise tech business. This is Dave Vellante for the CUBE Insights, powered by ETR. Thanks for watching, thanks for listening and we'll see you next time on breaking analysis. (upbeat music)

Published Date : Nov 10 2022

SUMMARY :

insights from the Cube and ETR. And the ability to have multiple

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Mike Scarpelli	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
November 30th	DATE	0.99+
Ken Schiffman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Chip Simonton	PERSON	0.99+
October, 2021	DATE	0.99+
Rob Hove	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Four weeks	QUANTITY	0.99+
July	DATE	0.99+
six weeks	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
five	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Slootman	PERSON	0.99+
BMC	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
80%	QUANTITY	0.99+
last year	DATE	0.99+
October	DATE	0.99+
Silicon Valley	LOCATION	0.99+
40%	QUANTITY	0.99+
1,400	QUANTITY	0.99+
$10 billion	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
April	DATE	0.99+
3%	QUANTITY	0.99+
77%	QUANTITY	0.99+
64%	QUANTITY	0.99+
60%	QUANTITY	0.99+
194 responses	QUANTITY	0.99+
Kristin Martin	PERSON	0.99+
two parties	QUANTITY	0.99+
51%	QUANTITY	0.99+
2%	QUANTITY	0.99+
Silicon Angle	ORGANIZATION	0.99+
fiscal year 28	DATE	0.99+
billion dollars	QUANTITY	0.99+
0%	QUANTITY	0.99+
Avamar	ORGANIZATION	0.99+
52%	QUANTITY	0.99+
Berkeley	LOCATION	0.99+
2028	DATE	0.99+
Mongo	ORGANIZATION	0.99+
Data Domain	ORGANIZATION	0.99+
1%	QUANTITY	0.99+
late August	DATE	0.99+
two	QUANTITY	0.99+
three	QUANTITY	0.99+
fiscal year 2028	DATE	0.99+

Collibra Data Citizens 22

>>Collibra is a company that was founded in 2008 right before the so-called modern big data era kicked into high gear. The company was one of the first to focus its business on data governance. Now, historically, data governance and data quality initiatives, they were back office functions and they were largely confined to regulatory regulated industries that had to comply with public policy mandates. But as the cloud went mainstream, the tech giants showed us how valuable data could become and the value proposition for data quality and trust. It evolved from primarily a compliance driven issue to becoming a lynchpin of competitive advantage. But data in the decade of the 2010s was largely about getting the technology to work. You had these highly centralized technical teams that were formed and they had hyper specialized skills to develop data architectures and processes to serve the myriad data needs of organizations. >>And it resulted in a lot of frustration with data initiatives for most organizations that didn't have the resources of the cloud guys and the social media giants to really attack their data problems and turn data into gold. This is why today for example, this quite a bit of momentum to rethinking monolithic data architectures. You see, you hear about initiatives like data mesh and the idea of data as a product. They're gaining traction as a way to better serve the the data needs of decentralized business Uni users, you hear a lot about data democratization. So these decentralization efforts around data, they're great, but they create a new set of problems. Specifically, how do you deliver like a self-service infrastructure to business users and domain experts? Now the cloud is definitely helping with that, but also how do you automate governance? This becomes especially tricky as protecting data privacy has become more and more important. >>In other words, while it's enticing to experiment and run fast and loose with data initiatives kinda like the Wild West, to find new veins of gold, it has to be done responsibly. As such, the idea of data governance has had to evolve to become more automated. And intelligence governance and data lineage is still fundamental to ensuring trust as data. It moves like water through an organization. No one is gonna use data that isn't trusted. Metadata has become increasingly important for data discovery and data classification. As data flows through an organization, the continuously ability to check for data flaws and automating that data quality, they become a functional requirement of any modern data management platform. And finally, data privacy has become a critical adjacency to cyber security. So you can see how data governance has evolved into a much richer set of capabilities than it was 10 or 15 years ago. >>Hello and welcome to the Cube's coverage of Data Citizens made possible by Calibra, a leader in so-called Data intelligence and the host of Data Citizens 2022, which is taking place in San Diego. My name is Dave Ante and I'm one of the hosts of our program, which is running in parallel to data citizens. Now at the Cube we like to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the themes from the keynote speakers at Data Citizens and we'll hear from several of the executives. Felix Von Dala, who is the co-founder and CEO of Collibra, will join us along with one of the other founders of Collibra, Stan Christians, who's gonna join my colleague Lisa Martin. I'm gonna also sit down with Laura Sellers, she's the Chief Product Officer at Collibra. We'll talk about some of the, the announcements and innovations they're making at the event, and then we'll dig in further to data quality with Kirk Hasselbeck. >>He's the vice president of Data quality at Collibra. He's an amazingly smart dude who founded Owl dq, a company that he sold to Col to Collibra last year. Now many companies, they didn't make it through the Hado era, you know, they missed the industry waves and they became Driftwood. Collibra, on the other hand, has evolved its business. They've leveraged the cloud, expanded its product portfolio, and leaned in heavily to some major partnerships with cloud providers, as well as receiving a strategic investment from Snowflake earlier this year. So it's a really interesting story that we're thrilled to be sharing with you. Thanks for watching and I hope you enjoy the program. >>Last year, the Cube Covered Data Citizens Collibra's customer event. And the premise that we put forth prior to that event was that despite all the innovation that's gone on over the last decade or more with data, you know, starting with the Hado movement, we had data lakes, we'd spark the ascendancy of programming languages like Python, the introduction of frameworks like TensorFlow, the rise of ai, low code, no code, et cetera. Businesses still find it's too difficult to get more value from their data initiatives. And we said at the time, you know, maybe it's time to rethink data innovation. While a lot of the effort has been focused on, you know, more efficiently storing and processing data, perhaps more energy needs to go into thinking about the people and the process side of the equation, meaning making it easier for domain experts to both gain insights for data, trust the data, and begin to use that data in new ways, fueling data, products, monetization and insights data citizens 2022 is back and we're pleased to have Felix Van Dema, who is the founder and CEO of Collibra. He's on the cube or excited to have you, Felix. Good to see you again. >>Likewise Dave. Thanks for having me again. >>You bet. All right, we're gonna get the update from Felix on the current data landscape, how he sees it, why data intelligence is more important now than ever and get current on what Collibra has been up to over the past year and what's changed since Data Citizens 2021. And we may even touch on some of the product news. So Felix, we're living in a very different world today with businesses and consumers. They're struggling with things like supply chains, uncertain economic trends, and we're not just snapping back to the 2010s. That's clear, and that's really true as well in the world of data. So what's different in your mind, in the data landscape of the 2020s from the previous decade, and what challenges does that bring for your customers? >>Yeah, absolutely. And, and I think you said it well, Dave, and and the intro that that rising complexity and fragmentation in the broader data landscape, that hasn't gotten any better over the last couple of years. When when we talk to our customers, that level of fragmentation, the complexity, how do we find data that we can trust, that we know we can use has only gotten kinda more, more difficult. So that trend that's continuing, I think what is changing is that trend has become much more acute. Well, the other thing we've seen over the last couple of years is that the level of scrutiny that organizations are under respect to data, as data becomes more mission critical, as data becomes more impactful than important, the level of scrutiny with respect to privacy, security, regulatory compliance, as only increasing as well, which again, is really difficult in this environment of continuous innovation, continuous change, continuous growing complexity and fragmentation. >>So it's become much more acute. And, and to your earlier point, we do live in a different world and and the the past couple of years we could probably just kind of brute for it, right? We could focus on, on the top line. There was enough kind of investments to be, to be had. I think nowadays organizations are focused or are, are, are, are, are, are in a very different environment where there's much more focus on cost control, productivity, efficiency, How do we truly get value from that data? So again, I think it just another incentive for organization to now truly look at data and to scale it data, not just from a a technology and infrastructure perspective, but how do you actually scale data from an organizational perspective, right? You said at the the people and process, how do we do that at scale? And that's only, only only becoming much more important. And we do believe that the, the economic environment that we find ourselves in today is gonna be catalyst for organizations to really dig out more seriously if, if, if, if you will, than they maybe have in the have in the best. >>You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated it was gonna get, but you've been on a mission to really address these problems from the beginning. How would you describe your, your, your mission and what are you doing to address these challenges? >>Yeah, absolutely. We, we started Colli in 2008. So in some sense and the, the last kind of financial crisis, and that was really the, the start of Colli where we found product market fit, working with large finance institutions to help them cope with the increasing compliance requirements that they were faced with because of the, of the financial crisis and kind of here we are again in a very different environment, of course 15 years, almost 15 years later. But data only becoming more important. But our mission to deliver trusted data for every user, every use case and across every source, frankly, has only become more important. So what has been an incredible journey over the last 14, 15 years, I think we're still relatively early in our mission to again, be able to provide everyone, and that's why we call it data citizens. We truly believe that everyone in the organization should be able to use trusted data in an easy, easy matter. That mission is is only becoming more important, more relevant. We definitely have a lot more work ahead of us because we are still relatively early in that, in that journey. >>Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a company and then the fact that you're still in the early days is kind of interesting. I mean, you, Collibra's had a good 12 months or so since we last spoke at Data Citizens. Give us the latest update on your business. What do people need to know about your, your current momentum? >>Yeah, absolutely. Again, there's, there's a lot of tail organizations that are only maturing the data practices and we've seen it kind of transform or, or, or influence a lot of our business growth that we've seen, broader adoption of the platform. We work at some of the largest organizations in the world where it's Adobe, Heineken, Bank of America, and many more. We have now over 600 enterprise customers, all industry leaders and every single vertical. So it's, it's really exciting to see that and continue to partner with those organizations. On the partnership side, again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners like Google, Amazon, Snowflake, data bricks and, and others, right? As those kind of new modern data infrastructures, modern data architectures that are definitely all moving to the cloud, a great opportunity for us, our partners and of course our customers to help them kind of transition to the cloud even faster. >>And so we see a lot of excitement and momentum there within an acquisition about 18 months ago around data quality, data observability, which we believe is an enormous opportunity. Of course, data quality isn't new, but I think there's a lot of reasons why we're so excited about quality and observability now. One is around leveraging ai, machine learning, again to drive more automation. And the second is that those data pipelines that are now being created in the cloud, in these modern data architecture arch architectures, they've become mission critical. They've become real time. And so monitoring, observing those data pipelines continuously has become absolutely critical so that they're really excited about about that as well. And on the organizational side, I'm sure you've heard a term around kind of data mesh, something that's gaining a lot of momentum, rightfully so. It's really the type of governance that we always believe. Then federated focused on domains, giving a lot of ownership to different teams. I think that's the way to scale data organizations. And so that aligns really well with our vision and, and from a product perspective, we've seen a lot of momentum with our customers there as well. >>Yeah, you know, a couple things there. I mean, the acquisition of i l dq, you know, Kirk Hasselbeck and, and their team, it's interesting, you know, the whole data quality used to be this back office function and, and really confined to highly regulated industries. It's come to the front office, it's top of mind for chief data officers, data mesh. You mentioned you guys are a connective tissue for all these different nodes on the data mesh. That's key. And of course we see you at all the shows. You're, you're a critical part of many ecosystems and you're developing your own ecosystem. So let's chat a little bit about the, the products. We're gonna go deeper in into products later on at, at Data Citizens 22, but we know you're debuting some, some new innovations, you know, whether it's, you know, the, the the under the covers in security, sort of making data more accessible for people just dealing with workflows and processes as you talked about earlier. Tell us a little bit about what you're introducing. >>Yeah, absolutely. We're super excited, a ton of innovation. And if we think about the big theme and like, like I said, we're still relatively early in this, in this journey towards kind of that mission of data intelligence that really bolts and compelling mission, either customers are still start, are just starting on that, on that journey. We wanna make it as easy as possible for the, for our organization to actually get started because we know that's important that they do. And for our organization and customers that have been with us for some time, there's still a tremendous amount of opportunity to kind of expand the platform further. And again, to make it easier for really to, to accomplish that mission and vision around that data citizen that everyone has access to trustworthy data in a very easy, easy way. So that's really the theme of a lot of the innovation that we're driving. >>A lot of kind of ease of adoption, ease of use, but also then how do we make sure that lio becomes this kind of mission critical enterprise platform from a security performance architecture scale supportability that we're truly able to deliver that kind of an enterprise mission critical platform. And so that's the big theme from an innovation perspective, From a product perspective, a lot of new innovation that we're really excited about. A couple of highlights. One is around data marketplace. Again, a lot of our customers have plans in that direction, how to make it easy. How do we make, how do we make available to true kind of shopping experience that anybody in your organization can, in a very easy search first way, find the right data product, find the right dataset, that data can then consume usage analytics. How do you, how do we help organizations drive adoption, tell them where they're working really well and where they have opportunities homepages again to, to make things easy for, for people, for anyone in your organization to kind of get started with ppia, you mentioned workflow designer, again, we have a very powerful enterprise platform. >>One of our key differentiators is the ability to really drive a lot of automation through workflows. And now we provided a new low code, no code kind of workflow designer experience. So, so really customers can take it to the next level. There's a lot more new product around K Bear Protect, which in partnership with Snowflake, which has been a strategic investor in kib, focused on how do we make access governance easier? How do we, how do we, how are we able to make sure that as you move to the cloud, things like access management, masking around sensitive data, PII data is managed as much more effective, effective rate, really excited about that product. There's more around data quality. Again, how do we, how do we get that deployed as easily and quickly and widely as we can? Moving that to the cloud has been a big part of our strategy. >>So we launch more data quality cloud product as well as making use of those, those native compute capabilities in platforms like Snowflake, Data, Bricks, Google, Amazon, and others. And so we are bettering a capability, a capability that we call push down. So actually pushing down the computer and data quality, the monitoring into the underlying platform, which again, from a scale performance and ease of use perspective is gonna make a massive difference. And then more broadly, we, we talked a little bit about the ecosystem. Again, integrations, we talk about being able to connect to every source. Integrations are absolutely critical and we're really excited to deliver new integrations with Snowflake, Azure and Google Cloud storage as well. So there's a lot coming out. The, the team has been work at work really hard and we are really, really excited about what we are coming, what we're bringing to markets. >>Yeah, a lot going on there. I wonder if you could give us your, your closing thoughts. I mean, you, you talked about, you know, the marketplace, you know, you think about data mesh, you think of data as product, one of the key principles you think about monetization. This is really different than what we've been used to in data, which is just getting the technology to work has been been so hard. So how do you see sort of the future and, you know, give us the, your closing thoughts please? >>Yeah, absolutely. And I, and I think we we're really at this pivotal moment, and I think you said it well. We, we all know the constraint and the challenges with data, how to actually do data at scale. And while we've seen a ton of innovation on the infrastructure side, we fundamentally believe that just getting a faster database is important, but it's not gonna fully solve the challenges and truly kind of deliver on the opportunity. And that's why now is really the time to deliver this data intelligence vision, this data intelligence platform. We are still early, making it as easy as we can. It's kind of, of our, it's our mission. And so I'm really, really excited to see what we, what we are gonna, how the marks gonna evolve over the next, next few quarters and years. I think the trend is clearly there when we talk about data mesh, this kind of federated approach folks on data products is just another signal that we believe that a lot of our organization are now at the time. >>The understanding need to go beyond just the technology. I really, really think about how do we actually scale data as a business function, just like we've done with it, with, with hr, with, with sales and marketing, with finance. That's how we need to think about data. I think now is the time given the economic environment that we are in much more focus on control, much more focused on productivity efficiency and now's the time. We need to look beyond just the technology and infrastructure to think of how to scale data, how to manage data at scale. >>Yeah, it's a new era. The next 10 years of data won't be like the last, as I always say. Felix, thanks so much and good luck in, in San Diego. I know you're gonna crush it out there. >>Thank you Dave. >>Yeah, it's a great spot for an in-person event and, and of course the content post event is gonna be available@collibra.com and you can of course catch the cube coverage@thecube.net and all the news@siliconangle.com. This is Dave Valante for the cube, your leader in enterprise and emerging tech coverage. >>Hi, I'm Jay from Collibra's Data Office. Today I want to talk to you about Collibra's data intelligence cloud. We often say Collibra is a single system of engagement for all of your data. Now, when I say data, I mean data in the broadest sense of the word, including reference and metadata. Think of metrics, reports, APIs, systems, policies, and even business processes that produce or consume data. Now, the beauty of this platform is that it ensures all of your users have an easy way to find, understand, trust, and access data. But how do you get started? Well, here are seven steps to help you get going. One, start with the data. What's data intelligence? Without data leverage the Collibra data catalog to automatically profile and classify your enterprise data wherever that data lives, databases, data lakes or data warehouses, whether on the cloud or on premise. >>Two, you'll then wanna organize the data and you'll do that with data communities. This can be by department, find a business or functional team, however your organization organizes work and accountability. And for that you'll establish community owners, communities, make it easy for people to navigate through the platform, find the data and will help create a sense of belonging for users. An important and related side note here, we find it's typical in many organizations that data is thought of is just an asset and IT and data offices are viewed as the owners of it and who are really the central teams performing analytics as a service provider to the enterprise. We believe data is more than an asset, it's a true product that can be converted to value. And that also means establishing business ownership of data where that strategy and ROI come together with subject matter expertise. >>Okay, three. Next, back to those communities there, the data owners should explain and define their data, not just the tables and columns, but also the related business terms, metrics and KPIs. These objects we call these assets are typically organized into business glossaries and data dictionaries. I definitely recommend starting with the topics that are most important to the business. Four, those steps that enable you and your users to have some fun with it. Linking everything together builds your knowledge graph and also known as a metadata graph by linking or relating these assets together. For example, a data set to a KPI to a report now enables your users to see what we call the lineage diagram that visualizes where the data in your dashboards actually came from and what the data means and who's responsible for it. Speaking of which, here's five. Leverage the calibra trusted business reporting solution on the marketplace, which comes with workflows for those owners to certify their reports, KPIs, and data sets. >>This helps them force their trust in their data. Six, easy to navigate dashboards or landing pages right in your platform for your company's business processes are the most effective way for everyone to better understand and take action on data. Here's a pro tip, use the dashboard design kit on the marketplace to help you build compelling dashboards. Finally, seven, promote the value of this to your users and be sure to schedule enablement office hours and new employee onboarding sessions to get folks excited about what you've built and implemented. Better yet, invite all of those community and data owners to these sessions so that they can show off the value that they've created. Those are my seven tips to get going with Collibra. I hope these have been useful. For more information, be sure to visit collibra.com. >>Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. My name is Dave Valante. With us is Kirk Hasselbeck, who's the vice president of Data Quality of Collibra Kirk, good to see you. Welcome. >>Thanks for having me, Dave. Excited to be here. >>You bet. Okay, we're gonna discuss data quality observability. It's a hot trend right now. You founded a data quality company, OWL dq, and it was acquired by Collibra last year. Congratulations. And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >>Yeah, absolutely. It's, it's definitely exciting times for data quality, which you're right, has been around for a long time. So why now and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And, and while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as, as to why this is becoming so important now. And, and I guess you could kind of break this down simply and think about if Dave, you and I were gonna build, you know, a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, you know, what the ramifications could be, what, what those incidents would look like, or maybe better yet, we try to build a, a new trading algorithm with a crossover strategy where the 50 day crosses the, the 10 day average. >>And imagine if the data underlying the inputs to that is incorrect. We will probably have major financial ramifications in that sense. So, you know, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. You know, I, I bought a car not too long ago and my dad called and said, How many cylinders does it have? And I realized in that moment, you know, I might have failed him because, cause I didn't know. And, and I used to ask those types of questions about any lock brakes and cylinders and, and you know, if it's manual or, or automatic and, and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips, I, I really don't know that much about it. >>And, and that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the, the individuals loading and consuming all of this data for the company actually may not know that much about the data itself, and that's not even their job anymore. So we'll talk more about that in a minute, but that's really what's setting the foreground for this observability play and why everybody's so interested. It, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >>You know, the other thing too about data quality, and for years we did the MIT CDO IQ event, we didn't do it last year, Covid messed everything up. But the observation I would make there thoughts is, is it data quality? Used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a, a risk to data as a, as an asset. And now as we say, we're gonna talk about observability. And so it's really become front and center just the whole quality issue because data's so fundamental, hasn't it? >>Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my, my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And, and that's kind of what's going on. There's, there's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor, but with the scale that we've achieved in early days, even before calibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is, is listening right intently nowadays to this topic is, so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, you know, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's, it's not ever going to be based on one or two domain experts anymore. >>So, So how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they, are they cousins? What's your perspective on that? >>Yeah, it's, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the, the lingo is constantly moving is, you know, as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens, it's wrong and when it doesn't it's correct. Or I could look for a trend and I'll give you a good example. You know, everybody's talking about fresh data and stale data and, and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good in the bads. That was kind of your, your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data, but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >>So what's the Collibra angle on all this stuff made the acquisition, you got data quality observability coming together, you guys have a lot of expertise in, in this area, but you hear providence of data, you just talked about, you know, stale data, you know, the, the whole trend toward real time. How is Calibra approaching the problem and what's unique about your approach? >>Well, I think where we're fortunate is with our background, myself and team, we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the, the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution, it's more advanced than some of the observation techniques that that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights, and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong, just show me the big picture, help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact, connecting it with lineage and catalog metadata. And as that grows, you can actually achieve total data governance at this point with the acquisition of what was a Lineage company years ago, and then my company Ldq now Collibra, Data quality Collibra may be the best positioned for total data governance and intelligence in the space. >>Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens 22 that you're announcing, you gotta announce new products, right? You're yearly event what's, what's new. Give us a sense as to what products are coming out, but specifically around data quality and observability. >>Absolutely. There's this, you know, there's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks is Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a SaaS like model. And we've started to hook in to these databases. And while we've always worked with the the same databases in the past, they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did your, my data that I've spent all this time and money with my security team securing ever leave my hands, did it ever leave my secure VPC as they call it? >>And with these native integrations that we're building and about to unveil, here's kind of a sneak peek for, for next week at Data Citizens. We're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration, you could log into the Collibra data quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress, cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >>So this is interesting because what you just described, you know, you mentioned Snowflake, you mentioned Google, Oh actually you mentioned yeah, data bricks. You know, Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool, but then Google's got the open data cloud. If you heard, you know, Google next and now data bricks doesn't call it the data cloud, but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm, I'm hearing to, to really understand the relationships between all those and have confidence across, you know, it's like Jak Dani, you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And, and, and that's what you're bringing to the table, Is that right? Did I get that right? >>Yeah, that's right. And it's, for us, it's, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now, we can send them the, the operating ability to crunch all of the calculations, the governance, the quality, and get the answers. And what that's doing, it's basically zero network costs, zero egress cost, zero latency of time. And so when you were to log into Big Query tomorrow using our tool or like, or say Snowflake for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls, things of that nature that just become less onerous. What we're seeing is there's so much technology out there, just like all of the major brands that you mentioned, but how do we make it easier? The future is about less clicks, faster time to value, faster scale, and eventually lower cost. And, and we think that this positions us to be the leader there. >>I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. All right, Kirk, give us your, your final thoughts and on on the trends that we've talked about and Data Citizens 22. >>Absolutely. Well, I think, you know, one big trend is discovery and classification. Seeing that across the board, people used to know it was a zip code and nowadays with the amount of data that's out there, they wanna know where everything is, where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases how fast they can get controls and insights out of their tools. So I think we're gonna see more one click solutions, more SAS based solutions and solutions that hopefully prove faster time to value on, on all of these modern cloud platforms. >>Excellent. All right, Kurt Hasselbeck, thanks so much for coming on the Cube and previewing Data Citizens 22. Appreciate it. >>Thanks for having me, Dave. >>You're welcome. Right, and thank you for watching. Keep it right there for more coverage from the Cube. Welcome to the Cube's virtual Coverage of Data Citizens 2022. My name is Dave Valante and I'm here with Laura Sellers, who's the Chief Product Officer at Collibra, the host of Data Citizens. Laura, welcome. Good to see you. >>Thank you. Nice to be here. >>Yeah, your keynote at Data Citizens this year focused on, you know, your mission to drive ease of use and scale. Now when I think about historically fast access to the right data at the right time in a form that's really easily consumable, it's been kind of challenging, especially for business users. Can can you explain to our audience why this matters so much and what's actually different today in the data ecosystem to make this a reality? >>Yeah, definitely. So I think what we really need and what I hear from customers every single day is that we need a new approach to data management and our product teams. What inspired me to come to Calibra a little bit a over a year ago was really the fact that they're very focused on bringing trusted data to more users across more sources for more use cases. And so as we look at what we're announcing with these innovations of ease of use and scale, it's really about making teams more productive in getting started with and the ability to manage data across the entire organization. So we've been very focused on richer experiences, a broader ecosystem of partners, as well as a platform that delivers performance, scale and security that our users and teams need and demand. So as we look at, Oh, go ahead. >>I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it was just so complicated. But, but please carry on. I'd love to hear more about this. >>Yeah, I, I really, you know, Collibra is a system of engagement for data and we really are working on bringing that entire system of engagement to life for everyone to leverage here and now. So what we're announcing from our ease of use side of the world is first our data marketplace. This is the ability for all users to discover and access data quickly and easily shop for it, if you will. The next thing that we're also introducing is the new homepage. It's really about the ability to drive adoption and have users find data more quickly. And then the two more areas of the ease of use side of the world is our world of usage analytics. And one of the big pushes and passions we have at Collibra is to help with this data driven culture that all companies are trying to create. And also helping with data literacy, with something like usage analytics, it's really about driving adoption of the CLE platform, understanding what's working, who's accessing it, what's not. And then finally we're also introducing what's called workflow designer. And we love our workflows at Libra, it's a big differentiator to be able to automate business processes. The designer is really about a way for more people to be able to create those workflows, collaborate on those workflow flows, as well as people to be able to easily interact with them. So a lot of exciting things when it comes to ease of use to make it easier for all users to find data. >>Y yes, there's definitely a lot to unpack there. I I, you know, you mentioned this idea of, of of, of shopping for the data. That's interesting to me. Why this analogy, metaphor or analogy, I always get those confused. I let's go with analogy. Why is it so important to data consumers? >>I think when you look at the world of data, and I talked about this system of engagement, it's really about making it more accessible to the masses. And what users are used to is a shopping experience like your Amazon, if you will. And so having a consumer grade experience where users can quickly go in and find the data, trust that data, understand where the data's coming from, and then be able to quickly access it, is the idea of being able to shop for it, just making it as simple as possible and really speeding the time to value for any of the business analysts, data analysts out there. >>Yeah, I think when you, you, you see a lot of discussion about rethinking data architectures, putting data in the hands of the users and business people, decentralized data and of course that's awesome. I love that. But of course then you have to have self-service infrastructure and you have to have governance. And those are really challenging. And I think so many organizations, they're facing adoption challenges, you know, when it comes to enabling teams generally, especially domain experts to adopt new data technologies, you know, like the, the tech comes fast and furious. You got all these open source projects and get really confusing. Of course it risks security, governance and all that good stuff. You got all this jargon. So where do you see, you know, the friction in adopting new data technologies? What's your point of view and how can organizations overcome these challenges? >>You're, you're dead on. There's so much technology and there's so much to stay on top of, which is part of the friction, right? It's just being able to stay ahead of, of and understand all the technologies that are coming. You also look at as there's so many more sources of data and people are migrating data to the cloud and they're migrating to new sources. Where the friction comes is really that ability to understand where the data came from, where it's moving to, and then also to be able to put the access controls on top of it. So people are only getting access to the data that they should be getting access to. So one of the other things we're announcing with, with all of the innovations that are coming is what we're doing around performance and scale. So with all of the data movement, with all of the data that's out there, the first thing we're launching in the world of performance and scale is our world of data quality. >>It's something that Collibra has been working on for the past year and a half, but we're launching the ability to have data quality in the cloud. So it's currently an on-premise offering, but we'll now be able to carry that over into the cloud for us to manage that way. We're also introducing the ability to push down data quality into Snowflake. So this is, again, one of those challenges is making sure that that data that you have is d is is high quality as you move forward. And so really another, we're just reducing friction. You already have Snowflake stood up. It's not another machine for you to manage, it's just push down capabilities into Snowflake to be able to track that quality. Another thing that we're launching with that is what we call Collibra Protect. And this is that ability for users to be able to ingest metadata, understand where the PII data is, and then set policies up on top of it. So very quickly be able to set policies and have them enforced at the data level. So anybody in the organization is only getting access to the data they should have access to. >>Here's Topica data quality is interesting. It's something that I've followed for a number of years. It used to be a back office function, you know, and really confined only to highly regulated industries like financial services and healthcare and government. You know, you look back over a decade ago, you didn't have this worry about personal information, g gdpr, and, you know, California Consumer Privacy Act all becomes, becomes so much important. The cloud is really changed things in terms of performance and scale and of course partnering for, for, with Snowflake it's all about sharing data and monetization, anything but a back office function. So it was kind of smart that you guys were early on and of course attracting them and as a, as an investor as well was very strong validation. What can you tell us about the nature of the relationship with Snowflake and specifically inter interested in sort of joint engineering or, and product innovation efforts, you know, beyond the standard go to market stuff? >>Definitely. So you mentioned there were a strategic investor in Calibra about a year ago. A little less than that I guess. We've been working with them though for over a year really tightly with their product and engineering teams to make sure that Collibra is adding real value. Our unified platform is touching pieces of our unified platform or touching all pieces of Snowflake. And when I say that, what I mean is we're first, you know, able to ingest data with Snowflake, which, which has always existed. We're able to profile and classify that data we're announcing with Calibra Protect this week that you're now able to create those policies on top of Snowflake and have them enforce. So again, people can get more value out of their snowflake more quickly as far as time to value with, with our policies for all business users to be able to create. >>We're also announcing Snowflake Lineage 2.0. So this is the ability to take stored procedures in Snowflake and understand the lineage of where did the data come from, how was it transformed with within Snowflake as well as the data quality. Pushdown, as I mentioned, data quality, you brought it up. It is a new, it is a, a big industry push and you know, one of the things I think Gartner mentioned is people are losing up to $15 million without having great data quality. So this push down capability for Snowflake really is again, a big ease of use push for us at Collibra of that ability to, to push it into snowflake, take advantage of the data, the data source, and the engine that already lives there and get the right and make sure you have the right quality. >>I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, you know, high degree of confidence that the data sharing can be done in a safe way. Bringing, you know, Collibra into the, into the story allows me to have that data quality and, and that governance that I, that I need. You know, we've said many times on the cube that one of the notable differences in cloud this decade versus last decade, I mean ob there are obvious differences just in terms of scale and scope, but it's shaping up to be about the strength of the ecosystems. That's really a hallmark of these big cloud players. I mean they're, it's a key factor for innovating, accelerating product delivery, filling gaps in, in the hyperscale offerings cuz you got more stack, you know, mature stack capabilities and you know, it creates this flywheel momentum as we often say. But, so my question is, how do you work with the hyperscalers? Like whether it's AWS or Google, whomever, and what do you see as your role and what's the Collibra sweet spot? >>Yeah, definitely. So, you know, one of the things I mentioned early on is the broader ecosystem of partners is what it's all about. And so we have that strong partnership with Snowflake. We also are doing more with Google around, you know, GCP and kbra protect there, but also tighter data plex integration. So similar to what you've seen with our strategic moves around Snowflake and, and really covering the broad ecosystem of what Collibra can do on top of that data source. We're extending that to the world of Google as well and the world of data plex. We also have great partners in SI's Infosys is somebody we spoke with at the conference who's done a lot of great work with Levi's as they're really important to help people with their whole data strategy and driving that data driven culture and, and Collibra being the core of it. >>Hi Laura, we're gonna, we're gonna end it there, but I wonder if you could kind of put a bow on, you know, this year, the event your, your perspectives. So just give us your closing thoughts. >>Yeah, definitely. So I, I wanna say this is one of the biggest releases Collibra's ever had. Definitely the biggest one since I've been with the company a little over a year. We have all these great new product innovations coming to really drive the ease of use to make data more valuable for users everywhere and, and companies everywhere. And so it's all about everybody being able to easily find, understand, and trust and get access to that data going forward. >>Well congratulations on all the pro progress. It was great to have you on the cube first time I believe, and really appreciate you, you taking the time with us. >>Yes, thank you for your time. >>You're very welcome. Okay, you're watching the coverage of Data Citizens 2022 on the cube, your leader in enterprise and emerging tech coverage. >>So data modernization oftentimes means moving some of your storage and computer to the cloud where you get the benefit of scale and security and so on. But ultimately it doesn't take away the silos that you have. We have more locations, more tools and more processes with which we try to get value from this data. To do that at scale in an organization, people involved in this process, they have to understand each other. So you need to unite those people across those tools, processes, and systems with a shared language. When I say customer, do you understand the same thing as you hearing customer? Are we counting them in the same way so that shared language unites us and that gives the opportunity for the organization as a whole to get the maximum value out of their data assets and then they can democratize data so everyone can properly use that shared language to find, understand, and trust the data asset that's available. >>And that's where Collibra comes in. We provide a centralized system of engagement that works across all of those locations and combines all of those different user types across the whole business. At Collibra, we say United by data and that also means that we're united by data with our customers. So here is some data about some of our customers. There was the case of an online do it yourself platform who grew their revenue almost three times from a marketing campaign that provided the right product in the right hands of the right people. In other case that comes to mind is from a financial services organization who saved over 800 K every year because they were able to reuse the same data in different kinds of reports and before there was spread out over different tools and processes and silos, and now the platform brought them together so they realized, oh, we're actually using the same data, let's find a way to make this more efficient. And the last example that comes to mind is that of a large home loan, home mortgage, mortgage loan provider where they have a very complex landscape, a very complex architecture legacy in the cloud, et cetera. And they're using our software, they're using our platform to unite all the people and those processes and tools to get a common view of data to manage their compliance at scale. >>Hey everyone, I'm Lisa Martin covering Data Citizens 22, brought to you by Collibra. This next conversation is gonna focus on the importance of data culture. One of our Cube alumni is back, Stan Christians is Collibra's co-founder and it's Chief Data citizens. Stan, it's great to have you back on the cube. >>Hey Lisa, nice to be. >>So we're gonna be talking about the importance of data culture, data intelligence, maturity, all those great things. When we think about the data revolution that every business is going through, you know, it's so much more than technology innovation. It also really re requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >>Right. So as you know, our event is called Data Citizens because we believe that in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations, you have a lot of people, most of the employees in an organization are somehow gonna to be a data citizen, right? So you need to make sure that these people are aware of it. You need that. People have skills and competencies to do with data what necessary and that's on, all right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss, we need to make this decision that your boss is also open to and able to interpret, you know, the data presented in dashboard to actually make that decision and take that action. Right? >>And once you have that why to the organization, that's when you have a good data culture. Now that's continuous effort for most organizations because they're always moving, somehow they're hiring new people and it has to be continuous effort because we've seen that on the hand. Organizations continue challenged their data sources and where all the data is flowing, right? Which in itself creates a lot of risk. But also on the other set hand of the equation, you have the benefit. You know, you might look at regulatory drivers like, we have to do this, right? But it's, it's much better right now to consider the competitive drivers, for example, and we did an IDC study earlier this year, quite interesting. I can recommend anyone to it. And one of the conclusions they found as they surveyed over a thousand people across organizations worldwide is that the ones who are higher in maturity. >>So the, the organizations that really look at data as an asset, look at data as a product and actively try to be better at it, don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them up as data citizens. I'm doing this for competitive reasons, I'm doing this re reasons you're trying to bring both of those together and the ones that get data intelligence right, are successful and competitive. That's, and that's what we're seeing out there in the market. >>Absolutely. We know that just generally stand right, the organizations that are, are really creating a, a data culture and enabling everybody within the organization to become data citizens are, We know that in theory they're more competitive, they're more successful. But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >>Of course, of course it's difficult for an organization to adapt but it's also necessary, as you just said, imagine that, you know, you're a modern day organization, laptops, what have you, you're not using those, right? Or you know, you're delivering them throughout organization, but not enabling your colleagues to actually do something with that asset. Same thing as through with data today, right? If you're not properly using the data asset and competitors are, they're gonna to get more advantage. So as to how you get this done, establish this. There's angles to look at, Lisa. So one angle is obviously the leadership whereby whoever is the boss of data in the organization, you typically have multiple bosses there, like achieve data officers. Sometimes there's, there's multiple, but they may have a different title, right? So I'm just gonna summarize it as a data leader for a second. >>So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? Yes. Now that's one part because then you can leadership in the organization and also the business value. And that's important. Cause those people, their job in essence really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that right, is it's not enough to just have that leadership out there, but you also have to get the hearts and minds of the data champions across the organization. You, I really have to win them over. And if you have those two combined and obviously a good technology to, you know, connect those people and have them execute on their responsibilities such as a data intelligence platform like s then the in place to really start upgrading that culture inch by inch if you'll, >>Yes, I like that. The recipe for success. So you are the co-founder of Collibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how before we went live, we were talking about Calibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what maybe some of the specific projects are that Collibra's data office is working on. >>Yes, and it is indeed data citizens. There are a ton of speaks here, are very excited. You know, we have Barb from m MIT speaking about data monetization. We have Dilla at the last minute. So really exciting agen agenda. Can't wait to get back out there essentially. So over the years at, we've doing this since two and eight, so a good years and I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around as are we. And myself, you know, when you start a company, we were for people in a, if you, so everybody's wearing all sorts of hat at time. But over the years I've run, you know, presales that sales partnerships, product cetera. And as our company got a little bit biggish, we're now thousand two. Something like people in the company. >>I believe systems and processes become a lot important. So we said you CBRA isn't the size our customers we're getting there in of organization structure, process systems, et cetera. So we said it's really time for us to put our money where is and to our own data office, which is what we were seeing customers', organizations worldwide. And they organizations have HR units, they have a finance unit and over time they'll all have a department if you'll, that is responsible somehow for the data. So we said, ok, let's try to set an examples that other people can take away with it, right? Can take away from it. So we set up a data strategy, we started building data products, took care of the data infrastructure. That's sort of good stuff. And in doing all of that, ISA exactly as you said, we said, okay, we need to also use our product and our own practices and from that use, learn how we can make the product better, learn how we make, can make the practice better and share that learning with all the, and on, on the Monday mornings, we sometimes refer to eating our dog foods on Friday evenings. >>We referred to that drinking our own champagne. I like it. So we, we had a, we had the driver to do this. You know, there's a clear business reason. So we involved, we included that in the data strategy and that's a little bit of our origin. Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should, this is just the organization that works at our company, but it can serve as an inspiration. So we have a pillar, which is data science. The data product builders, if you'll or the people who help the business build data products. We have the data engineers who help keep the lights on for that data platform to make sure that the products, the data products can run, the data can flow and you know, the quality can be checked. >>And then we have a data intelligence or data governance builders where we have those data governance, data intelligence stakeholders who help the business as a sort of data partner to the business stakeholders. So that's how we've organized it. And then we started following the CBRA approach, which is, well, what are the challenges that our business stakeholders have in hr, finance, sales, marketing all over? And how can data help overcome those challenges? And from those use cases, we then just started to build a map and started execution use of the use case. And a important ones are very simple. We them with our, our customers as well, people talking about the cata, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in and privacy. So they have their process registry and they can see how the data flows. >>So that's a starting place and that turns into a marketplace so that if new analysts and data citizens join kbra, they immediately have a place to go to, to look at, see, ok, what data is out there for me as an analyst or a data scientist or whatever to do my job, right? So they can immediately get access data. And another one that we is around trusted business. We're seeing that since, you know, self-service BI allowed everyone to make beautiful dashboards, you know, pie, pie charts. I always, my pet pee is the pie chart because I love buy and you shouldn't always be using pie charts. But essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report the reporting on the same thing. But the numbers seem different, right? So that's why we have trusted this reporting. So we know if a, the dashboard, a data product essentially is built, we not that all the right steps are being followed and that whoever is consuming that can be quite confident in the result either, Right. And that silver browser, right? Absolutely >>Decay. >>Exactly. Yes, >>Absolutely. Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >>KPIs and measuring is a big topic in the, in the data chief data officer profession, I would say, and again, it always varies with to your organization, but there's a few that we use that might be of interest. Use those pillars, right? And we have metrics across those pillars. So for example, a pillar on the data engineering side is gonna be more related to that uptime, right? Are the, is the data platform up and running? Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption's a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data sciences and products. Are people using them? Are they getting value from it? >>Can we calculate that value in ay perspective, right? Yeah. So that we can to the rest of the business continue to say we're tracking all those numbers and those numbers indicate that value is generated and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example, you have a number of domains in a data mesh. People talk about being the owner of a data domain, for example, like product or, or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open closed? How many data products are built according to process? And so and so forth. So these are an set of examples of, of KPIs. There's a, there's a lot more, but hopefully those can already inspire the audience. >>Absolutely. So we've, we've talked about the rise cheap data offices, it's only accelerating. You mentioned this is like a 10 year journey. So if you were to look into a crystal ball, what do you see in terms of the maturation of data offices over the next decade? >>So we, we've seen indeed the, the role sort of grow up, I think in, in thousand 10 there may have been like 10 achieve data officers or something. Gartner has exact numbers on them, but then they grew, you know, industries and the number is estimated to be about 20,000 right now. Wow. And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy support for the digital program. And now all about data products, right? So as a data leader, you now need all of those competences and need to include them in, in your strategy. >>How is that going to evolve for the next couple of years? I wish I had one of those balls, right? But essentially I think for the next couple of years there's gonna be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data. So you'll see over the years that's gonna evolve more digital and more data products. So for next years, my, my prediction is it's all products because it's an immediate link between data and, and the essentially, right? Right. So that's gonna be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up in a few years. I think there's gonna be a continued challenge for the chief officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not, right? >>So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done will be the ones that are successful and the ones who get that will the ones that do it on the basis of data monetization, right? Connecting value to the data and making that value clear to all the data citizens in the organization, right? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences aligned of course. And they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you're waking up data citizens across the organization and you make everyone in the organization think about data as an asset. >>Absolutely. Because there's so much value that can be extracted. Organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely gonna keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, organizations have so much more likelihood of being successful and being competitive. So we're gonna watch this space. Stan, thank you so much for joining me on the cube at Data Citizens 22. We appreciate it. >>Thanks for having me over >>From Data Citizens 22, I'm Lisa Martin, you're watching The Cube, the leader in live tech coverage. >>Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra. Remember, all these videos are available on demand@thecube.net. And don't forget to check out silicon angle.com for all the news and wiki bod.com for our weekly breaking analysis series where we cover many data topics and share survey research from our partner ETR Enterprise Technology Research. If you want more information on the products announced at Data Citizens, go to collibra.com. There are tons of resources there. You'll find analyst reports, product demos. It's really worthwhile to check those out. Thanks for watching our program and digging into Data Citizens 2022 on the Cube, your leader in enterprise and emerging tech coverage. We'll see you soon.

Published Date : Nov 2 2022

SUMMARY :

largely about getting the technology to work. Now the cloud is definitely helping with that, but also how do you automate governance? So you can see how data governance has evolved into to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the So it's a really interesting story that we're thrilled to be sharing And we said at the time, you know, maybe it's time to rethink data innovation. 2020s from the previous decade, and what challenges does that bring for your customers? as data becomes more impactful than important, the level of scrutiny with respect to privacy, So again, I think it just another incentive for organization to now truly look at data You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated the last kind of financial crisis, and that was really the, the start of Colli where we found product market Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners And the second is that those data pipelines that are now being created in the cloud, I mean, the acquisition of i l dq, you know, So that's really the theme of a lot of the innovation that we're driving. And so that's the big theme from an innovation perspective, One of our key differentiators is the ability to really drive a lot of automation through workflows. So actually pushing down the computer and data quality, one of the key principles you think about monetization. And I, and I think we we're really at this pivotal moment, and I think you said it well. We need to look beyond just the I know you're gonna crush it out there. This is Dave Valante for the cube, your leader in enterprise and Without data leverage the Collibra data catalog to automatically And for that you'll establish community owners, a data set to a KPI to a report now enables your users to see what Finally, seven, promote the value of this to your users and Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. And now you lead data quality at Collibra. imagine if we get that wrong, you know, what the ramifications could be, And I realized in that moment, you know, I might have failed him because, cause I didn't know. And it's so complex that the way companies consume them in the IT function is And so it's really become front and center just the whole quality issue because data's so fundamental, nowadays to this topic is, so maybe we could surface all of these problems with So the language is changing a you know, stale data, you know, the, the whole trend toward real time. we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. And the one right now is these hyperscalers in the cloud. And I think if you look at the whole So this is interesting because what you just described, you know, you mentioned Snowflake, And so when you were to log into Big Query tomorrow using our I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, Seeing that across the board, people used to know it was a zip code and nowadays Appreciate it. Right, and thank you for watching. Nice to be here. Can can you explain to our audience why the ability to manage data across the entire organization. I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it And one of the big pushes and passions we have at Collibra is to help with I I, you know, you mentioned this idea of, and really speeding the time to value for any of the business analysts, So where do you see, you know, the friction in adopting new data technologies? So one of the other things we're announcing with, with all of the innovations that are coming is So anybody in the organization is only getting access to the data they should have access to. So it was kind of smart that you guys were early on and We're able to profile and classify that data we're announcing with Calibra Protect this week that and get the right and make sure you have the right quality. I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, We also are doing more with Google around, you know, GCP and kbra protect there, you know, this year, the event your, your perspectives. And so it's all about everybody being able to easily It was great to have you on the cube first time I believe, cube, your leader in enterprise and emerging tech coverage. the cloud where you get the benefit of scale and security and so on. And the last example that comes to mind is that of a large home loan, home mortgage, Stan, it's great to have you back on the cube. Talk to us about what you mean by data citizenship and the And we believe that today's organizations, you have a lot of people, And one of the conclusions they found as they So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them But the IDC study that you just mentioned demonstrates they're three times So as to how you get this done, establish this. part of the equation of getting that right, is it's not enough to just have that leadership out Talk to us about how you are building a data culture within Collibra and But over the years I've run, you know, So we said you the data products can run, the data can flow and you know, the quality can be checked. The catalog for the data scientists to know what's in their data lake, and data citizens join kbra, they immediately have a place to go to, Yes, success of the data office. So for example, a pillar on the data engineering side is gonna be more related So how many of those domains do you have covered? to look into a crystal ball, what do you see in terms of the maturation industries and the number is estimated to be about 20,000 right now. How is that going to evolve for the next couple of years? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences And as the data show that you mentioned in that IDC study, the leader in live tech coverage. Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra.

ENTITIES

Entity	Category	Confidence
Laura	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Heineken	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
Laura Sellers	PERSON	0.99+
2008	DATE	0.99+
Collibra	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Felix Von Dala	PERSON	0.99+
Google	ORGANIZATION	0.99+
Felix Van Dema	PERSON	0.99+
seven	QUANTITY	0.99+
Stan Christians	PERSON	0.99+
2010	DATE	0.99+
Lisa	PERSON	0.99+
San Diego	LOCATION	0.99+
Jay	PERSON	0.99+
50 day	QUANTITY	0.99+
Felix	PERSON	0.99+
one	QUANTITY	0.99+
Kurt Hasselbeck	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
10 year	QUANTITY	0.99+
California Consumer Privacy Act	TITLE	0.99+
10 day	QUANTITY	0.99+
Six	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+
Last year	DATE	0.99+
demand@thecube.net	OTHER	0.99+
ETR Enterprise Technology Research	ORGANIZATION	0.99+
Barry	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
one part	QUANTITY	0.99+
Python	TITLE	0.99+
2010s	DATE	0.99+
2020s	DATE	0.99+
Calibra	LOCATION	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Calibra	ORGANIZATION	0.99+
K Bear Protect	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
12 months	QUANTITY	0.99+
tomorrow	DATE	0.99+
AWS	ORGANIZATION	0.99+
Barb	PERSON	0.99+
Stan	PERSON	0.99+
Data Citizens	ORGANIZATION	0.99+

IBM DataOps in Action Panel | IBM DataOps 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi buddy welcome to this special noob digital event where we're focusing in on data ops data ops in Acton with generous support from friends at IBM let me set up the situation here there's a real problem going on in the industry and that's that people are not getting the most out of their data data is plentiful but insights perhaps aren't what's the reason for that well it's really a pretty complicated situation for a lot of organizations there's data silos there's challenges with skill sets and lack of skills there's tons of tools out there sort of a tools brief the data pipeline is not automated the business lines oftentimes don't feel as though they own the data so that creates some real concerns around data quality and a lot of finger-point quality the opportunity here is to really operationalize the data pipeline and infuse AI into that equation and really attack their cost-cutting and revenue generation opportunities that are there in front of you think about this virtually every application this decade is going to be infused with AI if it's not it's not going to be competitive and so we have organized a panel of great practitioners to really dig in to these issues first I want to introduce Victoria Stassi with who's an industry expert in a top at Northwestern you two'll very great to see you again thanks for coming on excellent nice to see you as well and Caitlin Alfre is the director of AI a vai accelerator and also part of the peak data officers organization at IBM who has actually eaten some of it his own practice what a creep let me say it that way Caitlin great to see you again and Steve Lewis good to see you again see vice president director of management associated a bank and Thompson thanks for coming on thanks Dave make speaker alright guys so you heard my authority with in terms of operationalizing getting the most insight hey data is wonderful insights aren't but getting insight in real time is critical in this decade each of you is a sense as to where you are on that journey or Victoria your taste because you're brand new to Northwestern Mutual but you have a lot of deep expertise in in health care and manufacturing financial services but where you see just the general industry climate and we'll talk about the journeys that you are on both personally and professionally so it's all fair sure I think right now right again just me going is you need to have speech insight right so as I experienced going through many organizations are all facing the same challenges today and a lot of those pounds is hard where do my to live is my data trust meaning has a bank curated has been Clinton's visit qualified has a big a lot of that is ready what we see often happen is businesses right they know their KPIs they know their business metrics but they can't find where that data Linda Barragan asked there's abundant data disparity all over the place but it is replicated because it's not well managed it's a lot of what governance in the platform of pools that governance to speak right offer fact it organizations pay is just that piece of it I can tell you where data is I can tell you what's trusted that when you can quickly access information and bring back answers to business questions that is one answer not many answers leaving the business to question what's the right path right which is the correct answer which which way do I go at the executive level that's the biggest challenge where we want the industry to go moving forward right is one breaking that down along that information to be published quickly and to an emailing data virtualization a lot of what you see today is most businesses right it takes time to build out large warehouses at an enterprise level we need to pivot quicker so a lot of what businesses are doing is we're leaning them towards taking advantage of data virtualization allowing them to connect to these data sources right to bring that information back quickly so they don't have to replicate that information across different systems or different applications right and then to be able to provide that those answers back quickly also allowing for seamless access to from the analysts that are running running full speed right try and find the answers as quickly as they find great okay and I want to get into that sort of how news Steve let me go to you one of the things that we talked about earlier was just infusing this this mindset of a data cult and thinking about data as a service so talk a little bit about how you got started what was the starting NICUs through that sure I think the biggest thing for us there is to change that mindset from data being just for reporting or things that have happened in the past to do some insights on us and some data that already existed well we've tried to shift the mentality there is to start to use data and use that into our actual applications so that we're providing those insight in real time through the applications as they're consumed helping with customer experience helping with our personalization and an optimization of our application the way we've started down that path or kind of the journey that we're still on was to get the foundation laid birch so part of that has been making sure we have access to all that data whether it's through virtualization like vic talked about or whether it's through having more of the the data selected in a data like that that where we have all of that foundational data available as opposed to waiting for people to ask for it that's been the biggest culture shift for us is having that availability of data to be ready to be able to provide those insights as opposed to having to make the businesses or the application or asked for that day Oh Kailyn when I first met into pulp andari the idea wobble he paid up there yeah I was asking him okay where does a what's the role of that at CBO and and he mentioned a number of things but two of the things that stood out is you got to understand how data affect the monetization of your company that doesn't mean you know selling the data what role does it play and help cut cost or ink revenue or productivity or no customer service etc the other thing he said was you've got a align with the lines of piss a little sounded good and this is several years ago and IBM took it upon itself Greek its own champagne I was gonna say you know dogfooding whatever but it's not easy just flip a switch and an infuse a I and automate the data pipeline you guys had to go you know some real of pain to get there and you did you were early on you took some arrows and now you're helping your customers better on thin debt but talk about some of the use cases that where you guys have applied this obviously the biggest organization you know one of the biggest in the world the real challenge is they're sure I'm happy today you know we've been on this journey for about four years now so we stood up our first book to get office 2016 and you're right it was all about getting what data strategy offered and executed internally and we want to be very transparent because as you've mentioned you know a lot of challenges possible think differently about the value and so as we wrote that data strategy at that time about coming to enterprise and then we quickly of pivoted to see the real opportunity and value of infusing AI across all of our needs were close to your question on a couple of specific use cases I'd say you know we invested that time getting that platform built and implemented and then we were able to take advantage of that one particular example that I've been really excited about I have a practitioner on my team who's a supply chain expert and a couple of years ago he started building out supply chain solution so that we can better mitigate our risk in the event of a natural disaster like the earthquake hurricane anywhere around the world and be cuz we invest at the time and getting the date of pipelines right getting that all of that were created and cleaned and the quality of it we were able to recently in recent weeks add the really critical Kovach 19 data and deliver that out to our employees internally for their preparation purposes make that available to our nonprofit partners and now we're starting to see our first customers take advantage too with the health and well-being of their employees mine so that's you know an example I think where and I'm seeing a lot of you know my clients I work with they invest in the data and AI readiness and then they're able to take advantage of all of that work work very quickly in an agile fashion just spin up those out well I think one of the keys there who Kaelin is that you know we can talk about that in a covet 19 contact but it's that's gonna carry through that that notion of of business resiliency is it's gonna live on you know in this post pivot world isn't it absolutely I think for all of us the importance of investing in the business continuity and resiliency type work so that we know what to do in the event of either natural disaster or something beyond you know it'll be grounded in that and I think it'll only become more important for us to be able to act quickly and so the investment in those platforms and approach that we're taking and you know I see many of us taking will really be grounded in that resiliency so Vic and Steve I want to dig into this a little bit because you know we use this concept of data op we're stealing from DevOps and there are similarities but there are also differences now let's talk about the data pipeline if you think about the data pipeline as a sort of quasi linear process where you're investing data and you might be using you know tools but whether it's Kafka or you know we have a favorite who will you have and then you're transforming that that data and then you got a you know discovery you got to do some some exploration you got to figure out your metadata catalog and then you're trying to analyze that data to get some insights and then you ultimately you want to operationalize it so you know and and you could come up with your own data pipeline but generally that sort of concept is is I think well accepted there's different roles and unlike DevOps where it might be the same developer who's actually implementing security policies picking it the operations in in data ops there might be different roles and fact very often are there's data science there's may be an IT role there's data engineering there's analysts etc so Vic I wonder if you could you could talk about the challenges in in managing and automating that data pipeline applying data ops and how practitioners can overcome them yeah I would say a perfect example would be a client that I was just recently working for where we actually took a team and we built up a team using agile methodologies that framework right we're rapidly ingesting data and then proving out data's fit for purpose right so often now we talk a lot about big data and that is really where a lot of industries are going they're trying to add an enrichment to their own data sources so what they're doing is they're purchasing these third-party data sets so in doing so right you make that initial purchase but what many companies are doing today is they have no real way to vet that so they'll purchase the information they aren't going to vet it upfront they're going to bring it into an environment there it's going to take them time to understand if the data is of quality or not and by the time they do typically the sales gone and done and they're not going to ask for anything back but we were able to do it the most recent claim was use an instructure data source right bring that and ingest that with modelers using this agile team right and within two weeks we were able to bring the data in from the third-party vendor what we considered rapid prototyping right be able to profile the data understand if the data is of quality or not and then quickly figure out that you know what the data's not so in doing that we were able to then contact the vendor back tell them you know it sorry the data set up to snuff we'd like our money back we're not gonna go forward with it that's enabling businesses to be smarter with what they're doing with 30 new purchases today as many businesses right now um as much as they want to rely on their own data right they actually want to rely on cross the data from third-party sources and that's really what data Ops is allowing us to do it's allowing us to think at a broader a higher level right what to bring the information what structures can we store them in that they don't necessarily have to be modeled because a modeler is great right but if we have to take time to model all the information before we even know we want to use it that's gonna slow the process now and that's slowing the business down the business is looking for us to speed up all of our processes a lot of what we heard in the past raised that IP tends to slow us down and that's where we're trying to change that perception in the industry is no we're actually here to speed you up we have all the tools and technologies to do so and they're only getting better I would say also on data scientists right that's another piece of the pie for us if we can bring the information in and we can quickly catalog it in a metadata and burn it bring in the information in the backend data data assets right and then supply that information back to scientists gone are the days where scientists are going and asking for connections to all these different data sources waiting days for access requests to be approved just to find out that once they figure out how it with them the relationship diagram right the design looks like in that back-end database how to get to it write the code to get to it and then figure out this is not the information I need that Sally next to me right fold me the wrong information that's where the catalog comes in that's where due to absent data governance having that catalog that metadata management platform available to you they can go into a catalog without having to request access to anything quickly and within five minutes they can see the structures what if the tables look like what did the fields look like are these are these the metrics I need to bring back answers to the business that's data apps it's allowing us to speed up all of that information you know taking stuff that took months now down two weeks down two days down two hours so Steve I wonder if you could pick up on that and just help us understand what data means you we talked about earlier in our previous conversation I mentioned it upfront is this notion of you know the demand for for data access is it was through the roof and and you've gone from that to sort of more of a self-service environment where it's not IT owning the data it's really the businesses owning the data but what what is what is all this data op stuff meaning in your world sure I think it's very similar it's it's how do we enable and get access to that clicker showing the right controls showing the right processes and and building that scalability and agility and into all of it so that we're we're doing this at scale it's much more rapidly available we can discover new data separately determine if it's right or or more importantly if it's wrong similar to what what Vic described it's it's how do we enable the business to make those right decisions on whether or not they're going down the right path whether they're not the catalog is a big part of that we've also introduced a lot of frameworks around scale so just the ability to rapidly ingest data and make that available has been a key for us we've also focused on a prototyping environment so that sandbox mentality of how do we rapidly stand those up for users and and still provide some controls but have provide that ability for people to do that that exploration what we're finding is that by providing the platform and and the foundational layers that were we're getting the use cases to sort of evolve and come out of that as opposed to having the use cases prior to then go build things from we're shifting the mentality within the organization to say we don't know what we need yet let's let's start to explore that's kind of that data scientist mentality and culture it more of a way of thinking as opposed to you know an actual project or implement well I think that that cultural aspect is important of course Caitlin you guys are an AI company or at least that you know part of what you do but you know you've you for four decades maybe centuries you've been organized around different things by factoring plant but sales channel or whatever it is but-but-but-but how has the chief data officer organization within IBM been able to transform itself and and really infuse a data culture across the entire company one of the approaches you know we've taken and we talk about sort of the blueprint to drive AI transformation so that we can achieve and deliver these really high value use cases we talked about the data the technology which we've just pressed on with organizational piece of it duration are so important the change management enabling and equipping our data stewards I'll give one a civic example that I've been really excited about when we were building our platform and starting to pull districting structured unstructured pull it in our ADA stewards are spending a lot of time manually tagging and creating business metadata about that data and we identified that that was a real pain point costing us a lot of money valuable resources so we started to automate the metadata and doing that in partnership with our deep learning practitioners and some of the models that they were able to build that capability we pushed out into our contacts our product last year and one of the really exciting things for me to see is our data stewards who be so value exporters and the skills that they bring have reported that you know it's really changed the way they're able to work it's really sped up their process it's enabled them to then move on to higher value to abilities and and business benefits so they're very happy from an organizational you know completion point of view so I think there's ways to identify those use cases particularly for taste you know we drove some significant productivity savings we also really empowered and hold our data stewards we really value to make their job you know easier more efficient and and help them move on to things that they are more you know excited about doing so I think that's that you know another example of approaching taken yes so the cultural piece the people piece is key we talked a little bit about the process I want to get into a little bit into the tech Steve I wonder if you could tell us you know what's it what's the tech we have this bevy of tools I mentioned a number of them upfront you've got different data stores you've got open source pooling you've got IBM tooling what are the critical components of the technology that people should be thinking about tapping in architecture from ingestion perspective we're trying to do a lot of and a Python framework and scaleable ingestion pipe frameworks on the catalog side I think what we've done is gone with IBM PAC which provides a platform for a lot of these tools to stay integrated together so things from the discovery of data sources the cataloging the documentation of those data sources and then all the way through the actual advanced analytics and Python models and our our models and the open source ID combined with the ability to do some data prep and refinery work having that all in an integrated platform was a key to us for us that the rollout and of more of these tools in bulk as opposed to having the point solutions so that's been a big focus area for us and then on the analytic side and the web versus IDE there's a lot of different components you can go into whether it's meal soft whether it's AWS and some of the native functionalities out there you mentioned before Kafka and Anissa streams and different streaming technologies those are all the ones that are kind of in our Ketil box that we're starting to look at so and one of the keys here is we're trying to make decisions in as close to real time as possible as opposed to the business having to wait you know weeks or months and then by the time they get insights it's late and really rearview mirror so Vic your focus you know in your career has been a lot on data data quality governance master data management data from a data quality standpoint as well what are some of the key tools that you're familiar with that you've used that really have enabled you operationalize that data pipeline you know I would say I'm definitely the IBM tools I have the most experience with that also informatica though as well those are to me the two top players IBM definitely has come to the table with a suite right like Steve said cloud pack for data is really a one-stop shop so that's allowing that quick seamless access for business user versus them having to go into some of the previous versions that IBM had rolled out where you're going into different user interfaces right to find your information and that can become clunky it can add the process it can also create almost like a bad taste and if in most people's mouths because they don't want to navigate from system to system to system just to get their information so cloud pack to me definitely brings everything to the table in one in a one-stop shop type of environment in for me also though is working on the same thing and I would tell you that they haven't come up with a solution that really comes close to what IBM is done with cloud pack for data I'd be interested to see if they can bring that on the horizon but really IBM suite of tools allows for profiling follow the analytics write metadata management access to db2 warehouse on cloud those are the tools that I've worked in my past to implement as well as cloud object store to bring all that together to provide that one stop that at Northwestern right we're working right now with belieber I think calibra is a great set it pool are great garments catalog right but that's really what it's truly made for is it's a governance catalog you have to bring some other pieces to the table in order for it to serve up all the cloud pack does today which is the advanced profiling the data virtualization that cloud pack enables today the machine learning at the level where you can actually work with our and Python code and you put our notebooks inside of pack that's some of this the pieces right that are missing in some of the under vent other vendor schools today so one of the things that you're hearing here is the theme of openness others addition we've talked about a lot of tools and not IBM tools all IBM tools there there are many but but people want to use what they want to use so Kaitlin from an IBM perspective what's your commitment the openness number one but also to you know we talked a lot about cloud packs but to simplify the experience for your client well and I thank Stephen Victoria for you know speaking to their experience I really appreciate feedback and part of our approach has been to really take one the challenges that we've had I mentioned some of the capabilities that we brought forward in our cloud platform data product one being you know automating metadata generation and that was something we had to solve for our own data challenges in need so we will continue to source you know our use cases from and grounded from a practitioner perspective of what we're trying to do and solve and build and the approach we've really been taking is co-creation line and that we roll these capability about the product and work with our customers like Stephen light victorious you really solicit feedback to product route our dev teams push that out and just be very open and transparent I mean we want to deliver a seamless experience we want to do it in partnership and continue to solicit feedback and improve and roll out so no I think that will that has been our approach will continue to be and really appreciate the partnerships that we've been able to foster so we don't have a ton of time but I want to go to practitioners on the panel and ask you about key key performance indicators when I think about DevOps one of the things that we're measuring is the elapsed time the deploy applications start finished where we're measuring the amount of rework that has to be done the the quality of the deliverable what are the KPIs Victoria that are indicators of success in operationalizing date the data pipeline well I would definitely say your ability to deliver quickly right so how fast can you deliver is that is that quicker than what you've been able to do in the past right what is the user experience like right so have you been able to measure what what the amount of time was right that users are spending to bring information to the table in the past versus have you been able to reduce that time to delivery right of information business answers to business questions those are the key performance indicators to me that tell you that the suite that we've put in place today right it's providing information quickly I can get my business answers quickly but quicker than I could before and the information is accurate so being able to measure is it quality that I've been giving that I've given back or is this not is it the wrong information and yet I've got to go back to the table and find where I need to gather that from from somewhere else that to me tells us okay you know what the tools we've put in place today my teams are working quicker they're answering the questions they need to accurately that is when we know we're on the right path Steve anything you add to that I think she covered a lot of the people components the around the data quality scoring right for all the different data attributes coming up with a metric around how to measure that and and then showing that trend over time to show that it's getting better the other one that we're doing is just around overall date availability how how much data are we providing to our users and and showing that trend so when I first started you know we had somewhere in the neighborhood of 500 files that had been brought into the warehouse and and had been published and available in the neighborhood of a couple thousand fields we've grown that into weave we have thousands of cables now available so it's it's been you know hundreds of percent in scale as far as just the availability of that data how much is out there how much is is ready and available for for people to just dig in and put into their their analytics and their models and get those back into the other application so that's another key metric that we're starting to track as well so last question so I said at the top that every application is gonna need to be infused with AI this decade otherwise that application not going to be as competitive as it could be and so for those that are maybe stuck in their journey don't really know where to get started I'll start with with Caitlin and go to Victoria and then and then even bring us home what advice would you give the people that need to get going on this my advice is I think you pull the folks that are either producing or accessing your data and figure out what the rate is between I mentioned some of the data management challenges we were seeing this these processes were taking weeks and prone to error highly manual so part was ripe for AI project so identifying those use cases I think that are really causing you know the most free work and and manual effort you can move really quickly and as you build this platform out you're able to spin those up on an accelerated fashion I think identifying that and figuring out the business impact are able to drive very early on you can get going and start really seeing the value great yeah I would actually say kids I hit it on the head but I would probably add to that right is the first and foremost in my opinion right the importance around this is data governance you need to implement a data governance at an enterprise level many organizations will do it but they'll have silos of governance you really need an interface I did a government's platform that consists of a true framework of an operational model model charters right you have data domain owners data domain stewards data custodians all that needs to be defined and while that may take some work in in the beginning right the payoff down the line is that much more it's it it's allowing your business to truly own the data once they own the data and they take part in classifying the data assets for technologists and for analysts right you can start to eliminate some of the technical debt that most organizations have acquired today they can start to look at what are some of the systems that we can turn off what are some of the systems that we see valium truly build out a capability matrix we can start mapping systems right to capabilities and start to say where do we have wares or redundancy right what can we get rid of that's the first piece of it and then the second piece of it is really leveraging the tools that are out there today the IBM tools some of the other tools out there as well that enable some of the newer next-generation capabilities like unit nai right for example allowing automation for automation which right for all of us means that a lot of the analysts that are in place today they can access the information quicker they can deliver the information accurately like we've been talking about because it's been classified that pre works being done it's never too late to start but once you start that it just really acts as a domino effect to everything else where you start to see everything else fall into place all right thank you and Steve bring us on but advice for your your peers that want to get started sure I think the key for me too is like like those guys have talked about I think all everything they said is valid and accurate thing I would add is is from a starting perspective if you haven't started start right don't don't try to overthink that over plan it it started just do something and and and start the show that progress and value the use cases will come even if you think you're not there yet it's amazing once you have the national components there how some of these things start to come out of the woodwork so so it started it going may have it have that iterative approach to this and an open mindset it's encourage exploration and enablement look your organization in the eye to say why are their silos why do these things like this what are our problem what are the things getting in our way and and focus and tackle those those areas as opposed to trying to put up more rails and more boundaries and kind of encourage that silo mentality really really look at how do you how do you focus on that enablement and then the last comment would just be on scale everything should be focused on scale what you think is a one-time process today you're gonna do it again we've all been there you're gonna do it a thousand times again so prepare for that prepare forever that you're gonna do everything a thousand times and and start to instill that culture within your organization a great advice guys data bringing machine intelligence an AI to really drive insights and scaling with a cloud operating model no matter where that data live it's really great to have have three such knowledgeable practitioners Caitlyn Toria and Steve thanks so much for coming on the cube and helping support this panel all right and thank you for watching everybody now remember this panel was part of the raw material that went into a crowd chat that we hosted on May 27th Crouch at net slash data ops so go check that out this is Dave Volante for the cube thanks for watching [Music]

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Steve Lewis	PERSON	0.99+
Caitlyn Toria	PERSON	0.99+
Steve	PERSON	0.99+
Linda Barragan	PERSON	0.99+
Dave Volante	PERSON	0.99+
two weeks	QUANTITY	0.99+
Victoria Stassi	PERSON	0.99+
Caitlin Alfre	PERSON	0.99+
two hours	QUANTITY	0.99+
Vic	PERSON	0.99+
two days	QUANTITY	0.99+
May 27th	DATE	0.99+
500 files	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Python	TITLE	0.99+
five minutes	QUANTITY	0.99+
30 new purchases	QUANTITY	0.99+
last year	DATE	0.99+
Caitlin	PERSON	0.99+
Clinton	PERSON	0.99+
first piece	QUANTITY	0.99+
first book	QUANTITY	0.99+
Dave	PERSON	0.99+
second piece	QUANTITY	0.99+
Boston	LOCATION	0.99+
Sally	PERSON	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
hundreds of percent	QUANTITY	0.98+
Stephen Victoria	PERSON	0.98+
one	QUANTITY	0.98+
Northwestern Mutual	ORGANIZATION	0.98+
Kaitlin	PERSON	0.97+
four decades	QUANTITY	0.97+
first	QUANTITY	0.97+
two top players	QUANTITY	0.97+
several years ago	DATE	0.96+
about four years	QUANTITY	0.96+
first customers	QUANTITY	0.95+
tons of tools	QUANTITY	0.95+
Kailyn	PERSON	0.95+
both	QUANTITY	0.95+
two	QUANTITY	0.94+
Northwestern	ORGANIZATION	0.94+
Northwestern	LOCATION	0.93+
each	QUANTITY	0.91+
Crouch	PERSON	0.91+
CBO	ORGANIZATION	0.91+
DevOps	TITLE	0.91+
two of	QUANTITY	0.89+
AI	ORGANIZATION	0.87+
things	QUANTITY	0.87+
three such knowledgeable practitioners	QUANTITY	0.87+

Josh Rogers, Syncsort | Big Data NYC 2017

>> Announcer: Live from Midtown Manhattan it's theCUBE. Covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Welcome back everyone live here in New York City this theCUBE's coverage of our fifth annual annual event that we put on ourselves in conjunction Strata Hadoop now called Strata Data. It's theCUBE and we're covering the scene here at Hadoop World going back to 2010, eight years of Coverage. I'm John Furrier co-host of theCUBE. Usually Dave Vellante is here but he's down covering the Splunk Conference and who was there yesterday was no other than Josh Rogers my next guest the CEO of Syncsort, you were with Dave Vellante yesterday and live on theCUBE in Washington, DC for the Splunk .conf kind of a Big Data Conference but it's a proprietary, branded event for themselves. This is a more industry even here at Big Data NYC that we put on. Welcome back glad you flew up on the on the Concord, the private jet. >> Early morning but it was was fine. >> No good to see you a CEO of Syncsort, you guys have been busy. For the folks watching in theCUBE community know that you've been on many times. The folks that are learning more about theCUBE every day, you guys had an interesting transformations as a company, take a minute to talk about where you've come from and where you are today. Certainly a ton of corporate development activity in your end it, as you guys are seeing the opportunities, you're moving on them. Take a minute to explain. >> So, you know it's been a great journey so far and there's a lot more work to do, but you know Syncsort is one of the first software companies, right. Founded in the late 60's today has a unparalleled franchise in the mainframe space. But over the last 10 years or so we branched out into open systems and delivered high performance data integration solutions. About 4 years ago really started to invest in the Big Data space we had a DNA around performance and scale we felt like that would be relevant in the Big Data space. We delivered a Hadoop focused product and today we focus around that product around helping customers ingest mainframe data assets into their into Hadoop clusters along with other types data. But a specific focus there. That has lead us into understanding a bigger market space that we call Big Iron to Big Data. And what we see in the marketplace is that customers are adapting. >> Just before you get in there I love that term, Big Iron Big Data you know I love Big Iron. Used to be a term for the mainframe for the younger generation out there. But you're really talking about you guys have leveraged experience with the installed base activity that scale call it batched, molded, single threaded, whatever you want to call it. But as you got into the game of Big Data you then saw other opportunities, did I get that right? You got into the game with some Hadoop, then you realize, whoa, I can do some large scale. What was that opportunity? >> The opportunity is that you know large enterprise is absolutely investing heavily in the next generation of analytic technologies in a new stack. Hadoop is a part of that, Spark is a part of that. And they're rapidly adopting these new infrastructures to drive deeper analytics to answer bigger questions and improve their business and in multiple dimensions. The opportunity we saw was that you know the ability for those enterprises to be able to integrate this new kind of architecture with the legacy architectures. So, the old architectures that were powering key applications impede key up producers of data was a challenge, there was multiple technology challenges, there's cultural challenges. And we had this kind of expertise on both sides of the house and and we found that to be unique in the marketplace. So we put a lot of effort into understanding, defining what are the challenges in that Big Iron to Big Data space that helped customers maximize their value out of these investments in next generation architectures. And we define the problem two ways, one is our two components. One is that people are generating more and more data more and more touch points and driving more and more transactions with their customers. And that's generating increased load on the compute environments and they want to figure out how do I run that, you know if I have a mainframe how to run as efficiently as possible contain my costs maximize availability and uptime. At the same time I've got all this new data that I can start to analyze but I got to get it from the area that it's produced into this next generation system. And there's a lot of challenges there. So we started to isolate, you know, what are the specific use cases the present customers challenge and deliver very different IT solutions. Overarching kind of messages around positioning is around solving the Big Iron to Big Data challenge. >> You guys had done some acquisitions and been successful, I want to talk a little bit about the ones that you like right now that happened the past year or two years. I think you've done five in the past two years. A couple key notable ones that set you up kind of give you pole position for some of these big markets, and then after we talk then I want to talk about your ecosystem opportunity. But some of the acquisitions and what's working for you? What's been the big deals? >> So the larger the larger we did in 2016 was a company called Trillium, leader in the data quality space. Long time leader in the data quality space and the opportunity we saw with Trillium was to complement our data movement integration capabilities. A natural complement, but to focus very specifically on how to drive value in this next generation architecture. Particularly in things like Hadoop. what I'd like to be able to do is apply best in class data quality routines directly in that environment. And so we, from our experience in delivering these Big Data solutions in the past, we knew that we could take a lot of technology and create really powerful solutions that were that leverage the native kind of capabilities of Hadoop but had it on a layer of you've proven technology for best in class day quality. Probably the biggest news of the last few weeks has been that we were acquired by a new private equity partner called Centerbridge Partners. In that acquisition actually acquired Syncsort and they acquired a company called Vision Solutions. And we've combined those organizations. >> John: When did that happen? >> The deal was announced July, early July and it closed in the middle of August. And vision solutions is a really interesting company. They're the leader in high availability for the IBM i market. IBM i was originally called AS/400 it's had a couple of different names and a dominant kind of market position. What we liked about that business was A. That market position four thousand customers generally large enterprise. And also you know market leading capability around data replication in real time. >> And we saw IBM. >> Migration data, disaster recovery kind of thing? >> It's DR it's high availability, it's migrations, it's also changed data capture actually. And leveraging all common technology elements there. But it also represents a market leading franchise in IBM i which is in many ways very similar to the mainframe. Run optimized for transactional systems, hard to kind of get at. >> Sounds like you're reconstructing the mainframe in the cloud. >> It's not so much that, it's the recognition that those compute systems still run the world. They still run all the transactions. >> Well, some say the cloud is a software mainframe. >> I think over time you'll see that, we don't see that our business today. There is a cloud aspect our business it's not to move this transactional applications running on those platforms into the cloud yet. Although I suspect that happens at some point. But our point, our interest was more these are the systems that are producing the world's data. And it's hard to to get. >> There are big, big power sources for data, they're not going anywhere. So we've got the expertise to source that data into these next generation systems. And that's a tricky problem for a lot of customers, and and not something. >> That a problem they have. And you guys basically cornered the market on that. >> So think about Big Iron and Big Data as these two components, being able to source data and make a productive using these next generation analytics systems, and also be able to run those existing systems as you know efficiently as possible. >> All right, so how do you talk to customers and I've asked this question before so I just ask again, oh, Syncsort now you got vision you guys are just a bunch of old mainframe guys. What do you know about cloud native? A lot of the hipsters and the young guns out there might not know about some of the things you're doing on the cutting edge, because even though you have the power base of these old big systems, we're just throwing off massive amounts of data that aren't going anywhere. You still are integrated into some cutting edge. Talk about that, that narrative, and how you. >> So I mean the folks that we target. >> I used cloud only as an example. Shiny, cool, new toys. >> Organizations we target and our customers and prospects, and generally we we serve large enterprise. You know large complex global enterprises. They are making significant investments in Hadoop and Splunk and these next generation environments. We approach them and say we believe to get full value out of your investments in these next generation technologies, it would be helpful if you had your most critical data assets available. And that's hard, and we can help you do that. And we can help you do that in a number of ways that you won't be able to find anywhere else. That includes features in our products, it includes experts on the ground. And what we're seeing is there's a huge demand because, you know, Hadoop is really kind of you can see it in the Cloudera and Hortonworks results and the scale of revenue. This is a you know a real foundational component data management this point. Enterprises are embracing it. If they can't solve that integration challenge between the systems that produce all the data and, you know, where they want to analyze the data There's a there's a big value gap. And we think we're uniquely positioned to be able to do that, one because we've got the technical expertise, two, they're all our customers at this point, we have six thousand customers. >> You guys have executed very well. I just got to say you guys are just slowly taking territory down you and you got a great strategy, get into a business, you don't overplay your hand or get over your skis, whatever you want to call it. And you figure it out and see if was a fit. If it is, grab it, if not, you move on. So also you guys have relationships so we're talking about your ecosystem. What is your ecosystem and what is your partner strategy? >> I'll talk a little bit about the overall strategy and I'll talk about how partners fit into that. Our strategy is to identify specific use cases that are common and challenging in our customer set, that fall within this Big Iron to Big Data umbrella. It's then to deliver a solution that is highly differentiated. Now, the third piece of that is to partner very closely with you know the emerging platform vendors in the in the Big Data space. And the reason for that is we're solving an integration challenge for them. Like Cloudera, like Hortonworks, like Splunk. We launched a relationship with Calibra in the middle the year. We just announced our relationship. >> Yeah, for them the benefits of them is they don't do the heavy lifting you've got that covered. >> We can we can solve a lot of pain points they have getting their platforms setup. >> That's hard to replicate on their end, it's not like they're going to go build it. >> Cloudera and Hortonworks, they don't have mainframe skills. They don't understand how to go access >> Classic partnering example. >> But that the other pieces is we do real engineering work with these partnerships. So we build, we write code to integrate and add value to platforms. >> It's not a Barney deal, it's not an optical deal. >> Absolutely. >> Any jazz is critical in the VM world of some of the deals he's been done in the industry referring to his deal, that's seems to be back in vogue thank God, that people going to say they're going to do a deal and they back it with actually following through. What about other partnerships, how else, how you looking at partnering? So, pretty much, where it fits in your business, are people coming to you, are you going to them? >> We certainly have people coming to us. The the key thing, the number one driver is customers. You know, as we understand use cases, as customers introduce us to new challenges that they are facing, we will not just look at how do we solve it, but and what are the other platforms that we're integrating with, and if we believe we can add unique value to that partner we'll approach that partner. >> Let's talk customers, give me some customer use cases that you're working on right now, that you think are notable worth highlighting. >> Sure so we do a lot in the in the financial services space. You know we have a number of customers >> Where there's mainframes. >> Where there's a lot of mainframes, but it's not just in financial services. Here's an interesting one, was insurance company and they were looking at how to transition their mainframe archive strategy. So they have regulations around how long they have to keep data, they had been using traditional mainframe archive technology, very expensive on annual basis and also unflexible. They didn't have access to. >> And performance too. At the end of the day don't forget performance >> They want performance, this was more of an archive use case and what they really wanted was an ability both access the data and also lower the cost of storing the data for the required time from a regulation perspective. And so they made the decision that they wanted to store it in the cloud, they want to store it in S3. There's a complicated data movement there, there's a complicated data translation process there and you need to understand the mainframe and you need to understand AWS and S3 and all those components, and we had all those pieces and all that expertise and were able to solve that. So we're doing that with a few different customers now. But that's just an example of, you know, there's a great ROI, there's a lot more business flexibility then there's a modernization aspect to it that's very attractive. >> Well, great to hear from you today. I'm glad you made it up here, again you were in DC yesterday thanks for coming in, checking out to shows you're certainly pounding the pavement as they say in New York, to quote New Yorker phrase. What's new for you guys, what's coming out? More acquisitions happening? what's the outlook for Syncsort? >> So were were always active on the M&A front. We certainly have a pipeline of activities and there's a lot of different you know interesting spaces, adjacencies that we're exploring right now. There's nothing that I can really talk about there >> Can you talk about the categories you're looking at? >> Sure you know, things around metadata management, things around real-time data movement, cloud opportunities. There's there's some interesting opportunities in the artificial intelligence, machine learning space. Those are all >> Deep learning. >> Deep learning, those are all interesting spaces for us to think about. Security and other space is interesting. So we're pretty active in a lot of adjacencies >> Classic adjacent markets that you're looking at. So you take one step at a time, slow. >> But then we try to innovate on, you know, after the catch, so we did three announcements this week. Transaction tracing for Ironstream and a kind of refresh of data quality for Hadoop approach. So we'll continue to innovate on the organic setup as well. >> Final question the whole private equity thing. So that's done, so they put a big bag of money in there and brought the two companies together. Is there structural changes, management changes, you're the Syncsort CEO is there a new co name? >> The combined companies will operate under the Syncsort name, I'll serve as the CEO. >> Syncsort is the remaining name and you guys now have another company under it. >> Yes, that's right. >> And cash they put in, probably a boatload of cash for corporate development. >> The announcement the announced deal value was $1.2 billion a little over $1.2 billion. >> So you get a checkbook and looking to buy companies? >> We are we're going to continue, as I said yesterday, to Dave, you know I like to believe that we proved the hypothesis were in about the second inning. Can't wait to keep playing the game. >> It's interesting just, real quick while I got you in here, we got a break coming up for the guys. Private equity move is a good move in this transitional markets, you and I have talked about this in the past off-camera. It's a great thing to do, is take, if you're public and you're not really knocking it out of the park. Kill the 90 day shot clock, go private, there seems to be a lot of movement there. Retool and then re-emerge stronger. >> We've never been public, but I will say, the Centerbridge team has been terrific. A lot of resources there and certainly we do talk we're still very quarterly focused, but I think we've got a great partner and look forward to continue. >> The waves are coming, the big waves are coming so get your big surfboard out, we say in California. Josh, thanks for spending the time. Josh Rogers, CEO Syncsort here on theCUBE. More live coverage in New York after this break. Stay with us for our day two of three days of coverage of Big Data NYC 2017. Our event that we hold every year here in conjunction with Hadoop World right around the corner. I'm John Furrier, we'll be right back.

Published Date : Oct 2 2017

SUMMARY :

Brought to you by SiliconANGLE Media the CEO of Syncsort, you were with Dave Vellante No good to see you a CEO of Syncsort, in the Big Data space we had a DNA around performance You got into the game with some Hadoop, of the house and and we found that to be unique about the ones that you like right now and the opportunity we saw with Trillium was and it closed in the middle of August. hard to kind of get at. reconstructing the mainframe in the cloud. It's not so much that, it's the recognition the systems that are producing the world's data. and and not something. And you guys basically cornered the market on that. as you know efficiently as possible. A lot of the hipsters and the young guns out there I used cloud only as an example. And that's hard, and we can help you do that. I just got to say you guys are just slowly Now, the third piece of that is to partner very closely is they don't do the heavy lifting you've got that covered. We can we can solve a lot of pain points it's not like they're going to go build it. Cloudera and Hortonworks, they don't But that the other pieces is we of some of the deals he's been done in the industry the other platforms that we're integrating with, that you think are notable worth highlighting. the financial services space. and they were looking at how to transition At the end of the day don't forget performance and you need to understand the mainframe Well, great to hear from you today. and there's a lot of different you know interesting spaces, in the artificial intelligence, machine learning space. Security and other space is interesting. So you take one step at a time, slow. But then we try to innovate on, you know, and brought the two companies together. the Syncsort name, I'll serve as the CEO. Syncsort is the remaining name and you guys And cash they put in, probably a boatload of cash the announced deal value was $1.2 billion to Dave, you know I like to believe that we proved in this transitional markets, you and I the Centerbridge team has been terrific. Our event that we hold every year here

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
New York	LOCATION	0.99+
Josh Rogers	PERSON	0.99+
2016	DATE	0.99+
California	LOCATION	0.99+
$1.2 billion	QUANTITY	0.99+
Syncsort	ORGANIZATION	0.99+
July	DATE	0.99+
John Furrier	PERSON	0.99+
Josh	PERSON	0.99+
two companies	QUANTITY	0.99+
Centerbridge Partners	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
90 day	QUANTITY	0.99+
Washington, DC	LOCATION	0.99+
yesterday	DATE	0.99+
2010	DATE	0.99+
Centerbridge	ORGANIZATION	0.99+
three days	QUANTITY	0.99+
Vision Solutions	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
five	QUANTITY	0.99+
DC	LOCATION	0.99+
Big Iron	ORGANIZATION	0.99+
third piece	QUANTITY	0.99+
Calibra	ORGANIZATION	0.99+
Hadoop World	ORGANIZATION	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.99+
two ways	QUANTITY	0.99+
two components	QUANTITY	0.99+
early July	DATE	0.99+
Trillium	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
both sides	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
late 60's	DATE	0.98+
today	DATE	0.98+
middle of August	DATE	0.98+
M&A	ORGANIZATION	0.98+
this week	DATE	0.98+
AWS	ORGANIZATION	0.98+
six thousand customers	QUANTITY	0.98+
NYC	LOCATION	0.98+
Midtown Manhattan	LOCATION	0.98+
one step	QUANTITY	0.98+
Splunk	ORGANIZATION	0.98+
four thousand customers	QUANTITY	0.98+
eight years	QUANTITY	0.98+
vision solutions	ORGANIZATION	0.98+
over $1.2 billion	QUANTITY	0.97+
both	QUANTITY	0.97+
Barney	ORGANIZATION	0.97+
S3	TITLE	0.97+
Ironstream	ORGANIZATION	0.97+
Splunk Conference	EVENT	0.97+
About 4 years ago	DATE	0.96+
two	QUANTITY	0.96+
past year	DATE	0.96+
three announcements	QUANTITY	0.96+
Concord	LOCATION	0.95+
theCUBE	ORGANIZATION	0.95+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Calibra: