Starburst Panel Q1
>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting costs could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data, Mars, data hubs, and yes, even data lakes were broken and left us wanting for more welcome to the data doesn't lie, or does it a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have feature parody with the data lake or vice versa is the so-called modern data stack simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for IDU that was acquired by Teradata. And when I got to Teradata, of course, Terada is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on-prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience, Joe? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know? Right. But you actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like swans Oxley, for things like security, for certain very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited JAK, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come, I think that's the story of Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenets of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about, so Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and con contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but the, what does that mean? Does that mean the ed w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's gonna be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems. Maybe either those that either source systems, the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to lose all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got, you know, the domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue or two, you know, challenges self-serve infrastructure let's park that for a second. And then in your industry, one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And, and I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at EMI is we have a single security layer that sits on top of our data mesh, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. >>And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin mean Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, doing analytic queries and with data, that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah, I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively, almost eCommerce, like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself. >>Okay. G guys grab a sip of water. After the short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence. Keep it right there.
SUMMARY :
famously said the best minds of my generation are thinking about how to get people to Teresa is on the west coast and Justin is in Massachusetts with me. So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? you might be able to centralize all the data and all of the tooling and teams in one place. Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? of rock stars that, that, you know, build cubes and, and the like, And you can think of them like consultants Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come, I think that's the story of Oracle or Terra data or other proprietary But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing you know, new mesh layer that still takes advantage of the things. But it creates what I would argue or two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around And, and so having done that and investing quite heavily in making that possible But do you have anything to add to this because you're essentially taking, you know, the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Lanta | PERSON | 0.99+ |
Dani | PERSON | 0.99+ |
Richard | PERSON | 0.99+ |
Justin Borgman | PERSON | 0.99+ |
Justin | PERSON | 0.99+ |
Jeff Ocker | PERSON | 0.99+ |
Theresa | PERSON | 0.99+ |
Richard Jarvis | PERSON | 0.99+ |
Teresa | PERSON | 0.99+ |
Massachusetts | LOCATION | 0.99+ |
Teradata | ORGANIZATION | 0.99+ |
40 years | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
UK | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
Joe | PERSON | 0.99+ |
GDPR | TITLE | 0.99+ |
JAK | PERSON | 0.99+ |
2011 | DATE | 0.99+ |
Starburst | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
thousands | QUANTITY | 0.99+ |
two models | QUANTITY | 0.99+ |
EMI | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Gemma | PERSON | 0.99+ |
Terada | ORGANIZATION | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
Each | QUANTITY | 0.99+ |
first lie | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
first startup | QUANTITY | 0.98+ |
Cloudera | ORGANIZATION | 0.98+ |
Today | DATE | 0.98+ |
SQL | TITLE | 0.98+ |
first technologist | QUANTITY | 0.97+ |
one place | QUANTITY | 0.97+ |
Democrat | ORGANIZATION | 0.97+ |
single | QUANTITY | 0.97+ |
about 30 miles | QUANTITY | 0.97+ |
one | QUANTITY | 0.96+ |
three industry experts | QUANTITY | 0.95+ |
more than a decade later | DATE | 0.94+ |
One | QUANTITY | 0.94+ |
hit adapt | ORGANIZATION | 0.94+ |
Terra data | ORGANIZATION | 0.93+ |
Greenfield | LOCATION | 0.92+ |
single source | QUANTITY | 0.91+ |
single tool | QUANTITY | 0.91+ |
Oxley | PERSON | 0.91+ |
one vendor | QUANTITY | 0.9+ |
single bucket | QUANTITY | 0.9+ |
single version | QUANTITY | 0.88+ |
about a year ago | DATE | 0.85+ |
Theresa tongue | PERSON | 0.83+ |
emos | ORGANIZATION | 0.82+ |
Mars | ORGANIZATION | 0.8+ |
swans Oxley | PERSON | 0.77+ |
IDU | TITLE | 0.69+ |
first | QUANTITY | 0.59+ |
a second | QUANTITY | 0.55+ |
Sarbanes Oxley | ORGANIZATION | 0.53+ |
Mastered | PERSON | 0.45+ |
Q1 | QUANTITY | 0.37+ |