Today’s Data Challenges and the Emergence of Smart Data Fabrics
(intro music) >> Now, as we all know, businesses are awash with data, from financial services to healthcare to supply chain and logistics and more. Our activities, and increasingly, actions from machines are generating new and more useful information in much larger volumes than we've ever seen. Now, meanwhile, our data-hungry society's expectations for experiences are increasingly elevated. Everybody wants to leverage and monetize all this new data coming from smart devices and innumerable sources around the globe. All this data, it surrounds us, but more often than not, it lives in silos, which makes it very difficult to consume, share, and make valuable. These factors, combined with new types of data and analytics, make things even more complicated. Data from ERP systems to images, to data generated from deep learning and machine learning platforms, this is the reality that organizations are facing today. And as such, effectively leveraging all of this data has become an enormous challenge. So, today, we're going to be discussing these modern data challenges and the emergence of so-called "Smart Data Fabrics" as a key solution to said challenges. To do so, we're joined by thought leaders from InterSystems. This is a really creative technology provider that's attacking some of the most challenging data obstacles. InterSystems tells us that they're dedicated to helping customers address their critical scalability, interoperability, and speed-to-value challenges. And in this first segment, we welcome Scott Gnau, he's the global Head of Data Platforms at InterSystems, to discuss the context behind these issues and how smart data fabrics provide a solution. Scott, welcome. Good to see you again. >> Thanks a lot. It's good to be here. >> Yeah. So, look, you and I go back, you know, several years and, you know, you've worked in Tech, you've worked in Data Management your whole career. You've seen many data management solutions, you know, from the early days. And then we went through the hoop, the Hadoop era together and you've come across a number of customer challenges that sort of change along the way. And they've evolved. So, what are some of the most pressing issues that you see today when you're talking to customers and, you know, put on your technical hat if you want to. >> (chuckles) Well, Dave, I think you described it well. It's a perfect storm out there. You know, combined with there's just data everywhere and it's coming up on devices, it's coming from new different kinds of paradigms of processing and people are trying to capture and harness the value from this data. At the same time, you talked about silos and I've talked about data silos through my entire career. And I think, I think the interesting thing about it is for so many years we've talked about, "We've got to reduce the silos and we've got to integrate the data, we've got to consolidate the data." And that was a really good paradigm for a long time. But frankly, the perfect storm that you described? The sources are just too varied. The required agility for a business unit to operate and manage their customers is creating an enormous presser and I think ultimately, silos aren't going away. So, there's a realization that, "Okay, we're going to have these silos, we want to manage them, but how do we really take advantage of data that may live across different parts of our business and in different organizations?" And then of course, the expectation of the consumer is at an all-time high, right? They expect that we're going to treat them and understand their needs or they're going to find some other provider. So, you know, pulling all of this together really means that, you know, our customers and businesses around the world are struggling to keep up and it's forcing a real, a new paradigm shift in underlying data management, right? We started, you know, many, many years ago with data marts and then data warehouses and then we graduated to data lakes, where we expanded beyond just traditional transactional data into all kinds of different data. And at each step along the way, we help businesses to thrive and survive and compete and win. But with the perfect storm that you've described, I think those technologies are now just a piece of the puzzle that is really required for success. And this is really what's leading to data fabrics and data meshes in the industry. >> So what are data fabrics? What problems do they solve? How do they work? Can you just- >> Yeah. So the idea behind it is, and this is not to the exclusion of other technologies that I described in data warehouses and data lakes and so on, but data fabrics kind of take the best of those worlds but add in the notion of being able to do data connectivity with provenance as a way to integrate data versus data consolidation. And when you think about it, you know, data has gravity, right? It's expensive to move data. It's expensive in terms of human cost to do ETL processes where you don't have known provenance of data. So, being able to play data where it lies and connect the information from disparate systems to learn new things about your business is really the ultimate goal. You think about in the world today, we hear about issues with the supply chain and supply and logistics is a big issue, right? Why is that an issue? Because all of these companies are data-driven. They've got lots of access to data. They have formalized and automated their processes, they've installed software, and all of that software is in different systems within different companies. But being able to connect that information together, without changing the underlying system, is an important way to learn and optimize for supply and logistics, as an example. And that's a key use case for data fabrics. Being able to connect, have provenance, not interfere with the operational system, but glean additional knowledge by combining multiple different operational systems' data together. >> And to your point, data is by its very nature, you know, distributed around the globe, it's on different clouds, it's in different systems. You mentioned "data mesh" before. How do data fabrics relate to this concept of data mesh? Are they competing? Are they complimentary? >> Ultimately, we think that they're complimentary. And we actually like to talk about smart data fabrics as a way to kind of combine the best of the two worlds. >> What is that? >> The biggest thing really is there's a lot around data fabric architecture that talks about centralized processing. And in data meshes, it's more about distributed processing. Ultimately, we think a smart data fabric will support both and have them be interchangeable and be able to be used where it makes the most sense. There are some things where it makes sense to process, you know, for a local business unit, or even on a device for real-time kinds of implementations. There are some other areas where centralized processing of multiple different data sources make sense. And what we're saying is, "Your technology and the architecture that you define behind that technology should allow for both where they make the most sense." >> What's the bottom line business benefit of implementing a data fabric? What can I expect if I go that route? >> I think there are a couple of things, right? Certainly, being able to interact with customers in real time and being able to manage through changes in the marketplace is certainly a key concept. Time-to-value is another key concept. You know, if you think about the supply and logistics discussion that I had before, right? No company is going to rewrite their ERP operational system. It's how they manage and run their business. But being able to glean additional insights from that data combined with data from a partner combined with data from a customer or combined with algorithmic data that, you know, you may create some sort of forecast and that you want to fit into. And being able to combine that together without interfering with the operational process and get those answers quickly is an important thing. So, seeing through the silos and being able to do the connectivity, being able to have interoperability, and then, combining that with flexibility on the analytics and flexibility on the algorithms you might want to run against that data. Because in today's world, of course, you know, certainly there's the notion of predictive modeling and relational theory, but also now adding in machine learning, deep learning algorithms, and have all of those things kind of be interchangeable is another important concept behind data fabric. So you're not relegated to one type of processing. You're saying, "It's data and I have multiple different processing engines and I may want to interchange them over time." >> So, I know, well actually, you know, when you said "real time", I infer from that, I don't have a zillion copies of the data and it's not in a bunch of silos. Is that a correct premise? >> You try to minimize your copies of the data? >> Yeah. Okay. >> There's certainly, there's a nirvana that says, "There's only ever one copy of data." That's probably impossible. But you certainly don't want to be forced into making multiple copies of data to support different processing engines unnecessarily. >> And so, you've recently made some enhancements to the data fabric capability that takes it, you know, ostensibly to the next level. Is that the smart piece? Is that machine intelligence? Can you describe what's in there? >> Well, you know, ultimately, the business benefit is be able to have a single source of the truth for a company. And so, what we're doing is combining multiple technologies in a single set of software that makes that software agile and supportable and not fragile for deployment of applications. At its core, what we're saying is, you know, we want to be able to consume any kind of data and I think your data fabric architecture is predicated on the fact that you're going to have relational data, you're going to have document data, you may have key-value store data, you may have images, you may have other things, and you want to be able to not be limited by the kind of data that you want to process. And so that certainly is what we build into our product set. And then, you want to be able to have any kind of algorithm, where appropriate, run against that data without having to do a bunch of massive ETL processes or make another copy of the data and move it somewhere else. And so, to that end, we have, taking our award-winning engine, which, you know, provides, you know, traditional analytic capabilities and relational capabilities, we've now integrated machine learning. So, you basically can bring machine learning algorithms to the data without having to move data to the machine learning algorithm. What does that mean? Well, number one, your application developer doesn't have to think differently to take advantage of the new algorithm. So that's a really good thing. The other thing that happens is if you, you're playing that algorithm where the data actually exists from your operational system, that means the round trip from running the model to inferring some decision you want to make to actually implementing that decision can happen instantaneously, as opposed to, you know, other kinds of architectures, where you may want to make a copy of the data and move it somewhere else. That takes time, latency. Now the data gets stale, your model may not be as efficient because you're running against stale data. We've now taken all of that off the table by being able to pull that processing inside the data fabric, inside of the single source of truth. >> And you got to manage all that complexity. So you got one system, so that makes it, you know, cost-effective, and you're bringing modern tooling to the platform. Is that right? >> That's correct. >> How can people learn more and maybe continue the conversation with you if they have other questions? (both chuckle) >> Call or write. >> Yeah. >> Yeah, I mean, certainly, check out our website. We've got a lot of information about the different kinds of solutions, the different industries, the different technologies. Reach out: scottg@intersystems.com. >> Excellent. Thank you, Scott. Really appreciate it and great to see you again. >> Good to see you. >> All right, keep it right there. We have a demo coming up next. You want to see smart data fabrics in action? Stay tuned. (ambient music)
SUMMARY :
Good to see you again. It's good to be here. and I go back, you know, and data meshes in the industry. and this is not to the exclusion data is by its very nature, you know, the best of the two worlds. and be able to be used where and that you want to fit into. and it's not in a bunch of silos. But you certainly don't want to be forced Is that the smart piece? and you want to be able to not be limited so that makes it, you about the different kinds of solutions, great to see you again. data fabrics in action?
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Scott | PERSON | 0.99+ |
InterSystems | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Scott Gnau | PERSON | 0.99+ |
scottg@intersystems.com | OTHER | 0.99+ |
one system | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
one copy | QUANTITY | 0.99+ |
today | DATE | 0.98+ |
first segment | QUANTITY | 0.98+ |
single | QUANTITY | 0.97+ |
each step | QUANTITY | 0.96+ |
two worlds | QUANTITY | 0.96+ |
single source | QUANTITY | 0.96+ |
single set | QUANTITY | 0.94+ |
Today | DATE | 0.91+ |
many years ago | DATE | 0.84+ |
zillion copies | QUANTITY | 0.73+ |
one type | QUANTITY | 0.71+ |
one | QUANTITY | 0.64+ |
Today’s Data Challenges and the Emergence of Smart Data Fabrics
(upbeat music) >> Now, as we all know, businesses are awash with data, from financial services to healthcare to supply chain and logistics and more. Our activities, and increasingly, actions from machines are generating new and more useful information in much larger volumes than we've ever seen. Now, meanwhile, our data hungry society's expectations for experiences are increasingly elevated. Everybody wants to leverage and monetize all this new data coming from smart devices and innumerable sources around the globe. All this data, it surrounds us, but more often than not, it lives in silos, which makes it very difficult to consume, share, and make valuable. These factors combined with new types of data and analytics make things even more complicated. Data from ERP systems to images, to data generated from deep learning and machine learning platforms, this is the reality that organizations are facing today. And as such, effectively leveraging all of this data has become an enormous challenge. So today, we're going to be discussing these modern data challenges in the emergence of so-called smart data fabrics as a key solution to said challenges. To do so, we're joined by thought leaders from InterSystems. This is a really creative technology provider that's attacking some of the most challenging data obstacles. InterSystems tells us that they're dedicated to helping customers address their critical scalability, interoperability, and speed to value challenges. And in this first segment, we welcome Scott now. He's the global head of data platforms at InterSystems to discuss the context behind these issues and how smart data fabrics provide a solution. Scott, welcome, good to see you again. >> Thanks a lot. It's good to be here. >> Yeah, so look, you and I go back, you know, several years and you've worked in tech. You've worked in data management your whole career. You've seen many data management solutions, you know, from the early days. And then we went through the Hadoop era together. And you've come across a number of customer challenges that sort of changed along the way, and they've evolved. So what are some of the most pressing issues that you see today when you're talking to customers, and, you know, put on your technical hat if you want to? >> Well, Dave, I think you described it well. It's a perfect storm out there, you know, combined with, there's just data everywhere. And it's coming up on devices, it's coming from new different kinds of paradigms of processing and people are trying to capture and harness the value from this data. At the same time, you talked about silos, and I've talked about data silos through my entire career. And I think the interesting thing about it is for so many years we've talked about we've got to reduce the silos, and we've got to integrate the data, we've got to consolidate the data. And that was a really good paradigm for a long time. But frankly, the perfect storm that you described, the sources are just too varied. The required agility for a business unit to operate and manage their customers is creating an enormous pressure. And I think, ultimately, silos aren't going away. So there's a realization that, okay, we're going to have these silos, we want to manage them, but how do we really take advantage of data that may live across different parts of our business and in different organizations? And then, of course, the expectation of the consumer is at an all-time high, right? They expect that we're going to treat them and understand their needs, or they're going to find some other provider. So, you know, pulling all of this together really means that, you know, our customers and businesses around the world are struggling to keep up, and it's forcing a new paradigm shift in underlying data management, right? We started, you know, many, many years ago with data marts and then data warehouses, and then we graduated to data lakes where we expanded beyond just traditional transactional data into all kinds of different data. And at each step along the way, we help businesses to thrive and survive and compete and win. But with the perfect storm that you've described, I think those technologies are now just a piece of the puzzle that is really required for success. And this is really what's leading to data fabrics and data meshes in the industry. >> So what are data fabrics? What problems do they solve? How do they work? Can you just add- >> Yeah, so the idea behind it is, and this is not to the exclusion of other technologies that I described in data warehouses and data lakes and so on. But data fabrics kind of take the best of those worlds, but add in the notion of being able to do data connectivity with provenance as a way to integrate data versus data consolidation. And when you think about it, you know, data has gravity, right? It's expensive to move data. It's expensive in terms of human cost to do ETL processes where you don't have known provenance of data. So being able to play data where it lies and connect the information from disparate systems to learn new things about your business is really the ultimate goal. You think about in the world today, we hear about issues with the supply chain, and supply and logistics is a big issue, right? Why is that an issue? Because all of these companies are data driven. They've got lots of access to data. They have formalized and automated their processes. They've installed software. And all of that software is in different systems within different companies. But being able to connect that information together without changing the underlying system is an important way to learn and optimize for supply and logistics, as an example. And that's a key use case for data fabrics being able to connect, have provenance, not interfere with the operational system, but glean additional knowledge by combining multiple different operational systems' data together. >> And to your point, data is by its very nature, you're distributed around the globe, it's on different clouds, it's in different systems. You mentioned data mesh before. How do data fabrics relate to this concept of data mesh? Are they competing? Are they complimentary? >> Ultimately, we think that they're complimentary. And we actually like to talk about smart data fabrics as a way to kind of combine the best of the two worlds. >> What is that? I mean, the biggest thing really is there's a lot around data fabric architecture that talks about centralized processing. And in data meshes, it's more about distributed processing. Ultimately, we think a smart data fabric will support both and have them be interchangeable and be able to be used where it makes the most sense. There are some things where it makes sense to process, you know, for a local business unit, or even on a device for real time kinds of implementations. There are some other areas where centralized processing of multiple different data sources make sense. And what we're saying is your technology and the architecture that you define behind that technology should allow for both where they make the most sense. >> What's the bottom line business benefit of implementing a data fabric? What can I expect if I go that route? >> I think there are a couple of things, right? Certainly being able to interact with customers in real time and being able to manage through changes in the marketplace is certainly a key concept. Time to value is another key concept. You know, if you think about the supply and logistics discussion that I had before, right? No company is going to rewrite their ERP operational system. It's how they manage and run their business. But being able to glean additional insights from that data combined with data from a partner, combined with data from a customer, or combined with algorithmic data that, you know, you may create some sort of forecast and that you want to fit into. And being able to combine that together without interfering with the operational process and get those answers quickly is an important thing. So seeing through the silos and being able to do the connectivity being able to have interoperability, and then combining that with flexibility on the analytics and flexibility on the algorithms you might want to run against that data. Because in today's world, of course, certainly there's the notion of predictive modeling and relational theory, but also now adding in machine learning, deep learning algorithms, and have all of those things kind of be interchangeable is another important concept behind data fabrics. So you're not relegated to one type of processing. You're saying it's data, and I have multiple different processing engines and I may want to interchange them over time. >> So, I know, well actually, when you said real time, I infer from that I don't have a zillion copies of the data and it's not in a bunch of silos. Is that a correct premise? >> You try to minimize your copies of the data. There's a nirvana that says there's only ever one copy of data. That's probably impossible. But you certainly don't want to be forced into making multiple copies of data to support different processing engines unnecessarily. >> And so you've recently made some enhancements to the data fabric capability that takes it, you know, ostensibly to the next level. Is that the smart piece, is that machine intelligence? Can you describe what's in there? >> Well, you know, ultimately the business benefit is be able to have a single source of the truth for a company. And so what we're doing is combining multiple technologies in a single set of software that makes that software agile and supportable and not fragile for deployment of applications. At its core, what we're saying is, we want to be able to consume any kind of data, and I think your data fabric architecture is predicated on the fact that you're going to have relational data you're going to have document data, you may have key value store data, you may have images, you may have other things, and you want to be able to not be limited by the kind of data that you want to process. And so that certainly is what we build into our product set. And then you want to be able to have any kind of algorithm where appropriate run against that data without having to do a bunch of massive ETL processes or make another copy of the data and move it somewhere else. And so to that end, we have taken our award-winning engine, which, you know, provides traditional analytic capabilities and relational capabilities. We've now integrated machine learning. So you basically can bring machine learning algorithms to the data without having to move data to the machine learning algorithm. What does that mean? Well, number one, your application developer doesn't have to think differently to take advantage of the new algorithms. So that's a really good thing. The other thing that happens is if you're playing that algorithm where the data actually exists from your operational system, that means the roundtrip from running the model to inferring some decision you want to make to actually implementing that decision can happen instantaneously. As opposed to, you know, other kinds of architectures where you may want to make a copy of the data and move it somewhere else. That takes time, latency. Now the data gets stale. Your model may not be as efficient because you're running against stale data. We've now taken all of that off the table by being able to pull that processing inside the data fabric, inside of the single source of truth. >> And you got to manage all that complexity. So you got one system, so that makes it cost effective, and you're bringing modern tooling to the platform. Is that right? >> That's correct. How can people learn more and maybe continue the conversation with you if they have other questions? >> (Scott laughs) Call or write. Yeah, I mean, certainly check out our website. We've got a lot of information about the different kinds of solutions, the different industries, the different technologies. Reach out at scottg@intersystems.com. >> Excellent, thank you, Scott. Really appreciate it. And great to see you again. >> Good to see you. All right, keep it right there. We have a demo coming up next. If you want to see smart data fabrics in action, stay tuned. (upbeat music)
SUMMARY :
and innumerable sources around the globe. It's good to be here. that you see today when At the same time, you talked about silos, and this is not to the exclusion And to your point, data the best of the two worlds. and the architecture that you define and that you want to fit into. and it's not in a bunch of silos. But you certainly don't want to be forced Is that the smart piece, is and you want to be able to not be limited And you got to manage the conversation with you if about the different kinds of solutions, And great to see you again. If you want to see smart
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Scott | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
InterSystems | ORGANIZATION | 0.99+ |
scottg@intersystems.com | OTHER | 0.99+ |
both | QUANTITY | 0.99+ |
one system | QUANTITY | 0.99+ |
one copy | QUANTITY | 0.99+ |
today | DATE | 0.98+ |
first segment | QUANTITY | 0.98+ |
each step | QUANTITY | 0.97+ |
single source | QUANTITY | 0.93+ |
two worlds | QUANTITY | 0.92+ |
many years ago | DATE | 0.87+ |
zillion copies | QUANTITY | 0.86+ |
single set | QUANTITY | 0.84+ |
one type | QUANTITY | 0.83+ |
Today | DATE | 0.67+ |
one | QUANTITY | 0.33+ |
Lie 3, Today’s Modern Data Stack Is Modern | Starburst
(energetic music) >> Okay, we're back with Justin Borgman, CEO of Starburst, Richard Jarvis is the CTO of EMIS Health, and Teresa Tung is the cloud first technologist from Accenture. We're on to lie number three. And that is the claim that today's "Modern Data Stack" is actually modern. So (chuckles), I guess that's the lie. Or, is that it's not modern. Justin, what do you say? >> Yeah, I think new isn't modern. Right? I think it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually, are exactly the same as what we've had for 40 years. Rather than Teradata, you have Snowflake. Rather than Informatica, you have Fivetran. So, it's the same general stack, just, y'know, a cloud version of it. And I think a lot of the challenges that have plagued us for 40 years still maintain. >> So, let me come back to you Justin. Okay, but there are differences, right? You can scale. You can throw resources at the problem. You can separate compute from storage. You really, there's a lot of money being thrown at that by venture capitalists, and Snowflake you mentioned, its competitors. So that's different. Is it not? Is that not at least an aspect of modern dial it up, dial it down? So what do you say to that? >> Well, it is. It's certainly taking, y'know what the cloud offers and taking advantage of that. But it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data's still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same structural constraints that exist with the old enterprise data warehouse model on-preem still exist. Just yes, a little bit more elastic now because the cloud offers that. >> So Teresa, let me go to you, 'cause you have cloud-first in your title. So, what's say you to this conversation? >> Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud as we know it, maybe data lake, data warehouse in the central place, that's not even how the cloud providers are looking at it. They have use query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our- the future goes, right? That's going to very much fall the same thing. There was going to be more edge. There's going to be more on-premise, because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers, right? So, there's a lot of reasons why the modern, I guess, the next modern generation of the data stack needs to be much more federated. >> Okay, so Richard, how do you deal with this? You've obviously got, you know, the technical debt, the existing infrastructure, it's on the books. You don't want to just throw it out. A lot of conversation about modernizing applications, which a lot of times is, you know, of microservices layer on top of legacy apps. How do you think about the Modern Data Stack? >> Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just 'cause you can scale CPU and storage doesn't mean you can get more people to use your data to generate you more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business not just the technology itself. >> Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five seven years cloud obviously has given a different pricing model. Derisked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm taking away that that's not enough. Based on what Richard just said, the modern data stack has to serve the business and enable the business to build data products. I buy that. I'm you a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about you know, the, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >> Of how it should look like or, or how >> Yeah. What it should be? >> Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I certainly agree with that. So by no means, are we suggesting that, you know Snowflake or what Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of of idealism. They had the benefit of starting with a clean slate that does not reflect the vast majority of enterprises. And even those companies, as they grow up, mature out of that ideal state, they go by a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really future proof yourself from the inevitable change that you will you won't encounter over time. >> So thank you. So Theresa, based on what Justin just said, I I might take away there is it's inclusive whether it's a data mart, data hub, data lake, data warehouse, just a node on the mesh. Okay. I get that. Does that include Theresa on, on Preem data? Obviously it has to. What are you seeing in terms of the ability to, to take that data mesh concept on Preem I mean most implementations I've seen and data mesh, frankly really aren't, you know adhering to the philosophy there. Maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing, HelloFresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >> I mean, I think it's a killer case for data mesh. The fact that you have valuable data sources on Preem, and then yet you still want to modernize and take the best of cloud. Cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both world. You can start using the data products on Preem, or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or or maybe just tapping into better analytics to get better insights, right? So you're going to be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >> Okay. Thank you. So Richard, you know, talking about data as product wonder if we could give us your perspectives here what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >> So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients, demographics about their their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight and in any business that's clearly not a desirable outcome but when that insight is so critical as it might be in healthcare or some security settings you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured managed way, even if that data comes from a variety of different sources in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >> So that data product through whatever APIs is is accessible, it's discoverable, but it's obviously got to be governed as well. You mentioned appropriately provided to internally. >> Yeah. >> But also, you know, external folks as well. So the, so you've, you've architected that capability today? >> We have and because the data is standard it can generate value much more quickly and we can be sure of the security and value that that's providing, because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context what does this data mean, and what does it mean to process this data for a particular use case. >> Yeah, it makes sense. It's got the context. If the, if the domains on the data, you know you got to cut through a lot of the, the centralized teams, the technical teams that that data agnostic, they don't really have that context. All right, let's end. Justin. How does Starburst fit into this modern data stack? Bring us home. >> Yeah. So I think for us it's really providing our customers with, you know the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know and optionality provides the ability to reduce costs store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know incorporated into our offering as well you can really create and, and curate, you know data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know model and make that an appropriate compliment to you know, the modern data stack that people have today. >> Excellent. Hey, I want to thank Justin, Teresa, and Richard for joining us today. You guys are great. Big believers in the in the data mesh concept, and I think, you know we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are going to be available on the cube.net for on demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and they have awesome resources. Lots of data mesh conversations over there and really good stuff in, in the resource section. So check that out. Thanks for watching the "Data Doesn't Lie... or Does It?" made possible by Starburst data. This is Dave Vellante for the Cube, and we'll see you next time. (upbeat music)
SUMMARY :
And that is the claim It's the cloud data stack, So, let me come back to you Justin. that the cloud data warehouses out there So Teresa, let me go to you, So the centralized cloud as we know it, it's on the books. the first thing to say is of the modern data stack. from the inevitable change that you will What's the answer to that Theresa? So the mesh allows you to in the modern data stack? or having the data not presented So that data product But also, you know, around the data to say in a on the data, you know enable the data mesh, you know in the data mesh concept,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Richard | PERSON | 0.99+ |
Teresa Tung | PERSON | 0.99+ |
Justin | PERSON | 0.99+ |
Teresa | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Justin Borgman | PERSON | 0.99+ |
Richard Jarvis | PERSON | 0.99+ |
40 years | QUANTITY | 0.99+ |
Theresa | PERSON | 0.99+ |
Starburst | ORGANIZATION | 0.99+ |
JPMC | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Informatica | ORGANIZATION | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
both worlds | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
EMIS Health | ORGANIZATION | 0.99+ |
first technologist | QUANTITY | 0.98+ |
one element | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
first thing | QUANTITY | 0.98+ |
five seven years | QUANTITY | 0.98+ |
one | QUANTITY | 0.97+ |
Teradata | ORGANIZATION | 0.97+ |
Oracle | ORGANIZATION | 0.97+ |
cube.net | OTHER | 0.96+ |
Mongo | ORGANIZATION | 0.95+ |
one size | QUANTITY | 0.93+ |
Cube | ORGANIZATION | 0.92+ |
Preem | TITLE | 0.92+ |
both world | QUANTITY | 0.91+ |
one place | QUANTITY | 0.91+ |
Today’s | TITLE | 0.89+ |
Fivetran | ORGANIZATION | 0.86+ |
Data Doesn't Lie... or Does It? | TITLE | 0.86+ |
single location | QUANTITY | 0.85+ |
HelloFresh | ORGANIZATION | 0.84+ |
first place | QUANTITY | 0.83+ |
CEO | PERSON | 0.83+ |
Lie | TITLE | 0.82+ |
single source | QUANTITY | 0.79+ |
first | QUANTITY | 0.75+ |
one node | QUANTITY | 0.72+ |
Snowflake | ORGANIZATION | 0.66+ |
Snowflake | TITLE | 0.66+ |
three | QUANTITY | 0.59+ |
CTO | PERSON | 0.53+ |
Data Stack | TITLE | 0.53+ |
Redshift | TITLE | 0.52+ |
starburst.io | OTHER | 0.48+ |
COVID | TITLE | 0.37+ |