Lie 3, Today’s Modern Data Stack Is Modern | Starburst

(energetic music) >> Okay, we're back with Justin Borgman, CEO of Starburst, Richard Jarvis is the CTO of EMIS Health, and Teresa Tung is the cloud first technologist from Accenture. We're on to lie number three. And that is the claim that today's "Modern Data Stack" is actually modern. So (chuckles), I guess that's the lie. Or, is that it's not modern. Justin, what do you say? >> Yeah, I think new isn't modern. Right? I think it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually, are exactly the same as what we've had for 40 years. Rather than Teradata, you have Snowflake. Rather than Informatica, you have Fivetran. So, it's the same general stack, just, y'know, a cloud version of it. And I think a lot of the challenges that have plagued us for 40 years still maintain. >> So, let me come back to you Justin. Okay, but there are differences, right? You can scale. You can throw resources at the problem. You can separate compute from storage. You really, there's a lot of money being thrown at that by venture capitalists, and Snowflake you mentioned, its competitors. So that's different. Is it not? Is that not at least an aspect of modern dial it up, dial it down? So what do you say to that? >> Well, it is. It's certainly taking, y'know what the cloud offers and taking advantage of that. But it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data's still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same structural constraints that exist with the old enterprise data warehouse model on-preem still exist. Just yes, a little bit more elastic now because the cloud offers that. >> So Teresa, let me go to you, 'cause you have cloud-first in your title. So, what's say you to this conversation? >> Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud as we know it, maybe data lake, data warehouse in the central place, that's not even how the cloud providers are looking at it. They have use query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our- the future goes, right? That's going to very much fall the same thing. There was going to be more edge. There's going to be more on-premise, because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers, right? So, there's a lot of reasons why the modern, I guess, the next modern generation of the data stack needs to be much more federated. >> Okay, so Richard, how do you deal with this? You've obviously got, you know, the technical debt, the existing infrastructure, it's on the books. You don't want to just throw it out. A lot of conversation about modernizing applications, which a lot of times is, you know, of microservices layer on top of legacy apps. How do you think about the Modern Data Stack? >> Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just 'cause you can scale CPU and storage doesn't mean you can get more people to use your data to generate you more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business not just the technology itself. >> Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five seven years cloud obviously has given a different pricing model. Derisked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm taking away that that's not enough. Based on what Richard just said, the modern data stack has to serve the business and enable the business to build data products. I buy that. I'm you a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about you know, the, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >> Of how it should look like or, or how >> Yeah. What it should be? >> Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I certainly agree with that. So by no means, are we suggesting that, you know Snowflake or what Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of of idealism. They had the benefit of starting with a clean slate that does not reflect the vast majority of enterprises. And even those companies, as they grow up, mature out of that ideal state, they go by a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really future proof yourself from the inevitable change that you will you won't encounter over time. >> So thank you. So Theresa, based on what Justin just said, I I might take away there is it's inclusive whether it's a data mart, data hub, data lake, data warehouse, just a node on the mesh. Okay. I get that. Does that include Theresa on, on Preem data? Obviously it has to. What are you seeing in terms of the ability to, to take that data mesh concept on Preem I mean most implementations I've seen and data mesh, frankly really aren't, you know adhering to the philosophy there. Maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing, HelloFresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >> I mean, I think it's a killer case for data mesh. The fact that you have valuable data sources on Preem, and then yet you still want to modernize and take the best of cloud. Cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both world. You can start using the data products on Preem, or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or or maybe just tapping into better analytics to get better insights, right? So you're going to be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >> Okay. Thank you. So Richard, you know, talking about data as product wonder if we could give us your perspectives here what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >> So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients, demographics about their their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight and in any business that's clearly not a desirable outcome but when that insight is so critical as it might be in healthcare or some security settings you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured managed way, even if that data comes from a variety of different sources in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >> So that data product through whatever APIs is is accessible, it's discoverable, but it's obviously got to be governed as well. You mentioned appropriately provided to internally. >> Yeah. >> But also, you know, external folks as well. So the, so you've, you've architected that capability today? >> We have and because the data is standard it can generate value much more quickly and we can be sure of the security and value that that's providing, because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context what does this data mean, and what does it mean to process this data for a particular use case. >> Yeah, it makes sense. It's got the context. If the, if the domains on the data, you know you got to cut through a lot of the, the centralized teams, the technical teams that that data agnostic, they don't really have that context. All right, let's end. Justin. How does Starburst fit into this modern data stack? Bring us home. >> Yeah. So I think for us it's really providing our customers with, you know the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know and optionality provides the ability to reduce costs store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know incorporated into our offering as well you can really create and, and curate, you know data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know model and make that an appropriate compliment to you know, the modern data stack that people have today. >> Excellent. Hey, I want to thank Justin, Teresa, and Richard for joining us today. You guys are great. Big believers in the in the data mesh concept, and I think, you know we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are going to be available on the cube.net for on demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and they have awesome resources. Lots of data mesh conversations over there and really good stuff in, in the resource section. So check that out. Thanks for watching the "Data Doesn't Lie... or Does It?" made possible by Starburst data. This is Dave Vellante for the Cube, and we'll see you next time. (upbeat music)

Published Date : Aug 22 2022

SUMMARY :

And that is the claim It's the cloud data stack, So, let me come back to you Justin. that the cloud data warehouses out there So Teresa, let me go to you, So the centralized cloud as we know it, it's on the books. the first thing to say is of the modern data stack. from the inevitable change that you will What's the answer to that Theresa? So the mesh allows you to in the modern data stack? or having the data not presented So that data product But also, you know, around the data to say in a on the data, you know enable the data mesh, you know in the data mesh concept,

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Teresa Tung	PERSON	0.99+
Justin	PERSON	0.99+
Teresa	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
40 years	QUANTITY	0.99+
Theresa	PERSON	0.99+
Starburst	ORGANIZATION	0.99+
JPMC	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Informatica	ORGANIZATION	0.99+
Accenture	ORGANIZATION	0.99+
both worlds	QUANTITY	0.99+
today	DATE	0.99+
EMIS Health	ORGANIZATION	0.99+
first technologist	QUANTITY	0.98+
one element	QUANTITY	0.98+
both	QUANTITY	0.98+
first thing	QUANTITY	0.98+
five seven years	QUANTITY	0.98+
one	QUANTITY	0.97+
Teradata	ORGANIZATION	0.97+
Oracle	ORGANIZATION	0.97+
cube.net	OTHER	0.96+
Mongo	ORGANIZATION	0.95+
one size	QUANTITY	0.93+
Cube	ORGANIZATION	0.92+
Preem	TITLE	0.92+
both world	QUANTITY	0.91+
one place	QUANTITY	0.91+
Today’s	TITLE	0.89+
Fivetran	ORGANIZATION	0.86+
Data Doesn't Lie... or Does It?	TITLE	0.86+
single location	QUANTITY	0.85+
HelloFresh	ORGANIZATION	0.84+
first place	QUANTITY	0.83+
CEO	PERSON	0.83+
Lie	TITLE	0.82+
single source	QUANTITY	0.79+
first	QUANTITY	0.75+
one node	QUANTITY	0.72+
Snowflake	ORGANIZATION	0.66+
Snowflake	TITLE	0.66+
three	QUANTITY	0.59+
CTO	PERSON	0.53+
Data Stack	TITLE	0.53+
Redshift	TITLE	0.52+
starburst.io	OTHER	0.48+
COVID	TITLE	0.37+

Lie 2, An Open Source Based Platform Cannot Give You Performance and Control | Starburst

>>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay. We're gonna get into lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you'll, you'll never get performance because you need to be column. You need to store data in a column format. And then, you know, column formats were introduced to, to data lake. You have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again, like iceberg and Delta and hoote that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a quote from, you know, Kurt Monash many years ago where he said, you know, it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a lie and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, the clothes is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect and what you don't want to end up done is backed itself into a corner that then prevents it from innovating. So if you have chosen the technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, obviously her vision is there's an open source that, that data mesh is open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to hit and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in hit back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, it's interesting remind of when I, you know, I see the, the gas price, the TSR gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you, you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. That that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you wanna use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you and, and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers there, but, but a lot of Oracle customers and they, you know, they'll admit yeah, you know, the Jammin us on price and the license cost, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast an ROI? >>I think the answer to that is it can depend a bit. It depends on your business's skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is always a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So IE, it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you command a 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years and in the world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse, it, it fit in this, in this world. >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a data lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understanding holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern or is it the same wine new bottle when it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage.

Published Date : Aug 22 2022

SUMMARY :

give you the performance and control that you can get with a proprietary We got, you know, largely over the performance hurdle, you know, more recently people will say, And I remember a quote from, you know, Kurt Monash many years ago where he said, you know, it is an evolving, you know, spectrum, but, but from your perspective, in a, a direction, slightly different to what people expect and what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, it's interesting remind of when I, you know, I see the, the gas price, the TSR gas price And I think, you know, I loved what Richard said. you know, the Jammin us on price and the license cost, but we do get value out And so for those different teams, they can get to an you know, the data brick snowflake, you know, thing is always a lot of fun for analysts like me. So the advice that I saw years ago was if you have open source technologies, years and in the world of Oracle, you know, normally it's the staff, to discover and consume via, you know, the creation of data products as well. data model that we see emerging and the so-called modern data stack is

ENTITIES

Entity	Category	Confidence
Jess Borgman	PERSON	0.99+
Richard	PERSON	0.99+
20 cents	QUANTITY	0.99+
six	QUANTITY	0.99+
Justin	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
Jess	PERSON	0.99+
pythons	TITLE	0.99+
seven years	QUANTITY	0.99+
Today	DATE	0.99+
Javas	TITLE	0.99+
Teradata	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.98+
millions	QUANTITY	0.98+
EVAs	ORGANIZATION	0.98+
JAK	PERSON	0.98+
Starburst	ORGANIZATION	0.98+
both	QUANTITY	0.97+
10	DATE	0.97+
12 years ago	DATE	0.97+
Starbust	TITLE	0.96+
today	DATE	0.95+
Apache iceberg	ORGANIZATION	0.94+
Google	ORGANIZATION	0.93+
12 years	QUANTITY	0.92+
single point	QUANTITY	0.92+
two worlds	QUANTITY	0.92+
10	QUANTITY	0.91+
Hudu	LOCATION	0.91+
Unix	TITLE	0.9+
one thing	QUANTITY	0.87+
trillions of records	QUANTITY	0.83+
first data lake	QUANTITY	0.82+
Starburst	TITLE	0.8+
PJI	ORGANIZATION	0.79+
years ago	DATE	0.76+
IE	TITLE	0.75+
Lie 2	TITLE	0.72+
many years ago	DATE	0.72+
over a couple times	QUANTITY	0.7+
TCO	ORGANIZATION	0.7+
Parque	ORGANIZATION	0.67+
Number two	QUANTITY	0.64+
Kubernetes	ORGANIZATION	0.59+
a decade	QUANTITY	0.58+
plus years	DATE	0.57+
Azure	TITLE	0.57+
S3	TITLE	0.55+
Delta	TITLE	0.54+
20	QUANTITY	0.49+
last	DATE	0.48+
Mohan	PERSON	0.44+
ORC	ORGANIZATION	0.27+

Lie 1, The Most Effective Data Architecture Is Centralized | Starburst

(bright upbeat music) >> In 2011, early Facebook employee and Cloudera co-founder Jeff Hammerbacher famously said, "The best minds of my generation are thinking about how to get people to click on ads, and that sucks!" Let's face it. More than a decade later, organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile and data-driven enterprise. What does that even mean, you ask? Well, it means that everyone in the organization has the data they need when they need it in a context that's relevant to advance the mission of an organization. Now, that could mean cutting costs, could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data warehouses, data marts, data hubs, and yes even data lakes were broken and left us wanting for more. Welcome to The Data Doesn't Lie... Or Does It? A series of conversations produced by theCUBE and made possible by Starburst Data. I'm your host, Dave Vellante, and joining me today are three industry experts. Justin Borgman is the co-founder and CEO of Starburst, Richard Jarvis is the CTO at EMIS Health, and Teresa Tung is cloud first technologist at Accenture. Today, we're going to have a candid discussion that will expose the unfulfilled, and yes, broken promises of a data past. We'll expose data lies: big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth inevitable? Will the data warehouse ever have feature parity with the data lake or vice versa? Is the so-called modern data stack simply centralization in the cloud, AKA the old guards model in new cloud close? How can organizations rethink their data architectures and regimes to realize the true promises of data? Can and will an open ecosystem deliver on these promises in our lifetimes? We're spanning much of the Western world today. Richard is in the UK, Teresa is on the West Coast, and Justin is in Massachusetts with me. I'm in theCUBE studios, about 30 miles outside of Boston. Folks, welcome to the program. Thanks for coming on. >> Thanks for having us. >> Okay, let's get right into it. You're very welcome. Now, here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >> Yeah, definitely a lie. My first startup was a company called Hadapt, which was an early SQL engine for IDU that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem, data in the cloud. Those companies were acquiring other companies and inheriting their data architecture. So despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >> So Richard, from a practitioner's point of view, what are your thoughts? I mean, there's a lot of pressure to cut cost, keep things centralized, serve the business as best as possible from that standpoint. What does your experience show? >> Yeah, I mean, I think I would echo Justin's experience really that we as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in a platform that's close to data experts people who really understand healthcare data from pharmacies or from doctors. And so, although if you were starting from a greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that businesses just don't grow up like that. And it's just really impossible to get that academic perfection of storing everything in one place. >> Teresa, I feel like Sarbanes-Oxley have kind of saved the data warehouse, right? (laughs) You actually did have to have a single version of the truth for certain financial data, but really for some of those other use cases I mentioned, I do feel like the industry has kind of let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralize? >> I think you got to have centralized governance, right? So from the central team, for things like Sarbanes-Oxley, for things like security, for certain very core data sets having a centralized set of roles, responsibilities to really QA, right? To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise, you're not going to be able to scale, right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately, you're going to collaborate with your partners. So partners that are not within the company, right? External partners. We're going to see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >> So Justin, you guys last, jeez, I think it was about a year ago, had a session on data mesh. It was a great program. You invited Zhamak Dehghani. Of course, she's the creator of the data mesh. One of our fundamental premises is that you've got this hyper specialized team that you've got to go through if you want anything. But at the same time, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess, a question for you Richard. How do you deal with that? Do you organize so that there are a few sort of rock stars that build cubes and the like or have you had any success in sort of decentralizing with your constituencies that data model? >> Yeah. So we absolutely have got rockstar data scientists and data guardians, if you like. People who understand what it means to use this data, particularly the data that we use at EMIS is very private, it's healthcare information. And some of the rules and regulations around using the data are very complex and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a consulting type experience from a set of rock stars to help a more decentralized business who needs to understand the data and to generate some valuable output. >> Justin, what do you say to a customer or prospect that says, "Look, Justin. I got a centralized team and that's the most cost effective way to serve the business. Otherwise, I got duplication." What do you say to that? >> Well, I would argue it's probably not the most cost effective, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you for many, many years to come. I think that's the story at Oracle or Teradata or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams, as much as they are experts in the technology, they don't necessarily understand the data itself. And this is one of the core tenets of data mesh that Zhamak writes about is this idea of the domain owners actually know the data the best. And so by not only acknowledging that data is generally decentralized, and to your earlier point about Sarbanes-Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for those laws to be compliant. But I think the reality is the data mesh model basically says data's decentralized and we're going to turn that into an asset rather than a liability. And we're going to turn that into an asset by empowering the people that know the data the best to participate in the process of curating and creating data products for consumption. So I think when you think about it that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two models comparing and contrasting. >> So do you think the demise of the data warehouse is inevitable? Teresa, you work with a lot of clients. They're not just going to rip and replace their existing infrastructure. Maybe they're going to build on top of it, but what does that mean? Does that mean the EDW just becomes less and less valuable over time or it's maybe just isolated to specific use cases? What's your take on that? >> Listen, I still would love all my data within a data warehouse. I would love it mastered, would love it owned by a central team, right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date, I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's going to be a new technology that's going to emerge that we're going to want to tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this new mesh layer that still takes advantage of the things I mentioned: the data products in the systems that are meaningful today, and the data products that actually might span a number of systems. Maybe either those that either source systems with the domains that know it best, or the consumer-based systems or products that need to be packaged in a way that'd be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >> So, Richard, let me ask you. Take Zhamak's principles back to those. You got the domain ownership and data as product. Okay, great. Sounds good. But it creates what I would argue are two challenges: self-serve infrastructure, let's park that for a second, and then in your industry, one of the most regulated, most sensitive, computational governance. How do you automate and ensure federated governance in that mesh model that Teresa was just talking about? >> Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to centralize the security and the governance of the data. And I think although a data warehouse makes that very simple 'cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at EMIS is we have a single security layer that sits on top of our data mesh, which means that no matter which user is accessing which data source, we go through a well audited, well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is audited in a very kind of standard way regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible, understanding where your source of truth is and securing that in a common way is still a valuable approach, and you can do it without having to bring all that data into a single bucket so that it's all in one place. And so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform, and ensuring that only data that's available under GDPR and other regulations is being used by the data users. >> Yeah. So Justin, we always talk about data democratization, and up until recently, they really haven't been line of sight as to how to get there, but do you have anything to add to this because you're essentially doing analytic queries with data that's all dispersed all over. How are you seeing your customers handle this challenge? >> Yeah, I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, the people who know the data the best, to create data as a product ultimately to be consumed. And we try to represent that in our product as effectively, almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization, and then you can start to consume them as you'd like. And so really trying to build on that notion of data democratization and self-service, and making it very easy to discover and start to use with whatever BI tool you may like or even just running SQL queries yourself. >> Okay guys, grab a sip of water. After the short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence. Keep it right there. (bright upbeat music)

Published Date : Aug 22 2022

SUMMARY :

has the data they need when they need it Now, here's the first lie. has proven that to be a lie. of pressure to cut cost, and all of the tooling have kind of saved the data So from the central team, for that build cubes and the like and to generate some valuable output. and that's the most cost effective way is that the reality is those of the data warehouse is inevitable? and making sure that the mesh one of the most regulated, most sensitive, and processes that you put as to how to get there, aspect of the answer to that. or open platforms are the best path

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Richard	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Teresa Tung	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Teresa	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Massachusetts	LOCATION	0.99+
Zhamak Dehghani	PERSON	0.99+
UK	LOCATION	0.99+
2011	DATE	0.99+
two challenges	QUANTITY	0.99+
Hadapt	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Starburst	ORGANIZATION	0.99+
two models	QUANTITY	0.99+
thousands	QUANTITY	0.99+
Boston	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Sarbanes-Oxley	ORGANIZATION	0.99+
Each	QUANTITY	0.99+
first lie	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
Today	DATE	0.98+
today	DATE	0.98+
SQL	TITLE	0.98+
Starburst Data	ORGANIZATION	0.98+
EMIS Health	ORGANIZATION	0.98+
Cloudera	ORGANIZATION	0.98+
one	QUANTITY	0.98+
first startup	QUANTITY	0.98+
one place	QUANTITY	0.98+
about 30 miles	QUANTITY	0.98+
One	QUANTITY	0.97+
More than a decade later	DATE	0.97+
EMIS	ORGANIZATION	0.97+
single bucket	QUANTITY	0.97+
first technologist	QUANTITY	0.96+
three industry experts	QUANTITY	0.96+
single tool	QUANTITY	0.96+
single version	QUANTITY	0.94+
Zhamak	PERSON	0.92+
theCUBE	ORGANIZATION	0.91+
single source	QUANTITY	0.9+
West Coast	LOCATION	0.87+
one vendor	QUANTITY	0.84+
single security layer	QUANTITY	0.81+
about a year ago	DATE	0.75+
IDU	ORGANIZATION	0.68+
Is	TITLE	0.65+
a second	QUANTITY	0.64+
EDW	ORGANIZATION	0.57+
examples	QUANTITY	0.55+
echo	COMMERCIAL_ITEM	0.54+
twofold	QUANTITY	0.5+
Lie	TITLE	0.35+

SAP: McDermott steps down, major integration challenges lie ahead

>> From the SiliconANGLE media office in Boston, Massachusetts, it's theCUBE. Now, here's your host, Dave Vellante. >> Hi, everybody, welcome to this CUBE Insights, powered by ETR. In this episode of the Breaking Analysis, we're going to take a look at SAP. Thursday, October 10th, SAP surprised the Street, they announced early, they preannounced their earnings, and at the same time they timed that with the announcement that CEO, longtime CEO Bill McDermot was stepping down, his contract was up for renewal in January of 2020, and he decided that he's going to turn it over to a co-CEO structure that I'll talk about a little bit, so that was big news. Spending on SAP has been holding pretty steady over the last several quarters, I'll share some ETR data with you. It's been quite a run by Bill McDermott, he started out as CEO, I think it was February of 2010, as co-CEO with Jim Hagemann Snabe, and then two years later was named the sole CEO and I'll share some data on that in terms of the performance of SAP during his tenure. But the bottom line is, we expect, based on the spending data, some continued momentum from SAP, I'll show you some data that shows a little bit of a mix in the numbers, ETR basically just dropped a report on Friday that I'll share with you as well, but the bottom line is we see some major challenges ahead for SAP, specifically from a technology integration point that I'll talk to you, and it really is not showing up yet in the spending numbers, but it's something that we're keeping an eye on, and it's something that we want to share with you, our community. So Alex, if you wouldn't mind bringing up the first slide, here. I'll make some key points, really around SAP's Q3 earnings and the CEO news. So as they say, they pre-announced earnings on October 10th after the close, 10% revenue growth, which is a nice, healthy double digit revenue growth, cloud was up considerably, Bill McDermott made the big emphasis when he was doing the rounds on how their cloud revenue is growing faster than competitors, 33%, but definitely from a smaller base, but their license revenue, their traditional on-prem businesses continues to be under pressure and decline, it's got a, SAP is a strong services business, services and maintenance business, and they're up to 12,000 customers with HANA, I'll make some comments on HANA in a little bit later. This may have some implications for Europe, we've been saying that Europe is over-banked, that banking is soft based on the ETR spending data, so this may be a little bit of a bright spot for Europe. Of course SAP with its ERP business of strong manufacturing, anybody who has a supply chain, so this may be a good sign for Europe, that's something that we're watching. And then, say McDermott steps down, we're going back to the dual CEO structure. Jennifer Morgan, who headed the cloud business, is a longtime SAP employee, and she essentially is going to be taking that role of the customer-facing CEO. Christian Kline is really, has history as product development and HANA, he did a stint in finance at SuccessFactors, and is really an operations guru, so back to that dual CEO role that you saw with Snabe and McDermott, where McDermott was really the front-facing, sales-facing individual, and Snabe was the product person. So that's kind of an interesting structure, we see that, we saw that in Oracle before Mark Hurd stepped down with Safra Catz as co-CEO, so it's not a unique structure, although it's not, certainly not common in the industry. The next thought I want to share with you is one that you may have seen before, every time that ETR does a survey, and this is data, fresh data from the October survey, every time they do a survey, they take spending intentions and they ask folks, "Are you spending more, "are you spending less, are you spending the same, "are you adding to the platform, "are you subtracting from the platform?" So they essentially ignore the, for this net score that I'm showing you now, they ignore the people that aren't spending, that are staying the same, flat, and they take the more minus the less, subtract amount, you get a net score, and the net score here is 27%. This is not uncommon for, from the data that I've seen out of ETR for a large company established legacy provider like SAP. Net score 27% is not great, but it's a holding steady score, it's not in the negatives, it's not in the red zone, and so you can see here that 32% of the survey respondents were saying they're going to spend more, 54% basically flat, but only a smaller number, 6%, saying they're going to spend less, so it's reasonable for SAP, but if you look at the trendline, Alex, bring up the next slide, look at the spending trendline from the survey for SAP since the July 16 survey, they do this every quarter, and so the blue is the net score, that green minus the red that I've talked about in the past, and you can see that sort of steady decline, but this is not a disaster, what it is, is it's a sign of spending momentum relative to previous years or previous quarters, and you can see the yellow line is also declining, that's market share, what that means is market share in terms of spend relative to other initiatives, so the categories that SAP participates in, enterprise software, et cetera, spending on SAP relative to other sectors has been in decline. If you look at, Alex, if you bring up the next slide, look at the SaaS business, you'll see that it's a much happier story. SAP's made a number of acquisitions that I'll talk about in a moment, of cloud/SaaS players, so you can see their SaaS position has been holding firm, ETR cites Concur, SuccessFactors, Ariba, Callidus, they kind of remaining stable versus a year ago, and you can see the market share's kind of ticking up, so pretty solid from the new growth, that high growth area, and that's something that the Street really pays very close attention to. The next data point that I want to show you on the next slide is actually quite fascinating, so SAP beat its forecasts, so it didn't beat and raise expectations for the rest of the year, but so what this shows is ETR's regression analysis, what the quants at ETR do, is they crunch the numbers, and they compare them to the consensus on Wall Street, and they actually forecast higher or lower, where they think that earnings are going to come in based on their spending data, so you can see here that green, you see that little RPM meter, they're in the green, that's where you want to be, 359 basis points ahead of the median forecast, so they're saying, so the ETR second half spending 10.4% versus consensus of 6.8%, very positive sign. I think it's no coincidence that SAP records B for the quarter, so based on that data collected in that October survey, it looks like there's some momentum for SAP. Now the next slide I want to show you is the stock chart, this is kind of the scorecard, if you will, for Bill McDermott's tenure, and you can see, so I went back to 2010, as I say, he started in 2010, as a co-CEO with Jim Snabe, and then look at the performance here, I mean it's been pretty solid. And so you see today it's up around 10%, as I say, they announced the earnings beat, they announced their revenue beat, and they basically affirmed expectations, maybe raised them a little bit going forward. The reason why the stock is up is the beat, but also McDermott has put in place sort of an efficiency improvement and a restructuring. They've made a promise to improve operating margins by 1% a year over the next five years. They've made a promise to get cloud gross margins to 75% by 2023. They've done a restructuring, I think it affects around 4400 people, and they're hiring data scientists and AI experts and machine learning people, and RPA folks, they acquired an RPA company a while ago, and kind of just threw that in 'cause it's such a hot space. Software coders all around the world, China, US, Europe, all over the place. And so that restructuring, the Street loves when you restructure, you cut the dead wood, so to speak. With all due respect to the folks that might be affected by this, but the Street loves that. So you're seeing the combination of the beat, and the uptick or the efficiencies taking place in the quarter, and they timed that with the McDermott announcement because they wanted to, I'm sure, time it with some positive news, so you can see the stocks up today, so that's kind of a scorecard on Bill McDermott, I have to say, pretty impressive performance over the last 10 years, or nearly 10 years. But here's the thing, we see some major challenges coming forth with SAP, and I want to talk about that a little bit. Before I do, Alex, if you would play the video from Bill McDermott answering a question that John Furrier asked several years ago, and then we'll come back and talk about it. >> I had a meeting with the CEO yesterday, and this is a very common conversation. He grew his business by acquisition, and now he's got a federation of a whole bunch of companies, and he feels like a holding company. What he wants to do is consolidate these businesses onto a common platform. He won't do it overnight because you can't shut down businesses, but the vision over the next few years is consolidate everything onto one common SAP platform, and take all the databases out and standardize everything on HANA. >> Now here's what's ironic. The core success of SAP historically has been what? It's been that they have a single, unified system, the general ledger and all the financial data and all the supply chain data, all of that is in the same place, accessible, single version of the truth if you will. What's ironic is SAP's made 31 acquisitions in the last nearly 10 years under the tenure of Bill McDermott. So in a way, SAP is becoming a tech holding company, kind of picking up on some of the things that Bill McDermott said in his little clip there. In our view, SAP's big technical challenge is to get all this stuff working together. As you all know, it's nontrivial when you make a lot of acquisitions, billions and billions of dollars of acquisitions, which by the way, they promised to stop that torrid pace of multibillion dollar acquisitions, very difficult to pull those together. Let's look at some of those acquisitions that they've made, Ariba, Concur, SuccessFactors, SuccessFactors is interesting because SuccessFactors was kind of talent management, you had kind of core HR from SAP and it's kind of been a challenge to put those things together. Think about the legacy R3 and R4 and all the on-prem manufacturing stuff that SAP still runs, that customers still run. Acquisition of Sybase, Callidus, so... SAP's answer to all this integration is to put everything in memory in HANA. So the motivation for HANA, however, in many ways was to compete more effectively with Oracle and not have to rely so much on the Oracle database and get people off Oracle. But here's the thing that SAP didn't do that Oracle did do, and I think, my opinion, Oracle got right. Oracle did Fusion, they bit the bullet and did Oracle Fusion, it took the better part of a decade, it actually took more than a decade, but every time Oracle buys a company, and every SAS application that it jams into the Red Stack, runs Fusion middleware, and runs the Oracle database. So, it's not the case with HANA. So it's kind of an integration nightmare, it's very very complex what SAP has got handed to the new regime. I think this is a daunting task, and I think this might be in part why the timing of Bill McDermott stepping down, I mean he sees that this is going to be a heavy lift, it's going to need more of a product-focused leadership team, that's why I think it's smart that SAP has maybe gone back to that two-headed monster of two CEOs, one that's customer-facing and one that's more product-oriented and R&D-oriented because they have a major integration challenge ahead of them. So as I say, SAP has promised to stop making these multibillion dollar acquisitions, they got to get to work on integration, which is going to be a major portion of the task in the next five years, so spending data from ETR shows some positive momentum relative to consensus, now remember, the Street works in a quarter, so they're on a quarterly shot clock, so if the Street says, "You're going to do this for earnings," and they do this, well, that means higher EPS, so the stock's going to go up. If you do this and you come in below, that means the stock's going to go down, so these are very tactical kinds of things. We're talking here about more longer terms, this could be a five to seven year integration challenge if not more, remember, it took Oracle 10 years plus in terms of integrating Fusion, so that's something that you need to keep an eye on, especially if you're a customer and you're getting pitched all these different services and cloud services, just got to think about the architecture for integration. Okay, this is Dave Vellante with CUBE Insights powered by ETR, thanks for watching, we'll see you next time. (techno music)

Published Date : Oct 11 2019

SUMMARY :

From the SiliconANGLE media office in the past, and you can see that sort of steady decline, and take all the databases out and standardize that means the stock's going to go down,

ENTITIES

Entity	Category	Confidence
Jennifer Morgan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
January of 2020	DATE	0.99+
Jim Snabe	PERSON	0.99+
Bill McDermott	PERSON	0.99+
2010	DATE	0.99+
February of 2010	DATE	0.99+
October 10th	DATE	0.99+
Mark Hurd	PERSON	0.99+
Christian Kline	PERSON	0.99+
July 16	DATE	0.99+
five	QUANTITY	0.99+
October	DATE	0.99+
Friday	DATE	0.99+
Thursday, October 10th	DATE	0.99+
10.4%	QUANTITY	0.99+
Alex	PERSON	0.99+
75%	QUANTITY	0.99+
6.8%	QUANTITY	0.99+
31 acquisitions	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Concur	ORGANIZATION	0.99+
Bill McDermot	PERSON	0.99+
SAP	ORGANIZATION	0.99+
10%	QUANTITY	0.99+
33%	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Ariba	ORGANIZATION	0.99+
27%	QUANTITY	0.99+
32%	QUANTITY	0.99+
billions	QUANTITY	0.99+
6%	QUANTITY	0.99+
SuccessFactors	ORGANIZATION	0.99+
US	LOCATION	0.99+
HANA	TITLE	0.99+
Jim Hagemann Snabe	PERSON	0.99+
yesterday	DATE	0.99+
10 years	QUANTITY	0.99+
2023	DATE	0.99+
Europe	LOCATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
seven year	QUANTITY	0.99+
one	QUANTITY	0.98+
ETR	ORGANIZATION	0.98+
Snabe	PERSON	0.98+
today	DATE	0.98+
two years later	DATE	0.98+
around 4400 people	QUANTITY	0.98+
Safra Catz	PERSON	0.98+
Callidus	ORGANIZATION	0.98+
second half	QUANTITY	0.98+

Ruchir Puri, IBM and Tom Anderson, Red Hat | AnsibleFest 2022

>>Good morning live from Chicago. It's the cube on the floor at Ansible Fast 2022. This is day two of our wall to wall coverage. Lisa Martin here with John Furrier. John, we're gonna be talking next in the segment with two alumni about what Red Hat and IBM are doing to give Ansible users AI superpowers. As one of our alumni guests said, just off the keynote stage, we're nearing an inflection point in ai. >>The power of AI with Ansible is really gonna be an innovative, I think an inflection point for a long time because Ansible does such great things. This segment's gonna explore that innovation, bringing AI and making people more productive and more importantly, you know, this whole low code, no code, kind of right in the sweet spot of the skills gap. So should be a great segment. >>Great segment. Please welcome back two of our alumni. Perry is here, the Chief scientist, IBM Research and IBM Fellow. And Tom Anderson joins us once again, VP and general manager at Red Hat. Gentlemen, great to have you on the program. We're gonna have you back. >>Thank you for having >>Us and thanks for joining us. Fresh off the keynote stage. Really enjoyed your keynote this morning. Very exciting news. You have a project called Project Wisdom. We're talking about this inflection point in ai. Tell the audience, the viewers, what is Project Wisdom And Wisdom differs from intelligence. How >>I think Project Wisdom is really about, as I said, sort of combining two major forces that are in many ways disrupting and, and really constructing many a aspects of our society, which are software and AI together. Yeah. And I truly believe it's gonna result in a se shift on how not just enterprises, but society carries forefront. And as I said, intelligence is, is, I would argue at least artificial intelligence is more, in some ways mechanical, if I may say it, it's about algorithms, it's about data, it's about compute. Wisdom is all about what is truly important to bring out. It's not just about when you bring out a, a insight, when you bring out a decision to be able to explain that decision as well. It's almost like humans have wisdom. Machines have intelligence and, and it's about project wisdom. That's why we called it wisdom. >>Because it is about being a, a assistant augmenting humans. Just like be there with the humans and, and almost think of it as behave and interact with them as another colleague will versus intelligence, which is, you know, as I said, more mechanical is about data. Computer algorithms crunch together and, and we wanna bring the power of project wisdom and artificial intelligence to developers to, as you said, close the skills gap to be able to really make them more productive and have wisdom for Ansible be their assistant. Yeah. To be able to get things for them that they would find many ways mundane, many ways hard to find and again, be an assistant and augmented, >>You know, you know what's interesting, I want to get into the origin, how it all happened, but interesting IBM research, well known for the deep tech, big engineering. And you guys have been doing this for a long time, so congratulations. But it's interesting here at this event, even on stage here event, you're starting to see the automation come in. So the question comes up, scale. So what happens, IBM buys Red Hat, you go raid the, the raid, the ip, Trevor Treasure trove of ai. I mean this cuz this is kind of like bringing two killer apps together. The Ansible configuration automation layer with ai just kind of a, >>Yeah, it's an amazing relationship. I was gonna say marriage, but I don't wanna say marriage cause I may be >>Last. I didn't mean say raid the Treasure Trobe, but the kind of >>Like, oh my God. An amazing relationship where we bring all this expertise around automation, obviously around IP and application infrastructure automation and IBM research, Richie and his team bring this amazing capacity and experience around ai. Bring those two things together and applying AI to automation for our teams is so incredibly fantastic. I just can't contain my enthusiasm about it. And you could feel it in the keynote this morning that Richie was doing the energy in the room and when folks saw that, it's just amazing. >>The geeks are gonna love it for sure. But here I wanna get into the whole evolution. Computers on computers, remember the old days thinking machines was a company generations ago that I think they've sold or went outta business, but self-learning, learning machines, computers, programming, computers was actually on your slide you kind of piece out this next wave of AI and machine learning, starting with expert systems really kind of, I'm almost say static, but like okay programs. Yeah, yeah. And then now with machine learning and that big debate was unsupervised, supervised, which is not really perfect. Deep learning, which now explores some things, but now we're at another wave. Take, take us through the thought there explaining what this transition looks like and why. >>I think we are, as I said, we are really at an inflection point in the journey of ai. And if ai, I think it's fair to say data is the pain of ai without data, AI doesn't exist. But if I were to train AI with what is known as supervised learning or or data that is labeled, you are almost sort of limited because there are only so many people who have that expertise. And interestingly, they all have day jobs. So they're not just gonna sit around and label this for you. Some people may be available, but you know, this is not, again, as I as Tom said, we are really trying to apply it to some very sort of key domains which require subject matter expertise. This is not like labeling cats and dogs that everybody else in the board knows there are, the community's very large, but still the skills to go around are not that many. >>And I truly believe to apply AI to the, to the word of, you know, enterprises information technology automation, you have to have unsupervised learning and that's the only way to skate. Yeah. And these two trends really about, you know, information technology percolating across every enterprise and unsupervised learning, which is learning on this very large amount of data with of course know very large compute with some very powerful algorithms like transformer architectures and others which have been disrupting the, the domain of natural language as well are coming together with what I described as foundation models. Yeah. Which anybody who plays with it, you'll be blown away. That's literally blown away. >>And you call that self supervision at scale, which is kind of the foundation. So I have to ask you, cuz this comes up a lot with cloud, cloud scale, everyone tells horizontally scalable cloud, but vertically specialized applications where domain expertise and data plays. So the better the data, the better the self supervision, better the learning. But if it's horizontally scalable is a lot to learn. So how do you create that data ops where it's where the machines are gonna be peaked to maximize what's addressable, but what's also in the domain too, you gotta have that kind of diversity. Can you share your thoughts on that? >>Absolutely. So in, in the domain of foundation models, there are two main stages I would say. One is what I'll describe as pre-training, which is think of it as the, the machine in this particular case is knowledgeable about the domain of code in general. It knows syntax of Python, Java script know, go see Java and so, so on actually, and, and also Yammel as well, which is obviously one would argue is the domain of information technology. And once you get to that level, it's a, it's almost like having a developer who knows all of this but may not be an expert at Ansible just yet. He or she can be an expert at Ansible but is not there yet. That's what I'll call background knowledge. And also in the, in the case of foundation models, they are very adept at natural language as well. So they can connect natural language to code, but they are not yet expert at the domain of Ansible. >>Now there's something called, the second stage of learning is called fine tuning, which is about this data ops where I take data, which is sort of the SME data in this particular case. And it's curated. So this is not just generic data, you pick off GitHub, you don't know what exists out there. This is the data which is governed, which we know is of high quality as well. And you think of it as you specialize the generic AI with pre-trained AI with that data. And those two stages, including the governance of that data that goes into it results in this sort of really breakthrough technology that we've been calling Project Wisdom for. Our first application is Ansible, but just watch out that area. There are many more to come and, and we are gonna really, I'm really excited about this partnership with Red Hat because across IBM and research, I think where wherever we, if there is one place where we can find excited, open source, open developer community, it is Right. That's, >>Yeah. >>Tom, talk about the, the role of open source and Project Wisdom, the involvement of the community and maybe Richard, any feedback that you've gotten since coming off stage? I'm sure you were mobbed. >>Yeah, so for us this is, it's called Project Wisdom, not Product Wisdom. Right? Sorry. Right. And so, no, you didn't say that but I wanna just emphasize that it is a project and for us that is a key word in the upstream community that this is where we're inviting the community to jump on board with us and bring their expertise. All these people that are here will start to participate. They're excited in it. They'll bring their expertise and experience and that fine tuning of the model will just get better and better. So we're really excited about introducing this now and involving the community because it's super nuts. Everything that Red Hat does is around the community and this is no different. And so we're really excited about Project Wisdom. >>That's interesting. The project piece because if you see in today's world the innovation strategy before where we are now, go back to say 15 years ago it was of standard, it's gotta have standard bodies. You can still innovate and differentiate, but yet with open source and community, it's a blending of research and practitioners. I think that to me is a big story here is that what you guys are demonstrating is the combination of research and practitioners in the project. Yes. So how does this play out? Cuz this is kind of like how things are gonna get done in the cloud cuz Amazon's not gonna just standardize their stack at at higher level services, nor is Azure and they might get some plumbing commonalities below, but for Project Project Wisdom to be successful, they can, it doesn't need to have standards. If I get this right, if I can my on point here, what do you guys think about that? React to that? Yeah, >>So I definitely, I think standardization in terms of what we will call ML ops pipeline for models to be deployed and managed and operated. It's like models, like any other code, there's standardization on DevOps ops pipeline, there's standardization on machine learning pipeline. And these models will be deployed in the cloud because they need to scale. The only way to scale to, you know, thousands of users is through cloud. And there is, there are standard pipelines that we are working and architecting together with the Red Hat community leveraging open source packages. Yeah. Is really to, to help scale out the AI models of wisdom together. And another point I wanted to pick up on just what Tom said, I've been sort of in the area of productizing AI for for long now having experience with Watson as well. The only scenario where I've seen AI being successful is in this scenario where, what I describe as it meets the criteria of flywheel of ai. >>What do I mean by flywheel of ai? It cannot be some research people build a model. It may be wowing, but you roll it out and there's no feedback. Yeah, exactly. Okay. We are duh. So what actually, the only way the more people use these models, the more they give you feedback, the better it gets because it knows what is right and what is not right. It will never be right the first time. Actually, you know, the data it is trained on is a depiction of reality. Yeah. It is not a reality in itself. Yeah. The reality is a constantly moving target and the only way to make AI successful is to close that loop with the community. And that's why I just wanted to reemphasize the point on why community is that important >>Actually. And what's interesting Tom is this is a difference between standards bodies, old school and communities. Because developers are very efficient in their feedback. Yes. They jump to patterns that serve their needs, whether it's self-service or whatever. You can kind of see what's going on. Yeah. It's either working or not. Yeah, yeah, >>Yeah. We get immediate feedback from the community and we know real fast when something isn't working, when something is working, there are no problems with the flow of data between the members of the community and, and the developers themselves. So yeah, it's, I'm it's great. It's gonna be fantastic. The energy around Project Wisdom already. I bet. We're gonna go down to the Project Wisdom session, the breakout session, and I bet you the room will be overflowed. >>How do people get involved real quick? Get, get a take a minute to explain how I would get involved. I'm a community member. Yep. I'm watching this video, I'm intrigued. This has got me enthusiastic. How do I get more confident with this opportunity? >>So you go to, first of all, you go to red hat.com/project Wisdom and you register your interests and you wanna participate. We're gonna start growing this process, bringing people in, getting ready to make the service available to people to start using and to experiment with. Start getting their feedback. So this is the beginning of, of a journey. This isn't the, you know, this isn't the midpoint of a journey, this is the begin. You know, even though the work has been going on for a year, this is the beginning of the community journey now. And so we're gonna start working together through channels like Discord and whatnot to be able to exchange information and bring people in. >>What are some of the key use cases, maybe Richie are starting with you that, that you think maybe dream use cases that you think the community will help to really uncover as we're looking at Project Wisdom really helping in this transformation of ai. >>So if I focus on let's say Ansible itself, there are much wider use cases, but Ansible itself and you know, I, I would say I had not realized, I've been working on AI for Good for long, but I had not realized the excitement and the power of Ansible community itself. It's very large, it's very bottom sum, which I love actually. But as I went to lot of like CTOs and CIOs of lot of our customers as well, it was becoming clear the use cases of, you know, I've got thousand Ansible developers or IT or automation experts. They write code all the time. I don't know what all of this code is about. So the, the system administrators, managers, they're trying to figure out sort of how to organize all of this together and think of it as Google for finding all of these automation code automation content. >>And I'm very excited about not just the use cases that we demonstrated today, that is beginning of the journey, but to be able to help enterprises in finding the right code through natural language interfaces, generating the code, helping Del us debug their code as well. Giving them predictive insights into this may happen. Just watch out for it when you deploy this. Something like that happened before, just watch out for it as well. So I'm, I'm excited about the entire life cycle of IT automation, Not just about at the build time, but also at the time of deployment. At the time of management. This is just a start of a journey, but there are many exciting use cases abound for Ansible and beyond. >>It's gonna be great to watch this as it unfolds. Obviously just announcing this today. We thank you both so much for joining us on the program, talking about Project wisdom and, and sharing how the community can get involved. So you're gonna have to come back next year. We're gonna have to talk about what's going on. Cause I imagine with the excitement of the community and the volume of the community, this is just the tip of the iceberg. Absolutely. >>This is absolutely exactly. You're excited about. >>Excellent. And you should be. Congratulations. Thank, thanks again for joining us. We really appreciate your insights. Thank you. Thank >>You for having >>Us. For our guests and John Furrier, I'm Lisa Barton and you're watching The Cube Lie from Chicago at Ansible Fest 22. This is day two of wall to wall coverage on the cube. Stick around. Our next guest joins us in just a minute.

Published Date : Oct 19 2022

SUMMARY :

It's the cube on the floor at Ansible Fast 2022. bringing AI and making people more productive and more importantly, you know, this whole low code, Gentlemen, great to have you on the program. Tell the audience, the viewers, what is Project Wisdom And Wisdom differs from intelligence. It's not just about when you bring out a, a insight, when you bring out a decision to to developers to, as you said, close the skills gap to And you guys have been doing this for a long time, I was gonna say marriage, And you could feel it in the keynote this morning And then now with machine learning and that big debate was unsupervised, This is not like labeling cats and dogs that everybody else in the board the domain of natural language as well are coming together with And you call that self supervision at scale, which is kind of the foundation. And once you So this is not just generic data, you pick off GitHub, of the community and maybe Richard, any feedback that you've gotten since coming off stage? Everything that Red Hat does is around the community and this is no different. story here is that what you guys are demonstrating is the combination of research and practitioners The only way to scale to, you know, thousands of users is through the only way to make AI successful is to close that loop with the community. They jump to patterns that serve the breakout session, and I bet you the room will be overflowed. Get, get a take a minute to explain how I would get involved. So you go to, first of all, you go to red hat.com/project Wisdom and you register your interests and you What are some of the key use cases, maybe Richie are starting with you that, that you think maybe dream use the use cases of, you know, I've got thousand Ansible developers So I'm, I'm excited about the entire life cycle of IT automation, and sharing how the community can get involved. This is absolutely exactly. And you should be. This is day two of wall to wall coverage on the cube.

ENTITIES

Entity	Category	Confidence
Tom	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Lisa Barton	PERSON	0.99+
John Furrier	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Richard	PERSON	0.99+
Tom Anderson	PERSON	0.99+
Ansible	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Chicago	LOCATION	0.99+
John	PERSON	0.99+
Perry	PERSON	0.99+
two	QUANTITY	0.99+
Richie	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
next year	DATE	0.99+
Ruchir Puri	PERSON	0.99+
two alumni	QUANTITY	0.99+
one	QUANTITY	0.99+
Java	TITLE	0.99+
Red Hat	ORGANIZATION	0.99+
two stages	QUANTITY	0.99+
second stage	QUANTITY	0.99+
Python	TITLE	0.99+
two things	QUANTITY	0.99+
GitHub	ORGANIZATION	0.99+
first application	QUANTITY	0.99+
today	DATE	0.98+
Google	ORGANIZATION	0.98+
both	QUANTITY	0.98+
Discord	ORGANIZATION	0.97+
15 years ago	DATE	0.97+
AnsibleFest	EVENT	0.97+
Trevor Treasure	PERSON	0.97+
thousand	QUANTITY	0.97+
red hat.com/project	OTHER	0.96+
One	QUANTITY	0.95+
The Cube Lie	TITLE	0.93+
Ansible Fest 22	EVENT	0.93+
first time	QUANTITY	0.93+
Project Wisdom	ORGANIZATION	0.92+
two killer apps	QUANTITY	0.92+
two major forces	QUANTITY	0.92+
users	QUANTITY	0.9+
IBM Research	ORGANIZATION	0.9+
DevOps	TITLE	0.89+
Azure	TITLE	0.85+
Project Wisdom	TITLE	0.85+
this morning	DATE	0.85+
Yammel	TITLE	0.82+
Project Wisdom	ORGANIZATION	0.81+
a year	QUANTITY	0.78+
Ansible Fast	ORGANIZATION	0.75+
two main stages	QUANTITY	0.74+
wave	EVENT	0.72+
day	QUANTITY	0.69+
first	QUANTITY	0.67+
Project	ORGANIZATION	0.66+
Project Project Wisdom	TITLE	0.63+
Wisdom	TITLE	0.61+

Starburst The Data Lies FULL V2b

>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data-driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting cost could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data marts, data hubs, and yes, even data lakes were broken and left us wanting from more welcome to the data doesn't lie, or doesn't a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have featured parody with the data lake or vice versa is the so-called modern data stack, simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for hit that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience show? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know, right. You actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like star Oxley, for things like security for certainly very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited Jamma, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come. I think that's the story at Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenants of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about SAR, brain Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but what does that mean? Does that mean the E D w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems, maybe either those that either source systems for the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got to, you know, domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue are two, you know, challenges, self-serve infrastructure let's park that for a second. And then in your industry, the one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at emus is we have a single security layer that sits on top of our data match, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin, I mean, Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, do an analytic queries and with data that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah. I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively a almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself, >>Okay. G guys grab a sip of water. After this short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence, keep it right there. >>Your company has more data than ever, and more people trying to understand it, but there's a problem. Your data is stored across multiple systems. It's hard to access and that delays analytics and ultimately decisions. The old method of moving all of your data into a single source of truth is slow and definitely not built for the volume of data we have today or where we are headed while your data engineers spent over half their time, moving data, your analysts and data scientists are left, waiting, feeling frustrated, unproductive, and unable to move the needle for your business. But what if you could spend less time moving or copying data? What if your data consumers could analyze all your data quickly? >>Starburst helps your teams run fast queries on any data source. We help you create a single point of access to your data, no matter where it's stored. And we support high concurrency, we solve for speed and scale, whether it's fast, SQL queries on your data lake or faster queries across multiple data sets, Starburst helps your teams run analytics anywhere you can't afford to wait for data to be available. Your team has questions that need answers. Now with Starburst, the wait is over. You'll have faster access to data with enterprise level security, easy connectivity, and 24 7 support from experts, organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact our Trino experts to get started. >>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay, we're gonna get to lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you you'll never get performance because you need to be column there. You need to store data in a column format. And then, you know, column formats we're introduced to, to data apes, you have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again like iceberg and Delta and Hodi that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a line and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, look closed is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen a technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, obviously her vision is there's an open source that, that the data meshes open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but to come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well, I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to Haddo and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in Haddo back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, that's interesting reminded when I, you know, I see the, the gas price, the tees or gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. Th that that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you want to use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you in and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers, but, but a lot of Oracle customers and they, you know, they'll admit, yeah, you know, they're jamming us on price and the license cost they give, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast in ROI? >>I think the answer to that is it can depend a bit. It depends on your businesses skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So I E it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you commander 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities, because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years. And in world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse fit in this, in this world? >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage, or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a deal lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern, or is it the same wine new bottle? When it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage. >>Your data is capable of producing incredible results, but data consumers are often left in the dark without fast access to the data they need. Starers makes your data visible from wherever it lives. Your company is acquiring more data in more places, more rapidly than ever to rely solely on a data centralization strategy. Whether it's in a lake or a warehouse is unrealistic. A single source of truth approach is no longer viable, but disconnected data silos are often left untapped. We need a new approach. One that embraces distributed data. One that enables fast and secure access to any of your data from anywhere with Starburst, you'll have the fastest query engine for the data lake that allows you to connect and analyze your disparate data sources no matter where they live Starburst provides the foundational technology required for you to build towards the vision of a decentralized data mesh Starburst enterprise and Starburst galaxy offer enterprise ready, connectivity, interoperability, and security features for multiple regions, multiple clouds and everchanging global regulatory requirements. The data is yours. And with Starburst, you can perform analytics anywhere in light of your world. >>Okay. We're back with Justin Boardman. CEO of Starbust Richard Jarvis is the CTO of EMI health and Theresa tongue is the cloud first technologist from Accenture. We're on July number three. And that is the claim that today's modern data stack is actually modern. So I guess that's the lie it's it is it's is that it's not modern. Justin, what do you say? >>Yeah. I mean, I think new isn't modern, right? I think it's the, it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually are exactly the same as what we've had for 40 years, rather than Terra data. You have snowflake rather than Informatica you have five trend. So it's the same general stack, just, you know, a cloud version of it. And I think a lot of the challenges that it plagued us for 40 years still maintain. >>So lemme come back to you just, but okay. But, but there are differences, right? I mean, you can scale, you can throw resources at the problem. You can separate compute from storage. You really, you know, there's a lot of money being thrown at that by venture capitalists and snowflake, you mentioned it's competitors. So that's different. Is it not, is that not at least an aspect of, of modern dial it up, dial it down. So what, what do you say to that? >>Well, it, it is, it's certainly taking, you know, what the cloud offers and taking advantage of that, but it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same sort of structural constraints that exist with the old enterprise data warehouse model OnPrem still exist just yes, a little bit more elastic now because the cloud offers that. >>So Theresa, let me go to you cuz you have cloud first in your, in your, your title. So what's what say you to this conversation? >>Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud, as we know it, maybe data lake data warehouse in the central place, that's not even how the cloud providers are looking at it. They have news query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our, the future goes, right, that that's gonna very much fall the same thing. There was gonna be more edge. There's gonna be more on premise because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers. Right? So there's a lot of reasons why the modern, I guess, the next modern generation of the data staff needs to be much more federated. >>Okay. So Richard, how do you deal with this? You you've obviously got, you know, the technical debt, the existing infrastructure it's on the books. You don't wanna just throw it out. A lot of, lot of conversation about modernizing applications, which a lot of times is a, you know, a microservices layer on top of leg legacy apps. How do you think about the modern data stack? >>Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just cuz you can scale CPU and storage doesn't mean you can get more people to use your data, to generate you more, more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data, but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams, we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business, not just the technology itself. >>Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five, seven years cloud obviously has given a different pricing model. De-risked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm, I'm taking away that that's not enough based on what Richard just said. The modern data stack has to serve the business and enable the business to build data products. I, I buy that. I'm a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about, you know, paying, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >>Of, of how it should look like or, or how >>It's yeah. What it should be. >>Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in, in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I, I certainly agree with that. So by no means, are we suggesting that, you know, snowflake or Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's, it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of, of idealism. They had the benefit of it was starting with a clean slate that does not reflect the vast majority of enterprises. >>And even those companies, as they grow up mature out of that ideal state, they go buy a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding, you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really Futureproof yourself from the inevitable change that you will, you won't encounter over time. >>So thank you. So there, based on what Justin just said, I, my takeaway there is it's inclusive, whether it's a data Mar data hub, data lake data warehouse, it's a, just a node on the mesh. Okay. I get that. Does that include there on Preem data? O obviously it has to, what are you seeing in terms of the ability to, to take that data mesh concept on Preem? I mean, most implementations I've seen in data mesh, frankly really aren't, you know, adhering to the philosophy. They're maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing. Hello, fresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >>I mean, I, I think it's a killer case for data. Me, the fact that you have valuable data sources, OnPrem, and then yet you still wanna modernize and take the best of cloud cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both worlds. You can start using the data products on-prem or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or, or maybe just tap into better analytics to get better insights, right? So you're gonna be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >>Okay. Thank you. So Richard, you know, talking about data as product, wonder if we could give us your perspectives here, what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >>So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients' demographics about their, their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight. And in any business, that's clearly not a desirable outcome, but when that insight is so critical, as it might be in healthcare or some security settings, you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured, managed way, even if that data comes from a variety of different sources in, in, in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >>So that data product through whatever APIs is, is accessible, it's discoverable, but it's obviously gotta be governed as well. You mentioned you, you appropriately provided to internally. Yeah. But also, you know, external folks as well. So the, so you've, you've architected that capability today >>We have, and because the data is standard, it can generate value much more quickly and we can be sure of the security and, and, and value that that's providing because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context, what does this data mean? And what does it mean to process this data for a particular use case? >>Yeah, it makes sense. It's got the context. If the, if the domains own the data, you, you gotta cut through a lot of the, the, the centralized teams, the technical teams that, that data agnostic, they don't really have that context. All right. Let's send Justin, how does Starburst fit into this modern data stack? Bring us home. >>Yeah. So I think for us, it's really providing our customers with, you know, the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know, and optionality provides the ability to reduce costs, store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, you can really create and, and curate, you know, data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know, model and make that an appropriate compliment to, you know, the, the, the modern data stack that people have today. >>Excellent. Hey, I wanna thank Justin Theresa and Richard for joining us today. You guys are great. I big believers in the, in the data mesh concept, and I think, you know, we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are gonna be available on the cube.net for on-demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and, and, and they have awesome resources, lots of data mesh conversations over there, and really good stuff in, in the resource section. So check that out. Thanks for watching the data doesn't lie or does it made possible by Starburst data? This is Dave Valante for the cube, and we'll see you next time. >>The explosion of data sources has forced organizations to modernize their systems and architecture and come to terms with one size does not fit all for data management today. Your teams are constantly moving and copying data, which requires time management. And in some cases, double paying for compute resources. Instead, what if you could access all your data anywhere using the BI tools and SQL skills your users already have. And what if this also included enterprise security and fast performance with Starburst enterprise, you can provide your data consumers with a single point of secure access to all of your data, no matter where it lives with features like strict, fine grained, access control, end to end data encryption and data masking Starburst meets the security standards of the largest companies. Starburst enterprise can easily be deployed anywhere and managed with insights where data teams holistically view their clusters operation and query execution. So they can reach meaningful business decisions faster, all this with the support of the largest team of Trino experts in the world, delivering fully tested stable releases and available to support you 24 7 to unlock the value in all of your data. You need a solution that easily fits with what you have today and can adapt to your architecture. Tomorrow. Starbust enterprise gives you the fastest path from big data to better decisions, cuz your team can't afford to wait. Trino was created to empower analytics anywhere and Starburst enterprise was created to give you the enterprise grade performance, connectivity, security management, and support your company needs organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact us to get started.

Published Date : Aug 22 2022

SUMMARY :

famously said the best minds of my generation are thinking about how to get people to the data warehouse ever have featured parody with the data lake or vice versa is So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? although if you were starting from a Greenfield site and you were building something brand new, Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, And you can think of them Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come. But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing that the mesh actually allows you to use all of them. But it creates what I would argue are two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around those do an analytic queries and with data that's all dispersed all over the, how are you seeing your the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of data But what if you could spend less you create a single point of access to your data, no matter where it's stored. give you the performance and control that you can get with a proprietary system. I remember in the very early days, people would say, you you'll never get performance because And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven it is an evolving, you know, spectrum, but, but from your perspective, And what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, that's interesting reminded when I, you know, I see the, the gas price, And I think, you know, I loved what Richard said. not as many te data customers, but, but a lot of Oracle customers and they, you know, And so for those different teams, they can get to an ROI more quickly with different technologies that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts So the advice that I saw years ago was if you have open source technologies, And in world of Oracle, you know, normally it's the staff, easy to discover and consume via, you know, the creation of data products as well. really modern, or is it the same wine new bottle? And with Starburst, you can perform analytics anywhere in light of your world. And that is the claim that today's So it's the same general stack, just, you know, a cloud version of it. So lemme come back to you just, but okay. So a lot of the same sort of structural constraints that exist with So Theresa, let me go to you cuz you have cloud first in your, in your, the data staff needs to be much more federated. you know, a microservices layer on top of leg legacy apps. So I think the stack needs to support a scalable So you think about the past, you know, five, seven years cloud obviously has given What it should be. And I think that's the paradigm shift that needs to occur. data that lives outside of the data warehouse, maybe living in open data formats in a data lake seen in data mesh, frankly really aren't, you know, adhering to So the mesh allows you to have the best of both worlds. So Richard, you know, talking about data as product, wonder if we could give us your perspectives is expecting means that you generate the wrong insight. But also, you know, around the data to say in a very clear business context, It's got the context. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, This is Dave Valante for the cube, and we'll see you next time. You need a solution that easily fits with what you have today and can adapt

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Dave Lanta	PERSON	0.99+
Jess Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Theresa	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Teresa	PERSON	0.99+
Jeff Ocker	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Dave Valante	PERSON	0.99+
Justin Boardman	PERSON	0.99+
six	QUANTITY	0.99+
Dani	PERSON	0.99+
Massachusetts	LOCATION	0.99+
20 cents	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jamma	PERSON	0.99+
UK	LOCATION	0.99+
FINRA	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
two	QUANTITY	0.99+
five	QUANTITY	0.99+
Jess	PERSON	0.99+
2011	DATE	0.99+
Starburst	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
thousands	QUANTITY	0.99+
pythons	TITLE	0.99+
Boston	LOCATION	0.99+
GDPR	TITLE	0.99+
Today	DATE	0.99+
two models	QUANTITY	0.99+
Zolando Comcast	ORGANIZATION	0.99+
Gemma	PERSON	0.99+
Starbust	ORGANIZATION	0.99+
JPMC	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Javas	TITLE	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
first lie	QUANTITY	0.99+
10	DATE	0.99+
12 years	QUANTITY	0.99+
one place	QUANTITY	0.99+
Tomorrow	DATE	0.99+

Starburst The Data Lies FULL V1

>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data-driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting cost could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data marts, data hubs, and yes, even data lakes were broken and left us wanting from more welcome to the data doesn't lie, or doesn't a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have featured parody with the data lake or vice versa is the so-called modern data stack, simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for hit that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience show? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know, right. You actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like star Oxley, for things like security for certainly very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited Jamma, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come. I think that's the story at Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenants of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about SAR, brain Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but what does that mean? Does that mean the E D w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems, maybe either those that either source systems for the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got to, you know, domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue are two, you know, challenges, self-serve infrastructure let's park that for a second. And then in your industry, the one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at emus is we have a single security layer that sits on top of our data match, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin, I mean, Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, do an analytic queries and with data that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah. I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively a almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself, >>Okay. G guys grab a sip of water. After this short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence, keep it right there. >>Your company has more data than ever, and more people trying to understand it, but there's a problem. Your data is stored across multiple systems. It's hard to access and that delays analytics and ultimately decisions. The old method of moving all of your data into a single source of truth is slow and definitely not built for the volume of data we have today or where we are headed while your data engineers spent over half their time, moving data, your analysts and data scientists are left, waiting, feeling frustrated, unproductive, and unable to move the needle for your business. But what if you could spend less time moving or copying data? What if your data consumers could analyze all your data quickly? >>Starburst helps your teams run fast queries on any data source. We help you create a single point of access to your data, no matter where it's stored. And we support high concurrency, we solve for speed and scale, whether it's fast, SQL queries on your data lake or faster queries across multiple data sets, Starburst helps your teams run analytics anywhere you can't afford to wait for data to be available. Your team has questions that need answers. Now with Starburst, the wait is over. You'll have faster access to data with enterprise level security, easy connectivity, and 24 7 support from experts, organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact our Trino experts to get started. >>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay, we're gonna get to lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you you'll never get performance because you need to be column there. You need to store data in a column format. And then, you know, column formats we're introduced to, to data apes, you have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again like iceberg and Delta and Hodi that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a line and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, look closed is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen a technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, obviously her vision is there's an open source that, that the data meshes open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but to come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well, I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to Haddo and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in Haddo back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, that's interesting reminded when I, you know, I see the, the gas price, the tees or gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. Th that that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you want to use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you in and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers, but, but a lot of Oracle customers and they, you know, they'll admit, yeah, you know, they're jamming us on price and the license cost they give, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast in ROI? >>I think the answer to that is it can depend a bit. It depends on your businesses skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So I E it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you commander 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities, because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years. And in world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse fit in this, in this world? >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage, or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a deal lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern, or is it the same wine new bottle? When it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage. >>Your data is capable of producing incredible results, but data consumers are often left in the dark without fast access to the data they need. Starers makes your data visible from wherever it lives. Your company is acquiring more data in more places, more rapidly than ever to rely solely on a data centralization strategy. Whether it's in a lake or a warehouse is unrealistic. A single source of truth approach is no longer viable, but disconnected data silos are often left untapped. We need a new approach. One that embraces distributed data. One that enables fast and secure access to any of your data from anywhere with Starburst, you'll have the fastest query engine for the data lake that allows you to connect and analyze your disparate data sources no matter where they live Starburst provides the foundational technology required for you to build towards the vision of a decentralized data mesh Starburst enterprise and Starburst galaxy offer enterprise ready, connectivity, interoperability, and security features for multiple regions, multiple clouds and everchanging global regulatory requirements. The data is yours. And with Starburst, you can perform analytics anywhere in light of your world. >>Okay. We're back with Justin Boardman. CEO of Starbust Richard Jarvis is the CTO of EMI health and Theresa tongue is the cloud first technologist from Accenture. We're on July number three. And that is the claim that today's modern data stack is actually modern. So I guess that's the lie it's it is it's is that it's not modern. Justin, what do you say? >>Yeah. I mean, I think new isn't modern, right? I think it's the, it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually are exactly the same as what we've had for 40 years, rather than Terra data. You have snowflake rather than Informatica you have five trend. So it's the same general stack, just, you know, a cloud version of it. And I think a lot of the challenges that it plagued us for 40 years still maintain. >>So lemme come back to you just, but okay. But, but there are differences, right? I mean, you can scale, you can throw resources at the problem. You can separate compute from storage. You really, you know, there's a lot of money being thrown at that by venture capitalists and snowflake, you mentioned it's competitors. So that's different. Is it not, is that not at least an aspect of, of modern dial it up, dial it down. So what, what do you say to that? >>Well, it, it is, it's certainly taking, you know, what the cloud offers and taking advantage of that, but it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same sort of structural constraints that exist with the old enterprise data warehouse model OnPrem still exist just yes, a little bit more elastic now because the cloud offers that. >>So Theresa, let me go to you cuz you have cloud first in your, in your, your title. So what's what say you to this conversation? >>Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud, as we know it, maybe data lake data warehouse in the central place, that's not even how the cloud providers are looking at it. They have news query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our, the future goes, right, that that's gonna very much fall the same thing. There was gonna be more edge. There's gonna be more on premise because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers. Right? So there's a lot of reasons why the modern, I guess, the next modern generation of the data staff needs to be much more federated. >>Okay. So Richard, how do you deal with this? You you've obviously got, you know, the technical debt, the existing infrastructure it's on the books. You don't wanna just throw it out. A lot of, lot of conversation about modernizing applications, which a lot of times is a, you know, a microservices layer on top of leg legacy apps. How do you think about the modern data stack? >>Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just cuz you can scale CPU and storage doesn't mean you can get more people to use your data, to generate you more, more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data, but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams, we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business, not just the technology itself. >>Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five, seven years cloud obviously has given a different pricing model. De-risked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm, I'm taking away that that's not enough based on what Richard just said. The modern data stack has to serve the business and enable the business to build data products. I, I buy that. I'm a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about, you know, paying, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >>Of, of how it should look like or, or how >>It's yeah. What it should be. >>Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in, in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I, I certainly agree with that. So by no means, are we suggesting that, you know, snowflake or Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's, it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of, of idealism. They had the benefit of it was starting with a clean slate that does not reflect the vast majority of enterprises. >>And even those companies, as they grow up mature out of that ideal state, they go buy a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding, you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really Futureproof yourself from the inevitable change that you will, you won't encounter over time. >>So thank you. So there, based on what Justin just said, I, my takeaway there is it's inclusive, whether it's a data Mar data hub, data lake data warehouse, it's a, just a node on the mesh. Okay. I get that. Does that include there on Preem data? O obviously it has to, what are you seeing in terms of the ability to, to take that data mesh concept on Preem? I mean, most implementations I've seen in data mesh, frankly really aren't, you know, adhering to the philosophy. They're maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing. Hello, fresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >>I mean, I, I think it's a killer case for data. Me, the fact that you have valuable data sources, OnPrem, and then yet you still wanna modernize and take the best of cloud cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both worlds. You can start using the data products on-prem or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or, or maybe just tap into better analytics to get better insights, right? So you're gonna be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >>Okay. Thank you. So Richard, you know, talking about data as product, wonder if we could give us your perspectives here, what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >>So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients' demographics about their, their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight. And in any business, that's clearly not a desirable outcome, but when that insight is so critical, as it might be in healthcare or some security settings, you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured, managed way, even if that data comes from a variety of different sources in, in, in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >>So that data product through whatever APIs is, is accessible, it's discoverable, but it's obviously gotta be governed as well. You mentioned you, you appropriately provided to internally. Yeah. But also, you know, external folks as well. So the, so you've, you've architected that capability today >>We have, and because the data is standard, it can generate value much more quickly and we can be sure of the security and, and, and value that that's providing because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context, what does this data mean? And what does it mean to process this data for a particular use case? >>Yeah, it makes sense. It's got the context. If the, if the domains own the data, you, you gotta cut through a lot of the, the, the centralized teams, the technical teams that, that data agnostic, they don't really have that context. All right. Let's send Justin, how does Starburst fit into this modern data stack? Bring us home. >>Yeah. So I think for us, it's really providing our customers with, you know, the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know, and optionality provides the ability to reduce costs, store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, you can really create and, and curate, you know, data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know, model and make that an appropriate compliment to, you know, the, the, the modern data stack that people have today. >>Excellent. Hey, I wanna thank Justin Theresa and Richard for joining us today. You guys are great. I big believers in the, in the data mesh concept, and I think, you know, we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are gonna be available on the cube.net for on-demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and, and, and they have awesome resources, lots of data mesh conversations over there, and really good stuff in, in the resource section. So check that out. Thanks for watching the data doesn't lie or does it made possible by Starburst data? This is Dave Valante for the cube, and we'll see you next time. >>The explosion of data sources has forced organizations to modernize their systems and architecture and come to terms with one size does not fit all for data management today. Your teams are constantly moving and copying data, which requires time management. And in some cases, double paying for compute resources. Instead, what if you could access all your data anywhere using the BI tools and SQL skills your users already have. And what if this also included enterprise security and fast performance with Starburst enterprise, you can provide your data consumers with a single point of secure access to all of your data, no matter where it lives with features like strict, fine grained, access control, end to end data encryption and data masking Starburst meets the security standards of the largest companies. Starburst enterprise can easily be deployed anywhere and managed with insights where data teams holistically view their clusters operation and query execution. So they can reach meaningful business decisions faster, all this with the support of the largest team of Trino experts in the world, delivering fully tested stable releases and available to support you 24 7 to unlock the value in all of your data. You need a solution that easily fits with what you have today and can adapt to your architecture. Tomorrow. Starbust enterprise gives you the fastest path from big data to better decisions, cuz your team can't afford to wait. Trino was created to empower analytics anywhere and Starburst enterprise was created to give you the enterprise grade performance, connectivity, security management, and support your company needs organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact us to get started.

Published Date : Aug 20 2022

SUMMARY :

famously said the best minds of my generation are thinking about how to get people to the data warehouse ever have featured parody with the data lake or vice versa is So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? although if you were starting from a Greenfield site and you were building something brand new, Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, And you can think of them Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come. But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing that the mesh actually allows you to use all of them. But it creates what I would argue are two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around those do an analytic queries and with data that's all dispersed all over the, how are you seeing your the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of data But what if you could spend less you create a single point of access to your data, no matter where it's stored. give you the performance and control that you can get with a proprietary system. I remember in the very early days, people would say, you you'll never get performance because And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven it is an evolving, you know, spectrum, but, but from your perspective, And what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, that's interesting reminded when I, you know, I see the, the gas price, And I think, you know, I loved what Richard said. not as many te data customers, but, but a lot of Oracle customers and they, you know, And so for those different teams, they can get to an ROI more quickly with different technologies that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts So the advice that I saw years ago was if you have open source technologies, And in world of Oracle, you know, normally it's the staff, easy to discover and consume via, you know, the creation of data products as well. really modern, or is it the same wine new bottle? And with Starburst, you can perform analytics anywhere in light of your world. And that is the claim that today's So it's the same general stack, just, you know, a cloud version of it. So lemme come back to you just, but okay. So a lot of the same sort of structural constraints that exist with So Theresa, let me go to you cuz you have cloud first in your, in your, the data staff needs to be much more federated. you know, a microservices layer on top of leg legacy apps. So I think the stack needs to support a scalable So you think about the past, you know, five, seven years cloud obviously has given What it should be. And I think that's the paradigm shift that needs to occur. data that lives outside of the data warehouse, maybe living in open data formats in a data lake seen in data mesh, frankly really aren't, you know, adhering to So the mesh allows you to have the best of both worlds. So Richard, you know, talking about data as product, wonder if we could give us your perspectives is expecting means that you generate the wrong insight. But also, you know, around the data to say in a very clear business context, It's got the context. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, This is Dave Valante for the cube, and we'll see you next time. You need a solution that easily fits with what you have today and can adapt

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Dave Lanta	PERSON	0.99+
Jess Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Theresa	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Teresa	PERSON	0.99+
Jeff Ocker	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Dave Valante	PERSON	0.99+
Justin Boardman	PERSON	0.99+
six	QUANTITY	0.99+
Dani	PERSON	0.99+
Massachusetts	LOCATION	0.99+
20 cents	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jamma	PERSON	0.99+
UK	LOCATION	0.99+
FINRA	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
two	QUANTITY	0.99+
five	QUANTITY	0.99+
Jess	PERSON	0.99+
2011	DATE	0.99+
Starburst	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
thousands	QUANTITY	0.99+
pythons	TITLE	0.99+
Boston	LOCATION	0.99+
GDPR	TITLE	0.99+
Today	DATE	0.99+
two models	QUANTITY	0.99+
Zolando Comcast	ORGANIZATION	0.99+
Gemma	PERSON	0.99+
Starbust	ORGANIZATION	0.99+
JPMC	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Javas	TITLE	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
first lie	QUANTITY	0.99+
10	DATE	0.99+
12 years	QUANTITY	0.99+
one place	QUANTITY	0.99+
Tomorrow	DATE	0.99+

Starburst Panel Q1

>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting costs could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data, Mars, data hubs, and yes, even data lakes were broken and left us wanting for more welcome to the data doesn't lie, or does it a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have feature parody with the data lake or vice versa is the so-called modern data stack simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for IDU that was acquired by Teradata. And when I got to Teradata, of course, Terada is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on-prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience, Joe? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know? Right. But you actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like swans Oxley, for things like security, for certain very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited JAK, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come, I think that's the story of Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenets of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about, so Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and con contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but the, what does that mean? Does that mean the ed w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's gonna be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems. Maybe either those that either source systems, the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to lose all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got, you know, the domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue or two, you know, challenges self-serve infrastructure let's park that for a second. And then in your industry, one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And, and I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at EMI is we have a single security layer that sits on top of our data mesh, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. >>And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin mean Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, doing analytic queries and with data, that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah, I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively, almost eCommerce, like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself. >>Okay. G guys grab a sip of water. After the short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence. Keep it right there.

Published Date : Aug 2 2022

SUMMARY :

famously said the best minds of my generation are thinking about how to get people to Teresa is on the west coast and Justin is in Massachusetts with me. So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? you might be able to centralize all the data and all of the tooling and teams in one place. Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? of rock stars that, that, you know, build cubes and, and the like, And you can think of them like consultants Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come, I think that's the story of Oracle or Terra data or other proprietary But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing you know, new mesh layer that still takes advantage of the things. But it creates what I would argue or two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around And, and so having done that and investing quite heavily in making that possible But do you have anything to add to this because you're essentially taking, you know, the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of

ENTITIES

Entity	Category	Confidence
Dave Lanta	PERSON	0.99+
Dani	PERSON	0.99+
Richard	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Jeff Ocker	PERSON	0.99+
Theresa	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Teresa	PERSON	0.99+
Massachusetts	LOCATION	0.99+
Teradata	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
UK	LOCATION	0.99+
two	QUANTITY	0.99+
Joe	PERSON	0.99+
GDPR	TITLE	0.99+
JAK	PERSON	0.99+
2011	DATE	0.99+
Starburst	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
thousands	QUANTITY	0.99+
two models	QUANTITY	0.99+
EMI	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Gemma	PERSON	0.99+
Terada	ORGANIZATION	0.99+
Accenture	ORGANIZATION	0.99+
Each	QUANTITY	0.99+
first lie	QUANTITY	0.99+
today	DATE	0.99+
first startup	QUANTITY	0.98+
Cloudera	ORGANIZATION	0.98+
Today	DATE	0.98+
SQL	TITLE	0.98+
first technologist	QUANTITY	0.97+
one place	QUANTITY	0.97+
Democrat	ORGANIZATION	0.97+
single	QUANTITY	0.97+
about 30 miles	QUANTITY	0.97+
one	QUANTITY	0.96+
three industry experts	QUANTITY	0.95+
more than a decade later	DATE	0.94+
One	QUANTITY	0.94+
hit adapt	ORGANIZATION	0.94+
Terra data	ORGANIZATION	0.93+
Greenfield	LOCATION	0.92+
single source	QUANTITY	0.91+
single tool	QUANTITY	0.91+
Oxley	PERSON	0.91+
one vendor	QUANTITY	0.9+
single bucket	QUANTITY	0.9+
single version	QUANTITY	0.88+
about a year ago	DATE	0.85+
Theresa tongue	PERSON	0.83+
emos	ORGANIZATION	0.82+
Mars	ORGANIZATION	0.8+
swans Oxley	PERSON	0.77+
IDU	TITLE	0.69+
first	QUANTITY	0.59+
a second	QUANTITY	0.55+
Sarbanes Oxley	ORGANIZATION	0.53+
Mastered	PERSON	0.45+
Q1	QUANTITY	0.37+

Anthony Lye & Jonsi Stefansson, NetApp | AWS. re:Invent 2019

>>long from Las Vegas. It's the Q covering a ws re invent 2019. Brought to you by Amazon Web service is and in Came along with its ecosystem partners. >>Hey, welcome back to the Cube. Lisa Martin at AWS Reinvent in Vegas. Very busy. Sands Expo Center. Pleased to be joined by my co host this afternoon. Justin Warren, founder and chief analyst at Pivot nine. Justin, we're hosting together again. We are. >>It's great to be >>here. It's great to have you that. So. Justin Meyer, please welcome a couple of our cue ball. Um, back to the program. A couple guys from nut up. We have Anthony Lie, the S B, P and G m of the Cloud business unit. Welcome back at the >>very much great to be here >>and color coordinating with Anthony's Jandi Stephenson, Chief Technology officer and GPS Cloud. Welcome back. >>Thank you. Thank you >>very shortly. Dress, guys and very >>thank you. Thank you. It's, uh, the good news Is that their suits anymore. So we're not going to have to wear ties >>comfortable guys net up a w s this event even bigger than last year, which I can't even believe that 65,000 or so thugs. But, Anthony, let's start with you. Talk to us about what's new with the net up AWS partnership a little bit about the evolution of it. >>Yeah. I mean, you know, we started on AWS. Oh, my gosh. Must be almost five or six years ago now and we made a conscious effort to port are operating system to AWS, which was no small task on dhe. It's taken us a few years, but we're really starting to hit our stride Now. We've been very successful, were on boarding customers on an ever increasing rate. We've added more. Service is on. We just continue to love the cloud as a platform for development. We can go so fast, and we can do things in in an environment like aws that, frankly, you just couldn't do on premise, you know, they're they're complexity and EJ ineighty of on premise was always a challenge. The cloud for us is an amazing platform where we can go very, very fast >>and from a customer demand standpoint. Don't talk to me about that, Chief technologist. One of the thing interesting things that that Andy Jassy shared yesterday was that surprised me. 97% of I t spend is still on from So we know that regardless of the M word, multi cloud work customers are living in that multi cloud world. Whether it's by strategy, a lot of it's not. A lot of it's inherited right, but they have to have that choice, right? It's gonna depend on the data, the workload, etcetera. What can you tell us about when you're talking with customers? What what? How are they driving NetApp evolution of its partnership with public provider AWS? >>So actually, I don't know if it's the desired state to be running in a hybrid, mostly cloud fashion, but it's it's It's driven by strategy, and it's usually driven by specific workloads and on the finding the best home for your application or for your workers at any given time. Because it's it's ultimately unrealistic for on premise customers to try to compete with like a machine and keep learning algorithms and the rate of development and rate off basically evolution in the cloud. So you always have to be there to be able to stay competitive, so it's becoming a part of the strategy even though it was probably asked that developers that drove a lot off cloud adoption to begin with. Maybe, maybe not. Not in favor of the c i o r. You have, like a lot of Cloud Cloud sprawling, but there's no longer sprawling it. It's part of the strategy before every company in my way >>heard from any Jesse in the keynote yesterday about the transformation being an important thing. And he also highlighted a lot of enterprise. Nedda has a long history with enterprise, Yes, very solid reputation with enterprise. So it feels to me like this This is an enterprise show. Now that the enterprise has really arrived at with the cloud, what are you seeing from the customers that you've already had for a long time? No, no, no, I'm familiar with it. Trust Net up. We're now exploring the Clouded and doing more than just dipping their toe in the water. What are they actually doing with the cloud and and we'll get up together, you know, >>we see and no one ever growing list of workload. I think when people make decisions in the cloud, they're not making those traditional horizontal decisions anymore. They're making workload by workload by workload decisions and Internet EPPS history and I think, uh, performance on premises, given customers peace of mind now in the cloud, they sort of know that what's been highly reliable, highly scaleable for them on premise, they can now have that same confidence in the cloud. So way started. Like just like Amazon. We started off seeing secondary workloads like D r Back Up Dev ops, but now is seeing big primaries go A s, a p big database workloads, e commerce. Ah, lot of HBC high forming compute. We're doing very well in oil and gas in the pharmaceutical industries where file has been really lacking on the public cloud. I think we leaned in as a company years ago and put put, put a concerted effort to make it there. And I think now the workloads a confident that were there and we can give them the throughput. We give them the performance on the protocols and now we're seeing big, big workloads come over to the public clouds. >>And he did make a big deal about transformation being important. And a lot of that was around the operational model. Let's let's just the pure technology. But what about the operating model? How are you seeing Enterprises Transformer? There's a lot of traditionally just taken a workload, do a bit of lift and shift and put it to the cloud. Where are they now transforming the way they actually operate? Things because of >>cloud? Absolutely. I mean, they have to They have to adopt the new technologies and new ways of doing business. So I mean, I think they are actually celebrating that to answer point. I think this is not a partnership and we're partnering with. We have a very unique story. We're partnering with all of them and have really deep engineering relationship with all of them. And they are now able to go after enterprise type workloads that they haven't been gone. I've been able to go after before, so that's why it's such a strategic strategic relationship that we have with all of them. That sort of brings in in the freedom of choice. You can basically go everywhere anywhere. That, in my opinion, is that true hyper cloud story lot has always been really difficult. But with the data management capabilities of not top, it's really easy to move my greater replicate across on premise toe are hyper scaler off choice. >>I mean, I think you know, if you're in enterprise right now, you know you're a CEO. You're probably scared to death of, like, being uber, you know exactly on. Uh, you know, if you're you know, So speed has now become what we say. The new scale they used to be scaled is your advantage. And now, if you're not fast, you could be killed any day by some of these startups who just build a mobile app. And all of a sudden they've gotten between you and the customer and you've lost. And I think CEOs are now. How fast are we going? How many application developers do we have? And did a scientist do we have? And because of that, that they're seeing Amazon as a platform for speed on. So that's just that paranoia. I think digital transformation is driving everybody to the cloud. >>You're right. If we look at transformation if a business and Andy Jassy and John for your talked about this and that exclusive interview that they did the other day. And Andy, if you're and a legacy enterprise and you're looking at your existing market share segment exactly, and you're not thinking there's somebody else. What assisting on there on the side mirror? Objects in mirror are closer. Not getting ready for that. You're on the wrong. You're going to be on the wrong side of that equation. But if we look at cloud, it has had an impact on traditional story one of naps. Taglines is data driven. If we look at transformation and if we'll even look at the translation of cloud in and of itself, data is at the heart of everything. Yes, and they talk to us about net APS transformation as cloud is something that you're enabling on prime hybrid multi cloud as you talked about. But how is your advantage allowing customers to not only be data driven, but to find value in that data that gives them that differentiation that they need for the guy or a girl that's right behind them. I already did take over. >>Well, I think if you're you know, if you're an enterprise, you know, the one asset you have is data. You have history now >>a liability Now with an asset. >>Can they can they do anything with it. Do they know where it is? Do they know how to use it where it should be, you know, Is it secured? Is it protected all of those things? It's very hard for enterprise to answer those questions. What one end up, I think it's done incredibly well, is by leaning in as much as we did onto AWS way. Give our customers the absolute choice to leave our on premise business and a lot of people, I think years ago thought we were crazy. But because now we've expanded our footprint to allow customers to run anywhere without any fear of lock in, people will start to see us now not as a storage vendor but as a strategic partner, and that that that strategic partnership is really has really come about because of our willingness to let people move the data and manage the data wherever they needed to be. On that something our customers have said, you know, used to be a storage vendor on along with the other storage vendors and now all of a sudden that we're having conversations with you about strategy where the data should be, you know who's using it is. It's secured all of those kinds of conversations we're having with customers. >>You mentioned moving data, and that was something that again came up in the keynote yesterday. And he mentioned that Hey, maybe instead of taking the data to the computer, we should bring the computer's data. That's something that Ned Abbas has long actually talked about. I remember when you used to mention data fabric was something about We want to take your data and then make it available to where the computer is. I'd like you to talk it through that, particularly in light of like a I and ML, which is on the tip of everybody's tongue. It's It's a bit of I think, it's possibly reaching the peak of the hype cycle at the moment s o what our customers actually doing with their data to actually analyze it? Are they actually seeing real value from machine learning? And I are We still isn't just kicking the tires on that. >>I mean, the biggest problem with deep learning and machine learning is having our accumulating enough on being able to have the data or lessening that gravity by being able to move it then you can take advantage off states maker in AWS, the big Cleary and Google, whatever fits your needs. And then, if you want to store the results back on premise, that's what we enable. With it out of harbor having that free flowing work clothes migration has to count for data. It's not enough to just move your application that that that's the key for machine learning and thought the lakes and others, >>absolutely in terms of speed. Anthony mentioned that that's the new scale. How is flash changing the game >>with perspective, you know, flashes a media type, but it's just, you know, the prices have come down now that you know the price performance couple flashes an obvious thing. Um, and a lot of people are, I think now, making on premise decisions to get rid of spinning disc and replaced with Flash because the R. O. I is so good. Tco the meantime between failures, that's that's so many advantages that percent workloads. It's a better decision, of course. You know, AWS provides a whole bunch of media Onda again. It's just you like a kid in a candy store, you know, as a developer, you look at Amazon. You're like, Oh, my God. Back in the day, we had to make, like, an Oracle decision and everything was Oracle. And now you can just move things around and you can take advantage of all sorts of different utilities. And now you piece together an application very differently. And so you're able to sort of really think I think Dion sees point. People are telling us they have to have a date, a strategy, and then, based on the data strategy, they will then leverage the right storage with the right protocols. They'll then bring that to compute whatever compute is necessary. I think data science is, you know, a little fashion, you know, conscious. Right now, you know, everybody wants to say how many did a scientist they have on their teams? They're looking for needles in haystacks. Someone, they're finding them. Some of them are but not doing it, I think it is. Makes companies very, very nervous. So they're going the results, gonna trying as hard as they can to leverage that technology. >>And you'll see where is that data strategy conversation happening if we think about the four essentials that Andy Johnson talked about yesterday for transformation in one of the first things he said was, it has to be topped at senior level decision. Then it's going to be aggressively pushed down through the organization. Are you seeing this data strategy at the CEO level yet? >>Yeah, we are. But I'm also seeing it much lower. I mean, with the data engineers with the developers, because it's asked, is it is extremely important to be developing on top off production data, specifically if you're doing machine and deep learning. So I think it's both. I think the decision authority has actually moved lower in the company where the developers are the side reliability engineers are actually choosing more technology to use. That fits the product that they are actually creating off course. The strategy happens at the tall, but the influencer and the decision makers, in my opinion, has been moving lower and within the organization. So I'm basically contradicting what yes is a. But to me that is also important. The days off a C t o r C E o. Forcing a specific platform or strategy on to developers. Those days are hopefully gone. >>I think if you're a CEO and you know of any company in any industry you have to be a tech company, you know, it used to be a tech industry, and now every company in the world is now tech. Everyone's building APS. Everyone's using data. Everybody's, you know, trying to figure out machine learning. And so I think what's happening is CEOs are are increasingly becoming technically literate. They have to Exactly. They're dead if they're not. I mean, you know whether your insurance company, your primary platform, is now digital if you're a medical company or primary platform additional. So I think that's a great stat. I saw that about two and 1/2 years ago. The number of software engineering jobs in non tech surpassed the number of jobs in tech, so we used to have our little industry and all the software engineers came to work for tech companies. Now there are more jobs outside the tech segment for engineers, and there are in the text >>well, and you brought up uber a minute ago and I think of a couple of companies examples in my last question for you is real. Rapid is about industries. You look at uber for example, what the fact that the taxi cab companies were transitional. And we're really eager to, you know, AP, if I their organizations, and meet the consumer demand. And then you look at Airbnb and how that's revolutionized hospitality or pellet on how it's revolutionized. Fitness Last question, Jonesy, Let's go for you. Looking at all of the transformation that cloud has enabled and can enable what industry you mentioned when the gas. But is there any industry that you see right now that is just at the tipping point to be ableto blow the door wide open if they transform successfully? >>Well, I mean way are working with a lot off pharma companies and genome sequencing companies that have not actually working with sensitive data on if those companies, I mean, these are people's medical histories and everything, so we're seeing them moving now in close into the cloud so those companies can move to the cloud. Anybody can move to the cloud. You mean these sort of compliancy scaremongering? You cannot move to the cloud because of P. C. I or hip power. Those days are over because aws, Microsoft and Google, that's the first thing they do they have? Ah, stricter compliancy than most on premise Homemade tartar sentence. So I see. I see that industry really moving into the cloud. Now >>who knows what a ws re invent 2020 will look like Gentlemen I wish we had more time, but thank you. Both Young and Anthony were talking with Justin and me today sharing what's new with netapp. What? You guys are enabling customers. D'oh! In multiple. Same old way. We appreciate your time where my car is. Justin Warren, I'm Lisa Martin. You're watching the Cube from AWS or reinvent 19 from Vegas. Thanks for watching.

Published Date : Dec 4 2019

SUMMARY :

Brought to you by Amazon Web service Pleased to be joined by my co host It's great to have you that. and color coordinating with Anthony's Jandi Stephenson, Chief Technology Thank you. Dress, guys and very So we're not going to have to wear ties Talk to us about what's new with the net up AWS partnership and we can do things in in an environment like aws that, frankly, you just couldn't do on premise, A lot of it's inherited right, but they have to have that So actually, I don't know if it's the desired state to be running in a hybrid, Now that the enterprise has really arrived at with the cloud, what are you seeing from the customers And I think now the workloads a confident that were there and And a lot of that was around the operational I mean, they have to They have to adopt the new technologies I mean, I think you know, if you're in enterprise right now, you know you're a CEO. Yes, and they talk to us about net APS transformation as Well, I think if you're you know, if you're an enterprise, you know, the one asset you have is of a sudden that we're having conversations with you about strategy where the data should be, maybe instead of taking the data to the computer, we should bring the computer's data. that gravity by being able to move it then you can take advantage off states maker in AWS, Anthony mentioned that that's the new scale. and a lot of people are, I think now, making on premise decisions to get rid of spinning Then it's going to be aggressively pushed down through the organization. That fits the product that they have to be a tech company, you know, it used to be a tech industry, and now every company of the transformation that cloud has enabled and can enable what industry you mentioned I see that industry really moving into the cloud. Both Young and Anthony were talking with Justin and me today sharing what's new with netapp.

ENTITIES

Entity	Category	Confidence
Andy Johnson	PERSON	0.99+
Jonsi Stefansson	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Justin	PERSON	0.99+
Justin Warren	PERSON	0.99+
Anthony Lye	PERSON	0.99+
Anthony	PERSON	0.99+
Andy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Justin Meyer	PERSON	0.99+
Google	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Jandi Stephenson	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Anthony Lie	PERSON	0.99+
Las Vegas	LOCATION	0.99+
yesterday	DATE	0.99+
97%	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
uber	ORGANIZATION	0.99+
Jesse	PERSON	0.99+
John	PERSON	0.99+
aws	ORGANIZATION	0.99+
65,000	QUANTITY	0.99+
Vegas	LOCATION	0.99+
Ned Abbas	PERSON	0.99+
today	DATE	0.99+
Young	PERSON	0.99+
both	QUANTITY	0.99+
Airbnb	ORGANIZATION	0.99+
last year	DATE	0.98+
Both	QUANTITY	0.98+
Jonesy	PERSON	0.98+
one	QUANTITY	0.98+
four essentials	QUANTITY	0.98+
six years ago	DATE	0.98+
five	DATE	0.96+
a minute ago	DATE	0.95+
Dion	PERSON	0.95+
One	QUANTITY	0.94+
Pivot nine	ORGANIZATION	0.93+
first thing	QUANTITY	0.93+
Trust Net	ORGANIZATION	0.93+
AP	ORGANIZATION	0.92+
years ago	DATE	0.9+
Flash	TITLE	0.88+
1/2 years ago	DATE	0.85+
NetApp	TITLE	0.84+
Amazon Web	ORGANIZATION	0.84+
AWS Reinvent	ORGANIZATION	0.84+
years	DATE	0.83+
2020	DATE	0.83+
netapp	TITLE	0.83+
Nedda	ORGANIZATION	0.83+
P. C. I	PERSON	0.82+
GPS Cloud	ORGANIZATION	0.82+
Cleary	ORGANIZATION	0.8+
this afternoon	DATE	0.79+
about two and	DATE	0.78+
S B	ORGANIZATION	0.77+
first things	QUANTITY	0.76+
prime hybrid	COMMERCIAL_ITEM	0.75+
HBC	ORGANIZATION	0.74+
Sands Expo Center	LOCATION	0.74+
2019	DATE	0.7+
NetApp	ORGANIZATION	0.68+
big	ORGANIZATION	0.66+
couple guys	QUANTITY	0.66+
Invent	EVENT	0.62+
Clouded	ORGANIZATION	0.6+
2019	TITLE	0.59+
Cloud	TITLE	0.56+

Ravi Pendekanti, Dell EMC & Glenn Gainor, Sony Innovation Studios | Dell Technologies World 2019

>> live from Las Vegas. It's the queue covering del Technologies. World twenty nineteen. Brought to you by Del Technologies and its ecosystem partners. >> Welcome back to Las Vegas. Lisa Martin with John Ferrier. You're watching the Cube live at Del Technologies World twenty nineteen. This is our second full day of Double Cube set coverage. We've got a couple of we're gonna really cool conversation coming up for you. We've got Robbie Pender County, one of our alumni on the cue back as VP product management server solutions. Robbie, Welcome back. >> Thank you, Lisa. Much appreciated. >> And you brought some Hollywood? Yes. Glenn Glenn ER, president of Sony Innovation Studios. Glenn and welcome to the Cube. >> Thank you very much. It's great to be here. >> So you are love this intersection of Hollywood and technology. But you're a filmmaker. >> Yeah. I have been filming movies for many years. Uh, I started off making motion pictures for many years. Executive produced him and over so production for them at one of our movie labels called Screen Gems, which is part of Sony Pictures. >> Wait a tremendous amount of evolution of the creative process being really fueled by technology and vice versa. Sony Innovation Studios is not quite one year old. This is a really exciting venture. Tell us about that and and what the the impetus was to start this company. >> You know that the genesis for it was based out of necessity because I looked at a nice Well, you know, I love making movies were doing it for a long time. And the challenge of making good pictures is resource is and you never get enough money believing not you never get enough money and never get enough time. That's everybody's issue, particularly time management. And I thought, Well, you know, we got a pretty good technology company behind us. What if we looked inward towards technology to help us find solutions? And so innovation studios is born out of that idea on what was exciting about it was to know that we had, uh, invited partners to the game right here with Del so that we could make movies and television shows and commercials and even enterprise solutions leaning into state of the art and cutting edge technology. >> And what some of the work prize and you guys envision coming out this mission you mentioned commercials. TV is it going to be like an artist's studio actor? Ackerson Ball is Take us through what this is going to look like. How does it get billed out? >> I lean into my career as a producer. To answer that one and say is going to enable that's one of the greatest things about being a producer is enabling stories, uh, inspiring ideas to be Greenland. That may not have been able to be done so before. And there's a key reason why we can't do that, because one of our key technologies is what we call the volumetric image acquisition. That's a lot of words. You probably say. What the heck is that? But a volumetric image acquisition is our ability to capture a real world, this analog world and digitize it, bring it into our servers using the power of Del and then live in that new environment, which is now a virtual sets. And that virtual set is made out of billions and trillions in quadrillions of points, much like the matter around us. And it's a difference because many people use pixels, which is interpretation of like worry, using points which is representative of the world around us, so it's a whole revolutionary way of looking at it. But what it allows us to do is actually film in it in a thirty K moving volume. >> It's like a monster green screen for the world. Been away >> in a way, your your your your action around it because you have peril X so these cameras could be photographing us. And for all you know, we may not be here. Could be at stage seven at Innovation Studios and not physically here, but you couldn't tell it. If >> this is like cloud computing, we talking check world, you don't the provisional these resource is you just get what you want. This is Hollywood looking at the artistry, enabling faster, more agile storytelling. You don't need to go set up a town and go get the permit. All the all the heavy lifting you're shooting in this new digital realm. >> That's right. Exactly. Now I love going on location on. There's a lot to celebrate about going on location, but we can always get to that location. Think of all the locations that we want to be in that air >> base off limits. Both space, the one I >> haven't been, uh, but but on said I've been I've walked on virtual moons and I've walked on set moons. But what if we did a volumetric image acquisition of someone set off the moon? Now we have that, and then we can walk around it. Or what if there's a great club, a nightclub? This says guys want you shoot here, but we have performances Monday night, Tuesday night, Wednesday night there. You know they have a job. What if we grab that image, acquired it, and then you could be there anytime you want. >> Robbie, we could go for an hour here. This is just a great comic. I >> completely agree with you. >> The Cube. You could. You could sponsor a cube in this new world. We could run the Q twenty four seven. That's absolutely >> right. And we don't even have >> to talk about the relationship with Dale because on Del Technologies, because you're enabling new capabilities. New kind of artistry was just totally cool. Want to get back to the second? But you guys were involved. What's your role? How do you get involved? Tell the story about your >> John. I mean, first and foremost one of things that didn't Glendon mention is he's actually got about fifty movies to his credit. So the guy actually knows this stuff, so which is absolutely fantastic. So we said, How do you go take average to the next level? So what else is better than trying to work something out, wherein we together between what Glenn and Esteem does at the Sony Innovation Labs for Studio Sorry. And as in Dead Technologies could do is to try and actually stretch the boundaries of our technology to a next tent that when he talks about kazillion bytes of data right one followed the harmony of our zeros way have to be able to process the data quickly. We have to be able to go out and do their rendering. We probably have to go out and do whatever is needed to make a high quality movie, and that, I think, in a way, is actually giving us an opportunity to go back and test the boundaries of their technology. They're building, which we believe this is the first of its kind in the media industry. If we can go learn together from this experience, we can actually go ahead and do other things in other industries. To maybe, and we were just talking about how we could also take this. He's got his labs here in Los Angeles, were thinking maybe one of the next things we do based on the learnings we get, we probably could take it to other parts of the world. And if we are successful, we might even take it to other industries. What if we could go do something to help in this field of medicine? >> It's just thinking that, right? Yes. >> Think about it. Lisa, John. I mean, it's phenomenal. I mean, this is something Michael always talks about is how do we as del technologies help in progress in the human kind? And if this is something that we can learn from, I think it's going to be phenomenal. >> I think I think that's so interesting. Not only is that a good angle for Del Technologies, the thing that strikes me is the access toe artist trees, voices, new voices that may be missed in the prop the vetting process the old way. But, you know, you got to know where we're going. No, in the Venture Capital way seen this with democratization of seed labs and incubators, where, if you can create access to the story, tells on the artists we're gonna have one more exposure to people might have missed. But also as things change, like whether it's Ray Ray beaming and streaming, we saw in the gaming side to pull a metric or volumetric things. You're gonna have a better canvas, more paint brushes on the creative side and more. Artist. Is that the mission to get AC, get those artists in there? Is it? Is that part of the core mission submission? Because you're going to be essentially incubating new opportunities really fast. >> It's, uh, it's very important to me. Personally. I know it speaks of the values of both Sony and L. I like to call it the democratization of storytelling. You know, I've been very blessed again, a Hollywood producer, and we maybe curate a certain kind of movie, a certain kind of experience. But there's so many voices around the world that need to be hurt, and there are so many stories that otherwise can't be enabled. Imagine a story that perhaps is a unique >> special voice but requires distance. It requires five disparate locations Perhaps it's in London, Piccadilly Circus and in Times Square. And perhaps it's overto Abu Dhabi on DH Libya somewhere because that's part of the story. We can now collapse geography and bring those locations to a central place and allow a story to be told that may not otherwise have been able to be created. And that's vital to the fabric of storytelling worldwide's >> going change the creative process to you don't have to have that waterfall kind of mentality like we don't talk about intact. You're totally distributed content, decentralized, potentially the creative process going change with all the tools and also the visual tools. >> That's right. It's >> almost becoming unlimited. >> You wanted to be unlimited. You want the human spirit to be unlimited. You want to be able to elevate people on. That's the great thing about what we're trying to achieve and will achieve. >> It is your right. I mean, it is interesting, you know, we were just talking about this, too. Uh, we're in, you know, as an example. Shock tank. Yes, right. I mean, they obviously did it. The filming and stuff, and then they don't have the access. Let's say to the right studio. But the fact is, they had all this done. Andi, you know, they had all the rendering they had captured. Already done. You could now go out and do your chute without having all the space you needed. >> That's right. In the case of Shark Tank, which shoots a Sony Pictures studios, they knew they had a real estate issue. The fact of the matter is, there's a limited amount of sound stages around the world. They needed to sound stages and only had access to one. So we went in and we did a volumetric image acquisition of their exit interview stage. They're set. And then when it came time to shoot the second half a season ten, one hundred contestants went into a virtual set and were filmed in that set. And the funny thing is, one of the guys in the truck you know how you have the camera trucks and, you know, off offstage, he leaned into the mike. Is that you guys, could you move that plant a couple inches to the left and somebody said, Uh, I don't think we can do it right now, he said, We're on a movie lot. You could move a plant. They said No, it's physically not there. We're on innovation studios goes Oh, that's right. It's virtual mind. >> So he was fooled. >> He was pulled. In a way, we're >> being hashing it out within a team. When we heard about some of the things you know Glenn and Team are doing is think about this. If you have to teach people when we are running short of doctors, right? Yeah, if you could. With this technology and the learnings that come from here, if you could go have an expert surgeon do surgery once you're captured, it would be nice. Just imagine, to take that learning, go to the new surgeons of the future and trained them and so they can get into the act without actually doing it. So my point and all this is this is where I think we can take technology, that next level where we can not only learn from one specific industry, but we could potentially put it to human good in terms of what we could to and not only preparing the next of doctors, but also take it to the next level. >> This was a great theme to Michael Dell put out there about these new kinds of use case is that the time is now to do before. Maybe you could get there technology, but maybe aspirational. Hey, let's do it. I could see that, Glenn, I want to ask you specifically. The time is now. This is all kind of coming together. Timing's pretty good. It's only gonna get better. It's gonna be good Tech, Tech mojo Coming for the creative side. Where were we before? Because I can almost imagine this is not a new vision for you. Probably seen it now that this house here now what was it like before for, um and compare contrast where you were a few years ago, maybe decades. Now what's different? Why? Why is this so important >> for me? There's a fundamental change in how we can create content and how we can tell stories. It used to be the two most expensive words in the movie TV industry were what if today that the most important words to me or what if Because what if we could collapse geography? What if we could empower a new story? Technology is at a place where, if we can dream it. Chances are we can make it a reality. We're changing the dynamics of how we may content. He used to be lights, action camera. I think it's now lights, action, compute power action, you know, is that kind of difference. >> That is an amazing vision. I think society now has opportunities to kind of take that from distance learning to distance connections, the distance sharing experiences, whether it's immersion, virtual analog face, the face could really be powerful. Yeah, >> and this is not even a year old. >> That's right. >> So if you look at your your launch, you said, I think let june fourth twenty eighteen. What? Where do you go from here? I mean, like we said, this is like, unlimited possibilities. But besides putting Robbie in the movie, naturally, Yes, of course I have >> a star here >> who? E. >> So I got to say he's got star power. >> What's what's next year? Exactly? >> Very exciting. I will say we have shark tank Thie Advanced Imaging Society gives an award for being the first volume met you set ever put out on the airwaves. Uh, for that television show is a great honor. We have already captured uh, men in black. We captured a fifty thousand square foot stage that had the men in black headquarters has been used for commercials to market the film that comes out this June. We have captured sets where television shows >> and in hopes, that they got a second season and one television show called up and said, Guys, we got the second season so they don't have to go back to what was a very expensive set and a beautiful set >> way captured that set. It reminds me of a story of productions and a friend of mine said, which is every year. The greatest gift I have is building a beautiful set and and to me, the biggest challenges. When I say, remember that sent you built four years ago? I need that again. Now you can go >> toe. It's hard to replicate the exact set. You capture it digitally. It lives. >> That's exactly it. >> And this is amazing. I mean, I'd love to do a cube set into do ah, like a simulcast. Virtually. >> So. This is the next thing John and Lisa. You guys could be sitting anywhere going forward >> way. You don't have to be really sitting here >> you could be doing. What do you have to do? And, you know, you got everything rendered >> captured. We don't have to come to Vegas twenty times a year. >> We billed upset once. You >> know you want to see you here believing that So I'LL take that >> visual is a really beautiful thing. So if we can with hologram just seeing people doing conscious with Hollywood. Frank Zappa just did a concert hologram concert, but bringing real people and from communities around the world where the localization diversity right into a content mixture is just so powerful. >> Actually, you said something very interesting, John, which is one of the other teams to which is, if you have a globally connected society and he wanted try and personalize it to that particular nation ethnicity group. You can do that easily now because you can probably pop in actors from the local area with the same. Yeah, think about it. >> It's surely right. >> There's a cascade of transformations that that this is going Teo to generate. I mean just thinking of how different even acting schools and drama schools will be well, teaching people how to behave in these virtual environments, right? >> How to immerse themselves in these environments. And we have tricks up our sleeves that Khun put the actor in that moment through projection mapping and the other techniques that allow filmmakers and actors to actually understand the world. They're about to stepped in rather than a green screen and saying, OK, there's going to be a creature over here is gonna be blue Water falls over there will actually be able to see that environment because that environment will exist before they step on the stage. >> Well, great job the Del Partnership. On my final question, Glenn, free since you're awesome and got a great vision so smart, experienced, I've been really thinking a lot about how visualization and artistry are coming together and how disciplines silo disciplines like music. They do great music, but they're not translating to the graphics. It was just some about Ray tracing and the impact with GP use for an immersive experiences, which we're seeing on the client side of the house. It del So you got the back and stuff you metrics. And so, as artist trees, the next generation come up. This is now a link between the visual that audio the storytelling. It's not a siloed. >> It is not >> your I want to get your vision on. How do you see this playing out and your advice for young artists? That might be, you know, looked as country. What do you know? That's not how we do it. >> Well, the beautiful thing is that there are new ways to tell stories. You know, Hollywood has evolved over the last century. If you look at the studios and still exist, they have all evolved, and that's why they do exist. Great storytellers evolved. We tell stories differently, so long as we can emotionally relate to the story that's being told. I say, Do it in your own voice. The cinematic power is among us. We're blessed that when we look back, we have that shared experience, whether it's animate from Japan or traditional animation from Walt Disney everybody, she shares a similar history. Now it's opportunity to author our new stories, and we can do that and physical assets and volumetric assets and weaken blend the real and the unreal. With the compute power. The world is our oyster. >> Wow, >> What a nice >> trap right there. >> Exactly. That isn't my job. The transformation of of Hollywood. What it's really like the tip of the iceberg. Unlimited story potential. Thank you, Glenn. Thank you. This has been a fascinating cannot wait to hear, See and feel and touch What's next for Sony Animation studios With your technology power, we appreciate your time. >> Thank you. Thank you both. Which of >> our pleasure for John Carrier? I'm Lisa Martin. You're watching the Cube lie from Del Technologies World twenty nineteen We've just wrapped up Day two we'LL see you tomorrow.

Published Date : May 1 2019

SUMMARY :

Brought to you by Del Technologies We've got Robbie Pender County, one of our alumni on the cue back as VP product management And you brought some Hollywood? It's great to be here. So you are love this intersection of Hollywood and technology. I started off making motion pictures for many years. to start this company. You know that the genesis for it was based out of necessity because I looked at a nice And what some of the work prize and you guys envision coming out this mission you mentioned commercials. To answer that one and say is going to enable that's It's like a monster green screen for the world. And for all you know, we may not be here. this is like cloud computing, we talking check world, you don't the provisional these resource is you just get what you want. Think of all the locations that we want to be Both space, the one I What if we grab that image, acquired it, and then you could be there anytime you want. Robbie, we could go for an hour here. We could run the Q twenty four seven. And we don't even have Tell the story about your So we said, How do you go take average to the next level? It's just thinking that, right? And if this is something that we can learn from, I think it's going to be phenomenal. Is that the mission to get AC, get those artists in there? I know it speaks of the values of both Sony and may not otherwise have been able to be created. going change the creative process to you don't have to have that waterfall kind of mentality like we don't talk about That's right. on. That's the great thing about what we're trying to achieve and will achieve. I mean, it is interesting, you know, we were just talking about this, in the truck you know how you have the camera trucks and, you know, off offstage, he leaned into the mike. In a way, we're the next of doctors, but also take it to the next level. I could see that, Glenn, I want to ask you specifically. We're changing the dynamics of how we may content. I think society now has opportunities to kind of take that from distance learning to So if you look at your your launch, you said, I think let june fourth twenty eighteen. had the men in black headquarters has been used for commercials to market the film that comes out this The greatest gift I have is building a beautiful set and and to me, It's hard to replicate the exact set. I mean, I'd love to do a cube set into do ah, like a simulcast. So. This is the next thing John and Lisa. You don't have to be really sitting here What do you have to do? We don't have to come to Vegas twenty times a year. You So if we can with hologram just seeing people doing conscious if you have a globally connected society and he wanted try and personalize it There's a cascade of transformations that that this is going Teo to generate. OK, there's going to be a creature over here is gonna be blue Water falls over there will actually be able to see It del So you got the back and stuff you metrics. How do you see this playing out and your advice for young artists? You know, Hollywood has evolved over the last century. What it's really like the tip of the iceberg. Thank you both. World twenty nineteen We've just wrapped up Day two we'LL see you tomorrow.

ENTITIES

Entity	Category	Confidence
John Ferrier	PERSON	0.99+
John	PERSON	0.99+
Lisa Martin	PERSON	0.99+
John Carrier	PERSON	0.99+
Michael	PERSON	0.99+
Sony Pictures	ORGANIZATION	0.99+
Del Technologies	ORGANIZATION	0.99+
Glenn	PERSON	0.99+
Robbie	PERSON	0.99+
Ravi Pendekanti	PERSON	0.99+
Michael Dell	PERSON	0.99+
Sony Innovation Studios	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
Glendon	PERSON	0.99+
second season	QUANTITY	0.99+
Monday night	DATE	0.99+
London	LOCATION	0.99+
Las Vegas	LOCATION	0.99+
Abu Dhabi	LOCATION	0.99+
Vegas	LOCATION	0.99+
Frank Zappa	PERSON	0.99+
Los Angeles	LOCATION	0.99+
Esteem	PERSON	0.99+
Glenn Gainor	PERSON	0.99+
Times Square	LOCATION	0.99+
del Technologies	ORGANIZATION	0.99+
Sony	ORGANIZATION	0.99+
Piccadilly Circus	LOCATION	0.99+
Japan	LOCATION	0.99+
Sony Innovation Labs	ORGANIZATION	0.99+
kazillion bytes	QUANTITY	0.99+
tomorrow	DATE	0.99+
Tuesday night	DATE	0.99+
first	QUANTITY	0.99+
first volume	QUANTITY	0.99+
next year	DATE	0.99+
Wednesday night	DATE	0.99+
one	QUANTITY	0.99+
second half	QUANTITY	0.99+
thirty K	QUANTITY	0.99+
both	QUANTITY	0.99+
one hundred contestants	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.98+
Khun	PERSON	0.98+
Robbie Pender County	PERSON	0.98+
four years ago	DATE	0.98+
Glenn Glenn ER	PERSON	0.98+
second	QUANTITY	0.98+
five disparate locations	QUANTITY	0.98+
billions and trillions	QUANTITY	0.98+
Dale	PERSON	0.97+
fifty thousand square foot	QUANTITY	0.97+
two most expensive words	QUANTITY	0.97+
one television show	QUANTITY	0.97+
Day two	QUANTITY	0.97+
Ackerson Ball	PERSON	0.97+
Thie Advanced Imaging Society	ORGANIZATION	0.96+
Shark Tank	TITLE	0.96+
Andi	PERSON	0.96+
Hollywood	ORGANIZATION	0.96+
twenty times a year	QUANTITY	0.94+
second full day	QUANTITY	0.93+
Del Technologies	ORGANIZATION	0.92+
Screen Gems	ORGANIZATION	0.92+
last century	DATE	0.92+
Del Partnership	ORGANIZATION	0.91+
few years ago	DATE	0.91+
a year	QUANTITY	0.91+
today	DATE	0.91+

Ravi Pendakanti, Dell EMC & Glenn Gainor, Sony Innovation Studios | Dell Technologies World 2019

>> Live from Las Vegas. It's the queue covering del Technologies. World twenty nineteen. Brought to you by Del Technologies and its ecosystem partners. >> Welcome back to Las Vegas. Lisa Martin with John Ferrier. You're watching the Cube live at Del Technologies World twenty nineteen. This is our second full day of Double Cube set coverage. We've got a couple of we got a really cool conversation coming up for you. We've got Robbie Pender County, one of our alumni on the cue back as VP product management server solutions. Robbie, Welcome back. >> Thank you, Lisa. Much appreciated. >> And you brought some Hollywood? Yes, Glenn Glenn er, president of Sony Innovation Studios. Glenn and welcome to the Cube. >> Thank you very much. It's great to be here. >> So you are love this intersection of Hollywood and technology. But you're a filmmaker. >> Yeah, I have been filming movies for many years. I started off making motion pictures for many years. Executive produced him and oversaw production for them at one of our movie labels called Screen Gems, which is part of Sony Pictures. >> Wait a tremendous amount of evolution of the creative process being really fueled by technology and vice versa. Sony Innovation Studios is not quite one year old. This is a really exciting venture. Tell us about that and and what the The impetus was to start this company. >> You know that the genesis for it was based out of necessity because I looked at a nice Well, you know, I love making movies were doing it for a long time. And the challenge of making good pictures is resource is and you never get enough money. Believe or not, you never get enough money and never get enough time. That's everybody's issue, particularly time management. And I thought, Well, you know, we got a pretty good technology company behind us. What if we looked inward towards technology to help us find solutions? And so innovation studios is born out of that idea on what was exciting about it was to know that we had, uh, invited partners to the game right here with Del so that we could make movies and television shows and commercials and even enterprise solutions leaning into state of the art and cutting edge technology. >> And what some of the work private you guys envision coming out this mission you mentioned commercials TV. Is it going to be like an artist's studio actor actress in ball is take us through what this is going to look like. How does it get billed out? >> I lean into my career as a producer. To answer that one and say is going to enable that's one of the greatest things about being a producer is enabling stories, uh, inspiring ideas to be green lit that may not have been able to be done so before. And there's a key reason why we can't do that, because one of our key technologies is what we call the volumetric image acquisition. That's a lot of words. You probably say. What the heck is that? But a volumetric image acquisition is our ability to capture a real world, this analog world and digitize it, bring it into our servers using the power of Del and then live in that new environment, which is now a virtual sets. And that virtual set is made out of billions and trillions in quadrillions of points, much like the matter around us. And that's a difference because many people use pixels, which is interpretation of like we're using points which is representative of the world around us, so it's a whole revolutionary way of looking at it. But what it allows us to do is actually film in it in a thirty K moving volume. >> It's like a monster green screen for the world. Been away >> in a way, you're you're you're interaction around it because you have peril X, so these cameras could be photographing us. And for all you know, we may not be here. Could be at stage seven at Innovation Studios and not physically here, but you couldn't tell the >> difference. This is like cloud computing. We talking check world, you don't the provisional these resource is you just get what you want. This is Hollywood looking at the artistry, enabling faster, more agile storytelling. You don't need to go set up a town and go get the permit. All the all the heavy lifting you're shooting in this new digital realm. >> That's right. Exactly. Now I love going on location on There's a lot to celebrate about going on location, but we can always get to that location. Think of all the locations that we want to be in that air >> base off limits. Both space, the one I >> haven't been, uh, but but on said I've been I've walked on virtual moons and I've walked on set moons. But what if we did a volumetric image acquisition of someone set off the moon? Now we have that, and then we can walk around it. Or what if there's a great club, a nightclub? This says guys and wanted to shoot here. But we have performances Monday night, Tuesday night, Wednesday night there. You know they have a job. What? We grabbed that image acquired it. And then you could be there anytime you want. >> Robbie, we could go for an hour here. This is just a great comic. I >> completely agree with >> you. The Cube. You could You could sponsor a cube in this new world. We could run the Q twenty four seven is absolutely >> right. And we don't even have >> to talk about the relationship with Dale because on Del Technologies, because you're enabling new capabilities. New kind of artistry, just totally cool. Want to get back to the second? But you guys were involved. What's your role? How do you get involved? Tell the story about your >> John. I mean, first and foremost one of the things didn't Glendon mention is he's actually got about fifty movies to his credit. So the guy actually knows this stuff. So which is absolutely fantastic. So we said, How do you go take coverage to the next level? So what else is better than trying to work something out, wherein we together between what Glenn and Esteem does at the Sony Innovation Labs for Studio Sorry. And as in Dead Technologies could do is to try and actually stretch the boundaries of our technology to a next tent that when he talks about kazillion bytes of data right one followed by harmony, our zeros. We have to be able to process the data quickly. We have to be able to go out and do their rendering. We probably have to go out and do whatever is needed to make a high quality movie, and that, I think, in a way, is actually giving us an opportunity to go back and test the boundaries of their technology. They're building, which we believe this is the first of its kind in the media industry. If we can go learn together from this experience, we can actually go ahead and do other things in other industries do. Maybe. And we were just talking about how we could also take this. He's got his labs here in Los Angeles, were thinking maybe one of the next things we do based on the learning to get. We probably could take it to other parts of the world. And if we are successful, we might even take it to other industries. What if we could go do something to help in this field of medicine? >> It's just thinking that, right? Yes. Think >> about it. Lisa, John. I mean, it's phenomenal. I mean, this is something Michael always talks about is how do we as del technologies help in progress in the human kind? And if this is something that we can learn from, I think it's going to be phenomenal. >> I think I think that's so interesting. Not only is that a good angle for Del Technologies, the thing that strikes me is the access to artist trees, voices, new voices that may be missed in the prop the vetting process the old way. But, you know, you got to know where we're going. No, in the venture, cobble way seen this with democratization of seed labs and incubators where, if you can create access to the story, tells on the artists we're gonna have one more exposure to people might have missed. But also as things change, like whether it's Ray Ray beaming and streaming we saw in the gaming side to volumetric or volumetric things, you're gonna have a better canvas, more paint brushes on the creative side and more action. Is that the mission to get AC Get those artists in there? Is it? Is that part of the core mission submission? Because you're going to be essentially incubating new opportunities really fast. >> It's, uh, it's very important to me. Personally. I know it speaks of the values of both Sony and L. I like to call it the democratization of storytelling. You know, I've been very blessed again, a Hollywood producer, and we maybe curate a certain kind of movie, a certain kind of experience. But there's so many voices around the world that need to be hurt, and there are so many stories that otherwise can't be enabled. Imagine a story that perhaps is >> a unique special voice but requires distance. It requires five disparate locations. Perhaps it's in London Piccadilly Circus and in Times Square. And perhaps it's overto Abu Dhabi on DH Libya somewhere because that's part of the story. We can now collapse geography and bring those locations to a central place and allow a story to be told that may not otherwise have been able to be created. And that's vital to the fabric of storytelling. Worldwide >> is going to change the creative process to You don't have to have that waterfall kind of mentality like we don't talk about intact. You're totally distributed content, decentralized, potentially the creative process going change with all the tools and also the visual tools. >> That's right. It's >> almost becoming unlimited. >> You want it to be unlimited. You want the human spirit to be unlimited. You want to be able to elevate people on. That's the great thing about what we're trying to achieve and will achieve. >> It is your right. I mean, it is interesting, you know, we were just talking about this too. We're in, you know, as an example, shock tank. Yes, right. I mean, they obviously did it the filming and stuff, and then they don't have the access, let's say to the right studio, but The fact is, there had all this done on DH. No, they had all the rendering. They had the captured already done. You could now go out and do your chute without having all the space you needed. >> That's right. In the case of Shark Tank, which shoots a Sony Pictures studios, they knew they had a real estate issue. The fact of the matter is, there's a limited amount of sound stages around the world. They needed to sound stages and only had access to one. So we went in and we did a volumetric image acquisition of their exit interview stage. They're set. And then when it came time to shoot the second half a season ten, one hundred contestants went into a virtual set and were filmed in that set. And the funny thing is, one of the guys in the truck you know how you have the camera trucks and, you know, off offstage, he leaned into the mike. Is that you guys, could you move that plant a couple inches to the left and somebody said, Uh, I don't think we can do it right now, he said. We're on a movie lot. You could move a plant. They said, No, it's physically not there. We're on innovation studios goes Oh, that's right. It's virtual mind. >> So he was fooled. >> He was pulled. In a way, we're >> being hashing it out within a team. When we heard about some of the things you know Glenn and Team are doing is think about this. If you have to teach people when we are running short of doctors, right? Yeah, if you could. With this technology and the learnings that come from here, if you could go have an expert surgeon do surgery once you're captured, it would be nice. Just imagine, to take that learning, go to the new surgeons of the future and trained them and so they can get into the act without actually doing it. So my point in all this is this is where I think we can take technology, that next level where we can not only learn from one specific industry, but we could potentially put it to human good in terms of what we could to and not only preparing the next of doctors, but also take it to the next level. >> This was a great theme to Michael Dell put out there about these new kinds of use case is that the time is now to do before. Maybe you couldn't get there with technology, but maybe aspirational, eh? Let's do it. I could see that. Glenn, I want to ask you specifically. The time is now. This is all kind of coming together. Timing's pretty good. It's only gonna get better. It's gonna be good. Tech, Tech mojo Coming for the creative side. Where were we before? Because I could almost imagine this is not a new vision for you. Probably seen it now that this house here now what was it like before for, um and compare contrast where you were a few years ago, maybe decades. Now what's different? Why? Why is this so important? >> You know, for me, there's a fundamental change in how we can create content and how we can tell stories. It used to be the two most expensive words in the movie TV industry were what if today that the most important words to me or what if Because what if we could collapse geography? What if we could empower a new story? Technology is at a place where if we can dream it. Chances are we can make it a reality. We're changing the dynamics of how we may content. He used to be lights, action, camera. I think it's now lights, action, compute power action, you know, is that kind of difference. >> That is an amazing vision. I think society now has opportunities to kind of take that from distance learning to distance connections, the distance sharing experiences, whether it's immersion, virtual analog face the face. I could really be powerful. Yeah, >> and this is not even a year old. >> That's right. >> So if you look at your your launch, you said, I think let june fourth twenty eighteen. What? Where do you go from here? I mean, like we said, this is like, unlimited possibilities. But besides putting Robbie in the movie, naturally, Yes, of course I have >> a star here >> who video. >> So I got to say he's got star power. >> What's what. The next year? Exactly. >> Very exciting. I will say we have shark tank Thie Advanced Imaging Society gives an award for being the first volume metric set ever put out on the airwaves. Uh, for that television show was a great honor. Uh, we have already captured, uh, men in black. We captured a fifty thousand square foot stage that had the men in black headquarters has been used for commercials to market the film that comes out this June. We have captured sets where television >> shows and in the in hopes that they got a second season and one television show called up and said, Guys, we got the second season so they don't have to go back to what was a very expensive set and a beautiful set >> Way captured that set. It reminds me of a story of productions and a friend of mine said, which is every year. The greatest gift I have is building a beautiful set and and to me, the biggest challenges. When I say, remember that sent you built four years ago. I need that again. Now you can go >> toe hard, replicate the exact set, you capture it digitally. It lives. >> That's exactly it. >> And this is amazing. I mean, I'd love to do a cube set into do ah, like a simulcasts. Virtually. >> So. This is the next thing John and Lisa. You guys could be sitting anywhere going forward. We don't have to be really sitting here you could be doing. What do you have to do? And, you know, you got everything rendered >> captured. We don't have to come to Vegas twenty times a year. >> We billed upset once >> You want to see you here believing that So I'LL take that >> visual is a really beautiful thing. So if we can with hologram just seeing people doing conscious. But Hollywood Frank Zappa just did a concert hologram concert, but bringing real people and from communities around the world where the localization diversity right into a content mixture is just so powerful. >> Actually, you said something very interesting, John, which is one of the other teams to which is, if you have a globally connected society and he wanted try and personalize it to that particular nation ethnicity group. You can do that easily now because you can probably pop in actors from the local area with the same city. Yeah, think about it. >> It's surely right. >> There's a cascade of transformations that that this is going Teo to generate. I mean just thinking of how different even acting schools and drama schools will be well, teaching people how to behave in these virtual environments, right? >> How to immerse themselves in these environments. And we have tricks up our sleeves that Khun put the actor in that moment through projection mapping and the other techniques that allow filmmakers and actors to actually understand the world. They're about to stepped in rather than a green screen and saying, OK, there's going to be a creature over here is gonna be blue Water Falls over there will actually be able to see that environment because that environment will exist before they step on the stage. >> Well, great job the Dale Partnership On my final question, Glenn free since you're awesome and got a great vision so smart, experienced, I've been really thinking a lot about how visualization and artistry are coming together and how disciplines silo disciplines like music. They do great music, but they're not translating to the graphics. It was just some about Ray tracing and the impact with GP use for immersive experiences, which was seeing on the client side of the house. It del So you got the back and stuff, but you metrics. And so, as artist trees, the next generation come up. This is now a link between the visual that audio, the storytelling. It's not a siloed. >> It is not >> your I want to get your vision on. How do you see this playing out and your advice for young artists? That might be, you know, looked as country. What do you know? That's not how we do it. >> Well, the beautiful thing is that there are new ways to tell stories. You know, Hollywood has evolved over the last century. If you look at the studios and still exist, they have all evolved, and that's why they do exist. Great storytellers evolved. We tell stories differently, so long as we can emotionally relate to the story that's being told. I say Do it in your own voice. The cinematic power is among us. We're blessed that when we look back, we have that shared experience, whether it's animate from Japan or traditional animation from Walt Disney, everybody shares a similar history. Now it's opportunity to author our new stories and we can do that and physical assets and volumetric assets and weakened blend the real and the unreal. With the compute power. The world is our oyster. >> Wow, >> What a nice >> trap right there. >> Exactly that is, um I dropped the transformation of Hollywood. What? And it's really think the tip of the iceberg. Unlimited story potential. Thank you, Glenn. Thank you. This has been a fascinating cannot wait to hear, See and feel and touch What's next for Sony Animation studios With your technology power We appreciate your time. >> Yeah, Thank you. Thank you both of >> our pleasure for John Farrier. I'm Lisa Martin. You're watching the Cube lie from Del Technologies World twenty nineteen We've just wrapped up Day two we'LL see you tomorrow.

Published Date : May 1 2019

SUMMARY :

Brought to you by Del Technologies We've got Robbie Pender County, one of our alumni on the cue back as VP product management And you brought some Hollywood? It's great to be here. So you are love this intersection of Hollywood and technology. I started to start this company. You know that the genesis for it was based out of necessity because I looked at a nice And what some of the work private you guys envision coming out this mission you mentioned commercials TV. To answer that one and say is going to enable that's It's like a monster green screen for the world. And for all you know, we may not be here. This is Hollywood looking at the artistry, enabling faster, more agile storytelling. Think of all the locations that we want to be Both space, the one I And then you could be there anytime you want. Robbie, we could go for an hour here. We could run the Q twenty four seven is absolutely And we don't even have Tell the story about your So we said, How do you go take coverage to the next level? It's just thinking that, right? And if this is something that we can learn from, I think it's going to be phenomenal. Is that the mission to get AC Get those artists in there? that need to be hurt, and there are so many stories that otherwise can't be enabled. We can now collapse geography and bring those locations to a central place is going to change the creative process to You don't have to have that waterfall kind of mentality like we don't talk That's right. on. That's the great thing about what we're trying to achieve and will achieve. the access, let's say to the right studio, but The fact is, there had all this done on in the truck you know how you have the camera trucks and, you know, off offstage, he leaned into the mike. In a way, we're the next of doctors, but also take it to the next level. Glenn, I want to ask you specifically. You know, for me, there's a fundamental change in how we can create content and how we can tell I think society now has opportunities to kind of take that from distance learning to So if you look at your your launch, you said, I think let june fourth twenty eighteen. The next year? that had the men in black headquarters has been used for commercials to market the film that comes out this The greatest gift I have is building a beautiful set and and to me, toe hard, replicate the exact set, you capture it digitally. I mean, I'd love to do a cube set into do ah, like a simulcasts. We don't have to be really sitting here you could be doing. We don't have to come to Vegas twenty times a year. So if we can with hologram just seeing people doing conscious. if you have a globally connected society and he wanted try and personalize it I mean just thinking of how different And we have tricks up our sleeves that Khun put the actor It del So you got the back and stuff, but you metrics. How do you see this playing out and your advice for young artists? You know, Hollywood has evolved over the last century. And it's really think the tip of the iceberg. Thank you both of World twenty nineteen We've just wrapped up Day two we'LL see you tomorrow.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
John Farrier	PERSON	0.99+
Michael	PERSON	0.99+
John Ferrier	PERSON	0.99+
John	PERSON	0.99+
Michael Dell	PERSON	0.99+
Robbie	PERSON	0.99+
Sony Pictures	ORGANIZATION	0.99+
Del Technologies	ORGANIZATION	0.99+
Glenn	PERSON	0.99+
Glendon	PERSON	0.99+
Lisa	PERSON	0.99+
Ravi Pendakanti	PERSON	0.99+
second season	QUANTITY	0.99+
Abu Dhabi	LOCATION	0.99+
Esteem	PERSON	0.99+
Sony Innovation Studios	ORGANIZATION	0.99+
Vegas	LOCATION	0.99+
Monday night	DATE	0.99+
Sony	ORGANIZATION	0.99+
Glenn Gainor	PERSON	0.99+
del Technologies	ORGANIZATION	0.99+
Los Angeles	LOCATION	0.99+
Times Square	LOCATION	0.99+
Japan	LOCATION	0.99+
Frank Zappa	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Sony Innovation Labs	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Tuesday night	DATE	0.99+
one hundred contestants	QUANTITY	0.99+
thirty K	QUANTITY	0.99+
kazillion bytes	QUANTITY	0.99+
London Piccadilly Circus	LOCATION	0.99+
first	QUANTITY	0.99+
Robbie Pender County	PERSON	0.99+
five disparate locations	QUANTITY	0.99+
one	QUANTITY	0.99+
both	QUANTITY	0.99+
second half	QUANTITY	0.98+
four years ago	DATE	0.98+
next year	DATE	0.98+
Sony Animation	ORGANIZATION	0.98+
first volume	QUANTITY	0.98+
fifty thousand square foot	QUANTITY	0.98+
Khun	PERSON	0.98+
second	QUANTITY	0.98+
a year	QUANTITY	0.98+
Dell EMC	ORGANIZATION	0.98+
Hollywood	ORGANIZATION	0.98+
Thie Advanced Imaging Society	ORGANIZATION	0.97+
one television show	QUANTITY	0.97+
Wednesday night	DATE	0.97+
Shark Tank	TITLE	0.97+
Day two	QUANTITY	0.96+
Dale	PERSON	0.96+
billions and trillions	QUANTITY	0.96+
twenty times a year	QUANTITY	0.96+
Glenn Glenn er	PERSON	0.95+
one year old	QUANTITY	0.93+
an hour	QUANTITY	0.93+
second full day	QUANTITY	0.92+
two most expensive words	QUANTITY	0.92+
last century	DATE	0.91+
few years ago	DATE	0.91+
Del Technologies	ORGANIZATION	0.91+
june	DATE	0.9+
today	DATE	0.89+

Eric Siegel, Predictive Analytics World - #SparkSummit - #theCUBE

>> Announcer: Live from San Francisco it's theCUBE Covering Spark Summit 2017, brought to you by Databricks. >> Welcome back to theCUBE. You are watching coverage of Spark Summit 2017. It's day two, we've got so many new guests to talk to today. We already learned a lot, right George? >> Yeah, I mean we had some, I guess, pretty high bandwidth conversations. >> Yes, well I expect we're going to have another one here too, because the person we have is the founder of Predictive Analytics World, it's Eric Siegel, Eric welcome to the show. >> Hey thanks Dave, thanks George. You go by Dave or David? >> Dave: Oh you can call me sir, and that would be. >> I was calling you, should I, can I bow? >> Oh no we are bowing to you, you're the author of the book, Predictive Analytics, I love the subtitle, the Power to Predict Who Will Click, Buy, Lie or Die. >> And that sums up the industry right? >> Right, so if people are new to the industry, that's sort of an informal definition of predictive analytics, basically also known as machine learning. Where you're trying to make predictions for each individual, whether it's a customer for marketing, a suspect for fraud or law enforcement, a voter for political campaigning, a patient for healthcare. So, in general it's on that level, it's a prediction for each individual. So how does data help make those predictions? And then you can only imagine just how many ways in which predicting on that level helps organizations improve all their activities. >> Well we know you were on the keynote stage this morning. Could you maybe summarize for the CUBE audience, what a couple of the top themes that you were talking about? >> Yeah, I covered two advanced topics. I wanted to make sure this pretty technical audience was aware of because a lot of people aren't and one is called uplift modeling, so that's optimizing for persuasion for things like marketing and also for healthcare, actually. And for political campaigning. So when you do predictive analytics for targeting marketing normally sort of the traditional approach is, let's predict will this person buy if I contact them because when well its okay maybe its a good idea to spend the two dollars to send them a brochure its marketing treatment, right. But there is actually a little bit different question that would make even driving them better decisions Which is not will this person buy but would contacting them, sending them the brochure, influence them to buy, will it increase the chance that we get that positive outcome. That's a different question, and it doesn't correspond with standard predictive modeling or machine learning methods So uplift modeling, also known as net lift modeling, persuasion modeling its a way to actually create a predictive model like any other except that it's target is, is it a good idea to contact this person because it will increase the chances that they are going to have a positive outcome. So that's the first of the two. And I cram this all in 20 minutes. The other one was a little more commonly known But I think people would like to visit it and it's called P-Hacking or vast search. Where you can be fooled by randomness and data relatively easily in the era of Big Data there is this all to common pitfall where you find a predictive insight in the data and it turns out it was actually just a random perturbation. How do you know the difference? >> Dave: Fake news right? >> Okay fake news, except that in this case, it was generated by a computer, right? And then there is a statistical test that makes it look like its actually statistically significant and we should have credibility to it, on it or about it. So you can avert it, you have compensate for the fact that you are trying lots, that you are evaluating many different predictive insights or hypotheses whatever you want to call it and make sure that the one that you are believing you sort of checked for the ability that it wasn't just random luck, that's known as p-hacking. >> Alright, so uplift modeling and p-hacking. George do you want to drill on those a little bit. >> Yeah, I want to start from maybe the vocabulary of our audience where they say sort of like uplift modeling goes beyond prediction. Actually even for the second one with p-hacking is that where you're essentially playing with the parameters of the model to find the difference between correlation and causation and going from prediction to prescription? >> It's not about causation, its actually so correlation is what you get when you get a predictive insight or some component of a predictive model where you see these things connected therefore one is predictive of the other. Now the fact that does not entail causation is a really good point to remind people of as such. But even before you address that question, the first question is this correlation actually legit? Is there really a correlation between this things? Is this an actual finding? Or is it just happened to be the case in this particular sample of limited sample data that I have access to at the moment, right? So is it a real link or correlation in the first place before you even start asking any question about causality and it does have, it does related to what you alluded to with regard to tuning parameters because its closely related to this issue of overfitting. People who do predictive modeling are very familiar with overfitting. The standard practice all tools implementations of machine learning and predictive modeling do this, which is they hold the side evaluation set called test set. So you don't get to cheat, creates a predict model. It learns from the data, does the number crunching, its mostly automated, right. And it comes out with this beautiful model that does well predicting and then you evaluate, you assess it over this held aside. Oh my thing's falling off here. >> Dave: Just second on your. >> See then you evaluate it on this held aside set it was quarantine so you didn't get to cheat. You didn't get to look at it when you are creating the model. So it serves as an objective performance measure. The problem is and here is the huge irony, the things that we get from data, the predictive insights, there was one famous one that was broadcasted too loudly because its not nearly as credible as they first thought. Is that an orange used car is a better one to buy because its less likely to be a lemon. That's what it looked like in this one data set. The problem is, that when you have a single insight where its relatively simple, just talking about the car, the color to make the prediction. A predictive model is much more complex and deals with lots of other attributes not just the color, for example, make, year, model everything on that individual car, individual person, you can imagine all the attributes that's the point of the modeling process, the learning process, how do you consider multiple things. If its just a really simple thing with just based on the car color, then many of even the most advanced data science practitioners kind of forget that there is still potential to effectively overfit, that you might have found something that doesn't apply in general, only applies over this particular set of data. So that's where the trap falls and they don't necessarily hold themselves a high standard of having this held aside test set. So its kind of ironic thing, the things that most likely to make the headlines like orange cars are simpler, easier to understand, but are less well understood that they could be wrong. >> You know keying off that, that's really interesting, because we've been hearing for years that what's made, especially deep learning relevant over the last few years is huge compute up in the cloud and huge data sets. >> Yeah. >> But we're also starting to hear about methods of generating a sort of synthetic data so that if you don't have, I don't know what the term is, organic training data, and then test data, we're getting to the point where we can do high quality models with less. >> Yes, less of that training data. And did you. >> Tell us. >> Did you interview with the keynote speaker from Stanford about that? >> No, I only saw part of his. >> Yeah his speech yesterday. That's an area that I'm relatively new to but it sounds extremely important because that is the bottleneck. He called it, if data's the new oil, he's calling it the new-new oil. Which is more specific than data, it's training data. So all of the machine learning or predictive modeling methods of which we speak, are, in most cases, what's called supervised learning. So the thing that makes it supervised is you have a bunch of examples where you already know the answer. So you're trying to figure out is this picture of a cat or of a dog, that means you need to have a whole bunch of data from which to learn, the training data, where you've already got it labeled. You already know the correct answer. In many business applications just because of history you know who did or didn't respond to your marketing, you know who did or did not turn out to be fraudulent. History is experience in which to learn, it's in the data, so you do have that labeled, yes, no, like you already know the answer, you don't need to predict on them, it's in the past but you use that as training data. So we have that in many cases. But for something like classifying an image, and we're trying to figure out does this have a picture of a cat somewhere in the image, or whatever all these big image classification problems, you do need, often, a manual effort to label the data. Have the positive and negative examples, that's what's called training data, the learning data. It's actually called training data. There's definitely a bottleneck so anything that can be done to avert that bottleneck decrease the amount that we need, or find ways to make, sort of, rough training data that may serve as a building block for the modeling process this kind of thing. That's not my area of expertise, sounds really intriguing though. >> What about, and this may be further out on the horizon but one thing we are hearing about is the extreme shortage of data scientists who need to be teamed up with domain experts to figure out the knobs, the variables to create these elaborate models. We're told that even if you're doing the traditional, statistical, machine learning models, that eventually deep learning can help us identify the features or the variables just the way they sort of identify you know ears and whiskers and a nose and then figure out from that the cat. That's something that's in the near term, the medium term in terms of helping to augment what the data scientist does? >> It's in the near term and that's why everyone's excited about deep learning right now is that, basically the reason we built these machines called computers is because they automate stuff. Pretty much anything that you can think of and define well, you can program. Then you've got a machine that does it. Of course one of the things we wanted to learn, to do actually, is to learn from data. Now, it's literally really very analogous to what it means for a human to learn. You've got a limited number of examples that you're trying to draw generalizations from those. When you go to bigger scale problems where the thing you're classifying isn't just like a customer, and all the things you know about the customer, are they likely to commit fraud, yes or no. But it become a level more complex when it's an image right, image is worth a thousand words. And maybe literally more than a thousand words where it says of data if it's a high resolution. So how do you process that? Well there's all sorts of research like well we can define the thing that tries to find arcs, and circles and edges and this kind of thing, or, we can try to, once again, let that be automatic. Let the computer do that. So deep learning is a way to allow, spark is a way to make it operate quickly but there's another level of scale other than speed. The level of scale is just like how complex of a task can you leave up to the automaton, to go by itself. That's what deep learning does is it scales in that respect it has the ability to automate more layers of that complexity as far as finding those kinds of what might me domain specific features and images. >> Okay, but I'm thinking not just the, help me figure out speech to text and natural language understanding or classify. >> Anything with a signal where it's a high bandwidth amount of data coming in that you want to classify. >> OK, so could that, does that extend to I'm building a very elaborate predictive model not on, is there a cat in the video or in the picture so much as I guess you called it, is there an uplift potential and how big is that potential, in a context of making a sale on an eCommerce site. >> So what you just tapped into was when you go to marketing and many other business applications, you don't actually need to have high accuracy what you have to do is have a prediction that's better than guessing. So for example, if I get a 1% response rate to my marketing campaign, but I can find a pocket that's got 3% response rate, it may be very much rocket science to define and learn from the data how to define that specifically defined sub-segment that has a higher response rate, or whatever it is. But the 3% isn't like, I have high confidence this person's definitely going to buy, it's still just 3%, but that difference can make a huge difference and can improve the bottom line marketing by a factor of five and that kind of thing. It's not necessarily about accuracy. If you've got an image and you need to know is there a picture of a car, or is this traffic light green or red, somewhere in this image, then there's certain application areas, self driving cars what have you, it does need to be accurate right. But maybe there's more potential for it to be accurate because there's more predictability inherent to that problem. Like I can predict that there's a traffic light that has a green light somewhere in an image because there is enough label data and the nature of the problem is more tractable because it's not as challenging to find where the traffic light is, and then which color it is. You need it to scale, to reach that level of classification performance in terms of accuracy or whatever measure you use for certain applications. >> Are you seeing like new methodologies like reinforcement learning or deep learning where the models are adversarial where they make big advances in terms of what they can learn without a lot of supervision? Like the ones where. >> It's more self learning and unsupervised. >> Sort of glue yourself onto this video game screen we'll give you control of the steering wheel and you figure out how to win. >> Having less required supervision, more self-learning, anomaly detection or clustering, these are some of the unsupervised ones. When it comes to vision there are part of the process that can be unsupervised in the sense that you don't need labels on your target like is there a car in the picture. But it can still learn the feature detection in a way that doesn't have that supervised data. Although that image classification in general, on that level deep learning, is not my area of expertise. That's a very up and coming part of machine learning but it's only needed when you have these high bandwidth inputs like an entire image, high resolution, or a video, or a high bandwidth audio. So it's signal processing type problems where you start to need that kind of deep learning. >> Great discussion Eric, just a couple of minutes to go in this segment here. I want to make sure I give a chance to talk about Predictive Analytics World and what's your affiliation with that ad what do you want theCUBE audience to know? >> Oh sure, Predictive Analytics World I'm the founder it's the leading cross-vendor event focused on commercial deployment of predictive analytics and machine learning. Our main event a few times a year is a broad scope business focused event but we also have industry vertical focused specialized events just for financial services, healthcare, workforce, manufacturing and government applications of predictive analytics and machine learning. So there's a number a year, and two weeks from now in Chicago, October in New York and you can see the full agendas at PredictiveAnalyticsWorld.com. >> Alright great short commercial there. 30 seconds. >> It's the elevator pitch. >> Answered the toughest question in 30 seconds what the toughest question you got after your keynote this morning? Maybe a hallway conversation or. >> What's the toughest question I got after my keynote? >> Dave: From one of the attendees. >> Oh, the question that always comes up is how do you get this level of complexity across to non-technical people or your boss or your colleagues or your friends and family. By the way that's something I worked really hard on with the book which is meant for all readers although the last few chapters have. >> How do you get executive sponsors to get what you're doing? >> Well, as I say, give them the book. Because the point of the book is it's pop science it's accessible, it's analytically driven, it's entertaining it keeps it relevant but it does address advanced topics at the end of the book. So it sort of ends, industry overview kind of thing. The bottom line there, in general, is that you want to focus on the business impact. What I mentioned briefly a second ago if we can improve target marketing this much it will increase profit by a factor five something like that. So you start with that and then answer any questions they have about, well how does it work, what makes it credible that it really has that much potential in the bottom line. When you're a techie, you're inclined to go forward you start with the technology that you're excited about. That's my background, so that's sort of the definition of being a geek, that you're ore enamored with the technology than the value it produces. Because it's amazing that it works, and it's exciting, it's interesting, it's scientifically challenging. But, when you're talking to the decision makers you have to start with the eventual carrot at the end of the stick, which is the value. >> The business outcome. >> Yeah. >> Great, well that's going to be the last word. That might even make it onto our CUBE Gems segment, great sound bites. George thanks again, great questions and Eric the author of Predictive Analytics, the Power to Predict Who Will Click, Buy, Lie or Die. Thank you for being on the show we appreciate your time. >> Eric: Sure, yeah thank you, great to meet you. >> Thank you for watching theCUBE we'll be back in just a few minutes with our next guest here at Spark Summit 2017.

Published Date : Jun 7 2017

SUMMARY :

brought to you by Databricks. to talk to today. Yeah, I mean we had some, I guess, because the person we have is the founder You go by Dave or David? I love the subtitle, the Power to Predict Who Will Click, And then you can only imagine just how many ways what a couple of the top themes that you were talking about? there is this all to common pitfall where you find and make sure that the one that you are believing George do you want to drill on those a little bit. is that where you're essentially of a predictive model where you see these things connected The problem is, that when you have a single insight over the last few years is huge compute up in the cloud so that if you don't have, I don't know what the term is, Yes, less of that training data. it's in the data, so you do have that labeled, That's something that's in the near term, the medium term and all the things you know about the customer, help me figure out speech to text that you want to classify. so much as I guess you called it, So what you just tapped into was Are you seeing like new methodologies like and unsupervised. and you figure out how to win. that you don't need labels on your target ad what do you want theCUBE audience to know? in Chicago, October in New York and you can see what the toughest question you got is how do you get this level of complexity is that you want to focus on the business impact. and Eric the author of Predictive Analytics, the Power Thank you for watching theCUBE we'll be back

ENTITIES

Entity	Category	Confidence
George	PERSON	0.99+
Dave	PERSON	0.99+
Eric Siegel	PERSON	0.99+
David	PERSON	0.99+
Eric	PERSON	0.99+
Chicago	LOCATION	0.99+
1%	QUANTITY	0.99+
two dollars	QUANTITY	0.99+
3%	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
New York	LOCATION	0.99+
30 seconds	QUANTITY	0.99+
yesterday	DATE	0.99+
first question	QUANTITY	0.99+
20 minutes	QUANTITY	0.99+
Predictive Analytics	TITLE	0.99+
Spark Summit 2017	EVENT	0.99+
more than a thousand words	QUANTITY	0.98+
Predictive Analytics World	ORGANIZATION	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.98+
second one	QUANTITY	0.98+
each individual	QUANTITY	0.98+
two	QUANTITY	0.98+
today	DATE	0.97+
second	QUANTITY	0.97+
October	DATE	0.97+
two weeks	QUANTITY	0.97+
two advanced topics	QUANTITY	0.97+
first place	QUANTITY	0.96+
the Power to Predict Who Will Click, Buy, Lie or Die	TITLE	0.94+
Predictive Analytics, the Power to Predict Who Will Click, Buy, Lie or Die	TITLE	0.94+
Databricks	ORGANIZATION	0.94+
single insight	QUANTITY	0.93+
Stanford	ORGANIZATION	0.91+
five	QUANTITY	0.9+
this morning	DATE	0.87+
CUBE	ORGANIZATION	0.86+
a thousand words	QUANTITY	0.84+
first thought	QUANTITY	0.82+
Predictive Analytics	ORGANIZATION	0.77+
a year	QUANTITY	0.72+
theCUBE	ORGANIZATION	0.72+
day two	QUANTITY	0.7+
one famous	QUANTITY	0.69+
PredictiveAnalyticsWorld.com	ORGANIZATION	0.66+
times a year	QUANTITY	0.66+
second ago	DATE	0.66+
World	EVENT	0.63+
#theCUBE	ORGANIZATION	0.57+
years	QUANTITY	0.56+
last	DATE	0.56+
factor	QUANTITY	0.52+
years	DATE	0.49+
minutes	QUANTITY	0.48+
five	OTHER	0.33+

Randy Bias, Juniper - OpenStack Summit 2017 - #OpenStackSummit - #theCUBE

>> Voiceover: Live from Boston, Massachusetts, it's the Cube, covering OpenStack Summit 2017. Brought to you by the OpenStack Foundation, Red Hat, and additional Ecosystem as support. >> Welcome back, I'm Stu Miniman joined by John Troyer. This is Silken Angle Media's production of the Cube at OpenStack Summit. We're the world wide leader in tech coverage, live tech coverage. Happy to welcome back to the program someone we've had on so many times we can't keep track. He is the creator of the term Pets versus Cattle, he is one of the OG of The Cloud Group, Randy, you know, wrote about everything before most of it was done. So good to see you, thank you for joining us. >> Thanks for having me. >> Alright, so Randy, coming into this show we felt that it was a bit of resetting expectations, people not understanding, you know, where infrastructure's going, a whole hybrid multi-cloud world, so, I mean you've told us all how it's going to go, so where are we today, what have people been getting wrong, what's your take coming into this week and what you've seen? >> Well, I've said it before, which is that the public clouds have done more than just deliver compute storage and networking on demand. What they've really done is they've built these massive development organizations. They're very sophisticated, that are, you know, that really come from that Webscale background and move at a velocity that's really different than anything we've seen before, and I think the hope in the early days of OpenStack was that we would achieve a similar kind of velocity and momentum, but I think the reality is is that it just hasn't really materialized; that while there are a lot of projects and there are a lot of contributors the coordination between them is very poor, and you know it's just not the, like architectural oversight that we really needed isn't there. I, a couple years ago at the Openstack Silicon Valley gave a presentation called The Lie of the Benevolent Dictator, and I chartered a course for how we could actually have more of a technical architecture oversight, and just that really fell on deaf ears. And so we continue to do the same thing and expect different results and I just, that's a little disappointing for me. >> Yeah. So what is your view of hybrid cloud? You know, no disagreement, you look at what the public cloud companies, especially the big three, the development that they can do, Amazon, a thousand new features a year, Google, what they can do with data, Microsoft has a whole lot of applications and communities around them. We're mostly talking about private cloud here, it was a term that you fought against for many years, we've had great debates on it, so how does that hybrid play out? Cause customers, they're keeping on premises. Edge fits into a lot of this too, so it's, there's not one winner, it's not a zero sum game, but how does that hybrid cloud work? >> Yeah so, I didn't fight against private cloud, I qualified it. I said if it's going to be a private cloud it's got to be built and look and smell the way that the public cloud was. Alright? If it's just VM ware with VM's on demand, that's not a private cloud. That was my position. And then in terms of hybrid cloud, you know, I don't think we're there yet. I've presented on this at many different OpenStacks, you can see it in the past, and I sort of laid out what needs to happen and that didn't happen. But I think there's hope, and I think the hope comes in the form of Kubernetes, and to a certain degree, Helm. And the reason that Kubernetes with Helm is very powerful is that Kubernetes gives us a computive traction, so that you don't care if you're on the public cloud, or you know OpenStack or Vmware or whatever, and then what Helm gives us is our charts, so ways to deploy services, not just software, and so what we could think about doing in the future is building hybrid cloud based off of Kubernetes and Helm. >> Yeah, so Randy since last time we talked you've got a new role, you're now with Juniper. Juniper had done a Contrail acquisition. You know, quite a few years back you wrote a good blueprint on one of the Juniper forums about the OpenContrail communities. So tell us a little bit about your role, your goals, in that community. >> So OpenContrail has been a primarily Juniper initiative, and we're going to press the reset button on the OpenContrail community. I'm going to do it tonight and call for people to sort of get involved in doing that reset, and when I say reset I mean, wipe the operating system, reload it from scratch, and do it really as a community, not just as a Juniper run initiative, and so people inside Juniper are very excited about this, and what we're trying to do is that we believe that the path forward for OpenContrail is ubiquitous adoption. So rather then playing for just the pieces that we have, which we've done a great job of, we want to take the world's best SDN controller and we want to make sure everybody uses it, because we think aggregate that's good for not only the entire community but also Juniper. >> So, love the idea of kind of rebooting the community in the open, right, because you have to be transparent about these sort of things. >> Randy: Yeah, that's right. >> What are the community segments that you would like to see join you here in the OpenContrail? What kind of users, what kind of companies would you like to see come in to the tent? >> Well anybody's welcome, but we want to start with all of our key stakeholders that exist today, so first one, and arguably one of the most important is our competitors, right so we're hoping to have Mirantis at the table, maybe Ericcson, Huawei, anybody. Cisco, hey come join the party. Second is that we have done really well in Sass and in gaming, and we'd like to see all of those companies come to the table as well, Workday, Symantech, and so on. The third segment is enterprises, we've done well in financial services, we think that that's a really important segment because they're leading edge of enterprises typically, and the fourth is the carrier's obviously incredibly important for Juniper, folks like AT&T, Direction Telecom, all those companies we'd love to see come to the table. And then that's really the primary focus, and then anybody else who wants to show up, anybody who wants to develop in Contrail in the future we'd love to have there. >> Well with open source communities, right, there's always a balance of the contributors and developers versus operators, and we can use the word contributors in a lot of roles. Some open source communities, much more developer focused, >> Randy: That's right. >> Others more operator focused, where do you see this OpenContrail community starting out? >> So where it's been historically is more of our end users and operators. >> I think that's interesting and an interesting twist because I think sometimes open source communities get stuck with just the people who can contribute code, and I'm from an operator community myself, >> Randy: Right. >> So I think that's really interesting. >> We still want all those people but I think what has happened is that when people have come in and they wanted to be more sort of on the developer side, the community hasn't been friendly to them. >> John: Okay. >> Randy: And so we want, that's a key thing that we want to change. You know when we were talking, to certain carriers they came and they said look, it's great you're going to do this, we want to be a part of it, and one of the things we'd like to contribute is more advanced testing around VMFs. And I just look at that and I'm just like that's what we need, right? Juniper is not, can't carry all the water on having, you know, sophisticated test suites for VMFs and more advanced networking use cases, but the carriers are deep into this and we'd love to have them come and bring that. So not just developers, but also QA, people who want to increase the code quality, the architectural quality, and the aggregate value of OpenContrail. >> Okay, Randy can you help place OpenContrail where it fits in this kind of networking spectrum, especially, there's open source things, we've talked about about VPP a couple times on theCube here. The joke for many years was SDN still does nothing, NFV solutions have grown, have been huge use case, is really where the early money for big deployments have been for OpenStack. Where does OpenContrail fit, where does it kind of compare and contrast against some of the other options out there. >> I'm going to answer that slightly differently. I've been skeptical about SDN overlays for a long time, and now I am helping with one of the world's best SDN overlays, and what's changed for me is that in the last year I've seen key customers of Contrail's, of Juniper's actually do something very interesting, right. You've got an SDN overlay, it's complex, it's hard to void, you got to wonder, why should I do this? Well I thought the same thing about virtualization, right, until I figured out, sort of what was the killer app. And what we've seen is a company, one of our customers, and several others, but one in particular I can talk about publicly, Riot Games, take containers and OpenContrail and marry them so that you have an abstraction around compute, and an abstraction around networking, so that their developers can write to that, and they don't care whether that's running on top of public cloud, private cloud, or in some partner's data center globally. And in fact they're going to talk about that today at OpenContrail days at 3:30, and are going to present a lot more details, and that's amazing to me because by abstracting a way and disintermediating the public clouds, you actually have more power, right. You can build your own framework. And if you're using Kubernetes as a baseline you can do a lot more on top of that computing network abstraction. >> You talked about OpenContrail days, again my first summit, I've actually been impressed by the foundation, acknowledging there's a huge landscape of open source and other technologies around there, OpenStack itself doesn't invent everything. Can you talk a little bit about that kind of attitude of bringing, I mean we talk about Kubernetes and that sort of thing, but all the other CNCF projects, monitoring, even components like SCD, right, we're talking about here at this conference. So, can you talk a little bit about how OpenStack can interact with the rest of the open source and cloud native at-large community? >> That's sort of a tough question John. >> John: Okay. >> I mean the reason I say that is like the origins of OpenStack are very much NIH and there has been a very disturbing tendency to sort of re-invent the wheel. A great example is Keystone, still to this day I don't know why Keystone exists and why we created a whole new authentic standard when there were dozens and dozens of battle-tested, battle-hardened protocols and bits of code that existed prior. It's great that we're getting a little bit better at that but I still sense that the origins of the community and some of the technical leadership have resistance to organizing and working with outside components and playing nice. So, it's better but it's not great, it's not where it should be. Really OpenStack needs to be broken down into a lot of different projects that can compete with each other and all run in parallel without having to be so tightly wound together. It's still disappointing to me that we aren't doing that today. >> Randy, wonder if you could give us a little bit of a personal reflection, you've been involved in cloud many years, we've talked about some of the state of it, where do you think enterprises are when they think about their IT, how IT relates to business, some of the big challenges they're facing, and kind of this rapid pace of change that's happening in our industry right now >> Yeah well the pressures just increase. The need to pick up speed and to move faster and to have a greater velocity, that's not going away, that seems to be like an incredible macro-trend that's just going to keep driving people towards the next event. But what I see is that the tension between the infra-structure IT teams and the line of business hasn't really started to get resolved. You see a lot of enterprises back into using DevOps as a way to try to fix the culture change problems but it's just not happening fast enough. I have a lot of concerns that basically private cloud or private infra-structure for enterprises will just not materialize in the way it needs to for the next generation. And that the line of business will continue to just keep moving to public cloud. All the while all the money that's being reinvested in the public cloud is increasing their capabilities in terms feature sets and security capabilities and so on. I just, I don't see the materialization of private cloud happening very well at this point in time and I don't see any trendlines that tell me it's going to change. >> Yeah, what recommendations do you give today to the OpenStack foundation? I know that you haven't been shy in the past about giving guidance as to the direction, what do you think needs to happen to be able to help customers along that journey that they need? >> I don't give any guidance to the OpenStack Foundation anymore, I'm not on the Board of Directors, and frankly I gave a lot of advice in the past that fell on deaf ears and people were unwilling to make the changes that were necessary I think to create success. And even though I was eventually proven right, there doesn't seem to be an appetite for change. I would say that the hard partition between the Board of Directors and the technical committee that was created at the outset with the founding of the Foundation has let to a big problem which is that there's simply business concerns that are technical concerns and there are technical concerns which are business concerns and the actual structure of the Foundation does not allow that to occur because that hard partition between them. So if people on Board of Directors can't actually tell the TC that they'd like to see certain technical changes because they're business concerns and Technical Committee can't tell the Board of Directors they'd like to see business changes made because they're technical concerns around them. And I think that's, it's fundamentally broken until the bylaws are fixed. >> So Randy beyond what we've talked about already what's exciting you these days, you look at like the serverless trend, is that something that you find intriguing or maybe contrary view on it, what's exciting you these days? >> Serverless is really interesting. In fact I'd like to see serverless at the edge. I think it would be fascinating if Amazon webservices could sell a serverless capability that was actually running in the mobile carriers edge. So like on the mobile towers or in essential offices. But you could do distributive computation for IOT literally at the very edge of the network, that would be incredibly powerful. So I am very interested in serverless in that regard. With Kubernetes, I think that this is the future, I think I've seen most of the other initiatives start to fail at this point. Docker Incorporated just hasn't made the progress they need to, hopefully a change in leadership will fix that. But it does mean that more and more people are gravitating towards Kubernetes and that's a thing because whereas OpenStack is historically got no opinion, Kubernetes is a much more prescriptive model and I think that actually leads to faster innovation, a greater pace of change and combined with Helm charts, I think that we're going to see an ecosystem develop around Kubernetes that actually could be a counterweight to the public clouds and really be sort of cloud agnostic. Private, public, at the edge, who cares? >> Randy Bias, always appreciated your very opinionated viewpoints on everything that are happening here. Pleasure to catch up with you as always. John and I will be back will lots more coverage here from OpenStack Summit in Boston, thanks for watching the Cube.

Published Date : May 10 2017

SUMMARY :

Brought to you by the OpenStack Foundation, Red Hat, He is the creator of the term Pets versus Cattle, The Lie of the Benevolent Dictator, especially the big three, the development and look and smell the way that the public cloud was. a good blueprint on one of the Juniper forums and call for people to sort of get involved So, love the idea of kind of rebooting and the fourth is the carrier's obviously and we can use the word contributors in a lot of roles. of our end users and operators. the community hasn't been friendly to them. and the aggregate value of OpenContrail. of the other options out there. is that in the last year I've seen key customers by the foundation, acknowledging there's a huge landscape but I still sense that the origins of the community And that the line of business will continue of the Foundation does not allow that to occur and I think that actually leads to faster innovation, Pleasure to catch up with you as always.

ENTITIES

Entity	Category	Confidence
Randy	PERSON	0.99+
John	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
John Troyer	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
AT&T	ORGANIZATION	0.99+
Huawei	ORGANIZATION	0.99+
Juniper	ORGANIZATION	0.99+
Direction Telecom	ORGANIZATION	0.99+
OpenStack Foundation	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
OpenStack Foundation	ORGANIZATION	0.99+
Randy Bias	PERSON	0.99+
Ericcson	ORGANIZATION	0.99+
Symantech	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Cisco	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
NIH	ORGANIZATION	0.99+
The Lie of the Benevolent Dictator	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
Docker Incorporated	ORGANIZATION	0.99+
Second	QUANTITY	0.99+
one	QUANTITY	0.99+
last year	DATE	0.99+
Boston, Massachusetts	LOCATION	0.99+
OpenStack Summit	EVENT	0.99+
fourth	QUANTITY	0.99+
Kubernetes	TITLE	0.98+
third segment	QUANTITY	0.98+
today	DATE	0.98+
Silken Angle Media	ORGANIZATION	0.98+
OpenContrail	ORGANIZATION	0.98+
Keystone	ORGANIZATION	0.98+
one winner	QUANTITY	0.98+
OpenStack Summit 2017	EVENT	0.98+
tonight	DATE	0.97+
#OpenStackSummit	EVENT	0.97+
this week	DATE	0.97+
first one	QUANTITY	0.97+
Pets versus Cattle	TITLE	0.96+
OpenContrail	TITLE	0.96+
Openstack	ORGANIZATION	0.96+
first summit	QUANTITY	0.94+
Workday	ORGANIZATION	0.93+
Contrail	ORGANIZATION	0.93+
Mirantis	ORGANIZATION	0.93+
3:30	DATE	0.9+
The Cloud Group	ORGANIZATION	0.89+
of	ORGANIZATION	0.89+
Helm	ORGANIZATION	0.89+
OpenStack	TITLE	0.88+
OpenStack foundation	ORGANIZATION	0.87+
Juniper	PERSON	0.87+
OpenStack	ORGANIZATION	0.86+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Lie: