Tammy Whyman, Telco & Kurt Schaubach, Federated Wireless | MWC Barcelona 2023

>> Announcer: The cube's live coverage is made possible by funding from Dell Technologies, creating technologies that drive human progress. (upbeat music) (background indistinct chatter) >> Good morning from Barcelona, everyone. It's theCUBE live at MWC23, day three of our four days of coverage. Lisa Martin here with Dave Nicholson. Dave, we have had some great conversations. Can't believe it's day three already. Anything sticking out at you from a thematic perspective that really caught your eye the last couple days? >> I guess I go back to kind of our experience with sort of the generalized world of information technology and a lot of the parallels between what's been happening in other parts of the economy and what's happening in the telecom space now. So it helps me understand some of the complexity when I tie it back to things that I'm aware of >> A lot of complexity, but a big ecosystem that's growing. We're going to be talking more about the ecosystem next and what they're doing to really enable customers CSPs to deliver services. We've got two guests here, Tammy Wyman joins us the Global head of Partners Telco at AWS. And Kurt Schaubach, CTO of Federated Wireless. Welcome to theCUBE Guys. >> Thank you. >> Thank you. >> Great to have you here, day three. Lots of announcements, lots of news at MWC. But Tammy, there's been a lot of announcements from partners with AWS this week. Talk to us a little bit more about first of all, the partner program and then let's unpack some of those announcements. One of them is with Federated Wireless. >> Sure. Yeah. So AWS created the partner program 10 years ago when they really started to understand the value of bringing together the ecosystem. So, I think we're starting to see how this is becoming a reality. So now we 100,000 partners later, 150 countries, 70% of those partners are outside of the US. So truly the global nature and partners being ISVs, GSIs. And then in the telco space, we're actually looking at how we help CSBs become partners of AWS and bring new revenue streams. So that's how we start having the discussions around Federated Wireless. >> Talk a little bit about Federated Wireless, Kurt, give the audience an overview of what you guys are doing and then maybe give us some commentary on the partnership. >> Sure. So we're a shared spectrum and private wireless company, and we actually started working with AWS about five years ago to take this model that we developed to perfect the use of shared spectrum to enable enterprise communications and bring the power of 5G to the enterprise to bring it to all of the AWS customers and partners. So through that now through we're one of the partner network participants. We're working very closely with the AWS team on bringing this, really unique form of connectivity to all sorts of different enterprise use cases from solving manufacturing and warehouse logistics issues to providing connectivity to mines, enhancing the experience for students on a university campus. So it's a really exciting partnership. Everything that we deliver on an end-to-end basis from design deployment to bringing the infrastructure on-prem, all runs on AWS. (background indistinct chatter) >> So a lot of the conversations that we've had sort of start with this concept of the radio access network and frankly in at least the public domain cellular sites. And so all of a sudden it's sort of grounded in this physical reality of these towers with these boxes of equipment on the tower, at the base of the tower, connected to other things. How does AWS and Federated Wireless, where do you fit in that model in terms of equipment at the base of a tower versus what having that be off-premises in some way or another. Kind of give us more of a flavor for the kind of physical reality of what you guys are doing? >> Yeah, I'll start. >> Yeah, Tammy. >> I'll hand it over to the real expert but from an AWS perspective, what we're finding is really I don't know if it's even a convergence or kind of a delaying of the network. So customers are, they don't care if they're on Wi-Fi if they're on public spectrum, if they're on private spectrum, what they want are networks that are able to talk to each other and to provide the right connectivity at the right time and with the right pricing model. So by moving to the cloud that allows us that flexibility to be able to offer the quality of service and to be able to bring in a larger ecosystem of partners as with the networks are almost disaggregated. >> So does the AWS strategy focus solely on things that are happening in, say, AWS locations or AWS data centers? Or is AWS also getting into the arena of what I would refer to as an Outpost in an AWS parlance where physical equipment that's running a stack might actually also be located physically where the communications towers are? What does that mix look like in terms of your strategy? >> Yeah, certainly as customers are looking at hybrid cloud environments, we started looking at how we can use Outpost as part of the network. So, we've got some great use cases where we're taking Outpost into the edge of operators networks, and really starting to have radio in the cloud. We've launched with Dish earlier, and now we're starting to see some other announcements that we've made with Nokia about having ran in the cloud as well. So using Outpost, that's one of our key strategies. It creates, again, a lot of flexibility for the hybrid cloud environment and brings a lot of that compute power to the edge of the network. >> Let's talk about some of the announcements. Tammy was reading that AWS is expanding, its telecom and 5g, private 5G network support. You've also unveiled the AWS Telco Network Builder service. Talk about that, who that's targeted for. What does an operator do with AWS on this? Or maybe you guys can talk about that together. >> Sure. Would you like to start? I can talk. All right. So from the network builder, it's aimed at the, I would say the persona that it's aimed at would be the network engineer within the CSPs. And there was a bit of a difficulty when you want to design a telco network on AWS versus the way that the network engineers would traditionally design. So I'm going to call them protocols, but you know I can imagine saying, "I really want to build this on the cloud, but they're making me move away from my typical way that I design a network and move it into a cloud world." So what we did was really kind of create this template saying, "You can build the network as you always do and we are going to put the magic behind it to translate it into a cloud world." So just really facilitating and taking some of the friction out of the building of the network. >> What was the catalyst for that? I think Dish and Swisscom you've been working with but talk about the catalyst for doing that and how it's facilitating change because part of that's change management with how network engineers actually function and how they work. >> Absolutely, yeah. And we're looking, we listen to customers and we're trying to understand what are those friction points? What would make it easier? And that was one that we heard consistently. So we wanted to apply a bit of our experience and the way that we're able to use data translate that using code so that you're building a network in your traditional way, and then it kind of spits out what's the formula to build the network in the cloud. >> Got it. Kurt, talk about, yeah, I saw that there was just an announcement that Federated Wireless made JBG Smith. Talk to us more about that. What will federated help them to create and how are you all working together? >> Sure. So JBG Smith is the exclusive redeveloper of an area just on the other side of the Potomac from Washington DC called National Landing. And it's about half the size of Manhattan. So it's an enormous area that's getting redeveloped. It's the home of Amazon's new HQ two location. And JBG Smith is investing in addition to the commercial real estate, digital place making a place where people live, work, play, and connect. And part of that is bringing an enhanced level of connectivity to people's homes, their residents, the enterprise, and private wireless is a key component of that. So when we talk about private wireless, what we're doing with AWS is giving an enterprise the freedom to operate a network independent of a mobile network operator. So that means everything from the ran to the core to the applications that run on this network are sort of within the domain of the enterprise merging 5G and edge compute and driving new business outcomes. That's really the most important thing. We can talk a lot about 5G here at MWC about what the enterprise really cares about are new business outcomes how do they become more efficient? And that's really what private wireless helps enable. >> So help us connect the dots. When we talk about private wireless we've definitely been in learning mode here. Well, I'll speak for myself going around and looking at some of the exhibits and seeing how things work. And I know that I wasn't necessarily a 100% clear on this connection between a 5G private wireless network today and where Wi-Fi still comes into play. So if I am a new resident in this area, happily living near the amazing new presence of AWS on the East coast, and I want to use my mobile device how am I connected into that private wireless network? What does that look like as a practical matter? >> So that example that you've just referred to is really something that we enable through neutral host. So in fact, what we're able to do through this private network is also create carrier connectivity. Basically create a pipe almost for the carriers to be able to reach a consumer device like that. A lot of private wireless is also driving business outcomes with enterprises. So work that we're doing, like for example, with the Cal Poly out in California, for example is to enable a new 5G innovation platform. So this is driving all sorts of new 5G research and innovation with the university, new applications around IoT. And they need the ability to do that indoors, outdoors in a way that's sort of free from the domain of connectivity to a a mobile network operator and having the freedom and flexibility to do that, merging that with edge compute. Those are some really important components. We're also doing a lot of work in things like warehouses. Think of a warehouse as being this very complex RF environment. You want to bring robotics you want to bring better inventory management and Wi-Fi just isn't an effective means of providing really reliable indoor coverage. You need more secure networks you need lower latency and the ability to move more data around again, merging new applications with edge compute and that's where private wireless really shines. >> So this is where we do the shout out to my daughter Rachel Nicholson, who is currently a junior at Cal Poly San Luis Obispo. Rachel, get plenty of sleep and get your homework done. >> Lisa: She better be studying. >> I held up my mobile device and I should have said full disclosure, we have spotty cellular service where I live. So I think of this as a Wi-Fi connected device, in fact. So maybe I confuse the issue at least. >> Tammy, talk to us a little bit about the architecture from an AWS perspective that is enabling JBG Smith, Cal Poly is this, we're talking an edge architecture, but give us a little bit more of an understanding of what that actually technically looks like. >> Alright, I would love to pass this one over to Kurt. >> Okay. >> So I'm sorry, just in terms of? >> Wanting to understand the AWS architecture this is an edge based architecture hosted on what? On AWS snow, application storage. Give us a picture of what that looks like. >> Right. So I mean, the beauty of this is the simplicity in it. So we're able to bring an AWS snowball, snow cone, edge appliance that runs a pack of core. We're able to run workloads on that locally so some applications, but we also obviously have the ability to bring that out to the public cloud. So depending on what the user application is, we look at anything from the AWS snow family to Outpost and sort of develop templates or solutions depending on what the customer workloads demand. But the innovation that's happened, especially around the pack core and how we can make that so compact and able to run on such a capable appliance is really powerful. >> Yeah, and I will add that I think the diversification of the different connectivity modules that we have a lot of them have been developed because of the needs from the telco industry. So the adaptation of Outpost to run into the edge, the snow family. So the telco industry is really leading a lot of the developments that AWS takes to market in the end because of the nature of having to have networks that are able to disconnect, ruggedize environments, the latency, the numerous use cases that our telco customers are facing to take to their end customers. So like it really allows us to adapt and bring the right network to the right place and the right environment. And even for the same customer they may have different satellite offices or remote sites that need different connectivity needs. >> Right. So it sounds like that collaboration between AWS and telco is quite strong and symbiotic, it sounds like. >> Tammy: Absolutely. >> So we talked about a number of the announcements in our final minutes. I want to talk about integrated private wireless that was just announced last week. What is that? Who are the users going to be? And I understand T-Mobile is involved there. >> Yes. Yeah. So this is a program that we launched based on what we're seeing is kind of a convergence of the ecosystem of private wireless. So we wanted to be able to create a program which is offering spectrum that is regulated as well. And we wanted to offer that on in a more of a multi country environment. So we launched with T-Mobile, Telephonica, KDDI and a number of other succeed, as a start to start being able to bring the regulated spectrum into the picture and as well other ISVs who are going to be bringing unique use cases so that when you look at, well we've got the connectivity into this environment, the mine or the port, what are those use cases? You know, so ISVs who are providing maybe asset tracking or some of the health and safety and we bring them in as part of the program. And I think an important piece is the actual discoverability of this, because when you think about that if you're a buyer on the other side, like where do I start? So we created a portal with this group of ISVs and partners so that one could come together and kind of build what are my needs? And then they start picking through and then the ecosystem would be recommended to them. So it's a really a way to discover and to also procure a private wireless network much more easily than could be done in the past. >> That's a great service >> And we're learning a lot from the market. And what we're doing together in our partnership is through a lot of these sort of ruggedized remote location deployments that we're doing, mines, clearing underbrush and forest forest areas to prevent forest fires. There's a tremendous number of applications for private wireless where sort of the conventional carrier networks just aren't prioritized to serve. And you need a different level of connectivity. Privacy is big concern as well. Data security. Keeping data on premise, which is a another big application that we were able to drive through these edge compute platforms. >> Awesome. Guys, thank you so much for joining us on the program talking about what AWS Federated are doing together and how you're really helping to evolve the telco landscape and make life ultimately easier for all the Nicholsons to connect over Wi-Fi, our private 5g. >> Keep us in touch. And from two Californians you had us when you said clear the brush, prevent fires. >> You did. Thanks guys, it was a pleasure having you on the program. >> Thank you. >> Thank you. >> Our pleasure. For our guest and for Dave Nicholson, I'm Lisa Martin. You're watching theCUBE Live from our third day of coverage of MWC23. Stick around Dave and I will be right back with our next guest. (upbeat music)

Published Date : Mar 1 2023

SUMMARY :

that drive human progress. eye the last couple days? and a lot of the parallels the Global head of Partners Telco at AWS. the partner program and then let's unpack So AWS created the partner commentary on the partnership. and bring the power of So a lot of the So by moving to the cloud that allows us and brings a lot of that compute power of the announcements. So from the network but talk about the catalyst for doing that and the way that we're Talk to us more about that. from the ran to the core and looking at some of the exhibits and the ability to move So this is where we do the shout out So maybe I confuse the issue at least. bit about the architecture pass this one over to Kurt. the AWS architecture the beauty of this is a lot of the developments that AWS and telco is quite strong and number of the announcements a convergence of the ecosystem a lot from the market. on the program talking the brush, prevent fires. having you on the program. of coverage of MWC23.

ENTITIES

Entity	Category	Confidence
Dave Nicholson	PERSON	0.99+
Kurt Schaubach	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Rachel Nicholson	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Tammy Wyman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
California	LOCATION	0.99+
Tammy	PERSON	0.99+
telco	ORGANIZATION	0.99+
T-Mobile	ORGANIZATION	0.99+
Kurt	PERSON	0.99+
US	LOCATION	0.99+
Lisa	PERSON	0.99+
Washington DC	LOCATION	0.99+
Federated Wireless	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Rachel	PERSON	0.99+
last week	DATE	0.99+
Nokia	ORGANIZATION	0.99+
Swisscom	ORGANIZATION	0.99+
Cal Poly	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Tammy Whyman	PERSON	0.99+
70%	QUANTITY	0.99+
two guests	QUANTITY	0.99+
Dell Technologies	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Barcelona	LOCATION	0.99+
One	QUANTITY	0.99+
100%	QUANTITY	0.99+
Telephonica	ORGANIZATION	0.99+
JBG Smith	ORGANIZATION	0.99+
Manhattan	LOCATION	0.99+
National Landing	LOCATION	0.99+
four days	QUANTITY	0.99+
this week	DATE	0.98+
third day	QUANTITY	0.98+
10 years ago	DATE	0.98+
JBG Smith	PERSON	0.98+
Dish	ORGANIZATION	0.98+
Potomac	LOCATION	0.98+
two	QUANTITY	0.98+
KDDI	ORGANIZATION	0.98+
150 countries	QUANTITY	0.97+
MWC23	EVENT	0.96+
two location	QUANTITY	0.96+
one	QUANTITY	0.96+
first	QUANTITY	0.96+
day three	QUANTITY	0.95+
MWC	EVENT	0.95+

Ahmad Khan, Snowflake & Kurt Muehmel, Dataiku | Snowflake Summit 2022

>>Hey everyone. Welcome back to the Cube's live coverage of snowflake summit 22 live from Las Vegas. Caesar's forum. Lisa Martin here with Dave Valante. We've got a couple of guests here. We're gonna be talking about every day. AI. You wanna know what that means? You're in the right spot. Kurt UL joins us, the chief customer officer at data ICU and the mod Conn, the head of AI and ML strategy at snowflake guys. Great to have you on the program. >>It's wonderful to be here. Thank you so much. >>So we wanna understand Kurt what everyday AI means, but before we do that for the audience who might not be familiar with data, I could give them a little bit of an overview. What about what you guys do your mission and maybe a little bit about the partnership? >>Yeah, great. Uh, very happy to do so. And thanks so much for this opportunity. Um, well, data IKU, we are a collaborative platform, uh, for enterprise AI. And what that means is it's a software, you know, that sits on top of incredible infrastructure, notably snowflake that allows people from different backgrounds of data, analysts, data, scientists, data, engineers, all to come together, to work together, to build out machine learning models and ultimately the AI that's gonna be the future, uh, of their business. Um, and so we're very excited to, uh, to be here, uh, and you know, very proud to be a, a, a very close partner of snowflake. >>So Amad, what is Snowflake's AI strategy? Is it to, is it to partner? Where do, where do you pick up? And Frank said today, we, we're not doing it all. Yeah. The ecosystem by design. >>Yeah. Yeah, absolutely. So we believe in the best of breed look. Um, I think, um, we, we think that we're the best data platform and for data science and machine learning, we want our customers to really use the best tool for their use cases. Right. And, you know, data ICU is, is our leading partner in that space. And so, you know, when, when you talk about, uh, machine learning and data science, people talk about training a model, but it's really the difficult part and challenges are really, before you train the model, how do you get access to the right data? And then after you train the model, how do you then run the model? And then how do you manage the model? Uh, that's very, very important. And that's where our partnership with, with data, uh, IKU comes in place. Snowflake provides the platform that can process data at scale for the pre-processing bit and, and data IKU comes in and really, uh, simplifies the process for deploying the models and managing the model. >>Got it. Thank >>You. You talk about KD data. Aico talks about everyday AI. I wanna break that down. What do you mean by that? And how is this partnership with snowflake empowering you to deliver that to companies? >>Yeah, absolutely. So everyday AI for us is, uh, you know, kind of a future state that we are building towards where we believe that AI will become so pervasive in all of the business processes, all the decision making that organizations have to go through that it's no longer this special thing that we talk about. It's just the, the day to day life of, uh, of our businesses. And we can't do that without partners like snowflake and, uh, because they're bringing together all of that data and ensuring that there is the, uh, the computational horsepower behind that to drive that we heard that this morning in some of the keynote talking about that broad democratization and the, um, let's call it the, uh, you know, the pressure that that's going to put on the underlying infrastructure. Um, and so ultimately everyday AI for us is where companies own that AI capability. They're building it themselves very broad, uh, participation in the development of that. And all that work then is being pushed down into best of breed, uh, infrastructure, notably of course, snowflake. Well, >>You said push down, you, you guys, you there's a term in the industry push down optimization. What does that mean? How is it evolving? Why is it so important? >>So Amma, do you want to take a first step at that? >>Yeah, absolutely. So, I mean, when, when you're, you know, processing data, so saying data, um, before you train a, uh, a model, you have to do it at scale, that that, that data is, is coming from all different sources. It's human generated machine generated data, we're talking millions and billions of rows of data. Uh, and you have to make sense of it. You have to transform that data into the right kind of features into the right kind of signals that inform the machine learning model that you're trying to, uh, train. Uh, and so that's where, you know, any kind of large scale data processing is automatically pushed down by data IQ, into snowflakes, scalable infrastructure. Um, so you don't get into like memory issues. You don't get into, um, uh, situations where you're where your pipeline is running overnight, and it doesn't finish in time. Right? And so, uh, you can really take advantage of the scalable nature of cloud computing, uh, using Snowflake's infrastructure. So a lot of that processing is actually getting pushed down from data I could down into the scalable snowflake compute engine. How >>Does this affect the life of a data scientist? You always hear a data scientist spend 80% of the time wrangling data. Uh, I presume there's an infrastructure component around that you trying, we heard this morning, you're making infrastructure, my words, infrastructure, self serve, uh, does this directly address that problem and, and talk about that. And what else are you doing to address that 80% problem? >>It, it certainly does, right? Uh, that's how you solve for, uh, data scientists needing to have on demand access to computing resources, or of course, to the, uh, to the underlying data, um, is by ensuring that that work doesn't have to run on their laptop, doesn't have to run on some, you know, constrained, uh, physical machines, uh, in, in a data center somewhere. Instead it gets pushed down into snowflake and can be executed at scale with incredible parallelization. Now what's really, uh, I important is the ongoing development, uh, between the two products, uh, and within that technology. And so today snowflake, uh, announced the introduction of Python within snow park, um, which is really, really exciting, uh, because that really opens up this capability to a much wider audience. Now DataCo provides that both through a visual interface, um, in historically, uh, since last year through Java UDFs, but that's kind of the, the two extremes, right? You have people who don't code on one side, you know, very no code or a low code, uh, population, and then a very high code population. On the other side, this Python, uh, integration really allows us to, to touch really kind the, the fat center of the data science population, who, uh, who, for whom, you know, Python really is the lingua franca that they've been learning for, uh, for decades now. Sure. So >>Talking about the data scientist, I wanna elevate that a little bit because you both are enterprise customers, data ICO, and snowflake Kurt as the chief customer officer, obviously you're with customers all the time. If we look at the macro environment of all the challenges, companies have to be a data company these days, if you're not, you're not gonna be successful. It's how do we do that? Extract insights, value, action, take it. But I'm just curious if your customer conversations are elevating up to the C-suite or, or the board in terms of being able to get democratize access to data, to be competitive, new products, new services, we've seen tremendous momentum, um, on, on the, the part of customer's growth on the snowflake side. But what are you hearing from customers as they're dealing with some of these current macro pains? >>Yeah, no, I, I think it is the conversation today, uh, at that sea level is not only how do we, you know, leverage, uh, new infrastructure, right. You know, they they're, you know, most of them now are starting to have snowflake. I think Frank said, uh, you know, 50% of the, uh, fortune 500, so we can say most, um, have that in place. Um, but now the question is, how do we, how do we ensure that we're getting access to that data, to that, to that computational horsepower, to a broader group of people so that it becomes truly a transformational initiative and not just an it initiative, not just a technology initiative, but really a core business initiative. And that, that really has been a pivot. You know, I've been, you know, with my company now for almost eight years, right. Uh, and we've really seen a change in that discussion going from, you know, much more niche discussions at the team or departmental level now to truly corporate strategic level. How do we build AI into our corporate strategy? How do we really do that in practice? And >>We hear a lot about, Hey, I want to inject data into apps, AI, and machine intelligence into applications. And we've talked about, those are separate stacks. You got the data stack and analytics stack over here. You got the application development, stack the databases off in the corner. And so we see you guys bringing those worlds together. And my question is, what does that stack look like? I took a snapshot. I think it was Frank's presentation today. He had infrastructure at the lowest level live data. So infrastructure's cloud live data. That's multiple data sources coming in workload execution. You made some announcements there. Mm-hmm, <affirmative>, uh, to expend expand that application development. That's the tooling that is needed. Uh, and then marketplace, that's how you bring together this ecosystem. Yes. Monetization is how you turn data into data products and make money. Is that the stack, is that the new stack that's emerging here? Are you guys defining that? >>Absolutely. Absolutely. You talked about like the 80% of the time being spent by data scientists and part of that is actually discovering the right data. Right. Um, being able to give the right access to the right people and being able to go and discover that data. And so you, you, you go from that angle all the way to processing, training a model. And then all those predictions that are insights that are coming out of the model are being consumed downstream by data applications. And so the two major announcements I'm super excited about today is, is the ability to run Python, which is snow park, uh, in, in snowflake. Um, that will do, you know, you can now as a Python developer come and bring the processing to where the data lives rather than move the data out to where the processing lives. Right. Um, so both SQL developers, Python developers, fully enabled. Um, and then the predictions that are coming out of models that are being trained by data ICU are then being used downstream by these data applications for most of our customers. And so that's where number, the second announcement with streamlet is super exciting. I can write a complete data application without writing a single line of JavaScript CSS or HTML. I can write it completely in Python. It's it makes me super excited as, as a Python developer, myself >>And you guys have joint customers that are headed in this direction, doing this today. Where, where can you talk about >>That? Yeah, we do. Uh, you know, there's a few that we're very proud of. Um, you know, company, well known companies like, uh, like REI or emeritus. Um, but one that was mentioned today, uh, this morning by Frank again, uh, Novartis, uh, pharmaceutical company, you know, they have been extremely successful, uh, in accelerating their AI and ML development by expanding access to their data. And that's a combination of, uh, both the data ICU, uh, layer, you know, allowing for that work to be developed in that, uh, in that workspace. Um, but of course, without, you know, the, the underlying, uh, uh, platform of snowflake, right, they, they would not have been able to, to have re realized those, uh, those gains. And they were talking about, you know, very, very significant increases in inefficiency everything from data access to the actual model development to the deployment. Um, it's just really, really honestly inspiring to see. >>And it was great to see Novartis mentioned on the main stage, massive time to value there. We've actually got them on the program later this week. So that was great. Another joint customer, you mentioned re I we'll let you go, cuz you're off to do a, a session with re I, is that right? >>Yes, that's exactly right. So, uh, so we're going to be doing a fireside chat, uh, talking about, in fact, you know, much of the same, all of the success that they've had in accelerating their, uh, analytics, workflow development, uh, the actual development of AI capabilities within, uh, of course that, uh, that beloved brand. >>Excellent guys, thank you so much for joining Dave and me talking about everyday AI, what you're doing together, data ICO, and snowflake to empower organizations to actually achieve that and live it. We appreciate your insights. Thank you both. You guys. Thank you for having us for our guests and Dave ante. I'm Lisa Martin. You're watching the Cube's live coverage of snowflake summit 22 from Las Vegas. Stick around our next guest joins us momentarily.

Published Date : Jun 14 2022

SUMMARY :

Great to have you on the program. Thank you so much. What about what you guys do Um, and so we're very excited to, uh, to be here, uh, and you know, Where do, where do you pick up? And so, you know, when, Thank And how is this partnership with snowflake empowering you to deliver uh, you know, the pressure that that's going to put on the underlying infrastructure. Why is it so important? Uh, and so that's where, you know, any kind of And what else are you doing to address that 80% problem? You have people who don't code on one side, you know, very no code or a low code, Talking about the data scientist, I wanna elevate that a little bit because you both are enterprise customers, I think Frank said, uh, you know, 50% of the, uh, And so we see you guys Um, that will do, you know, you can now as a Python developer And you guys have joint customers that are headed in this direction, doing this today. And that's a combination of, uh, both the data ICU, uh, layer, you know, you go, cuz you're off to do a, a session with re I, is that right? you know, much of the same, all of the success that they've had in accelerating their, uh, analytics, Thank you both.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Frank	PERSON	0.99+
Dave Valante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Novartis	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
Kurt	PERSON	0.99+
80%	QUANTITY	0.99+
50%	QUANTITY	0.99+
Ahmad Khan	PERSON	0.99+
last year	DATE	0.99+
Python	TITLE	0.99+
millions	QUANTITY	0.99+
two products	QUANTITY	0.99+
today	DATE	0.99+
two extremes	QUANTITY	0.99+
Kurt Muehmel	PERSON	0.99+
both	QUANTITY	0.99+
Snowflake Summit 2022	EVENT	0.98+
Amma	PERSON	0.98+
Kurt UL	PERSON	0.98+
second announcement	QUANTITY	0.98+
JavaScript	TITLE	0.98+
Caesar	PERSON	0.98+
billions	QUANTITY	0.97+
first step	QUANTITY	0.97+
REI	ORGANIZATION	0.97+
HTML	TITLE	0.97+
two major announcements	QUANTITY	0.97+
later this week	DATE	0.97+
Snowflake	ORGANIZATION	0.96+
Amad	PERSON	0.94+
this morning	DATE	0.94+
single line	QUANTITY	0.94+
Aico	ORGANIZATION	0.93+
SQL	TITLE	0.93+
Snowflake	TITLE	0.93+
one side	QUANTITY	0.91+
fortune 500	QUANTITY	0.91+
Java UDFs	TITLE	0.9+
almost eight years	QUANTITY	0.9+
emeritus	ORGANIZATION	0.89+
snowflake summit 22	EVENT	0.85+
IKU	ORGANIZATION	0.85+
Cube	ORGANIZATION	0.85+
Cube	PERSON	0.82+
decades	QUANTITY	0.78+
IKU	TITLE	0.74+
streamlet	TITLE	0.72+
snowflake	ORGANIZATION	0.7+
Dataiku	PERSON	0.65+
couple of	QUANTITY	0.64+
DataCo	ORGANIZATION	0.63+
CSS	TITLE	0.59+
one	QUANTITY	0.55+
data ICU	ORGANIZATION	0.51+
rows	QUANTITY	0.49+
Conn	ORGANIZATION	0.35+

Tom Clancy, UiPath & Kurt Carlson, William & Mary | UiPath FORWARD III 2019

(upbeat music) >> Announcer: Live from Las Vegas, it's theCUBE! Covering UIPath FORWARD America's 2019. Brought to you by UIPath. >> Welcome back, everyone, to theCUBE's live coverage of UIPath FORWARD, here in Sin City, Las Vegas Nevada. I'm your host, Rebecca Knight, co-hosting alongside Dave Velante. We have two guests for this segment. We have Kurt Carlson, Associate Dean for faculty and academic affairs of the Mason School of Business at the college of William and Mary. Thanks for coming on the show. >> Thanks you for having me. >> Rebecca: And we have Tom Clancy, the SVP of learning at UIPath, thank you so much. >> Great to be here. >> You're a Cube alum, so thank you for coming back. >> I've been here a few times. >> A Cube veteran, I should say. >> I think 10 years or so >> So we're talking today about a robot for every student, this was just announced in August, William and Mary is the first university in the US to provide automation software to every undergraduate student, thanks to a four million dollar investment from UIPath. Tell us a little bit about this program, Kurt, how it works and what you're trying to do here. >> Yeah, so first of all, to Tom and the people at UIPath for making this happen. This is a bold and incredible initiative, one that, frankly, when we had it initially, we thought that maybe we could get a robot for every student, we weren't sure that other people would be willing to go along with that, but UIPath was, they see the vision, and so it was really a meeting of the minds on a common purpose. The idea was pretty simple, this technology is transforming the world in a way that students, we think it's going to transform the way that students actually are students. But it's certainly transforming the world that our students are going into. And so, we want to give them exposure to it. We wanted to try and be the first business school on the planet that actually prepares students not just for the way RPA's being used today, but the way that it's going to be used when AI starts to take hold, when it becomes the gateway to AI three, four, five years down the road. So, we talked to UIPath, they thought it was a really good idea, we went all in on it. Yeah, all of our starting juniors in the business school have robots right now, they've all been trained through the academy live session putting together a course, it's very exciting. >> So, Tom, you've always been an innovator when it comes to learning, here's my question. How come we didn't learn this school stuff when we were in college? We learned Fortran. >> I don't know, I only learned BASIC, so I can't speak to that. >> So you know last year we talked about how you're scaling, learning some of the open, sort of philosophy that you have. So, give us the update on how you're pushing learning FORWARD, and why the College of William and Mary. >> Okay, so if you buy into a bot for every worker, or a bot for every desktop, that's a lot of bots, that's a lot of desktops, right? There's studies out there from the research companies that say that there's somewhere a hundred and 200 million people that need to be educated on RPA, RPA/AI. So if you buy into that, which we do, then traditional learning isn't going to do it. We're going to miss the boat. So we have a multi-pronged approach. The first thing is to democratize RPA learning. Two and a half years ago we made, we created RPA Academy, UIPath academy, and 100% free. After two and a half years, we have 451,000 people go through the academy courses, that's huge. But we think there's a lot more. Over the next next three years we think we'll train at least two million people. But the challenge still is, if we train five million people, there's still a hundred million that need to know about it. So, the second biggest thing we're doing is, we went out, last year at this event, we announced our academic alliance program. We had one university, now we're approaching 400 universities. But what we're doing with William and Mary is a lot more than just providing a course, and I'll let Kurt talk to that, but there is so much more that we could be doing to educate our students, our youth, upscaling, rescaling the existing workforce. When you break down that hundred million people, they come from a lot of different backgrounds, and we're trying to touch as many people as we can. >> You guys are really out ahead of the curve. Oftentimes, I mean, you saw this a little bit with data science, saw some colleges leaning in. So what lead you guys to the decision to actually invest and prioritize RPA? >> Yeah, I think what we're trying to accomplish requires incredibly smart students. It requires students that can sit at the interface between what we would think of today as sort of an RPA developer and a decision maker who would be stroking the check or signing the contract. There's got to be somebody that sits in that space that understands enough about how you would actually execute this implementation. What's the right buildout of that, how we're going to build a portfolio of bots, how we're going to prioritize the different processes that we might automate, How we're going to balance some processes that might have a nice ROI but be harder for the individual who's process is being automated to absorb against processes that the individual would love to have automated, but might not have as great of an ROI. How do you balance that whole set of things? So what we've done is worked with UIPath to bring together the ideas of automation with the ideas of being a strategic thinker in process automation, and we're designing a course in collaboration to help train our students to hit the ground running. >> Rebecca, it's really visionary, isn't it? I mean it's not just about using the tooling, it's about how to apply the tooling to create competitive advantage or change lives. >> I used to cover business education for the Financial Times, so I completely agree that this really is a game changer for the students to have this kind of access to technology and ability to explore this leading edge of software robotics and really be, and graduate from college. This isn't even graduate school, they're graduating from college already having these skills. So tell me, Kurt, what are they doing? What is the course, what does it look like, how are they using this in the classroom? >> The course is called a one credit. It's 14 hours but it actually turns into about 42 when you add this stuff that's going on outside of class. They're learning about these large conceptual issues around how do you prioritize which processes, what's the process you should go through to make sure that you measure in advance of implementation so that you can do an audit on the backend to have proof points on the effectiveness, so you got to measure in advance, creating a portfolio of perspective processes and then scoring them, how do you do that, so they're learning all that sort of conceptual straight business slash strategy implementation stuff, so that's on the first half, and to keep them engaged with this software, we're giving them small skills, we're calling them skillets. Small skills in every one of those sessions that add up to having a fully automated and programmed robot. Then they're going to go into a series of days where every one of those days they're going to learn a big skill. And the big skills are ones that are going to be useful for the students in their lives as people, useful in lives as students, and useful in their lives as entrepreneurs using RPA to create new ventures, or in the organizations they go to. We've worked with UIPath and with our alums who've implement this, folks at EY, Booz. In fact, we went up to DC, we had a three hour meeting with these folks. So what are the skills students need to learn, and they told us, and so we build these three big classes, each around each one of those skills so that our students are going to come out with the ability to be business translators, not necessarily the hardcore programmers. We're not going to prevent them from doing that, but to be these business translators that sit between the programming and the decision makers. >> That's huge because, you know, like, my son's a senior in college. He and his friends, they all either want to work for Amazon, Google, an investment bank, or one of the big SIs, right? So this is a perfect role for a consultant to go in and advise. Tom, I wanted to ask you, and you and I have known each other for a long time, but one of the reasons I think you were successful at your previous company is because you weren't just focused on a narrow vendor, how to make metrics work, for instance. I presume you're taking the same philosophy here. It transcends UIPath and is really more about, you know, the category if you will, the potential. Can you talk about that? >> So we listen to our customers and now we listen to the universities too, and they're going to help guide us to where we need to go. Most companies in tech, you work with marketing, and you work with engineering, and you build product courses. And you also try to sell those courses, because it's a really good PNL when you sell training. We don't think that's right for the industry, for UIPath, or for our customers, or our partners. So when we democratize learning, everything else falls into place. So, as we go forward, we have a bunch of ideas. You know, as we get more into AI, you'll see more AI type courses. We'll team with 400 universities now, by end of next year, we'll probably have a thousand universities signed up. And so, there's a lot of subject matter expertise, and if they come to us with ideas, you mentioned a 14 hour course, we have a four hour course, and we also have a 60 hour course. So we want to be as flexible as possible, because different universities want to apply it in different ways. So we also heard about Lean Six Sigma. I mean, sorry, Lean RPA, so we might build a course on Lean RPA, because that's really important. Solution architect is one of the biggest gaps in the industry right now so, so we look to where these gaps are, we listen to everybody, and then we just execute. >> Well, it's interesting you said Six Sigma, we have Jean Younger coming on, she's a Six Sigma expert. I don't know if she's a black belt, but she's pretty sure. She talks about how to apply RPA to make business processes in Six Sigma, but you would never spend the time and money, I mean, if it's an airplane engine, for sure, but now, so that's kind of transformative. Kurt, I'm curious as to how you, as a college, market this. You know, you're very competitive industry, if you will. So how do you see this attracting students and separating you guys from the pack? >> Well, it's a two separate things. How do we actively try to take advantage of this, and what effects is it having already? Enrollments to the business school, well. Students at William and Mary get admitted to William and Mary, and they're fantastic, amazingly good undergraduate students. The best students at William and Mary come to the Raymond A. Mason school of business. If you take our undergraduate GPA of students in the business school, they're top five in the country. So what we've seen since we've announced this is that our applications to the business school are up. I don't know that it's a one to one correlation. >> Tom: I think it is. >> I believe it's a strong predictor, right? And part because it's such an easy sell. And so, when we talk to those alums and friends in DC and said, tell us why this is, why our students should do this, they said, well, if for no other reason, we are hiring students that have these skills into data science lines in the mid 90s. When I said that to my students, they fell out of their chairs. So there's incredible opportunity here for them, that's the easy way to market it internally, it aligns with things that are happening at William and Mary, trying to be innovative, nimble, and entrepreneurial. We've been talking about being innovative, nimble, and entrepreneurial for longer than we've been doing it, we believe we're getting there, we believe this is the type of activity that would fit for that. As far as promoting it, we're telling everybody that will listen that this is interesting, and people are listening. You know, the standard sort of marketing strategy that goes around, and we are coordinating with UIPath on that. But internally, this sells actually pretty easy. This is something people are looking for, we're going to make it ready for the world the way that it's going to be now and in the future. >> Well, I imagine the big consultants are hovering as well. You know, you mentioned DC, Booz Allen, Hughes and DC, and Excensior, EY, Deloitte, PWC, IBM itself. I mean it's just, they all want the best and the brightest, and now you're going to have this skill set that is a sweet spot for their businesses. >> Kurt: That's the plan. >> I'm just thinking back to remembering who these people are, these are 19 and 20 year olds. They've never experienced the dreariness of work and the drudge tasks that we all know well. So, what are you, in terms of this whole business translator idea, that they're going to be the be people that sit in the middle and can sort of be these people who can speak both languages. What kind of skills are you trying to impart to them, because it is a whole different skill set. >> Our vision is that in two or three years, the nodes and the processes that are currently... That currently make implementing RPA complex and require significant programmer skills, these places where, right now, there's a human making a relatively mundane decision, but it's sill a model. There's a decision node there. We think AI is going to take over that. The simple, AI's going to simply put models into those decision nodes. We also think a lot of the programming that takes place, you're seeing it now with studio X, a lot of the programming is going to go away. And what that's going to do is it's going to elevate the business process from the mundane to the more human intelligent, what would currently be considered human intelligence process. When we get into that space, people skills are going to be really important, prioritizing is going to be really important, identifying organizations that are ripe for this, at this moment in time, which processes to automate. Those are the kind of skills we're trying to get students to develop, and what we're selling it partly as, this is going to make you ready of the world the way we think it's going to be, a bit of a guess. But we're also saying if you don't want to automate mundane processes, then come with us on a different magic carpet ride. And that magic carpet ride is, imagine all the processes that don't exist right now because nobody would ever conceive of them because they couldn't possibly be sustained, or they would be too mundane. Now think about those processes through a business lens, so take a business student and think about all the potential when you look at it that way. So this course that we're building has that, everything in the course is wrapped in that, and so, at the end of the course, they're going to be doing a project, and the project is to bring a new process to the world that doesn't currently exist. Don't program it, don't worry about whether or not you have a team that could actually execute it. Just conceive of a process that doesn't currently exist and let's imagine, with the potential of RPA, how we would make that happen. That's going to be, we think we're going to be able to bring a lot of students along through that innovative lens even though they are 19 and 20, because 19 and 20 year olds love innovation, while they've never submitted a procurement report. >> Exactly! >> A innovation presentation. >> We'll need to do a Cube follow up with that. >> What Kurt just said, is the reason why, Tom, I think this market is being way undercounted. I think it's hard for the IDCs and the forces, because they look back they say how big was it last year, how fast are these companies growing, but, to your point, there's so much unknown processes that could be attacked. The TAM on this could be enormous. >> We agree. >> Yeah, I know you do, but I think that it's a point worth mentioning because it touches so many different parts of every organization that I think people perhaps don't realize the impact that it could have. >> You know, when listening to you, Kurt, when you look at these young kids, at least compared to me, all the coding and setting up a robot, that's the easy part, they'll pick that up right away. It's really the thought process that goes into identifying new opportunities, and that's, I think, you're challenging them to do that. But learning how to do robots, I think, is going to be pretty easy for this new digital generation. >> Piece of cake. Tom and Kurt, thank you so much for coming on theCUBE with a really fascinating conversation. >> Thank you. >> Thanks, you guys >> I'm Rebecca Knight, for Dave Velante, stay tuned for more of theCUBEs live coverage of UIPath FORWARD. (upbeat music)

Published Date : Oct 15 2019

SUMMARY :

Brought to you by UIPath. and academic affairs of the Mason School of Business at UIPath, thank you so much. William and Mary is the first university in the US that it's going to be used when AI starts to take hold, it comes to learning, here's my question. so I can't speak to that. sort of philosophy that you have. But the challenge still is, if we train five million people, So what lead you guys to the decision to actually that the individual would love to have automated, it's about how to apply the tooling to create the students to have this kind of access to And the big skills are ones that are going to be useful the category if you will, the potential. and if they come to us with ideas, and separating you guys from the pack? I don't know that it's a one to one correlation. When I said that to my students, Well, I imagine the big consultants are hovering as well. and the drudge tasks that we all know well. and so, at the end of the course, they're going to be doing how fast are these companies growing, but, to your point, don't realize the impact that it could have. is going to be pretty easy for this new digital generation. Tom and Kurt, thank you so much for coming on theCUBE for more of theCUBEs live coverage of UIPath FORWARD.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Tom	PERSON	0.99+
Kurt	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Deloitte	ORGANIZATION	0.99+
Tom Clancy	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Dave Velante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
PWC	ORGANIZATION	0.99+
UIPath	ORGANIZATION	0.99+
Kurt Carlson	PERSON	0.99+
EY	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
19	QUANTITY	0.99+
14 hours	QUANTITY	0.99+
US	LOCATION	0.99+
400 universities	QUANTITY	0.99+
August	DATE	0.99+
Excensior	ORGANIZATION	0.99+
Jean Younger	PERSON	0.99+
two	QUANTITY	0.99+
College of William and Mary	ORGANIZATION	0.99+
last year	DATE	0.99+
one credit	QUANTITY	0.99+
first half	QUANTITY	0.99+
second	QUANTITY	0.99+
Financial Times	ORGANIZATION	0.99+
two guests	QUANTITY	0.99+
hundred million people	QUANTITY	0.99+
William and Mary	ORGANIZATION	0.99+
Mason School of Business	ORGANIZATION	0.99+
20	QUANTITY	0.99+
three hour	QUANTITY	0.99+
four	QUANTITY	0.99+
451,000 people	QUANTITY	0.99+
today	DATE	0.99+
DC	ORGANIZATION	0.99+
one university	QUANTITY	0.99+
both languages	QUANTITY	0.99+
100%	QUANTITY	0.99+
three years	QUANTITY	0.99+
five million people	QUANTITY	0.99+
20 year	QUANTITY	0.99+
five years	QUANTITY	0.98+
Six Sigma	TITLE	0.98+
Sin City, Las Vegas Nevada	LOCATION	0.98+
Raymond A. Mason	ORGANIZATION	0.98+
each	QUANTITY	0.98+
Las Vegas	LOCATION	0.98+
one	QUANTITY	0.98+
four million dollar	QUANTITY	0.98+
2019	DATE	0.98+
Two and a half years ago	DATE	0.97+
four hour course	QUANTITY	0.97+
first university	QUANTITY	0.97+
60 hour course	QUANTITY	0.97+
mid 90s	DATE	0.97+
first thing	QUANTITY	0.96+
UIPath FORWARD	ORGANIZATION	0.95+
about 42	QUANTITY	0.95+

Kurt Kuckein, DDN Storage, and Darrin Johnson, NVIDIA | CUBEConversation, Sept 2018

[Music] [Applause] I'll Buena Burris and welcome to another cube conversation from our fantastic studios in beautiful palo alto california today we're going to be talking about what infrastructure can do to accelerate AI and specifically we're gonna use a relationship a burgeoning relationship between PDN and nvidia to describe what we can do to accelerate AI workloads by using higher performance smarter and more focused of infrastructure for computing now to have this conversation we've got two great guests here we've got Kurt ku kind who is the senior director of marketing at ddn and also Darren Johnson is a global director of technical marketing for enterprise and NVIDIA Kurt Gerron welcome to the cube thanks for thank you very much so let's get going on this because this is a very very important topic and I think it all starts with this notion of that there is a relationship that you guys have put forward Kurt once you describe it sure well so what we're announcing today is ddn's a3i architecture powered by Nvidia so it is a full rack level solution a reference architecture that's been fully integrated and fully tested to deliver an AI infrastructure very simply very completely so if we think about how this is gonna or why this is important AI workloads clearly have a special stress on underlying technology Darin talk to us a little bit about the nature of these workloads and why in particular things like GPUs and other technologies are so important to make them go fast absolutely and as you probably know AI is all about the data whether you're doing medical imaging whether you're doing natural language processing whatever it is it's all driven by the data the more data that you have the better results that you get but to drive that data into the GPUs you need great IO and that's why we're here today to talk about ddn and the partnership of how to bring that I owe to the GPUs on our dgx platforms so if we think about what you described a lot of small files off and randomly just riveted with nonetheless very high-profile jobs that just can't stop midstream and start over absolutely and if you think about the history of high-performance computing which is very similar to a I really I owe is just that lots of files you have to get it they're low latency high throughput and that's why ddn's probably nearly twenty years of experience working in that exact same domain is perfect because you get the parallel file system which gives you that throughput gives you that low latency just helps drive the GPU so we you'd mention HPC from 20 years of experience now it used to be that HPC you'd have scientists with a bunch of graduate students setting up some of these big honkin machines but now we're moving into the commercial domain you don't have graduate students running around you don't have very low cost high quality people you're you know a lot of administrators who nonetheless good people but a lot to learn so how does this relationship actually start making or bringing AI within reach of the commercial world exactly where this reference architecture comes in right so a customer doesn't need to start from scratch they have a design now that allows them to quickly implement AI it's something that's really easily deployable we've fully integrated this solution ddn has made changes to our parallel file system appliance to integrate directly within the DG x1 environment makes that even easier to deploy from there and extract the maximum performance out of this without having to run around and tune a bunch of knobs change a bunch of settings it's really gonna work out of the box and the you know nvidia has done more than just the DG x1 it's more than hardware you've done a lot of optimization of different of AI toolkits if Sarah I'm talking what about that Darin yeah so I mean talking about the example I use researchers in the past with HPC what we have today are data scientists data scientists understand pie tours they understand tensorflow they understand the frameworks they don't want to understand the underlying filesystem networking RDMA InfiniBand any of that they just want to be able to come in run their tensorflow get the data get the results and just turn that keep turning that whether it's a single GPU or 90 Jex's or as many dejection as you want so this solution helps bring that to customers much easier so those data scientists don't have to be system administrators so a reference architecture that makes things easier but that's more than just for some of these commercial things it's also the overall ecosystem new application providers application developers how is this going to impact the aggregate ecosystem it's growing up around the need to do AI related outcomes well I think one point that Darrin was getting to you there and one of the big effects is also as these ecosystems reach a point where they're going to need to scale right there's somewhere where ddn has tons of experience right so many customers are starting off with smaller data sets they still need the performance a parallel file system in that case is going to deliver that performance but then also as they grow right going from one GPU to 90 G X's is going to be an incredible amount of both performance scalability that they're going to need from their i/o as well as probably capacity scalability and that's another thing that we've made easy with a3i is being able to scale that environment seamlessly within a single namespace so that people don't have to deal with a lot of again tuning and turning of knobs to make this stuff work really well and drive those outcomes that they need as they're successful right so in the end it is the application that's most important to both of us right it's it's not the infrastructure it's making the discoveries faster it's processing information out in the field faster it's doing analysis of the MRI faster it's you know helping the doctors helping the anybody who's using this to really make faster decisions better decisions exactly and just to add to that I mean in automotive industry you have datasets that are from 50 to 500 petabytes and you need access to all that data all the time because you're constantly training and Retraining to create better models to create better autonomous vehicles and you need you need the performance to do that ddn helps bring that to bear and with this reference architecture simplifies it so you get the value add of nvidia gpus plus its ecosystem of software plus DD on its match made in heaven Darren Johnson Nvidia Curt Koo Kien ddn thanks very much for being on the cube thank you very much and I'm Peter burrs and once again I'd like to thank you for watching this cube conversation until next time [Music]

Published Date : Oct 4 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Darren Johnson	PERSON	0.99+
20 years	QUANTITY	0.99+
Kurt Kuckein	PERSON	0.99+
Sarah	PERSON	0.99+
Sept 2018	DATE	0.99+
ddn	ORGANIZATION	0.99+
nvidia	ORGANIZATION	0.99+
Kurt Gerron	PERSON	0.99+
Kurt	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Darrin Johnson	PERSON	0.99+
today	DATE	0.99+
NVIDIA	ORGANIZATION	0.99+
both	QUANTITY	0.98+
50	QUANTITY	0.98+
two great guests	QUANTITY	0.98+
one point	QUANTITY	0.96+
500 petabytes	QUANTITY	0.96+
Curt Koo Kien	PERSON	0.96+
PDN	ORGANIZATION	0.96+
palo alto california	LOCATION	0.95+
one GPU	QUANTITY	0.94+
one	QUANTITY	0.93+
DDN Storage	ORGANIZATION	0.92+
Peter burrs	PERSON	0.88+
nearly twenty years	QUANTITY	0.86+
lots of files	QUANTITY	0.85+
90 G X	QUANTITY	0.83+
single namespace	QUANTITY	0.79+
Burris	PERSON	0.75+
single GPU	QUANTITY	0.74+
DG x1	TITLE	0.74+
90 Jex	QUANTITY	0.66+
a lot of small files	QUANTITY	0.62+
gpus	COMMERCIAL_ITEM	0.61+
Darrin	ORGANIZATION	0.56+
experience	QUANTITY	0.52+

9_20_18 DDN Nvidia Launch about Benchmarking with PETER & KURT KUCKEIN

(microphone not on) >> be 47 (laughter) >> Are you ready? >> Here we go, alright and, three, two... >> You know it's great to see real benchmarking data, because this is a very important domain and there is not a lot of benchmarking information out there around some of these other products that are available. But let's try to to turn that benchmarking information into business outcomes, and to do that we got, Kurt Kuckein, back from DDN. Kurt welcome back let's talk a bit about how are these high value outcomes that business seeks with AI going to be achieved as a consequence of this new performance, faster capabilities, etcetera. >> So there's a couple of considerations, the first consideration I think is just the selection of AI infrastructure itself. Right, we have customers telling us constantly that they don't know where to start. Now that they have readily available reference architectures that tell them, hey here's something you can implement get installed quickly, you're up and running, running your AI from day one. >> So the decision process for what to get is reduced. >> Exactly. >> Okay. >> Uh, number two is you're unlocking all ends of the investment with something like this right? You're maximizing the performance on the GPU side. You're maximizing the performance on the ingest side for the storage. You're maximizing the through-put of the entire system, so you're really gaining the most out of your investment there. And not just gaining the most out of the investment, but truly accelerating the application and that's the end goal right, that we're looking for with customers. Plenty of people can deliver fast storage, but it does- If it doesn't impact the application and deliver faster results, cut run times down, then what are you really gaining from having fast storage? And so that where we're focused, we're focused on application acceleration. >> So simpler architecture, faster implementation based on that, integrated capabilities, ultimately, all revealing or all resulting in, better application performance. >> Better application performance, and in the end something that's more reliable as well. >> Kurt, thanks for again for being on The Cube. >> Thanks for having me.

Published Date : Sep 28 2018

SUMMARY :

and to do that we got, Kurt Kuckein, back from DDN. the first consideration I think is just You're maximizing the performance on the GPU side. So simpler architecture, and in the end something that's more reliable as well.

ENTITIES

Entity	Category	Confidence
Kurt Kuckein	PERSON	0.99+
Kurt	PERSON	0.99+
KURT KUCKEIN	PERSON	0.99+
PETER	PERSON	0.99+
first consideration	QUANTITY	0.98+
two	QUANTITY	0.97+
three	QUANTITY	0.94+
47	QUANTITY	0.93+
DDN	ORGANIZATION	0.91+
day one	QUANTITY	0.84+
number two	QUANTITY	0.79+
Nvidia	ORGANIZATION	0.79+
Cube	COMMERCIAL_ITEM	0.59+
DDN	EVENT	0.43+

9_20_18 DDN Nvidia Launch AI & Storage with PETER & KURT KUCKEIN

(laughing) >> This is V-3. >> Alec, you're going to open up, we're going to cut, come to you in a second. Good luck, buddy. Okay, here we go. Alright Peter, ready? >> Yup. >> And we're coming to you in. >> Hold on guys, sorry, I lied. (laughing) V-2, V-3, there it is. Okay, ready. >> Now you're ready? >> Yup. >> You're ready ready? Okay here we go, ready and, three, two. >> Hi, I'm Peter Burris, welcome to another Cube Conversation from our wonderful studios in beautiful Palo Alto, California. Great conversation today, we're going to be talking about the relationship between AI, business, and especially some of the new infrastructure technologies in the storage part of the stack. And to join me in this endeavor is Kurt Kuckein, who's a senior director of product marketing at DDN. Kurt Kuckein, welcome to The Cube. >> Thanks, Peter, happy to be here. >> So tell us a little bit about DDN to start. >> So DDN is a storage company that's been around for 20 years. We've got a legacy in high-performance computing, and that's what we see a lot of similarities with this new AI workload. DDN is well-known in that HPC community; if you look at the top 100 supercomputers in the world we're attached to 75-percent of them and so we have a fundamental understanding of that type of scalable need that's where we're focused, we're focused on performance requirements, we're focused on scalability requirements, which can mean multiple things, right, it can mean the scaling of performance, it can mean the scaling of capacity, and we're very flexible. >> Well let me stop you and say, so you've got a lot of customers in the high-performance world, and a lot of those customers are at the vanguard of moving to some of these new AI workloads. What are customers saying? With this significant engagement that you have with the best and the brightest out there, what are they saying about this transition to AI? >> Well I think it's fascinating that we kind of have a bifurcated customer base here, where we have those traditionalists who probably have been looking at AI for over 40 years, right, and they've been exploring this idea and they've gone through the peaks and troughs in the promise of AI, and then contraction because CPUs weren't powerful enough. Now we've got this emergence of GPUs in the supercomputing world, and if you look at how the supercomputing world has expanded in the last few years, it is through investment in GPUs. And then we've got an entirely different segment, which is a much more commercial segment, and they're maybe newly invested in this AI arena, right, they don't have the legacy of 30, 40 years of research behind them, and they are trying to figure out exactly, you know, what do I do here? A lot of companies are coming to us, hey, I have an AI initiative, well what's behind it? Well, we don't know yet, but we've got to have something and they don't understand where is this infrastructure going to come from. >> So the general availability of AI technologies, and obviously Flash has been a big part of that, very high-speed networks within data centers, virtualization certainly helps as well, now opens up the possibility for using these algorithms, some of which have been around for a long time, but have required very specialized bespoke configurations of hardware, to the enterprise. That still begs the question, there are some differences between high-performance computing workloads and AI workloads. Let's start with some of the, what are the similarities, and then let's explore some of the differences. >> So the biggest similarity, I think, is just it's an intractable, hard IO problem, right, at least from the storage perspective. It requires a lot of high throughput, depending on where those IO characteristics are from, it can be very small-file, high-op-intensive type workflows, but it needs the ability of the entire infrastructure to deliver all of that seamlessly from end to end. >> So really high-performance throughput so that you can get to the data you need and keep this computing element saturated. >> Keeping the GPU saturated is really the key, that's where the huge investment is. >> So how do AI and HPC workloads differ? >> So how they're fundamentally different is often AI workloads operate on a smaller scale in terms of the amount of capacity, at least today's AI workloads. As soon as a project encounters success, what our forecast is, is those things will take off and you'll want to apply those algorithms bigger and bigger data sets. But today, you know, we encounter things like 10-terabyte data sets, 50-terabyte data sets and a lot of customers are focused only on that. But what happens when you're successful, how do you scale your current infrastructure to petabytes and multi-petabytes when you'll need it in the future? >> So when I think of HPC, I think of often very, very big batch jobs, very, very large, complex data sets. When I think about AI, like image processing or voice processing, whatever else it might be, I think of a lot of small files, randomly accessed. >> Right. >> That require nonetheless some very complex processing, that you don't want to have to restart all the time. >> Right. >> And a degree of simplicity that's required to make sure that you have the people that can do it. Have I got that right? >> You've got it right. Now one, I think, misconception is, is on the HPC side, right, that whole random small file thing has come in in the last five, 10 years and it's something DDN's been working on quite a bit, right. Our legacy was in high-performance throughput workloads, but the workloads have evolved so much on the HPC side as well, and, as you posited at the beginning, so much of it has become AI and deep-learning research >> Right, so they look a lot more alike. >> They do look a lot more alike. >> So if we think about the revolving relationship now between some of these new data-first workloads, AI-oriented, change the way the business operates types of stuff, what do you anticipate is going to be the future of the relationship between AI and storage? >> Well, what we foresee really is that the explosion in AI needs and AI capabilities is going to mimic what we already see and really drive what we see on the storage side, right? We've been showing that graph for years and years and years of just everything going up and to the right, but as AI starts working on itself and improving itself, as the collection means keep getting better and more sophisticated and have increased resolutions, whether you're talking about cameras or in life sciences, acquisition capabilities just keep getting better and better and the resolutions get better and better, it's more and more data, right? And you want to be able to expose a wide variety of data to these algorithms; that's how they're going to learn faster. And so what we see is that the data-centric part of the infrastructure is going to need to scale, even if you're starting today with a smaller workload. >> Kurt Kuckein, DDN, thanks very much for being on The Cube. >> Thanks for having me. >> And once again, this is Peter Burris with another Cube Conversation, thank you very much for watching. Until next time. (electronic whooshing)

Published Date : Sep 28 2018

SUMMARY :

we're going to cut, come to you in a second. Hold on guys, sorry, I lied. Okay here we go, ready and, three, two. and especially some of the new infrastructure technologies and that's what we see a lot of similarities in the high-performance world, and if you look at how the supercomputing world has expanded So the general availability of AI technologies, but it needs the ability of the entire infrastructure so that you can get to the data you need Keeping the GPU saturated is really the key, in terms of the amount of capacity, So when I think of HPC, I think of that you don't want to have to restart all the time. to make sure that you have the people that can do it. is on the HPC side, right, and the resolutions get better and better, thank you very much for watching.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
50-terabyte	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
10-terabyte	QUANTITY	0.99+
Kurt Kuckein	PERSON	0.99+
KURT KUCKEIN	PERSON	0.99+
DDN	ORGANIZATION	0.99+
PETER	PERSON	0.99+
30, 40 years	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
75-percent	QUANTITY	0.99+
Alec	PERSON	0.99+
over 40 years	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.98+
Nvidia	ORGANIZATION	0.95+
three	QUANTITY	0.93+
100 supercomputers	QUANTITY	0.92+
10 years	QUANTITY	0.91+
20 years	QUANTITY	0.9+
years	QUANTITY	0.89+
Cube	COMMERCIAL_ITEM	0.87+
V-2	OTHER	0.86+
V-3	OTHER	0.85+
one	QUANTITY	0.79+
five	QUANTITY	0.73+
The Cube	ORGANIZATION	0.72+
first	QUANTITY	0.7+
last few years	DATE	0.67+
second	QUANTITY	0.63+
Cube	ORGANIZATION	0.55+
DDN	PERSON	0.54+
9_20_18	DATE	0.45+
last	QUANTITY	0.39+

Olivier Frank & Kurt Bager | HPE Discover 2017 Madrid

>> Announcer: Live from Madrid, Spain, it's theCUBE, covering HPE Discover Madrid 2017, brought to you by Hewlett Packard Enterprise. >> Welcome back to Madrid, everybody, this is theCUBE, the leader in live tech coverage. My name is Dave Vellante, I'm here with Peter Burris, this is day one of HPE Discover Madrid. Olivier Frank is here, he's the Worldwide Senior Sales Director for Alliances for IoT at HPE, and Kurt Bayer, otherwise known as Bager in English, in America. He's Vice President of IoT Solutions for EMEA PTC, did I get that right? >> Yeah you did it. >> Bayer? All right, well thank you for sharing that with me. Welcome to theCUBE, gentlemen. Olivier, let me start with you. The relationship between PTC and HPE is not brand new. You guys got together a while back. What catalyzed that getting together? >> Yeah, it's a great question, and thank you for inviting us, it's great pleasure to be on theCUBE, and for me the first time, so thank you for that. >> Welcome. >> Yeah, you know, the partnership is all about action and doing things together, so we did start about a year ago with, you may remember flow serve and industrial pump that we showcased, and since then we've been working very closely together to actually allow our customers to go an test the technology themselves. So I would say the partnership has matured, we now have two live environments that customer can visit, one in Europe, in Germany, in Aachen, with the RWTH University, and one in the US, near Houston, with Texmark who you know because you also came to the show. >> Right, okay, Kurt give us the update on PTC. Company's been in business for a long time, IoT is like a tailwind. >> It is, that's right. PTC is mostly known for CAD and PLM, so for 30 years they made 3D CAD software for when you design and make an aircraft or car engine. But over the last five years, PTC have moved heavily into IoT, spent a billion on acquiring and designing software platform that can connect and calculate and show in augmented reality. >> So let me build on that, because PTC as a CAD company, as a PLM company, has done a phenomenal job of using software and technology to be able to design things to a level of specificity and tolerance that just wasn't able to be done before, and it's revolutionized how people build products. But now, because technology's advanced, you can leverage that information in your drawings, in your systems to create a new kind of an artifact, a digital twin that allows a business that's working closely with you to actually render that in an IoT sense and add intelligence to it. Have I got that right? >> You got it exactly right. So making the copy. We can draw it and we can design the physical part, and we can make the digital twin of the physical part with sensors. So in that way you can loop back and see if the calculation, the design, the engineering you have made is the right fit, or you need to change things. You can optimize product with having the live digital twin of the things that you've designed physically. >> So it's like a model, except it's not a model. It's like a real world instantiation. Model is an estimate, right? A digital twin is actual real data. >> It's feeded by live data, so you have a real copy of what's going on. And we use it for not only closing the loop of designing products, but also to optimize in the industrial fold, to optimize operation and creating manufacturing of things, and we use it to connect things, so you can do predictive maintenance or you can turn products to be a service, instead of selling an asset, the company can buy by click, by use, plus the product are connected. >> I want to really amplify this, Dave, 'cause it's really important, I want to test this with you, 'cause the whole concept of using technology, IoT technology to improve the operational efficiency, to improve the serviceability, to evolve your business models, your ability to do that is tied back to the fidelity of the models you're using for things that are delivering the services, and I don't think the world fully understands the degree to which it's a natural leap from CAD and related technologies, into building the digital artifacts that are gonna be necessary to make that all work. Have I got that right? >> You got it completely right. So it is moving from having live informations from the physical object. So if you go to augmented reality, so you have the opportunity to look at things and get live information about temperature, power, streaming of water, and all these things that goes on inside the product, you also have the opportunity to understand if there's something wrong with the product, you can click on it and you can be directed on how to change and service things like when the augmented reality, all built by the CAD drawing in the beginning that is combined with sensor information and >> And simulate, and test, and all the other things that are hard, but obviously to do that, you need a whole bunch of other technology, and I guess that's where HPE comes in. >> Exactly. >> Absolutely. In fact to bounce on that thought, we talk a lot about connected operation, where you know, we are showing the digital twin, but one of the new use case that we're showing on the floor here is what we call smart product engineering. So we're basically using the CAD environment of (mumbles), running on that edge line with edge compute, you know, enterprise compute capability, manageability and security, and running on that same platform then, simulation from companies like Ensys, right, and then doing 3D printing, print prototyping, and basically instrumenting the prototype, we're using a bike, the saddle stem of a bike showcase, and they are able to connect and collect the data, we're partnering with National Instruments who are also well-known, and reinject the real data into the digital model. So again, the engineers can compare their thought and their design assumptions with the real physical prototype, and accelerate time to market. >> PTC's been a leader in starting with the CAD and then pulling it through product life cycle management, PLM. So talk about this is going to alter the way PLM becomes a design tool for digital business. If I'm right. >> You're right, it becomes industrial innovation platform from creating the product to the full life cycle of it. >> Peter: All the way up to the business model. >> All the way up to the business model. And talking about analytics, so if you have a lot of data and you want to make sure you get some decision made fast about predictive maintenance, that's an area where we are partnering with HP so we have a lot of power close in the edge, close to the products that can do the calculations from the devices, from the product, and do some fast results in order to do predictive maintenance and only send the results away from the location. >> So what are some of the things you guys are most excited about, Olivier? >> Well, really excited about making those use cases, being the smart product engineering, or the predictive maintenance, you know, work for our customers so behind the scenes we have great solutions, now we're partnering on the sales front to kind of go together to customers, we have huge install base on both sides, and picking the right customers interested in this digital transformation, and make it real for them, because we know it's a journey, we know it's kind of the crawl, walk, run, and it's really about accelerating, you know, turning insights into information and into actions, and that's really where we are very much excited to work together. >> So it's not just, so the collaboration's extending to go to market is what I'm hearing. And so what's the uptake been like, what are customers, customers must be asking you, "Where do I start?" What do you tell them? >> Before you start, it's important that you have a business case, a business value, you understand what you wanted to achieve, by integrating an IoT solution. That's important. Then you need to figure out what is the data, what is the fast solution I need to take, and then you can start deciding on the planning of your implementation of the IoT. >> Can I go back one step further, >> Yep. >> You tell me if I got that. And that one step further is, look, every... Innovation and adoption happens faster when you can take an existing asset and create new value. >> Kurt: Exactly. >> So isn't PTC actually starting by saying, hey, you've already got these designs, you've already got these models. Reuse them, create new life, give 'em new life, create new value with 'em. Do things in ways that now you can work with your customers totally differently, and isn't that kind of where it starts? >> It does, and you already have a good portion of what you need, so in order to make a fast value out of your new product or the new thing you can do with the product, connecting the products, then PTC and HP is a good platform to move on. >> Yeah but the pretesting, precertify, packaging, the software with the hardware, is allowing our customer to go faster to proof of concept and then to production. So we have a number of workshops, customers can come, again as I mentioned at the beginning, in Germany, in Aachen or in Houston at our Texmark facility, where we can basically walk the talk with customers and start those early POCs, defining the business success factors, business value they want to take out of it, and basically get the ball rolling. But it's really exciting because we have, we're touching really some of the key digital transformation of our enterprise customers. >> And don't forget that you need a partner that can do a good job in service, because you need a organization that can help you get it through, and HP are a strong service organization too. >> Well this idea of the intelligent edge has a lot of obviously executive support at Hewlett Packard Enterprise, that keeps buzzing at theCUBE today, Meg Whitman's in the house, she's right next door, and we're gonna do a quick cutaway to Meg, give her a shoutout, trying to get her over here to talk about her six-year tenure here, but you know, that top-down executive support has been so critical in terms of HPE getting early into the edge, IoT, intelligent edge you call it, Tom Bradicich obviously a leader, he's coming on. You mentioned National Instruments, PTC, you guys were first, really, from a traditional IT business to really get into that space. >> We're also the first to converge OT and IT, so we're showing on the floor what we're doing in end of line quality testing for automotive for example, taking PX higher standard, which is like instrumentation and real-time data position into our converged systems. So what I found is really amazing. You take the same architecture, and we can do it edge to core to cloud, right, that's very powerful. One software framework, one IT architecture that's pan out. >> Peter: Not some time in the future, but right now. >> Yeah, right now. >> So we talk about a three, maybe even a 3A, four-tier data model, where you've got data at the edge, real time, maybe you don't persist all of it or a lot of it. >> We call it experience data or primary data at the edge. vet data, or secondary data, and then business optimization data at the top level, that's at the cloud. >> So let's unpack that a little bit and get your perspective. So the edge, obviously you're talking about real time decision making, autonomous cars, you're not gonna go back to the cloud to make that decision. That, well you call it core, that's what did you call it? >> The hybrid IT. >> The vet, the vet. That's an aggregation point, right, to collect a lot of the data from the edge, and then cloud maybe is where you do the deep analysis and do the deep modeling. And that cloud can be on-prem, or it can be on the public cloud. Is that a reasonable data model for the flow of data for edge and IoT? >> I believe it is, because some of these products generate a lot of data, and you need to be able to handle that data, and honestly, connectivity is not for free, and sometimes it's difficult if it's in the industry floor, manufacturing floor, you need good connectivity, but you still have limitations. So if you can do the local analytics and then you only send the results to the core, then it's a perfect model. And then there's a lot of regulations around data, so for many countries, and especially in Europe, there's boundaries around the data, it's not all that you can move to a cloud, especially if it's out of the country. So the model makes a good hybrid in between speed, connectivity, analytics and the legislation problem. >> Dave: And you've both got solutions at each layer? >> Absolutely, so in fact... So PTC can run at the edge, at the core or in the cloud, and of course we are powering the three pillars. And I think what's also interesting to know is that with the advance in artificial intelligence, as was explored during the main session, there it is pivotal you need to keep a lot of data in order to learn from those data, right? So I think it's quite fascinating that we're going to store more and more data, probably make some useful right away, and maybe store some that we come back to it. That's why we're working also with companies like OSIsoft, an historian, which is collecting this time stamp data for later utilization. But I wanted also to say that what's great working with PTC is that it's kind of a workflow in media, in terms of collecting the data, contextualizing them and then visualization and then analytics. But we're developing a rich ecosystem, because in this complex world of IoT, again it's kind of an art and a science, and the ability to partner ourselves, but also our let's say friendly partners is very, very critical. >> Dave: Guys, oh good, last word. >> I will say we started with a digital twin, and for some companies they might be late to get the digital twin. The longer you have had collecting data from a live product >> The better the model gets >> The stronger you will be, >> Peter: Better fidelity. >> The better model you can do, because you have the bigger data. So it's a matter of getting the data into the twin. >> That's exactly what our research suggests. We've got a lot of examples of this. >> It's the difference between sampling and having an entire corpus of data. >> Kurt: Exactly. >> Kurt, Olivier, thanks very much for coming on the theCUBE. >> Thank you. >> Thank you so much. >> Great segment guys. Okay, keep it right there everybody, Dave Vellante for Peter Burris, we'll be back in Madrid right after this short break.

Published Date : Nov 28 2017

SUMMARY :

brought to you by Hewlett Packard Enterprise. Olivier Frank is here, he's the Worldwide All right, well thank you for sharing that with me. and for me the first time, and one in the US, near Houston, with Texmark who you know Company's been in business for a long time, for when you design and make an aircraft or car engine. and add intelligence to it. So in that way you can loop back and see So it's like a model, except it's not a model. in the industrial fold, to optimize operation the degree to which it's a natural leap so you have the opportunity to look at things And simulate, and test, and all the other things and reinject the real data into the digital model. So talk about this is going to alter from creating the product to the full life cycle of it. close in the edge, close to the products or the predictive maintenance, you know, So it's not just, so the collaboration's extending and then you can start deciding on the planning when you can take an existing asset and create new value. Do things in ways that now you can of what you need, so in order to make a fast value and basically get the ball rolling. And don't forget that you need a partner into the edge, IoT, intelligent edge you call it, We're also the first to converge OT and IT, maybe you don't persist all of it or a lot of it. We call it experience data or primary data at the edge. So the edge, obviously you're talking about real time and then cloud maybe is where you do the deep analysis and then you only send the results to the core, and the ability to partner ourselves, The longer you have had collecting data So it's a matter of getting the data into the twin. We've got a lot of examples of this. It's the difference between sampling coming on the theCUBE. Dave Vellante for Peter Burris,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
Olivier	PERSON	0.99+
Meg	PERSON	0.99+
Tom Bradicich	PERSON	0.99+
Europe	LOCATION	0.99+
Aachen	LOCATION	0.99+
Germany	LOCATION	0.99+
Peter	PERSON	0.99+
Houston	LOCATION	0.99+
US	LOCATION	0.99+
HP	ORGANIZATION	0.99+
Madrid	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
National Instruments	ORGANIZATION	0.99+
Kurt Bayer	PERSON	0.99+
Kurt	PERSON	0.99+
six-year	QUANTITY	0.99+
Olivier Frank	PERSON	0.99+
Bayer	PERSON	0.99+
America	LOCATION	0.99+
OSIsoft	ORGANIZATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
Meg Whitman	PERSON	0.99+
PTC	ORGANIZATION	0.99+
30 years	QUANTITY	0.99+
Kurt Bager	PERSON	0.99+
RWTH University	ORGANIZATION	0.99+
one	QUANTITY	0.99+
each layer	QUANTITY	0.99+
both	QUANTITY	0.99+
first time	QUANTITY	0.98+
Ensys	ORGANIZATION	0.98+
three pillars	QUANTITY	0.98+
first	QUANTITY	0.98+
one step	QUANTITY	0.98+
Texmark	ORGANIZATION	0.98+
Madrid, Spain	LOCATION	0.98+
today	DATE	0.97+
three	QUANTITY	0.97+
Bager	PERSON	0.96+
both sides	QUANTITY	0.96+
One software	QUANTITY	0.96+
four-tier	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.92+
twin	QUANTITY	0.92+
a billion	QUANTITY	0.9+
a year ago	DATE	0.88+
IoT Solutions	ORGANIZATION	0.86+
for IoT	ORGANIZATION	0.85+
two live environments	QUANTITY	0.83+
about	DATE	0.82+
2017	DATE	0.81+
PX	ORGANIZATION	0.78+
last five years	DATE	0.78+
EMEA PTC	ORGANIZATION	0.66+
Vice President	PERSON	0.61+
Texmark	LOCATION	0.58+
day one	QUANTITY	0.54+

Collibra Data Citizens 22

>>Collibra is a company that was founded in 2008 right before the so-called modern big data era kicked into high gear. The company was one of the first to focus its business on data governance. Now, historically, data governance and data quality initiatives, they were back office functions and they were largely confined to regulatory regulated industries that had to comply with public policy mandates. But as the cloud went mainstream, the tech giants showed us how valuable data could become and the value proposition for data quality and trust. It evolved from primarily a compliance driven issue to becoming a lynchpin of competitive advantage. But data in the decade of the 2010s was largely about getting the technology to work. You had these highly centralized technical teams that were formed and they had hyper specialized skills to develop data architectures and processes to serve the myriad data needs of organizations. >>And it resulted in a lot of frustration with data initiatives for most organizations that didn't have the resources of the cloud guys and the social media giants to really attack their data problems and turn data into gold. This is why today for example, this quite a bit of momentum to rethinking monolithic data architectures. You see, you hear about initiatives like data mesh and the idea of data as a product. They're gaining traction as a way to better serve the the data needs of decentralized business Uni users, you hear a lot about data democratization. So these decentralization efforts around data, they're great, but they create a new set of problems. Specifically, how do you deliver like a self-service infrastructure to business users and domain experts? Now the cloud is definitely helping with that, but also how do you automate governance? This becomes especially tricky as protecting data privacy has become more and more important. >>In other words, while it's enticing to experiment and run fast and loose with data initiatives kinda like the Wild West, to find new veins of gold, it has to be done responsibly. As such, the idea of data governance has had to evolve to become more automated. And intelligence governance and data lineage is still fundamental to ensuring trust as data. It moves like water through an organization. No one is gonna use data that isn't trusted. Metadata has become increasingly important for data discovery and data classification. As data flows through an organization, the continuously ability to check for data flaws and automating that data quality, they become a functional requirement of any modern data management platform. And finally, data privacy has become a critical adjacency to cyber security. So you can see how data governance has evolved into a much richer set of capabilities than it was 10 or 15 years ago. >>Hello and welcome to the Cube's coverage of Data Citizens made possible by Calibra, a leader in so-called Data intelligence and the host of Data Citizens 2022, which is taking place in San Diego. My name is Dave Ante and I'm one of the hosts of our program, which is running in parallel to data citizens. Now at the Cube we like to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the themes from the keynote speakers at Data Citizens and we'll hear from several of the executives. Felix Von Dala, who is the co-founder and CEO of Collibra, will join us along with one of the other founders of Collibra, Stan Christians, who's gonna join my colleague Lisa Martin. I'm gonna also sit down with Laura Sellers, she's the Chief Product Officer at Collibra. We'll talk about some of the, the announcements and innovations they're making at the event, and then we'll dig in further to data quality with Kirk Hasselbeck. >>He's the vice president of Data quality at Collibra. He's an amazingly smart dude who founded Owl dq, a company that he sold to Col to Collibra last year. Now many companies, they didn't make it through the Hado era, you know, they missed the industry waves and they became Driftwood. Collibra, on the other hand, has evolved its business. They've leveraged the cloud, expanded its product portfolio, and leaned in heavily to some major partnerships with cloud providers, as well as receiving a strategic investment from Snowflake earlier this year. So it's a really interesting story that we're thrilled to be sharing with you. Thanks for watching and I hope you enjoy the program. >>Last year, the Cube Covered Data Citizens Collibra's customer event. And the premise that we put forth prior to that event was that despite all the innovation that's gone on over the last decade or more with data, you know, starting with the Hado movement, we had data lakes, we'd spark the ascendancy of programming languages like Python, the introduction of frameworks like TensorFlow, the rise of ai, low code, no code, et cetera. Businesses still find it's too difficult to get more value from their data initiatives. And we said at the time, you know, maybe it's time to rethink data innovation. While a lot of the effort has been focused on, you know, more efficiently storing and processing data, perhaps more energy needs to go into thinking about the people and the process side of the equation, meaning making it easier for domain experts to both gain insights for data, trust the data, and begin to use that data in new ways, fueling data, products, monetization and insights data citizens 2022 is back and we're pleased to have Felix Van Dema, who is the founder and CEO of Collibra. He's on the cube or excited to have you, Felix. Good to see you again. >>Likewise Dave. Thanks for having me again. >>You bet. All right, we're gonna get the update from Felix on the current data landscape, how he sees it, why data intelligence is more important now than ever and get current on what Collibra has been up to over the past year and what's changed since Data Citizens 2021. And we may even touch on some of the product news. So Felix, we're living in a very different world today with businesses and consumers. They're struggling with things like supply chains, uncertain economic trends, and we're not just snapping back to the 2010s. That's clear, and that's really true as well in the world of data. So what's different in your mind, in the data landscape of the 2020s from the previous decade, and what challenges does that bring for your customers? >>Yeah, absolutely. And, and I think you said it well, Dave, and and the intro that that rising complexity and fragmentation in the broader data landscape, that hasn't gotten any better over the last couple of years. When when we talk to our customers, that level of fragmentation, the complexity, how do we find data that we can trust, that we know we can use has only gotten kinda more, more difficult. So that trend that's continuing, I think what is changing is that trend has become much more acute. Well, the other thing we've seen over the last couple of years is that the level of scrutiny that organizations are under respect to data, as data becomes more mission critical, as data becomes more impactful than important, the level of scrutiny with respect to privacy, security, regulatory compliance, as only increasing as well, which again, is really difficult in this environment of continuous innovation, continuous change, continuous growing complexity and fragmentation. >>So it's become much more acute. And, and to your earlier point, we do live in a different world and and the the past couple of years we could probably just kind of brute for it, right? We could focus on, on the top line. There was enough kind of investments to be, to be had. I think nowadays organizations are focused or are, are, are, are, are, are in a very different environment where there's much more focus on cost control, productivity, efficiency, How do we truly get value from that data? So again, I think it just another incentive for organization to now truly look at data and to scale it data, not just from a a technology and infrastructure perspective, but how do you actually scale data from an organizational perspective, right? You said at the the people and process, how do we do that at scale? And that's only, only only becoming much more important. And we do believe that the, the economic environment that we find ourselves in today is gonna be catalyst for organizations to really dig out more seriously if, if, if, if you will, than they maybe have in the have in the best. >>You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated it was gonna get, but you've been on a mission to really address these problems from the beginning. How would you describe your, your, your mission and what are you doing to address these challenges? >>Yeah, absolutely. We, we started Colli in 2008. So in some sense and the, the last kind of financial crisis, and that was really the, the start of Colli where we found product market fit, working with large finance institutions to help them cope with the increasing compliance requirements that they were faced with because of the, of the financial crisis and kind of here we are again in a very different environment, of course 15 years, almost 15 years later. But data only becoming more important. But our mission to deliver trusted data for every user, every use case and across every source, frankly, has only become more important. So what has been an incredible journey over the last 14, 15 years, I think we're still relatively early in our mission to again, be able to provide everyone, and that's why we call it data citizens. We truly believe that everyone in the organization should be able to use trusted data in an easy, easy matter. That mission is is only becoming more important, more relevant. We definitely have a lot more work ahead of us because we are still relatively early in that, in that journey. >>Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a company and then the fact that you're still in the early days is kind of interesting. I mean, you, Collibra's had a good 12 months or so since we last spoke at Data Citizens. Give us the latest update on your business. What do people need to know about your, your current momentum? >>Yeah, absolutely. Again, there's, there's a lot of tail organizations that are only maturing the data practices and we've seen it kind of transform or, or, or influence a lot of our business growth that we've seen, broader adoption of the platform. We work at some of the largest organizations in the world where it's Adobe, Heineken, Bank of America, and many more. We have now over 600 enterprise customers, all industry leaders and every single vertical. So it's, it's really exciting to see that and continue to partner with those organizations. On the partnership side, again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners like Google, Amazon, Snowflake, data bricks and, and others, right? As those kind of new modern data infrastructures, modern data architectures that are definitely all moving to the cloud, a great opportunity for us, our partners and of course our customers to help them kind of transition to the cloud even faster. >>And so we see a lot of excitement and momentum there within an acquisition about 18 months ago around data quality, data observability, which we believe is an enormous opportunity. Of course, data quality isn't new, but I think there's a lot of reasons why we're so excited about quality and observability now. One is around leveraging ai, machine learning, again to drive more automation. And the second is that those data pipelines that are now being created in the cloud, in these modern data architecture arch architectures, they've become mission critical. They've become real time. And so monitoring, observing those data pipelines continuously has become absolutely critical so that they're really excited about about that as well. And on the organizational side, I'm sure you've heard a term around kind of data mesh, something that's gaining a lot of momentum, rightfully so. It's really the type of governance that we always believe. Then federated focused on domains, giving a lot of ownership to different teams. I think that's the way to scale data organizations. And so that aligns really well with our vision and, and from a product perspective, we've seen a lot of momentum with our customers there as well. >>Yeah, you know, a couple things there. I mean, the acquisition of i l dq, you know, Kirk Hasselbeck and, and their team, it's interesting, you know, the whole data quality used to be this back office function and, and really confined to highly regulated industries. It's come to the front office, it's top of mind for chief data officers, data mesh. You mentioned you guys are a connective tissue for all these different nodes on the data mesh. That's key. And of course we see you at all the shows. You're, you're a critical part of many ecosystems and you're developing your own ecosystem. So let's chat a little bit about the, the products. We're gonna go deeper in into products later on at, at Data Citizens 22, but we know you're debuting some, some new innovations, you know, whether it's, you know, the, the the under the covers in security, sort of making data more accessible for people just dealing with workflows and processes as you talked about earlier. Tell us a little bit about what you're introducing. >>Yeah, absolutely. We're super excited, a ton of innovation. And if we think about the big theme and like, like I said, we're still relatively early in this, in this journey towards kind of that mission of data intelligence that really bolts and compelling mission, either customers are still start, are just starting on that, on that journey. We wanna make it as easy as possible for the, for our organization to actually get started because we know that's important that they do. And for our organization and customers that have been with us for some time, there's still a tremendous amount of opportunity to kind of expand the platform further. And again, to make it easier for really to, to accomplish that mission and vision around that data citizen that everyone has access to trustworthy data in a very easy, easy way. So that's really the theme of a lot of the innovation that we're driving. >>A lot of kind of ease of adoption, ease of use, but also then how do we make sure that lio becomes this kind of mission critical enterprise platform from a security performance architecture scale supportability that we're truly able to deliver that kind of an enterprise mission critical platform. And so that's the big theme from an innovation perspective, From a product perspective, a lot of new innovation that we're really excited about. A couple of highlights. One is around data marketplace. Again, a lot of our customers have plans in that direction, how to make it easy. How do we make, how do we make available to true kind of shopping experience that anybody in your organization can, in a very easy search first way, find the right data product, find the right dataset, that data can then consume usage analytics. How do you, how do we help organizations drive adoption, tell them where they're working really well and where they have opportunities homepages again to, to make things easy for, for people, for anyone in your organization to kind of get started with ppia, you mentioned workflow designer, again, we have a very powerful enterprise platform. >>One of our key differentiators is the ability to really drive a lot of automation through workflows. And now we provided a new low code, no code kind of workflow designer experience. So, so really customers can take it to the next level. There's a lot more new product around K Bear Protect, which in partnership with Snowflake, which has been a strategic investor in kib, focused on how do we make access governance easier? How do we, how do we, how are we able to make sure that as you move to the cloud, things like access management, masking around sensitive data, PII data is managed as much more effective, effective rate, really excited about that product. There's more around data quality. Again, how do we, how do we get that deployed as easily and quickly and widely as we can? Moving that to the cloud has been a big part of our strategy. >>So we launch more data quality cloud product as well as making use of those, those native compute capabilities in platforms like Snowflake, Data, Bricks, Google, Amazon, and others. And so we are bettering a capability, a capability that we call push down. So actually pushing down the computer and data quality, the monitoring into the underlying platform, which again, from a scale performance and ease of use perspective is gonna make a massive difference. And then more broadly, we, we talked a little bit about the ecosystem. Again, integrations, we talk about being able to connect to every source. Integrations are absolutely critical and we're really excited to deliver new integrations with Snowflake, Azure and Google Cloud storage as well. So there's a lot coming out. The, the team has been work at work really hard and we are really, really excited about what we are coming, what we're bringing to markets. >>Yeah, a lot going on there. I wonder if you could give us your, your closing thoughts. I mean, you, you talked about, you know, the marketplace, you know, you think about data mesh, you think of data as product, one of the key principles you think about monetization. This is really different than what we've been used to in data, which is just getting the technology to work has been been so hard. So how do you see sort of the future and, you know, give us the, your closing thoughts please? >>Yeah, absolutely. And I, and I think we we're really at this pivotal moment, and I think you said it well. We, we all know the constraint and the challenges with data, how to actually do data at scale. And while we've seen a ton of innovation on the infrastructure side, we fundamentally believe that just getting a faster database is important, but it's not gonna fully solve the challenges and truly kind of deliver on the opportunity. And that's why now is really the time to deliver this data intelligence vision, this data intelligence platform. We are still early, making it as easy as we can. It's kind of, of our, it's our mission. And so I'm really, really excited to see what we, what we are gonna, how the marks gonna evolve over the next, next few quarters and years. I think the trend is clearly there when we talk about data mesh, this kind of federated approach folks on data products is just another signal that we believe that a lot of our organization are now at the time. >>The understanding need to go beyond just the technology. I really, really think about how do we actually scale data as a business function, just like we've done with it, with, with hr, with, with sales and marketing, with finance. That's how we need to think about data. I think now is the time given the economic environment that we are in much more focus on control, much more focused on productivity efficiency and now's the time. We need to look beyond just the technology and infrastructure to think of how to scale data, how to manage data at scale. >>Yeah, it's a new era. The next 10 years of data won't be like the last, as I always say. Felix, thanks so much and good luck in, in San Diego. I know you're gonna crush it out there. >>Thank you Dave. >>Yeah, it's a great spot for an in-person event and, and of course the content post event is gonna be available@collibra.com and you can of course catch the cube coverage@thecube.net and all the news@siliconangle.com. This is Dave Valante for the cube, your leader in enterprise and emerging tech coverage. >>Hi, I'm Jay from Collibra's Data Office. Today I want to talk to you about Collibra's data intelligence cloud. We often say Collibra is a single system of engagement for all of your data. Now, when I say data, I mean data in the broadest sense of the word, including reference and metadata. Think of metrics, reports, APIs, systems, policies, and even business processes that produce or consume data. Now, the beauty of this platform is that it ensures all of your users have an easy way to find, understand, trust, and access data. But how do you get started? Well, here are seven steps to help you get going. One, start with the data. What's data intelligence? Without data leverage the Collibra data catalog to automatically profile and classify your enterprise data wherever that data lives, databases, data lakes or data warehouses, whether on the cloud or on premise. >>Two, you'll then wanna organize the data and you'll do that with data communities. This can be by department, find a business or functional team, however your organization organizes work and accountability. And for that you'll establish community owners, communities, make it easy for people to navigate through the platform, find the data and will help create a sense of belonging for users. An important and related side note here, we find it's typical in many organizations that data is thought of is just an asset and IT and data offices are viewed as the owners of it and who are really the central teams performing analytics as a service provider to the enterprise. We believe data is more than an asset, it's a true product that can be converted to value. And that also means establishing business ownership of data where that strategy and ROI come together with subject matter expertise. >>Okay, three. Next, back to those communities there, the data owners should explain and define their data, not just the tables and columns, but also the related business terms, metrics and KPIs. These objects we call these assets are typically organized into business glossaries and data dictionaries. I definitely recommend starting with the topics that are most important to the business. Four, those steps that enable you and your users to have some fun with it. Linking everything together builds your knowledge graph and also known as a metadata graph by linking or relating these assets together. For example, a data set to a KPI to a report now enables your users to see what we call the lineage diagram that visualizes where the data in your dashboards actually came from and what the data means and who's responsible for it. Speaking of which, here's five. Leverage the calibra trusted business reporting solution on the marketplace, which comes with workflows for those owners to certify their reports, KPIs, and data sets. >>This helps them force their trust in their data. Six, easy to navigate dashboards or landing pages right in your platform for your company's business processes are the most effective way for everyone to better understand and take action on data. Here's a pro tip, use the dashboard design kit on the marketplace to help you build compelling dashboards. Finally, seven, promote the value of this to your users and be sure to schedule enablement office hours and new employee onboarding sessions to get folks excited about what you've built and implemented. Better yet, invite all of those community and data owners to these sessions so that they can show off the value that they've created. Those are my seven tips to get going with Collibra. I hope these have been useful. For more information, be sure to visit collibra.com. >>Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. My name is Dave Valante. With us is Kirk Hasselbeck, who's the vice president of Data Quality of Collibra Kirk, good to see you. Welcome. >>Thanks for having me, Dave. Excited to be here. >>You bet. Okay, we're gonna discuss data quality observability. It's a hot trend right now. You founded a data quality company, OWL dq, and it was acquired by Collibra last year. Congratulations. And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >>Yeah, absolutely. It's, it's definitely exciting times for data quality, which you're right, has been around for a long time. So why now and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And, and while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as, as to why this is becoming so important now. And, and I guess you could kind of break this down simply and think about if Dave, you and I were gonna build, you know, a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, you know, what the ramifications could be, what, what those incidents would look like, or maybe better yet, we try to build a, a new trading algorithm with a crossover strategy where the 50 day crosses the, the 10 day average. >>And imagine if the data underlying the inputs to that is incorrect. We will probably have major financial ramifications in that sense. So, you know, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. You know, I, I bought a car not too long ago and my dad called and said, How many cylinders does it have? And I realized in that moment, you know, I might have failed him because, cause I didn't know. And, and I used to ask those types of questions about any lock brakes and cylinders and, and you know, if it's manual or, or automatic and, and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips, I, I really don't know that much about it. >>And, and that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the, the individuals loading and consuming all of this data for the company actually may not know that much about the data itself, and that's not even their job anymore. So we'll talk more about that in a minute, but that's really what's setting the foreground for this observability play and why everybody's so interested. It, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >>You know, the other thing too about data quality, and for years we did the MIT CDO IQ event, we didn't do it last year, Covid messed everything up. But the observation I would make there thoughts is, is it data quality? Used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a, a risk to data as a, as an asset. And now as we say, we're gonna talk about observability. And so it's really become front and center just the whole quality issue because data's so fundamental, hasn't it? >>Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my, my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And, and that's kind of what's going on. There's, there's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor, but with the scale that we've achieved in early days, even before calibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is, is listening right intently nowadays to this topic is, so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, you know, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's, it's not ever going to be based on one or two domain experts anymore. >>So, So how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they, are they cousins? What's your perspective on that? >>Yeah, it's, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the, the lingo is constantly moving is, you know, as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens, it's wrong and when it doesn't it's correct. Or I could look for a trend and I'll give you a good example. You know, everybody's talking about fresh data and stale data and, and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good in the bads. That was kind of your, your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data, but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >>So what's the Collibra angle on all this stuff made the acquisition, you got data quality observability coming together, you guys have a lot of expertise in, in this area, but you hear providence of data, you just talked about, you know, stale data, you know, the, the whole trend toward real time. How is Calibra approaching the problem and what's unique about your approach? >>Well, I think where we're fortunate is with our background, myself and team, we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the, the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution, it's more advanced than some of the observation techniques that that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights, and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong, just show me the big picture, help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact, connecting it with lineage and catalog metadata. And as that grows, you can actually achieve total data governance at this point with the acquisition of what was a Lineage company years ago, and then my company Ldq now Collibra, Data quality Collibra may be the best positioned for total data governance and intelligence in the space. >>Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens 22 that you're announcing, you gotta announce new products, right? You're yearly event what's, what's new. Give us a sense as to what products are coming out, but specifically around data quality and observability. >>Absolutely. There's this, you know, there's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks is Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a SaaS like model. And we've started to hook in to these databases. And while we've always worked with the the same databases in the past, they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did your, my data that I've spent all this time and money with my security team securing ever leave my hands, did it ever leave my secure VPC as they call it? >>And with these native integrations that we're building and about to unveil, here's kind of a sneak peek for, for next week at Data Citizens. We're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration, you could log into the Collibra data quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress, cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >>So this is interesting because what you just described, you know, you mentioned Snowflake, you mentioned Google, Oh actually you mentioned yeah, data bricks. You know, Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool, but then Google's got the open data cloud. If you heard, you know, Google next and now data bricks doesn't call it the data cloud, but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm, I'm hearing to, to really understand the relationships between all those and have confidence across, you know, it's like Jak Dani, you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And, and, and that's what you're bringing to the table, Is that right? Did I get that right? >>Yeah, that's right. And it's, for us, it's, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now, we can send them the, the operating ability to crunch all of the calculations, the governance, the quality, and get the answers. And what that's doing, it's basically zero network costs, zero egress cost, zero latency of time. And so when you were to log into Big Query tomorrow using our tool or like, or say Snowflake for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls, things of that nature that just become less onerous. What we're seeing is there's so much technology out there, just like all of the major brands that you mentioned, but how do we make it easier? The future is about less clicks, faster time to value, faster scale, and eventually lower cost. And, and we think that this positions us to be the leader there. >>I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. All right, Kirk, give us your, your final thoughts and on on the trends that we've talked about and Data Citizens 22. >>Absolutely. Well, I think, you know, one big trend is discovery and classification. Seeing that across the board, people used to know it was a zip code and nowadays with the amount of data that's out there, they wanna know where everything is, where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases how fast they can get controls and insights out of their tools. So I think we're gonna see more one click solutions, more SAS based solutions and solutions that hopefully prove faster time to value on, on all of these modern cloud platforms. >>Excellent. All right, Kurt Hasselbeck, thanks so much for coming on the Cube and previewing Data Citizens 22. Appreciate it. >>Thanks for having me, Dave. >>You're welcome. Right, and thank you for watching. Keep it right there for more coverage from the Cube. Welcome to the Cube's virtual Coverage of Data Citizens 2022. My name is Dave Valante and I'm here with Laura Sellers, who's the Chief Product Officer at Collibra, the host of Data Citizens. Laura, welcome. Good to see you. >>Thank you. Nice to be here. >>Yeah, your keynote at Data Citizens this year focused on, you know, your mission to drive ease of use and scale. Now when I think about historically fast access to the right data at the right time in a form that's really easily consumable, it's been kind of challenging, especially for business users. Can can you explain to our audience why this matters so much and what's actually different today in the data ecosystem to make this a reality? >>Yeah, definitely. So I think what we really need and what I hear from customers every single day is that we need a new approach to data management and our product teams. What inspired me to come to Calibra a little bit a over a year ago was really the fact that they're very focused on bringing trusted data to more users across more sources for more use cases. And so as we look at what we're announcing with these innovations of ease of use and scale, it's really about making teams more productive in getting started with and the ability to manage data across the entire organization. So we've been very focused on richer experiences, a broader ecosystem of partners, as well as a platform that delivers performance, scale and security that our users and teams need and demand. So as we look at, Oh, go ahead. >>I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it was just so complicated. But, but please carry on. I'd love to hear more about this. >>Yeah, I, I really, you know, Collibra is a system of engagement for data and we really are working on bringing that entire system of engagement to life for everyone to leverage here and now. So what we're announcing from our ease of use side of the world is first our data marketplace. This is the ability for all users to discover and access data quickly and easily shop for it, if you will. The next thing that we're also introducing is the new homepage. It's really about the ability to drive adoption and have users find data more quickly. And then the two more areas of the ease of use side of the world is our world of usage analytics. And one of the big pushes and passions we have at Collibra is to help with this data driven culture that all companies are trying to create. And also helping with data literacy, with something like usage analytics, it's really about driving adoption of the CLE platform, understanding what's working, who's accessing it, what's not. And then finally we're also introducing what's called workflow designer. And we love our workflows at Libra, it's a big differentiator to be able to automate business processes. The designer is really about a way for more people to be able to create those workflows, collaborate on those workflow flows, as well as people to be able to easily interact with them. So a lot of exciting things when it comes to ease of use to make it easier for all users to find data. >>Y yes, there's definitely a lot to unpack there. I I, you know, you mentioned this idea of, of of, of shopping for the data. That's interesting to me. Why this analogy, metaphor or analogy, I always get those confused. I let's go with analogy. Why is it so important to data consumers? >>I think when you look at the world of data, and I talked about this system of engagement, it's really about making it more accessible to the masses. And what users are used to is a shopping experience like your Amazon, if you will. And so having a consumer grade experience where users can quickly go in and find the data, trust that data, understand where the data's coming from, and then be able to quickly access it, is the idea of being able to shop for it, just making it as simple as possible and really speeding the time to value for any of the business analysts, data analysts out there. >>Yeah, I think when you, you, you see a lot of discussion about rethinking data architectures, putting data in the hands of the users and business people, decentralized data and of course that's awesome. I love that. But of course then you have to have self-service infrastructure and you have to have governance. And those are really challenging. And I think so many organizations, they're facing adoption challenges, you know, when it comes to enabling teams generally, especially domain experts to adopt new data technologies, you know, like the, the tech comes fast and furious. You got all these open source projects and get really confusing. Of course it risks security, governance and all that good stuff. You got all this jargon. So where do you see, you know, the friction in adopting new data technologies? What's your point of view and how can organizations overcome these challenges? >>You're, you're dead on. There's so much technology and there's so much to stay on top of, which is part of the friction, right? It's just being able to stay ahead of, of and understand all the technologies that are coming. You also look at as there's so many more sources of data and people are migrating data to the cloud and they're migrating to new sources. Where the friction comes is really that ability to understand where the data came from, where it's moving to, and then also to be able to put the access controls on top of it. So people are only getting access to the data that they should be getting access to. So one of the other things we're announcing with, with all of the innovations that are coming is what we're doing around performance and scale. So with all of the data movement, with all of the data that's out there, the first thing we're launching in the world of performance and scale is our world of data quality. >>It's something that Collibra has been working on for the past year and a half, but we're launching the ability to have data quality in the cloud. So it's currently an on-premise offering, but we'll now be able to carry that over into the cloud for us to manage that way. We're also introducing the ability to push down data quality into Snowflake. So this is, again, one of those challenges is making sure that that data that you have is d is is high quality as you move forward. And so really another, we're just reducing friction. You already have Snowflake stood up. It's not another machine for you to manage, it's just push down capabilities into Snowflake to be able to track that quality. Another thing that we're launching with that is what we call Collibra Protect. And this is that ability for users to be able to ingest metadata, understand where the PII data is, and then set policies up on top of it. So very quickly be able to set policies and have them enforced at the data level. So anybody in the organization is only getting access to the data they should have access to. >>Here's Topica data quality is interesting. It's something that I've followed for a number of years. It used to be a back office function, you know, and really confined only to highly regulated industries like financial services and healthcare and government. You know, you look back over a decade ago, you didn't have this worry about personal information, g gdpr, and, you know, California Consumer Privacy Act all becomes, becomes so much important. The cloud is really changed things in terms of performance and scale and of course partnering for, for, with Snowflake it's all about sharing data and monetization, anything but a back office function. So it was kind of smart that you guys were early on and of course attracting them and as a, as an investor as well was very strong validation. What can you tell us about the nature of the relationship with Snowflake and specifically inter interested in sort of joint engineering or, and product innovation efforts, you know, beyond the standard go to market stuff? >>Definitely. So you mentioned there were a strategic investor in Calibra about a year ago. A little less than that I guess. We've been working with them though for over a year really tightly with their product and engineering teams to make sure that Collibra is adding real value. Our unified platform is touching pieces of our unified platform or touching all pieces of Snowflake. And when I say that, what I mean is we're first, you know, able to ingest data with Snowflake, which, which has always existed. We're able to profile and classify that data we're announcing with Calibra Protect this week that you're now able to create those policies on top of Snowflake and have them enforce. So again, people can get more value out of their snowflake more quickly as far as time to value with, with our policies for all business users to be able to create. >>We're also announcing Snowflake Lineage 2.0. So this is the ability to take stored procedures in Snowflake and understand the lineage of where did the data come from, how was it transformed with within Snowflake as well as the data quality. Pushdown, as I mentioned, data quality, you brought it up. It is a new, it is a, a big industry push and you know, one of the things I think Gartner mentioned is people are losing up to $15 million without having great data quality. So this push down capability for Snowflake really is again, a big ease of use push for us at Collibra of that ability to, to push it into snowflake, take advantage of the data, the data source, and the engine that already lives there and get the right and make sure you have the right quality. >>I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, you know, high degree of confidence that the data sharing can be done in a safe way. Bringing, you know, Collibra into the, into the story allows me to have that data quality and, and that governance that I, that I need. You know, we've said many times on the cube that one of the notable differences in cloud this decade versus last decade, I mean ob there are obvious differences just in terms of scale and scope, but it's shaping up to be about the strength of the ecosystems. That's really a hallmark of these big cloud players. I mean they're, it's a key factor for innovating, accelerating product delivery, filling gaps in, in the hyperscale offerings cuz you got more stack, you know, mature stack capabilities and you know, it creates this flywheel momentum as we often say. But, so my question is, how do you work with the hyperscalers? Like whether it's AWS or Google, whomever, and what do you see as your role and what's the Collibra sweet spot? >>Yeah, definitely. So, you know, one of the things I mentioned early on is the broader ecosystem of partners is what it's all about. And so we have that strong partnership with Snowflake. We also are doing more with Google around, you know, GCP and kbra protect there, but also tighter data plex integration. So similar to what you've seen with our strategic moves around Snowflake and, and really covering the broad ecosystem of what Collibra can do on top of that data source. We're extending that to the world of Google as well and the world of data plex. We also have great partners in SI's Infosys is somebody we spoke with at the conference who's done a lot of great work with Levi's as they're really important to help people with their whole data strategy and driving that data driven culture and, and Collibra being the core of it. >>Hi Laura, we're gonna, we're gonna end it there, but I wonder if you could kind of put a bow on, you know, this year, the event your, your perspectives. So just give us your closing thoughts. >>Yeah, definitely. So I, I wanna say this is one of the biggest releases Collibra's ever had. Definitely the biggest one since I've been with the company a little over a year. We have all these great new product innovations coming to really drive the ease of use to make data more valuable for users everywhere and, and companies everywhere. And so it's all about everybody being able to easily find, understand, and trust and get access to that data going forward. >>Well congratulations on all the pro progress. It was great to have you on the cube first time I believe, and really appreciate you, you taking the time with us. >>Yes, thank you for your time. >>You're very welcome. Okay, you're watching the coverage of Data Citizens 2022 on the cube, your leader in enterprise and emerging tech coverage. >>So data modernization oftentimes means moving some of your storage and computer to the cloud where you get the benefit of scale and security and so on. But ultimately it doesn't take away the silos that you have. We have more locations, more tools and more processes with which we try to get value from this data. To do that at scale in an organization, people involved in this process, they have to understand each other. So you need to unite those people across those tools, processes, and systems with a shared language. When I say customer, do you understand the same thing as you hearing customer? Are we counting them in the same way so that shared language unites us and that gives the opportunity for the organization as a whole to get the maximum value out of their data assets and then they can democratize data so everyone can properly use that shared language to find, understand, and trust the data asset that's available. >>And that's where Collibra comes in. We provide a centralized system of engagement that works across all of those locations and combines all of those different user types across the whole business. At Collibra, we say United by data and that also means that we're united by data with our customers. So here is some data about some of our customers. There was the case of an online do it yourself platform who grew their revenue almost three times from a marketing campaign that provided the right product in the right hands of the right people. In other case that comes to mind is from a financial services organization who saved over 800 K every year because they were able to reuse the same data in different kinds of reports and before there was spread out over different tools and processes and silos, and now the platform brought them together so they realized, oh, we're actually using the same data, let's find a way to make this more efficient. And the last example that comes to mind is that of a large home loan, home mortgage, mortgage loan provider where they have a very complex landscape, a very complex architecture legacy in the cloud, et cetera. And they're using our software, they're using our platform to unite all the people and those processes and tools to get a common view of data to manage their compliance at scale. >>Hey everyone, I'm Lisa Martin covering Data Citizens 22, brought to you by Collibra. This next conversation is gonna focus on the importance of data culture. One of our Cube alumni is back, Stan Christians is Collibra's co-founder and it's Chief Data citizens. Stan, it's great to have you back on the cube. >>Hey Lisa, nice to be. >>So we're gonna be talking about the importance of data culture, data intelligence, maturity, all those great things. When we think about the data revolution that every business is going through, you know, it's so much more than technology innovation. It also really re requires cultural transformation, community transformation. Those are challenging for customers to undertake. Talk to us about what you mean by data citizenship and the role that creating a data culture plays in that journey. >>Right. So as you know, our event is called Data Citizens because we believe that in the end, a data citizen is anyone who uses data to do their job. And we believe that today's organizations, you have a lot of people, most of the employees in an organization are somehow gonna to be a data citizen, right? So you need to make sure that these people are aware of it. You need that. People have skills and competencies to do with data what necessary and that's on, all right? So what does it mean to have a good data culture? It means that if you're building a beautiful dashboard to try and convince your boss, we need to make this decision that your boss is also open to and able to interpret, you know, the data presented in dashboard to actually make that decision and take that action. Right? >>And once you have that why to the organization, that's when you have a good data culture. Now that's continuous effort for most organizations because they're always moving, somehow they're hiring new people and it has to be continuous effort because we've seen that on the hand. Organizations continue challenged their data sources and where all the data is flowing, right? Which in itself creates a lot of risk. But also on the other set hand of the equation, you have the benefit. You know, you might look at regulatory drivers like, we have to do this, right? But it's, it's much better right now to consider the competitive drivers, for example, and we did an IDC study earlier this year, quite interesting. I can recommend anyone to it. And one of the conclusions they found as they surveyed over a thousand people across organizations worldwide is that the ones who are higher in maturity. >>So the, the organizations that really look at data as an asset, look at data as a product and actively try to be better at it, don't have three times as good a business outcome as the ones who are lower on the maturity scale, right? So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them up as data citizens. I'm doing this for competitive reasons, I'm doing this re reasons you're trying to bring both of those together and the ones that get data intelligence right, are successful and competitive. That's, and that's what we're seeing out there in the market. >>Absolutely. We know that just generally stand right, the organizations that are, are really creating a, a data culture and enabling everybody within the organization to become data citizens are, We know that in theory they're more competitive, they're more successful. But the IDC study that you just mentioned demonstrates they're three times more successful and competitive than their peers. Talk about how Collibra advises customers to create that community, that culture of data when it might be challenging for an organization to adapt culturally. >>Of course, of course it's difficult for an organization to adapt but it's also necessary, as you just said, imagine that, you know, you're a modern day organization, laptops, what have you, you're not using those, right? Or you know, you're delivering them throughout organization, but not enabling your colleagues to actually do something with that asset. Same thing as through with data today, right? If you're not properly using the data asset and competitors are, they're gonna to get more advantage. So as to how you get this done, establish this. There's angles to look at, Lisa. So one angle is obviously the leadership whereby whoever is the boss of data in the organization, you typically have multiple bosses there, like achieve data officers. Sometimes there's, there's multiple, but they may have a different title, right? So I'm just gonna summarize it as a data leader for a second. >>So whoever that is, they need to make sure that there's a clear vision, a clear strategy for data. And that strategy needs to include the monetization aspect. How are you going to get value from data? Yes. Now that's one part because then you can leadership in the organization and also the business value. And that's important. Cause those people, their job in essence really is to make everyone in the organization think about data as an asset. And I think that's the second part of the equation of getting that right, is it's not enough to just have that leadership out there, but you also have to get the hearts and minds of the data champions across the organization. You, I really have to win them over. And if you have those two combined and obviously a good technology to, you know, connect those people and have them execute on their responsibilities such as a data intelligence platform like s then the in place to really start upgrading that culture inch by inch if you'll, >>Yes, I like that. The recipe for success. So you are the co-founder of Collibra. You've worn many different hats along this journey. Now you're building Collibra's own data office. I like how before we went live, we were talking about Calibra is drinking its own champagne. I always loved to hear stories about that. You're speaking at Data Citizens 2022. Talk to us about how you are building a data culture within Collibra and what maybe some of the specific projects are that Collibra's data office is working on. >>Yes, and it is indeed data citizens. There are a ton of speaks here, are very excited. You know, we have Barb from m MIT speaking about data monetization. We have Dilla at the last minute. So really exciting agen agenda. Can't wait to get back out there essentially. So over the years at, we've doing this since two and eight, so a good years and I think we have another decade of work ahead in the market, just to be very clear. Data is here to stick around as are we. And myself, you know, when you start a company, we were for people in a, if you, so everybody's wearing all sorts of hat at time. But over the years I've run, you know, presales that sales partnerships, product cetera. And as our company got a little bit biggish, we're now thousand two. Something like people in the company. >>I believe systems and processes become a lot important. So we said you CBRA isn't the size our customers we're getting there in of organization structure, process systems, et cetera. So we said it's really time for us to put our money where is and to our own data office, which is what we were seeing customers', organizations worldwide. And they organizations have HR units, they have a finance unit and over time they'll all have a department if you'll, that is responsible somehow for the data. So we said, ok, let's try to set an examples that other people can take away with it, right? Can take away from it. So we set up a data strategy, we started building data products, took care of the data infrastructure. That's sort of good stuff. And in doing all of that, ISA exactly as you said, we said, okay, we need to also use our product and our own practices and from that use, learn how we can make the product better, learn how we make, can make the practice better and share that learning with all the, and on, on the Monday mornings, we sometimes refer to eating our dog foods on Friday evenings. >>We referred to that drinking our own champagne. I like it. So we, we had a, we had the driver to do this. You know, there's a clear business reason. So we involved, we included that in the data strategy and that's a little bit of our origin. Now how, how do we organize this? We have three pillars, and by no means is this a template that everyone should, this is just the organization that works at our company, but it can serve as an inspiration. So we have a pillar, which is data science. The data product builders, if you'll or the people who help the business build data products. We have the data engineers who help keep the lights on for that data platform to make sure that the products, the data products can run, the data can flow and you know, the quality can be checked. >>And then we have a data intelligence or data governance builders where we have those data governance, data intelligence stakeholders who help the business as a sort of data partner to the business stakeholders. So that's how we've organized it. And then we started following the CBRA approach, which is, well, what are the challenges that our business stakeholders have in hr, finance, sales, marketing all over? And how can data help overcome those challenges? And from those use cases, we then just started to build a map and started execution use of the use case. And a important ones are very simple. We them with our, our customers as well, people talking about the cata, right? The catalog for the data scientists to know what's in their data lake, for example, and for the people in and privacy. So they have their process registry and they can see how the data flows. >>So that's a starting place and that turns into a marketplace so that if new analysts and data citizens join kbra, they immediately have a place to go to, to look at, see, ok, what data is out there for me as an analyst or a data scientist or whatever to do my job, right? So they can immediately get access data. And another one that we is around trusted business. We're seeing that since, you know, self-service BI allowed everyone to make beautiful dashboards, you know, pie, pie charts. I always, my pet pee is the pie chart because I love buy and you shouldn't always be using pie charts. But essentially there's become proliferation of those reports. And now executives don't really know, okay, should I trust this report or that report the reporting on the same thing. But the numbers seem different, right? So that's why we have trusted this reporting. So we know if a, the dashboard, a data product essentially is built, we not that all the right steps are being followed and that whoever is consuming that can be quite confident in the result either, Right. And that silver browser, right? Absolutely >>Decay. >>Exactly. Yes, >>Absolutely. Talk a little bit about some of the, the key performance indicators that you're using to measure the success of the data office. What are some of those KPIs? >>KPIs and measuring is a big topic in the, in the data chief data officer profession, I would say, and again, it always varies with to your organization, but there's a few that we use that might be of interest. Use those pillars, right? And we have metrics across those pillars. So for example, a pillar on the data engineering side is gonna be more related to that uptime, right? Are the, is the data platform up and running? Are the data products up and running? Is the quality in them good enough? Is it going up? Is it going down? What's the usage? But also, and especially if you're in the cloud and if consumption's a big thing, you have metrics around cost, for example, right? So that's one set of examples. Another one is around the data sciences and products. Are people using them? Are they getting value from it? >>Can we calculate that value in ay perspective, right? Yeah. So that we can to the rest of the business continue to say we're tracking all those numbers and those numbers indicate that value is generated and how much value estimated in that region. And then you have some data intelligence, data governance metrics, which is, for example, you have a number of domains in a data mesh. People talk about being the owner of a data domain, for example, like product or, or customer. So how many of those domains do you have covered? How many of them are already part of the program? How many of them have owners assigned? How well are these owners organized, executing on their responsibilities? How many tickets are open closed? How many data products are built according to process? And so and so forth. So these are an set of examples of, of KPIs. There's a, there's a lot more, but hopefully those can already inspire the audience. >>Absolutely. So we've, we've talked about the rise cheap data offices, it's only accelerating. You mentioned this is like a 10 year journey. So if you were to look into a crystal ball, what do you see in terms of the maturation of data offices over the next decade? >>So we, we've seen indeed the, the role sort of grow up, I think in, in thousand 10 there may have been like 10 achieve data officers or something. Gartner has exact numbers on them, but then they grew, you know, industries and the number is estimated to be about 20,000 right now. Wow. And they evolved in a sort of stack of competencies, defensive data strategy, because the first chief data officers were more regulatory driven, offensive data strategy support for the digital program. And now all about data products, right? So as a data leader, you now need all of those competences and need to include them in, in your strategy. >>How is that going to evolve for the next couple of years? I wish I had one of those balls, right? But essentially I think for the next couple of years there's gonna be a lot of people, you know, still moving along with those four levels of the stack. A lot of people I see are still in version one and version two of the chief data. So you'll see over the years that's gonna evolve more digital and more data products. So for next years, my, my prediction is it's all products because it's an immediate link between data and, and the essentially, right? Right. So that's gonna be important and quite likely a new, some new things will be added on, which nobody can predict yet. But we'll see those pop up in a few years. I think there's gonna be a continued challenge for the chief officer role to become a real executive role as opposed to, you know, somebody who claims that they're executive, but then they're not, right? >>So the real reporting level into the board, into the CEO for example, will continue to be a challenging point. But the ones who do get that done will be the ones that are successful and the ones who get that will the ones that do it on the basis of data monetization, right? Connecting value to the data and making that value clear to all the data citizens in the organization, right? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences aligned of course. And they'll need to focus on adoption. Again, it's not enough to just have your data office be involved in this. It's really important that you're waking up data citizens across the organization and you make everyone in the organization think about data as an asset. >>Absolutely. Because there's so much value that can be extracted. Organizations really strategically build that data office and democratize access across all those data citizens. Stan, this is an exciting arena. We're definitely gonna keep our eyes on this. Sounds like a lot of evolution and maturation coming from the data office perspective. From the data citizen perspective. And as the data show that you mentioned in that IDC study, you mentioned Gartner as well, organizations have so much more likelihood of being successful and being competitive. So we're gonna watch this space. Stan, thank you so much for joining me on the cube at Data Citizens 22. We appreciate it. >>Thanks for having me over >>From Data Citizens 22, I'm Lisa Martin, you're watching The Cube, the leader in live tech coverage. >>Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra. Remember, all these videos are available on demand@thecube.net. And don't forget to check out silicon angle.com for all the news and wiki bod.com for our weekly breaking analysis series where we cover many data topics and share survey research from our partner ETR Enterprise Technology Research. If you want more information on the products announced at Data Citizens, go to collibra.com. There are tons of resources there. You'll find analyst reports, product demos. It's really worthwhile to check those out. Thanks for watching our program and digging into Data Citizens 2022 on the Cube, your leader in enterprise and emerging tech coverage. We'll see you soon.

Published Date : Nov 2 2022

SUMMARY :

largely about getting the technology to work. Now the cloud is definitely helping with that, but also how do you automate governance? So you can see how data governance has evolved into to say we extract the signal from the noise, and over the, the next couple of days, we're gonna feature some of the So it's a really interesting story that we're thrilled to be sharing And we said at the time, you know, maybe it's time to rethink data innovation. 2020s from the previous decade, and what challenges does that bring for your customers? as data becomes more impactful than important, the level of scrutiny with respect to privacy, So again, I think it just another incentive for organization to now truly look at data You know, I don't know when you guys founded Collibra, if, if you had a sense as to how complicated the last kind of financial crisis, and that was really the, the start of Colli where we found product market Well, that's interesting because, you know, in my observation it takes seven to 10 years to actually build a again, a lot of momentum in the org in, in the, in the markets with some of the cloud partners And the second is that those data pipelines that are now being created in the cloud, I mean, the acquisition of i l dq, you know, So that's really the theme of a lot of the innovation that we're driving. And so that's the big theme from an innovation perspective, One of our key differentiators is the ability to really drive a lot of automation through workflows. So actually pushing down the computer and data quality, one of the key principles you think about monetization. And I, and I think we we're really at this pivotal moment, and I think you said it well. We need to look beyond just the I know you're gonna crush it out there. This is Dave Valante for the cube, your leader in enterprise and Without data leverage the Collibra data catalog to automatically And for that you'll establish community owners, a data set to a KPI to a report now enables your users to see what Finally, seven, promote the value of this to your users and Welcome to the Cube's coverage of Data Citizens 2022 Collibra's customer event. And now you lead data quality at Collibra. imagine if we get that wrong, you know, what the ramifications could be, And I realized in that moment, you know, I might have failed him because, cause I didn't know. And it's so complex that the way companies consume them in the IT function is And so it's really become front and center just the whole quality issue because data's so fundamental, nowadays to this topic is, so maybe we could surface all of these problems with So the language is changing a you know, stale data, you know, the, the whole trend toward real time. we sort of lived this problem for a long time, you know, in, in the Wall Street days about a decade you know, they just said, Oh, it's a glitch, you know, so they didn't understand the root cause of it. And the one right now is these hyperscalers in the cloud. And I think if you look at the whole So this is interesting because what you just described, you know, you mentioned Snowflake, And so when you were to log into Big Query tomorrow using our I love this example because, you know, Barry talks about, wow, the cloud guys are gonna own the world and, Seeing that across the board, people used to know it was a zip code and nowadays Appreciate it. Right, and thank you for watching. Nice to be here. Can can you explain to our audience why the ability to manage data across the entire organization. I was gonna say, you know, when I look back at like the last 10 years, it was all about getting the technology to work and it And one of the big pushes and passions we have at Collibra is to help with I I, you know, you mentioned this idea of, and really speeding the time to value for any of the business analysts, So where do you see, you know, the friction in adopting new data technologies? So one of the other things we're announcing with, with all of the innovations that are coming is So anybody in the organization is only getting access to the data they should have access to. So it was kind of smart that you guys were early on and We're able to profile and classify that data we're announcing with Calibra Protect this week that and get the right and make sure you have the right quality. I mean, the nice thing about Snowflake, if you play in the Snowflake sandbox, you, you, you, you can get sort of a, We also are doing more with Google around, you know, GCP and kbra protect there, you know, this year, the event your, your perspectives. And so it's all about everybody being able to easily It was great to have you on the cube first time I believe, cube, your leader in enterprise and emerging tech coverage. the cloud where you get the benefit of scale and security and so on. And the last example that comes to mind is that of a large home loan, home mortgage, Stan, it's great to have you back on the cube. Talk to us about what you mean by data citizenship and the And we believe that today's organizations, you have a lot of people, And one of the conclusions they found as they So you can say, ok, I'm doing this, you know, data culture for everyone, awakening them But the IDC study that you just mentioned demonstrates they're three times So as to how you get this done, establish this. part of the equation of getting that right, is it's not enough to just have that leadership out Talk to us about how you are building a data culture within Collibra and But over the years I've run, you know, So we said you the data products can run, the data can flow and you know, the quality can be checked. The catalog for the data scientists to know what's in their data lake, and data citizens join kbra, they immediately have a place to go to, Yes, success of the data office. So for example, a pillar on the data engineering side is gonna be more related So how many of those domains do you have covered? to look into a crystal ball, what do you see in terms of the maturation industries and the number is estimated to be about 20,000 right now. How is that going to evolve for the next couple of years? And in that sense, they'll need to have both, you know, technical audiences and non-technical audiences And as the data show that you mentioned in that IDC study, the leader in live tech coverage. Okay, this concludes our coverage of Data Citizens 2022, brought to you by Collibra.

ENTITIES

Entity	Category	Confidence
Laura	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Heineken	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
Laura Sellers	PERSON	0.99+
2008	DATE	0.99+
Collibra	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Felix Von Dala	PERSON	0.99+
Google	ORGANIZATION	0.99+
Felix Van Dema	PERSON	0.99+
seven	QUANTITY	0.99+
Stan Christians	PERSON	0.99+
2010	DATE	0.99+
Lisa	PERSON	0.99+
San Diego	LOCATION	0.99+
Jay	PERSON	0.99+
50 day	QUANTITY	0.99+
Felix	PERSON	0.99+
one	QUANTITY	0.99+
Kurt Hasselbeck	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
10 year	QUANTITY	0.99+
California Consumer Privacy Act	TITLE	0.99+
10 day	QUANTITY	0.99+
Six	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+
Last year	DATE	0.99+
demand@thecube.net	OTHER	0.99+
ETR Enterprise Technology Research	ORGANIZATION	0.99+
Barry	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
one part	QUANTITY	0.99+
Python	TITLE	0.99+
2010s	DATE	0.99+
2020s	DATE	0.99+
Calibra	LOCATION	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Calibra	ORGANIZATION	0.99+
K Bear Protect	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
12 months	QUANTITY	0.99+
tomorrow	DATE	0.99+
AWS	ORGANIZATION	0.99+
Barb	PERSON	0.99+
Stan	PERSON	0.99+
Data Citizens	ORGANIZATION	0.99+

Kirk Haslbeck, Collibra | Data Citizens '22

(bright upbeat music) >> Welcome to theCUBE's Coverage of Data Citizens 2022 Collibra's Customer event. My name is Dave Vellante. With us is Kirk Hasselbeck, who's the Vice President of Data Quality of Collibra. Kirk, good to see you. Welcome. >> Thanks for having me, Dave. Excited to be here. >> You bet. Okay, we're going to discuss data quality, observability. It's a hot trend right now. You founded a data quality company, OwlDQ and it was acquired by Collibra last year. Congratulations! And now you lead data quality at Collibra. So we're hearing a lot about data quality right now. Why is it such a priority? Take us through your thoughts on that. >> Yeah, absolutely. It's definitely exciting times for data quality which you're right, has been around for a long time. So why now, and why is it so much more exciting than it used to be? I think it's a bit stale, but we all know that companies use more data than ever before and the variety has changed and the volume has grown. And while I think that remains true, there are a couple other hidden factors at play that everyone's so interested in as to why this is becoming so important now. And I guess you could kind of break this down simply and think about if Dave, you and I were going to build, you know a new healthcare application and monitor the heartbeat of individuals, imagine if we get that wrong, what the ramifications could be? What those incidents would look like? Or maybe better yet, we try to build a new trading algorithm with a crossover strategy where the 50 day crosses the 10 day average. And imagine if the data underlying the inputs to that is incorrect. We'll probably have major financial ramifications in that sense. So, it kind of starts there where everybody's realizing that we're all data companies and if we are using bad data, we're likely making incorrect business decisions. But I think there's kind of two other things at play. I bought a car not too long ago and my dad called and said, "How many cylinders does it have?" And I realized in that moment, I might have failed him because 'cause I didn't know. And I used to ask those types of questions about any lock brakes and cylinders and if it's manual or automatic and I realized I now just buy a car that I hope works. And it's so complicated with all the computer chips. I really don't know that much about it. And that's what's happening with data. We're just loading so much of it. And it's so complex that the way companies consume them in the IT function is that they bring in a lot of data and then they syndicate it out to the business. And it turns out that the individuals loading and consuming all of this data for the company actually may not know that much about the data itself and that's not even their job anymore. So, we'll talk more about that in a minute but that's really what's setting the foreground for this observability play and why everybody's so interested, it's because we're becoming less close to the intricacies of the data and we just expect it to always be there and be correct. >> You know, the other thing too about data quality and for years we did the MIT CDOIQ event we didn't do it last year at COVID, messed everything up. But the observation I would make there love thoughts is it data quality used to be information quality used to be this back office function, and then it became sort of front office with financial services and government and healthcare, these highly regulated industries. And then the whole chief data officer thing happened and people were realizing, well, they sort of flipped the bit from sort of a data as a a risk to data as an asset. And now, as we say, we're going to talk about observability. And so it's really become front and center, just the whole quality issue because data's fundamental, hasn't it? >> Yeah, absolutely. I mean, let's imagine we pull up our phones right now and I go to my favorite stock ticker app and I check out the NASDAQ market cap. I really have no idea if that's the correct number. I know it's a number, it looks large, it's in a numeric field. And that's kind of what's going on. There's so many numbers and they're coming from all of these different sources and data providers and they're getting consumed and passed along. But there isn't really a way to tactically put controls on every number and metric across every field we plan to monitor. But with the scale that we've achieved in early days, even before Collibra. And what's been so exciting is we have these types of observation techniques, these data monitors that can actually track past performance of every field at scale. And why that's so interesting and why I think the CDO is listening right intently nowadays to this topic is so maybe we could surface all of these problems with the right solution of data observability and with the right scale and then just be alerted on breaking trends. So we're sort of shifting away from this world of must write a condition and then when that condition breaks, that was always known as a break record. But what about breaking trends and root cause analysis? And is it possible to do that, with less human intervention? And so I think most people are seeing now that it's going to have to be a software tool and a computer system. It's not ever going to be based on one or two domain experts anymore. >> So, how does data observability relate to data quality? Are they sort of two sides of the same coin? Are they cousins? What's your perspective on that? >> Yeah, it's super interesting. It's an emerging market. So the language is changing a lot of the topic and areas changing the way that I like to say it or break it down because the lingo is constantly moving as a target on this space is really breaking records versus breaking trends. And I could write a condition when this thing happens it's wrong and when it doesn't, it's correct. Or I could look for a trend and I'll give you a good example. Everybody's talking about fresh data and stale data and why would that matter? Well, if your data never arrived or only part of it arrived or didn't arrive on time, it's likely stale and there will not be a condition that you could write that would show you all the good and the bads. That was kind of your traditional approach of data quality break records. But your modern day approach is you lost a significant portion of your data, or it did not arrive on time to make that decision accurately on time. And that's a hidden concern. Some people call this freshness, we call it stale data but it all points to the same idea of the thing that you're observing may not be a data quality condition anymore. It may be a breakdown in the data pipeline. And with thousands of data pipelines in play for every company out there there, there's more than a couple of these happening every day. >> So what's the Collibra angle on all this stuff made the acquisition you got data quality observability coming together, you guys have a lot of expertise in this area but you hear providence of data you just talked about stale data, the whole trend toward real time. How is Collibra approaching the problem and what's unique about your approach? >> Well, I think where we're fortunate is with our background, myself and team we sort of lived this problem for a long time in the Wall Street days about a decade ago. And we saw it from many different angles. And what we came up with before it was called data observability or reliability was basically the underpinnings of that. So we're a little bit ahead of the curve there when most people evaluate our solution. It's more advanced than some of the observation techniques that currently exist. But we've also always covered data quality and we believe that people want to know more, they need more insights and they want to see break records and breaking trends together so they can correlate the root cause. And we hear that all the time. I have so many things going wrong just show me the big picture. Help me find the thing that if I were to fix it today would make the most impact. So we're really focused on root cause analysis, business impact connecting it with lineage and catalog, metadata. And as that grows, you can actually achieve total data governance. At this point, with the acquisition of what was a lineage company years ago and then my company OwlDQ, now Collibra Data Quality, Collibra may be the best positioned for total data governance and intelligence in the space. >> Well, you mentioned financial services a couple of times and some examples, remember the flash crash in 2010. Nobody had any idea what that was, they just said, "Oh, it's a glitch." So they didn't understand the root cause of it. So this is a really interesting topic to me. So we know at Data Citizens '22 that you're announcing you got to announce new products, right? Your yearly event, what's new? Give us a sense as to what products are coming out but specifically around data quality and observability. >> Absolutely. There's always a next thing on the forefront. And the one right now is these hyperscalers in the cloud. So you have databases like Snowflake and Big Query and Data Bricks, Delta Lake and SQL Pushdown. And ultimately what that means is a lot of people are storing in loading data even faster in a salike model. And we've started to hook in to these databases. And while we've always worked with the same databases in the past they're supported today we're doing something called Native Database pushdown, where the entire compute and data activity happens in the database. And why that is so interesting and powerful now is everyone's concerned with something called Egress. Did my data that I've spent all this time and money with my security team securing ever leave my hands? Did it ever leave my secure VPC as they call it? And with these native integrations that we're building and about to unveil here as kind of a sneak peek for next week at Data Citizens, we're now doing all compute and data operations in databases like Snowflake. And what that means is with no install and no configuration you could log into the Collibra Data Quality app and have all of your data quality running inside the database that you've probably already picked as your your go forward team selection secured database of choice. So we're really excited about that. And I think if you look at the whole landscape of network cost, egress cost, data storage and compute, what people are realizing is it's extremely efficient to do it in the way that we're about to release here next week. >> So this is interesting because what you just described you mentioned Snowflake, you mentioned Google, oh actually you mentioned yeah, the Data Bricks. Snowflake has the data cloud. If you put everything in the data cloud, okay, you're cool but then Google's got the open data cloud. If you heard Google Nest and now Data Bricks doesn't call it the data cloud but they have like the open source data cloud. So you have all these different approaches and there's really no way up until now I'm hearing to really understand the relationships between all those and have confidence across, it's like (indistinct) you should just be a note on the mesh. And I don't care if it's a data warehouse or a data lake or where it comes from, but it's a point on that mesh and I need tooling to be able to have confidence that my data is governed and has the proper lineage, providence. And that's what you're bringing to the table. Is that right? Did I get that right? >> Yeah, that's right. And for us, it's not that we haven't been working with those great cloud databases, but it's the fact that we can send them the instructions now we can send them the operating ability to crunch all of the calculations, the governance, the quality and get the answers. And what that's doing, it's basically zero network cost, zero egress cost, zero latency of time. And so when you were to log into Big BigQuery tomorrow using our tool or let or say Snowflake, for example, you have instant data quality metrics, instant profiling, instant lineage and access privacy controls things of that nature that just become less onerous. What we're seeing is there's so much technology out there just like all of the major brands that you mentioned but how do we make it easier? The future is about less clicks, faster time to value faster scale, and eventually lower cost. And we think that this positions us to be the leader there. >> I love this example because every talks about wow the cloud guys are going to own the world and of course now we're seeing that the ecosystem is finding so much white space to add value, connect across cloud. Sometimes we call it super cloud and so, or inter clouding. Alright, Kirk, give us your final thoughts and on the trends that we've talked about and Data Citizens '22. >> Absolutely. Well I think, one big trend is discovery and classification. Seeing that across the board people used to know it was a zip code and nowadays with the amount of data that's out there, they want to know where everything is where their sensitive data is. If it's redundant, tell me everything inside of three to five seconds. And with that comes, they want to know in all of these hyperscale databases, how fast they can get controls and insights out of their tools. So I think we're going to see more one click solutions, more SAS-based solutions and solutions that hopefully prove faster time to value on all of these modern cloud platforms. >> Excellent, all right. Kurt Hasselbeck, thanks so much for coming on theCUBE and previewing Data Citizens '22. Appreciate it. >> Thanks for having me, Dave. >> You're welcome. All right, and thank you for watching. Keep it right there for more coverage from theCUBE.

Published Date : Oct 24 2022

SUMMARY :

Kirk, good to see you. Excited to be here. and it was acquired by Collibra last year. And it's so complex that the And now, as we say, we're going and I check out the NASDAQ market cap. and areas changing the and what's unique about your approach? of the curve there when most and some examples, remember and data activity happens in the database. and has the proper lineage, providence. and get the answers. and on the trends that we've talked about and solutions that hopefully and previewing Data Citizens '22. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Kurt Hasselbeck	PERSON	0.99+
2010	DATE	0.99+
one	QUANTITY	0.99+
Kirk Hasselbeck	PERSON	0.99+
50 day	QUANTITY	0.99+
Kirk	PERSON	0.99+
10 day	QUANTITY	0.99+
OwlDQ	ORGANIZATION	0.99+
Kirk Haslbeck	PERSON	0.99+
next week	DATE	0.99+
Google	ORGANIZATION	0.99+
last year	DATE	0.99+
two sides	QUANTITY	0.99+
thousands	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
Snowflake	TITLE	0.99+
Data Citizens	ORGANIZATION	0.99+
Data Bricks	ORGANIZATION	0.99+
two other things	QUANTITY	0.98+
one click	QUANTITY	0.98+
tomorrow	DATE	0.98+
today	DATE	0.98+
five seconds	QUANTITY	0.97+
two domain	QUANTITY	0.94+
Collibra Data Quality	TITLE	0.92+
MIT CDOIQ	EVENT	0.9+
Data Citizens '22	TITLE	0.9+
Egress	ORGANIZATION	0.89+
Delta Lake	TITLE	0.89+
three	QUANTITY	0.86+
zero	QUANTITY	0.85+
Big Query	TITLE	0.85+
about a decade ago	DATE	0.85+
SQL Pushdown	TITLE	0.83+
Data Citizens 2022 Collibra	EVENT	0.82+
Big BigQuery	TITLE	0.81+
more than a couple	QUANTITY	0.79+
couple	QUANTITY	0.78+
one big	QUANTITY	0.77+
Collibra Data Quality	ORGANIZATION	0.75+
Collibra	OTHER	0.75+
Google Nest	ORGANIZATION	0.75+
Data Citizens '22	ORGANIZATION	0.74+
zero latency	QUANTITY	0.72+
SAS	ORGANIZATION	0.71+
Snowflake	ORGANIZATION	0.69+
COVID	ORGANIZATION	0.69+
years ago	DATE	0.68+
Wall Street	LOCATION	0.66+
theCUBE	ORGANIZATION	0.66+
many numbers	QUANTITY	0.63+
Collibra	PERSON	0.63+
times	QUANTITY	0.61+
Data	ORGANIZATION	0.61+
too long	DATE	0.6+
Vice President	PERSON	0.57+
data	QUANTITY	0.56+
CDO	TITLE	0.52+
Bricks	TITLE	0.48+

Day 1 Keynote Analysis | CrowdStrike Fal.Con 2022

(upbeat music) >> Hello everyone, and welcome to Fal.Con 2022, CrowdStrike's big user conference. You're watching the Cube. My name is Dave Vallante. I'm here with my co-host David Nicholson. CrowdStrike is a company that was founded over 10 years ago. This is about 11 years, almost to the day. They're 2 billion company in revenue terms. They're growing at about 60% a year. They've got a path they've committed to wall street. They've got a path to $5 billion by mid decade. They got a $40 billion market cap. They're free, free cash flow positive and trying to build essentially a generational company with a very growing Tam and a modern platform. CrowdStrike has the fundamental belief that the unstoppable breach is a myth. David Nicholson, even though CSOs don't believe that, CrowdStrike is on a mission. Right? >> I didn't hear the phrase. Zero trust mentioned in the keynote >> Right. >> What was mentioned was this idea that CrowdStrike isn't simply a tool, it's a platform. And obviously it takes a platform to get to 5 billion. >> Yeah. So let's talk about the keynote. George Kurtz, the CEO came on. I thought the keynote was, was measured, but very substantive. It was not a lot of hype in there. Most security conferences, the two exceptions are this one and Reinforce, Amazon's big security conference. Steven Schmidt. The first time I was at a Reinforce said "All this narrative about security is such a bad industry" and "We're not doing a great job." And "It's so scary." That doesn't help the industry. George Kurtz sort of took a similar message. And you know what, Dave? When I think of security outside the context of IT I think of like security guards >> Right. >> Like protecting the billionaires. Right? That's a powerful, you know, positive thing. It's not really a defensive movement even though it is defensive but so that was kind of his posture there. But he talked about essentially what I call, not his words permanent changes in the, in the in the cyber defense industry, subsequent to the pandemic. Again, he didn't specifically mention the pandemic but he alluded to, you know, this new world that we live in. Fal.Con is a hundred sessions, eight tracks. And really his contention is we're in the early innings. These guys got 20,000 customers. And I think they got the potential to have hundreds of thousands. >> Yeah. Yeah. So, if I'm working with a security company I want them to be measured. I'm not looking for hype. I don't want those. I don't want those guards to be in disco shirts. I want them in black suits. So, you know, so the, the, the point about measured is is I think a positive one. I was struck by the competence of the people who were on stage today. I have seen very very large companies become kind of bureaucratic. And sometimes you don't get the best of the best up on stage. And we saw a lot of impressive folks. >> Yeah. Michael Santonis get up, but before we get to him. So, a couple points that Kurtz made he said, "digital transformation is needed to bring modern architectures to IT. And that brings modern security." And he laid out that whole sort of old way, new way very Andy Jassy-like old guard, new guard. He didn't hit on it that hard but he basically said "security is all about mitigating risk." And he mentioned that the the CSO I say CSO, he says CSO or CSO has a seat at the board. Now, many CSOs are board level participants. And then he went into the sort of four pillars of, of workload, and the areas that they focus on. So workload to them is end point, identity, and then data. They don't touch network security. That's where they partner with the likes of Cisco, >> Right. >> And Palo Alto networks. But then they went deep into identity threat protection, data, which is their observability platform from an acquisition called Humio. And then they went big time into XDR. We're going to talk about all this stuff. He said, "data is the new digital currency." Talked a lot about how they're now renaming, Humio, Log Scale. That's their Splunk killer. We're going to talk about that all week. And he talked a little bit about the single agent architecture. That is kind of the linchpin of CrowdStrike's architecture. And then Michael Santonis, the CTO came on and did a deep dive into each of those, and really went deep into XDR extended, right? Detection and response. XDR building on EDR. >> Yeah. I think the subject of XDR is something we'll be, we'll be touching on a lot. I think in the next two days. I thought the extension into observability was very, very interesting. When you look at performance metrics, where things are gathering those things in and being able to use a single agent to do so. That speaks to this idea that they are a platform and not just a tool. It's easy to say that you aspire to be a platform. I think that's a proof point. On the subject, by the way of their fundamental architecture. Over the years, there have been times when saying that your infrastructure requires an agent that would've been a deal killer. People say "No agents!" They've stuck to their guns because they know that the best way to deliver what they deliver is to have an agent in the environment. And it has proven to be the right strategy. >> Well, this is one of the things I want to explore with the technical architects that come on here today is, how do you build a lightweight agent that can do everything that you say it's going to do? Because they started out at endpoint, and then they've extended it to all these other modules, you know, identity. They're now into observability. They've got this data platform. They just announced that acquisition of another company they bought Preempt, which is their identity. They announced Responsify, responsify? Reposify, which is sort of extends the observability and gives them visualization or visibility. And I'm like, how do you take? How do you keep an agent lightweight? That's one of the things I want to better understand. And then the other is, as you get into XDR I thought Michael Santonis was pretty interesting. He had black hat last month. He did a little video, you know. >> That was great >> Man in the street, what's XDR what's XDR what's XDR. I thought the best response was, somebody said "a holistic approach to end point security." And so it's really an evolution of, of EDR. So we're going to talk about that. But, how do you keep an agent lightweight and still support all these other capabilities? That's something I really want to dig into, you know, without getting bloated. >> Yeah, Yeah. I think it's all about the TLAs, Dave. It's about the S, it's about SDKs and APIs and having an ecosystem of partners that will look at the lightweight agent and then develop around it. Again, going back to the idea of platform, it's critical. If you're trying to do it all on your own, you get bloat. If you try to be all things to all people with your agent, if you try to reverse engineer every capability that's out there, it doesn't work. >> Well that's one of the things that, again I want to explore because CrowdStrike is trying to be a generational company. In the Breaking Analysis that we published this week. One of the things I said, "In order to be a generational company you have to have a strong ecosystem." Now the ecosystem here is respectable, you know, but it's obviously not AWS class. You know, I think Snowflake is a really good example, ServiceNow. This feels to me like ServiceNow circa 2013. >> Yeah. >> And we've seen how ServiceNow has evolved. You know, Okta, bought Off Zero to give them the developer angle. We heard a little bit about a developer platform today. I want to dig into that some more. And we heard a lot about everybody hates their DLP. I want to get rid of my DLP, data loss prevention. And so, and the same thing with the SIM. One of the ETR round table, Eric Bradley, our colleague at a round table said "If it weren't for the compliance requirements, I would replace my SIM with XDR." And so that's again, another interesting topic. CrowdStrike, cloud native, lightweight agent, you know, some really interesting tuck in acquisitions. Great go-to-market, you know, not super hype just product that works and gets stuff done, you know, seems to have a really good, bright future. >> Yeah, no, I would agree. Definitely. No hype necessary. Just constant execution moving forward. It's clearly something that will be increasingly in demand. Another subject that came up that I thought was interesting, in the keynote, was this idea of security for elections, extending into the realm of misinformation and disinformation which are both very very loaded terms. It'll be very interesting to see how security works its way into that realm in the future. >> Yeah, yeah, >> Yeah. >> Yeah, his guy, Kevin Mandia, who is the CEO of Mandiant, which just got acquired. Google just closed the deal for $5.4 billion. I thought that was kind of light, by the way, I thought Mandiant was worth more than that. Still a good number, but, and Kevin, you know was the founder and, >> Great guy. >> they were self-funded. >> Yeah, yeah impressive. >> So. But I thought he was really impressive. He talked about election security in terms of hardening you know, the election infrastructure, but then, boom he went right to what I see as the biggest issue, disinformation. And so I'm sitting there asking myself, okay how do you deal with that? And what he talked about was mapping network effects and monitoring network effects, >> Right. >> to see who's pumping the disinformation and building career streams to really monitor those network effects, positive, you know, factual or non-factual network or information. Because a lot of times, you know, networks will pump factual information to build credibility. Right? >> Right. >> And get street cred, earn that trust. You know, you talk about zero trust. And then pump disinformation into the network. So they've now got a track. We'll get, we have Kevin Mandia on later with Sean Henry who's the CSO yeah, the the CSO or C S O, chief security officer of CrowdStrike >> more TLA. Well, so, you can think of it as almost the modern equivalent of the political ad where the candidate at the end says I support this ad or I stand behind whatever's in this ad. Forget about trying to define what is dis or misinformation. What is opinion versus fact. Let's have a standard for finding, for exposing where the information is coming from. So if you could see, if you're reading something and there is something that is easily de-code able that says this information is coming from a troll farm of a thousand bots and you can sort of examine the underlying ethos behind where this information is coming from. And you can take that into consideration. Personally, I'm not a believer in trying to filter stuff out. Put the garbage out there, just make sure people know where the garbage is coming from so they can make decisions about it. >> So I got a thought on that because, Kevin Mandia touched on it. Again, I want to ask about this. He said, so this whole idea of these, you know detecting the bots and monitoring the networks. Then he said, you can I think he said something that's to the effect of. "You can go on the offensive." And I'm thinking, okay, what does that mean? So for instance, you see it all the time. Anytime I see some kind of fact put out there, I got to start reading the comments and like cause I like to see both sides, you know. I'm right down the middle. And you'll go down and like 40 comments down, you're like, oh this is, this is fake. This video was edited, >> Right. >> Da, da, da, da, and then a bunch of other people. But then the bots take over and that gets buried. So, maybe going on the offensive is to your point. Go ahead and put it out there. But then the bots, the positive bots say, okay, by the way, this is fake news. This is an edited video FYI. And this is who put it out and here's the bot graph or something like that. And then you attack the bots with more bots and then now everybody can sort of of see it, you know? And it's not like you don't have to, you know email your friend and saying, "Hey dude, this is fake news." >> Right, right. >> You know, Do some research. >> Yeah. >> Put the research out there in volume is what you're saying. >> Yeah. So, it's an, it's just I thought it was an interesting segue into another area of security under the heading of election security. That is fraught with a lot of danger if done wrong, if done incorrectly, you know, you you get into the realm of opinion making. And we should be free to see information, but we also should have access to information about where the information is coming from. >> The other narrative that you hear. So, everything's down today again and I haven't checked lately, but security generally, we wrote about this in our Breaking Analysis. Security, somewhat, has held up in the stock market better than the broad tech market. Why? And the premise is, George Kurt said this on the last conference call, earnings call, that "security is non-discretionary." At the same time he did say that sales cycles are getting a little longer, but we see this as a positive for CrowdStrike. Because CrowdStrike, their mission, or one of their missions is to consolidate all these point tools. We've talked many, many times in the Cube, and in Breaking Analysis and on Silicon Angle, and on Wikibon, how the the security business use too many point tools. You know this as a former CTO. And, now you've got all these stove pipes, the number one challenge the CSOs face is lack of talent. CrowdStrike's premise is they can consolidate that with the Fal.Con platform, and have a single point of control. "Single pane of glass" to use that bromide. So, the question is, is security really non-discretionary? My answer to that is yes and no. It is to a sense, because security is the number one priority. You can't be lax on security. But at the same time the CSO doesn't have an open checkbook, >> Right. >> He or she can't just say, okay, I need this. I need that. I need this. There's other competing initiatives that have to be taken in balance. And so, we've seen in the ETR spending data, you know. By the way, everything's up relative to where it was, pre you know, right at the pandemic, right when, pandemic year everything was flat to down. Everything's up, really up last year, I don't know 8 to 10%. It was expected to be up 8% this year, let's call it 6 to 7% in 21. We were calling for 7 to 8% this year. It's back down to like, you know, 4 or 5% now. It's still healthy, but it's softer. People are being more circumspect. People aren't sure about what the fed's going to do next. Interest rates, you know, loom large. A lot of uncertainty out here. So, in that sense, I would say security is not non-discretionary. Sorry for the double negative. What's your take? >> I think it's less discretionary. >> Okay. >> Food, water, air. Non-discretionary. (David laughing) And then you move away in sort of gradations from that point. I would say that yeah, it is, it falls into the category of less-discretionary. >> Alright. >> Which is a good place to be. >> Dave Nicholson and David Vallante here. Two days of wall to wall coverage of Fal.Con 2022, CrowdStrike's big user conference. We got some great guests. Keep it right there, we'll be right back, right after this short break. (upbeat music)

Published Date : Sep 20 2022

SUMMARY :

that the unstoppable breach is a myth. I didn't hear the phrase. platform to get to 5 billion. And you know what, Dave? in the cyber defense industry, of the people who were on stage today. And he mentioned that the That is kind of the linchpin that the best way to deliver And then the other is, as you get into XDR Man in the street, It's about the S, it's about SDKs and APIs One of the things I said, And so, and the same thing with the SIM. into that realm in the future. of light, by the way, Yeah, as the biggest issue, disinformation. Because a lot of times, you know, into the network. And you can take that into consideration. cause I like to see both sides, you know. And then you attack the You know, Put the research out there in volume I thought it was an interesting And the premise is, George Kurt said this the fed's going to do next. And then you move away Two days of wall to wall coverage

ENTITIES

Entity	Category	Confidence
Eric Bradley	PERSON	0.99+
Dave Vallante	PERSON	0.99+
Sean Henry	PERSON	0.99+
8	QUANTITY	0.99+
David Nicholson	PERSON	0.99+
Kevin Mandia	PERSON	0.99+
David Vallante	PERSON	0.99+
Michael Santonis	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
George Kurtz	PERSON	0.99+
Kurtz	PERSON	0.99+
Steven Schmidt	PERSON	0.99+
George Kurt	PERSON	0.99+
Kevin	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Mandiant	ORGANIZATION	0.99+
7	QUANTITY	0.99+
5 billion	QUANTITY	0.99+
$5 billion	QUANTITY	0.99+
40 comments	QUANTITY	0.99+
Andy Jassy	PERSON	0.99+
$40 billion	QUANTITY	0.99+
$5.4 billion	QUANTITY	0.99+
2 billion	QUANTITY	0.99+
6	QUANTITY	0.99+
20,000 customers	QUANTITY	0.99+
4	QUANTITY	0.99+
last year	DATE	0.99+
5%	QUANTITY	0.99+
CrowdStrike	ORGANIZATION	0.99+
last month	DATE	0.99+
Reinforce	ORGANIZATION	0.99+
two exceptions	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
one	QUANTITY	0.99+
both sides	QUANTITY	0.99+
today	DATE	0.99+
David	PERSON	0.98+
this week	DATE	0.98+
eight tracks	QUANTITY	0.98+
both	QUANTITY	0.98+
10%	QUANTITY	0.98+
hundreds of thousands	QUANTITY	0.98+
7%	QUANTITY	0.98+
this year	DATE	0.97+
Okta	ORGANIZATION	0.97+
One	QUANTITY	0.97+
Fal.Con 2022	EVENT	0.97+
Day 1	QUANTITY	0.97+
about 60% a year	QUANTITY	0.97+
Two days	QUANTITY	0.97+
zero trust	QUANTITY	0.97+
8%	QUANTITY	0.96+
21	QUANTITY	0.96+
Fal.Con	EVENT	0.96+
hundred sessions	QUANTITY	0.96+
each	QUANTITY	0.95+
over 10 years ago	DATE	0.95+
single agent	QUANTITY	0.95+
single point	QUANTITY	0.95+
CrowdStrike	TITLE	0.95+
pandemic	EVENT	0.95+
first time	QUANTITY	0.95+
Off Zero	ORGANIZATION	0.94+
CrowdStrike	EVENT	0.94+
2013	DATE	0.92+
Preempt	ORGANIZATION	0.92+
Humio	ORGANIZATION	0.92+
Zero trust	QUANTITY	0.9+

Lie 2, An Open Source Based Platform Cannot Give You Performance and Control | Starburst

>>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay. We're gonna get into lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you'll, you'll never get performance because you need to be column. You need to store data in a column format. And then, you know, column formats were introduced to, to data lake. You have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again, like iceberg and Delta and hoote that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a quote from, you know, Kurt Monash many years ago where he said, you know, it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a lie and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, the clothes is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect and what you don't want to end up done is backed itself into a corner that then prevents it from innovating. So if you have chosen the technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, obviously her vision is there's an open source that, that data mesh is open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to hit and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in hit back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, it's interesting remind of when I, you know, I see the, the gas price, the TSR gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you, you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. That that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you wanna use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you and, and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers there, but, but a lot of Oracle customers and they, you know, they'll admit yeah, you know, the Jammin us on price and the license cost, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast an ROI? >>I think the answer to that is it can depend a bit. It depends on your business's skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is always a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So IE, it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you command a 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years and in the world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse, it, it fit in this, in this world. >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a data lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understanding holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern or is it the same wine new bottle when it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage.

Published Date : Aug 22 2022

SUMMARY :

give you the performance and control that you can get with a proprietary We got, you know, largely over the performance hurdle, you know, more recently people will say, And I remember a quote from, you know, Kurt Monash many years ago where he said, you know, it is an evolving, you know, spectrum, but, but from your perspective, in a, a direction, slightly different to what people expect and what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, it's interesting remind of when I, you know, I see the, the gas price, the TSR gas price And I think, you know, I loved what Richard said. you know, the Jammin us on price and the license cost, but we do get value out And so for those different teams, they can get to an you know, the data brick snowflake, you know, thing is always a lot of fun for analysts like me. So the advice that I saw years ago was if you have open source technologies, years and in the world of Oracle, you know, normally it's the staff, to discover and consume via, you know, the creation of data products as well. data model that we see emerging and the so-called modern data stack is

ENTITIES

Entity	Category	Confidence
Jess Borgman	PERSON	0.99+
Richard	PERSON	0.99+
20 cents	QUANTITY	0.99+
six	QUANTITY	0.99+
Justin	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
Jess	PERSON	0.99+
pythons	TITLE	0.99+
seven years	QUANTITY	0.99+
Today	DATE	0.99+
Javas	TITLE	0.99+
Teradata	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.98+
millions	QUANTITY	0.98+
EVAs	ORGANIZATION	0.98+
JAK	PERSON	0.98+
Starburst	ORGANIZATION	0.98+
both	QUANTITY	0.97+
10	DATE	0.97+
12 years ago	DATE	0.97+
Starbust	TITLE	0.96+
today	DATE	0.95+
Apache iceberg	ORGANIZATION	0.94+
Google	ORGANIZATION	0.93+
12 years	QUANTITY	0.92+
single point	QUANTITY	0.92+
two worlds	QUANTITY	0.92+
10	QUANTITY	0.91+
Hudu	LOCATION	0.91+
Unix	TITLE	0.9+
one thing	QUANTITY	0.87+
trillions of records	QUANTITY	0.83+
first data lake	QUANTITY	0.82+
Starburst	TITLE	0.8+
PJI	ORGANIZATION	0.79+
years ago	DATE	0.76+
IE	TITLE	0.75+
Lie 2	TITLE	0.72+
many years ago	DATE	0.72+
over a couple times	QUANTITY	0.7+
TCO	ORGANIZATION	0.7+
Parque	ORGANIZATION	0.67+
Number two	QUANTITY	0.64+
Kubernetes	ORGANIZATION	0.59+
a decade	QUANTITY	0.58+
plus years	DATE	0.57+
Azure	TITLE	0.57+
S3	TITLE	0.55+
Delta	TITLE	0.54+
20	QUANTITY	0.49+
last	DATE	0.48+
Mohan	PERSON	0.44+
ORC	ORGANIZATION	0.27+

Starburst The Data Lies FULL V2b

>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data-driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting cost could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data marts, data hubs, and yes, even data lakes were broken and left us wanting from more welcome to the data doesn't lie, or doesn't a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have featured parody with the data lake or vice versa is the so-called modern data stack, simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for hit that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience show? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know, right. You actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like star Oxley, for things like security for certainly very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited Jamma, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come. I think that's the story at Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenants of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about SAR, brain Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but what does that mean? Does that mean the E D w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems, maybe either those that either source systems for the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got to, you know, domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue are two, you know, challenges, self-serve infrastructure let's park that for a second. And then in your industry, the one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at emus is we have a single security layer that sits on top of our data match, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin, I mean, Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, do an analytic queries and with data that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah. I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively a almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself, >>Okay. G guys grab a sip of water. After this short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence, keep it right there. >>Your company has more data than ever, and more people trying to understand it, but there's a problem. Your data is stored across multiple systems. It's hard to access and that delays analytics and ultimately decisions. The old method of moving all of your data into a single source of truth is slow and definitely not built for the volume of data we have today or where we are headed while your data engineers spent over half their time, moving data, your analysts and data scientists are left, waiting, feeling frustrated, unproductive, and unable to move the needle for your business. But what if you could spend less time moving or copying data? What if your data consumers could analyze all your data quickly? >>Starburst helps your teams run fast queries on any data source. We help you create a single point of access to your data, no matter where it's stored. And we support high concurrency, we solve for speed and scale, whether it's fast, SQL queries on your data lake or faster queries across multiple data sets, Starburst helps your teams run analytics anywhere you can't afford to wait for data to be available. Your team has questions that need answers. Now with Starburst, the wait is over. You'll have faster access to data with enterprise level security, easy connectivity, and 24 7 support from experts, organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact our Trino experts to get started. >>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay, we're gonna get to lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you you'll never get performance because you need to be column there. You need to store data in a column format. And then, you know, column formats we're introduced to, to data apes, you have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again like iceberg and Delta and Hodi that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a line and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, look closed is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen a technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, obviously her vision is there's an open source that, that the data meshes open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but to come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well, I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to Haddo and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in Haddo back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, that's interesting reminded when I, you know, I see the, the gas price, the tees or gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. Th that that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you want to use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you in and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers, but, but a lot of Oracle customers and they, you know, they'll admit, yeah, you know, they're jamming us on price and the license cost they give, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast in ROI? >>I think the answer to that is it can depend a bit. It depends on your businesses skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So I E it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you commander 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities, because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years. And in world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse fit in this, in this world? >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage, or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a deal lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern, or is it the same wine new bottle? When it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage. >>Your data is capable of producing incredible results, but data consumers are often left in the dark without fast access to the data they need. Starers makes your data visible from wherever it lives. Your company is acquiring more data in more places, more rapidly than ever to rely solely on a data centralization strategy. Whether it's in a lake or a warehouse is unrealistic. A single source of truth approach is no longer viable, but disconnected data silos are often left untapped. We need a new approach. One that embraces distributed data. One that enables fast and secure access to any of your data from anywhere with Starburst, you'll have the fastest query engine for the data lake that allows you to connect and analyze your disparate data sources no matter where they live Starburst provides the foundational technology required for you to build towards the vision of a decentralized data mesh Starburst enterprise and Starburst galaxy offer enterprise ready, connectivity, interoperability, and security features for multiple regions, multiple clouds and everchanging global regulatory requirements. The data is yours. And with Starburst, you can perform analytics anywhere in light of your world. >>Okay. We're back with Justin Boardman. CEO of Starbust Richard Jarvis is the CTO of EMI health and Theresa tongue is the cloud first technologist from Accenture. We're on July number three. And that is the claim that today's modern data stack is actually modern. So I guess that's the lie it's it is it's is that it's not modern. Justin, what do you say? >>Yeah. I mean, I think new isn't modern, right? I think it's the, it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually are exactly the same as what we've had for 40 years, rather than Terra data. You have snowflake rather than Informatica you have five trend. So it's the same general stack, just, you know, a cloud version of it. And I think a lot of the challenges that it plagued us for 40 years still maintain. >>So lemme come back to you just, but okay. But, but there are differences, right? I mean, you can scale, you can throw resources at the problem. You can separate compute from storage. You really, you know, there's a lot of money being thrown at that by venture capitalists and snowflake, you mentioned it's competitors. So that's different. Is it not, is that not at least an aspect of, of modern dial it up, dial it down. So what, what do you say to that? >>Well, it, it is, it's certainly taking, you know, what the cloud offers and taking advantage of that, but it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same sort of structural constraints that exist with the old enterprise data warehouse model OnPrem still exist just yes, a little bit more elastic now because the cloud offers that. >>So Theresa, let me go to you cuz you have cloud first in your, in your, your title. So what's what say you to this conversation? >>Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud, as we know it, maybe data lake data warehouse in the central place, that's not even how the cloud providers are looking at it. They have news query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our, the future goes, right, that that's gonna very much fall the same thing. There was gonna be more edge. There's gonna be more on premise because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers. Right? So there's a lot of reasons why the modern, I guess, the next modern generation of the data staff needs to be much more federated. >>Okay. So Richard, how do you deal with this? You you've obviously got, you know, the technical debt, the existing infrastructure it's on the books. You don't wanna just throw it out. A lot of, lot of conversation about modernizing applications, which a lot of times is a, you know, a microservices layer on top of leg legacy apps. How do you think about the modern data stack? >>Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just cuz you can scale CPU and storage doesn't mean you can get more people to use your data, to generate you more, more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data, but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams, we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business, not just the technology itself. >>Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five, seven years cloud obviously has given a different pricing model. De-risked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm, I'm taking away that that's not enough based on what Richard just said. The modern data stack has to serve the business and enable the business to build data products. I, I buy that. I'm a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about, you know, paying, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >>Of, of how it should look like or, or how >>It's yeah. What it should be. >>Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in, in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I, I certainly agree with that. So by no means, are we suggesting that, you know, snowflake or Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's, it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of, of idealism. They had the benefit of it was starting with a clean slate that does not reflect the vast majority of enterprises. >>And even those companies, as they grow up mature out of that ideal state, they go buy a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding, you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really Futureproof yourself from the inevitable change that you will, you won't encounter over time. >>So thank you. So there, based on what Justin just said, I, my takeaway there is it's inclusive, whether it's a data Mar data hub, data lake data warehouse, it's a, just a node on the mesh. Okay. I get that. Does that include there on Preem data? O obviously it has to, what are you seeing in terms of the ability to, to take that data mesh concept on Preem? I mean, most implementations I've seen in data mesh, frankly really aren't, you know, adhering to the philosophy. They're maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing. Hello, fresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >>I mean, I, I think it's a killer case for data. Me, the fact that you have valuable data sources, OnPrem, and then yet you still wanna modernize and take the best of cloud cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both worlds. You can start using the data products on-prem or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or, or maybe just tap into better analytics to get better insights, right? So you're gonna be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >>Okay. Thank you. So Richard, you know, talking about data as product, wonder if we could give us your perspectives here, what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >>So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients' demographics about their, their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight. And in any business, that's clearly not a desirable outcome, but when that insight is so critical, as it might be in healthcare or some security settings, you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured, managed way, even if that data comes from a variety of different sources in, in, in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >>So that data product through whatever APIs is, is accessible, it's discoverable, but it's obviously gotta be governed as well. You mentioned you, you appropriately provided to internally. Yeah. But also, you know, external folks as well. So the, so you've, you've architected that capability today >>We have, and because the data is standard, it can generate value much more quickly and we can be sure of the security and, and, and value that that's providing because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context, what does this data mean? And what does it mean to process this data for a particular use case? >>Yeah, it makes sense. It's got the context. If the, if the domains own the data, you, you gotta cut through a lot of the, the, the centralized teams, the technical teams that, that data agnostic, they don't really have that context. All right. Let's send Justin, how does Starburst fit into this modern data stack? Bring us home. >>Yeah. So I think for us, it's really providing our customers with, you know, the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know, and optionality provides the ability to reduce costs, store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, you can really create and, and curate, you know, data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know, model and make that an appropriate compliment to, you know, the, the, the modern data stack that people have today. >>Excellent. Hey, I wanna thank Justin Theresa and Richard for joining us today. You guys are great. I big believers in the, in the data mesh concept, and I think, you know, we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are gonna be available on the cube.net for on-demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and, and, and they have awesome resources, lots of data mesh conversations over there, and really good stuff in, in the resource section. So check that out. Thanks for watching the data doesn't lie or does it made possible by Starburst data? This is Dave Valante for the cube, and we'll see you next time. >>The explosion of data sources has forced organizations to modernize their systems and architecture and come to terms with one size does not fit all for data management today. Your teams are constantly moving and copying data, which requires time management. And in some cases, double paying for compute resources. Instead, what if you could access all your data anywhere using the BI tools and SQL skills your users already have. And what if this also included enterprise security and fast performance with Starburst enterprise, you can provide your data consumers with a single point of secure access to all of your data, no matter where it lives with features like strict, fine grained, access control, end to end data encryption and data masking Starburst meets the security standards of the largest companies. Starburst enterprise can easily be deployed anywhere and managed with insights where data teams holistically view their clusters operation and query execution. So they can reach meaningful business decisions faster, all this with the support of the largest team of Trino experts in the world, delivering fully tested stable releases and available to support you 24 7 to unlock the value in all of your data. You need a solution that easily fits with what you have today and can adapt to your architecture. Tomorrow. Starbust enterprise gives you the fastest path from big data to better decisions, cuz your team can't afford to wait. Trino was created to empower analytics anywhere and Starburst enterprise was created to give you the enterprise grade performance, connectivity, security management, and support your company needs organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact us to get started.

Published Date : Aug 22 2022

SUMMARY :

famously said the best minds of my generation are thinking about how to get people to the data warehouse ever have featured parody with the data lake or vice versa is So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? although if you were starting from a Greenfield site and you were building something brand new, Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, And you can think of them Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come. But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing that the mesh actually allows you to use all of them. But it creates what I would argue are two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around those do an analytic queries and with data that's all dispersed all over the, how are you seeing your the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of data But what if you could spend less you create a single point of access to your data, no matter where it's stored. give you the performance and control that you can get with a proprietary system. I remember in the very early days, people would say, you you'll never get performance because And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven it is an evolving, you know, spectrum, but, but from your perspective, And what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, that's interesting reminded when I, you know, I see the, the gas price, And I think, you know, I loved what Richard said. not as many te data customers, but, but a lot of Oracle customers and they, you know, And so for those different teams, they can get to an ROI more quickly with different technologies that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts So the advice that I saw years ago was if you have open source technologies, And in world of Oracle, you know, normally it's the staff, easy to discover and consume via, you know, the creation of data products as well. really modern, or is it the same wine new bottle? And with Starburst, you can perform analytics anywhere in light of your world. And that is the claim that today's So it's the same general stack, just, you know, a cloud version of it. So lemme come back to you just, but okay. So a lot of the same sort of structural constraints that exist with So Theresa, let me go to you cuz you have cloud first in your, in your, the data staff needs to be much more federated. you know, a microservices layer on top of leg legacy apps. So I think the stack needs to support a scalable So you think about the past, you know, five, seven years cloud obviously has given What it should be. And I think that's the paradigm shift that needs to occur. data that lives outside of the data warehouse, maybe living in open data formats in a data lake seen in data mesh, frankly really aren't, you know, adhering to So the mesh allows you to have the best of both worlds. So Richard, you know, talking about data as product, wonder if we could give us your perspectives is expecting means that you generate the wrong insight. But also, you know, around the data to say in a very clear business context, It's got the context. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, This is Dave Valante for the cube, and we'll see you next time. You need a solution that easily fits with what you have today and can adapt

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Dave Lanta	PERSON	0.99+
Jess Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Theresa	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Teresa	PERSON	0.99+
Jeff Ocker	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Dave Valante	PERSON	0.99+
Justin Boardman	PERSON	0.99+
six	QUANTITY	0.99+
Dani	PERSON	0.99+
Massachusetts	LOCATION	0.99+
20 cents	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jamma	PERSON	0.99+
UK	LOCATION	0.99+
FINRA	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
two	QUANTITY	0.99+
five	QUANTITY	0.99+
Jess	PERSON	0.99+
2011	DATE	0.99+
Starburst	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
thousands	QUANTITY	0.99+
pythons	TITLE	0.99+
Boston	LOCATION	0.99+
GDPR	TITLE	0.99+
Today	DATE	0.99+
two models	QUANTITY	0.99+
Zolando Comcast	ORGANIZATION	0.99+
Gemma	PERSON	0.99+
Starbust	ORGANIZATION	0.99+
JPMC	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Javas	TITLE	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
first lie	QUANTITY	0.99+
10	DATE	0.99+
12 years	QUANTITY	0.99+
one place	QUANTITY	0.99+
Tomorrow	DATE	0.99+

Starburst The Data Lies FULL V1

>>In 2011, early Facebook employee and Cloudera co-founder Jeff Ocker famously said the best minds of my generation are thinking about how to get people to click on ads. And that sucks. Let's face it more than a decade later organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile data-driven enterprise. What does that even mean? You ask? Well, it means that everyone in the organization has the data they need when they need it. In a context that's relevant to advance the mission of an organization. Now that could mean cutting cost could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving, supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data, warehouses, data marts, data hubs, and yes, even data lakes were broken and left us wanting from more welcome to the data doesn't lie, or doesn't a series of conversations produced by the cube and made possible by Starburst data. >>I'm your host, Dave Lanta and joining me today are three industry experts. Justin Borgman is this co-founder and CEO of Starburst. Richard Jarvis is the CTO at EMI health and Theresa tongue is cloud first technologist at Accenture. Today we're gonna have a candid discussion that will expose the unfulfilled and yes, broken promises of a data past we'll expose data lies, big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth. Inevitable will the data warehouse ever have featured parody with the data lake or vice versa is the so-called modern data stack, simply centralization in the cloud, AKA the old guards model in new cloud close. How can organizations rethink their data architectures and regimes to realize the true promises of data can and will and open ecosystem deliver on these promises in our lifetimes, we're spanning much of the Western world today. Richard is in the UK. Teresa is on the west coast and Justin is in Massachusetts with me. I'm in the cube studios about 30 miles outside of Boston folks. Welcome to the program. Thanks for coming on. Thanks for having us. Let's get right into it. You're very welcome. Now here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >>Yeah, definitely a lie. My first startup was a company called hit adapt, which was an early SQL engine for hit that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem data in the cloud. You know, those companies were acquiring other companies and inheriting their data architecture. So, you know, despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >>So Richard, from a practitioner's point of view, you know, what, what are your thoughts? I mean, there, there's a lot of pressure to cut cost, keep things centralized, you know, serve the business as best as possible from that standpoint. What, what is your experience show? >>Yeah, I mean, I think I would echo Justin's experience really that we, as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in, in a platform that's close to data experts, people who really understand healthcare data from pharmacies or from, from doctors. And so, although if you were starting from a Greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that that businesses just don't grow up like that. And, and it's just really impossible to get that academic perfection of, of storing everything in one place. >>Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, you know, right. You actually did have to have a single version of the truth for certain financial data, but really for those, some of those other use cases, I, I mentioned, I, I do feel like the industry has kinda let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralized? >>I, I think you gotta have centralized governance, right? So from the central team, for things like star Oxley, for things like security for certainly very core data sets, having a centralized set of roles, responsibilities to really QA, right. To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise you're not gonna be able to scale. Right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately you're gonna collaborate with your partners. So partners that are not within the company, right. External partners, we're gonna see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >>So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, on data mesh. It was a great program. You invited Jamma, Dani, of course, she's the creator of the data mesh. And her one of our fundamental premises is that you've got this hyper specialized team that you've gotta go through. And if you want anything, but at the same time, these, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess question for you, Richard, how do you deal with that? Do you, do you organize so that there are a few sort of rock stars that, that, you know, build cubes and, and the like, and, and, and, or have you had any success in sort of decentralizing with, you know, your, your constituencies, that data model? >>Yeah. So, so we absolutely have got rockstar, data scientists and data guardians. If you like people who understand what it means to use this data, particularly as the data that we use at emos is very private it's healthcare information. And some of the, the rules and regulations around using the data are very complex and, and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business, because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a, a consulting type experience from a, a set of rock stars to help a, a more decentralized business who needs to, to understand the data and to generate some valuable output. >>Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, I got a centralized team and that's the most cost effective way to serve the business. Otherwise I got, I got duplication. What do you say to that? >>Well, I, I would argue it's probably not the most cost effective and, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you, you know, for many, many years to come. I think that's the story at Oracle or Terra data or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams is as much as they are experts in the technology. They don't necessarily understand the data itself. And this is one of the core tenants of data mash that that jam writes about is this idea of the domain owners actually know the data the best. >>And so by, you know, not only acknowledging that data is generally decentralized and to your earlier point about SAR, brain Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for, for those laws to be compliant. But I think the reality is, you know, the data mesh model basically says, data's decentralized, and we're gonna turn that into an asset rather than a liability. And we're gonna turn that into an asset by empowering the people that know the data, the best to participate in the process of, you know, curating and creating data products for, for consumption. So I think when you think about it, that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two, the two models comparing and contrasting. >>So do you think the demise of the data warehouse is inevitable? I mean, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing infrastructure. Maybe they're gonna build on top of it, but what does that mean? Does that mean the E D w just becomes, you know, less and less valuable over time, or it's maybe just isolated to specific use cases. What's your take on that? >>Listen, I still would love all my data within a data warehouse would love it. Mastered would love it owned by essential team. Right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date. I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's gonna be a new technology. That's gonna emerge that we're gonna wanna tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this, you know, new mesh layer that still takes advantage of the things. I mentioned, the data products in the systems that are meaningful today and the data products that actually might span a number of systems, maybe either those that either source systems for the domains that know it best, or the consumer based systems and products that need to be packaged in a way that be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >>So, Richard, let me ask you, you take, take Gemma's principles back to those. You got to, you know, domain ownership and, and, and data as product. Okay, great. Sounds good. But it creates what I would argue are two, you know, challenges, self-serve infrastructure let's park that for a second. And then in your industry, the one of the high, most regulated, most sensitive computational governance, how do you automate and ensure federated governance in that mesh model that Theresa was just talking about? >>Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to be, to centralize the security and the governance of the data. And I think, although a data warehouse makes that very simple, cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at emus is we have a single security layer that sits on top of our data match, which means that no matter which user is accessing, which data source, we go through a well audited well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is, is audited in a very kind of standard way, regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible understanding where your source of truth is and securing that in a common way is still a valuable approach and you can do it without having to bring all that data into a single bucket so that it's all in one place. And, and so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform and ensuring that only data that's available under GDPR and other regulations is being used by, by the data users. >>Yeah. So Justin, I mean, Democrat, we always talk about data democratization and you know, up until recently, they really haven't been line of sight as to how to get there. But do you have anything to add to this because you're essentially taking, you know, do an analytic queries and with data that's all dispersed all over the, how are you seeing your customers handle this, this challenge? >>Yeah. I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, people know the data, the best to, to create, you know, data as a product ultimately to be consumed. And we try to represent that in our product as effectively a almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization. And then you can start to consume them as, as you'd like. And so really trying to build on that notion of, you know, data democratization and self-service, and making it very easy to discover and, and start to use with whatever BI tool you, you may like, or even just running, you know, SQL queries yourself, >>Okay. G guys grab a sip of water. After this short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence, keep it right there. >>Your company has more data than ever, and more people trying to understand it, but there's a problem. Your data is stored across multiple systems. It's hard to access and that delays analytics and ultimately decisions. The old method of moving all of your data into a single source of truth is slow and definitely not built for the volume of data we have today or where we are headed while your data engineers spent over half their time, moving data, your analysts and data scientists are left, waiting, feeling frustrated, unproductive, and unable to move the needle for your business. But what if you could spend less time moving or copying data? What if your data consumers could analyze all your data quickly? >>Starburst helps your teams run fast queries on any data source. We help you create a single point of access to your data, no matter where it's stored. And we support high concurrency, we solve for speed and scale, whether it's fast, SQL queries on your data lake or faster queries across multiple data sets, Starburst helps your teams run analytics anywhere you can't afford to wait for data to be available. Your team has questions that need answers. Now with Starburst, the wait is over. You'll have faster access to data with enterprise level security, easy connectivity, and 24 7 support from experts, organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact our Trino experts to get started. >>We're back with Jess Borgman of Starburst and Richard Jarvis of EVAs health. Okay, we're gonna get to lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you you'll never get performance because you need to be column there. You need to store data in a column format. And then, you know, column formats we're introduced to, to data apes, you have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again like iceberg and Delta and Hodi that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a line and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, look closed is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen a technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, obviously her vision is there's an open source that, that the data meshes open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but to come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well, I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to Haddo and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in Haddo back then. And I think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, that's interesting reminded when I, you know, I see the, the gas price, the tees or gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up, you mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down. Cause I thought it was an amazing quote. He said, it buys us the ability to be unsure of the future. Th that that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use spark to train a machine learning model and you want to use Starbust to query via sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you in and locks you in. >>So I, I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers, but, but a lot of Oracle customers and they, you know, they'll admit, yeah, you know, they're jamming us on price and the license cost they give, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast in ROI? >>I think the answer to that is it can depend a bit. It depends on your businesses skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run at enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud-based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like PJI Mohan said, you know what? I think it's actually harder to play in the data engineering. So I E it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you commander 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities, because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years. And in world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse fit in this, in this world? >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage, or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a deal lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access controls so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern, or is it the same wine new bottle? When it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage. >>Your data is capable of producing incredible results, but data consumers are often left in the dark without fast access to the data they need. Starers makes your data visible from wherever it lives. Your company is acquiring more data in more places, more rapidly than ever to rely solely on a data centralization strategy. Whether it's in a lake or a warehouse is unrealistic. A single source of truth approach is no longer viable, but disconnected data silos are often left untapped. We need a new approach. One that embraces distributed data. One that enables fast and secure access to any of your data from anywhere with Starburst, you'll have the fastest query engine for the data lake that allows you to connect and analyze your disparate data sources no matter where they live Starburst provides the foundational technology required for you to build towards the vision of a decentralized data mesh Starburst enterprise and Starburst galaxy offer enterprise ready, connectivity, interoperability, and security features for multiple regions, multiple clouds and everchanging global regulatory requirements. The data is yours. And with Starburst, you can perform analytics anywhere in light of your world. >>Okay. We're back with Justin Boardman. CEO of Starbust Richard Jarvis is the CTO of EMI health and Theresa tongue is the cloud first technologist from Accenture. We're on July number three. And that is the claim that today's modern data stack is actually modern. So I guess that's the lie it's it is it's is that it's not modern. Justin, what do you say? >>Yeah. I mean, I think new isn't modern, right? I think it's the, it's the new data stack. It's the cloud data stack, but that doesn't necessarily mean it's modern. I think a lot of the components actually are exactly the same as what we've had for 40 years, rather than Terra data. You have snowflake rather than Informatica you have five trend. So it's the same general stack, just, you know, a cloud version of it. And I think a lot of the challenges that it plagued us for 40 years still maintain. >>So lemme come back to you just, but okay. But, but there are differences, right? I mean, you can scale, you can throw resources at the problem. You can separate compute from storage. You really, you know, there's a lot of money being thrown at that by venture capitalists and snowflake, you mentioned it's competitors. So that's different. Is it not, is that not at least an aspect of, of modern dial it up, dial it down. So what, what do you say to that? >>Well, it, it is, it's certainly taking, you know, what the cloud offers and taking advantage of that, but it's important to note that the cloud data warehouses out there are really just separating their compute from their storage. So it's allowing them to scale up and down, but your data still stored in a proprietary format. You're still locked in. You still have to ingest the data to get it even prepared for analysis. So a lot of the same sort of structural constraints that exist with the old enterprise data warehouse model OnPrem still exist just yes, a little bit more elastic now because the cloud offers that. >>So Theresa, let me go to you cuz you have cloud first in your, in your, your title. So what's what say you to this conversation? >>Well, even the cloud providers are looking towards more of a cloud continuum, right? So the centralized cloud, as we know it, maybe data lake data warehouse in the central place, that's not even how the cloud providers are looking at it. They have news query services. Every provider has one that really expands those queries to be beyond a single location. And if we look at a lot of where our, the future goes, right, that that's gonna very much fall the same thing. There was gonna be more edge. There's gonna be more on premise because of data sovereignty, data gravity, because you're working with different parts of the business that have already made major cloud investments in different cloud providers. Right? So there's a lot of reasons why the modern, I guess, the next modern generation of the data staff needs to be much more federated. >>Okay. So Richard, how do you deal with this? You you've obviously got, you know, the technical debt, the existing infrastructure it's on the books. You don't wanna just throw it out. A lot of, lot of conversation about modernizing applications, which a lot of times is a, you know, a microservices layer on top of leg legacy apps. How do you think about the modern data stack? >>Well, I think probably the first thing to say is that the stack really has to include the processes and people around the data as well is all well and good changing the technology. But if you don't modernize how people use that technology, then you're not going to be able to, to scale because just cuz you can scale CPU and storage doesn't mean you can get more people to use your data, to generate you more, more value for the business. And so what we've been looking at is really changing in very much aligned to data products and, and data mesh. How do you enable more people to consume the service and have the stack respond in a way that keeps costs low? Because that's important for our customers consuming this data, but also allows people to occasionally run enormous queries and then tick along with smaller ones when required. And it's a good job we did because during COVID all of a sudden we had enormous pressures on our data platform to answer really important life threatening queries. And if we couldn't scale both our data stack and our teams, we wouldn't have been able to answer those as quickly as we had. So I think the stack needs to support a scalable business, not just the technology itself. >>Well thank you for that. So Justin let's, let's try to break down what the critical aspects are of the modern data stack. So you think about the past, you know, five, seven years cloud obviously has given a different pricing model. De-risked experimentation, you know that we talked about the ability to scale up scale down, but it's, I'm, I'm taking away that that's not enough based on what Richard just said. The modern data stack has to serve the business and enable the business to build data products. I, I buy that. I'm a big fan of the data mesh concepts, even though we're early days. So what are the critical aspects if you had to think about, you know, paying, maybe putting some guardrails and definitions around the modern data stack, what does that look like? What are some of the attributes and, and principles there >>Of, of how it should look like or, or how >>It's yeah. What it should be. >>Yeah. Yeah. Well, I think, you know, in, in Theresa mentioned this in, in a previous segment about the data warehouse is not necessarily going to disappear. It just becomes one node, one element of the overall data mesh. And I, I certainly agree with that. So by no means, are we suggesting that, you know, snowflake or Redshift or whatever cloud data warehouse you may be using is going to disappear, but it's, it's not going to become the end all be all. It's not the, the central single source of truth. And I think that's the paradigm shift that needs to occur. And I think it's also worth noting that those who were the early adopters of the modern data stack were primarily digital, native born in the cloud young companies who had the benefit of, of idealism. They had the benefit of it was starting with a clean slate that does not reflect the vast majority of enterprises. >>And even those companies, as they grow up mature out of that ideal state, they go buy a business. Now they've got something on another cloud provider that has a different data stack and they have to deal with that heterogeneity that is just change and change is a part of life. And so I think there is an element here that is almost philosophical. It's like, do you believe in an absolute ideal where I can just fit everything into one place or do I believe in reality? And I think the far more pragmatic approach is really what data mesh represents. So to answer your question directly, I think it's adding, you know, the ability to access data that lives outside of the data warehouse, maybe living in open data formats in a data lake or accessing operational systems as well. Maybe you want to directly access data that lives in an Oracle database or a Mongo database or, or what have you. So creating that flexibility to really Futureproof yourself from the inevitable change that you will, you won't encounter over time. >>So thank you. So there, based on what Justin just said, I, my takeaway there is it's inclusive, whether it's a data Mar data hub, data lake data warehouse, it's a, just a node on the mesh. Okay. I get that. Does that include there on Preem data? O obviously it has to, what are you seeing in terms of the ability to, to take that data mesh concept on Preem? I mean, most implementations I've seen in data mesh, frankly really aren't, you know, adhering to the philosophy. They're maybe, maybe it's data lake and maybe it's using glue. You look at what JPMC is doing. Hello, fresh, a lot of stuff happening on the AWS cloud in that, you know, closed stack, if you will. What's the answer to that Theresa? >>I mean, I, I think it's a killer case for data. Me, the fact that you have valuable data sources, OnPrem, and then yet you still wanna modernize and take the best of cloud cloud is still, like we mentioned, there's a lot of great reasons for it around the economics and the way ability to tap into the innovation that the cloud providers are giving around data and AI architecture. It's an easy button. So the mesh allows you to have the best of both worlds. You can start using the data products on-prem or in the existing systems that are working already. It's meaningful for the business. At the same time, you can modernize the ones that make business sense because it needs better performance. It needs, you know, something that is, is cheaper or, or maybe just tap into better analytics to get better insights, right? So you're gonna be able to stretch and really have the best of both worlds. That, again, going back to Richard's point, that is meaningful by the business. Not everything has to have that one size fits all set a tool. >>Okay. Thank you. So Richard, you know, talking about data as product, wonder if we could give us your perspectives here, what are the advantages of treating data as a product? What, what role do data products have in the modern data stack? We talk about monetizing data. What are your thoughts on data products? >>So for us, one of the most important data products that we've been creating is taking data that is healthcare data across a wide variety of different settings. So information about patients' demographics about their, their treatment, about their medications and so on, and taking that into a standards format that can be utilized by a wide variety of different researchers because misinterpreting that data or having the data not presented in the way that the user is expecting means that you generate the wrong insight. And in any business, that's clearly not a desirable outcome, but when that insight is so critical, as it might be in healthcare or some security settings, you really have to have gone to the trouble of understanding the data, presenting it in a format that everyone can clearly agree on. And then letting people consume in a very structured, managed way, even if that data comes from a variety of different sources in, in, in the first place. And so our data product journey has really begun by standardizing data across a number of different silos through the data mesh. So we can present out both internally and through the right governance externally to, to researchers. >>So that data product through whatever APIs is, is accessible, it's discoverable, but it's obviously gotta be governed as well. You mentioned you, you appropriately provided to internally. Yeah. But also, you know, external folks as well. So the, so you've, you've architected that capability today >>We have, and because the data is standard, it can generate value much more quickly and we can be sure of the security and, and, and value that that's providing because the data product isn't just about formatting the data into the correct tables, it's understanding what it means to redact the data or to remove certain rows from it or to interpret what a date actually means. Is it the start of the contract or the start of the treatment or the date of birth of a patient? These things can be lost in the data storage without having the proper product management around the data to say in a very clear business context, what does this data mean? And what does it mean to process this data for a particular use case? >>Yeah, it makes sense. It's got the context. If the, if the domains own the data, you, you gotta cut through a lot of the, the, the centralized teams, the technical teams that, that data agnostic, they don't really have that context. All right. Let's send Justin, how does Starburst fit into this modern data stack? Bring us home. >>Yeah. So I think for us, it's really providing our customers with, you know, the flexibility to operate and analyze data that lives in a wide variety of different systems. Ultimately giving them that optionality, you know, and optionality provides the ability to reduce costs, store more in a data lake rather than data warehouse. It provides the ability for the fastest time to insight to access the data directly where it lives. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, you can really create and, and curate, you know, data as a product to be shared and consumed. So we're trying to help enable the data mesh, you know, model and make that an appropriate compliment to, you know, the, the, the modern data stack that people have today. >>Excellent. Hey, I wanna thank Justin Theresa and Richard for joining us today. You guys are great. I big believers in the, in the data mesh concept, and I think, you know, we're seeing the future of data architecture. So thank you. Now, remember, all these conversations are gonna be available on the cube.net for on-demand viewing. You can also go to starburst.io. They have some great content on the website and they host some really thought provoking interviews and, and, and they have awesome resources, lots of data mesh conversations over there, and really good stuff in, in the resource section. So check that out. Thanks for watching the data doesn't lie or does it made possible by Starburst data? This is Dave Valante for the cube, and we'll see you next time. >>The explosion of data sources has forced organizations to modernize their systems and architecture and come to terms with one size does not fit all for data management today. Your teams are constantly moving and copying data, which requires time management. And in some cases, double paying for compute resources. Instead, what if you could access all your data anywhere using the BI tools and SQL skills your users already have. And what if this also included enterprise security and fast performance with Starburst enterprise, you can provide your data consumers with a single point of secure access to all of your data, no matter where it lives with features like strict, fine grained, access control, end to end data encryption and data masking Starburst meets the security standards of the largest companies. Starburst enterprise can easily be deployed anywhere and managed with insights where data teams holistically view their clusters operation and query execution. So they can reach meaningful business decisions faster, all this with the support of the largest team of Trino experts in the world, delivering fully tested stable releases and available to support you 24 7 to unlock the value in all of your data. You need a solution that easily fits with what you have today and can adapt to your architecture. Tomorrow. Starbust enterprise gives you the fastest path from big data to better decisions, cuz your team can't afford to wait. Trino was created to empower analytics anywhere and Starburst enterprise was created to give you the enterprise grade performance, connectivity, security management, and support your company needs organizations like Zolando Comcast and FINRA rely on Starburst to move their businesses forward. Contact us to get started.

Published Date : Aug 20 2022

SUMMARY :

famously said the best minds of my generation are thinking about how to get people to the data warehouse ever have featured parody with the data lake or vice versa is So, you know, despite being the industry leader for 40 years, not one of their customers truly had So Richard, from a practitioner's point of view, you know, what, what are your thoughts? although if you were starting from a Greenfield site and you were building something brand new, Y you know, Theresa, I feel like Sarbanes Oxley kinda saved the data warehouse, I, I think you gotta have centralized governance, right? So, you know, Justin, you guys last, geez, I think it was about a year ago, had a session on, And you can think of them Justin, what do you say to a, to a customer or prospect that says, look, Justin, I'm gonna, you know, for many, many years to come. But I think the reality is, you know, the data mesh model basically says, I mean, you know, there Theresa you work with a lot of clients, they're not just gonna rip and replace their existing that the mesh actually allows you to use all of them. But it creates what I would argue are two, you know, Well, it absolutely depends on some of the tooling and processes that you put in place around those do an analytic queries and with data that's all dispersed all over the, how are you seeing your the best to, to create, you know, data as a product ultimately to be consumed. open platforms are the best path to the future of data But what if you could spend less you create a single point of access to your data, no matter where it's stored. give you the performance and control that you can get with a proprietary system. I remember in the very early days, people would say, you you'll never get performance because And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, know it takes six or seven it is an evolving, you know, spectrum, but, but from your perspective, And what you don't want to end up So Jess, let me play devil's advocate here a little bit, and I've talked to Shaak about this and you know, And I think similarly, you know, being able to connect to an external table that lives in an open data format, Well, that's interesting reminded when I, you know, I see the, the gas price, And I think, you know, I loved what Richard said. not as many te data customers, but, but a lot of Oracle customers and they, you know, And so for those different teams, they can get to an ROI more quickly with different technologies that strike me, you know, the data brick snowflake, you know, thing is, oh, is a lot of fun for analysts So the advice that I saw years ago was if you have open source technologies, And in world of Oracle, you know, normally it's the staff, easy to discover and consume via, you know, the creation of data products as well. really modern, or is it the same wine new bottle? And with Starburst, you can perform analytics anywhere in light of your world. And that is the claim that today's So it's the same general stack, just, you know, a cloud version of it. So lemme come back to you just, but okay. So a lot of the same sort of structural constraints that exist with So Theresa, let me go to you cuz you have cloud first in your, in your, the data staff needs to be much more federated. you know, a microservices layer on top of leg legacy apps. So I think the stack needs to support a scalable So you think about the past, you know, five, seven years cloud obviously has given What it should be. And I think that's the paradigm shift that needs to occur. data that lives outside of the data warehouse, maybe living in open data formats in a data lake seen in data mesh, frankly really aren't, you know, adhering to So the mesh allows you to have the best of both worlds. So Richard, you know, talking about data as product, wonder if we could give us your perspectives is expecting means that you generate the wrong insight. But also, you know, around the data to say in a very clear business context, It's got the context. And ultimately with this concept of data products that we've now, you know, incorporated into our offering as well, This is Dave Valante for the cube, and we'll see you next time. You need a solution that easily fits with what you have today and can adapt

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Dave Lanta	PERSON	0.99+
Jess Borgman	PERSON	0.99+
Justin	PERSON	0.99+
Theresa	PERSON	0.99+
Justin Borgman	PERSON	0.99+
Teresa	PERSON	0.99+
Jeff Ocker	PERSON	0.99+
Richard Jarvis	PERSON	0.99+
Dave Valante	PERSON	0.99+
Justin Boardman	PERSON	0.99+
six	QUANTITY	0.99+
Dani	PERSON	0.99+
Massachusetts	LOCATION	0.99+
20 cents	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jamma	PERSON	0.99+
UK	LOCATION	0.99+
FINRA	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
Kurt Monash	PERSON	0.99+
20%	QUANTITY	0.99+
two	QUANTITY	0.99+
five	QUANTITY	0.99+
Jess	PERSON	0.99+
2011	DATE	0.99+
Starburst	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
thousands	QUANTITY	0.99+
pythons	TITLE	0.99+
Boston	LOCATION	0.99+
GDPR	TITLE	0.99+
Today	DATE	0.99+
two models	QUANTITY	0.99+
Zolando Comcast	ORGANIZATION	0.99+
Gemma	PERSON	0.99+
Starbust	ORGANIZATION	0.99+
JPMC	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Javas	TITLE	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
first lie	QUANTITY	0.99+
10	DATE	0.99+
12 years	QUANTITY	0.99+
one place	QUANTITY	0.99+
Tomorrow	DATE	0.99+

Wayne Duso & Nancy Wang | AWS Storage Day 2022

>>Okay, we're back. My name is Dave Valante and this is the Cube's coverage of AWS storage day. You know, coming off of reinforc I wrote the, the cloud was a new layer of defense. In fact, the first line of defense in a cyber security strategy. And that brings new thinking and models for protecting data, data protection, specifically, traditionally thought of as backup and recovery, it's become a critical adjacency to security and a component of a comprehensive cybersecurity strategy. We're here in our studios outside of Boston with two cube alums, and we're gonna discuss this in other topics. Wayne do so is the vice president for AWS storage edge and data services, and Nancy Wong as general manager of AWS backup and data protection services, guys. Welcome. Great to see you again. Thanks for coming on. Of >>Course, always a pleasure, Dave. Good to >>See you, Dave. All right. So Wayne, let's talk about how organizations should be thinking about this term data protection. It's an expanding definition, isn't >>It? It is an expanding definition. They, last year we talked about data and the importance of data to companies. Every company is becoming a data company, you know, da the amount of data they generate, the amount of data they can use to create models, to do predictive analytics. And frankly, to find ways of innovating is, is grown rapidly. And, you know, there's this tension between access to all that data, right? Getting the value out of that data. And how do you secure that data? And so this is something we think about with customers all the time. So data durability, data protection, data resiliency, and, you know, trust in their data. If you think about running your organization on your data, trust in your data is so important. So, you know, you gotta trust where you're putting your data. You know, people who are putting their data on a platform need to trust that platform will in fact, ensure it's durability, security, resiliency. >>And, you know, we see ourselves AWS as a partner in securing their data, making their data dur durable, making their data resilient, right? So some of that responsibility is on us. Some of that is on so shared responsibility around data protection, data resiliency. And, you know, we think about forever, you know, the notion of, you know, compromise of your infrastructure, but more and more people think about the compromise of their data as data becomes more valuable. And in fact, data is a company's most valuable asset. We've talked about this before. Only second to their people. You know, the people that are most valuable asset, but right next to that is their data. So really important stuff. >>So Nancy, you talked to a lot of customers, but by the way, it always comes back to the data. We've saying this for years, haven't we? So you've got this expanding definition of data protection, you know, governance is in there. You, you think about access cetera. When you talk to customers, what are you hearing from them? How are they thinking about data protection? >>Yeah. So a lot of the customers that Wayne and I have spoken to often come to us seeking thought leadership about, you know, how do I solve this data challenge? How do I solve this data sprawl challenge, but also more importantly, tying it back to data protection and data resiliency is how do I make sure that data is secure, that it's protected against, let's say ransomware events, right. And continuously protected. So there's a lot of mental frameworks that come to mind and a very popular one that comes up in quite a few conversations is this cybersecurity framework, right? And from a data protection perspective is just as important to protect and recover your data as it is to be able to detect different events or be able to respond to those events. Right? So recently I was just having a conversation with a regulatory body of financial institutions in Europe, where we're designing a architecture that could help them make their data immutable, but also continuously protected. So taking a step back, that's really where I see AWS's role in that we provide a wide breadth of primitives to help customers build secure platforms and scaffolding so that they can focus on building the data protection, the data governance controls, and guardrails on top of that platform. >>And, and that's always been AWS's philosophy, you know, make sure that developers have access to those primitives and APIs so that they can move fast and, and essentially build their own if that that's in fact what they wanna do. And as you're saying, when data protection is now this adjacency to cyber security, but there's disaster recoveries in there, business continuance, cyber resilience, et cetera. So, so maybe you could pick up on that and sort of extend how you see AWS, helping customers build out those resilient services. >>Yeah. So, you know, two core pillars to a data protection strategy is around their data durability, which is really an infrastructure element. You know, it's, it's, it's, it's by and large the responsibility of the provider of that infrastructure to make sure that data's durable, cuz if it's not durable, everything else doesn't matter. And then the second pillar is really about data resiliency. So in terms of security, controls and governance, like these are really important, but these are shared responsibility. Like the customers working with us with the services that we provide are there to architect the design, it's really human factors and design factors that get them resiliency, >>Nancy, anything you would add to what Wayne just said. >>Yeah, absolutely. So customers tell us that they want always on data resiliency and data durability, right? So oftentimes in those conversations, three common themes come up, which is they want a centralized solution. They want to be able to transcribe their intent into what they end up doing with their data. And number three, they want something that's policy driven because once you centralize your policies, it's much better and easier to establish control and governance at an organizational level. So keeping that in mind with policy as our interface, there's two managed AWS solutions that I recommend you all check out in terms of data resiliency and data durability. Those are AWS backup, which is our centralized solution for managing protection recovery, and also provides an audit audit capability of how you protect your data across 15 different AWS services, as well as on-premises VMware and for customers whose mission critical data is contained entirely on disk. We also offer AWS elastic disaster recovery services, especially for customers who want to fail over their workloads from on premises to the cloud. >>So you can essentially centralize as a quick follow up, centralize the policy. And like I said, the intent, but you can support a federated data model cuz you're building out this massive, you know, global system, but you can take that policy and essentially bring it anywhere on the AWS cloud. Is that >>Right? Exactly. And actually one powerful integration I want to touch upon is that AWS backup is natively integrated with AWS organizations, which is our defacto multi account federated organization model for how AWS services work with customers, both in the cloud, on the edge, at the edge and on premises. >>So that's really important because as, as we talk about all the time on the cube, this notion of a, a decentralized data architecture data mesh, but the problem is how do you ensure governance and a federated model? So we're clearly moving in that direction. Wayne, I want to ask you about cyber as a board level discussion years ago, I interviewed Dr. Robert Gates, you know, former defense secretary and he sat on a number of boards and I asked him, you know, how important and prominent is security at the board level? Is it really a board level discussion? He said, absolutely. Every time we meet, we talk about cyber security, but not every company at the time, this was kind of early last decade was doing that. That's changed now. Ransomware is front and center. Hear about it all the time. What's AWS. What's your thinking on cyber as a board level discussion and specifically what are you guys doing around ran ransomware? >>Yeah. So, you know, malware in general, ransomware being a particular type of malware. Sure. It's a hot topic and it continues to be a hot topic. And whether at the board level, the C-suite level, I had a chance to listen to Dr. Gates a couple months ago and super motivational, but we think about ransomware and the same way that our customers do. Right? Cause all of us are subject to an incident. Nobody is immune to a ransomware incident. So we think very much the same way. And you, as Nancy said, along the lines of the, this framework, we really think about, you know, how do customers identify their critical access? How do they plan for protecting those assets, right? How do they make sure that they are in fact protected? And if they do detect the ransomware event and ransomware events come from a lot of different places, like there's not one signature, there's not one thumbprint, if you would for ransomware. >>So it's, it's, there's really a lot of vigilance that needs to be put in place, but a lot of planning that needs to be put in place. And once that's detected and a, a, we have to recover, you know, we know that we have to take an action and recover having that plan in place, making sure that your assets are fully protected and can be restored. As you know, ransomware is a insidious type of malware. You know, it sits in your system for a long time. It figures out what's going on, including your backup policies, your protection policies, and figures out how to get around those with some of the things that Nancy talked about in terms of air gaping, your capabilities, being able to, if you would scan your secondary, your backup storage for malware, knowing that it's a good copy. And then being able to restore from that known good copy in the event of an incident is critical. So we think about this for ourselves and the same way that we think about these for our customers. You gotta have a great plan. You gotta have great protection and you gotta be ready to restore in the case of an incident. And we wanna make sure we provide all the capabilities to do >>That. Yeah. So I'll glad you mentioned air gaping. So at the recent re reinforce, I think it was Kurt kufeld was speaking about ransomware and he didn't specifically mention air gaping. I had to leave. So I might have, I might have missed it cause I was doing the cube, but that's a, that's a key aspect. I'm sure there were, were things on the, on the deep dives that addressed air gaping, but Nancy look, AWS has the skills. It has the resources, you know, necessary to apply all these best practices and, you know, share those with customers. But, but what specific investments is AWS making to make the CISO's life easier? Maybe you could talk about that. >>Sure. So following on to your point about the reinforced keynote, Dave, right? CJ Boes talked about how the events of a ransomware, for example, incident or event can take place right on stage where you go from detect to respond and to recover. And specifically on the recovery piece, you mentioned AWS backup, the managed service that protects across 15 different AWS services, as well as on-premises VMware as automated recovery. And that's in part why we've decided to continue that investment and deliver AWS backup audit manager, which helps customers actually prove their posture against how their protection policies are actually mapping back to their organizational controls based on, for example, how they TA tag their data for mission criticality or how sensitive that data is. Right. And so turning to best practices, especially for ransomware events. Since this is very top of mind for a lot of customers these days is I will, will always try to encourage customers to go through game day simulations, for example, identifying which are those most critical applications in their environment that they need up and running for their business to function properly, for example, and actually going through the recovery plan and making sure that their staff is well trained or that they're able to go through, for example, a security orchestration automation, recovery solution, to make sure that all of their mission critical applications are back up and running in case of a ransomware event. >>Yeah. So I love the game day thing. I mean, we know, well just the, in the history of it, you couldn't even test things like disaster recovery, right? Because it was too dangerous with the cloud. You can test these things safely and actually plan out, develop a blueprint, test your blueprint. I love the, the, the game day >>Analogy. Yeah. And actually one thing I'd love to add is, you know, we talked about air gaping. I just wanna kind of tie up that statement is, you know, one thing that's really interesting about the way that the AWS cloud is architected is the identity access and management platform actually allows us to create identity constructs, that air gap, your data perimeter. So that way, when attackers, for example, are able to gain a foothold in your environment, you're still able to air gap your most mission critical and also crown jewels from being infiltrated. >>Mm that's key. Yeah. We've learned, you know, when paying the ransom is not a good strategy, right? Cuz most of the time, many times you don't even get your data back. Okay. So we, we're kind of data geeks here. We love data and we're passionate about it on the cube AWS and you guys specifically are passionate about it. So what excites you, Wayne, you start and then Nancy, you bring us home. What excites you about data and data protection and why? >>You know, we are data nerds. So at the end of the day, you know, there's this expressions we use all the time, but data is such a rich asset for all of us. And some of the greatest innovations that come out of AWS comes out of our analysis of our own data. Like we collect a lot of data on our operations and some of our most critical features for our customers come out of our analysis, that data. So we are data nerds and we understand how businesses view their data cuz we view our data the same way. So, you know, Dave security really started in the data center. It started with the enterprises. And if we think about security, often we talk about securing compute and securing network. And you know, if you, if you secured your compute, you secured your data generally, but we've separated data from compute so that people can get the value from their data no matter how they want to use it. And in doing that, we have to make sure that their data is durable and it's resilient to any sort of incident and event. So this is really, really important to us. And what do I get excited about? You know, again, thinking back to this framework, I know that we as thought leaders alongside our customers who also thought leaders in their space can provide them with the capabilities. They need to protect their data, to secure their data, to make sure it's compliant and always, always, always durable. >>You know, it's funny, you'd say funny it's it's serious actually. Steven Schmidt at reinforc he's the, the, the chief security officer at Amazon used to be the C C ISO of AWS. He said that Amazon sees quadrillions of data points a month. That's 15 zeros. Okay. So that's a lot of data. Nancy bring us home. What's what excites you about data and data protection? >>Yeah, so specifically, and this is actually drawing from conversations that I had with multiple ISV partners at AWS reinforc is the ability to derive value from secondary data, right? Because traditionally organizations have really seen that as a call center, right? You're producing secondary data because most likely you're creating backups of your mission critical workloads. But what if you're able to run analytics and insights and derive insights from that, that secondary data, right? Then you're actually able to let AWS do the undifferentiated heavy lifting of analyzing that secondary data state. So that way us customers or ISV partners can build value on the security layers above. And that is how we see turning cost into value. >>I love it. As you're taking the original premise of the cloud, taking away the under heavy lifting for, you know, D deploying, compute, storage, and networking now bringing up to the data level, the analytics level. So it continues. The cloud continues to expand. Thank you for watching the cubes coverage of AWS storage day 2022.

Published Date : Aug 10 2022

SUMMARY :

Great to see you again. So Wayne, let's talk about how organizations should be thinking about this term data So data durability, data protection, data resiliency, and, you know, And, you know, we think about forever, you know, the notion of, you know, So Nancy, you talked to a lot of customers, but by the way, it always comes back to the data. about, you know, how do I solve this data challenge? And, and that's always been AWS's philosophy, you know, make sure that developers have access it's, it's, it's by and large the responsibility of the provider of that infrastructure to make sure that data's durable, how you protect your data across 15 different AWS services, as well as on-premises VMware And like I said, the intent, but you can support a federated data model cuz you're building both in the cloud, on the edge, at the edge and on premises. data mesh, but the problem is how do you ensure governance and a federated model? along the lines of the, this framework, we really think about, you know, how do customers identify you know, we know that we have to take an action and recover having that plan in place, you know, necessary to apply all these best practices and, And specifically on the recovery piece, you mentioned AWS backup, you couldn't even test things like disaster recovery, right? I just wanna kind of tie up that statement is, you know, one thing that's really interesting Cuz most of the time, many times you don't even get your data back. So at the end of the day, you know, there's this expressions we use What's what excites you about data and data protection? at AWS reinforc is the ability to derive value from secondary data, you know, D deploying, compute, storage, and networking now bringing up to the data level,

ENTITIES

Entity	Category	Confidence
Nancy	PERSON	0.99+
Nancy Wong	PERSON	0.99+
Dave	PERSON	0.99+
Steven Schmidt	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Wayne	PERSON	0.99+
Boston	LOCATION	0.99+
15	QUANTITY	0.99+
Kurt kufeld	PERSON	0.99+
CJ Boes	PERSON	0.99+
Nancy Wang	PERSON	0.99+
Robert Gates	PERSON	0.99+
two	QUANTITY	0.99+
last year	DATE	0.99+
Gates	PERSON	0.99+
first line	QUANTITY	0.99+
second pillar	QUANTITY	0.99+
one	QUANTITY	0.99+
Wayne Duso	PERSON	0.99+
both	QUANTITY	0.98+
15 zeros	QUANTITY	0.98+
one thumbprint	QUANTITY	0.98+
one signature	QUANTITY	0.97+
two core pillars	QUANTITY	0.96+
early last decade	DATE	0.96+
three common themes	QUANTITY	0.95+
a month	QUANTITY	0.9+
second	QUANTITY	0.88+
couple months ago	DATE	0.85+
Dr.	PERSON	0.84+
two cube	QUANTITY	0.77+
VMware	TITLE	0.71+
Day 2022	EVENT	0.71+
three	QUANTITY	0.66+
years	DATE	0.65+
game	EVENT	0.57+
day	EVENT	0.52+
2022	DATE	0.45+
Cube	ORGANIZATION	0.35+

Wayne Durso & Nancy Wang | AWS Storage Day 2022

[Music] okay we're back my name is dave vellante and this is thecube's coverage of aws storage day you know coming off of reinforce i wrote that the cloud was a new layer of defense in fact the first line of defense in a cyber security strategy that brings new thinking and models for protecting data data protection specifically traditionally thought of as backup and recovery it's become a critical adjacency to security and a component of a comprehensive cyber security strategy we're here in our studios outside of boston with two cube alums and we're going to discuss this and other topics wayne dusso is the vice president for aws storage edge and data services and nancy wong as general manager of aws backup and data protection services guys welcome great to see you again thanks for coming on of course always a pleasure dave good to see you dave all right so wayne let's talk about how organizations should be thinking about this term data protection it's an expanding definition isn't it it is an expanded definition dave last year we talked about uh data and the importance of data to companies every company um is becoming a data company uh you know the amount of data they generate uh the amount of data they can use to uh create models to do predictive analytics and frankly uh to find ways of innovating uh is is growing uh rapidly and you know there's this tension between access to all that data right getting the value out of that data and how do you secure that data and so this is something we think about with customers all the time so data durability data protection data resiliency and you know trust in their data if you think about running your organization on your data trust in your data is so important so you know you got to trust where you're putting your data you know people who are putting their data on a platform need to trust that platform will in fact ensure its durability security resiliency and you know we see ourselves uh aws as a partner uh in securing their data making their data they're built durable making their data resilient all right so some of that responsibility is on us some of that is on amazon responsibility around data protection data resiliency and you know um we think about forever you know the notion of um you know compromise of your infrastructure but more and more people think about the compromise of their data as data becomes more valuable in fact data is a company's most valuable asset we've talked about this before only second to their people you know the people who are the most valuable asset but right next to that is their data so really important stuff so nancy you talk to a lot of customers but by the way it always comes back to the data we've been saying this for years haven't we so you've got this expanding definition of data protection you know governance is in there you think about access etc when you talk to customers what are you hearing from them how are they thinking about data protection yeah so a lot of the customers that wayne and i have spoken to often come to us seeking thought leadership about you know how do i solve this data challenge how do i solve this data sprawl challenge but also more importantly tying it back to data protection and data resiliency is how do i make sure that data is secure that it's protected against let's say ransomware events right and continuously protected so there's a lot of mental frameworks that come to mind and a very popular one that comes up in quite a few conversations is in this cyber security framework right and from a data protection perspective it's just as important to protect and recover your data as it is to be able to detect different events or be able to respond to those events right so recently i was just having a conversation with a regulatory body of financial institutions in europe where we're designing a architecture that could help them make their data immutable but also continuously protected so taking a step back that's really where i see aws's role in that we provide a wide breadth of primitives to help customers build secure platforms and scaffolding so that they can focus on building the data protection the data governance controls and guardrails on top of that platform and that's always been aws philosophy make sure that developers have access to those primitives and apis so that they can move fast and essentially build their own if that that's in fact what they want to do and as you're saying when data protection is now this adjacency to cyber security but there's disaster recoveries in there business continuance cyber resilience etc so so maybe you could pick up on that and sort of extend how you see aws helping customers build out those resilient services yeah so you know two uh core pillars to a data protection strategy is around their data durability which is really an infrastructural element you know it's it's it's by and large the responsibility of the provided that infrastructure to make sure that data is durable because if it's not durable and everything else doesn't matter um and the second pillar is really about data resiliency so in terms of security controls and governance like these are really important but these are a shared responsibility like the customers working with us with the services that we provide are there to architect the design it's really human factors and design factors that get them resiliency nancy anything you would add to what wayne just said yeah absolutely so customers tell us that they want always on data resiliency and data durability right so oftentimes in those conversations three common themes come up which is they want a centralized solution they want to be able to transcribe their intent into what they end up doing with their data and number three they want something that's policy driven because once you centralize your policies it's much better and easier to establish control and governance at an organizational level so keeping that in mind with policy as our interface there's two managed aws solutions that i recommend you all check out in terms of data resiliency and data durability those are aws backup which is our centralized solution for managing protection recovery and also provides an audit audit capability of how you protect your data across 15 different aws services as well as on-premises vmware and for customers whose mission-critical data is contained entirely on disk we also offer aws elastic disaster recovery services especially for customers who want to fail over their workloads from on-premises to the cloud so you can essentially centralize as a quick follow-up centralize the policy and as you said the intent but you can support a federated data model because you're building out this massive you know global system but you can take that policy and essentially bring it anywhere on the aws cloud is that right exactly and actually one powerful integration i want to touch upon is that aws backup is natively integrated with aws organizations which is our de facto multi-account federated organization model for how aws services work with customers both in the cloud on the edge at the edge and on premises so that's really important because as we talk about all the time on the cube this notion of a decentralized data architecture data mesh but the problem is how do you ensure governance in a federated model so we're clearly moving in that direction when i want to ask you about cyber as a board level discussion years ago i interviewed dr robert gates you know former defense secretary and he sat on a number of boards and i asked him you know how important and prominent is security at the board level is it really a board level discussion he said absolutely every time we meet we talk about cyber security but not every company at the time this was kind of early last decade was doing that that's changed um now ransomware is front and center hear about it all the time what's aws what's your thinking on cyber as a board level discussion and specifically what are you guys doing around ransomware yeah so you know malware in general ransomware being a particular type of malware um it's a hot topic and it continues to be a hot topic and whether at the board level the c-suite level um i had a chance to listen to uh dr gates a couple months ago and uh it was super motivational um but we think about ransomware in the same way that our customers do right because all of us are subject to an incident nobody is uh uh immune to a ransomware incident so we think very much the same way and as nancy said along the lines of the nist framework we really think about you know how do customers identify their critical access how do they plan for protecting those assets right how do they make sure that they are in fact protected and if they do detect a ransomware event and ransomware events come from a lot of different places like there's not one signature there's not one thumb print if you would for ransomware so it's it's there's really a lot of vigilance uh that needs to be put in place but a lot of planning that needs to be put in place and once that's detected and a we have to recover you know we know that we have to take an action and recover having that plan in place making sure that your assets are fully protected and can be restored as you know ransomware is a insidious uh type of malware you know it sits in your system for a long time it figures out what's going on including your backup policies your protection policies and figures out how to get around those with some of the things that nancy talked about in terms of air gapping your capabilities being able to if you would scan your secondary your backup storage for malware knowing that it's a good copy and then being able to restore from that known good copy in the event of an incident is critical so we think about this for ourselves in the same way that we think about these for our customers you've got to have a great plan you've got to have great protection and you've got to be ready to restore in the case of an incident and we want to make sure we provide all the capabilities to do that yeah so i'm glad you mentioned air gapping so at the recent reinforce i think it was kurt kufeld was speaking about ransomware and he didn't specifically mention air gapping i had to leave so i might i might have missed it because i'm doing the cube but that's a that's a key aspect i'm sure there were things in the on the deep dives that addressed air gapping but nancy look aws has the skills it has the resources you know necessary to apply all these best practices and you know share those as customers but but what specific investments is aws making to make the cso's life easier maybe you could talk about that sure so following on to your point about the reinforced keynote dave right cj moses talked about how the events of a ransomware for example incident or event can take place right on stage where you go from detect to respond and to recover and specifically on the recover piece he mentioned aws backup the managed service that protects across 15 different aws services as well as on-premises vmware as automated recovery and that's in part why we've decided to continue that investment and deliver aws backup audit manager which helps customers actually prove their posture against how their protection policies are actually mapping back to their organizational controls based on for example how they tag their data for mission criticality or how sensitive that data is right and so turning to best practices especially for ransomware events since this is very top of mind for a lot of customers these days is i will always try to encourage customers to go through game day simulations for example identifying which are those most critical applications in their environment that they need up and running for their business to function properly for example and actually going through the recovery plan and making sure that their staff is well trained or that they're able to go through for example a security orchestration automation recovery solution to make sure that all of their mission critical applications are back up and running in case of a ransomware event yeah so i love the game date thing i mean we know well just in the history of it you couldn't even test things like disaster recovery be right because it was too dangerous with the cloud you can test these things safely and actually plan out develop a blueprint test your blueprint i love the the game day analogy yeah and actually one thing i love to add is you know we talked about air gapping i just want to kind of tie up that statement is you know one thing that's really interesting about the way that the aws cloud is architected is the identity access and management platform actually allows us to create identity constructs that air gap your data perimeter so that way when attackers for example are able to gain a foothold in your environment you're still able to air gap your most mission critical and also crown jewels from being infiltrated that's key yeah we've learned you know when paying the ransom is not a good strategy right because most of the time many times you don't even get your data back okay so we we're kind of data geeks here we love data um and we're passionate about it on the cube aws and you guys specifically are passionate about it so what excites you wayne you start and then nancy you bring us home what excites you about data and data protection and why you know we are data nerds uh so at the end of the day um you know there's there's expressions we use all the time but data is such a rich asset for all of us some of the greatest innovations that come out of aws comes out of our analysis of our own data like we collect a lot of data on our operations and some of our most critical features for our customers come out of our analysis that data so we are data nerds and we understand how businesses uh view their data because we view our data the same way so you know dave security really started in the data center it started with the enterprises and if we think about security often we talk about securing compute and securing network and you know if you if you secured your compute you secured your data generally but we've separated data from compute so that people can get the value from their data no matter how they want to use it and in doing that we have to make sure that their data is durable and it's resilient to any sort of incident event so this is really really important to us and what do i get excited about um you know again thinking back to this framework i know that we as thought leaders alongside our customers who also thought leaders in their space can provide them with the capabilities they need to protect their data to secure their data to make sure it's compliant and always always always durable you know it's funny you'd say it's not funny it's serious actually steven schmidt uh at reinforce he's the the chief security officer at amazon used to be the c c iso of aws he said that amazon sees quadrillions of data points a month that's 15 zeros okay so that's a lot of data nancy bring us home what's what excites you about data and data protection yeah so specifically and this is actually drawing from conversations that i had with multiple isv partners at aws reinforce is the ability to derive value from secondary data right because traditionally organizations have really seen that as a cost center right you're producing secondary data because most likely you're creating backups of your mission critical workloads but what if you're able to run analytics and insights and derive insights from that secondary data right then you're actually able to let aws do the undifferentiated heavy lifting of analyzing that secondary data as state so that way you as customers or isv partners can build value on the security layers above and that is how we see turning cost into value i love it you're taking the original premise of the cloud taking away the undifferentiated heavy lifting for you know deploying compute storage and networking now bringing up to the data level the analytics level so it continues the cloud continues to expand thank you for watching thecube's coverage of aws storage day 2022

Published Date : Aug 5 2022

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
amazon	ORGANIZATION	0.99+
aws	ORGANIZATION	0.99+
kurt kufeld	PERSON	0.99+
europe	LOCATION	0.99+
last year	DATE	0.99+
boston	LOCATION	0.99+
wayne dusso	PERSON	0.99+
steven schmidt	PERSON	0.99+
Nancy Wang	PERSON	0.99+
two	QUANTITY	0.98+
Wayne Durso	PERSON	0.98+
uh aws	ORGANIZATION	0.98+
first line	QUANTITY	0.97+
AWS	ORGANIZATION	0.97+
dave vellante	PERSON	0.97+
dave	PERSON	0.97+
one signature	QUANTITY	0.97+
three common themes	QUANTITY	0.96+
one thumb	QUANTITY	0.96+
wayne	PERSON	0.96+
nancy	PERSON	0.95+
second pillar	QUANTITY	0.94+
15 zeros	QUANTITY	0.94+
one	QUANTITY	0.92+
15 different	QUANTITY	0.92+
both	QUANTITY	0.92+
dr robert gates	PERSON	0.91+
second	QUANTITY	0.91+
a month	QUANTITY	0.9+
one thing	QUANTITY	0.88+
vmware	TITLE	0.81+
a couple months ago	DATE	0.81+
early last decade	DATE	0.8+
years ago	DATE	0.78+
lot of customers	QUANTITY	0.76+
lot	QUANTITY	0.76+
15 different	QUANTITY	0.74+
a lot of customers	QUANTITY	0.74+
dr gates	PERSON	0.67+
day 2022	EVENT	0.65+
data	QUANTITY	0.63+
cube	ORGANIZATION	0.63+
ransomware	TITLE	0.62+
nancy	ORGANIZATION	0.59+
three	QUANTITY	0.54+
Day 2022	EVENT	0.53+
years	QUANTITY	0.48+
core	QUANTITY	0.48+
nancy wong	PERSON	0.47+
thecube	PERSON	0.47+
cloud	TITLE	0.36+

Starburst Panel Q2

>>We're back with Jess Borgman of Starburst and Richard Jarvis of emus health. Okay. We're gonna get into lie. Number two, and that is this an open source based platform cannot give you the performance and control that you can get with a proprietary system. Is that a lie? Justin, the enterprise data warehouse has been pretty dominant and has evolved and matured. Its stack has mature over the years. Why is it not the default platform for data? >>Yeah, well, I think that's become a lie over time. So I, I think, you know, if we go back 10 or 12 years ago with the advent of the first data lake really around Hudu, that probably was true that you couldn't get the performance that you needed to run fast, interactive, SQL queries in a data lake. Now a lot's changed in 10 or 12 years. I remember in the very early days, people would say, you'll, you'll never get performance because you need to be column. You need to store data in a column format. And then, you know, column formats were introduced to, to data lakes. You have Parque ORC file in aro that were created to ultimately deliver performance out of that. So, okay. We got, you know, largely over the performance hurdle, you know, more recently people will say, well, you don't have the ability to do updates and deletes like a traditional data warehouse. >>And now we've got the creation of new data formats, again like iceberg and Delta and DY that do allow for updates and delete. So I think the data lake has continued to mature. And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, it takes six or seven years to build a functional database. I think that's that's right. And now we've had almost a decade go by. So, you know, these technologies have matured to really deliver very, very close to the same level performance and functionality of, of cloud data warehouses. So I think the, the reality is that's become a lie and now we have large giant hyperscale internet companies that, you know, don't have the traditional data warehouse at all. They do all of their analytics in a data lake. So I think we've, we've proven that it's very much possible today. >>Thank you for that. And so Richard, talk about your perspective as a practitioner in terms of what open brings you versus, I mean, the closed is it's open as a moving target. I remember Unix used to be open systems and so it's, it is an evolving, you know, spectrum, but, but from your perspective, what does open give you that you can't get from a proprietary system where you are fearful of in a proprietary system? >>I, I suppose for me open buys us the ability to be unsure about the future, because one thing that's always true about technology is it evolves in a, a direction, slightly different to what people expect. And what you don't want to end up is done is backed itself into a corner that then prevents it from innovating. So if you have chosen the technology and you've stored trillions of records in that technology and suddenly a new way of processing or machine learning comes out, you wanna be able to take advantage and your competitive edge might depend upon it. And so I suppose for us, we acknowledge that we don't have perfect vision of what the future might be. And so by backing open storage technologies, we can apply a number of different technologies to the processing of that data. And that gives us the ability to remain relevant, innovate on our data storage. And we have bought our way out of the, any performance concerns because we can use cloud scale infrastructure to scale up and scale down as we need. And so we don't have the concerns that we don't have enough hardware today to process what we want to do, but want to achieve. We can just scale up when we need it and scale back down. So open source has really allowed us to maintain the being at the cutting edge. >>So Justin, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, obviously her vision is there's an open source that, that data mesh is open source, an open source tooling, and it's not a proprietary, you know, you're not gonna buy a data mesh. You're gonna build it with, with open source toolings and, and vendors like you are gonna support it, but come back to sort of today, you can get to market with a proprietary solution faster. I'm gonna make that statement. You tell me if it's a lie and then you can say, okay, we support Apache iceberg. We're gonna support open source tooling, take a company like VMware, not really in the data business, but how, the way they embraced Kubernetes and, and you know, every new open source thing that comes along, they say, we do that too. Why can't proprietary systems do that and be as effective? >>Yeah, well, I think at least with the, within the data landscape saying that you can access open data formats like iceberg or, or others is, is a bit dis disingenuous because really what you're selling to your customer is a certain degree of performance, a certain SLA, and you know, those cloud data warehouses that can reach beyond their own proprietary storage drop all the performance that they were able to provide. So it is, it reminds me kind of, of, again, going back 10 or 12 years ago when everybody had a connector to hit and that they thought that was the solution, right? But the reality was, you know, a connector was not the same as running workloads in had back then. And I think, think similarly, you know, being able to connect to an external table that lives in an open data format, you know, you're, you're not going to give it the performance that your customers are accustomed to. And at the end of the day, they're always going to be predisposed. They're always going to be incentivized to get that data ingested into the data warehouse, cuz that's where they have control. And you know, the bottom line is the database industry has really been built around vendor lockin. I mean, from the start, how, how many people love Oracle today, but our customers, nonetheless, I think, you know, lockin is, is, is part of this industry. And I think that's really what we're trying to change with open data formats. >>Well, it's interesting reminded when I, you know, I see the, the gas price, the TSR gas price I, I drive up and then I say, oh, that's the cash price credit card. I gotta pay 20 cents more, but okay. But so the, the argument then, so let me, let me come back to you, Justin. So what's wrong with saying, Hey, we support open data formats, but yeah, you're gonna get better performance if you, if you, you keep it into our closed system, are you saying that long term that's gonna come back and bite you cuz you're gonna end up. You mentioned Oracle, you mentioned Teradata. Yeah. That's by, by implication, you're saying that's where snowflake customers are headed. >>Yeah, absolutely. I think this is a movie that, you know, we've all seen before. At least those of us who've been in the industry long enough to, to see this movie play over a couple times. So I do think that's the future. And I think, you know, I loved what Richard said. I actually wrote it down cause I thought it was amazing quote. He said, it buys us the ability to be unsure of the future. That that pretty much says it all the, the future is unknowable and the reality is using open data formats. You remain interoperable with any technology you want to utilize. If you want to use smart to train a machine learning model and you wanna use Starbust to query be a sequel, that's totally cool. They can both work off the same exact, you know, data, data sets by contrast, if you're, you know, focused on a proprietary model, then you're kind of locked in again to that model. I think the same applies to data, sharing to data products, to a wide variety of, of aspects of the data landscape that a proprietary approach kind of closes you and, and locks you in. >>So I would say this Richard, I'd love to get your thoughts on it. Cause I talked to a lot of Oracle customers, not as many te data customers, but, but a lot of Oracle customers and they, you know, they'll admit yeah, you know, they Jimin some price and the license cost they give, but we do get value out of it. And so my question to you, Richard, is, is do the, let's call it data warehouse systems or the proprietary systems. Are they gonna deliver a greater ROI sooner? And is that in allure of, of that customers, you know, are attracted to, or can open platforms deliver as fast an ROI? >>I think the answer to that is it can depend a bit. It depends on your business's skillset. So we are lucky that we have a number of proprietary teams that work in databases that provide our operational data capability. And we have teams of analytics and big data experts who can work with open data sets and open data formats. And so for those different teams, they can get to an ROI more quickly with different technologies for the business though, we can't do better for our operational data stores than proprietary databases. Today we can back off very tight SLAs to them. We can demonstrate reliability from millions of hours of those databases being run enterprise scale, but for an analytics workload where increasing our business is growing in that direction, we can't do better than open data formats with cloud based data mesh type technologies. And so it's not a simple answer. That one will always be the right answer for our business. We definitely have times when proprietary databases provide a capability that we couldn't easily represent or replicate with open technologies. >>Yeah. Richard, stay with you. You mentioned, you know, you know, some things before that, that strike me, you know, the data brick snowflake, you know, thing is a lot of fun for analysts like me. You've got data bricks coming at it. Richard, you mentioned you have a lot of rockstar, data engineers, data bricks coming at it from a data engineering heritage. You get snowflake coming at it from an analytics heritage. Those two worlds are, are colliding people like P Sanji Mohan said, you know what? I think it's actually harder to play in the data engineering. So I E it's easier to for data engineering world to go into the analytics world versus the reverse, but thinking about up and coming engineers and developers preparing for this future of data engineering and data analytics, how, how should they be thinking about the future? What, what's your advice to those young people? >>So I think I'd probably fall back on general programming skill sets. So the advice that I saw years ago was if you have open source technologies, the pythons and Javas on your CV, you command a 20% pay, hike over people who can only do proprietary programming languages. And I think that's true of data technologies as well. And from a business point of view, that makes sense. I'd rather spend the money that I save on proprietary licenses on better engineers, because they can provide more value to the business that can innovate us beyond our competitors. So I think I would my advice to people who are starting here or trying to build teams to capitalize on data assets is begin with open license, free capabilities, because they're very cheap to experiment with. And they generate a lot of interest from people who want to join you as a business. And you can make them very successful early, early doors with, with your analytics journey. >>It's interesting. Again, analysts like myself, we do a lot of TCO work and have over the last 20 plus years and in the world of Oracle, you know, normally it's the staff, that's the biggest nut in total cost of ownership, not an Oracle. It's the it's the license cost is by far the biggest component in the, in the blame pie. All right, Justin, help us close out this segment. We've been talking about this sort of data mesh open, closed snowflake data bricks. Where does Starburst sort of as this engine for the data lake data lake house, the data warehouse, it fit in this, in this world. >>Yeah. So our view on how the future ultimately unfolds is we think that data lakes will be a natural center of gravity for a lot of the reasons that we described open data formats, lowest total cost of ownership, because you get to choose the cheapest storage available to you. Maybe that's S3 or Azure data lake storage, or Google cloud storage, or maybe it's on-prem object storage that you bought at a, at a really good price. So ultimately storing a lot of data in a data lake makes a lot of sense, but I think what makes our perspective unique is we still don't think you're gonna get everything there either. We think that basically centralization of all your data assets is just an impossible endeavor. And so you wanna be able to access data that lives outside of the lake as well. So we kind of think of the lake as maybe the biggest place by volume in terms of how much data you have, but to, to have comprehensive analytics and to truly understand your business and understand it holistically, you need to be able to go access other data sources as well. And so that's the role that we wanna play is to be a single point of access for our customers, provide the right level of fine grained access control so that the right people have access to the right data and ultimately make it easy to discover and consume via, you know, the creation of data products as well. >>Great. Okay. Thanks guys. Right after this quick break, we're gonna be back to debate whether the cloud data model that we see emerging and the so-called modern data stack is really modern, or is it the same wine new bottle when it comes to data architectures, you're watching the cube, the leader in enterprise and emerging tech coverage.

Published Date : Aug 2 2022

SUMMARY :

cannot give you the performance and control that you can get with We got, you know, largely over the performance hurdle, you know, more recently people will say, And I remember a, a quote from, you know, Kurt Monash many years ago where he said, you know, open systems and so it's, it is an evolving, you know, spectrum, And what you don't want to end up So Justin, let me play devil's advocate here a little bit, and I've talked to JAK about this and you know, And I think, think similarly, you know, being able to connect to an external table that lives in an open data Well, it's interesting reminded when I, you know, I see the, the gas price, And I think, you know, I loved what Richard said. not as many te data customers, but, but a lot of Oracle customers and they, you know, I think the answer to that is it can depend a bit. that strike me, you know, the data brick snowflake, you know, thing is a lot of fun for analysts So the advice that I saw years ago was if you have open source technologies, years and in the world of Oracle, you know, normally it's the staff, it easy to discover and consume via, you know, the creation of data products as well. data model that we see emerging and the so-called modern data stack

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Jess Borgman	PERSON	0.99+
Justin	PERSON	0.99+
six	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Richard Jarvis	PERSON	0.99+
20 cents	QUANTITY	0.99+
20%	QUANTITY	0.99+
Kurt Monash	PERSON	0.99+
P Sanji Mohan	PERSON	0.99+
Today	DATE	0.99+
seven years	QUANTITY	0.99+
pythons	TITLE	0.99+
Teradata	ORGANIZATION	0.99+
JAK	PERSON	0.99+
Javas	TITLE	0.99+
10	DATE	0.99+
today	DATE	0.98+
Starbust	TITLE	0.98+
Starburst	ORGANIZATION	0.97+
VMware	ORGANIZATION	0.97+
both	QUANTITY	0.97+
12 years ago	DATE	0.96+
single point	QUANTITY	0.96+
millions of hours	QUANTITY	0.95+
10	QUANTITY	0.93+
Unix	TITLE	0.92+
12 years	QUANTITY	0.92+
Google	ORGANIZATION	0.9+
two worlds	QUANTITY	0.9+
DY	ORGANIZATION	0.87+
first data lake	QUANTITY	0.86+
Hudu	LOCATION	0.85+
trillions	QUANTITY	0.85+
one thing	QUANTITY	0.83+
many years ago	DATE	0.79+
Apache iceberg	ORGANIZATION	0.79+
over a couple times	QUANTITY	0.77+
emus health	ORGANIZATION	0.75+
Jimin	PERSON	0.73+
Starburst	TITLE	0.73+
years ago	DATE	0.72+
Azure	TITLE	0.7+
Kubernetes	ORGANIZATION	0.67+
TCO	ORGANIZATION	0.64+
S3	TITLE	0.62+
Delta	ORGANIZATION	0.6+
plus years	DATE	0.59+
Number two	QUANTITY	0.58+
a decade	QUANTITY	0.56+
iceberg	TITLE	0.47+
Parque	ORGANIZATION	0.47+
last	DATE	0.47+
20	QUANTITY	0.46+
Q2	QUANTITY	0.31+
ORC	ORGANIZATION	0.27+

Breaking Analysis: How the cloud is changing security defenses in the 2020s

>> Announcer: From theCUBE studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> The rapid pace of cloud adoption has changed the way organizations approach cybersecurity. Specifically, the cloud is increasingly becoming the first line of cyber defense. As such, along with communicating to the board and creating a security aware culture, the chief information security officer must ensure that the shared responsibility model is being applied properly. Meanwhile, the DevSecOps team has emerged as the critical link between strategy and execution, while audit becomes the free safety, if you will, in the equation, i.e., the last line of defense. Hello, and welcome to this week's, we keep on CUBE Insights, powered by ETR. In this "Breaking Analysis", we'll share the latest data on hyperscale, IaaS, and PaaS market performance, along with some fresh ETR survey data. And we'll share some highlights and the puts and takes from the recent AWS re:Inforce event in Boston. But first, the macro. It's earning season, and that's what many people want to talk about, including us. As we reported last week, the macro spending picture is very mixed and weird. Think back to a week ago when SNAP reported. A player like SNAP misses and the Nasdaq drops 300 points. Meanwhile, Intel, the great semiconductor hope for America misses by a mile, cuts its revenue outlook by 15% for the year, and the Nasdaq was up nearly 250 points just ahead of the close, go figure. Earnings reports from Meta, Google, Microsoft, ServiceNow, and some others underscored cautious outlooks, especially those exposed to the advertising revenue sector. But at the same time, Apple, Microsoft, and Google, were, let's say less bad than expected. And that brought a sigh of relief. And then there's Amazon, which beat on revenue, it beat on cloud revenue, and it gave positive guidance. The Nasdaq has seen this month best month since the isolation economy, which "Breaking Analysis" contributor, Chip Symington, attributes to what he calls an oversold rally. But there are many unknowns that remain. How bad will inflation be? Will the fed really stop tightening after September? The Senate just approved a big spending bill along with corporate tax hikes, which generally don't favor the economy. And on Monday, August 1st, the market will likely realize that we are in the summer quarter, and there's some work to be done. Which is why it's not surprising that investors sold the Nasdaq at the close today on Friday. Are people ready to call the bottom? Hmm, some maybe, but there's still lots of uncertainty. However, the cloud continues its march, despite some very slight deceleration in growth rates from the two leaders. Here's an update of our big four IaaS quarterly revenue data. The big four hyperscalers will account for $165 billion in revenue this year, slightly lower than what we had last quarter. We expect AWS to surpass 83 billion this year in revenue. Azure will be more than 2/3rds the size of AWS, a milestone from Microsoft. Both AWS and Azure came in slightly below our expectations, but still very solid growth at 33% and 46% respectively. GCP, Google Cloud Platform is the big concern. By our estimates GCP's growth rate decelerated from 47% in Q1, and was 38% this past quarter. The company is struggling to keep up with the two giants. Remember, both GCP and Azure, they play a shell game and hide the ball on their IaaS numbers, so we have to use a survey data and other means of estimating. But this is how we see the market shaping up in 2022. Now, before we leave the overall cloud discussion, here's some ETR data that shows the net score or spending momentum granularity for each of the hyperscalers. These bars show the breakdown for each company, with net score on the right and in parenthesis, net score from last quarter. lime green is new adoptions, forest green is spending up 6% or more, the gray is flat, pink is spending at 6% down or worse, and the bright red is replacement or churn. Subtract the reds from the greens and you get net score. One note is this is for each company's overall portfolio. So it's not just cloud. So it's a bit of a mixed bag, but there are a couple points worth noting. First, anything above 40% or 40, here as shown in the chart, is considered elevated. AWS, as you can see, is well above that 40% mark, as is Microsoft. And if you isolate Microsoft's Azure, only Azure, it jumps above AWS's momentum. Google is just barely hanging on to that 40 line, and Alibaba is well below, with both Google and Alibaba showing much higher replacements, that bright red. But here's the key point. AWS and Azure have virtually no churn, no replacements in that bright red. And all four companies are experiencing single-digit numbers in terms of decreased spending within customer accounts. People may be moving some workloads back on-prem selectively, but repatriation is definitely not a trend to bet the house on, in our view. Okay, let's get to the main subject of this "Breaking Analysis". TheCube was at AWS re:Inforce in Boston this week, and we have some observations to share. First, we had keynotes from Steven Schmidt who used to be the chief information security officer at Amazon on Web Services, now he's the CSO, the chief security officer of Amazon. Overall, he dropped the I in his title. CJ Moses is the CISO for AWS. Kurt Kufeld of AWS also spoke, as did Lena Smart, who's the MongoDB CISO, and she keynoted and also came on theCUBE. We'll go back to her in a moment. The key point Schmidt made, one of them anyway, was that Amazon sees more data points in a day than most organizations see in a lifetime. Actually, it adds up to quadrillions over a fairly short period of time, I think, it was within a month. That's quadrillion, it's 15 zeros, by the way. Now, there was drill down focus on data protection and privacy, governance, risk, and compliance, GRC, identity, big, big topic, both within AWS and the ecosystem, network security, and threat detection. Those are the five really highlighted areas. Re:Inforce is really about bringing a lot of best practice guidance to security practitioners, like how to get the most out of AWS tooling. Schmidt had a very strong statement saying, he said, "I can assure you with a 100% certainty that single controls and binary states will absolutely positively fail." Hence, the importance of course, of layered security. We heard a little bit of chat about getting ready for the future and skating to the security puck where quantum computing threatens to hack all of the existing cryptographic algorithms, and how AWS is trying to get in front of all that, and a new set of algorithms came out, AWS is testing. And, you know, we'll talk about that maybe in the future, but that's a ways off. And by its prominent presence, the ecosystem was there enforced, to talk about their role and filling the gaps and picking up where AWS leaves off. We heard a little bit about ransomware defense, but surprisingly, at least in the keynotes, no discussion about air gaps, which we've talked about in previous "Breaking Analysis", is a key factor. We heard a lot about services to help with threat detection and container security and DevOps, et cetera, but there really wasn't a lot of specific talk about how AWS is simplifying the life of the CISO. Now, maybe it's inherently assumed as AWS did a good job stressing that security is job number one, very credible and believable in that front. But you have to wonder if the world is getting simpler or more complex with cloud. And, you know, you might say, "Well, Dave, come on, of course it's better with cloud." But look, attacks are up, the threat surface is expanding, and new exfiltration records are being set every day. I think the hard truth is, the cloud is driving businesses forward and accelerating digital, and those businesses are now exposed more than ever. And that's why security has become such an important topic to boards and throughout the entire organization. Now, the other epiphany that we had at re:Inforce is that there are new layers and a new trust framework emerging in cyber. Roles are shifting, and as a direct result of the cloud, things are changing within organizations. And this first hit me in a conversation with long-time cyber practitioner and Wikibon colleague from our early Wikibon days, and friend, Mike Versace. And I spent two days testing the premise that Michael and I talked about. And here's an attempt to put that conversation into a graphic. The cloud is now the first line of defense. AWS specifically, but hyperscalers generally provide the services, the talent, the best practices, and automation tools to secure infrastructure and their physical data centers. And they're really good at it. The security inside of hyperscaler clouds is best of breed, it's world class. And that first line of defense does take some of the responsibility off of CISOs, but they have to understand and apply the shared responsibility model, where the cloud provider leaves it to the customer, of course, to make sure that the infrastructure they're deploying is properly configured. So in addition to creating a cyber aware culture and communicating up to the board, the CISO has to ensure compliance with and adherence to the model. That includes attracting and retaining the talent necessary to succeed. Now, on the subject of building a security culture, listen to this clip on one of the techniques that Lena Smart, remember, she's the CISO of MongoDB, one of the techniques she uses to foster awareness and build security cultures in her organization. Play the clip >> Having the Security Champion program, so that's just, it's like one of my babies. That and helping underrepresented groups in MongoDB kind of get on in the tech world are both really important to me. And so the Security Champion program is purely purely voluntary. We have over 100 members. And these are people, there's no bar to join, you don't have to be technical. If you're an executive assistant who wants to learn more about security, like my assistant does, you're more than welcome. Up to, we actually, people grade themselves when they join us. We give them a little tick box, like five is, I walk on security water, one is I can spell security, but I'd like to learn more. Mixing those groups together has been game-changing for us. >> Now, the next layer is really where it gets interesting. DevSecOps, you know, we hear about it all the time, shifting left. It implies designing security into the code at the dev level. Shift left and shield right is the kind of buzz phrase. But it's getting more and more complicated. So there are layers within the development cycle, i.e., securing the container. So the app code can't be threatened by backdoors or weaknesses in the containers. Then, securing the runtime to make sure the code is maintained and compliant. Then, the DevOps platform so that change management doesn't create gaps and exposures, and screw things up. And this is just for the application security side of the equation. What about the network and implementing zero trust principles, and securing endpoints, and machine to machine, and human to app communication? So there's a lot of burden being placed on the DevOps team, and they have to partner with the SecOps team to succeed. Those guys are not security experts. And finally, there's audit, which is the last line of defense or what I called at the open, the free safety, for you football fans. They have to do more than just tick the box for the board. That doesn't cut it anymore. They really have to know their stuff and make sure that what they sign off on is real. And then you throw ESG into the mix is becoming more important, making sure the supply chain is green and also secure. So you can see, while much of this stuff has been around for a long, long time, the cloud is accelerating innovation in the pace of delivery. And so much is changing as a result. Now, next, I want to share a graphic that we shared last week, but a little different twist. It's an XY graphic with net score or spending velocity in the vertical axis and overlap or presence in the dataset on the horizontal. With that magic 40% red line as shown. Okay, I won't dig into the data and draw conclusions 'cause we did that last week, but two points I want to make. First, look at Microsoft in the upper-right hand corner. They are big in security and they're attracting a lot of dollars in the space. We've reported on this for a while. They're a five-star security company. And every time, from a spending standpoint in ETR data, that little methodology we use, every time I've run this chart, I've wondered, where the heck is AWS? Why aren't they showing up there? If security is so important to AWS, which it is, and its customers, why aren't they spending money with Amazon on security? And I asked this very question to Merrit Baer, who resides in the office of the CISO at AWS. Listen to her answer. >> It doesn't mean don't spend on security. There is a lot of goodness that we have to offer in ESS, external security services. But I think one of the unique parts of AWS is that we don't believe that security is something you should buy, it's something that you get from us. It's something that we do for you a lot of the time. I mean, this is the definition of the shared responsibility model, right? >> Now, maybe that's good messaging to the market. Merritt, you know, didn't say it outright, but essentially, Microsoft they charge for security. At AWS, it comes with the package. But it does answer my question. And, of course, the fact is that AWS can subsidize all this with egress charges. Now, on the flip side of that, (chuckles) you got Microsoft, you know, they're both, they're competing now. We can take CrowdStrike for instance. Microsoft and CrowdStrike, they compete with each other head to head. So it's an interesting dynamic within the ecosystem. Okay, but I want to turn to a powerful example of how AWS designs in security. And that is the idea of confidential computing. Of course, AWS is not the only one, but we're coming off of re:Inforce, and I really want to dig into something that David Floyer and I have talked about in previous episodes. And we had an opportunity to sit down with Arvind Raghu and J.D. Bean, two security experts from AWS, to talk about this subject. And let's share what we learned and why we think it matters. First, what is confidential computing? That's what this slide is designed to convey. To AWS, they would describe it this way. It's the use of special hardware and the associated firmware that protects customer code and data from any unauthorized access while the data is in use, i.e., while it's being processed. That's oftentimes a security gap. And there are two dimensions here. One is protecting the data and the code from operators on the cloud provider, i.e, in this case, AWS, and protecting the data and code from the customers themselves. In other words, from admin level users are possible malicious actors on the customer side where the code and data is being processed. And there are three capabilities that enable this. First, the AWS Nitro System, which is the foundation for virtualization. The second is Nitro Enclaves, which isolate environments, and then third, the Nitro Trusted Platform Module, TPM, which enables cryptographic assurances of the integrity of the Nitro instances. Now, we've talked about Nitro in the past, and we think it's a revolutionary innovation, so let's dig into that a bit. This is an AWS slide that was shared about how they protect and isolate data and code. On the left-hand side is a classical view of a virtualized architecture. You have a single host or a single server, and those white boxes represent processes on the main board, X86, or could be Intel, or AMD, or alternative architectures. And you have the hypervisor at the bottom which translates instructions to the CPU, allowing direct execution from a virtual machine into the CPU. But notice, you also have blocks for networking, and storage, and security. And the hypervisor emulates or translates IOS between the physical resources and the virtual machines. And it creates some overhead. Now, companies like VMware have done a great job, and others, of stripping out some of that overhead, but there's still an overhead there. That's why people still like to run on bare metal. Now, and while it's not shown in the graphic, there's an operating system in there somewhere, which is privileged, so it's got access to these resources, and it provides the services to the VMs. Now, on the right-hand side, you have the Nitro system. And you can see immediately the differences between the left and right, because the networking, the storage, and the security, the management, et cetera, they've been separated from the hypervisor and that main board, which has the Intel, AMD, throw in Graviton and Trainium, you know, whatever XPUs are in use in the cloud. And you can see that orange Nitro hypervisor. That is a purpose-built lightweight component for this system. And all the other functions are separated in isolated domains. So very strong isolation between the cloud software and the physical hardware running workloads, i.e., those white boxes on the main board. Now, this will run at practically bare metal speeds, and there are other benefits as well. One of the biggest is security. As we've previously reported, this came out of AWS's acquisition of Annapurna Labs, which we've estimated was picked up for a measly $350 million, which is a drop in the bucket for AWS to get such a strategic asset. And there are three enablers on this side. One is the Nitro cards, which are accelerators to offload that wasted work that's done in traditional architectures by typically the X86. We've estimated 25% to 30% of core capacity and cycles is wasted on those offloads. The second is the Nitro security chip, which is embedded and extends the root of trust to the main board hardware. And finally, the Nitro hypervisor, which allocates memory and CPU resources. So the Nitro cards communicate directly with the VMs without the hypervisors getting in the way, and they're not in the path. And all that data is encrypted while it's in motion, and of course, encryption at rest has been around for a while. We asked AWS, is this an, we presumed it was an Arm-based architecture. We wanted to confirm that. Or is it some other type of maybe hybrid using X86 and Arm? They told us the following, and quote, "The SoC, system on chips, for these hardware components are purpose-built and custom designed in-house by Amazon and Annapurna Labs. The same group responsible for other silicon innovations such as Graviton, Inferentia, Trainium, and AQUA. Now, the Nitro cards are Arm-based and do not use any X86 or X86/64 bit CPUs. Okay, so it confirms what we thought. So you may say, "Why should we even care about all this technical mumbo jumbo, Dave?" Well, a year ago, David Floyer and I published this piece explaining why Nitro and Graviton are secret weapons of Amazon that have been a decade in the making, and why everybody needs some type of Nitro to compete in the future. This is enabled, this Nitro innovations and the custom silicon enabled by the Annapurna acquisition. And AWS has the volume economics to make custom silicon. Not everybody can do it. And it's leveraging the Arm ecosystem, the standard software, and the fabrication volume, the manufacturing volume to revolutionize enterprise computing. Nitro, with the alternative processor, architectures like Graviton and others, enables AWS to be on a performance, cost, and power consumption curve that blows away anything we've ever seen from Intel. And Intel's disastrous earnings results that we saw this past week are a symptom of this mega trend that we've been talking about for years. In the same way that Intel and X86 destroyed the market for RISC chips, thanks to PC volumes, Arm is blowing away X86 with volume economics that cannot be matched by Intel. Thanks to, of course, to mobile and edge. Our prediction is that these innovations and the Arm ecosystem are migrating and will migrate further into enterprise computing, which is Intel's stronghold. Now, that stronghold is getting eaten away by the likes of AMD, Nvidia, and of course, Arm in the form of Graviton and other Arm-based alternatives. Apple, Tesla, Amazon, Google, Microsoft, Alibaba, and others are all designing custom silicon, and doing so much faster than Intel can go from design to tape out, roughly cutting that time in half. And the premise of this piece is that every company needs a Nitro to enable alternatives to the X86 in order to support emergent workloads that are data rich and AI-based, and to compete from an economic standpoint. So while at re:Inforce, we heard that the impetus for Nitro was security. Of course, the Arm ecosystem, and its ascendancy has enabled, in our view, AWS to create a platform that will set the enterprise computing market this decade and beyond. Okay, that's it for today. Thanks to Alex Morrison, who is on production. And he does the podcast. And Ken Schiffman, our newest member of our Boston Studio team is also on production. Kristen Martin and Cheryl Knight help spread the word on social media and in the community. And Rob Hof is our editor in chief over at SiliconANGLE. He does some great, great work for us. Remember, all these episodes are available as podcast. Wherever you listen, just search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me directly at David.Vellante@siliconangle.com or DM me @dvellante, comment on my LinkedIn post. And please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights, powered by ETR. Thanks for watching. Be well, and we'll see you next time on "Breaking Analysis." (upbeat theme music)

Published Date : Jul 30 2022

SUMMARY :

This is "Breaking Analysis" and the Nasdaq was up nearly 250 points And so the Security Champion program the SecOps team to succeed. of the shared responsibility model, right? and it provides the services to the VMs.

ENTITIES

Entity	Category	Confidence
Alex Morrison	PERSON	0.99+
David Floyer	PERSON	0.99+
Mike Versace	PERSON	0.99+
Michael	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Steven Schmidt	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Kurt Kufeld	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
J.D. Bean	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
Arvind Raghu	PERSON	0.99+
Lena Smart	PERSON	0.99+
Kristen Martin	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
40%	QUANTITY	0.99+
Rob Hof	PERSON	0.99+
Dave	PERSON	0.99+
Schmidt	PERSON	0.99+
Palo Alto	LOCATION	0.99+
2022	DATE	0.99+
five	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
Annapurna Labs	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
SNAP	ORGANIZATION	0.99+
five-star	QUANTITY	0.99+
Chip Symington	PERSON	0.99+
47%	QUANTITY	0.99+
Annapurna	ORGANIZATION	0.99+
$350 million	QUANTITY	0.99+
Boston	LOCATION	0.99+
Merrit Baer	PERSON	0.99+
CJ Moses	PERSON	0.99+
40	QUANTITY	0.99+
Merritt	PERSON	0.99+
15%	QUANTITY	0.99+
25%	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+

Karl Mattson, Noname Security | AWS re:Inforce 2022

>>Hello, Ron. Welcome to AWS reinforce here. Live in Boston, Massachusetts. I'm John feer, host of the cube. We're here at Carl Matson. CSO at no name security. That's right, no name security, no name securities, also a featured partner at season two, episode four of our upcoming eightish startup showcase security themed event happening in the end of August. Look for that at this URL, AWS startups.com, but we're here at reinforc Carl. Thanks for joining me today. Good to see >>You. Thank you for having us, John. >>So this show security, it's not as packed as the eight of us summit was in New York. That just happened two weeks ago, 19,000 people here, more focused crowd. Lot at stake operations are under pressure. The security teams are under a lot of pressure as apps drive more and more cloud native goodness. As we say, the gen outta the bottle, people want more cloud native apps. Absolutely. That's put a lot of pressure on the ops teams and the security teams. That's the core theme here. How do you see it happening? How do you see this unfolding? Do you agree with that? And how would you describe today's event? >>Well, I think you're, you're spot on. I think the, the future of it is increasingly becoming the story of developers and APIs becoming the hero, the hero of digital transformation, the hero of public cloud adoption. And so this is really becoming much more of a developer-centric discussion about where we're moving our applications and, and where they're hosted, but also how they're designed. And so there's a lot of energy around that right now around focusing security capabilities that really appeal to the sensibilities and the needs of, of modern applications. >>I want to get to know name security a second, and let you explain what you guys do. Then I'll have a few good questions for you to kind of unpack that. But the thing about the structural change that's happened with cloud computing is kind of, and kind of in the past now, DevOps cloud scale, large scale data, the rise of the super cloud companies like snowflake capital, one there's examples of companies that don't even have CapEx investments building on the cloud. And in a way, our, the success of DevOps has created another sea of problems and opportunities that is more complexity as the benefits of DevOps and open source, continue to rise, agile applications that have value can be quantified. There's no doubt with the pandemic that's value there. Yeah. Now you have the collateral damage of success, a new opportunity to abstract away, more complexity to go to the next level. Yep. This is a big industry thing. What are the key opportunities and areas as this new environment, cuz that's the structural change happening now? Yep. What's the key dynamics right now. That's driving this new innovation and what are some of those problem areas that are gonna be abstracted away that you see? >>Well, the, the first thing I I'd suggest is is to, to lean into those structural changes and take advantage of them where they become an advantage for governance, security risk. A perfect example is automation. So what we have in microservices, applications and cloud infrastructures and new workloads like snowflake is we have workloads that want to talk, they want to be interoperated with. And because of that, we can develop new capabilities that take advantage of those of those capabilities. And, and so we want to have on our, on our security teams in particular is we wanna have the talent and the tools that are leaning into and capitalizing on exactly those strengths of, of the underlying capabilities that you're securing rather than to counter that trend, that the, the security professional needs to get ahead of it and, and be a part of that discussion with the developers and the infrastructure teams. >>And, and again, the tructure exchange could kill you too as well. I mean, some benefits, you know, data's the new oil, but end of the day it could be a problematic thing. Sure. All right. So let's get that. No names talk about the company. What you guys do, you have an interesting approach, heavily funded, good success, good buzz. What's going on with the company? Give the quick overview. >>Well, we're a company that's just under three years old and, and what APIs go back, of course, a, a decade or more. We've all been using APIs for a long time, but what's really shifted over the last couple of years is the, is the transition of, of applications and especially business critical processes to now writing on top of public facing APIs where API used to be the behind the scenes interconnection between systems. Now those APIs are exposed to their public facing. And so what we focus on as a company is looking at that API as a, as a software endpoint, just like any other endpoint in our environments that we're historically used to. That's an endpoint that needs full life cycle protection. It needs to be designed well secure coding standards for, for APIs and tested. Well, it also has to be deployed into production configured well and operated well. And when there's a misuse or an attack in progress, we have to be able to protect and identify the, the risks to that API in production. So when you add that up, we're looking at a full life cycle view of the API, and it's really it's about time because the API is not new yet. We're just starting to now to apply like actual discipline and, and practices that help keep that API secure. >>Yeah. It's interesting. It's like what I was saying earlier. They're not going anywhere. They're not going, they're the underpinning, the underlying benefit of cloud yes. Cloud native. So it's just more, more operational stability, scale growth. What are some of the challenges that, that are there and what do you guys do particularly to solve it? You're protecting it. Are you scaling it? What specifically are you guys addressing? >>But sure. So I think API security, even as a, as a discipline is not new, but I think the, the, the traditional look at API security looks only at, at the quality of the source code. Certainly quality of the source code of API is, is sort of step one. But what we see in, in practices is most of the publicly known API compromises, they weren't because of bad source code that they because of network misconfiguration or the misapplication of policy during runtime. So a great example of that would be developer designs, an API designs. It in such a way that Gar that, that enforces authentication to be well designed and strong. And then in production, those authentication policies are not applied at a gateway. So what we add to the, we add to the, to the conversation on API security is helping fill all those little gaps from design and testing through production. So we can see all of the moving parts in the, the context of the API to see how it can be exploited and, and how we can reduce risk in independent of. >>So this is really about hardening the infrastructure yep. Around cuz the developer did their job in that example. Yep. So academic API is well formed working, but something didn't happen on the network or gateway box or app, you know, some sort of network configuration or middleware configuration. >>Absolutely. So in our, in our platform, we, we essentially have sort of three functional areas. There's API code testing, and then we call next is posture management posture. Management's a real thing. If we're talking about a laptop we're talking about, is it up to date with patches? Is it configured? Well, is it secure network connectivity? The same is true with APIs. They have to be managed and cared for by somebody who's looking at their posture on the network. And then of course then there's threat defense and run time protection. So that posture management piece, that's really a new entrant into the discussion on API security. And that's really where we started as a company is focusing on that sort of acute gap of information, >>Posture, protection, >>Posture, and protection. Absolutely >>Define that. What does that, what does posture posture protection mean? How would you define that? >>Sure. I think it's a, it's identifying the inherent risk exposure of an API. Great example of that would be an API that is addressable by internal systems and external systems at the same time. Almost always. That is, that is an error. It's a mistake that's been made so well by, by identifying that misconfiguration of posture, then we can, we can protect that API by restricting the internet connectivity externally. That's just a great example of posture. We see almost every organization has that and it's never intended. >>Great, great, great call out. Thanks for sharing. All right, so I'm a customer. Yep. Okay. Look at, Hey, I already got an app firewall API gateway. Why do I need another tool? >>Well, first of all, web application firewalls are sort of essential parts of a security ecosystem. An API management gateway is usually the brain of an API economy. What we do is we, we augment those platforms with what they don't do well and also when they're not used. So for example, in, in any environment, we, we aspire to have all of our applications or APIs protected by web application firewall. First question is, are they even behind the web? Are they behind the w at all? We're gonna find that the WAFF doesn't know if it's not protecting something. And then secondary, there are attack types of business logic in particular of like authentication policy that a WAFF is not gonna be able to see. So the WAFF and the API management plan, those are the key control points and we can help make those better. >>You know what I think is cool, Carl, as you're bringing up a point that we're seeing here and we've seen before, but now it's kind of coming at the visibility. And it was mentioned in the keynote by one of the presenters, Kurt, I think it was who runs the platform. This idea of reasoning is coming into security. So the idea of knowing the topology know that there's dynamic stuff going on. I mean, topes aren't static anymore. Yep. And now you have more microservices. Yep. More APIs being turned on and off this runtime is interesting. So you starting to see this holistic view of, Hey, the secret sauce is you gotta be smarter. Yep. And that's either machine learning or AI. So, so how does that relate to what you guys do? Does it, cuz it sounds like you've got something of that going on with the product. Is that fair or yeah. >>Yeah, absolutely. So we, yeah, we talked about posture, so that's, that's really the inherent quality or secure posture of a, of an API. And now let's talk about sending traffic through that API, the request and response. When we're talking about organizations that have more APIs than they have people, employees, or, or tens of thousands, we're seeing in some customers, the only way to identify anomalous traffic is through machine learning. So we apply a machine learning model to each and every API in independently for itself because we wanna learn how that API is supposed to be behave. Where is it supposed to be talking? What kind of data is it supposed to be trafficking in, in, in all its facets. So we can model that activity and then identify the anomaly where there's a misuse, there's an attacker event. There's an, an insider employee is doing something with that API that's different. And that's really key with APIs is, is that no, a no two APIs are alike. Yeah. They really do have to be modeled individually rather than I can't share my, my threat signatures for my API, with your organization, cuz your APIs are different. And so we have to have that machine learning approach in order to really identify that >>Anomaly and watch the credentials, permissions. Absolutely all those things. All right. Take me through the life cycle of an API. There's pre-production postproduction what do I need to know about those two, those two areas with respect to what you guys do? >>Sure. So the pre-production activities are really putting in the hands of a developer or an APSEC team. The ability to test that API during its development and, and source code testing is one piece, but also in pre-production are we modeling production variables enough to know what's gonna happen when I move it into production? So it's one thing to have secure source code, of course, but then it's also, do we know how that API's gonna interact with the world once it's sort of set free? So the testing capabilities early life cycle is really how we de-risk in the long term, but we all have API ecosystems that are existing. And so in production we're applying the, all of those same testing of posture and configuration issues in runtime, but really what it, it may sound cliche to say, we wanna shift security left, but in APIs that's, that's a hundred percent true. We want to keep moving our, our issue detection to the earliest possible point in the development of an API. And that gives us the greatest return in the API, which is what we're all looking for is to capitalize on it as an agent of transformation. >>All right, let's take the customer perspective. I'm the customer, Carl, Carl, why do I need you? And how are you different from the competition? And if I like it, how do I get started? >>Sure. So the, the, the first thing that we differentiate selves from the customer is, or from our competitors is really looking at the API as an entire life cycle of activities. So whether it's from the documentation and the design and the secure source code testing that we can provide, you know, pre-development, or pre-deployment through production posture, through runtime, the differentiator really for us is being a one-stop shop for an entire API security program. And that's very important. And as that one stop shop, the, the great thing about that when having a conversation with a customer is not every customer's at the same point in their journey. And so if, if a customer discussion really focuses on their perhaps lack of confidence in their code testing, maybe somebody else has a lack of confidence in their runtime detection. We can say yes to those conversations, deliver value, and then consider other things that we can do with that customer along a whole continuum of life cycle. And so it allows us to have a customer conversation where we don't need to say, no, we don't do that. If it's an API, the answer is, yes, we do do that. And that's really where we, you know, we have an advantage, I think, in, in looking at this space and, and, and being able to talk with pretty much any customer in any vertical and having a, having a solution that, that gives them something value right away. >>And how do I get started? I like it. You sold me on, on operationalizing it. I like the one stop shop. I, my APIs are super important. I know that could be potential exposure, maybe access, and then lateral movement to a workload, all kinds of stuff could happen. Sure. How do I get started? What do I do to solve >>This? Well, no name, security.com. Of course we, we have, you know, most customers do sandboxing POVs as part of a trial period for us, especially with, you know, being here at AWS is wonderful because these are customers who's with whom we can integrate with. In a matter of minutes, we're talking about literally updating an IAM role. Permission is the complexity of implementation because cloud friendly workloads really allow us to, to do proofs of concept and value in a matter of minutes to, to achieve that value. So whether it's a, a dedicated sandbox for one customer, whether it's a full blown POC for a complicated organization, you know, whether it's here at AWS conference or, or, or Nona security.com, we would love to do a, do a, like a free demo test drive and assessment. >>Awesome. And now you guys are part of the elite alumni of our startup showcase yep. Where we feature the hot startups, obviously it's the security focuses episodes about security. You guys have been recognized by the industry and AWS as, you know, making it, making it happen. What specifically is your relationship with AWS? Are you guys doing stuff together? Cuz they're, they're clearly integrating with their partners. Yeah. I mean, they're going to companies and saying, Hey, you know what, the more we're integrated, the better security everyone gets, what are you doing with Amazon? Can you share any tidbits? You don't have to share any confidential information, but can you give us a little taste of the relationship? >>Well, so I think we have the best case scenario with our relationship with AWSs is, is as a, as a very, very small company. Most of our first customers were AWS customers. And so to develop the, the, the initial integrations with AWS, what we were able to do is have our customers, oftentimes, which are large public corporations, go to AWS and say, we need, we need that company to be through your marketplace. We need you to be a partner. And so that partnership with, with AWS has really grown from, you know, gone from zero to 60 to, you know, miles per hour in a very short period of time. And now being part of the startup program, we have a variety of ways that a customer can, can work with us from a direct purchase through the APS marketplace, through channel partners and, and VA, we really have that footprint now in AWS because our customers are there and, and they brought our customers to AWS with us. >>It's it nice. The customers pulls you to AWS. Yes. Its pulls you more customers. Yep. You get kind of intermingled there, provide the value. And certainly they got, they, they hyperscale so >>Well, that creates depth of the relationship. So for example, as AWS itself is evolving and changing new services become available. We are a part of that inner circle. So to speak, to know that we can make sure that our technology is sort of calibrated in advance of that service offering, going out to the rest of the world. And so it's a really great vantage point to be in as a startup. >>Well, Carl, the CISO for no name security, you're here on the ground. You partner with AWS. What do you think of the show this year? What's the theme. What's the top story one or two stories that you think of the most important stories that people should know about happening here in the security world? >>Well, I don't think it's any surprise that almost every booth in the, in the exhibit hall has the words cloud native associated with it. But I also think that's, that's, that's the best thing about it, which is we're seeing companies and, and I think no name is, is a part of that trend who have designed capabilities and technologies to take advantage and lean into what the cloud has to offer rather than compensating. For example, five years ago, when we were all maybe wondering, will the cloud ever be as secure as my own data center, those days are over. And we now have companies that have built highly sophisticated capabilities here in the exhibit hall that are remarkably better improvements in, in securing the cloud applications in, in our environments. So it's a, it's a real win for the cloud. It's something of a victory lap. If, if you hadn't already been there, you should be there at this point. >>Yeah. And the structural change is happening now that's clear and I'd love to get your reaction if you agree with it, is that the ops on security teams are now being pulled up to the level that the developers are succeeding at, meaning that they have to be in the boat together. Yes. >>Oh, lines of, of reporting responsibility are becoming less and less meaningful and that's a good thing. So we're having just in many conversations with developers or API management center of excellence teams to cloud infrastructure teams as we are security teams. And that's a good thing because we're finally starting to have some degree of conversions around where our interests lie in securing cloud assets. >>So developers ops security all in the boat together, sync absolutely together or win together. >>We, we, we win together, but we don't win on day one. We have to practice like we as organizations we have to, we have to rethink our, we have to rethink our tech stack. Yeah. But we also have to, you have to rethink our organizational models, our processes to get there, to get >>That in, keep the straining boat in low waters. Carl, thanks for coming on. No name security. Why the name just curious, no name. I love that name. Cause the restaurant here in Boston that used to be of all the people that know that. No name security, why no name? >>Well, it was sort of accidental at, in the, in the company's first few weeks, the there's an advisory board of CISOs who provides feedback on, on seed to seed companies on their, on their concept of, of where they're gonna build platforms. And, and so in absence of a name, the founders and the original investor filled out a form, putting no name as the name of this company that was about to develop an API security solution. Well, amongst this board of CSOs, basically there was unanimous feedback that the, what they needed to do was keep the name. If nothing else, keep the name, no name, it's a brilliant name. And that was very much accidental, really just a circumstance of not having picked one, but you know, a few weeks passed and all of a sudden they were locked in because sort of by popular vote, no name was, >>Was formed. Yeah. And now the legacy, the origination story is now known here on the cube call. Thanks for coming on. Really appreciate it. Thank you, John. Okay. We're here. Live on the floor show floor of AWS reinforced in Boston, Massachusetts. I'm John with Dave ALO. Who's out and about getting the stories in the trenches in the analyst meeting. He'll be right back with me shortly day tuned for more cube coverage. After this short break.

Published Date : Jul 26 2022

SUMMARY :

I'm John feer, host of the cube. And how would you describe today's event? developers and APIs becoming the hero, the hero of digital transformation, the hero of public cloud and kind of in the past now, DevOps cloud scale, large scale data, And because of that, we can develop new capabilities that take advantage of those of those capabilities. And, and again, the tructure exchange could kill you too as well. the risks to that API in production. What are some of the challenges that, that are there and what do you guys do particularly to So a great example of that would be developer designs, happen on the network or gateway box or app, you know, some sort of network configuration that's really a new entrant into the discussion on API security. Posture, and protection. How would you define that? systems and external systems at the same time. All right, so I'm a customer. So the WAFF and the API management plan, those are the key control points and So, so how does that relate to what you guys do? And so we have to have that machine learning approach in order to those two areas with respect to what you guys do? So it's one thing to have secure source code, of course, but then it's also, do we know how that API's And how are you different from the competition? and the design and the secure source code testing that we can provide, you know, pre-development, I like the one stop shop. the complexity of implementation because cloud friendly workloads really allow us to, to do proofs of concept and You guys have been recognized by the industry and AWS as, you know, And so that partnership with, with AWS has really grown from, you know, The customers pulls you to AWS. Well, that creates depth of the relationship. What's the top story one or two stories that you think of the most important stories capabilities here in the exhibit hall that are remarkably better improvements in, that the developers are succeeding at, meaning that they have to be in the boat together. API management center of excellence teams to cloud infrastructure teams as we are security teams. So developers ops security all in the boat together, sync absolutely together But we also have to, you have to rethink our organizational models, our processes to get there, Why the name just curious, no name. and so in absence of a name, the founders and the original investor filled Who's out and about getting the stories in the trenches

ENTITIES

Entity	Category	Confidence
AWSs	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Carl	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Ron	PERSON	0.99+
Karl Mattson	PERSON	0.99+
New York	LOCATION	0.99+
Boston	LOCATION	0.99+
Kurt	PERSON	0.99+
19,000 people	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
today	DATE	0.99+
First question	QUANTITY	0.99+
DevOps	TITLE	0.99+
two	QUANTITY	0.99+
tens of thousands	QUANTITY	0.99+
Dave ALO	PERSON	0.99+
one piece	QUANTITY	0.99+
five years ago	DATE	0.99+
two areas	QUANTITY	0.99+
two stories	QUANTITY	0.99+
60	QUANTITY	0.98+
two weeks ago	DATE	0.98+
zero	QUANTITY	0.98+
eightish	QUANTITY	0.98+
this year	DATE	0.98+
end of August	DATE	0.97+
first customers	QUANTITY	0.97+
security.com	OTHER	0.96+
eight	QUANTITY	0.96+
John feer	PERSON	0.95+
a decade	QUANTITY	0.94+
Nona security.com	ORGANIZATION	0.94+
one customer	QUANTITY	0.93+
day one	QUANTITY	0.93+
CapEx	ORGANIZATION	0.93+
each	QUANTITY	0.93+
first thing	QUANTITY	0.92+
WAFF	TITLE	0.91+
one thing	QUANTITY	0.91+
one	QUANTITY	0.9+
under three years old	QUANTITY	0.9+
first few weeks	QUANTITY	0.89+
hundred percent	QUANTITY	0.89+
weeks	QUANTITY	0.88+
three functional	QUANTITY	0.84+
APS	ORGANIZATION	0.82+
pandemic	EVENT	0.82+
one stop	QUANTITY	0.76+
one-	QUANTITY	0.74+
second	QUANTITY	0.71+
years	DATE	0.69+
last couple	DATE	0.69+
step one	QUANTITY	0.66+
CISOs	ORGANIZATION	0.64+
episode four	OTHER	0.64+
2022	DATE	0.63+
APSEC	ORGANIZATION	0.62+
season two	OTHER	0.6+
Carl Matson	ORGANIZATION	0.57+
every	QUANTITY	0.54+
startups.com	OTHER	0.53+
IAM	TITLE	0.46+

Ameesh Divatia, Baffle | AWS re:Inforce 2022

(upbeat music) >> Okay, welcome back everyone in live coverage here at theCUBE, Boston, Massachusetts, for AWS re:inforce 22 security conference for Amazon Web Services. Obviously reinvent the end of the years' the big celebration, "re:Mars" is the new show that we've covered as well. The res are here with theCUBE. I'm John Furrier, host with a great guest, Ameesh Divatia, co-founder, and CEO of a company called "Baffle." Ameesh, thanks for joining us on theCUBE today, congratulations. >> Thank you. It's good to be here. >> And we got the custom encrypted socks. >> Yup, limited edition >> 64 bitter 128. >> Base 64 encoding. >> Okay.(chuckles) >> Secret message in there. >> Okay.(chuckles) Secret message.(chuckles) We'll have to put a little meme on the internet, figure it out. Well, thanks for comin' on. You guys are goin' hot right now. You guys a hot startup, but you're in an area that's going to explode, we believe. >> Yeah. >> The SuperCloud is here, we've been covering that on theCUBE that people are building on top of the Amazon Hyperscalers. And without the capex, they're building platforms. The application tsunami has come and still coming, it's not stopping. Modern applications are faster, they're better, and they're driving a lot of change under the covers. >> Absolutely. Yeah. >> And you're seeing structural change happening in real time, in ops, the network. You guys got something going on in the encryption area. >> Yes >> Data. Talk about what you guys do. >> Yeah. So we believe very strongly that the next frontier in security is data. We've had multiple waves in security. The next one is data, because data is really where the threats will persist. If the data shows up in the wrong place, you get into a lot of trouble with compliance. So we believe in protecting the data all the way down at the field, or record level. That's what we do. >> And you guys doing all kinds of encryption, or other things? >> Yes. So we do data transformation, which encompasses three different things. It can be tokenization, which is format preserving. We do real encryption with counter mode, or we can do masked views. So tokenization, encryption, and masking, all with the same platform. >> So pretty wide ranging capabilities with respect to having that kind of safety. >> Yes. Because it all depends on how the data is used down the road. Data is created all the time. Data flows through pipelines all the time. You want to make sure that you protect the data, but don't lose the utility of the data. That's where we provide all that flexibility. >> So Kurt was on stage today on one of the keynotes. He's the VP of the platform at AWS. >> Yes. >> He was talking about encrypts, everything. He said it needs, we need to rethink encryption. Okay, okay, good job. We like that. But then he said, "We have encryption at rest." >> Yes. >> That's kind of been there, done that. >> Yes. >> And, in-flight? >> Yeah. That's been there. >> But what about in-use? >> So that's exactly what we plug. What happens right now is that data at rest is protected because of discs that are already self-encrypting, or you have transparent data encryption that comes native with the database. You have data in-flight that is protected because of SSL. But when the data is actually being processed, it's in the memory of the database or datastore, it is exposed. So the threat is, if the credentials of the database are compromised, as happened back then with Starwood, or if the cloud infrastructure is compromised with some sort of an insider threat like a Capital One, that data is exposed. That's precisely what we solve by making sure that the data is protected as soon as it's created. We use standard encryption algorithms, AES, and we either do format preserving, or true encryption with counter mode. And that data, it doesn't really matter where it ends up, >> Yeah. >> because it's always protected. >> Well, that's awesome. And I think this brings up the point that we want been covering on SiliconAngle in theCUBE, is that there's been structural change that's happened, >> Yes. >> called cloud computing, >> Yes. >> and then hybrid. Okay. Scale, role of data, higher level abstraction of services, developers are in charge, value creations, startups, and big companies. That success is causing now, a new structural change happening now. >> Yes. >> This is one of them. What areas do you see that are happening right now that are structurally changing, that's right in front of us? One is, more cloud native. So the success has become now the problem to solve - >> Yes. >> to get to the next level. >> Yeah. >> What are those, some of those? >> What we see is that instead of security being an afterthought, something that you use as a watchdog, you create ways of monitoring where data is being exposed, or data is being exfiltrated, you want to build security into the data pipeline itself. As soon as data is created, you identify what is sensitive data, and you encrypt it, or tokenize it as it flows into the pipeline using things like Kafka plugins, or what we are very clearly differentiating ourselves with is, proxy architectures so that it's completely transparent. You think you're writing to the datastore, but you're actually writing to the proxy, which in turn encrypts the data before its stored. >> Do you think that's an efficient way to do it, or is the only way to do it? >> It is a much more efficient way of doing it because of the fact that you don't need any app-dev resources. There are many other ways of doing it. In fact, the cloud vendors provide development kits where you can just go do it yourself. So that is actually something that we completely avoid. And what makes it really, really interesting is that once the data is encrypted in the data store, or database, we can do what is known as "Privacy Enhanced Computation." >> Mm. >> So we can actually process that data without decrypting it. >> Yeah. And so proxies then, with cloud computing, can be very fast, not a bottleneck that could be. >> In fact, the cloud makes it so. It's very hard to - >> You believe that? >> do these things in static infrastructure. In the cloud, there's infinite amount of processing available, and there's containerization. >> And you have good network. >> You have very good network, you have load balancers, you have ways of creating redundancy. >> Mm. So the cloud is actually enabling solutions like this. >> And the old way, proxies were seen as an architectural fail, in the old antiquated static web. >> And this is where startups don't have the baggage, right? We didn't have that baggage. (John laughs) We looked at the problem and said, of course we're going to use a proxy because this is the best way to do this in an efficient way. >> Well, you bring up something that's happening right now that I hear a lot of CSOs and CIOs and executives say, CXOs say all the time, "Our", I won't say the word, "Our stuff has gotten complicated." >> Yes. >> So now I have tool sprawl, >> Yeah. >> I have skill gaps, and on the rise, all these new managed services coming at me from the vendors who have never experienced my problem. And their reaction is, they don't get my problem, and they don't have the right solutions, it's more complexity. They solve the complexity by adding more complexity. >> Yes. I think we, again, the proxy approach is a very simple. >> That you're solving that with that approach. >> Exactly. It's very simple. And again, we don't get in the way. That's really the the biggest differentiator. The forcing function really here is compliance, right? Because compliance is forcing these CSOs to actually adopt these solutions. >> All right, so love the compliance angle, love the proxy as an ease of use, take the heavy lifting away, no operational problems, and deviations. Now let's talk about workloads. >> Yeah. >> 'Cause this is where the use is. So you got, or workloads being run large scale, lot a data moving around, computin' as well. What's the challenge there? >> I think it's the volume of the data. Traditional solutions that we're relying on legacy tokenizations, I think would replicate the entire storage because it would create a token wall, for example. You cannot do that at this scale. You have to do something that's a lot more efficient, which is where you have to do it with a cryptography approach. So the workloads are diverse, lots of large files in the workloads as well as structured workloads. What we have is a solution that actually goes across the board. We can do unstructured data with HTTP proxies, we can do structured data with SQL proxies. And that's how we are able to provide a complete solution for the pipeline. >> So, I mean, show about the on-premise versus the cloud workload dynamic right now. Hybrid is a steady state right now. >> Yeah. >> Multi-cloud is a consequence of having multiple vendors, not true multi-cloud but like, okay, they have Azure there, AWS here, I get that. But hybrid really is the steady state. >> Yes. >> Cloud operations. How are the workloads and the analytics the data being managed on-prem, and in the cloud, what's their relationship? What's the trend? What are you seeing happening there? >> I think the biggest trend we see is pipelining, right? The new ETL is streaming. You have these Kafka and Kinesis capabilities that are coming into the picture where data is being ingested all the time. It is not a one time migration. It's a stream. >> Yeah. >> So plugging into that stream is very important from an ingestion perspective. >> So it's not just a watchdog. >> No. >> It's the pipelining. >> It's built in. It's built-in, it's real time, that's where the streaming gets another diverse access to data. >> Exactly. >> Data lakes. You got data lakes, you have pipeline, you got streaming, you mentioned that. So talk about the old school OLTP, the old BI world. I think Power BI's like a $30 billion product. >> Yeah. >> And you got Tableau built on OLTP building cubes. Aren't we just building cubes in a new way, or, >> Well. >> is there any relevance to the old school? >> I think there, there is some relevance and in fact that's again, another place where the proxy architecture really helps, because it doesn't matter when your application was built. You can use Tableau, which nobody has any control over, and still process encrypted data. And so can with Power BI, any Sequel application can be used. And that's actually exactly what we like to. >> So we were, I was talking to your team, I knew you were coming on, and they gave me a sound bite that I'm going to read to the audience and I want to get your reaction to. >> Sure. >> 'Cause I love this. I fell out of my chair when I first read this. "Data is the new oil." In 2010 that was mentioned here on theCUBE, of course. "Data is the new oil, but we have to ensure that it does not become the next asbestos." Okay. That is really clever. So we all know about asbestos. I add to the Dave Vellante, "Lead paint too." Remember lead paint? (Ameesh laughs) You got to scrape it out and repaint the house. Asbestos obviously causes a lot of cancer. You know, joking aside, the point is, it's problematic. >> It's the asset. >> Explain why that sentence is relevant. >> Sure. It's the assets and liabilities argument, right? You have an asset which is data, but thanks to compliance regulations and Gartner says 75% of the world will be subject to privacy regulations by 2023. It's a liability. So if you don't store your data well, if you don't process your data responsibly, you are going to be liable. So while it might be the oil and you're going to get lots of value out of it, be careful about the, the flip side. >> And the point is, there could be the "Grim Reaper" waiting for you if you don't do it right, the consequences that are quantified would be being out of business. >> Yes. But here's something that we just discovered actually from our survey that we did. While 93% of respondents said that they have had lots of compliance related effects on their budgets. 75% actually thought that it makes them better. They can use the security postures as a competitive differentiator. That's very heartening to us. We don't like to sell the fear aspect of this. >> Yeah. We like to sell the fact that you look better compared to your neighbor, if you have better data hygiene, back to the. >> There's the fear of missing out, or as they say, "Keeping up with the Joneses", making sure that your yard looks better than the next one. I get the vanity of that, but you're solving real problems. And this is interesting. And I want to get your thoughts on this. I found, I read that you guys protect more than a 100 billion records across highly regulated industries. Financial services, healthcare, industrial IOT, retail, and government. Is that true? >> Absolutely. Because what we are doing is enabling SaaS vendors to actually allow their customers to control their data. So we've had the SaaS vendor who has been working with us for over three years now. They store confidential data from 30 different banks in the country. >> That's a lot of records. >> That's where the record, and. >> How many customers do you have? >> Well, I think. >> The next round of funding's (Ameesh laughs) probably they're linin' up to put money into you guys. >> Well, again, this is a very important problem, and there are, people's businesses are dependent on this. We're just happy to provide the best tool out there that can do this. >> Okay, so what's your business model behind? I love the success, by the way, I wanted to quote that stat to one verify it. What's the business model service, software? >> The business model is software. We don't want anybody to send us their confidential data. We embed our software into our customers environments. In case of SaaS, we are not even visible, we are completely embedded. We are doing other relationships like that right now. >> And they pay you how? >> They pay us based on the volume of the data that they're protecting. >> Got it. >> That in that case which is a large customers, large enterprise customers. >> Pay as you go. >> It is pay as you go, everything is annual licenses. Although, multi-year licenses are very common because once you adopt the solution, it is very sticky. And then for smaller customers, we do base our pricing also just on databases. >> Got it. >> The number of databases. >> And the technology just reviewed low-code, no-code implementation kind of thing, right? >> It is by definition, no code when it comes to proxy. >> Yeah. >> When it comes to API integration, it could be low code. Yeah, it's all cloud-friendly, cloud-native. >> No disruption to operations. >> Exactly. >> That's the culprit. >> Well, yeah. >> Well somethin' like non-disruptive operations.(laughs) >> No, actually I'll give an example of a migration, right? We can do live migrations. So while the databases are still alive, as you write your. >> Live secure migrations. >> Exactly. You're securing - >> That's the one that manifests. >> your data as it migrates. >> Awright, so how much funding have you guys raised so far? >> We raised 36 and a half, series A, and B now. We raised that late last year. >> Congratulations. >> Thank you. >> Who's the venture funders? >> True Ventures is our largest investor, followed by Celesta Capital, National Grid Partners is an investor, and so is Engineering Capital and Clear Vision Ventures. >> And the seed and it was from Engineering? >> Seed was from Engineering. >> Engineering Capital. >> And then True came in very early on. >> Okay. >> Greenspring is also an investor in us, so is Industrial Ventures. >> Well, privacy has a big concern, big application for you guys. Privacy, secure migrations. >> Very much so. So what we are believe very strongly in the security's personal, security is yours and my data. Privacy is what the data collector is responsible for. (John laughs) So the enterprise better be making sure that they've complied with privacy regulations because they don't tell you how to protect the data. They just fine you. >> Well, you're not, you're technically long, six year old start company. Six, seven years old. >> Yeah. >> Roughly. So yeah, startups can go on long like this, still startup, privately held, you're growing, got big records under management there, congratulations. What's next? >> I think scaling the business. We are seeing lots of applications for this particular solution. It's going beyond just regulated industries. Like I said, it's a differentiating factor now. >> Yeah >> So retail, and a lot of other IOT related industrial customers - >> Yeah. >> are also coming. >> Ameesh, talk about the show here. We're at re:inforce, actually we're live here on the ground, the show floor buzzing. What's your takeaway? What's the vibe this year? What if you had to share what your opinion the top story here at the show, what would be the two top things, or three things? >> I think it's two things. First of all, it feels like we are back. (both laugh) It's amazing to see people on the show floor. >> Yeah. >> People coming in and asking questions and getting to see the product. The second thing that I think is very gratifying is, people come in and say, "Oh, I've heard of you guys." So thanks to digital media, and digital marketing. >> They weren't baffled. They want baffled. >> Exactly. >> They use baffled. >> Looks like, our outreach has helped, >> Yeah. >> and has kept the continuity, which is a big deal. >> Yeah, and now you're a CUBE alumni, welcome to the fold. >> Thank you. >> Appreciate you coming on. And we're looking forward to profiling you some day in our startup showcase, and certainly, we'll see you in the Palo Alto studios. Love to have you come in for a deeper dive. >> Sounds great. Looking forward to it. >> Congratulations on all your success, and thanks for coming on theCUBE, here at re:inforce. >> Thank you, John. >> Okay, we're here in, on the ground live coverage, Boston, Massachusetts for AWS re:inforce 22. I'm John Furrier, your host of theCUBE with Dave Vellante, who's in an analyst session, right? He'll be right back with us on the next interview, coming up shortly. Thanks for watching. (gentle music)

Published Date : Jul 26 2022

SUMMARY :

is the new show that we've It's good to be here. meme on the internet, that people are building on Yeah. on in the encryption area. Talk about what you guys do. strongly that the next frontier So tokenization, encryption, and masking, that kind of safety. Data is created all the time. He's the VP of the platform at AWS. to rethink encryption. by making sure that the data is protected the point that we want been and then hybrid. So the success has become now the problem into the data pipeline itself. of the fact that you don't without decrypting it. that could be. In fact, the cloud makes it so. In the cloud, you have load balancers, you have ways Mm. So the cloud is actually And the old way, proxies were seen don't have the baggage, right? say, CXOs say all the time, and on the rise, all these the proxy approach is a very solving that with that That's really the love the proxy as an ease of What's the challenge there? So the workloads are diverse, So, I mean, show about the But hybrid really is the steady state. and in the cloud, what's coming into the picture So plugging into that gets another diverse access to data. So talk about the old school OLTP, And you got Tableau built the proxy architecture really helps, bite that I'm going to read "Data is the new oil." that sentence is relevant. 75% of the world will be And the point is, there could from our survey that we did. that you look better compared I get the vanity of that, but from 30 different banks in the country. up to put money into you guys. provide the best tool out I love the success, In case of SaaS, we are not even visible, the volume of the data That in that case It is pay as you go, It is by definition, no When it comes to API like still alive, as you write your. Exactly. That's the one that We raised that late last year. True Ventures is our largest investor, Greenspring is also an investor in us, big application for you guys. So the enterprise better be making sure Well, you're not, So yeah, startups can I think scaling the business. Ameesh, talk about the show here. on the show floor. see the product. They want baffled. and has kept the continuity, Yeah, and now you're a CUBE alumni, in the Palo Alto studios. Looking forward to it. and thanks for coming on the ground live coverage,

ENTITIES

Entity	Category	Confidence
Kurt	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ameesh	PERSON	0.99+
John Furrier	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2010	DATE	0.99+
National Grid Partners	ORGANIZATION	0.99+
John	PERSON	0.99+
six year	QUANTITY	0.99+
Engineering Capital	ORGANIZATION	0.99+
$30 billion	QUANTITY	0.99+
Six	QUANTITY	0.99+
Celesta Capital	ORGANIZATION	0.99+
Ameesh Divatia	PERSON	0.99+
75%	QUANTITY	0.99+
Clear Vision Ventures	ORGANIZATION	0.99+
93%	QUANTITY	0.99+
30 different banks	QUANTITY	0.99+
Greenspring	ORGANIZATION	0.99+
True Ventures	ORGANIZATION	0.99+
True	ORGANIZATION	0.99+
today	DATE	0.99+
2023	DATE	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
one	QUANTITY	0.99+
two things	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Power BI	TITLE	0.98+
seven years	QUANTITY	0.98+
over three years	QUANTITY	0.98+
Dave Vellante	PERSON	0.98+
First	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
Tableau	TITLE	0.98+
first	QUANTITY	0.97+
three things	QUANTITY	0.97+
36 and a half	QUANTITY	0.97+
second thing	QUANTITY	0.97+
one time	QUANTITY	0.97+
series A	OTHER	0.97+
this year	DATE	0.96+
late last year	DATE	0.96+
Baffle	ORGANIZATION	0.96+
Capital One	ORGANIZATION	0.96+
Industrial Ventures	ORGANIZATION	0.96+
128	QUANTITY	0.95+
Boston,	LOCATION	0.95+
Kafka	TITLE	0.95+
more than a 100 billion records	QUANTITY	0.95+
Starwood	ORGANIZATION	0.94+
two top things	QUANTITY	0.93+
Boston, Massachusetts	LOCATION	0.93+
CUBE	ORGANIZATION	0.91+
SQL	TITLE	0.89+
re:Mars	TITLE	0.88+
capex	ORGANIZATION	0.87+
three different things	QUANTITY	0.86+
One	QUANTITY	0.85+
64	QUANTITY	0.83+
Azure	TITLE	0.83+
Hyperscalers	COMMERCIAL_ITEM	0.82+
OLTP	TITLE	0.8+
Massachusetts	LOCATION	0.67+
re:inforce 22 security conference	EVENT	0.65+
SiliconAngle	ORGANIZATION	0.59+
Computation	OTHER	0.55+
SuperCloud	ORGANIZATION	0.55+
Sequel	TITLE	0.53+
Kinesis	ORGANIZATION	0.48+
2022	DATE	0.41+
Joneses	TITLE	0.27+

Denise Hayman, Sonrai Security | AWS re:Inforce 2022

(bright music) >> Welcome back everyone to the live Cube coverage here in Boston, Massachusetts for AWS re:Inforce 22, with a great guest here, Denise Hayman, CRO, Chief Revenue of Sonrai Security. Sonrai's a featured partner of Season Two, Episode Four of the upcoming AWS Startup Showcase, coming in late August, early September. Security themed startup focused event, check it out. awsstartups.com is the site. We're on Season Two. A lot of great startups, go check them out. Sonrai's in there, now for the second time. Denise, it's great to see you. Thanks for coming on. >> Ah, thanks for having me. >> So you've been around the industry for a while. You've seen the waves of innovation. We heard encrypt everything today on the keynote. We heard a lot of cloud native. They didn't say shift left but they said don't bolt on security after the fact, be in the CI/CD pipeline or the DevStream. All that's kind of top of line, Amazon's talking cloud native all the time. This is kind of what you guys are in the middle of. I've covered your company, you've been on theCUBE before. Your, not you, but your teammates have. You guys have a unique value proposition. Take a minute to explain for the folks that don't know, we'll dig into it, but what you guys are doing. Why you're winning. What's the value proposition. >> Yeah, absolutely. So, Sonrai is, I mean what we do is it's, we're a total cloud solution, right. Obviously, right, this is what everybody says. But what we're dealing with is really, our superpower has to do with the data and identity pieces within that framework. And we're tying together all the relationships across the cloud, right. And this is a unique thing because customers are really talking to us about being able to protect their sensitive data, protect their identities. And not just people identities but the non-people identity piece is the hardest thing for them to reign in. >> Yeah. >> So, that's really what we specialize in. >> And you guys doing good, and some good reports on good sales, and good meetings happening here. Here at the show, the big theme to me, and again, listening to the keynotes, you hear, you can see what's, wasn't talk about. >> Mm-hmm. >> Ransomware wasn't talked about much. They didn't talk about air-gapped. They mentioned ransomware I think once. You know normal stuff, teamwork, encryption everywhere. But identity was sprinkled in everywhere. >> Mm-hmm. >> And I think one of the, my favorite quotes was, I wrote it down, We've security in the development cycle CSD, they didn't say shift left. Don't bolt on any of that. Now, that's not new information. We know that don't bolt, >> Right. >> has been around for a while. He said, lessons learned, this is Stephen Schmidt, who's the CSO, top dog on security, who has access to what and why over permissive environments creates chaos. >> Absolutely. >> This is what you guys reign in. >> It is. >> Explain, explain that. >> Yeah, I mean, we just did a survey actually with AWS and Forrester around what are all the issues in this area that, that customers are concerned about and, and clouds in particular. One of the things that came out of it is like 95% of clouds are, what's called over privileged. Which means that there's access running amok, right. I mean, it, it is, is a crazy thing. And if you think about the, the whole value proposition of security it's to protect sensitive data, right. So if, if it's permissive out there and then sensitive data isn't being protected, I mean that, that's where we really reign it in. >> You know, it's interesting. I zoom out, I just put my historian hat on going back to the early days of my career in late eighties, early nineties. There's always, when you have these inflection points, there's always these problems that are actually opportunities. And DevOps, infrastructure as code was all about APS, all about the developer. And now open source is booming, open source is the software industry. Open source is it in the world. >> Right. >> That's now the software industry. Cloud scale has hit and now you have the Devs completely in charge. Now, what suffers now is the Ops and the Sec, Second Ops. Now Ops, DevOps. Now, DevSecOps is where all the action is. >> Yep. >> So the, the, the next thing to do is build an abstraction layer. That's what everyone's trying to do, build tools and platforms. And so that's where the action is here. This is kind of where the innovation's happening because the networks aren't the, aren't in charge anymore either. So, you now have this new migration up to higher level services and opportunities to take the complexity away. >> Mm-hmm. >> Because what's happened is customers are getting complexity. >> That's right. >> They're getting it shoved in their face, 'cause they want to do good with DevOps, scale up. But by default their success is also their challenge. >> Right. >> 'Cause of complexity. >> That's exactly right. >> This is, you agree with that. >> I do totally agree with that. >> If you, you believe that, then what's next. What happens next? >> You know, what I hear from customers has to do with two specific areas is they're really trying to understand control frameworks, right. And be able to take these scenarios and build them into something that they, where they can understand where the gaps are, right. And then on top of that building in automation. So, the automation is a, is a theme that we're hearing from everybody. Like how, how do they take and do things like, you know it's what we've been hearing for years, right. How do we automatically remediate? How do we automatically prioritize? How do we, how do we build that in so that they're not having to hire people alongside that, but can use software for that. >> The automation has become key. You got to find it first. >> Yes. >> You guys are also part of the DevCycle too. >> Yep. >> Explain that piece. So, I'm a developer, I'm an organization. You guys are on the front end. You're not bolt-on, right? >> We can do either. We prefer it when customers are willing to use us, right. At the very front end, right. Because anything that's built in the beginning doesn't have the extra cycles that you have to go through after the fact, right. So, if you can build security right in from the beginning and have the ownership where it needs to be, then you're not having to, to deal with it afterwards. >> Okay, so how do you guys, I'm putting my customer hat on for a second. A little hard, hard question, hard problem. I got active directory on Azure. I got, IM over here with AWS. I wanted them to look the same. Now, my on-premises, >> Ah. >> Is been booming, now I got cloud operations, >> Right. >> So, DevOps has moved to my premise and edge. So, what do I do? Do I throw everything out, do a redo. How do you, how do you guys talk about, talk to customers that have that chance, 'cause a lot of them are old school. >> Right. >> ID. >> And, and I think there's a, I mean there's an important distinction here which is there's the active directory identities right, that customers are used to. But then there's this whole other area of non-people identities, which is compute power and privileges and everything that gets going when you get you know, machines working together. And we're finding that it's about five-to-one in terms of how many identities are non-human identities versus human identity. >> Wow. >> So, so you actually have to look at, >> So, programmable access, basically. >> Yeah. Yes, absolutely. Right. >> Wow. >> And privileges and roles that are, you know accessed via different ways, right. Because that's how it's assigned, right. And people aren't really paying that close attention to it. So, from that scenario, like the AD thing of, of course that's important, right. To be able to, to take that and lift it into your cloud but it's actually even bigger to look at the bigger picture with the non-human identities, right. >> What about the CISOs out there that you talk to. You're in the front lines, >> Yep. >> talking to customers and you see what's coming on the roadmap. >> Yep. >> So, you kind of get the best of both worlds. See what they, what's coming out of engineering. What's the biggest problem CISOs are facing now? Is it the sprawl of the problems, the hacker space? Is it not enough talent? What, I mean, I see the fear, what are, what are they facing? How do you, how do you see that, and then what's your conversations like? >> Yeah. I mean the, the answer to that is unfortunately yes, right. They're dealing with all of those things. And, and here we are at the intersection of, you know, this huge complex thing around cloud that's happening. There's already a gap in terms of resources nevermind skills that are different skills than they used to have. So, I hear that a lot. The, the bigger thing I think I hear is they're trying to take the most advantage out of their current team. So, they're again, worried about how to operationalize things. So, if we bring this on, is it going to mean more headcount. Is it going to be, you know things that we have to invest in differently. And I was actually just with a CISO this morning, and the whole team was, was talking about the fact that bringing us on means they have, they can do it with less resource. >> Mm-hmm. >> Like this is a a resource help for them in this particular area. So, that that was their value proposition for us, which I loved. >> Let's talk about Adrian Cockcroft who retired from AWS. He was at Netflix before. He was a big DevOps guy. He talks about how agility's been great because from a sales perspective the old model was, he called it the, the big Indian wedding. You had to get everyone together, do a POC, you know, long sales cycles for big tech investments, proprietary. Now, open sources like speed dating. You can know what's good quickly and and try things quicker. How is that, how is that impacting your sales motions. Your customer engagements. Are they fast? Are they, are they test-tried before they buy? What's the engagement model that you, you see happening that the customers like the best. >> Yeah, hey, you know, because of the fact that we're kind of dealing with this serious part of the problem, right. With the identities and, and dealing with data aspects of it it's not as fast as I would like it to be, right. >> Yeah, it's pretty important, actually. >> They still need to get in and understand it. And then it's different if you're AWS environment versus other environments, right. We have to normalize all of that and bring it together. And it's such a new space, >> Yeah. >> that they all want to see it first. >> Yeah. >> Right, so. >> And, and the consequences are pretty big. >> They're huge. >> Yeah. >> Right, so the, I mean, the scenario here is we're still doing, in some cases we'll do workshops instead of a POV or a POC. 90% of the time though we're still doing a POV. >> Yeah, you got to. >> Right. So, they can see what it is. >> They got to get their hands on it. >> Yep. >> This is one of those things they got to see in action. What is the best-of-breed? If you had to say best-of-breed in identity looks like blank. How would you describe that from a customer's perspective? What do they need the most? Is it robustness? What's some of the things that you guys see as differentiators for having a best-of-breed solution like you guys have. >> A best-of-breed solution. I mean, for, for us, >> Or a relevant solution for that matter, for the solution. >> Yeah. I mean, for us, this, again, this identity issue it, for us, it's depth and it's continuous monitoring, right. Because the issue in the cloud is that there are new privileges that come out every single day, like to the tune of like 35,000 a year. So, even if at this exact moment, it's fine. It's not going to be in another moment, right. So, having that continuous monitoring in there, and, and it solves this issue that we hear from a lot of customers also around lateral movement, right. Because like a piece of compute can be on and off, >> Yeah, yeah, yeah. >> within a few seconds, right. So, you can't use any of the old traditional things anymore. So to me, it's the continuous monitoring I think that's important. >> I think that, and the lateral movement piece, >> Yep. >> that you guys have is what I hear the most of the biggest fears. >> Mm-hmm. >> Someone gets in here and can move around, >> That's right. >> and that's dangerous. >> Mm-hmm. And, and no traditional tools will see it. >> Yeah. Yeah. >> Right. There's nothing in there unless you're instrumented down to that level, >> Yeah. >> which is what we do. You're not going to see it. >> I mean, when someone has a firewall, a perimeter based system, yeah, I'm in the castle, I'm moving around, but that's not the case here. This is built for full observability, >> That's right. >> Yet there's so many vulnerabilities. >> It's all open. Mm-hmm, yeah. And, and our view too, is, I mean you bring up vulnerabilities, right. It, it is, you know, a little bit of the darling, right. People start there. >> Yep. >> And, and our belief in our view is that, okay, that's nice. But, and you do have to do that. You have to be able to see everything right, >> Yep. >> to be able to operationalize it. But if you're not dealing with the sensitive data pieces right, and the identities and stuff that's at the core of what you're trying to do >> Yeah. >> then you're not going to solve the problem. >> Yeah. Denise, I want to ask you. Because you make what was it, five-to-one was the machine to humans. I think that's actually might be low, on the low end. If you could imagine. If you believe that's true. >> Yep. >> I believe that's true by the way If microservices continues to be the, be the wave. >> Oh, it'll just get bigger. >> Which it will. It's going to much bigger. >> Yeah. >> Turning on and off, so, the lateral movement opportunities are going to be greater. >> Yep. >> That's going to be a bigger factor. Okay, so how do I protect myself. Now, 'cause developer productivity is also important. >> Mm-hmm. >> 'Cause, I've heard horror stories like, >> Yep. >> Yeah, my Devs are cranking away. Uh-oh, something's out there. We don't know about it. Everyone has to stop, have a meeting. They get pulled off their task. It's kind of not agile. >> Right. Right. >> I mean, >> Yeah. And, and, in that vein, right. We have built the product around what we call swim lanes. So, the whole idea is we're prioritizing based on actual impact and context. So, if it's a sandbox, it probably doesn't matter as much as if it's like operational code that's out there where customers are accessing it, right. Or it's accessing sensitive data. So, we look at it from a swim lane perspective. When we try to get whoever needs to solve it back to the person that is responsible for it. So we can, we can set it up that way. >> Yeah. I think that, that's key insight into operationalizing this. >> Yep. >> And remediation is key. >> Yes. >> How, how much, how important is the timing of that. When you talk to your customer, I mean, timing is obviously going to be longer, but like seeing it's one thing, knowing what to do is another. >> Yep. >> Do you guys provide that? Is that some of the insights you guys provide? >> We do, it's almost like, you know, us. The, and again, there's context that's involved there, right? >> Yeah. >> So, some remediation from a priority perspective doesn't have to be immediate. And some of it is hair on fire, right. So, we provide actually, >> Yeah. >> a recommendation per each of those situations. And, and in some cases we can auto remediate, right. >> Yeah. >> If, it depends on what the customer's comfortable with, right. But, when I talk to customers about what is their favorite part of what we do it is the auto remediation. >> You know, one of the things on the keynotes, not to, not to go off tangent, one second here but, Kurt who runs platforms at AWS, >> Mm-hmm. >> went on his little baby project that he loves was this automated, automatic reasoning feature. >> Mm-hmm. >> Which essentially is advanced machine learning. >> Right. >> That can connect the dots. >> Yep. >> Not just predict stuff but like actually say this doesn't belong here. >> Right. >> That's advanced computer science. That's heavy duty coolness. >> Mm-hmm. >> So, operationalizing that way, the way you're saying it I'm imagining there's some future stuff coming around the corner. Can you share how you guys are working with AWS specifically? Is it with Amazon? You guys have your own secret sauce for the folks watching. 'Cause this remediation should, it only gets harder. You got to, you have to be smarter on your end, >> Yep. >> with your engineers. What's coming next. >> Oh gosh, I don't know how much of what's coming next I can share with you, except for tighter and tighter integrations with AWS, right. I've been at three meetings already today where we're talking about different AWS services and how we can be more tightly integrated and what's things we want out of their APIs to be able to further enhance what we can offer to our customers. So, there's a lot of those discussions happening right now. >> What, what are some of those conversations like? Without revealing. >> I mean, they have to do with, >> Maybe confidential privilege. >> privileged information. I don't mean like privileged information. >> Yep. I mean like privileges, right, >> Right. >> that are out there. >> Like what you can access, and what you can't. >> What you can, yes. And who and what can access it and what can't. And passing that information on to us, right. To be able to further remediate it for an AWS customer. That's, that's one. You know, things like other AWS services like CloudTrail and you know some of the other scenarios that they're talking about. Like we're, you know, we're getting deeper and deeper and deeper with the AWS services. >> Yeah, it's almost as if Amazon over the past two years in particular has been really tightly integrating as a strategy to enable their partners like you guys >> Mm-hmm. >> to be successful. Not trying to land grab. Is that true? Do you get that vibe? >> I definitely get that vibe, right. Yesterday, we spent all day in a partnership meeting where they were, you know talking about rolling out new services. I mean, they, they are in it to win it with their ecosystem. Not on, not just themselves. >> All right, Denise it's great to have you on theCUBE here as part of re:Inforce. I'll give you the last minute or so to give a plug for the company. You guys hiring? What are you guys looking for? Potential customers that are watching? Why should they buy you? Why are you winning? Give a, give the pitch. >> Yeah, absolutely. So, so yes we are hiring. We're always hiring. I think, right, in this startup world. We're growing and we're looking for talent, probably in every area right now. I know I'm looking for talent on the sales side. And, and again, the, I think the important thing about us is the, the fullness of our solution but the superpower that we have, like I said before around the identity and the data pieces and this is becoming more and more the reality for customers that they're understanding that that is the most important thing to do. And I mean, if they're that, Gartner says it, Forrester says it, like we are one of the, one of the best choices for that. >> Yeah. And you guys have been doing good. We've been following you. Thanks for coming on. >> Thank you. >> And congratulations on your success. And we'll see you at the AWS Startup Showcase in late August. Check out Sonrai Systems at AWS Startup Showcase late August. Here at theCUBE live in Boston getting all the coverage. From the keynotes, to the experts, to the ecosystem, here on theCUBE, I'm John Furrier your host. Thanks for watching. (bright music)

Published Date : Jul 26 2022

SUMMARY :

of the upcoming AWS Startup Showcase, This is kind of what you is the hardest thing for them to reign in. So, that's really Here at the show, the big theme to me, You know normal stuff, We've security in the this is Stephen Schmidt, One of the things that came out of it is open source is the software industry. Ops and the Sec, Second Ops. because the networks aren't the, Because what's happened is customers is also their challenge. that, then what's next. So, the automation is a, is a theme You got to find it first. part of the DevCycle too. You guys are on the front end. and have the ownership Okay, so how do you guys, talk to customers that have that chance, and everything that gets Right. like the AD thing of, You're in the front lines, on the roadmap. What, I mean, I see the fear, what are, the answer to that is So, that that was their that the customers like the best. because of the fact that We have to normalize all of And, and the 90% of the time though So, they can see what it is. What is the best-of-breed? I mean, for, for us, for the solution. Because the issue in the cloud is that So, you can't use any of the of the biggest fears. And, and no traditional tools will see it. down to that level, You're not going to see it. but that's not the case here. bit of the darling, right. But, and you do have to do that. that's at the core of to solve the problem. might be low, on the low end. to be the, be the wave. going to much bigger. so, the lateral movement That's going to be a bigger factor. Everyone has to stop, have a meeting. Right. So, the whole idea is that's key insight into is the timing of that. We do, it's almost like, you know, us. doesn't have to be immediate. And, and in some cases we it is the auto remediation. baby project that he loves Which essentially is but like actually say That's advanced computer science. the way you're saying it I'm imagining with your engineers. to be able to further What, what are some of I don't mean like privileged information. I mean like privileges, right, access, and what you can't. some of the other scenarios to be successful. to win it with their ecosystem. to have you on theCUBE here the most important thing to do. Thanks for coming on. From the keynotes, to the

ENTITIES

Entity	Category	Confidence
Denise Hayman	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
Denise	PERSON	0.99+
Stephen Schmidt	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
John Furrier	PERSON	0.99+
95%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Sonrai	PERSON	0.99+
Forrester	ORGANIZATION	0.99+
Kurt	PERSON	0.99+
today	DATE	0.99+
late eighties	DATE	0.99+
90%	QUANTITY	0.99+
second time	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
Sonrai Security	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
Yesterday	DATE	0.99+
late August	DATE	0.99+
early nineties	DATE	0.98+
three meetings	QUANTITY	0.98+
one second	QUANTITY	0.98+
One	QUANTITY	0.98+
five	QUANTITY	0.97+
each	QUANTITY	0.97+
one	QUANTITY	0.97+
awsstartups.com	OTHER	0.96+
DevSecOps	TITLE	0.96+
early September	DATE	0.96+
both worlds	QUANTITY	0.96+
35,000 a year	QUANTITY	0.95+
two specific areas	QUANTITY	0.95+
CRO	PERSON	0.94+
Azure	TITLE	0.93+
first	QUANTITY	0.92+
this morning	DATE	0.9+
DevCycle	ORGANIZATION	0.89+
DevOps	TITLE	0.89+
2022	DATE	0.88+
AWS Startup Showcase	EVENT	0.86+
CloudTrail	TITLE	0.86+
late August,	DATE	0.85+
Indian	OTHER	0.83+
Season Two	QUANTITY	0.8+
DevStream	ORGANIZATION	0.8+
about five	QUANTITY	0.79+
theCUBE	ORGANIZATION	0.78+
Chief Revenue	PERSON	0.77+
past two years	DATE	0.77+
one thing	QUANTITY	0.77+
Sonrai Systems	PERSON	0.73+
Sonrai	ORGANIZATION	0.7+
single day	QUANTITY	0.69+
Cube	TITLE	0.66+
waves of innovation	EVENT	0.66+
Episode Four	QUANTITY	0.62+
things	QUANTITY	0.61+
years	QUANTITY	0.61+
Inforce 22	TITLE	0.45+
second	QUANTITY	0.42+

Eric Kostlan, Cisco Secure | AWS re:Inforce 2022

>>Okay, welcome back. Everyone's cubes live coverage of eight of us reinforced 22. I'm John furrier, my host David Lon. We've got a great guest from Cisco, Eric Costin, technical marketing engineer, Cisco systems. Great to have you on. Thanks with >>The all right. Thanks for having, >>Of course we've doing a lot of Cisco laws, Cisco events, Barcelona us know a lot of folks over there. A lot of great momentum supply chain challenges, but you got the cloud with a lot of networking there too. A lot of security conversations, dev sec ops, the trend we're hearing here is operations security and operations. What are some of the business realities that you guys are looking at right now focused on from a Cisco perspective and a landscape perspective? >>Well, the transition to the cloud is accelerating and it's really changed the way we're doing business and what we do now, this combined with the more and more remote work by remote users and also the consumption of cloud-based tools to perform your business functions has dramatically changed the contour of the business environment. The traditional trust boundary has evaporated or at least transformed dramatically, but you still have those requirements for trust for micro segmentation. So what we've seen is a dramatic change in how we do business and what we do. And this is essential because the value proposition is enormous and companies are able to pursue more and more ambitious objectives. But from a security point of view, it's quite challenging because on one hand, what we call the attack surface has increased and the stakes are much higher. So you have more sophisticated malicious actors taking advantage of a broader security target in order to conduct your business in order to maintain business continuity and achieve your objectives. You need to protect this environment. And one, one of the, >>Sorry, just to, just to clarify, sure. You said the value proposition is enormous. You mean the value proposition of the cloud is enormous. Exactly. So the business is leaning in big time and there are security consequences to >>That precisely. And so, and one thing that we've seen happen in the industry is as these components of the business environment have change, the industry has sort of bolted on more and more security solutions. But the problem with that is that's led to enormous complexity in administering security for the company, which is very expensive to find people with those expertise. And also the complexity itself is a vulnerability. >>And, and that traditional trust boundary that you talked about, it hasn't been vaporized has it, it's still there. So are you connecting into that? Is there an interoperability challenge? Does that create more security issues or are people kind of redoing? We talk about security as a do over, how are customers approaching it? >>It is a challenge because although the concept of a trust boundary still exists, the nature of the hybrid multi-cloud environment makes it very difficult to define furthermore, the traditional solutions such as simply having a, a, a firewall and, and an on-premise network is now much more complex because the on-premise network has to connect to the cloud infrastructure and parts of the cloud infrastructure have to be exposed to the public. Other parts have to be protected. So it's not that the, the concept of trusted versus untrusted has gone away. It's just become fundamentally more complex. >>So Eric, I wanna get your thoughts on this higher level abstraction trend, because you're seeing the complexity being pushed to the customers and they want to buy cloud or cloud operations from partners platforms that take the heavy lifting from there, and best of breed products that handle the complexity. What's your reaction to that, that statement? Do you think that's happening or that will happen because either the complexity is gonna be solved by the customer, or they're gonna buy a platform or SA product. >>Now the, the it's it's unreasonable to expect the customers to constantly adapt to this changing environment. From the point of view of, of security, they have to be able to focus on their business objectives, which is to actually sell their products and pursue their ambitions. And it's a distraction that they really can't afford if they have to be focused on security. So the solutions have to take that challenge that distraction away from them, and that has to be integral to the solution. >>So you're saying that the, the vendors, the provi supplier has to deal the underlying complexities on behalf of the customer. >>Exactly. The vendor can't do this without a robust partnership with the cloud provider, working together, the both at the engineering level to develop the products together and in the implementation, as well as standing side by side with the customer, as they expand their business into the >>Cloud, this is super cloud it's super cloud. Right? Exactly. So give us the specifics. What are you doing? What's Cisco doing? How are you working with AWS? What solutions are you talking about? >>Well, Cisco has a wide variety, quite an expansive portfolio because there's a large number of components to the solution. This spans both the, the workload protection, as well as the infrastructure protection. And these are integrated and in partnership with AWS not only integrated together, but integrated into the cloud components. And this is what allows comprehensive protection across the hybrid cloud environment. >>So are we talking about solutions that are embedded into switches? We're talking about software layers, maybe give, describe, add a little color, paint, a picture of the portfolio. >>And, and it's really all of those things. So the most of the solutions historically could say evolved from solutions that were utilized in the physical infrastructure, in the firewalls, in the switches, in the routers. And some of these technologies are still basically confined to those, to those form factors. But some of the most important technologies we use such as snort three, which is a best of breed intrusion protection system that we have adopted is, is applicable as well to the virtual environment, so that we, we push into the cloud in a way that's seamless. So that if you're, if you've developed those policies for your on-prem solutions, you can extend them into the cloud effortlessly. Another example of something that adapts quite well to the cloud is security intelligence. Cisco has talus. Talus is the world's leading security intelligence operation. This is fundamental for addressing threats day zero attacks and Taos updates are products approximately once every hour with the new, with information about these emerging attacks, as well as informing the community as a whole of this. And now that that architecture is very easily extensible into the cloud because you can inform a virtual device just as easily as you can inform a physical device of an emergent threat, >>But technically, how do you do that integration? That's just through AWS primitives. How do you, how does Cisco work with AWS at an engineering level to make that happen? >>So, part of it is that we, we, we have taken certain of our products and we virtualized them. So you could say the, the, the simplest or more straightforward approach is to take our firewalls and, and our other products and simply make virtual machines out of them. But that's really not sort of the most exciting thing. The most exciting thing is that working with them, with integration, with their components and doing such things as having our management platforms, like our Cisco defense orchestrator, be able to discover the virtual environment and utilize that discovery to, to manipulate the security components of that environment. Yeah. >>Kurt, this is where I think you, you, onto something big here management is kind of like, oh yeah, we have software management software kind of always a thing. When you talk about large scale, multiple data point billions and billions of things happening a month. Quantum, we mentioned that in the keynote, we heard Kurt who's VP of platform. So about reasoning. This is kind of a whole nother level of technology. Next level reasoning, knowing things mentioned micro segmentation. So we're seeing a new era of not just policies, reasoning around the networks, around the software stuff that needs to be better than just machine learning, doing predictive and, you know, analysis. Can you share your reaction to that? Because I see this dots connecting at a whole nother level. >>Yes. Now, as we understand artificial intelligence machine learning, I think we appreciate that one of the key components there, we think about it as data science, as data management. But when you think about data, you suddenly recognize where's it coming from data requires visibility. And when we talk about the transition to the cloud and the dispersion of the workforce, visibility is one of the great challenges and visibility even prior to these transitions has been one of the primary focuses of Cisco systems. So as we transition to the cloud and we recognize the need to be able to interpret what we're seeing, we have expanded our capacity to visualize what's happening. And I think there's a, a significant contribution yeah. To the >>Dave and I were talking about this in context to our thesis about super cloud, how that's going evolving building on top of the hyperscalers CapEx investment, doing things, customer data control flows are a huge thing going across multiple geographies. It's global, you got regions, you got network, some trusted, some not. And you have now applications that are global. So you got data flows. >>Yes. >>I mean, data's gotta move across multiple environments. So that's a challenge >>And it has to move secure securely. And furthermore, there's a real challenge here with confidence, with confidence of the company that it's data flow is secure in this new environment that is frankly, can be a little bit uncomfortable. And also the customer and the partners of that business have to be confident that their intellectual property, that their security and identity is protected. >>Yeah. Dave and I were talking also, we're kind of old and seen some seen the movie before. Remember the old days of multi-vendor and OSI models and, you know, interoperability, we're kind of at a new inflection point where teamwork, not just ecosystem partners, companies working together to make sure things are secure. This is a whole nother data problem, opportunity. Amazon sees things that other people don't seek and contribute that back. How does this whole next level multi-vendor partnerships, the open source is a big part of the software piece of it. You got it's custom Silicon. You mentioned. How do you view that whole team oriented approach in security? >>Now this is absolutely essential. The community, the industry has to work together. Fortunately, it's in the DNA of Cisco to interate, I've sat next to competitors at customer sites working to solve the customer's problem. It's just how we function. So it's not just our partnerships, but it's our relationship with industry because industry has common purpose in solving these problems. We have to be confident in order to pursue our objectives. >>You see, you see this industry at a flash point right now, everyone has to partner. >>Exactly. >>Okay. How would you summarize that? We, we are out of time, but so give us your leadership about the >>Part of you, of business leadership. A business needs business continuity, its contributors have to be able to access resources to perform their job. And the customers and partners need confidence to deal with that business. You need the continuity, you demand flexibility to adapt to the changing environment and to take advantage of emerging opportunities. And you expect security. The security has to be resilient. It has to be robust. The security has to be simple to implement Cisco in partnership with AWS provides the security. You need to succeed. >>Eric, thanks coming for so much for coming on the cube. Really appreciate your insights and your experience and, and candid commentary and appreciate your time. Thank >>You. Thank you very much for the >>Opportunity. Okay. We're here. Live on the floor and expo hall at reinforce Avis reinforced 22 in Boston, Massachusetts. I'm John ante. We'll be right back with more coverage after this short break.

Published Date : Jul 26 2022

SUMMARY :

Great to have you on. The all right. What are some of the business realities and also the consumption of cloud-based tools to So the business is leaning in big time and there are security consequences to administering security for the company, which is very expensive to find people with those expertise. And, and that traditional trust boundary that you talked about, it hasn't been vaporized has it, and parts of the cloud infrastructure have to be exposed to the public. complexity is gonna be solved by the customer, or they're gonna buy a platform or SA product. So the solutions have to take that challenge that on behalf of the customer. the cloud provider, working together, the both at the engineering level to How are you working with AWS? the hybrid cloud environment. layers, maybe give, describe, add a little color, paint, a picture of the portfolio. So the most of the solutions historically But technically, how do you do that integration? But that's really not sort of the most exciting thing. reasoning around the networks, around the software stuff that needs to be better than is one of the great challenges and visibility even prior to these transitions So you got data flows. So that's a challenge the partners of that business have to be confident that their a big part of the software piece of it. the DNA of Cisco to interate, I've sat next to We, we are out of time, but so give us your leadership about the And the customers and partners need confidence to deal with that Eric, thanks coming for so much for coming on the cube. Live on the floor and expo hall at reinforce Avis reinforced 22

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
David Lon	PERSON	0.99+
Kurt	PERSON	0.99+
Eric	PERSON	0.99+
Eric Costin	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Eric Kostlan	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
Avis	ORGANIZATION	0.99+
billions	QUANTITY	0.99+
John furrier	PERSON	0.99+
both	QUANTITY	0.98+
eight	QUANTITY	0.97+
one	QUANTITY	0.95+
a month	QUANTITY	0.95+
one thing	QUANTITY	0.93+
22	QUANTITY	0.88+
Barcelona	LOCATION	0.88+
approximately once every hour	QUANTITY	0.87+
Cisco Secure	ORGANIZATION	0.86+
Talus	ORGANIZATION	0.85+
2022	DATE	0.84+
CapEx	ORGANIZATION	0.83+
zero	QUANTITY	0.82+
Taos	TITLE	0.81+
John ante	PERSON	0.72+

Keynote Analysis | AWS re:Inforce 2022

>>Hello, everyone. Welcome to the Cube's live coverage here in Boston, Massachusetts for AWS reinforce 2022. I'm John fur, host of the cube with Dave. Valante my co-host for breaking analysis, famous podcast, Dave, great to see you. Um, Beck in Boston, 2010, we started >>The queue. It all started right here in this building. John, >>12 years ago, we started here, but here, you know, just 12 years, it just seems like a marathon with the queue. Over the years, we've seen many ways. You call yourself a historian, which you are. We are both now, historians security is doing over. And we said in 2013 is security to do where we asked pat GSK. Now the CEO of Intel prior to that, he was the CEO of VMware. This is the security show fors. It's called the reinforce. They have reinvent, which is their big show. Now they have these, what they call reshow, re Mars, machine learning, automation, um, robotics and space. And then they got reinforced, which is security. It's all about security in the cloud. So great show. Lot of talk about the keynotes were, um, pretty, I wouldn't say generic on one hand, but specific in the other clear AWS posture, we were both watching. What's your take? >>Well, John, actually looking back to may of 2010, when we started the cube at EMC world, and that was the beginning of this massive boom run, uh, which, you know, finally, we're starting to see some, some cracks of the armor. Of course, we're threats of recession. We're in a recession, most likely, uh, in inflationary pressures, interest rate hikes. And so, you know, finally the tech market has chilled out a little bit and you have this case before we get into the security piece of is the glass half full or half empty. So budgets coming into this year, it was expected. They would grow at a very robust eight point half percent CIOs have tuned that down, but it's still pretty strong at around 6%. And one of the areas that they really have no choice, but to focus on is security. They moved everything into the cloud or a lot of stuff into the cloud. >>They had to deal with remote work and that created a lot of security vulnerabilities. And they're still trying to figure that out and plug the holes with the lack of talent that they have. So it's interesting re the first reinforc that we did, which was also here in 2019, Steven Schmidt, who at the time was chief information security officer at Amazon web services said the state of cloud security is really strong. All this narrative, like the pat Gelsinger narrative securities, a do over, which you just mentioned, security is broken. It doesn't help the industry. The state of cloud security is very strong. If you follow the prescription. Well, see, now Steven Schmidt, as you know, is now chief security officer at Amazon. So we followed >>Jesse all Amazon, not just AWS. So >>He followed Jesse over and I asked him, well, why no, I, and they said, well, he's responsible now for physical security. Presumably the warehouses I'm like, well, wait a minute. What about the data centers? Who's responsible for that? So it's kind of funny, CJ. Moses is now the CSO at AWS and you know, these events are, are good. They're growing. And it's all about best practices, how to apply the practices. A lot of recommendations from, from AWS, a lot of tooling and really an ecosystem because let's face it. Amazon doesn't have the breadth and depth of tools to do it alone. >>And also the attendance is interesting, cuz we are just in New York city for the, uh, ado summit, 19,000 people, massive numbers, certainly in the pandemic. That's probably one of the top end shows and it was a summit. This is a different audience. It's security. It's really nerdy. You got OT, you got cloud. You've got on-prem. So now you have cloud operations. We're calling super cloud. Of course we're having our inaugural pilot event on August 9th, check it out. We're called super cloud, go to the cube.net to check it out. But this is the super cloud model evolving with security. And what you're hearing today, Dave, I wanna get your reaction to this is things like we've got billions of observational points. We're certainly there's no perimeter, right? So the perimeter's dead. The new perimeter, if you will, is every transaction at scale. So you have to have a new model. So security posture needs to be rethought. They actually said that directly on the keynote. So security, although numbers aren't as big as last week or two weeks ago in New York still relevant. So alright. There's sessions here. There's networking. Very interesting demographic, long hair. Lot of >>T-shirts >>No lot of, not a lot of nerds doing to build out things over there. So, so I gotta ask you, what's your reaction to this scale as the new advantage? Is that a tailwind or a headwind? What's your read? >>Well, it is amazing. I mean he actually, Steven Schmidt talked about quadrillions of events every month, quadrillions 15 zeros. What surprised me, John. So they, they, Amazon talks about five areas, but by the, by the way, at the event, they got five tracks in 125 sessions, data protection and privacy, GRC governance, risk and compliance, identity network security and threat detection. I was really surprised given the focus on developers, they didn't call out container security. I would've thought that would be sort of a separate area of focus, but to your point about scale, it's true. Amazon has a scale where they'll see events every day or every month that you might not see in a generation if you just kind of running your own data center. So I do think that's, that's, that's, that's a, a, a, a valid statement having said that Amazon's got a limited capability in terms of security. That's why they have to rely on the ecosystem. Now it's all about APIs connecting in and APIs are one of the biggest security vulnerability. So that's kind of, I, I I'm having trouble squaring that circle. >>Well, they did just to come up, bring back to the whole open source and software. They did say they did make a measurement was store, but at the beginning, Schmidt did say that, you know, besides scale being an advantage for Amazon with a quadri in 15 zeros, don't bolt on security. So that's a classic old school. We've heard that before, right. But he said specifically, weave in security in the dev cycles. And the C I C D pipeline that is, that basically means shift left. So sneak is here, uh, company we've covered. Um, and they, their whole thing is shift left. That implies Docker containers that implies Kubernetes. Um, but this is not a cloud native show per se. It's much more crypto crypto. You heard about, you know, the, uh, encrypt everything message on the keynote. You heard, um, about reasoning, quantum, quantum >>Skating to the puck. >>Yeah. So yeah, so, you know, although the middleman is logged for J heard that little little mention, I love the quote from Lewis Hamilton that they put up on stage CJ, Moses said, team behind the scenes make it happen. So a big emphasis on teamwork, big emphasis on don't bolt on security, have it in the beginning. We've heard that before a lot of threat modeling discussions, uh, and then really this, you know, the news around the cloud audit academy. So clearly skills gap, more threats, more use cases happening than ever before. >>Yeah. And you know, to your point about, you know, the teamwork, I think the problem that CISOs have is they just don't have the talent to that. AWS has. So they have a real difficulty applying that talent. And so but's saying, well, join us at these shows. We'll kind of show you how to do it, how we do it internally. And again, I think when you look out on this ecosystem, there's still like thousands and thousands of tools that practitioners have to apply every time. There's a tool, there's a separate set of skills to really understand that tool, even within AWS's portfolio. So this notion of a shared responsibility model, Amazon takes care of, you know, securing for instance, the physical nature of S3 you're responsible for secure, make sure you're the, the S3 bucket doesn't have public access. So that shared responsibility model is still very important. And I think practitioners still struggling with all this complexity in this matrix of tools. >>So they had the layered defense. So, so just a review opening keynote with Steve Schmidt, the new CSO, he talked about weaving insecurity in the dev cycles shift left, which is the, I don't bolt it on keep in the beginning. Uh, the lessons learned, he talked a lot about over permissive creates chaos, um, and that you gotta really look at who has access to what and why big learnings there. And he brought up the use cases. The more use cases are coming on than ever before. Um, layered defense strategy was his core theme, Dave. And that was interesting. And he also said specifically, no, don't rely on single security control, use multiple layers, stronger together. Be it it from the beginning, basically that was the whole ethos, the posture, he laid that down >>And he had a great quote on that. He said, I'm sorry to interrupt single controls. And binary states will fail guaranteed. >>Yeah, that's a guarantee that was basically like, that's his, that's not a best practice. That's a mandate. <laugh> um, and then CJ, Moses, who was his deputy in the past now takes over a CSO, um, ownership across teams, ransomware mitigation, air gaping, all that kind of in the weeds kind of security stuff. You want to check the boxes on. And I thought he did a good job. Right. And he did the news. He's the new CISO. Okay. Then you had lean is smart from Mongo DB. Come on. Yeah. Um, she was interesting. I liked her talk, obviously. Mongo is one of the ecosystem partners headlining game. How do you read into that? >>Well, I, I I'm, its really interesting. Right? You didn't see snowflake up there. Right? You see data breaks up there. You had Mongo up there and I'm curious is her and she's coming on the cube tomorrow is her primary role sort of securing Mongo internally? Is it, is it securing the Mongo that's running across clouds. She's obviously here talking about AWS. So what I make of it is, you know, that's, it's a really critical partner. That's driving a lot of business for AWS, but at the same time it's data, they talked about data security being one of the key areas that you have to worry about and that's, you know what Mongo does. So I'm really excited. I talked to her >>Tomorrow. I, I did like her mention a big idea, a cube alumni, yeah. Company. They were part of our, um, season one of our eight of us startup showcase, check out AWS startups.com. If you're watching this, we've been doing now, we're in season two, we're featuring the fastest growing hottest startups in the ecosystem. Not the big players, that's ISVs more of the startups. They were mentioned. They have a great product. So I like to mention a big ID. Um, security hub mentioned a config. They're clearly a big customer and they have user base, a lot of E C, two and storage going on. People are building on Mongo so I can see why they're in there. The question I want to ask you is, is Mongo's new stuff in line with all the upgrades in the Silicon. So you got graviton, which has got great stuff. Um, great performance. Do you see that, that being a key part of things >>Well, specifically graviton. So I I'll tell you this. I'll tell you what I know when you look at like snowflake, for instance, is optimizing for graviton. For certain workloads, they actually talked about it on their earnings call, how it's lowered the cost for customers and actually hurt their revenue. You know, they still had great revenue, but it hurt their revenue. My sources indicate to me that that, that Mongo is not getting as much outta graviton two, but they're waiting for graviton three. Now they don't want to make that widely known because they don't wanna dis AWS. But it's, it's probably because Mongo's more focused on analytics. But so to me, graviton is the future. It's lower cost. >>Yeah. Nobody turns off the database. >>Nobody turns off the database. >><laugh>, it's always cranking C two cycles. You >>Know the other thing I wanted to bring, bring up, I thought we'd hear, hear more about ransomware. We heard a little bit of from Kirk Coel and he, and he talked about all these things you could do to mitigate ransomware. He didn't talk about air gaps and that's all you hear is how air gap. David Flo talks about this all the time. You must have air gaps. If you wanna, you know, cover yourself against ransomware. And they didn't even mention that. Now, maybe we'll hear that from the ecosystem. That was kind of surprising. Then I, I saw you made a note in our shared doc about encryption, cuz I think all the talk here is encryption at rest. What about data in motion? >>Well, this, this is the last guy that came on the keynote. He brought up encryption, Kurt, uh, Goel, which I love by the way he's VP of platform. I like his mojo. He's got the long hair >>And he's >>Geeking out swagger, but I, he hit on some really cool stuff. This idea of the reasoning, right? He automated reasoning is little pet project that is like killer AI. That's next generation. Next level >>Stuff. Explain that. >>So machine learning does all kinds of things, you know, goes to sit pattern, supervise, unsupervised automate stuff, but true reasoning. Like no one connecting the dots with software. That's like true AI, right? That's really hard. Like in word association, knowing how things are connected, looking at pattern and deducing things. So you predictive analytics, we all know comes from great machine learning. But when you start getting into deduction, when you say, Hey, that EC two cluster never should be on the same VPC, is this, this one? Why is this packet trying to go there? You can see patterns beyond normal observation space. So if you have a large observation space like AWS, you can really put some killer computer science technology on this. And that's where this reasoning is. It's next level stuff you don't hear about it because nobody does it. Yes. I mean, Google does it with metadata. There's meta meta reasoning. Um, we've been, I've been watching this for over two decades now. It's it's a part of AI that no one's tapped and if they get it right, this is gonna be a killer part of the automation. So >>He talked about this, basically it being advanced math that gets you to provable security, like you gave an example. Another example I gave is, is this S3 bucket open to the public is a, at that access UN restricted or unrestricted, can anyone access my KMS keys? So, and you can prove, yeah. The answer to that question using advanced math and automated reasoning. Yeah, exactly. That's a huge leap because you used to be use math, but you didn't have the data, the observation space and the compute power to be able to do it in near real time or real time. >>It's like, it's like when someone, if in the physical world real life in real life, you say, Hey, that person doesn't belong here. Or you, you can look at something saying that doesn't fit <laugh> >>Yeah. Yeah. >>So you go, okay, you observe it and you, you take measures on it or you query that person and say, why you here? Oh, okay. You're here. It doesn't fit. Right. Think about the way on the right clothes, the right look, whatever you kind of have that data. That's deducing that and getting that information. That's what reasoning is. It's it's really a killer level. And you know, there's encrypt, everything has to be data. Lin has to be data in at movement at rest is one thing, but you gotta get data in flight. Dave, this is a huge problem. And making that work is a key >>Issue. The other thing that Kirk Coel talked about was, was quantum, uh, quantum proof algorithms, because basically he put up a quote, you're a hockey guy, Wayne Greski. He said the greatest hockey player ever. Do you agree? I do agree. Okay, great. >>Bobby or, and Wayne Greski. >>Yeah, but okay, so we'll give the nada Greski, but I always skate to the where the puck is gonna be not to where it's been. And basically his point was where skating to where quantum is going, because quantum, it brings risks to basically blow away all the existing crypto cryptographic algorithms. I, I, my understanding is N just came up with new algorithms. I wasn't clear if those were supposed to be quantum proof, but I think they are, and AWS is testing them. And AWS is coming out with, you know, some test to see if quantum can break these new algos. So that's huge. The question is interoperability. Yeah. How is it gonna interact with all the existing algorithms and all the tools that are out there today? So I think we're a long way off from solving that problem. >>Well, that was one of Kurt's big point. You talking about quantum resistant cryptography and they introduce hybrid post quantum key agreements. That means KMS cert certification, cert manager and manager all can manage the keys. This was something that's gives more flexibility on, on, on that quantum resistance argument. I gotta dig into it. I really don't know how it works, what he meant by that in terms of what does that hybrid actually mean? I think what it means is multi mode and uh, key management, but we'll see. >>So I come back to the ho the macro for a second. We've got consumer spending under pressure. Walmart just announced, not great earning. Shouldn't be a surprise to anybody. We have Amazon meta and alphabet announcing this weekend. I think Microsoft. Yep. So everybody's on edge, you know, is this gonna ripple through now? The flip side of that is BEC because the economy yeah. Is, is maybe not in, not such great shape. People are saying maybe the fed is not gonna raise after September. Yeah. So that's, so that's why we come back to this half full half empty. How does that relate to cyber security? Well, people are prioritizing cybersecurity, but it's not an unlimited budget. So they may have to steal from other places. >>It's a double whammy. Dave, it's a double whammy on the spend side and also the macroeconomic. So, okay. We're gonna have a, a recession that's predicted the issue >>On, so that's bad on the one hand, but it's good from a standpoint of not raising interest rates, >>It's one of the double whammy. It was one, it's one of the double whammy and we're talking about here, but as we sit on the cube two weeks ago at <inaudible> summit in New York, and we did at re Mars, this is the first recession where the cloud computing hyperscale is, are pumping full cylinder, all cylinders. So there's a new economic engine called cloud computing that's in place. So unlike data center purchase in the past, that was CapEx. When, when spending was hit, they pause was a complete shutdown. Then a reboot cloud computer. You can pause spending for a little bit, make, might make the cycle longer in sales, but it's gonna be quickly fast turned on. So, so turning off spending with cloud is not that hard to do. You can hit pause and like check things out and then turn it back on again. So that's just general cloud economics with security though. I don't see the spending slowing down. Maybe the sales cycles might go longer, but there's no spending slow down in my mind that I see. And if there's any pause, it's more of refactoring, whether it's the crypto stuff or new things that Amazon has. >>So, so that's interesting. So a couple things there. I do think you're seeing a slight slow down in the, the, the ex the velocity of the spend. When you look at the leaders in spending velocity in ETR data, CrowdStrike, Okta, Zscaler, Palo Alto networks, they're all showing a slight deceleration in spending momentum, but still highly elevated. Yeah. Okay. So, so that's a, I think now to your other point, really interesting. What you're saying is cloud spending is discretionary. That's one of the advantages. I can dial it down, but track me if I'm wrong. But most of the cloud spending is with reserved instances. So ultimately you're buying those reserved instances and you have to spend over a period of time. So they're ultimately AWS is gonna see that revenue. They just might not see it for this one quarter. As people pull back a little bit, right. >>It might lag a little bit. So it might, you might not see it for a quarter or two, so it's impact, but it's not as severe. So the dialing up, that's a key indicator get, I think I'm gonna watch that because that's gonna be something that we've never seen before. So what's that reserve now the wild card and all this and the dark horse new services. So there's other services besides the classic AC two, but security and others. There's new things coming out. So to me, this is absolutely why we've been saying super cloud is a thing because what's going on right now in security and cloud native is there's net new functionality that needs to be in place to handle multiple clouds, multiple abstraction layers, and to do all these super cloudlike capabilities like Mike MongoDB, like these vendors, they need to up their gain. And that we're gonna see new cloud native services that haven't exist. Yeah. I'll use some hatchy Corp here. I'll use something over here. I got some VMware, I got this, but there's gaps. Dave, there'll be gaps that are gonna emerge. And I think that's gonna be a huge wild >>Cup. And now I wanna bring something up on the super cloud event. So you think about the layers I, as, uh, PAs and, and SAS, and we see super cloud permeating, all those somebody ask you, well, because we have Intuit coming on. Yep. If somebody asks, why Intuit in super cloud, here's why. So we talked about cloud being discretionary. You can dial it down. We saw that with snowflake sort of Mongo, you know, similarly you can, if you want dial it down, although transaction databases are to do, but SAS, the SAS model is you pay for it every month. Okay? So I've, I've contended that the SAS model is not customer friendly. It's not cloudlike and it's broken for customers. And I think it's in this decade, it's gonna get fixed. And people are gonna say, look, we're gonna move SAS into a consumption model. That's more customer friendly. And that's something that we're >>Gonna explore in the super cloud event. Yeah. And one more thing too, on the spend, the other wild card is okay. If we believe super cloud, which we just explained, um, if you don't come to the August 9th event, watch the debate happen. But as the spending gets paused, the only reason why spending will be paused in security is the replatforming of moving from tools to platforms. So one of the indicators that we're seeing with super cloud is a flight to best of breeds on platforms, meaning hyperscale. So on Amazon web services, there's a best of breed set of services from AWS and the ecosystem on Azure. They have a few goodies there and customers are making a choice to use Azure for certain things. If they, if they have teams or whatever or office, and they run all their dev on AWS. So that's kind of what's happened. So that's, multi-cloud by our definition is customers two clouds. That's not multi-cloud, as in things are moving around. Now, if you start getting data planes in there, these customers want platforms. If I'm a cybersecurity CSO, I'm moving to platforms, not just tools. So, so maybe CrowdStrike might have it dial down, but a little bit, but they're turning into a platform. Splunk trying to be a platform. Okta is platform. Everybody's scale is a platform. It's a platform war right now, Dave cyber, >>A right paying identity. They're all plat platform, beach products. We've talked about that a lot in the queue. >>Yeah. Well, great stuff, Dave, let's get going. We've got two days alive coverage. Here is a cubes at, in Boston for reinforc 22. I'm Shante. We're back with our guests coming on the queue at the short break.

Published Date : Jul 26 2022

SUMMARY :

I'm John fur, host of the cube with Dave. It all started right here in this building. Now the CEO of Intel prior to that, he was the CEO of VMware. And one of the areas that they really have no choice, but to focus on is security. out and plug the holes with the lack of talent that they have. So And it's all about best practices, how to apply the practices. So you have to have a new No lot of, not a lot of nerds doing to build out things over there. Now it's all about APIs connecting in and APIs are one of the biggest security vulnerability. And the C I C D pipeline that is, that basically means shift left. I love the quote from Lewis Hamilton that they put up on stage CJ, Moses said, I think when you look out on this ecosystem, there's still like thousands and thousands I don't bolt it on keep in the beginning. He said, I'm sorry to interrupt single controls. And he did the news. So what I make of it is, you know, that's, it's a really critical partner. So you got graviton, which has got great stuff. So I I'll tell you this. You and he, and he talked about all these things you could do to mitigate ransomware. He's got the long hair the reasoning, right? Explain that. So machine learning does all kinds of things, you know, goes to sit pattern, supervise, unsupervised automate but you didn't have the data, the observation space and the compute power to be able It's like, it's like when someone, if in the physical world real life in real life, you say, Hey, that person doesn't belong here. the right look, whatever you kind of have that data. He said the greatest hockey player ever. you know, some test to see if quantum can break these new cert manager and manager all can manage the keys. So everybody's on edge, you know, is this gonna ripple through now? We're gonna have a, a recession that's predicted the issue I don't see the spending slowing down. But most of the cloud spending is with reserved So it might, you might not see it for a quarter or two, so it's impact, but it's not as severe. So I've, I've contended that the SAS model is not customer friendly. So one of the indicators that we're seeing with super cloud is a We've talked about that a lot in the queue. We're back with our guests coming on the queue at the short break.

ENTITIES

Entity	Category	Confidence
Steven Schmidt	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Wayne Greski	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Boston	LOCATION	0.99+
John	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
2013	DATE	0.99+
Moses	PERSON	0.99+
New York	LOCATION	0.99+
Mongo	ORGANIZATION	0.99+
August 9th	DATE	0.99+
David Flo	PERSON	0.99+
Bobby	PERSON	0.99+
2019	DATE	0.99+
Steve Schmidt	PERSON	0.99+
Shante	PERSON	0.99+
Kurt	PERSON	0.99+
thousands	QUANTITY	0.99+
Jesse	PERSON	0.99+
Lewis Hamilton	PERSON	0.99+
125 sessions	QUANTITY	0.99+
two days	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
last week	DATE	0.99+
Google	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
12 years	QUANTITY	0.99+
2010	DATE	0.99+
John fur	PERSON	0.99+
today	DATE	0.99+
19,000 people	QUANTITY	0.99+
Greski	PERSON	0.99+
Zscaler	ORGANIZATION	0.99+
Kirk Coel	PERSON	0.99+
SAS	ORGANIZATION	0.99+
Goel	PERSON	0.99+
Intel	ORGANIZATION	0.99+
two	QUANTITY	0.99+
12 years ago	DATE	0.98+
both	QUANTITY	0.98+
Okta	ORGANIZATION	0.98+
Tomorrow	DATE	0.98+
two weeks ago	DATE	0.98+
15 zeros	QUANTITY	0.98+
five tracks	QUANTITY	0.98+
first	QUANTITY	0.98+
Beck	PERSON	0.98+

Video Exclusive: Oracle Lures MongoDB Devs With New API for ADB

(upbeat music) >> Oracle continues to pursue a multi-mode converged database strategy. The premise of this all in one approach is to make life easier for practitioners and developers. And the most recent example is the Oracle database API for MongoDB, which was announced today. Now, Oracle, they're not the first to come out with a MongoDB compatible API, but Oracle hopes to use its autonomous database as a differentiator and further build a moat around OCI, Oracle Cloud Infrastructure. And with us to talk about Oracle's MongoDB compatible API is Gerald Venzl, who's a distinguished Product Manager at Oracle. Gerald was a guest along with Maria Colgan on the CUBE a while back, and we talked about Oracle's converge database and the kind of Swiss army knife strategy, I called it, of databases. This is dramatically different. It's an approach that we see at the opposite end of the the spectrum, for instance, from AWS, who, for example, goes after the world of developers with a different database for every use case. So, kind of picking up from there, Gerald, I wonder if you could talk about how this new MongoDB API adds to your converged model and the whole strategy there. Where does it fit? >> Yeah, thank you very much, Dave and, by the way, thanks for having me on the CUBE again. A pleasure to be here. So, essentially the MongoDB API to build the compatibility that we used with this API is a continuation of the converge database story, as you said before. Which is essentially bringing the many features of the many single purpose databases that people often like and use, together into one technology so that everybody can benefit from it. So as such, this is just a continuation that we have from so many other APIs or standards that we support. Since a long time, we already, of course to SQL because we are relational database from the get go. Also other standard like GraphQL, Sparkle, et cetera that we have. And the MongoDB API, is now essentially just the next step forward to give the developers this API that they've gotten to love and use. >> I wonder if you could talk about from the developer angle, what do they get out of it? Obviously you're appealing to the Mongo developers out there, but you've got this Mongo compatible API you're pouting the autonomous database on OCI. Why aren't they just going to use MongoDB Atlas on whatever cloud, Azure or AWS or Google Cloud platform? >> That's a very good question. We believe that the majority of developers want to just worry about their application, writing the application, and not so much about the database backend that they're using. And especially in cloud with cloud services, the reason why developers choose these services is so that they don't have to manage them. Now, autonomous database brings many topnotch advanced capabilities to database cloud services. We firmly believe that autonomous database is essentially the next generation of cloud services with all the self-driving features built in, and MongoDB developers writing applications against the MongoDB API, should not have to hold out on these capabilities either. It's like no developer likes to tune the database. No developer likes to take a downtime when they have to rescale their database to accommodate a bigger workload. And this is really where we see the benefit here, so for the developer, ideally nothing will change. You have MongoDB compatible API so they can keep on using their tools. They can build the applications the way that they do, but the benefit from the best cloud database service out there not having to worry about any of these package things anymore, that even MongoDB Atlas has a lot of shortcomings still today, as we find. >> Of cos, this is always a moving target The technology business, that's why we love it. So everybody's moving fast and investing and shaking and jiving. But, I want to ask you about, well, by the way, that's so you're hiding the underlying complexity, That's really the big takeaway there. So that's you huge for developers. But take, I was talking before about, the Amazon's approach, right tool for the right job. You got document DB, you got Microsoft with Cosmos, they compete with Mongo and they've been doing so for some time. How does Oracle's API for Mongo different from those offerings and how you going to attract their users to your JSON offering. >> So, you know, for first of all we have to kind of separate slightly document DB and AWS and Cosmos DB in Azure, they have slightly different approaches there. Document DB essentially is, a document store owned by and built by AWS, nothing different to Mongo DB, it's a head to head comparison. It's like use my document store versus the other document store. So you don't get any of the benefits of a converge database. If you ever want to do a different data model, run analytics over, etc. You still have to use the many other services that AWS provides you to. You cannot all do it into one database. Now Cosmos DB it's more in interesting because they claim to be a multi-model database. And I say claim because what we understand as multi-model database is different to what they understand as multimodel database. And also one of the reasons why we start differentiating with converge database. So what we mean is you should be able to regardless what data format you want to store in the database leverage all the functionality of the database over that data format, with no trade offs. Cosmos DB when you look at it, it essentially gives you mode of operation. When you connect as the application or the user, you have to decide at connection time, how you want, how this database should be treated. Should it be a document store? Should it be a graph store? Should it be a relational store? Once you make that choice, you are locked into that. As long as you establish that connection. So it's like, if you say, I want a document store, all you get is a document store. There's no way for you to crossly analyze with the relational data sitting in the same service. There's no for you to break these boundaries. If you ever want to add some graph data and graph analytics, you essentially have to disconnect and now treat it as a graph store. So you get multiple data models in it, but really you still get, one trick pony the moment you connect to it that you have to choose to. And that is where we see a huge differentiation again with our converge database, because we essentially say, look, one database cloud service on Oracle cloud, where it allows you to do anything, if you wish to do so. You can start as a document store if you wish to do so. If you want to write some SQL queries on top, you can do so. If you want to add some graph data, you can do so. But there's no way for you to have to rewrite your application, use different libraries and frameworks now to connect et cetera, et cetera. >> Got it. Thank you for that. Do you have any data when you talk to customers? Like I'm interested in the diversity of deployments, like for instance, how many customers are using more than one data model? Do for instance, do JSON users need support for other data types or are they happy to stay kind of in their own little sandbox? Do you have any data on that? >> So what we see from the majority of our customers, there is no such thing as one data model fits everything. So, and it's like, there again we have to differentiate the developer that builds a certain microservice, that makes happy to stay in the JSON world or relational world, or the company that's trying to derive value from the data. So it's like the relational model has not gone away since 40 years of it existence. It's still kicking strong. It's still really good at what it does. The JSON data model is really good in what it does. The graph model is really good at what it does. But all these models have been built for different purposes. Try to do graph analytics on relational or JSON data. It's like, it's really tricky, but that's why you use a graph model to begin with. Try to shield yourself from the organization of the data, how it's structured, that's really easy in the relational world, not so much when you get into a document store world. And so what we see about our customers is like as they accumulate more data, is they have many different applications to run their enterprises. The question always comes back, as we have predicted since about six, seven years now, where they say, hey, we have all this different data and different data formats. We want to bring it all together, analyze it together, get value out of the data together. We have seen a whole trend of big data emerge and disappear to answer the question and didn't quite do the trick. And we are basically now back to where we were in the early 2000's when XML databases have faded away, because everybody just allowed you to store XML in the database. >> Got it. So let's make this real for people. So maybe you could give us some examples. You got this new API from Mongo, you have your multi model database. How, take a, paint a picture of how customers are going to benefit in real world use cases. How does it kind of change the customer's world before and after if you will? >> Yeah, absolutely. So, you know the API essentially we are going to use it to accept before, you know, make the lives of the developers easier, but also of course to assist our customers with migrations from Mongo DB over to Oracle Autonomous Database. One customer that we have, for example, that would've benefited of the API several a couple of years ago, two, three years ago, it's one of the largest logistics company on the planet. They track every package that is being sent in JSON documents. So every track package is entries resembled in a JSON document, and they very early on came in with the next question of like, hey, we track all these packages and document in JSON documents. It will be really nice to know actually which packages are stuck, or anywhere where we have to intervene. It's like, can we do this? Can we analyze just how many packages get stuck, didn't get delivered on, the end of a day or whatever. And they found this struggle with this question a lot, they found this was really tricky to do back then, in that case in MongoDB. So they actually approached Oracle, they came over, they migrated over and they rewrote their applications to accommodate that. And there are happy JSON users in Oracle database, but if we were having this API already for them then they wouldn't have had to rewrite their applications or would we often see like worry about the rewriting the application later on. Usually migration use cases, we want to get kind of the migration done, get the data over be running, and then worry about everything else. So this would be one where they would've greatly benefited to shorten this migration time window. If we had already demo the Mongo API back then or this compatibility layer. >> That's a good use case. I mean, it's, one of the most prominent and painful, so anything you could do to help that is key. I remember like the early days of big data, NoSQL, of course was the big thing. There was a lot of confusion. No, people thought was none or not only SQL, which is kind of the more widely accepted interpretation today. But really, it's talking about data that's stored in a non-relational format. So, some people, again they thought that SQL was going to fade away, some people probably still believe that. And, we saw the rise of NoSQL and document databases, but if I understand it correctly, a premise for your Mongo DB API is you really see SQL as a main contributor over Mongo DB's document collections for analytics for example. Can you make, add some color here? Are you seeing, what are you seeing in terms of resurgence of SQL or the momentum in SQL? Has it ever really waned? What's your take? >> Yeah, no, it's a very good point. So I think there as well, we see to some extent history repeating itself from, this all has been tried beforehand with object databases, XML database, et cetera. But if we stay with the NoSQL databases, I think it speaks at length that every NoSQL database that as you write for the sensor you started with NoSQL, and then while actually we always meant, not only SQL, everybody has introduced a SQL like engine or interface. The last two actually join this family is MongoDB. Now they have just recently introduced a SQL compatibility for the aggregation pipelines, something where you can put in a SQL statement and that essentially will then work with aggregation pipeline. So they all acknowledge that SQL is powerful, for us this was always clear. SQL is a declarative language. Some argue it's the only true 4GL language out there. You don't have to code how to get the data, but you just ask the question and the rest is done for you. And, we think that as we, basically, has SQL ever diminished as you said before, if you look out there? SQL has always been a demand. Look at the various developer surveys, etc. The various top skills that are asked for SQL has never gone away. Everybody loves and likes and you wants to use SQL. And so, yeah, we don't think this has ever been, going away. It has maybe just been, put in the shadow by some hypes. But again, we had the same discussion in the 2000's with XML databases, with the same discussions in the 90's with object databases. And we have just frankly, all forgotten about it. >> I love when you guys come on and and let me do my thing and I can pretty much ask any question I want, because, I got to say, when Oracle starts talking about another company I know that company's doing well. So I like, I see Mongo in the marketplace and I love that you guys are calling it out and making some moves there. So here's the thing, you guys have a large install base and that can be an advantage, but it can also be a weight in your shoulder. These specialized cloud databases they don't have that legacy. So they can just kind of move freely about, less friction. Now, all the cloud database services they're going to have more and more automation. I mean, I think that's pretty clear and inevitable. And most if not all of the database vendors they're going to provide support for these kind of converged data models. However they choose to do that. They might do it through the ecosystem, like what Snowflake's trying to do, or bring it in the house themselves, like a watch maker that brings an in-house movement, if you will. But it's like death and taxes, you can't avoid it. It's got to happen. That's what customers want. So with all that being said, how do you see the capabilities that you have today with automation and converge capabilities, How do you see that, that playing out? What's, do you think it gives you enough of an advantage? And obviously it's an advantage, but is it enough of an advantage over the specialized cloud database vendors, where there's clearly a lot of momentum today? >> I mean, honestly yes, absolutely. I mean, we are with some of these databases 20 years ahead. And I give you concrete examples. It's like Oracle had transaction support asset transactions since forever. NoSQL players all said, oh, we don't need assets transactions, base transactions is fine. Yada, yada, yada. Mongo DB started introducing some transaction support. It comes with some limits, cannot be longer than 60 seconds, cannot touch more than a thousand documents as well, et cetera. They still will have to do some catching up there. I mean, it took us a while to get there, let's be honest. Glad We have been around for a long time. Same thing, now that happened with version five, is like we started some simple version of multi version concurrency control that comes along with asset transactions. The interesting part here is like, we've introduced this also an Oracle five, which was somewhere in the 80's before I even started using Oracle Database. So there's a lot of catching up to do. And then you look at the cloud services as well, there's actually certain, a lot of things that we kind of gotten take, we've kind of, we Oracle people have taken for granted and we kind of keep forgetting. For example, our elastic scale, you want to add one CPU, you add one CPU. Should you take downtime for that? Absolutely not. It's like, this is ridiculous. Why would you, you cannot take it downtime in a 24/7 backend system that runs the world. Take any of our customers. If you look at most of these cloud services or you want to reshape, you want to scale your cloud service, that's fine. It's just the VM under the covers, we just shut everything down, give you a VM with more CPU, and you boot it up again, downtown right there. So it's like, there's a lot of these things where we go like, well, we solved this frankly decades ago, that these cloud vendors will run into. And just to add one more point here, so it's like one thing that we see with all these migrations happening is exactly in that field. It's like people essentially started building on whether it's Mongo DB or other of these NoSQL databases or cloud databases. And eventually as these systems grow, as they ask more difficult questions, their use cases expand, they find shortcomings. Whether it's the scalability, whether it's the security aspects, the functionalities that we have, and this is essentially what drives them back to Oracle. And this is why we see essentially this popularity now of pendulum swimming towards our direction again, where people actually happily come over back and they come over to us, to get their workloads enterprise grade if you like. >> Well, It's true. I mean, I just reported on this recently, the momentum that you guys have in cloud because it is, 'cause you got the best mission critical database. You're all about maps. I got to tell you a quick story. I was at a vertical conference one time, I was on stage with Kurt Monash. I don't know if you know Kurt, but he knows this space really well. He's probably forgot and more about database than I'll ever know. But, and I was kind of busting his chops. He was talking about asset transactions. I'm like, well with NoSQL, who needs asset transactions, just to poke him. And he was like, "Are you out of your mind?" And, and he said, look it's everybody is going to head in this direction. It turned out, it's true. So I got to give him props for that. And so, my last question, if you had a message for, let's say there's a skeptical developer out there that's using Mongo DB and Atlas, what would you say to them? >> I would say go try it for yourself. If you don't believe us, we have an always free cloud tier out there. You just go to oracle.com/cloud/free. You sign up for an always free tier, spin up an autonomous database, go try it for yourself. See what's actually possible today. Don't just follow your trends on Hackernews and use a set study here or there. Go try it for yourself and see what's capable of >> All right, Gerald. Hey, thanks for coming into my firing line today. I really appreciate your time. >> Thank you for having me again. >> Good luck with the announcement. You're very welcome, and thank you for watching this CUBE conversation. This is Dave Vellante, We'll see you next time. (gentle music)

Published Date : Feb 10 2022

SUMMARY :

the first to come out the next step forward to I wonder if you could talk is so that they don't have to manage them. and how you going to attract their users the moment you connect to it you talk to customers? So it's like the relational So maybe you could give us some examples. to accept before, you know, make API is you really see SQL that as you write for the and I love that you And I give you concrete examples. the momentum that you guys have in cloud If you don't believe us, I really appreciate your time. and thank you for watching

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Maria Colgan	PERSON	0.99+
Gerald Venzl	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Gerald	PERSON	0.99+
Kurt	PERSON	0.99+
NoSQL	TITLE	0.99+
MongoDB	TITLE	0.99+
JSON	TITLE	0.99+
SQL	TITLE	0.99+
MongoDB Atlas	TITLE	0.99+
40 years	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
one	QUANTITY	0.99+
One customer	QUANTITY	0.99+
oracle.com/cloud/free	OTHER	0.98+
first	QUANTITY	0.98+
Kurt Monash	PERSON	0.98+
more than a thousand documents	QUANTITY	0.98+
today	DATE	0.98+
one time	QUANTITY	0.97+
two	DATE	0.97+
one database	QUANTITY	0.97+
more than one data model	QUANTITY	0.97+
one thing	QUANTITY	0.97+
90's	DATE	0.97+
one technology	QUANTITY	0.96+
20 years	QUANTITY	0.96+
80's	DATE	0.96+
one more point	QUANTITY	0.95+
decades ago	DATE	0.95+
one data model	QUANTITY	0.95+
Azure	TITLE	0.94+
three years ago	DATE	0.93+
seven years	QUANTITY	0.93+
version five	OTHER	0.92+
one approach	QUANTITY	0.92+

Kirk Bresniker, HPE | HPE Discover 2021

>>from the cube studios >>in Palo alto in >>boston connecting with thought leaders all around the world. This >>is a cute >>conversation. Hello welcome to the cubes coverage of HPD discovered 2021 virtual. I'm john for your host of the cube we're here with CUBA alumni. One of the original cube guests 2020 11 back in the day kurt president and chief architect of Hewlett Packard labs. He's also a Hewlett Packard enterprise fellow and vice president. Great to see you and you're in Vegas. I'm in Palo Alto. We've got a little virtual hybrid going on here. Thanks for spending time. >>Thanks john it's great to be back with you >>so much going on. I love to see you guys having this event kind of everyone in one spot. Good mojo. Great CHP, you know, back in the saddle again. I want to get your, take, your in the, in the, in the action right now on the lab side, which is great disruptive innovation is the theme. It's always been this year, more than ever coming out of the pandemic, people are looking for the future, looking to see the signs, they want to connect the dots. There's been some radical rethinking going on that you've been driving and in the labs, you hope you look back at last, take us through what's going on, what you're thinking, what's the, what's the big trends? >>Yeah, John So it's been interesting, you know, over the last 18 months, all of us had gone through about a decade's worth of advancement in decentralization, education, healthcare, our own work, what we're doing right now suddenly spread apart. Uh, and it got us thinking, you know, we think about that distributed mesh and as we, as we try and begin to return to normal and certainly think about all that we've lost, we want to move forward, we don't want to regress. And we started imagining, what does that world look like? And we think about the world of 20 2500 and 35 zeta bytes, 100 and 50 billion connected things out there. And it's the shape of the world has changed. That's where the data is going to be. And so we started thinking about what's it like to thrive in that kind of world. We had a global Defense research institute came to us, Nasa's that exact question. What's the edge? What do we need to prepare for for this age of insight? And it was kind of like when you had those exam questions and I was one of those kids who give you the final exam and if it's a really good question, suddenly everything clicked. I understood all the material because there was that really forcing question when they asked us that for me, it it solidified what I've been thinking about all the work we've done at labs over the last the last 10 years. And it's really about what does it take to survive and thrive. And for me it's three things. One is, success is going to go to whoever can reason over more information, who can gain the deepest insights from that information in time that matters and then can turn that insight into action at scale. So reason, insight and action. And it certainly was clear to me everything we've been trying to push for in labs, all those boundaries. We've been pushing all those conventions we've been defying are really trying to do that for, for our customers and our partners to bring in more information for them to understand, to be able to allow them to gain insight across departments across disciplines and then turn that insight into action at scale where scale is no longer one cloud or one company or one country, let alone one data center >>lot there. I love the dot I love that metadata and meta reasoning incites always been part of that. Um and you mentioned decentralization. Again, another big trend. I gotta ask you where is the big opportunity because a lot of people who are attending discover people watching are trying to ask what should they be thinking about. So what is that next big opportunity? How would you frame that and what should attendees look for coming out at HP discover. >>So one thing we're seeing is that this is actually a ubiquitous trend, whether we're talking about transportation or energy or communications, they all are trying to understand and how will they admit more of that data to make those real time decisions? Our expectation in the middle of this decade when we have the 125 petabytes, You know, 30% of that data will need real time action out of the edge where the speed of light is now material. And also we expect that at that point in time three out of four of those 185 petabytes, they'll never make it back to the data center. So understanding how we will allow that computation, that understanding to reach out to where the data is and then bringing in that's important. And then if we look at at those, all of those different areas, whether it's energy and transportation, communications, all that real time data, they all want to understand. And so I I think that as many people come to us virtually now, hopefully in person in the future when we have those conversations that labs, it's almost immediate takes a while for them and then they realize away that's me, this is my industry too, because they see that potential and suddenly where they see data, they see opportunity and they just want to know, okay, what does it take for me to turn that raw material into insight and then turn that insight into >>action, you know, storage compute never goes away, it gets more and more, you need more of it. This whole data and edge conversations really interesting. You know, we're living in that data centric, you know, everyone's gonna be a date a couple, okay. That we know that that's obvious. But I gotta ask you as you start to see machine learning, um cloud scale cloud operations, a new Edge and the new architecture is emerging and clients start to look at things like AI and they want to have more explain ability behind I hear that all the time. Can you explain it to me? Is there any kind of, what is it doing? Good as our biases, a good bad or you know, is really valuable expect experimental experiential. These are words are I'm hearing more and more of >>not so much a speeds >>and feeds game, but these are these are these are these are outcomes. So you got the core data, you've got a new architecture and you're hearing things like explainable ai experiential customer support, a new things happening, explain what this all means, >>You know, and it's it's interesting. We have just completed uh creating an Ai ethical framework for all of Hewlett Packard enterprise and whether we're talking about something that's internal improving a process, uh something that we sell our product or we're talking about a partnership where someone wants to build on top of our services and infrastructure, Build an AI system. We really wanted to encompass all of those. And so it was it was challenging actually took us about 18 months from that very first meeting for us to craft what are some principles for us to use to guide our our team members to give them that understanding. And what was interesting is we examined our principles of robustness of uh making sure they're human centric that they're reliable, that they are privacy preserving, that they are robust. We looked at that and then you look at where people want to apply these Ai today's AI and you start to realize there's a gap, there's actually areas where we have a great challenge, a human challenge and as interesting as possibly efficacious as today's A. I. S. R. We actually can't employ them with the confidence in the ethical position that we need to really pull that technology in. And what was interesting is that then became something that we were driving at labs. It began gave us a viewpoint into where there are gaps where, as you say, explica bility, you know, as fantastic as it is to talk into your mobile phone and have it translated into another one of hundreds of languages. I mean that is right out of Star trek and it's something we can all do. And frankly, it's, you know, we're expecting it now as efficacious as that is as we echo some other problems, it's not enough. We actually need to be explainable. We need to be able to audit these decisions. And so that's really what's informed now are trustworthy ai research and development program at Hewlett Packard Labs. Let's look at where we want to play. I I we look at what keeps us from doing it and then let's close the technology gap and it means some new things. It means new approaches. Sometimes we're going back back back to some of the very early ai um that things that we sort of left behind when suddenly the computational capability allowed us to enter into a machine learning and deep neural nets. Great applications, but it's not universally applicable. So that's where we are now. We're beginning to construct that second generation of AI systems where that explica bility where that trustworthiness and were more important that you said, understanding that data flow and the responsibility we have to those who created that data, especially when it's representing human information, that long term responsibility. What are the structures we need to support that ethically? >>That's great insight, Kirk, that's awesome stuff. And it reminds me of the old is new again, right? The cycles of innovation, you mentioned a I in the eighties, reminds me of dusting off and I was smiling because the notion of reasoning and natural language that's been around for a while, these other for a lot of Ai frame which have been around for a while But applied differently becomes interesting. The notion of Meta reasoning, I remember talking about that in 1998 around ontology and syntax and data analysis. I mean, again, well formed, you know, older ways to look at data. And so I gotta ask you, you know, you mentioned reasoning over information, getting the insights and having actions at scale. That doesn't sound like an R and D or labs issue. Right? I mean that that should be like in the market today. So I know you, there's stuff out there, what's different around the Hewlett Packard labs challenge because you guys, you guys are working on stuff that's kind of next gen, so why, what's next gen about reasoning moreover, information and getting insights? Because you know, there's a zillion startups out there that claim to be insights as a service, um, taking action outcomes >>and I think there were going to say a couple things. One is the technologies and the capabilities that God is this far. Uh, they're actually in an interesting position if we think of that twilight of moore's law is getting a little darker every day. Um, there's been such a tail wind behind us tremendous and we would have been foolish not to take advantage of it while it lasted, but as it now flattens out, we have to be realistic and say, you know what that ability to expect anticipate and then planned for a doubling and performance in the next 18 to 24 months because there's twice as many transistors in that square of silicon. We can't count on that anymore. We have to look now broader and it's not just one of these technology inflection points. There's so many we already mentioned ai it's voraciously vowing all this data at the same time. Now that data is all at the edge is no longer in the data center. I mean we may find ourselves laughing chuckling at the term itself data center. Remember when we sent it all the data? Because that's where the computers were. Well, that's 2020 thinking right, that's not even 2025. Thinking also security, that cyber threat of Nation State and criminal enterprises, all these things coming together and it's that confluence of discontinuities, that's what makes a loud problem. And the second piece is we don't just need to do it the way that we've been doing it because that's not necessarily sustainable. And if something is not sustainable is inherently inequitable because we can't afford to let everyone enjoy those benefits. So I think that's all those things, the technology confluence of technology, uh, disruptions and this desire to move to really sustainable, really inherently inequitable systems. That's what makes it a labs problem. >>I really think that's right on the money. And one of things I want to get your thoughts on, cause I know you have a unique historic view of the trajectory arc. Cloud computing that everyone's attention lift and shift cloud scale. Great cloud native. Now with hybrid and multi cloud clearly happening, all the cloud players were saying, oh, it's never gonna happen. All the data set is going to go away. Not really. The, the data center is just an edge big age. So you brought up the data center concept and you mentioned decentralization there, it's a distributed computing architecture, There is no line anymore between what's cloud and what's not the cloud is just the cloud and the data center is now a big fat edge and edges are smaller and bigger. Their nodes distribute computing now is the context. So this is not a new thing for Hewlett Packard enterprise. I mean you guys been doing distributed computing paradigms, supplying software and hardware and solutions Since I can remember since it was founded, what's new now, what do you say that folks are saying, what is HP doing for this new architecture? Because now an operating system is the word, the word that they want. They want to have an operating model, deV ops to have sex shops, all this is happening. What's the what's the state of the art from H. P. E. And how does the lab play into that vision? >>And it's so wonderful that you mentioned in our heritage because if you think about it was the first thing that Bill and they did, they made instruments of unparalleled value and quality for engineers and scientists. And the second thing they did was computerized that instrument control. And then they network them together and then they connect to the network measurement sensing systems to business computing. Right. And so that's really, that's exactly what we're talking about here. You know, and yesterday it was H. B. I. B. Cables. But today it is everything from an Aruba wireless gateway to a green Lake cloud that comes to you to now are cray exa scale supercomputing. And we wanted to look at that entire gamut and understand exactly what you said. How is today's modern developer who has been distinct in agile development in seven uh and devops and def sec ops. How can we make them as comfortable and confident deploying to any one of those systems or all of them in conjunction as confident as they've been deploying to a cloud. And I think that's really part of what we need to understand. And as you move out towards the edge things become interesting. A tiny amount of resources, the number of threats, physical and uh um cyber increased dramatically. It is no longer the healthy happy environment of that raised floor data center, It is actually out in the world but we have to because that's where the data is and so that's another piece of it that we're trying to bring with the labs are distributed systems lab trying to understand how do we make cloud native access every single bite everywhere from the tiniest little Edge embedded system, all the way up through that exa scale supercomputer, how do we admit all of that data to this entire generation and then the following subsequent generation, who will no longer understand what we were so worried about with things being in one place or another, they want to digest all the world's data regardless of where it is. >>You know, I was just having a conversation, you brought this up. Uh that's interesting around the history and the heritage, embedded systems is changing the whole hardware equations, changes the software driven model. Now, supply chain used to be constrained to software. Now you have a software supply chain, hardware, now you have software supply chain. So everything is happening in these kind of new use cases. And Edge is a great example where you want to have compute at the edge not having pulled back to some central location. So, again, advantage hp right, you've got more, you've got some solutions there. So all these like memory driven computing, something that you've worked on and been driving the machine product that we talked about when you guys launched a few years ago, um, looks like now a good R and D project, because all the discussions, I'm I'm hearing whether it's stuff in space or inside hybrid edges is I gotta have software running on an embedded system, I need security, I gotta have, you know, memory driven architecture is I gotta have data driven value in real time. This is new as a kind of a new shift, but you still need to run it. What's the update on the machine and the memory driven computing? And how does that connect the dots for this intelligent Edge? That's now super important in the hybrid equation. >>Yeah, it's fantastic you brought that up. You know, it's uh it's gratifying when you've been drawing pictures on your white board for 10 or 15 years and suddenly you see them printed uh and on the web and he's like, OK Yeah, you guys were there were there because we always knew it had to be bigger than us. And for a while you wonder, well is this the right direction? And then you get that gratification that you see it repeated. And I think one of the other elements that you said that was so important was talking about that supply chain uh and especially as we get towards these edge devices uh and the increasing cyber threat, you know, so much more about understanding the provenance of that supply chain and how we get beyond trust uh to prove. And in our case that proof is rooted in the silicon. Start with the silicon establish a silicon root of trust, something that can't be forged that that physically uncomfortable function in the silicon. And then build up that chain not of trust but a proof of measurable confidence. And then let's link that through the hardware through the data. And I think that's another element, understanding how that data is flowing in and we establish that that that provenance that's provable provenance and that also enables us to come back to that equitable question. How do we deal with all this data? Well, we want to make sure that everyone wants to buy in and that's why you need to be able to reward them. So being able to trace data into an AI model, trace it back out to its effect on society. All these are things that we're trying to understand the labs so that we can really establish this data economy and admit the day that we need to the problems that we have that really just are crying out for that solution bringing in that data, you just know where is the data, Where is the answer? Now I get to work with, I've worked for several years with the German center for your Degenerative Disease Research and I was teasing their director dr nakata. I said, you know, in a couple of years when you're getting that Nobel prize for medicine because you cracked Alzheimer's I want you to tell me how long was the answer hiding in plain sight because it was segregated across disciplines across geography and it was there. But we just didn't have that ability to view across the breath of the information and in a time that matters. And I think so much about what we're trying to do with the lab is that that's that reasoning moreover, more information, gaining insights in the time that matters and then it's all about action and that is driving that insight into the world regardless of whether it has to land in an exa scale supercomputer or tiny little edge device, we want today's application development teams to feel that degree of freedom to range over all of those that infrastructure and all of that data. >>You know, you bring up a great call out there. I want to just highlight that cause I thought that was awesome. The future breakthroughs are hiding in plain sight. It's the access to the people and the talent to solve the problems and the data that's stuck in the silos. You bring those together, you make that seamless and frictionless, then magic happens. That's that's really what we're talking about in this new world, isn't it? >>Absolutely, yeah. And it's one of those things that sometimes my kids as you know, why do you come in every day? And for me it is exactly that I think so many of the challenges we have are actually solvable if the right people knew the right information at the right time and that we all have that not again, not trust, but that proof that confidence, that measurable conference back to the instruments that that HP was always famous for. It was that precision and they all had that calibration tag. So you could measure your confidence in an HP instrument and the same. We want people to measure their confidence when data is flowing through Hewlett Packard Enterprise infrastructure. >>It's interesting to bring up the legacy because instrumentation network together, connecting to business systems. Hey, that sounds like the cloud observe ability, modern applications, instant action and actionable insights. I mean that's really the the same almost exact formula. >>Yeah, For me that's that, that the constant through line from the garage to right now is that ability to handle and connect people to the information that they need. >>Great, great to chat. You're always an inspiration and we could go for another hour talking about extra scale, green leg, all the other cool things going on at H P E. I got to ask you the final question, what are you most excited about for h B and his future and how and how can folks learn more to discover and what should they focus on? >>Uh so I think for me um what I love is that I imagine that world where the data you know today is out there at the edge and you know we have our Aruba team, we have our green Lake team, we have are consistent, you know, our core enterprise infrastructure business and now we also have all the way up through X scale compute when I think of that thriving business, that ability to bring in massive data analytics, machine learning and Ai and then stimulation and modeling. That's really what whether you're a scientist and engineer or an artist, you want to have that intersectionality. And I think we actually have this incredible, diverse set of resources to bring to bear to those problems that will span from edge to cloud, back to core and then to exit scale. So that's what really, that's why I find so exciting is all of the great uh innovators that we get to work with and the markets we get to participate in. And then for me it's also the fact it's all happening at Hewlett Packard Enterprise, which means we have a purpose. You know, if you ask, you know, when they did ask Dave Packer, Dave, why hp? And he said in 1960, we come together as a company because we can do something we could not do by ourselves and we make a contribution to society and I dare anyone to spend more than a couple of minutes with Antonio Neary and he won't remind you. And this is whether it is here to discover or in the halls at labs remind me our purpose, that Hewlett Packard Enterprise is to advance the way that people live and work. And for me that's that direct connection. So it's, it's the technology and then the purpose and that's really what I find so exciting about HPV. >>That's a great call out, Antonio deserves props. I love talking with him, he's the true Bill and Dave Bill. Hewlett Dave package spirit And I'll say that I've talked with him and one of the things that resident to me and resonates well is the citizenship and be interesting to see if Bill and Dave were alive today, that now it's a global citizenship. This is a huge part of the culture and I know it's still alive there at H P E. So, great call out there and props to Antonio and yourself and the team. Congratulations. Thanks for spending the time, appreciate it. >>Thank you john it's great to be with you again. >>Okay. Global labs. Global opportunities, radical. Rethinking this is what's happening within HP. Hewlett Packard Labs, Great, great contribution there from Kirk, have them on the cube and always fun to talk so much, so much to digest there. It's awesome. I'm john Kerry with the cube. Thanks for watching. >>Mm >>mhm Yeah.

Published Date : Jun 17 2021

SUMMARY :

boston connecting with thought leaders all around the world. Great to see you I love to see you guys having this event kind of everyone in one spot. And it was kind of like when you had those exam questions and I gotta ask you And so I I think that as many people come to us virtually now, But I gotta ask you as you start to see machine learning, So you got the core data, you've got a new architecture and you're hearing things like explainable ai experiential We looked at that and then you look at where people want to apply these I mean that that should be like in the market today. And the second piece is we don't just need to do it the All the data set is going to go away. And we wanted to look at that entire gamut and understand exactly what you said. been driving the machine product that we talked about when you guys launched a few years ago, And I think one of the other elements that you said that was so important was talking about that supply chain uh It's the access to the people and the talent to solve the problems and And it's one of those things that sometimes my kids as you know, I mean that's really the the same almost exact formula. Yeah, For me that's that, that the constant through line from the garage to right now is that green leg, all the other cool things going on at H P E. I got to ask you the final question, is all of the great uh innovators that we get to work with and the markets we get that resident to me and resonates well is the citizenship and be so much to digest there.

ENTITIES

Entity	Category	Confidence
Antonio	PERSON	0.99+
1998	DATE	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Vegas	LOCATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Bill	PERSON	0.99+
Hewlett Packard	ORGANIZATION	0.99+
Nasa	ORGANIZATION	0.99+
Dave Packer	PERSON	0.99+
10	QUANTITY	0.99+
125 petabytes	QUANTITY	0.99+
john Kerry	PERSON	0.99+
Antonio Neary	PERSON	0.99+
HP	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Kirk Bresniker	PERSON	0.99+
yesterday	DATE	0.99+
1960	DATE	0.99+
30%	QUANTITY	0.99+
Hewlett Packard Labs	ORGANIZATION	0.99+
Kirk	PERSON	0.99+
15 years	QUANTITY	0.99+
today	DATE	0.99+
185 petabytes	QUANTITY	0.99+
2020	DATE	0.99+
100	QUANTITY	0.99+
john	PERSON	0.99+
2025	DATE	0.99+
Palo alto	LOCATION	0.99+
twice	QUANTITY	0.99+
John	PERSON	0.99+
three things	QUANTITY	0.99+
pandemic	EVENT	0.99+
second thing	QUANTITY	0.99+
one company	QUANTITY	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
this year	DATE	0.98+
Nobel prize	TITLE	0.98+
kurt	PERSON	0.97+
three	QUANTITY	0.97+
first thing	QUANTITY	0.97+
one country	QUANTITY	0.97+
four	QUANTITY	0.96+
HPD	ORGANIZATION	0.96+
Star trek	TITLE	0.96+
nakata	PERSON	0.96+
2021	DATE	0.96+
one spot	QUANTITY	0.96+
second generation	QUANTITY	0.95+
Aruba	LOCATION	0.95+
H. P. E.	PERSON	0.95+
eighties	DATE	0.95+
first meeting	QUANTITY	0.94+
about 18 months	QUANTITY	0.94+
Degenerative Disease Research	ORGANIZATION	0.94+
24 months	QUANTITY	0.93+
CHP	ORGANIZATION	0.9+
18	QUANTITY	0.9+
HPE	ORGANIZATION	0.89+
hundreds of languages	QUANTITY	0.89+
boston	LOCATION	0.89+
Alzheimer	OTHER	0.88+
one cloud	QUANTITY	0.87+
seven	QUANTITY	0.86+
hp	ORGANIZATION	0.85+

Kirk Viktor Fireside Chat Trusted Data | Data Citizens'21

>>Kirk focuses on the approach to modern data quality and how it can enable the continuous delivery of trusted data. Take it away. Kirk >>Trusted data has been a focus of mine for the last several years. Most particularly in the area of machine learning. Uh, I spent much of my career on wall street, writing models and trying to create a healthy data program, sort of the run the bank and protect the franchise and how to do that at scale for larger organizations. Uh, I'm excited to have the opportunity today sitting with me as Victor to have a fireside chat. He is an award-winning and best-selling author of delete big data and most currently framers. He's also a professor of governance at Oxford. So Victor, my question for you today is in an era of data that is always on and always flowing. How does CDOs get comfortable? You know, the, I can sleep at night factor when data is coming in from more angles, it's being stored in different formats and varieties and probably just in larger quantities than ever before. In my opinion, just laws of large numbers with that much data. Is there really just that much more risk of having bad data or inaccuracy in your business? >>Well, thank you Kirk, for having me on. Yes, you're absolutely right. That the real problem, if I were to simplify it down to one statement is that incorrect data and it can lead to wrong decisions that can be incredibly costly and incredibly costly for trust for the brand, for the franchise incredibly costly, because they can lead to decisions that are fundamentally flawed, uh, and therefore lead the business in the wrong direction. And so the, the, the real question is, you know, how can you avoid, uh, incorrect data to produce incorrect insights? And that depends on how you view trust and how you view, uh, data and correctness in the first place. >>Yeah, that's interesting, you know, in my background, we were constantly writing models, you know, we're trying to make the models smarter all the time, and we always wanted to get that accuracy level from 89% to 90%, you know, whatever we could be, but there's this popular theme where over time the models can diminish an accuracy. And the only button we really had at our disposal was to retrain the model, uh, oftentime I'm focused on, should we be stress testing the data, it almost like a patient health exam. Uh, and how do we do that? Where we could get more comfortable thinking about the quality of the data before we're running our models and our analytics. >>Yeah, absolutely. When we look at the machine learning landscape, even the big data landscape, what we see is that a lot of focus is now put on getting the models, right, getting it worked out, getting the kinks worked out, but getting sort of the ethics, right. The value, right. That is in the model. Um, uh, and what is really not looked at what is not focused enough that, um, is the data. Now, if you're looking at it from a compliance viewpoint, maybe it's okay if you just look at the model, maybe not. But if you understand that actually using the right data with the right model gives you a competitive advantage that your competitors don't have, then it is far more than compliance. And if it is far more compliance, then actually the aperture for strategy opens up and you should not just look at models. You should actually look at the data and the quality and correctness of the data as a huge way by which you can push forward your competitive advantage. >>Well, I haven't even trickier one for you. I think, you know, there's so much coming in and there's so much that we know we can measure and there's so much we could replay and do what if analysis on and kind of back tests, but, you know, do you see organizations doing things to look around the corner? And maybe an interesting analogy would be something like with Tesla is doing whether it's sensors or LIDAR, and they're trying to bounce off every object they know, and they can make a lot of measurements, but the advancements in computer vision are saying, I might be able to predict what's around the corner. I might be able to be out ahead of the data error. I'm about to see tomorrow. Um, you know, do you see any organizations trying to take that futuristic step to sort of know the unknown and be more predictive versus reactive? >>Absolutely. Tesla is doing a bit Lincoln, uh, but so are others in that space and not autonomous driving space, um, uh, Waymo, the, uh, the, the, uh, Google company that is, uh, doing autonomous driving for a long period of time where they have been doing is collecting training data, uh, through their cars and then running a machine learning on the training data. Now they hit a wall a couple of years ago because the training data wasn't diverse enough. It didn't have that sort of Moore's law of insight anymore, even though it was more and more training data. Um, and so the, the Delta, the additional learning was just limited. So what they then decided to do was to build a virtual reality called car crafting, which were actually cars would drive around and create, uh, uh, predictive training data. Now, what is really interesting about that is that that is isn't a model. It is a model that creates predictive data. And this predictive is the actual value that is added to the equation here. And with this extra predictive data, they were able to improve their autonomous driving quite significantly. Uh, five years ago, their disengagement was, uh, raped was every, uh, 2000 miles on average. And, uh, last year, uh, five years later, it was every 30,000 miles on average, that's a 15 K improvement. And that wasn't driven by a mysterious model. It was driven by predictive data. >>Right, right. You know, that's interesting. I, I'm also a fan of trying to use data points that don't exist in the data sets. So it sounds like they were using more data data that was derived from other sources. And maybe the most simple format that I usually get started with was, you know, what, if I was looking at data from Glassdoor and I wanted to know if it was valid, if it was accurate, but of course there's going to be numbers in the age, field and salary and years of experience in different things. But what if the years of experience and age and academic level of someone no longer correlates to the salary yet that correlation component is not a piece of data that even lives in the column, the row, the cell. So I do think that there's a huge area for improvement and just advancement in the role data that we see in collect, but also the data science metrics, something like lift and correlation between the data points that really helped me certify and feel comfortable that this data makes sense. Otherwise it could just be numbers in the field >>Indeed. And, and this challenge of, of finding the data and focusing on the right subset of the data and manipulating it, uh, in the right, in a qualitatively right way is really something that has been with us for quite a number of years. There's a fabulous, uh, case, um, a few years back, uh, when, um, in Japan, when there was the suspicion that in Sumo wrestling, there was match fixing going on massive max fiction. Um, and, and so investigators came in and they took the data from the championship bouts and analyzed them and, uh, didn't find anything. And, uh, what was, what was really interesting is then later researchers came in and read the rules and regulations of Sumo wrestling and understood that it's not just the championship bouts that matter, but it's also sometimes the relegation matches that matter. And so then they started looking at those secondary matches that nobody looked at before and that subset of data, and they discovered there's massive match fixing going on. It's just, nobody looked at it because nobody just, as you said, that connection, uh, between th those various data sources or the sort of causal connectivity there. And so it's, it's, it's really crucial to understand, uh, that, uh, driving insight out of data, isn't a black box thing where you feed the data in and get it out. It really requires deep thinking about how to wire it up from the very beginning. >>No, that's an interesting story. I kind of wonder if the model in that case is almost the, the wrestlers themselves or the output, but definitely the, the data that goes into it. Um, yeah. So, I mean, do you see a path where organizations will achieve a hundred percent confidence? Because we all know there's a, I can't sleep at night factor, but there's also a case of what do I do today. It's, I'm probably not living in a perfect world. I might be sailing a boat across an ocean that already has a hole in it. So, you know, we can't turn everything off. We have to sort of patch the boat and sail it at the same time. Um, what do you think the, a good approaches for a large organization to improve their posture? >>You know, if you focus on perfection, you never, you never achieved that perfection a hundred percent perfection or so is never achievable. And if you want some radical change, then that that's admirable. But a lot of times it's very risky. It's a very risky proposition. So rather than doing that, there is a lot of low hanging fruit than that incremental, pragmatic step-by-step approach. If I can use an analogy from history, uh, we, we, we talk a lot about, um, the data revolution and before that, the industrial revolution, and when we think about the industrial revolution, we think about the steam engine, but the reality is that the steam engine, wasn't just one radical invention. In fact, there were a myriad of small incremental invade innovations over the course of a century that today we call the industrial revolution. And I think it's the various same thing when the data revolution where we don't have this one silver bullet that radically puts us into data Nirvana, but it is this incremental, pragmatic step-by-step change. It will get us closer. Um, pragmatic, can you speak in closer to where we want to be, even though there was always more work for us left? >>Yeah, that's interesting. Um, you know, that one hits home for me because we ultimately at Collibra take an incremental approach. We don't think there's a stop the world event. There's, you know, a way to learn from the past trends of our data to become incrementally smarter each day. And this kind of stops us from being in a binary project mode, right. Where we have to wait right. Something for six months and then reassess it and hope, you know, we kind of wonder if you're at 70% accuracy today is being at 71% better tomorrow, right? At least there's a measurable amount of improvement there. Uh, and it's a sort of a philosophical difference. And it reminds me of my banking days. When you say, uh, you know, past performance is no guarantee of future results. And, um, it's a nice disclaimer, you can put in everything, but I actually find it to be more true in data. >>We have all of these large data assets, whether it's terabytes or petabytes, or even if it's just gigabytes sitting there on all the datasets to learn from. And what I find in data is that the past historical values actually do tell us a lot about the future and we can learn from that to become incrementally smarter tomorrow. And there's really a lot of value sitting there in the historical data. And it tells me at least a lot about how to forecast the future. You know, one that's been sitting on the top of my mind recently, especially with COVID and the housing market a long time back, I competed with automation, valuation modeling, which basically means how well can you predict the price of a house? And, you know, that's always a fun one to do. And there's some big name brands out there that do that pretty well. >>Back then when I built those models, I would look at things like the size of the yard, the undulation of the land, uh, you know, whether a pool would award you more or less money for your house. And a lot of those factors were different than they are now. So those models ultimately have already changed. And now that we've seen post COVID people look for different things in housing and the prices have gone up. So we've seen a decline and then a dramatic increase. And then we've also seen things like land and pools become more valuable than they were in the housing model before, you know, what are you seeing here with models and data and how that's going to come together? And it's just, is it always going to change where you're going to have to constantly recalibrate both, you know, our understanding of the data and the models themselves? >>Well, indeed the, the problem of course is almost eternal. Um, oftentimes we have developed beautiful models that work really well. And then we're so wedded to this model or this particular kind of model. And we can fathom to give them up. I mean, if I think of my students, sometimes, you know, they, they, they, they have a model, they collect the data, then they run the analysis and, uh, it basically, uh, tells them that their model was wrong. They go out and they collect more data and more data and more data just to make sure that it isn't there, that, that, that their model is right. But the data tells them what the truth is that the model isn't right anymore that has context and goals and circumstances change the model needs to adapt. And we have seen it over and over again, not just in the housing market, but post COVID and in the COVID crisis, you know, a lot of the epidemiologists looked at life expectancy of people, but when you, when you look at people, uh, in the intensive care unit, uh, with long COVID, uh, suffering, uh, and in ICU and so on, you also need to realize, and many have that rather than life expectancy. >>You also need to look at life quality as a mother, uh, kind of dimension. And that means your model needs to change because you can't just have a model that optimizes on life expectancy anymore. And so what we need to do is to understand that the data and the changes in the data that they NAMIC of the data really is a thorn in our thigh of revisiting the model and thinking very critically about what we can do in order to adjust the model to the present situation. >>But with that, Victor, uh, I've really enjoyed our chat today. And, uh, do you have any final thoughts, comments, questions for me? >>Uh, you know, Kirk, I enjoyed it tremendously as well. Uh, I do think that, uh, that what is important, uh, to understand with data is that as there is no, uh, uh, no silver bullet, uh, and there is only incremental steps forward, this is not actually something to despair, but to give and be the source of great hope, because it means that not just tomorrow, but even the day after tomorrow and the day after the day after tomorrow, we still can make headway can make improvement and get better. >>Absolutely. I like the hopeful message I live every day to, uh, to make data a better place. And it is exciting as we see the advancements in what's possible on what's kind of on the forefront. Um, well with that, I really appreciate the chat and I would encourage anyone. Who's interested in this topic to attend a session later today on modern data quality, where I go through maybe five key flaws of the past and some of the pitfalls, and explain a little bit more about how we're using unsupervised learning to solve for future problems. Thanks Victor. Thank you, Kurt. >>Thanks, Kirk. And Victor, how incredible was that?

Published Date : Jun 17 2021

SUMMARY :

Kirk focuses on the approach to modern data quality and how it can enable the continuous delivery the franchise and how to do that at scale for larger organizations. And that depends on how you view trust and how you And the only button we really even the big data landscape, what we see is that a lot of focus is now Um, you know, the Delta, the additional learning was just limited. and just advancement in the role data that we see in collect, but also the that matter, but it's also sometimes the relegation matches that matter. Um, what do you think the, a good approaches And if you want some radical Um, you know, that one hits home for me because we ultimately And, you know, that's always a fun one to do. the undulation of the land, uh, you know, whether a pool would not just in the housing market, but post COVID and in the COVID crisis, you know, adjust the model to the present situation. And, uh, do you have any final thoughts, comments, questions for me? Uh, you know, Kirk, I enjoyed it tremendously as well. I like the hopeful message I live every day to, uh, to make data a better place.

ENTITIES

Entity	Category	Confidence
Kirk	PERSON	0.99+
Kurt	PERSON	0.99+
Victor	PERSON	0.99+
Google	ORGANIZATION	0.99+
Japan	LOCATION	0.99+
six months	QUANTITY	0.99+
71%	QUANTITY	0.99+
Glassdoor	ORGANIZATION	0.99+
89%	QUANTITY	0.99+
tomorrow	DATE	0.99+
15 K	QUANTITY	0.99+
Tesla	ORGANIZATION	0.99+
last year	DATE	0.99+
70%	QUANTITY	0.99+
2000 miles	QUANTITY	0.99+
Waymo	ORGANIZATION	0.99+
five years later	DATE	0.99+
one statement	QUANTITY	0.99+
90%	QUANTITY	0.98+
today	DATE	0.98+
five years ago	DATE	0.98+
both	QUANTITY	0.98+
each day	QUANTITY	0.98+
COVID	OTHER	0.97+
Moore	PERSON	0.97+
five key flaws	QUANTITY	0.95+
Collibra	ORGANIZATION	0.94+
hundred percent	QUANTITY	0.94+
one silver bullet	QUANTITY	0.92+
Kirk Viktor	PERSON	0.92+
first	QUANTITY	0.91+
COVID crisis	EVENT	0.88+
Oxford	ORGANIZATION	0.88+
every 30,000 miles	QUANTITY	0.86+
a couple of years ago	DATE	0.85+
Sumo wrestling	EVENT	0.84+
one radical invention	QUANTITY	0.8+
few years back	DATE	0.75+
secondary matches	QUANTITY	0.74+
last several years	DATE	0.73+
COVID	EVENT	0.68+
Delta	ORGANIZATION	0.66+
NAMIC	ORGANIZATION	0.53+
Kirk	ORGANIZATION	0.53+
Lincoln	ORGANIZATION	0.45+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for kurt: