Round table discussion

>>Thank you for joining us for accelerate next event. I hope you're enjoying it so far. I know you've heard about the industry challenges the I. T. Trends HP strategy from leaders in the industry and so today what we wanna do is focus on going deep on workload solutions. So in the most important workload solutions, the ones we always get asked about and so today we want to share with you some best practices, some examples of how we've helped other customers and how we can help you. All right with that. I'd like to start our panel now and introduce chris idler, who's the vice president and general manager of the element. Chris has extensive solution expertise, he's led HP solution engineering programs in the past. Welcome chris and Mark Nickerson, who is the Director of product management and his team is responsible for solution offerings, making sure we have the right solutions for our customers. Welcome guys, thanks for joining me. >>Thanks for having us christa. >>Yeah, so I'd like to start off with one of the big ones, the ones that we get asked about all the time, what we've been all been experienced in the last year, remote work, remote education and all the challenges that go along with that. So let's talk a little bit about the challenges that customers have had in transitioning to this remote work and remote education environments. >>Uh So I I really think that there's a couple of things that have stood out for me when we're talking with customers about V. D. I. Um first obviously there was a an unexpected and unprecedented level of interest in that area about a year ago and we all know the reasons why, but what it really uncovered was how little planning had gone into this space around a couple of key dynamics. One is scale. Um it's one thing to say, I'm going to enable V. D. I. For a part of my work force in a pre pandemic environment where the office was still the central hub of activity for work. It's a completely different scale. When you think about okay I'm going to have 50, 60, 80, maybe 100 of my workforce now distributed around the globe. Um Whether that's in an educational environment where now you're trying to accommodate staff and students in virtual learning, Whether that's in the area of things like Formula one racing, where we had the desire to still have events going on. But the need for a lot more social distancing. Not as many people able to be trackside but still needing to have that real time experience. This really manifested in a lot of ways and scale was something that I think a lot of customers hadn't put as much thought into. Initially the other area is around planning for experience a lot of times the V. D. I. Experience was planned out with very specific workloads are very specific applications in mind. And when you take it to a more broad based environment, if we're going to support multiple functions, multiple lines of business, there hasn't been as much planning or investigation that's gone into the application side. And so thinking about how graphically intense some applications are. Uh one customer that comes to mind would be Tyler I. S. D. Who did a fairly large rollout pre pandemic and as part of their big modernization effort, what they uncovered was even just changes in standard Windows applications Had become so much more graphically intense with Windows 10 with the latest updates with programs like Adobe that they were really needing to have an accelerated experience for a much larger percentage of their install base than they had counted on. So, um, in addition to planning for scale, you also need to have that visibility into what are the actual applications that are going to be used by these remote users? How graphically intense those might be. What's the logging experience going to be as well as the operating experience. And so really planning through that experience side as well as the scale and the number of users is kind of really two of the biggest, most important things that I've seen. >>You know, Mark, I'll just jump in real quick. I think you covered that pretty comprehensively there and it was well done. The a couple of observations I've made, one is just that um, V. D. I suddenly become like mission critical for sales. It's the front line, you know, for schools, it's the classroom, you know, that this isn't Uh cost cutting measure or uh optimization in IT. measure anymore. This is about running the business in a way it's a digital transformation. One aspect of about 1000 aspects of what does it mean to completely change how your business does. And I think what that translates to is that there's no margin for error, right? You know, you really need to to deploy this in a way that that performs, that understands what you're trying to use it for. That gives that end user the experience that they expect on their screen or on their handheld device or wherever they might be, whether it's a racetrack classroom or on the other end of a conference call or a boardroom. Right? So what we do in the engineering side of things when it comes to V. D. I. R. Really understand what's a tech worker, What's a knowledge worker? What's the power worker? What's a gP really going to look like? What time of day look like, You know, who's using it in the morning, Who is using it in the evening? When do you power up? When do you power down? Does the system behave? Does it just have the, it works function and what our clients can can get from H. P. E. Is um you know, a worldwide set of experiences that we can apply to, making sure that the solution delivers on its promises. So we're seeing the same thing you are christa, We see it all the time on beady eye and on the way businesses are changing the way they do business. >>Yeah. It's funny because when I talked to customers, you know, one of the things I heard that was a good tip is to roll it out to small groups first so you can really get a good sense of what the experiences before you roll it out to a lot of other people and then the expertise. Um It's not like every other workload that people have done before. So if you're new at it make sure you're getting the right advice expertise so that you're doing it the right way. Okay. One of the other things we've been talking a lot about today is digital transformation and moving to the edge. So now I'd like to shift gears and talk a little bit about how we've helped customers make that shift and this time I'll start with chris. >>All right Hey thanks. Okay so you know it's funny when it comes to edge because um the edge is different for every customer and every client and every single client that I've ever spoken to of. H. P. S. Has an edge somewhere. You know whether just like we were talking about the classroom might be the edge. But I think the industry when we're talking about edges talking about you know the internet of things if you remember that term from not too not too long ago you know and and the fact that everything is getting connected and how do we turn that into um into telemetry? And I think Mark is going to be able to talk through a a couple of examples of clients that we have in things like racing and automotive. But what we're learning about Edge is it's not just how do you make the Edge work? It's how do you integrate the edge into what you're already doing? And nobody's just the edge. Right. And so if it's if it's um ai ml dl there that's one way you want to use the edge. If it's a customer experience point of service, it's another, you know, there's yet another way to use the edge. So, it turns out that having a broad set of expertise like HP does, um, to be able to understand the different workloads that you're trying to tie together, including the ones that are running at the, at the edge. Often it involves really making sure you understand the data pipeline. What information is at the edge? How does it flow to the data center? How does it flow? And then which data center, which private cloud? Which public cloud are you using? Um, I think those are the areas where we, we really sort of shine is that we we understand the interconnectedness of these things. And so, for example, Red Bull, and I know you're going to talk about that in a minute mark, um the racing company, you know, for them the edges, the racetrack and, and you know, milliseconds or partial seconds winning and losing races, but then there's also an edge of um workers that are doing the design for the cars and how do they get quick access? So, um, we have a broad variety of infrastructure form factors and compute form factors to help with the edge. And this is another real advantage we have is that we we know how to put the right piece of equipment with the right software. And we also have great containerized software with our admiral container platform. So we're really becoming um, a perfect platform for hosting edge centric workloads and applications and data processing. Uh, it's uh um all the way down to things like a Superdome flex in the background. If you have some really, really, really big data that needs to be processed and of course our workhorse reliance that can be configured to support almost every combination of workload you have. So I know you started with edge christa but and and we're and we nail the edge with those different form factors, but let's make sure, you know, if you're listening to this, this show right now, um make sure you you don't isolate the edge and make sure they integrated with um with the rest of your operation, Mark, you know, what did I miss? >>Yeah, to that point chris I mean and this kind of actually ties the two things together that we've been talking about here at the Edge has become more critical as we have seen more work moving to the edge as where we do work, changes and evolves. And the edge has also become that much more closer because it has to be that much more connected. Um, to your point talking about where that edge exists, that edge can be a lot of different places. Um, but the one commonality really is that the edge is an area where work still needs to get accomplished. It can't just be a collection point and then everything gets shipped back to a data center back to some other area for the work. It's where the work actually needs to get done. Whether that's edge work in a used case like V. D. I. Or whether that's edge work. In the case of doing real time analytics, you mentioned red bull racing, I'll bring that up. I mean, you talk about uh, an area where time is of the essence, everything about that sport comes down to time. You're talking about wins and losses that are measured as you said in milliseconds. And that applies not just to how performance is happening on the track, but how you're able to adapt and modify the needs of the car, adapt to the evolving conditions on the track itself. And so when you talk about putting together a solution for an edge like that, you're right. It can't just be, here's a product that's going to allow us to collect data, ship it back someplace else and and wait for it to be processed in a couple of days, you have to have the ability to analyze that in real time. When we pull together a solution involving our compute products are storage products or networking products. When we're able to deliver that full package solution at the edge, what you see results like a 50 decrease in processing time to make real time analytic decisions about configurations for the car and adapting to real time test and track conditions. >>Yeah, really great point there. Um, and I really love the example of edge and racing because I mean that is where it all every millisecond counts. Um, and so important to process that at the edge. Now, switching gears just a little bit. Let's talk a little bit about um some examples of how we've helped customers when it comes to business agility and optimizing the workload for maximum outcome for business agility. Let's talk about some things that we've done to help customers with that >>mark, give it a >>shot. >>Uh, So when we, when we think about business agility, what you're really talking about is the ability to implement on the fly to be able to scale up and scale down the ability to adapt to real time changing situations. And I think the last year has been, has been an excellent example of exactly how so many businesses have been forced to do that. Um I think one of the areas that I think we've probably seen the most ability to help with customers in that agility area is around the space of private and hybrid clouds. Um if you take a look at the need that customers have to be able to migrate workloads and migrate data between public cloud environments, app development environments that may be hosted on site or maybe in the cloud, the ability to move out of development and into production and having the agility to then scale those application rollouts up, having the ability to have some of that. Um some of that private cloud flexibility in addition to a public cloud environment is something that is becoming increasingly crucial for a lot of our customers. >>All right, well, we could keep going on and on, but I'll stop it there. Uh, thank you so much Chris and Mark. This has been a great discussion. Thanks for sharing how we help other customers and some tips and advice for approaching these workloads. I thank you all for joining us and remind you to look at the on demand sessions. If you want to double click a little bit more into what we've been covering all day today, you can learn a lot more in those sessions. And I thank you for your time. Thanks for tuning in today.

Published Date : Apr 23 2021

SUMMARY :

so today we want to share with you some best practices, some examples of how we've helped Yeah, so I'd like to start off with one of the big ones, the ones that we get asked about in addition to planning for scale, you also need to have that visibility into what are It's the front line, you know, for schools, it's the classroom, one of the things I heard that was a good tip is to roll it out to small groups first so you can really the edge with those different form factors, but let's make sure, you know, if you're listening to this, is of the essence, everything about that sport comes down to time. Um, and so important to process that at the edge. at the need that customers have to be able to migrate And I thank you for your time.

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Mark Nickerson	PERSON	0.99+
chris	PERSON	0.99+
Mark	PERSON	0.99+
50	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
today	DATE	0.99+
60	QUANTITY	0.99+
80	QUANTITY	0.99+
100	QUANTITY	0.99+
two	QUANTITY	0.99+
two things	QUANTITY	0.99+
Red Bull	ORGANIZATION	0.99+
Windows	TITLE	0.99+
Windows 10	TITLE	0.99+
christa	PERSON	0.99+
last year	DATE	0.99+
Adobe	ORGANIZATION	0.98+
50 decrease	QUANTITY	0.98+
Tyler I. S. D.	PERSON	0.98+
first	QUANTITY	0.98+
One	QUANTITY	0.97+
H. P. S.	PERSON	0.97+
chris idler	PERSON	0.96+
one	QUANTITY	0.95+
about 1000 aspects	QUANTITY	0.92+
days	QUANTITY	0.92+
One aspect	QUANTITY	0.9+
one customer	QUANTITY	0.87+
about	DATE	0.85+
one thing	QUANTITY	0.84+
a year ago	DATE	0.84+
one way	QUANTITY	0.8+
Edge	TITLE	0.75+
racing	EVENT	0.75+
single client	QUANTITY	0.72+
H. P. E.	ORGANIZATION	0.71+
Formula one	TITLE	0.66+
Superdome	ORGANIZATION	0.66+
couple	QUANTITY	0.61+
double	QUANTITY	0.6+
Edge	ORGANIZATION	0.57+
V. D.	TITLE	0.56+
V. D. I.	PERSON	0.55+
pandemic	EVENT	0.53+
V. D. I.	ORGANIZATION	0.4+
V.	EVENT	0.39+
flex	COMMERCIAL_ITEM	0.37+
D. I.	ORGANIZATION	0.35+
bull	ORGANIZATION	0.35+

Breaking Analysis: CIO/CISO Round Table

>> From theCUBE Studios in Palo Alto, in Boston connecting with alt leaders all around the world, This is a CUBE conversation. >> Hello everybody, this is Dave Vellante and welcome to this Breaking Analysis. I'm here with Erik Bradley, who's the managing director of ETR and runs their VEN program. Erik good to see you. >> Very nice to see you too Dave. Hope you're doing well. >> Yeah, I'm doing okay hanging in there. You know, you guys in New York are fighting the battle. Looks like we're making some progress here so, you know, all the best, you and your family and the wider community. I'm really excited to have you on today because I had the pleasure of sitting in on a CIO/ CISO panel last week. And we're going to explain sort of what that's all about, but one of the things ETR does that I really like is they go deeper with anecdotal information and it's almost like in-depth interviews in these round tables. So they compliment their quarterly surveys, and their other drill down surveys, with other anecdotal information for people in their community. So it's a tried and true survey practice that adds some color to the dataset. So guys if you bring up the agenda, I want to share with the audience what we're going to talk about today. So, we'll talk a little bit about, you know we just did intros, I want to ask Erik, what ETR VENN is and then we'll go through some of the guests, but if we go back to Erik, explain a little bit about VENN and the whole process and how you guys do that. >> Yeah sure, we should hire you for marketing. You just did a great job, actually, describing that, but about three years ago what we decided was, ETR does an amazing job collecting the data. It can tell you what's happening, who it's happening to and when it's happening. But it can't always tell you why it's happened. So leveraging a lot of my background in twenty-plus years in journalism and institutional Wall Street research, we decided to take the ETR community, the people that actually take the surveys, and start doing interviews with them and start doing events with them. And enable to doing that, we're basically just trying to compliment the survey findings and the data. So what we always say is that ETR will always give you the quantitative answer and VENN will give you the qualitative answer. >> Now guys, let's bring up the agenda slide again, let's take a look at the folks that participated in the round table. Now, for ETR's clients, they actually know the names and the titles and well the company that these guys work for. We've anonymized it for the public. But you had a CIO of a Global Auto Supplier, a CISO of a Diversified Holdings Firm, who actually had some hospitality exposure but also some government contract manufacturing exposure. Chief Architect of a Software ISV and a VP and CISO of a Global Hospitality Resort Chain. So you had three out of the for, Erik, were really in industries that are getting hit hard. Obviously the software company maybe a little bit better. But maybe you can add some color to that. >> Well actually the software company, unfortunately, was getting hit hard as well because they're a software ISV that actually plays into the manufacturing space as well. So, this particular panel of CIOs and CISOs were actually in a very hard hit industries. And are going to make sure we do two more follow-ups with different industry verticals to make sure we're getting a little bit of a wider berth and collect all of that information in a better way. But coming back to this particular call, the whole reason we did this, and as you know, you spoke to my colleague and friend, Sagar Kadakia, who is the Director of Research for ETR, and we were nimble enough to actually change our survey while it was in the field, to start collecting data on what the real-time impact was on the COVID-19 pandemic. We were able to take that information, extrapolate it, and then say okay let's start reading out to these people and dig deeper. Find out why it's happening and even more so, is it permanent? And which vendors are going to win and which vendors might lose from it. So that was the whole reason we set up the series of calls. We've only conducted on so far. We have another one this coming Tuesday as well with four entirely new panelists that are going to be from different industry verticals because, as you astutely pointed out, these verticals were very hard hit and not all of them are as hard as others. So it's important to get a wider cross-section. >> So, guys let's take a look at some of the budget impacts the anecdotal evidence that we gathered here. So let me just scan through it and then Erik, I'll ask you to comment. So, you know, like Erik said, some hard hit industries. All major projects, anything sort of next-generation, have been essentially shelved. That was the ISV. And then another one, we cut at least 70% of the big projects moving forward. He mentioned ServiceNow actually calls them out, but the ServiceNow is a SaaS company they'll probably, you know, weather the storm here. But he did say we've put that on hold. The best comment, you know, "As-a-service has Saved our SaaS." (Erik laughs) That one's great. And then we're going to get into some of the networking commentary. Some really interesting things about how to support the work from home. You know, kind of shifting from a hardened top into remote workers. And then a lot of commentary on security. So, you know, that's sort of a high level scan and there's just so much information here Erik, but maybe you could sort of summarize on some of that commentary. >> Yeah, we should definitely dig into each of those sectors a little more, but to summarize what we're seeing here was the real winners and losers are clear. Not everyone was prepared to have a work from home strategy. Not everyone was prepared to send their workers out. Their VPN wasn't, they didn't have enough bandwidth. So there was a real quick uptick in spending, but longer term we're starting to see that these changes will become more permanent. So the real winners and losers right now, we're going to see on the loser's side traditional networking. The MPLS networking is in a lot of trouble according to all the data and the commentary that we're seeing. It's expensive, it's difficult to ramp to up bandwidth as quickly as you need and it doesn't support remote. So we're seeing that lose out and the winners there are in the SD-WAN space. It's going to be impossible to ignore that going forward and some of our CIO and CISO panelists said that change will be permanent. Also, we're seeing, at the same time, what they were calling a "SaaS and Cloud". Now, we know these trends obviously were already happening but they're being exacerbated. They're happening even more quickly and more strong. And I don't see that changing any time soon. That, of course, is at the expense of network, I'm sorry, data centers. Whether it be your own or hosted. Which has huge ramifications on on-prem hardware. Even the firewall providers. So what we're seeing here is obviously we know things are going to be impacted by this situation. We didn't necessarily expect all of our community members and IT decision-makers to talk about them being possibly permanent. So that on a high level was something that was extremely interesting. And the last one that I would bring up is that as we make this shift towards working from home, towards remote access, you also have to align yourself with the security that can support that. And one of the things that we're seeing in our data side on ETR, is a widening bifurcation between the next-generation security vendors and the more traditional security or the legacy security players. That bifurcation just keeps getting wider and wider and this situation could be the last straw. >> So I want to follow up on a couple of those things. You're talking about sort of the network shift you know, towards the SD-WAN. What people have described to me is that they got a, you know, a hardened top. It's a hierarchical network. It's very well understood and it's safe, right? And now all of a sudden you got all those remote workers and so you've got to completely soft of rethink your whole network architecture. The other thing I want to drill into is your Cloud commentary. There's a comment that I saw, Erik, that really stood out. One of the folks said, "I would like to see the data centers "be completely deleted, if you will, or closed down." I think we're going to see, you know, a lot more of this obviously. Not only from the standpoint of, and you heard this a lot, the kind of paid by the drink. But just generally getting rid of all that sort of so-called non-differentiated heavy-lifting as we often hear about. >> That is a extreme comment. I don't think everyone feels that way. But, yes, the comment was made and we've heard the comment from other people. As you and I both know, the larger the enterprise the harder that is to go completely SaaS. But yeah, when a situation like this has and see the inflexibility of their on-prem infrastructure, yes it becomes something that really has to be addressed and it can become a permanent change. I was also shocked about that comment. That gentleman also stated that his executives outside of the ITs area, the CEO, the CFO, had never ever, ever wanted to discuss Cloud. They did not want to discuss work from home. They did not want to discuss remote access. He said that conversation has changed immediately and to the credit of the actual IT companies out there, the technology companies, they're doing everything they can with this opportunity to make that happen. >> Yeah, and so you're right the whole work from home conversation. To your point earlier, Erik, big chunks of COVID, the post-COVID world are going to remain permanent. Guys bring up the SaaS slide if you will. The SaaS commentary, "As-a-Service Saved our SaaS." "The wittiest quip award" going to the ETR. You know, but you had, what's very interesting to hear folks, in fact I think somebody even called out, "Hey," you know, "we expected Oracle to," you know, "be auditing us but they're actually being supportive "as is IBM." Salesforce was an interesting common, Erik. One of the folks said they would share accounts on-prem, but when they all do the work from home they had to actually buy some more. You also got Cisco with big props. Microsoft was called out. A lot of organizations actually allowing them to defer payments. So the SaaS vendors actually got very high marks didn't they? >> They really did and even I wrote that summary and it was difficult to write that about Oracle because we all know that they're infamous for auditing their own customers in 2009 right after we came out of financial crisis. They have notoriously been a-- I don't know if they found religion and they decided to be nice to their customers, but every-single person mentioned them as one of the vendors that was actually helping. That was very shocking. And we all know that when bad situations happen people become opportunistic. And right now it's really seeming that the SaaS vendors understand that they need a longterm relationship with these customers and they're being altruistic instead. Which is really nice. >> Yeah I think that anybody with a Cloud realizes that hey, we have an opportunity here that the lifetime value of that customer, whereas maybe in 2009 when Oracle didn't have a Cloud, they had to get people in a headlock to try to persevere their, you know, income statement. Let's go to the networking drill down guys, that next slide because Fortinet, some of the things we've been reporting on is the sort of divergence in evaluations between Fortinet and Palo Alto before this whole thing hit, Fortinet has done a really good job with its Cloud offerings. Palo Alto struggles a little bit with trying to figure out the sales compensation, is maybe a little bit behind. Although both companies got strong props and I've talked to a number of customers, Palo Alto is going to be in the mix. Fortinet, from a Cloud standpoint, seems to be doing quite well? Obviously networking, Cisco is the big gorilla there. But we also got call outs from guys like Trend Micro which was interesting, from some of the folks. So, your thoughts on this Erik. >> Yeah, I'll start on the networking side because this is something that I've really, I've dug into quite amount, in not only this panel, but a lot of interviews and it really seems as if as networking refresh starts to come up, and it's coming up with a lot of large enterprises, when your network refresh comes up people are going to do an RFP for SD-WAN. They are sick and tired of paying MPLS network vendors and they really want to look at something else. That was even prior to this situation. Now what we're hearing is this is a permanent change. I particularly had one person say, I wanted to find this quote real quickly if I can, but basically they basically saying that, "From a permanency perspective, the freedom from MTLS "will reduce our networks spend by over half "while more than doubling or tripling our bandwidth." You can't ignore that. You're going to save me money and triple my bandwidth, and hey by the way, my refresh is due. It's something that's coming and it's going to happen. And yes, you mentioned the few right? There's Viptela, there's Velocloud, there's some big players like Cisco. The Palo Alto just acquired CloudGenix in the midst of all of this. They just went and got an SD-WAN player themselves. And they just keep acquiring a portfolio to shift from their on-prem to next-generation. It's going to take some time, because 70% plus of their revenues is still on-prem hardware, but I do believe that their portfolio that they're creating is the way the world is moving. And that's just one comment on the traditional networking versus the next-generation SD-WAN. >> And the customers have indicated, you know it's not easy just to get off of their MPLS network. I mean it takes time, it's like slowly pulling of the bandaid. But, like many things, COVID-19 is sort of accelerating that. We haven't talked about digital transformation. That came up as a maybe more strategic initiative. But one that very clearly has legs. >> You know, David, it's very simple. You just said it. People, when things are going well and they're comfortable, they don't change. And that's the same for an enterpriser company. Hey, everything's great, our revenue's fine. Why would we do this? We'll worry about that next year. Then something like this happens and you realize wow, we've been dragging our feet. That digital transformation that we've been talking about, and we've been a little bit slow to accept, we need to accept it, we need to move now. And yes, it was another one of the major themes and it sounds silly for researchers like you and I because we know this is a theme. We know Clouded option is there, we know digital transformation is there. But, there are still a lot of people that haven't moved as quickly as they should and this is going to be that final catalyst to get them there, without a doubt. Quickly on your point of Fortinet, I was actually very impressed with the commentary that came from that because Fortinet is sometimes one of those names that you think of that maybe plays in a smaller pool or isn't as big as some of the 800 pound gorillas out there. But in other other interviews besides this I've heard the phrase coined of "Forti-everything". So through RND and through acquisition, Fortinet has really expanded the portfolio and right now is their time to shine because when you have smaller satellite, you know, offices and branches that you need to connect, they're really, really good at it. And you don't always want to call a Palo Alto and pay that price when you have smaller branch offices. And I actually, I was glad you brought up Fortinet because it's not a name that we get to herald that often and it was deserving from this panel. >> Yeah and, you know, companies that can secure gateways, secure endpoints, obviously going to have momentum. Zscaler came up, you know I think that, and I'll tell ya, looking at, I've done a couple of breaking analysis on security and Fortinet has been strong in two dimensions. You know ETR is, as our audience is I think getting to know. We really look at two key metrics. One is net score, which is a measure of spending momentum, and the other is market share, which is a measure of pervasiveness. And companies like Fortinet, in security, show up on both of those dimensions so it's notable. >> Yes, it certainly is, it is. And I'm glad you brought up Zscaler too. Very recently by client request, we did a very in-depth research on Zscaler versus Palo Alto Prisma Access and they were very interested. This was before all this happened, you know. Does Palo Alto have a chance of catching up, taking share from Zscaler. And I've had the pleasure, myself, personally hosting Jay the CEO of Zscaler at an event in New York City. And I have nothing but incredible respect for the company. But what we found out through this research is Zscaler, at the moment, their technology is still ahead, according to their answers. There's no doubt. However, there doesn't seem to be any real secret sauce that will stop Palo Alto from catching up. So we do believe the parody of feature set will shrink over time. And then it will come down to Palo Alto obviously has a wider and user base. Now, what's happening today might change that. Because if I had to make a decision right now, for my company on secure web gateway, I'm still probably going to go to Zscaler. It's the name. If I had to choose that in a year from now, Palo Alto might have had a better chance. So in this panel, as you brought up, Zscaler was mentioned numerous times as just the wave of the future. Along with CASB brokers right? Whether you're talking about a Netskoper or Forcepointer. All those people that also play in CASB space to secure your access. Zero trust is no longer a marketing-hype term. It is real and it is becoming more real by the week. >> And so, I want to kind of end on one of the other comments that really struck me because we're constantly talking about okay, do you go with a portfolio of a suite of services or do you go with best of breed? What about startups? Are startups more risky in a crisis like this? And one of your panelists, I just love this comment, he said, "One of things that I've always done," he said, "You always hear about the guy, "oh we're going to go to the gardener, we're going to "check out the magic water, we'll pick out three guys "in the upper right hand corner and test them out." He says, "One of the things I always like to do, "I'll pick two from the upper right "and I'll take one from the lower left." One of the emerging, text, "And I'll give em a shot." It won't win every time, but then he called out FireEye as one of the organizations that he found early that gave them competitive advantage. >> Right. >> Love that comment. >> It's a great comment. And honestly if you're in charge of procurement you'd be stupid not to do that. Not only just to see what the technology is, but now I can play you off the big guys because I have negotiating leverage and I can say oh, well I could always just take their contract. So it's silly not to do it from a business perspective. But from technology perspective, what we kept hearing from these people with the smaller vendors. My partner Peter Steube, my colleague and I, we did the host together, we asked this question really believing that the financial insecurity of the moment and the times would make smaller vendors not viable. We heard the exact opposite. What our panelists said was, "No, I'd be happy "to work with a smaller vendor right now "because they're going to give me pricing flexibility, "they're going to work with me right now. "I don't need to pay them upfront "because we're seeing a permanent shift from CapEx to OpEX, "and the smaller vendors are willing to work with me and I can pay them later." So we were actually surprised to hear that and glad to hear it because, to connect to your other point, the other person who was talking about security and the platform approach versus best of breed, he said "Listen, platform approaches you're already "with the vendor, you can bundle a little bit. "But the problem is, if you're just going to acquire "a new technology every time there's a new threat, "the bad guys are just going to switch the threat. "And you can't acquire indefinitely. "So therefore, best of breed with security "will always beat platform." And that's kind of a message to Palo Alto and Cisco, in my opinion, because they seem to be the ones fighting that out. Even Microsoft now, trying to say they're a platform approach in security. >> Well and this says to me the security business, as we predicted, is going to stay fragmented because you're still going to get that best of breed. You know, just like Cloud is going to be fragmented and it's, you know, multiple vendors. Ever since I've been in this business people are trying to consolidate the number of vendors, but technology moves so quickly, it gives competitive advantage. Erik, awesome! Thank you so much for joining us. I'm looking forward to next Tuesday with the next vendor and love to have you back and talk about it anytime. You're a great guest, thanks so much. >> Certainly, I'll do my best to get a better AV connection the next time guys, I apologize for that. But it was great talking to you tonight. >> Hey we're all learning, you know so, thank you everybody for watching, this is Dave Vellante for theCUBE and we'll see you next time. (upbeat music)

Published Date : Apr 15 2020

SUMMARY :

connecting with alt leaders all around the world, Erik good to see you. Very nice to see you too Dave. and the wider community. and VENN will give you the qualitative answer. and the titles and well the company the whole reason we did this, and as you know, and then Erik, I'll ask you to comment. And one of the things that we're seeing in our data side Not only from the standpoint of, and you heard this a lot, and see the inflexibility of their on-prem infrastructure, One of the folks said they would share accounts on-prem, And right now it's really seeming that the SaaS vendors to try to persevere their, you know, income statement. and hey by the way, my refresh is due. And the customers have indicated, and pay that price when you have smaller branch offices. and the other is market share, And I have nothing but incredible respect for the company. He says, "One of the things I always like to do, "with the vendor, you can bundle a little bit. and love to have you back and talk about it anytime. But it was great talking to you tonight. and we'll see you next time.

ENTITIES

Entity	Category	Confidence
Peter Steube	PERSON	0.99+
Erik Bradley	PERSON	0.99+
David	PERSON	0.99+
Jay	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Erik	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Sagar Kadakia	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Fortinet	ORGANIZATION	0.99+
2009	DATE	0.99+
IBM	ORGANIZATION	0.99+
Zscaler	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Dave	PERSON	0.99+
Palo Alto	LOCATION	0.99+
New York City	LOCATION	0.99+
Boston	LOCATION	0.99+
70%	QUANTITY	0.99+
two	QUANTITY	0.99+
Trend Micro	ORGANIZATION	0.99+
twenty-plus years	QUANTITY	0.99+
Palo Alto	ORGANIZATION	0.99+
three guys	QUANTITY	0.99+
last week	DATE	0.99+
CapEx	ORGANIZATION	0.99+
One	QUANTITY	0.99+
OpEX	ORGANIZATION	0.99+
next year	DATE	0.99+
three	QUANTITY	0.99+
tonight	DATE	0.99+
ETR	ORGANIZATION	0.99+
next Tuesday	DATE	0.99+
both	QUANTITY	0.99+
VENN	ORGANIZATION	0.98+
Velocloud	ORGANIZATION	0.98+
Netskoper	ORGANIZATION	0.98+
one	QUANTITY	0.98+
two dimensions	QUANTITY	0.98+
Viptela	ORGANIZATION	0.98+
CASB	ORGANIZATION	0.98+
both companies	QUANTITY	0.98+
today	DATE	0.98+
each	QUANTITY	0.97+
four	QUANTITY	0.97+
one person	QUANTITY	0.97+
COVID-19	OTHER	0.97+
COVID-19 pandemic	EVENT	0.96+

Sue Barsamian | International Women's Day

(upbeat music) >> Hi, everyone. Welcome to theCUBE's coverage of International Women's Day. I'm John Furrier, host of theCUBE. As part of International Women's Day, we're featuring some of the leading women in business technology from developer to all types of titles and to the executive level. And one topic that's really important is called Getting a Seat at the Table, board makeup, having representation at corporate boards, private and public companies. It's been a big push. And former technology operating executive and corporate board member, she's a board machine Sue Barsamian, formerly with HPE, Hewlett Packard. Sue, great to see you. CUBE alumni, distinguished CUBE alumni. Thank you for coming on. >> Yes, I'm very proud of my CUBE alumni title. >> I'm sure it opens a lot of doors for you. (Sue laughing) We're psyched to have you on. This is a really important topic, and I want to get into the whole, as women advance up, and they're sitting on the boards, they can implement policy and there's governance. Obviously public companies have very strict oversight, and not strict, but like formal. Private boards have to operate, be nimble. They don't have to share all their results. But still, boards play an important role in the success of scaled up companies. So super important, that representation there is key. >> Yes. >> I want to get into that, but first, before we get started, how did you get into tech? How did it all start for you? >> Yeah, long time ago, I was an electrical engineering major. Came out in 1981 when, you know, opportunities for engineering, if you were kind, I went to Kansas State as an undergrad, and basically in those days you went to Texas and did semiconductors. You went to Atlanta and did communication satellites. You went to Boston or you went to Silicon Valley. And for me, that wasn't too hard a choice. I ended up going west and really, I guess what, embarked on a 40 year career in Silicon Valley and absolutely loved it. Largely software, but some time on the hardware side. Started out in networking, but largely software. And then, you know, four years ago transitioned to my next chapter, which is the corporate board director. And again, focused on technology software and cybersecurity boards. >> For the folks watching, we'll cut through another segment we can probably do about your operating career, but you rose through the ranks and became a senior operating executive at the biggest companies in the world. Hewlett Packard Enterprise, Hewlett Packard Enterprise and others. Very great career, okay. And so now you're kind of like, put that on pause, and you're moving on to the next chapter, which is being a board director. What inspired you to be a board director for multiple public companies and multiple private companies? Well, how many companies are you on? But what's the inspiration? What's the inspiration? First tell me how many board ships you're on, board seats you're on, and then what inspired you to become a board director? >> Yeah, so I'm on three public, and you are limited in terms of the number of publics that you can do to four. So I'm on three public, and I'm on four private from a tech perspective. And those range from, you know, a $4 billion in revenue public company down to a 35 person private company. So I've got the whole range. >> So you're like freelancing, I mean, what is it like? It's a full-time job, obviously. It's a lot of work involved. >> Yeah, yeah, it's. >> John: Why are you doing it? >> Well, you know, so I retired from being an operating executive after 37 years. And, but I loved, I mean, it's tough, right? It's tough these days, particularly with all the pressures out there in the market, not to mention the pandemic, et cetera. But I loved it. I loved working. I loved having a career, and I was ready to back off on, I would say the stresses of quarterly results and the stresses of international travel. You have so much of it. But I wasn't ready to back off from being involved and engaged and continuing to learn new things. I think this is why you come to tech, and for me, why I went to the valley to begin with was really that energy and that excitement, and it's like it's constantly reinventing itself. And I felt like that wasn't over for me. And I thought because I hadn't done boards before I retired from operating roles, I thought, you know, that would fill the bill. And it's honestly, it has exceeded expectations. >> In a good way. You feel good about where you're at and. >> Yeah. >> What you went in, what was the expectation going in and what surprised you? And were there people along the way that kind of gave you some pointers or don't do this, stay away from this. Take us through your experiences. >> Yeah, honestly, there is an amazing network of technology board directors, you know, in the US and specifically in the Valley. And we are all incredibly supportive. We have groups where we get together as board directors, and we talk about topics, and we share best practices and stories, and so I underestimated that, right? I thought I was going to, I thought I was going to enter this chapter where I would be largely giving back after 37 years. You've learned a little bit, right? What I underestimated was just the power of continuing to learn and being surrounded by so many amazing people. When, you know, when you do, you know, multiple boards, your learnings are just multiplied, right? Because you see not just one model, but you see many models. You see not just one problem, but many problems. Not just one opportunity, but many opportunities. And I underestimated how great that would be for me from a learning perspective and then your ability to share from one board to the other board because all of my boards are companies who are also quite close to each other, the executives collaborate. So that has turned out to be really exciting for me. >> So you had the stressful job. You rose to the top of the ranks, quarterly shot clock earnings, and it's hard charging. It's like, it's like, you know, being an athlete, as we say tech athlete. You're a tech athlete. Now you're taking that to the next level, which is now you're juggling multiple operational kind of things, but not with super pressure. But there's still a lot of responsibility. I know there's one board, you got compensation committee, I mean there's work involved. It's not like you're clipping coupons and having pizza. >> Yeah, no, it's real work. Believe me, it's real work. But I don't know how long it took me to not, to stop waking up and looking at my phone and thinking somebody was going to be dropping their forecast, right? Just that pressure of the number, and as a board member, obviously you are there to support and help guide the company and you feel, you know, you feel the pressure and the responsibility of what that role entails, but it's not the same as the frontline pressure every quarter. It's different. And so I did the first type. I loved it, you know. I'm loving this second type. >> You know, the retirement, it's always a cliche these days, but it's not really like what people think it is. It's not like getting a boat, going fishing or whatever. It's doing whatever you want to do, that's what retirement is. And you've chose to stay active. Your brain's being tested, and you're working it, having fun without all the stress. But it's enough, it's like going the gym. You're not hardcore workout, but you're working out with the brain. >> Yeah, no, for sure. It's just a different, it's just a different model. But the, you know, the level of conversations, the level of decisions, all of that is quite high. Which again, I like, yeah. >> Again, you really can't talk about some of the fun questions I want to ask, like what's the valuations like? How's the market, your headwinds? Is there tailwinds? >> Yes, yes, yes. It's an amazing, it's an amazing market right now with, as you know, counter indicators everywhere, right? Something's up, something's down, you know. Consumer spending's up, therefore interest rates go up and, you know, employment's down. And so or unemployment's down. And so it's hard. Actually, I really empathize with, you know, the, and have a great deal of respect for the CEOs and leadership teams of my board companies because, you know, I kind of retired from operating role, and then everybody else had to deal with running a company during a pandemic and then running a company through the great resignation, and then running a company through a downturn. You know, those are all tough things, and I have a ton of respect for any operating executive who's navigating through this and leading a company right now. >> I'd love to get your take on the board conversations at the end if we have more time, what the mood is, but I want to ask you about one more thing real quick before we go to the next topic is you're a retired operating executive. You have multiple boards, so you've got your hands full. I noticed there's a lot of amazing leaders, other female tech athletes joining boards, but they also have full-time jobs. >> Yeah. >> And so what's your advice? Cause I know there's a lot of networking, a lot of sharing going on. There's kind of a balance between how much you can contribute on the board versus doing the day job, but there's a real need for more women on boards, so yet there's a lot going on boards. What's the current state of the union if you will, state of the market relative to people in their careers and the stresses? >> Yeah. >> Cause you left one and jumped in all in there. >> Yeah. >> Some can't do that. They can't be on five boards, but they're on a few. What's the? >> Well, and you know, and if you're an operating executive, you wouldn't be on five boards, right? You would be on one or two. And so I spend a lot of time now bringing along the next wave of women and helping them both in their career but also to get a seat at the table on a board. And I'm very vocal about telling people not to do it the way I do it. There's no reason for it to be sequential. You can, you know, I thought I was so busy and was traveling all the time, and yes, all of that was true, but, and maybe I should say, you know, you can still fit in a board. And so, and what I see now is that your learnings are so exponential with outside perspective that I believe I would've been an even better operating executive had I done it earlier. I know I would've been an even better operating executive had I done it earlier. And so my advice is don't do it the way I did it. You know, it's worked out fine for me, but hindsight's 2020, I would. >> If you can go back and do a mulligan or a redo, what would you do? >> Yeah, I would get on a board earlier, full stop, yeah. >> Board, singular, plural? >> Well, I really, I don't think as an operating executive you can do, you could do one, maybe two. I wouldn't go beyond that, and I think that's fine. >> Yeah, totally makes sense. Okay, I got to ask you about your career. I know technical, you came in at that time in the market, I remember when I broke into the business, very male dominated, and then now it's much better. When you went through the ranks as a technical person, I know you had some blockers and definitely some, probably some people like, well, you know. We've seen that. How did you handle that? What were some of the key pivot points in your journey? And we've had a lot of women tell their stories here on theCUBE, candidly, like, hey, I was going to tell that professor, I'm going to sit in the front row. I'm going to, I'm getting two degrees, you know, robotics and aerospace. So, but they were challenged, even with the aspiration to do tech. I'm not saying that was something that you had, but like have you had experience like that, that you overcome? What were those key points and how did you handle them and how does that help people today? >> Yeah, you know, I have to say, you know, and not discounting that obviously this has been a journey for women, and there are a lot of things to overcome both in the workforce and also just balancing life honestly. And they're all real. There's also a story of incredible support, and you know, I'm the type of person where if somebody blocked me or didn't like me, I tended to just, you know, think it was me and like work harder and get around them, and I'm sure that some of that was potentially gender related. I didn't interpret it that way at the time. And I was lucky to have amazing mentors, many, many, many of whom were men, you know, because they were in the positions of power, and they made a huge difference on my career, huge. And I also had amazing female mentors, Meg Whitman, Ann Livermore at HPE, who you know well. So I had both, but you know, when I look back on the people who made a difference, there are as many men on the list as there are women. >> Yeah, and that's a learning there. Create those coalitions, not just one or the other. >> Yeah, yeah, yeah, absolutely. >> Well, I got to ask you about the, well, you brought up the pandemic. This has come up on some interviews this year, a little bit last year on the International Women's Day, but this year it's resonating, and I would never ask in an interview. I saw an interview once where a host asked a woman, how do you balance it all? And I was just like, no one asked men that. And so it's like, but with remote work, it's come up now the word empathy around people knowing each other's personal situation. In other words, when remote work happened, everybody went home. So we all got a glimpse of the backdrop. You got, you can see what their personal life was on Facebook. We were just commenting before we came on camera about that. So remote work really kind of opened up this personal side of everybody, men and women. >> Yeah. >> So I think this brings this new empathy kind of vibe or authentic self people call it. Is remote work an opportunity or a threat for advancement of women in tech? >> It's a much debated topic. I look at it as an opportunity for many of the reasons that you just said. First of all, let me say that when I was an operating executive and would try to create an environment on my team that was family supportive, I would do that equally for young or, you know, early to mid-career women as I did for early to mid-career men. And the reason is I needed those men, you know, chances are they had a working spouse at home, right? I needed them to be able to share the load. It's just as important to the women that companies give, you know, the partner, male or female, the partner support and the ability to share the love, right? So to me it's not just a woman thing. It's women and men, and I always tried to create the environment where it was okay to go to your soccer game. I knew you would be online later in the evening when the kids were in bed, and that was fine. And I think the pandemic has democratized that and made that, you know, made that kind of an everyday occurrence. >> Yeah the baby walks in. They're in the zoom call. The dog comes in. The leaf blower going on the outside the window. I've seen it all on theCUBE. >> Yeah, and people don't try to pretend anymore that like, you know, the house is clean, the dog's behaved, you know, I mean it's just, it's just real, and it's authentic, and I think that's healthy. >> Yeah. >> I do, you know, I also love, I also love the office, and you know, I've got a 31 year old and a soon to be 27 year old daughter, two daughters. And you know, they love going into the office, and I think about when I was their age, how just charged up I would get from being in the office. I also see how great it is for them to have a couple of days a week at home because you can get a few things done in between Zoom calls that you don't have to end up piling onto the weekend, and, you know, so I think it's a really healthy, I think it's a really healthy mix now. Most tech companies are not mandating five days in. Most tech companies are at two to three days in. I think that's a, I think that's a really good combination. >> It's interesting how people are changing their culture to get together more as groups and even events. I mean, while I got you, I might as well ask you, what's the board conversations around, you know, the old conferences? You know, before the pandemic, every company had like a user conference. Right, now it's like, well, do we really need to have that? Maybe we do smaller, and we do digital. Have you seen how companies are handling the in-person? Because there's where the relationships are really formed face-to-face, but not everyone's going to be going. But now certain it's clearly back to face-to-face. We're seeing that with theCUBE as you know. >> Yeah, yeah. >> But the numbers aren't coming back, and the numbers aren't that high, but the stakeholders. >> Yeah. >> And the numbers are actually higher if you count digital. >> Yeah, absolutely. But you know, also on digital there's fatigue from 100% digital, right? It's a hybrid. People don't want to be 100% digital anymore, but they also don't want to go back to the days when everybody got on a plane for every meeting, every call, every sales call. You know, I'm seeing a mix on user conferences. I would say two-thirds of my companies are back, but not at the expense level that they were on user conferences. We spend a lot of time getting updates on, cause nobody has put, interestingly enough, nobody has put T&E, travel and expense back to pre-pandemic levels. Nobody, so everybody's pulled back on number of trips. You know, marketing events are being very scrutinized, but I think very effective. We're doing a lot of, and, you know, these were part of the old model as well, like some things, some things just recycle, but you know, there's a lot of CIO and customer round tables in regional cities. You know, those are quite effective right now because people want some face-to-face, but they don't necessarily want to get on a plane and go to Las Vegas in order to do it. I mean, some of them are, you know, there are a lot of things back in Las Vegas. >> And think about the meetings that when you were an operating executive. You got to go to the sales kickoff, you got to go to this, go to that. There were mandatory face-to-faces that you had to go to, but there was a lot of travel that you probably could have done on Zoom. >> Oh, a lot, I mean. >> And then the productivity to the family impact too. Again, think about again, we're talking about the family and people's personal lives, right? So, you know, got to meet a customer. All right. Salesperson wants you to get in front of a customer, got to fly to New York, take a red eye, come on back. Like, I mean, that's gone. >> Yeah, and oh, by the way, the customer doesn't necessarily want to be in the office that day, so, you know, they may or may not be happy about that. So again, it's and not or, right? It's a mix. And I think it's great to see people back to some face-to-face. It's great to see marketing and events back to some face-to-face. It's also great to see that it hasn't gone back to the level it was. I think that's a really healthy dynamic. >> Well, I'll tell you that from our experience while we're on the topic, we'll move back to the International Women's Day is that the productivity of digital, this program we're doing is going to be streamed. We couldn't do this face-to-face because we had to have everyone fly to an event. We're going to do hundreds of stories that we couldn't have done. We're doing it remote. Because it's better to get the content than not have it. I mean it's offline, so, but it's not about getting people to the event and watch the screen for seven hours. It's pick your interview, and then engage. >> Yeah. >> So it's self-service. So we're seeing a lot, the new user experience kind of direct to consumer, and so I think there will be an, I think there's going to be a digital first class citizen with events, so that that matches up with the kind of experience, but the offline version. Face-to-face optimized for relationships, and that's where the recruiting gets done. That's where, you know, people can build these relationships with each other. >> Yeah, and it can be asynchronous. I think that's a real value proposition. It's a great point. >> Okay, I want to get, I want to get into the technology side of the education and re-skilling and those things. I remember in the 80s, computer science was software engineering. You learned like nine languages. You took some double E courses, one or two, and all the other kind of gut classes in school. Engineering, you had the four class disciplines and some offshoots of specialization. Now it's incredible the diversity of tracks in all engineering programs and computer science and outside of those departments. >> Yeah. >> Can you speak to the importance of STEM and the diversity in the technology industry and how this brings opportunity to lower the bar to get in and how people can stay in and grow and keep leveling up? >> Yeah, well look, we're constantly working on how to, how to help the incoming funnel. But then, you know, at a university level, I'm on the foundation board of Kansas State where I got my engineering degree. I was also Chairman of the National Action Council for Minorities in Engineering, which was all about diversity in STEM and how do you keep that pipeline going because honestly the US needs more tech resources than we have. And if you don't tap into the diversity of our entire workforce, we won't be able to fill that need. And so we focused a lot on both the funnel, right, that starts at the middle school level, particularly for girls, getting them in, you know, the situation of hands-on comfort level with coding, with robot building, you know, whatever gives them that confidence. And then keeping that going all the way into, you know, university program, and making sure that they don't attrit out, right? And so there's a number of initiatives, whether it's mentoring and support groups and financial aid to make sure that underrepresented minorities, women and other minorities, you know, get through the funnel and stay, you know, stay in. >> Got it. Now let me ask you, you said, I have two daughters. You have a family of girls too. Is there a vibe difference between the new generation and what's the trends that you're seeing in this next early wave? I mean, not maybe, I don't know how this is in middle school, but like as people start getting into their adult lives, college and beyond what's the current point of view, posture, makeup of the talent coming in? >> Yeah, yeah. >> Certain orientations, do you see any patterns? What's your observation? >> Yeah, it's interesting. So if I look at electrical engineering, my major, it's, and if I look at Kansas State, which spends a lot of time on this, and I think does a great job, but the diversity of that as a major has not changed dramatically since I was there in the early 80s. Where it has changed very significantly is computer science. There are many, many university and college programs around the country where, you know, it's 50/50 in computer science from a gender mix perspective, which is huge progress. Huge progress. And so, and to me that's, you know, I think CS is a fantastic degree for tech, regardless of what function you actually end up doing in these companies. I mean, I was an electrical engineer. I never did core electrical engineering work. I went right into sales and marketing and general management roles. So I think, I think a bunch of, you know, diverse CS graduates is a really, really good sign. And you know, we need to continue to push on that, but progress has been made. I think the, you know, it kind of goes back to the thing we were just talking about, which is the attrition of those, let's just talk about women, right? The attrition of those women once they got past early career and into mid-career then was a concern, right? And that goes back to, you know, just the inability to, you know, get it all done. And that I am hopeful is going to be better served now. >> Well, Sue, it's great to have you on. I know you're super busy. I appreciate you taking the time and contributing to our program on corporate board membership and some of your story and observations and opinions and analysis. Always great to have you and call you a contributor for theCUBE. You can jump on on one more board, be one of our board contributors for our analysts. (Sue laughing) >> I'm at capacity. (both laughing) >> Final, final word. What's the big seat at the table issue that's going well and areas that need to be improved? >> So I'll speak for my boards because they have made great progress in efficiency. You know, obviously with interest rates going up and the mix between growth and profitability changing in terms of what investors are looking for. Many, many companies have had to do a hard pivot from grow at all costs to healthy balance of growth and profit. And I'm very pleased with how my companies have made that pivot. And I think that is going to make much better companies as a result. I think diversity is something that has not been solved at the corporate level, and we need to keep working it. >> Awesome. Thank you for coming on theCUBE. CUBE alumni now contributor, on multiple boards, full-time job. Love the new challenge and chapter you're on, Sue. We'll be following, and we'll check in for more updates. And thank you for being a contributor on this program this year and this episode. We're going to be doing more of these quarterly, so we're going to move beyond once a year. >> That's great. (cross talking) It's always good to see you, John. >> Thank you. >> Thanks very much. >> Okay. >> Sue: Talk to you later. >> This is theCUBE coverage of IWD, International Women's Day 2023. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 3 2023

SUMMARY :

Thank you for coming on. of my CUBE alumni title. We're psyched to have you on. And then, you know, four years ago and then what inspired you And those range from, you know, I mean, what is it like? I think this is why you come to tech, You feel good about where you're at and. that kind of gave you some directors, you know, in the US I know there's one board, you and you feel, you know, It's doing whatever you want to But the, you know, the right now with, as you know, but I want to ask you about of the union if you will, Cause you left one and but they're on a few. Well, and you know, Yeah, I would get on a executive you can do, Okay, I got to ask you about your career. have to say, you know, not just one or the other. Well, I got to ask you about the, So I think this brings and made that, you know, made that They're in the zoom call. that like, you know, the house is clean, I also love the office, and you know, around, you know, and the numbers aren't that And the numbers are actually But you know, also on that you had to go to, So, you know, got to meet a customer. that day, so, you know, is that the productivity of digital, That's where, you know, people Yeah, and it can be asynchronous. and all the other kind all the way into, you know, and what's the trends that you're seeing And so, and to me that's, you know, Well, Sue, it's great to have you on. I'm at capacity. that need to be improved? And I think that is going to And thank you for being a It's always good to see you, John. I'm John Furrier, your host.

ENTITIES

Entity	Category	Confidence
Meg Whitman	PERSON	0.99+
Ann Livermore	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
HPE	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
Hewlett Packard	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Sue Barsamian	PERSON	0.99+
1981	DATE	0.99+
Texas	LOCATION	0.99+
40 year	QUANTITY	0.99+
one	QUANTITY	0.99+
100%	QUANTITY	0.99+
31 year	QUANTITY	0.99+
National Action Council for Minorities in Engineering	ORGANIZATION	0.99+
$4 billion	QUANTITY	0.99+
35 person	QUANTITY	0.99+
two daughters	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
five days	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
Sue	PERSON	0.99+
International Women's Day	EVENT	0.99+
US	LOCATION	0.99+
First	QUANTITY	0.99+
Boston	LOCATION	0.99+
last year	DATE	0.99+
three days	QUANTITY	0.99+
Atlanta	LOCATION	0.99+
hundreds	QUANTITY	0.99+
seven hours	QUANTITY	0.99+
one problem	QUANTITY	0.99+
one opportunity	QUANTITY	0.99+
both	QUANTITY	0.99+
Kansas State	LOCATION	0.99+
this year	DATE	0.98+
one model	QUANTITY	0.98+
second type	QUANTITY	0.98+
80s	DATE	0.98+
2020	DATE	0.98+
two-thirds	QUANTITY	0.98+
one board	QUANTITY	0.98+
first	QUANTITY	0.98+
five boards	QUANTITY	0.98+
one topic	QUANTITY	0.98+
first type	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
two degrees	QUANTITY	0.97+
International Women's Day 2023	EVENT	0.97+
50/50	QUANTITY	0.96+
early 80s	DATE	0.96+
four years ago	DATE	0.96+
four class	QUANTITY	0.95+
nine languages	QUANTITY	0.95+
pandemic	EVENT	0.95+
Facebook	ORGANIZATION	0.93+
once a year	QUANTITY	0.92+
27 year old	QUANTITY	0.91+
today	DATE	0.88+

Manya Rastogi, Dell Technologies & Abdel Bagegni, Telecom Infra Project | MWC Barcelona 2023

>> TheCUBE's live coverage is made possible by funding from Dell Technologies. Creating technologies that drive human progress. (upbeat music) >> Welcome back to Spain, everybody. We're here at the Theater Live and MWC 23. You're watching theCUBE's Continuous Coverage. This is day two. I'm Dave Vellante with my co-host, Dave Nicholson. Lisa Martin is also in the house. John Furrier out of our Palo Alto studio covering all the news. Check out silicon angle.com. Okay, we're going to dig into the core infrastructure here. We're going to talk a little bit about servers. Manya Rastogi is here. She's in technical marketing at Dell Technologies. And Abdel Bagegni is technical program manager at the Telecom Infra Project. Folks, welcome to theCUBE. Good to see you. >> Thank you. >> Abdel, what is the Telecom Infras Project? Explain to our audience. >> Yeah. So the Telecom Infra Project is a US based non-profit organization community that brings together different participants, suppliers, vendors, operators SI's together to accelerate the adoption of open RAN and open interface solutions across the globe. >> Okay. So that's the mission is open RAN adoption. And then how, when was it formed? Give us the background and some of the, some of the milestones so far. >> Yeah. So the telecom infra project was established five years ago from different vendor leaders and operators across the globe. And then the mission was to bring different players in to work together to accelerate the adoption of, of open RAN. Now open RAN has a lot of potential and opportunities, but in the same time there's challenges that we work together as a community to facilitate those challenges and overcome those barriers. >> And we've been covering all week just the disaggregation of the network. And you know, we've seen this movie sort of before playing out now in, in telecom. And Manya, this is obviously a compute intensive environment. We were at the Dell booth earlier this morning poking around, beautiful booth, lots of servers. Tell us what your angle is here in this marketplace. >> Yeah, so I would just like to say that Dell is kind of leading or accelerating the innovation at the telecom edge with all these ruggedized servers that we are offering. So just continuing the mission, like Abdel just mentioned for the open RAN, that's where a lot of focus will be from these servers will be, so XR 8000, it's it's going to be one of the star servers for telecom with, you know, offering various workloads. So it can be rerun, open run, multi access, edge compute. And it has all these different features with itself and the, if we, we can talk more about the performance gains, how it is based on the Intel CPUs and just try to solve the purpose like along with various vendors, the whole ecosystem solve this challenge for the open RAN. >> So Manya mentioned some of those infrastructure parts. Does and do, do you say TIP or T-I-P for short? >> Abdel: We say TIP. >> TIP. >> Abdel: T-I-P is fine as well. >> Does, does, does TIP or T-I-P have a certification process or a, or a set of guidelines that someone like Dell would either adhere to or follow to be sort of TIP certified? What does that look like? >> Yeah, of course. So what TIP does is TIP accredits what solutions that actually work in a real commercial grade environment. So what we do is we bring the different players together to come up with the most efficient optimized solution. And then it goes through a process that the community sets the, the, the criteria for and accepts. And then once this is accredited it goes into TIP exchange for other operators and the participants and the industry to adopt. So it's a well structured process and it's everything about how we orchestrate the industry to come together and set those requirements and and guidelines. Everything starts with a use case from the beginning. It's based on operators requirements, use cases and then those use cases will be translated into a solution that the industry will approve. >> So when you say operator, I can think of that sort of traditionally as the customer side of things versus the vendor side of things. Typically when organizations get together like TIP, the operator customer side is seeking a couple of things. They want perfect substitutes in all categories so that they could grind vendors down from a price perspective but they also want amazing innovation. How do you, how do you deliver both? >> Yeah, I mean that's an excellent question. We be pragmatic and we bring all players in one table to discuss. MNO's want this, vendors can provide a certain level and we bring them together and they discuss and come up with something that can be deployed today and future proof for the future. >> So I've been an enterprise technology observer for a long time and, you know, I saw the, the attempt to take network function virtualization which never really made much of an impact, but it was a it was the beginning of the enterprise players really getting into this market. And then I would see companies, whether it was Dell or HPE or Cisco, they'd take an X 86 server, put a cool name on it, edge something, and throw it over the fence and that didn't work so well. Now it's like, Manya. We're starting to get serious. You're building relationships. >> Manya: Totally. >> I mentioned we were at the Dell booth you're actually building purpose built systems now for this, this segment. Tell us what's different about this market and the products that you're developing for this market than say the commercial enterprise. >> So you are absolutely right, like, you know, kind of thinking about the journey, there has been a lot of, it has been going for a long time for all these improvements and towards going more open disaggregated and overall that kind of environment and what Dell brings together with our various partners and particularly if you talk about Intel. So these servers are powered by the players four gen intel beyond processors. And so what Intel is doing right now is providing us with great accelerators like vRAN Boost. So it increases performance like doubles what it was able to do before. And power efficiency, it has been an issue for a long, long time and it still continues but there is some improvement. For example 20% reduction overall with the power savings. So that's a step forward in that direction. And then we have done some of our like own testing as well with these servers and continuing that, you know it's not just telecom but also going towards Edge or inferencing like all these comes together not just X 30,000 but for example XR 56 10, 70, 76 20. So these are three servers which combines together to like form telecom and Edge and covers altogether. So that's what it is. >> Great, thank you. So Abdel, I mean I think generally people agree that in the fullness of time all radio access networks are going to be open, right? It's just a matter of okay, how do we get there? How do we make sure that it has the same, you know, quality of service characteristics. So where are we on on that, that journey from your perspective? And, and maybe you could project what, what it's going to look like over this decade. 'Cause it's going to take, you know, years. >> It's going to take a bit of time to mature and be a kind of a plug and play different units together. I think there was a lot, there was a, was a bit of over-promising in a few, in the last few years on the acceleration of open RAN deployment. That, well, a TIP is trying to do is trying to realize the pragmatic approach of the open run deployment. Now we know the innovation cannot happen when you have a kind of closed interfaces when you allow small players to be within the market and bring the value to, to the RAN areas. This is where the innovation happens. I think what would happen on the RAN side of things is that it would be driven by use cases and the operators. And the minute that the operators are no longer can depend on the closed interface vendors because there's use cases that fulfill that are requires some open RAN functionality, be the, the rig or the SMO layers and the different configurations of the rUSE getting the servers to the due side of things. This kind of modular scalability on this layer is when the RAN will, the Open RAN, would boost. This would happen probably, yeah. >> Go ahead. >> Yeah, it would happen in, in the next few years. Not next year or the year after but definitely something within the four to five years from now. >> I think it does feel like it's a second half of the decade and you feel like the, the the RAN intelligent controller is going to be a catalyst to actually sort of force the world into this open environment. >> Let's say that the Rick and the promises that were given to, to the sun 10 years ago, the Rick is realizing it and the closed RAN vendors are developing a lot on the Rick side more than the other parts of the, of the open RAN. So it will be a catalyst that would drive the innovation of open RAN, but only time will tell. >> And there are some naysayers, I mean I've seen some you know, very, very few, but I've seen some works that, oh the economics aren't there. It'll, it'll never get there. What, what do you, what do you say to that? That, that it won't ever, open RAN won't ever be as cost effective as you know, closed networks. >> Open RAN will open innovations that small players would have the opportunity to contribute to the, to the RAN space. This opportunity is not given to small players today. Open RAN provides this kind of opportunity and given that it's a path for innovation, then I would say that, you know, different perspectives some people are making sure that, you know the status quo is the way forward. But it would certainly put barriers on on innovation and this is not the way forward. >> Yeah. You can't protect the past in the future. My own personal opinion is, is that it doesn't have to be comparable from a, from a TCO perspective it can be close enough. It's the innovative, same thing with like you watch the, the, the adoption of Cloud. >> Exactly. >> Like cloud was more expensive it's always more expensive to rent, but people seem to be doing public Cloud, you know, because of the the innovation capabilities and the developer capabilities. Is that a fair analogy in this space, do you think? >> I mean this is what all technologies happens. >> Yeah. >> Right? It starts with a quite costly and then the the cost will start dropping down. I mean the, the cost of, of a megabyte two decades ago is probably higher than what it costly terabyte. So this is how technology evolves and it's any kind of comparison, either copper or even the old generation, the legacy generations could be a, a valid comparison. However, they need to be at a market demand for something like that. And I think the use cases today with what the industry is is looking for have that kind of opportunity to pull this kind of demand. But, but again, it needs to go work close by the what happens in the technology space, be it, you know we always talk about when we, we used to talk about 5G, there was a lot of hypes going on there. But I think once it realized in, in a pragmatic, in a in a real life situation, the minutes that governments decide to go for autonomous vehicles, then you would have limitations on the current closed RAN infrastructures and you would definitely need something to to top it up on the- >> I mean, 5G needs open RAN, I mean that's, you know not going to happen without it. >> Exactly. >> Yeah, yeah. But, but what is, but what would you say the most significant friction is between here and the open RAN nirvana? What are, what are the real hurdles that need to be overcome? There's obviously just the, I don't want to change we've been doing this the same way forever, but what what are the, what are the real, the legitimate concerns that people have when we start talking about open RAN? >> So I think from a technology perspective it will be solved. All of the tech, I mean there's smart engineers in the world today that will fix, you know these kind of problems and all of the interability, interruptability issues and, and all of that. I think it's about the mindset, the, the interfaces between the legacy core and RAN has been became more fluid today. We don't have that kind of a hard line between these kind of different aspects. We have the, the MEC coming closer to the RAN, we have the RAN coming closer to the Core, and we have the service based architectures in the Core. So these kind of things make it needs a paradigm shift between how operators that would need to tackle the open RAN space. >> Are there specific deployment requirements for open RAN that you can speak to from your perspective? >> For sure and going in this direction, like, you know evolution with the technology and how different players are coming together. Like that's something I wanted to comment from the previous question. And that's where like, you know these servers that Dell is offering right now. Specific functionality requirements, for example, it's it's a small server, it's short depth just 430 millimeters of depth and it can fit anywhere. So things like small form factor, it's it's crucial because if you, it can replace like multiple servers 10 years ago with just one server and you can place it like near a base band unit or to a cell site on top of a roof wherever. Like, you know, if it's a small company and you need this kind of 5G connection it kind of solves that challenge with this server. And then there are various things like, you know increasing thermals for example temperatures. It is classified like, you know kind of compliant with the negative 5 to 55 degree Celsius. And then we are also moving towards, for example negative 20 to 65 degree Celsius. Which is, which is kind of great because in situations where, which are out of our hands and you need specific thermals for those situations that's where it can solve that problem. >> Are those, are those statistics in those measurements different than the old NEB's standards, network equipment building standards? Or are they, are they in line with that? >> It is, it is a next step. Like so most of our servers that we have right now are negative five to five degree Celsius, for especially the extremely rugged server series and this one XR 8,000 which is focused for the, it's telecom inspired so it's focused on those customers. So we are trying to come up like go a step ahead and also like offering this additional temperatures testing and yeah compliance. So, so it is. >> Awesome. So we, I said we were at the booth early today. Looks like some good traffic people poking around at different, you know, innovations you got going. Some of the private network stuff is kind of cool. I'm like how much does that cost? I think I might like one of those, you know, but- >> [Private 5G home network. >> Right? Why not? Guys, great to have you on the show. Thanks so much for sharing. Appreciate it. >> Thank you. >> Thank you so much. >> Okay. For Dave Nicholson and Lisa Martin this is Dave Vellante, theCUBE's coverage. MWC 23 live from the Fida in Barcelona. We'll be right back. (outro music)

Published Date : Feb 28 2023

SUMMARY :

that drive human progress. Lisa Martin is also in the house. Explain to our audience. solutions across the globe. some of the milestones so far. and operators across the globe. of the network. So just continuing the mission, Does and do, do you say the industry to adopt. as the customer side and future proof for the future. the attempt to take network and the products that you're developing by the players four gen intel has the same, you know, quality and the different configurations of in, in the next few years. of the decade and you feel like the, the and the promises that were given to, oh the economics aren't there. the opportunity to contribute It's the innovative, same thing with like and the developer capabilities. I mean this is what by the what happens in the RAN, I mean that's, you know between here and the open RAN in the world today that will fix, you know from the previous question. for especially the extremely Some of the private network Guys, great to have you on the show. MWC 23 live from the Fida in Barcelona.

ENTITIES

Entity	Category	Confidence
Dave Nicholson	PERSON	0.99+
Manya Rastogi	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Dell	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
Abdel Bagegni	PERSON	0.99+
Manya	PERSON	0.99+
Abdel	PERSON	0.99+
John Furrier	PERSON	0.99+
Dell Technologies	ORGANIZATION	0.99+
Spain	LOCATION	0.99+
US	LOCATION	0.99+
both	QUANTITY	0.99+
HPE	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
next year	DATE	0.99+
65 degree Celsius	QUANTITY	0.99+
one server	QUANTITY	0.99+
today	DATE	0.99+
one table	QUANTITY	0.98+
MWC 23	EVENT	0.98+
55 degree Celsius	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
five degree Celsius	QUANTITY	0.98+
Telecom Infra Project	ORGANIZATION	0.98+
Telecom Infra Project	ORGANIZATION	0.98+
two decades ago	DATE	0.98+
five years ago	DATE	0.98+
one	QUANTITY	0.97+
TheCUBE	ORGANIZATION	0.97+
10 years ago	DATE	0.97+
430 millimeters	QUANTITY	0.97+
four	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.96+
five	QUANTITY	0.95+
5	QUANTITY	0.95+
early today	DATE	0.93+
XR 56 10	COMMERCIAL_ITEM	0.92+
Barcelona	LOCATION	0.92+
XR 8000	COMMERCIAL_ITEM	0.92+
next few years	DATE	0.89+
rUSE	TITLE	0.89+
20	COMMERCIAL_ITEM	0.89+
day two	QUANTITY	0.88+
three servers	QUANTITY	0.88+
five years	QUANTITY	0.87+
Fida	LOCATION	0.86+
intel	ORGANIZATION	0.86+
earlier this morning	DATE	0.86+
10 years ago	DATE	0.85+
Theater Live	LOCATION	0.83+
MWC Barcelona 2023	EVENT	0.82+
silicon angle.com	OTHER	0.81+
Telecom Infras Project	ORGANIZATION	0.81+
sun	DATE	0.8+
second half	QUANTITY	0.8+
5G	ORGANIZATION	0.79+
NEB	ORGANIZATION	0.78+
Rick	ORGANIZATION	0.78+
XR 8,000	COMMERCIAL_ITEM	0.77+
MNO	ORGANIZATION	0.77+
X 30,000	OTHER	0.72+
TCO	ORGANIZATION	0.71+
MWC 23	LOCATION	0.66+
RAN	TITLE	0.65+
of	DATE	0.61+
86	COMMERCIAL_ITEM	0.6+

Wen Phan, Ahana & Satyam Krishna, Blinkit & Akshay Agarwal, Blinkit | AWS Startup Showcase S2 E2

(gentle music) >> Welcome everyone to theCUBE's presentation of the AWS Startup Showcase. The theme is Data as Code; The Future of Enterprise Data and Analytics. This is the season two, episode two of the ongoing series of covering the exciting startups in the AWS ecosystem around data analytics and cloud computing. I'm your host, John Furrier. Today we're joined by great guests here. Three guests. Wen Phan, who's a Director of Product Management at Ahana, Satyam Krishna, Engineering Manager at Blinkit, and we have Akshay Agarwal, Senior Engineer at Blinkit as well. We're going to get into the relationship there. Let's get into. We're going to talk about how Blinkit's using open data lake, data house with Presto on AWS. Gentlemen, thanks for joining us. >> Thanks for having us. >> So we're going to get into the deep dive on the open data lake, but I want to just quickly get your thoughts on what it is for the folks out there. Set the table. What is the open data lakehouse? Why it is important? What's in it for the customers? Why are we seeing adoption around this because this is a big story. >> Sure. Yeah, the open data lakehouse is really being able to run a gamut of analytics, whether it be BI, SQL, machine learning, data science, on top of the data lake, which is based on inexpensive, low cost, scalable storage. And more importantly, it's also on top of open formats. And this to the end customer really offers a tremendous range of flexibility. They can run a bunch of use cases on the same storage and great price performance. >> You guys have any other thoughts on what's your reaction to the lakehouse? What is your experience with it? What's going on with Blinkit? >> No, I think for us also, it has been the primary driver of how as a company we have shifted our completely delivery model from us delivering in one day to someone who is delivering in 10 minutes, right? And a lot of this was made possible by having this kind of architecture in place, which helps us to be more open-source, more... where the tools are open-source, we have an open table format which helps us be very modular in nature, meaning we can pick solutions which works best for us, right? And that is the kind of architecture that we want to be in. >> Awesome. Wen, you know last time we chat with Ahana, we had a great conversation around Presto, data. The theme of this episode is Data as Code, which is interesting because in all the conversations in these episodes all around developers, which administrators are turning into developers, there's a developer vibe with data. And with opensource, it's software. Now you've got data taking a similar trajectory as how software development was with code, but the people running data they're not developers, they're administrators, they're operators. Now they're turning into DataOps. So it's kind of a similar vibe going on with branches and taking stuff out of and putting it back in, and testing it. Datasets becoming much more stable, iterating on machine learning algorithm. This is a movement. What's your guys reaction before we get into the relationships here with you guys. But, what's your reaction to this Data as Code movement? >> Yeah, so I think the folks at Blinkit are doing a great job there. I mean, they have a pretty compact data engineering team and they have some pretty stringent SLAs, as well as in terms of time to value and reliability. And what that ultimately translates for them is not only flexibility but reliability. So they've done some very fantastic work on a lot of automation, a lot of integration with code, and their data pipelines. And I'm sure they can give the details on that. >> Yes. Satyam and Akshay, you guys are engineers' software, but this is becoming a whole another paradigm where the frontline coding and or work or engineer data engineering is implementing the operations as well. It's kind of like DevOps for data. >> For sure. Right. And I think whenever you're working, even as a software engineer, the understanding of business is equally important. You cannot be working on something and be away from business, right? And that's where, like I mentioned earlier, when we realized that we have to completely move our stack and start giving analytics at 10 minutes, right. Because when you're delivering in 10 minutes, your leaders want to take decisions in your real-time. That means you need to move with them. You need to move with business. And when you do that, the kind of flexibility these softwares give is what enables the businesses at the end of the day. >> Awesome. This is the really kind of like, is there going to be a book called agile data warehouses? I don't think so. >> I think so. (laughing) >> The agile cloud data. This is cool. So let's get into what you guys do. What is Blinkit up to? What do you guys do? Can you take a minute to explain the company and your product? >> Sure. I'll take that. So Blinkit is India's biggest 10 minute delivery platform. It pioneered the delivery model in the country with over 10 million Indian shopping on our platform, ranging from everything: grocery staples, vegetables, emergency services, electronics, and much more, right. It currently delivers over 200,000 orders every day, and is in a hurry to bring the future of farmers to everyone in India. >> What's the relationship with Ahana and Blinkit? Wen, what's the tie in? >> Yeah, so Blinkit had a pretty well formed stack. They needed a little bit more flexibility and control. They thought a managed service was the way to go. And here at Ahana, we provide a SaaS managed service for Presto. So they engaged us and they evaluated our offering. And more importantly, we're able to partner. As a early stage startup, we really rely on very strong partners with great use cases that are willing to collaborate. And the folks at Blinkit have been really great in helping us push our product, develop our product. And we've been very happy about the value that we've been able to deliver to them as well. >> Okay. So let's unpack the open data lakehouse. What is it? What's under the covers? Let's get into it. >> Sure. So if bring up a slide. Like I said before, it's really a paradigm on being able to run a gamut of analytics on top of the open data lake. So what does that mean? How did it come about? So on the left hand side of the slide, we are coming out of this world where for the last several decades, the primary workhorse for SQL based processing and reporting and dashboarding use cases was really the data warehouse. And what we're seeing is a shift due to the trends in inexpensive scalable storage, cloud storage. The proliferation of open formats to facilitate using this storage to get certain amounts of reliability and performance, and the adoption of frameworks that can operate on top of this cloud data lake. So while here at Ahana, we're primarily focused on SQL workloads and Presto, this architecture really allows for other types of frameworks. And you see the ML and AI side. And like to Satyam's point earlier, offers a great amount of flexibility modularity for many use cases in the cloud. So really, that's really the lakehouse, and people like it for the performance, the openness, and the price performance. >> How's the open-source open side of it playing in the open-source? It's kind of open formats. What is the open-source angle on this because there's a lot of different approaches. I'm hearing open formats. You know, you have data stores which are a big part of seeing that. You got SQL, you mentioned SQL. There's got a mishmash of opportunities. Is it all coexisting? Is it one tool to rule the world or is it interchangeable? What's the open-source angle? >> There's multiple angles and I'll let definitely Satyam add to what I'm saying. This was definitely a big piece for Blinkit. So on one hand, you have the open formats. And what really the open formats enable is multiple compute engines to work on that data. And that's very huge. 'Cause it's open, you're not locked in. I think the other part of open that is important and I think it was important to Blinkit was the governance around that. So in particular Presto is governed by the Linux Foundation. And so, as a customer of open-source technology, they want some assurances for things like how's it governed? Is the license going to change? So there's that aspect of openness that I think is very important. >> Yeah. Blinkit, what's the data strategy here with lakehouse and you guys? Why are you adopting this type of architecture? >> So adding to what... Yeah, I think adding to Wen said, right. When we are thinking in terms of all these OpenStacks, you have got these open table formats, everything which is deployed over cloud, the primary reason there is modularity. It's as simple as that, right. You can plug and play so many different table formats from one thing to another based on the use case that you're trying to serve, so that you get the most value out of data. Right? I'll give you a very simple example. So for us we use... not even use one single table format. It's not that one thing solves for everything, right? We use both Hudi and Iceberg to solve for different use cases. One is good for when you're working for a certain data site. Icebergs works well when you're in the SQL kind of interface, right. Hudi's still trying to reach there. It's going to go there very soon. So having the ability to plug and play different formats based on the use case helps you to grow faster, helps you to take decisions faster because you now you're not stuck on one thing. They will have to implement it. Right. So I think that's what it is great about this data lake strategy. Keeping yourself cost effective. Yeah, please. >> So the enablement is basically use case driven. You don't have to be rearchitecturing for use cases. You can simply plug can play based on what you need for the use case. >> Yeah. You can... and again, you can focus on your business use case. You can figure out what your business users need and not worry about these things because that's where Presto comes in, helps you stitch that data together with multiple data formats, give you the performance that you need and it works out the best there. And that's something that you don't get to with traditional warehouse these days. Right? The kind of thing that we need, you don't get that. >> I do want to add. This is just to riff on what Satyam said. I think it's pretty interesting. So, it really allowed him to take the best-of-breed of what he was seeing in the community, right? So in the case of table formats, you've got Delta, you've got Hudi, you've got Iceberg, and they all have got their own roadmap and it's kind of organic of how these different communities want to evolve, and I think that's great, but you have these end consumers like Blinkit who have different maybe use cases overlapping, and they're not forced to pick one. When you have an open architecture, they can really put together best-of-breed. And as these projects evolve, they can continue to monitor it and then make decisions and continue to remain agile based on the landscape and how it's evolving. >> So the agility is a key point. Flexibility and agility, and time to valuing with your data. >> Yeah. >> All right. Wen, I got to get in to why the Presto is important here. Where does that fit in? Why is Presto important? >> Yeah. For me, it all comes down to the use cases and the needs. And reporting and dashboarding is not going to go away anytime soon. It's a very common use case. Many of our customers like Blinkit come to us for that use case. The difference now is today, people want to do that particular use case on top of the modern data lake, on top of scalable, inexpensive, low cost storage. Right? In addition to that, there's a need for this low latency interactive ability to engage with the data. This is often arises when you need to do things in a ad hoc basis or you're in the developmental phase of building things up. So if that's what your need is. And latency's important and getting your arms around the problems, very important. You have a certain SLA, I need to deliver something. That puts some requirements in the technology. And Presto is a perfect for that ideal use case. It's ideal for that use case. It's distributed, it's scalable, it's in memory. And so it's able to really provide that. I think the other benefit for Presto and why we're bidding on Presto is it works well on the data lakes, but you have to think about how are these organizations maturing with this technology. So it's not necessarily an all or nothing. You have organizations that have maybe the data lake and it's augmented with other analytical data stores like Snowflake or Redshift. So Presto also... a core aspect is its ability to federate or connect and query across different data sources. So this can be a permanent thing. This could also be a transitionary thing. We have some customers that are moving and slowly shifting their data portfolio from maybe all data warehouse into 80% data lake. But it gives that optionality, it gives that ability to transition over a timeframe. But for all those reasons, the latency, the scalability, the federation, is why Presto for this particular use case. >> And you can connect with other databases. It can be purpose built database, could be whatever. Right? >> Sure. Yes, yes. Presto has a very pluggable architecture. >> Okay. Here's the question for the Blinkit team? Why did you choose Presto and what led you to Ahana? >> So I'll take this better, over this what Presto sits well in that reach is, is how it is designed. Like basically, Presto decouples your storage with the compute. Basically like, people can use any storage and Presto just works as a query engine for them. So basically, it has a constant connectors where you can connect with a real-time databases like Pinot or a Druid, along with your warehouses like Redshift, along with your data lake that's like based on Hudi or Iceberg. So it's like a very landscape that you can use with the Presto. And consumers like the analytics doesn't need to learn the SQL or different paradigms of the querying for different sources. They just need to learn a single source. And, they get a single place to consume from. They get a single consumer on their single destination to write on also. So, it's a homologous architecture, which allows you to put a central security like which Presto integrates. So it's also based on open architecture, that's Apache engine. And it has also certain innovative features that you can see based on caching, which reduces a lot of the cost. And since you have further decoupled your storage with the compute, you can further reduce your cost, because now the biggest part of our tradition warehouse is a storage. And the cost goes massively upwards with the amount of data that you've added. Like basically, each time that you add more data, you require more storage, and warehouses ask you to write the data in their own format. Over here since we have decoupled that, the storage cost have gone down. It's literally that your cost that you are writing, and you just pay for the compute, and you can scale in scale out based on the requirements. If you have high traffic, you scale out. If you have low traffic, you scale in. So all those. >> So huge cost savings. >> Yeah. >> Yeah. Cost effectiveness, for sure. >> Cost effectiveness and you get a very good price value out of it. Like for each query, you can estimate what's the cost for you based on that tracking and all those things. >> I mean, if you think about the other classic Iceberg and what's under the water you don't know, it's the hidden cost. You think about the tooling, right, and also, time it takes to do stuff. So if you have flexibility on choice, when we were riffing on this last time we chatted with you guys and you brought it up earlier around, you can have the open formats to have different use cases in different tools or different platforms to work on it. Redshift, you can use Redshift here, or use something over there. You don't have to get locking >> Absolutely. >> Satyam & Akshay: Yeah. >> Locking is a huge problem. How do you guys see that 'cause sounds like here there's not a lot of locking. You got the open formats, and you got choice. >> Yeah. So you get best of the both worlds. Like you get with Ahana or with the Presto, you can get the best of the both worlds. Since it's cloud native, you can easily deploy your clusters very easily within like five minutes. Your cluster is up, you can start working on it. You can deploy multiple clusters for multiple teams. You get also flexibility of adding new connectors since it's open and further it's also much more secure since it's based on cloud native. So basically, you can control your security endpoints very well. So all those things comes in together with this architecture. So you can definitely go more on the lakehouse architecture than warehousing when you want to deliver data value faster. And basically, you get the much more high value out of your data in a sorted template. >> So Satyam, it sounds like the old warehousing was like the application person, not a lot of usage, old, a lot of latency. Okay. Here and there. But now you got more speed to deploy clusters, scale up scale down. Application developers are as everyone. It's not one person. It's not one group. It's whenever you want. So, you got speed. You got more diversity in the data opportunities, and your coding. >> Yeah. I think data warehouses are a way to start for every organization who is getting into data. I don't think data warehousing is still a solution and will be a solution for a lot of teams which are still getting into data. But as soon as you start scaling, as you start seeing the cost going up, as you start seeing the number of use cases adding up, having an open format definitely helps. So, I would say that's where we are also heading into and that's how our journey as well started with Presto as well, why we even thought about Ahana, right. >> (John chuckles) >> So, like you mentioned, one of the things that happened was as we were moving to the lakehouse and the open table format, I think Ahana is one of the first ones in the market to have Hudi as a first class citizen completely supported with all the things which are not even present at the time of... even with Presto, right. So we see Ahana working behind the scenes, improving even some of the things already over the open-source ecosystem. And that's where we get the most value out of Ahana as well. >> This is the convergence of open-source magic and commercialization. Wen, because you think about Data as Code, reminds me, I hear, "Data warehouse, it's not going to go away." But you got cloud scale or scale. It reminds me of the old, "Oh yeah, I have a data center." Well, here comes the cloud. So, doesn't really kill the data center, although Amazon would say that the data center's going to be eliminated. No, you just use it for whatever you need it for. You use it for specific use cases, but everyone, all the action goes to the cloud for scale. The same things happen with data, and look at the open-source community. It's kind of coming together. Data as Code is coming together. >> Yeah, absolutely. >> Absolutely. >> I do want to again to connect on another dot in terms of cost and that. You know, we've been talking a little bit about price performance, but there's an implicit cost, and I think this was also very important to Blinkit, and also why we're offering a managed service. So one piece of it. And it really revolves around the people, right? So outside of the technology, the performance. One thing that Akshay brought up and it's another important piece that I should have highlighted a little bit more is, Presto exposes the ability to interact your data in a widely adopted way, which is basically ANSI SQL. So the ability for your practitioners to use this technology is huge. That's just regular Presto. In terms of a managed service, the guys at Blinkit are a great high performing team, but they have to be very efficient with their time and what they manage. And what we're trying to do is provide leverage for them. So take a lot of the heavy lifting away, but at the same time, figuring out the right things to expose so that they have that same flexibility. And that's been the balancing point that we've been trying to balance at Ahana, but that goes back to cost. How do I total cost of ownership? And that not doesn't include just the actual querying processing time, but the ability for the organization to go ahead and absorb the solution. And what does it cost in terms of the people involved? >> Yeah. Great conversation. I mean, this brings up the question of back in the data center, the cloud days, you had the concept of an SRE, which is now popular, site reliability engineer. One person does all the clusters and manages all the scale. Is the data engineer the new SRE for data? Are we seeing a similar trajectory? Just want to get your reaction. What do you guys think? >> Yes, so I would say, definitely. It depends on the teams and the sizes of that. We are high performing team so each automation takes bits on the pieces of the architecture, like where they want to invest in. And it comes out with the value of the engineer's time and basically like how much they can invest in, how much they need to configure the architecture, and how much time it'll take to time to market. So basically like, this is what I would also highlight as an engineer. I found Ahana like the... I would say as a Presto in a cloud native environment, or I think so there's the one in the market that seamlessly scales and then scales out. And further, with a team of us, I would say our team size like three to four engineers managing cluster day in day out, conferring, tuning and all those things takes a lot of time. And Ahana came in and takes it off our plate and the hands in a solution which works out of box. So that's where this comes in. Ahana it's also based on open-source community. >> So the time of the engineer's time is so valuable. >> Yeah. >> My take on it really in terms of the data engineering being the SRE. I think that can work, it depends on the actual person, and we definitely try to make the process as easy as possible. I think in Blinkit's case, you guys are... There are data platform owners, but they definitely are aware of the pipelines. >> John: Yeah. >> So they have very intimate knowledge of what data engineers do, but I think in their case, you guys, you're managing a ton of systems. So it's not just even Presto. They have a ton of systems and surfacing that interface so they can cater to all the data engineers across their data systems, I think is the big need for them. I know you guys you want to chime in. I mean, we've seen the architecture and things like that. I think you guys did an amazing job there. >> So, and to adding to Wen's point, right. Like I generally think what DevOps is to the tech team. I think, what is data engineer or the data teams are to the data organization, right? Like they play a very similar role that you have to act as a guardrail to ensure that everyone has access to the data so the democratizing and everything is there, but that has to also come with security, right? And when you do that, there are (indistinct) a lot of points where someone can interact with data. We have... And again, there's a mixed match of open-source tools that works well, as well. And there are some paid tools as well. So for us like for visualization, we use Redash for our ad hoc analysis. And we use Tableau as well whenever we want to give a very concise reporting. We have Jupyter notebooks in place and we have EMRs as well. So we always have a mixed batch of things where people can interact with data. And most of our time is spent in acting as that guardrail to ensure that everyone should have access to data, but it shouldn't be exploited, right. And I think that's where we spend most of our time in. >> Yeah. And I think the time is valuable, but that your point about the democratization aspect of it, there seems to be a bigger step function value that you're enabling and needs to be talked out. The 10x engineer, it's more like 50x, right? If you get it done right, the enablement downstream at the scale that we're seeing with this new trend is significant. It's not just, oh yeah, visualization and get some data quicker, there's actually real advantages on a multiple with that engineering. So, and we saw that with DevOps, right? Like, you do this right and then magic happens on the edges. So, yeah, it's interesting. You guys, congratulations. Great environment. Thanks for sharing the insight Blinkit. Wen, great to see you. Ahana again with Presto, congratulations. The open-source meets data engineering. Thanks so much. >> Thanks, John. >> Appreciate it. >> Okay. >> Thanks John. >> Thanks. >> Thanks for having us. >> This season two, episode two of our ongoing series. This one is Data as Code. This is theCUBE. I'm John furrier. Thanks for watching. (gentle music)

Published Date : Apr 1 2022

SUMMARY :

This is the season two, episode What is the open data lakehouse? And this to the end customer And that is the kind of into the relationships here with you guys. give the details on that. is implementing the operations as well. You need to move with business. This is the really kind of like, I think so. So let's get into what you guys do. and is in a hurry to bring And the folks at Blinkit the open data lakehouse. So on the left hand side of the slide, What is the open-source angle on this Is the license going to change? with lakehouse and you guys? So having the ability to plug So the enablement is and again, you can focus So in the case of table formats, So the agility is a key point. Wen, I got to get in and the needs. And you can connect Presto has a very pluggable architecture. and what led you to Ahana? And consumers like the analytics and you get a very good and also, time it takes to do stuff. and you got choice. best of the both worlds. like the old warehousing as you start seeing the cost going up, and the open table format, the data center's going to be eliminated. figuring out the right things to expose and manages all the scale. and the sizes of that. So the time of the it depends on the actual person, I think you guys did an amazing job there. So, and to adding Thanks for sharing the insight Blinkit. This is theCUBE.

ENTITIES

Entity	Category	Confidence
John Furrier	PERSON	0.99+
Wen Phan	PERSON	0.99+
Akshay Agarwal	PERSON	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Ahana	PERSON	0.99+
India	LOCATION	0.99+
Blinkit	ORGANIZATION	0.99+
Satyam Krishna	PERSON	0.99+
Linux Foundation	ORGANIZATION	0.99+
Ahana	ORGANIZATION	0.99+
five minutes	QUANTITY	0.99+
Akshay	PERSON	0.99+
AWS	ORGANIZATION	0.99+
10 minutes	QUANTITY	0.99+
Three guests	QUANTITY	0.99+
Satyam	PERSON	0.99+
Blinkit	PERSON	0.99+
one day	QUANTITY	0.99+
10 minute	QUANTITY	0.99+
Redshift	TITLE	0.99+
both worlds	QUANTITY	0.99+
over 200,000 orders	QUANTITY	0.99+
Presto	PERSON	0.99+
over 10 million	QUANTITY	0.99+
SQL	TITLE	0.99+
10x	QUANTITY	0.99+
Wen	PERSON	0.98+
50x	QUANTITY	0.98+
agile	TITLE	0.98+
one piece	QUANTITY	0.98+
both	QUANTITY	0.98+
three	QUANTITY	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
single destination	QUANTITY	0.97+
One person	QUANTITY	0.97+
each time	QUANTITY	0.96+
each	QUANTITY	0.96+
Presto	ORGANIZATION	0.96+
one person	QUANTITY	0.96+
single source	QUANTITY	0.96+
Tableau	TITLE	0.96+
one tool	QUANTITY	0.96+
Icebergs	ORGANIZATION	0.96+
Today	DATE	0.95+
One	QUANTITY	0.95+
one thing	QUANTITY	0.95+

Analyst Predictions 2022: The Future of Data Management

[Music] in the 2010s organizations became keenly aware that data would become the key ingredient in driving competitive advantage differentiation and growth but to this day putting data to work remains a difficult challenge for many if not most organizations now as the cloud matures it has become a game changer for data practitioners by making cheap storage and massive processing power readily accessible we've also seen better tooling in the form of data workflows streaming machine intelligence ai developer tools security observability automation new databases and the like these innovations they accelerate data proficiency but at the same time they had complexity for practitioners data lakes data hubs data warehouses data marts data fabrics data meshes data catalogs data oceans are forming they're evolving and exploding onto the scene so in an effort to bring perspective to the sea of optionality we've brought together the brightest minds in the data analyst community to discuss how data management is morphing and what practitioners should expect in 2022 and beyond hello everyone my name is dave vellante with the cube and i'd like to welcome you to a special cube presentation analyst predictions 2022 the future of data management we've gathered six of the best analysts in data and data management who are going to present and discuss their top predictions and trends for 2022 in the first half of this decade let me introduce our six power panelists sanjeev mohan is former gartner analyst and principal at sanjamo tony bear is principal at db insight carl olufsen is well-known research vice president with idc dave meninger is senior vice president and research director at ventana research brad shimon chief analyst at ai platforms analytics and data management at omnia and doug henschen vice president and principal analyst at constellation research gentlemen welcome to the program and thanks for coming on thecube today great to be here thank you all right here's the format we're going to use i as moderator are going to call on each analyst separately who then will deliver their prediction or mega trend and then in the interest of time management and pace two analysts will have the opportunity to comment if we have more time we'll elongate it but let's get started right away sanjeev mohan please kick it off you want to talk about governance go ahead sir thank you dave i i believe that data governance which we've been talking about for many years is now not only going to be mainstream it's going to be table stakes and all the things that you mentioned you know with data oceans data lakes lake houses data fabric meshes the common glue is metadata if we don't understand what data we have and we are governing it there is no way we can manage it so we saw informatica when public last year after a hiatus of six years i've i'm predicting that this year we see some more companies go public uh my bet is on colibra most likely and maybe alation we'll see go public this year we we i'm also predicting that the scope of data governance is going to expand beyond just data it's not just data and reports we are going to see more transformations like spark jaws python even airflow we're going to see more of streaming data so from kafka schema registry for example we will see ai models become part of this whole governance suite so the governance suite is going to be very comprehensive very detailed lineage impact analysis and then even expand into data quality we already seen that happen with some of the tools where they are buying these smaller companies and bringing in data quality monitoring and integrating it with metadata management data catalogs also data access governance so these so what we are going to see is that once the data governance platforms become the key entry point into these modern architectures i'm predicting that the usage the number of users of a data catalog is going to exceed that of a bi tool that will take time and we already seen that that trajectory right now if you look at bi tools i would say there are 100 users to a bi tool to one data catalog and i i see that evening out over a period of time and at some point data catalogs will really become you know the main way for us to access data data catalog will help us visualize data but if we want to do more in-depth analysis it'll be the jumping-off point into the bi tool the data science tool and and that is that is the journey i see for the data governance products excellent thank you some comments maybe maybe doug a lot a lot of things to weigh in on there maybe you could comment yeah sanjeev i think you're spot on a lot of the trends uh the one disagreement i think it's it's really still far from mainstream as you say we've been talking about this for years it's like god motherhood apple pie everyone agrees it's important but too few organizations are really practicing good governance because it's hard and because the incentives have been lacking i think one thing that deserves uh mention in this context is uh esg mandates and guidelines these are environmental social and governance regs and guidelines we've seen the environmental rags and guidelines imposed in industries particularly the carbon intensive industries we've seen the social mandates particularly diversity imposed on suppliers by companies that are leading on this topic we've seen governance guidelines now being imposed by banks and investors so these esgs are presenting new carrots and sticks and it's going to demand more solid data it's going to demand more detailed reporting and solid reporting tighter governance but we're still far from mainstream adoption we have a lot of uh you know best of breed niche players in the space i think the signs that it's going to be more mainstream are starting with things like azure purview google dataplex the big cloud platform uh players seem to be uh upping the ante and and addressing starting to address governance excellent thank you doug brad i wonder if you could chime in as well yeah i would love to be a believer in data catalogs um but uh to doug's point i think that it's going to take some more pressure for for that to happen i recall metadata being something every enterprise thought they were going to get under control when we were working on service oriented architecture back in the 90s and that didn't happen quite the way we we anticipated and and uh to sanjeev's point it's because it is really complex and really difficult to do my hope is that you know we won't sort of uh how do we put this fade out into this nebulous nebula of uh domain catalogs that are specific to individual use cases like purview for getting data quality right or like data governance and cyber security and instead we have some tooling that can actually be adaptive to gather metadata to create something i know is important to you sanjeev and that is this idea of observability if you can get enough metadata without moving your data around but understanding the entirety of a system that's running on this data you can do a lot to help with with the governance that doug is talking about so so i just want to add that you know data governance like many other initiatives did not succeed even ai went into an ai window but that's a different topic but a lot of these things did not succeed because to your point the incentives were not there i i remember when starbucks oxley had come into the scene if if a bank did not do service obviously they were very happy to a million dollar fine that was like you know pocket change for them instead of doing the right thing but i think the stakes are much higher now with gdpr uh the floodgates open now you know california you know has ccpa but even ccpa is being outdated with cpra which is much more gdpr like so we are very rapidly entering a space where every pretty much every major country in the world is coming up with its own uh compliance regulatory requirements data residence is becoming really important and and i i think we are going to reach a stage where uh it won't be optional anymore so whether we like it or not and i think the reason data catalogs were not successful in the past is because we did not have the right focus on adoption we were focused on features and these features were disconnected very hard for business to stop these are built by it people for it departments to to take a look at technical metadata not business metadata today the tables have turned cdo's are driving this uh initiative uh regulatory compliances are beating down hard so i think the time might be right yeah so guys we have to move on here and uh but there's some some real meat on the bone here sanjeev i like the fact that you late you called out calibra and alation so we can look back a year from now and say okay he made the call he stuck it and then the ratio of bi tools the data catalogs that's another sort of measurement that we can we can take even though some skepticism there that's something that we can watch and i wonder if someday if we'll have more metadata than data but i want to move to tony baer you want to talk about data mesh and speaking you know coming off of governance i mean wow you know the whole concept of data mesh is decentralized data and then governance becomes you know a nightmare there but take it away tony we'll put it this way um data mesh you know the the idea at least is proposed by thoughtworks um you know basically was unleashed a couple years ago and the press has been almost uniformly almost uncritical um a good reason for that is for all the problems that basically that sanjeev and doug and brad were just you know we're just speaking about which is that we have all this data out there and we don't know what to do about it um now that's not a new problem that was a problem we had enterprise data warehouses it was a problem when we had our hadoop data clusters it's even more of a problem now the data's out in the cloud where the data is not only your data like is not only s3 it's all over the place and it's also including streaming which i know we'll be talking about later so the data mesh was a response to that the idea of that we need to debate you know who are the folks that really know best about governance is the domain experts so it was basically data mesh was an architectural pattern and a process my prediction for this year is that data mesh is going to hit cold hard reality because if you if you do a google search um basically the the published work the articles and databases have been largely you know pretty uncritical um so far you know that you know basically learning is basically being a very revolutionary new idea i don't think it's that revolutionary because we've talked about ideas like this brad and i you and i met years ago when we were talking about so and decentralizing all of us was at the application level now we're talking about at the data level and now we have microservices so there's this thought of oh if we manage if we're apps in cloud native through microservices why don't we think of data in the same way um my sense this year is that you know this and this has been a very active search if you look at google search trends is that now companies are going to you know enterprises are going to look at this seriously and as they look at seriously it's going to attract its first real hard scrutiny it's going to attract its first backlash that's not necessarily a bad thing it means that it's being taken seriously um the reason why i think that that uh that it will you'll start to see basically the cold hard light of day shine on data mesh is that it's still a work in progress you know this idea is basically a couple years old and there's still some pretty major gaps um the biggest gap is in is in the area of federated governance now federated governance itself is not a new issue uh federated governance position we're trying to figure out like how can we basically strike the balance between getting let's say you know between basically consistent enterprise policy consistent enterprise governance but yet the groups that understand the data know how to basically you know that you know how do we basically sort of balance the two there's a huge there's a huge gap there in practice and knowledge um also to a lesser extent there's a technology gap which is basically in the self-service technologies that will help teams essentially govern data you know basically through the full life cycle from developed from selecting the data from you know building the other pipelines from determining your access control determining looking at quality looking at basically whether data is fresh or whether or not it's trending of course so my predictions is that it will really receive the first harsh scrutiny this year you are going to see some organization enterprises declare premature victory when they've uh when they build some federated query implementations you're going to see vendors start to data mesh wash their products anybody in the data management space they're going to say that whether it's basically a pipelining tool whether it's basically elt whether it's a catalog um or confederated query tool they're all going to be like you know basically promoting the fact of how they support this hopefully nobody is going to call themselves a data mesh tool because data mesh is not a technology we're going to see one other thing come out of this and this harks back to the metadata that sanji was talking about and the catalogs that he was talking about which is that there's going to be a new focus on every renewed focus on metadata and i think that's going to spur interest in data fabrics now data fabrics are pretty vaguely defined but if we just take the most elemental definition which is a common metadata back plane i think that if anybody is going to get serious about data mesh they need to look at a data fabric because we all at the end of the day need to speak you know need to read from the same sheet of music so thank you tony dave dave meninger i mean one of the things that people like about data mesh is it pretty crisply articulates some of the flaws in today's organizational approaches to data what are your thoughts on this well i think we have to start by defining data mesh right the the term is already getting corrupted right tony said it's going to see the cold hard uh light of day and there's a problem right now that there are a number of overlapping terms that are similar but not identical so we've got data virtualization data fabric excuse me for a second sorry about that data virtualization data fabric uh uh data federation right uh so i i think that it's not really clear what each vendor means by these terms i see data mesh and data fabric becoming quite popular i've i've interpreted data mesh as referring primarily to the governance aspects as originally you know intended and specified but that's not the way i see vendors using i see vendors using it much more to mean data fabric and data virtualization so i'm going to comment on the group of those things i think the group of those things is going to happen they're going to happen they're going to become more robust our research suggests that a quarter of organizations are already using virtualized access to their data lakes and another half so a total of three quarters will eventually be accessing their data lakes using some sort of virtualized access again whether you define it as mesh or fabric or virtualization isn't really the point here but this notion that there are different elements of data metadata and governance within an organization that all need to be managed collectively the interesting thing is when you look at the satisfaction rates of those organizations using virtualization versus those that are not it's almost double 68 of organizations i'm i'm sorry um 79 of organizations that were using virtualized access express satisfaction with their access to the data lake only 39 expressed satisfaction if they weren't using virtualized access so thank you uh dave uh sanjeev we just got about a couple minutes on this topic but i know you're speaking or maybe you've spoken already on a panel with jamal dagani who sort of invented the concept governance obviously is a big sticking point but what are your thoughts on this you are mute so my message to your mark and uh and to the community is uh as opposed to what dave said let's not define it we spent the whole year defining it there are four principles domain product data infrastructure and governance let's take it to the next level i get a lot of questions on what is the difference between data fabric and data mesh and i'm like i can compare the two because data mesh is a business concept data fabric is a data integration pattern how do you define how do you compare the two you have to bring data mesh level down so to tony's point i'm on a warp path in 2022 to take it down to what does a data product look like how do we handle shared data across domains and govern it and i think we are going to see more of that in 2022 is operationalization of data mesh i think we could have a whole hour on this topic couldn't we uh maybe we should do that uh but let's go to let's move to carl said carl your database guy you've been around that that block for a while now you want to talk about graph databases bring it on oh yeah okay thanks so i regard graph database as basically the next truly revolutionary database management technology i'm looking forward to for the graph database market which of course we haven't defined yet so obviously i have a little wiggle room in what i'm about to say but that this market will grow by about 600 percent over the next 10 years now 10 years is a long time but over the next five years we expect to see gradual growth as people start to learn how to use it problem isn't that it's used the problem is not that it's not useful is that people don't know how to use it so let me explain before i go any further what a graph database is because some of the folks on the call may not may not know what it is a graph database organizes data according to a mathematical structure called a graph a graph has elements called nodes and edges so a data element drops into a node the nodes are connected by edges the edges connect one node to another node combinations of edges create structures that you can analyze to determine how things are related in some cases the nodes and edges can have properties attached to them which add additional informative material that makes it richer that's called a property graph okay there are two principal use cases for graph databases there's there's semantic proper graphs which are used to break down human language text uh into the semantic structures then you can search it organize it and and and answer complicated questions a lot of ai is aimed at semantic graphs another kind is the property graph that i just mentioned which has a dazzling number of use cases i want to just point out is as i talk about this people are probably wondering well we have relational databases isn't that good enough okay so a relational database defines it uses um it supports what i call definitional relationships that means you define the relationships in a fixed structure the database drops into that structure there's a value foreign key value that relates one table to another and that value is fixed you don't change it if you change it the database becomes unstable it's not clear what you're looking at in a graph database the system is designed to handle change so that it can reflect the true state of the things that it's being used to track so um let me just give you some examples of use cases for this um they include uh entity resolution data lineage uh um social media analysis customer 360 fraud prevention there's cyber security there's strong supply chain is a big one actually there's explainable ai and this is going to become important too because a lot of people are adopting ai but they want a system after the fact to say how did the ai system come to that conclusion how did it make that recommendation right now we don't have really good ways of tracking that okay machine machine learning in general um social network i already mentioned that and then we've got oh gosh we've got data governance data compliance risk management we've got recommendation we've got personalization anti-money money laundering that's another big one identity and access management network and i.t operations is already becoming a key one where you actually have mapped out your operation your your you know whatever it is your data center and you you can track what's going on as things happen there root cause analysis fraud detection is a huge one a number of major credit card companies use graph databases for fraud detection risk analysis tracking and tracing churn analysis next best action what-if analysis impact analysis entity resolution and i would add one other thing or just a few other things to this list metadata management so sanjay here you go this is your engine okay because i was in metadata management for quite a while in my past life and one of the things i found was that none of the data management technologies that were available to us could efficiently handle metadata because of the kinds of structures that result from it but grass can okay grafts can do things like say this term in this context means this but in that context it means that okay things like that and in fact uh logistics management supply chain it also because it handles recursive relationships by recursive relationships i mean objects that own other objects that are of the same type you can do things like bill materials you know so like parts explosion you can do an hr analysis who reports to whom how many levels up the chain and that kind of thing you can do that with relational databases but yes it takes a lot of programming in fact you can do almost any of these things with relational databases but the problem is you have to program it it's not it's not supported in the database and whenever you have to program something that means you can't trace it you can't define it you can't publish it in terms of its functionality and it's really really hard to maintain over time so carl thank you i wonder if we could bring brad in i mean brad i'm sitting there wondering okay is this incremental to the market is it disruptive and replaceable what are your thoughts on this space it's already disrupted the market i mean like carl said go to any bank and ask them are you using graph databases to do to get fraud detection under control and they'll say absolutely that's the only way to solve this problem and it is frankly um and it's the only way to solve a lot of the problems that carl mentioned and that is i think it's it's achilles heel in some ways because you know it's like finding the best way to cross the seven bridges of konigsberg you know it's always going to kind of be tied to those use cases because it's really special and it's really unique and because it's special and it's unique uh it it still unfortunately kind of stands apart from the rest of the community that's building let's say ai outcomes as the great great example here the graph databases and ai as carl mentioned are like chocolate and peanut butter but technologically they don't know how to talk to one another they're completely different um and you know it's you can't just stand up sql and query them you've got to to learn um yeah what is that carlos specter or uh special uh uh yeah thank you uh to actually get to the data in there and if you're gonna scale that data that graph database especially a property graph if you're gonna do something really complex like try to understand uh you know all of the metadata in your organization you might just end up with you know a graph database winter like we had the ai winter simply because you run out of performance to make the thing happen so i i think it's already disrupted but we we need to like treat it like a first-class citizen in in the data analytics and ai community we need to bring it into the fold we need to equip it with the tools it needs to do that the magic it does and to do it not just for specialized use cases but for everything because i i'm with carl i i think it's absolutely revolutionary so i had also identified the principal achilles heel of the technology which is scaling now when these when these things get large and complex enough that they spill over what a single server can handle you start to have difficulties because the relationships span things that have to be resolved over a network and then you get network latency and that slows the system down so that's still a problem to be solved sanjeev any quick thoughts on this i mean i think metadata on the on the on the word cloud is going to be the the largest font uh but what are your thoughts here i want to like step away so people don't you know associate me with only meta data so i want to talk about something a little bit slightly different uh dbengines.com has done an amazing job i think almost everyone knows that they chronicle all the major databases that are in use today in january of 2022 there are 381 databases on its list of ranked list of databases the largest category is rdbms the second largest category is actually divided into two property graphs and rdf graphs these two together make up the second largest number of data databases so talking about accolades here this is a problem the problem is that there's so many graph databases to choose from they come in different shapes and forms uh to bright's point there's so many query languages in rdbms is sql end of the story here we've got sci-fi we've got gremlin we've got gql and then your proprietary languages so i think there's a lot of disparity in this space but excellent all excellent points sanji i must say and that is a problem the languages need to be sorted and standardized and it needs people need to have a road map as to what they can do with it because as you say you can do so many things and so many of those things are unrelated that you sort of say well what do we use this for i'm reminded of the saying i learned a bunch of years ago when somebody said that the digital computer is the only tool man has ever devised that has no particular purpose all right guys we gotta we gotta move on to dave uh meninger uh we've heard about streaming uh your prediction is in that realm so please take it away sure so i like to say that historical databases are to become a thing of the past but i don't mean that they're going to go away that's not my point i mean we need historical databases but streaming data is going to become the default way in which we operate with data so in the next say three to five years i would expect the data platforms and and we're using the term data platforms to represent the evolution of databases and data lakes that the data platforms will incorporate these streaming capabilities we're going to process data as it streams into an organization and then it's going to roll off into historical databases so historical databases don't go away but they become a thing of the past they store the data that occurred previously and as data is occurring we're going to be processing it we're going to be analyzing we're going to be acting on it i mean we we only ever ended up with historical databases because we were limited by the technology that was available to us data doesn't occur in batches but we processed it in batches because that was the best we could do and it wasn't bad and we've continued to improve and we've improved and we've improved but streaming data today is still the exception it's not the rule right there's there are projects within organizations that deal with streaming data but it's not the default way in which we deal with data yet and so that that's my prediction is that this is going to change we're going to have um streaming data be the default way in which we deal with data and and how you label it what you call it you know maybe these databases and data platforms just evolve to be able to handle it but we're going to deal with data in a different way and our research shows that already about half of the participants in our analytics and data benchmark research are using streaming data you know another third are planning to use streaming technologies so that gets us to about eight out of ten organizations need to use this technology that doesn't mean they have to use it throughout the whole organization but but it's pretty widespread in its use today and has continued to grow if you think about the consumerization of i.t we've all been conditioned to expect immediate access to information immediate responsiveness you know we want to know if an uh item is on the shelf at our local retail store and we can go in and pick it up right now you know that's the world we live in and that's spilling over into the enterprise i.t world where we have to provide those same types of capabilities um so that's my prediction historical database has become a thing of the past streaming data becomes the default way in which we we operate with data all right thank you david well so what what say you uh carl a guy who's followed historical databases for a long time well one thing actually every database is historical because as soon as you put data in it it's now history it's no longer it no longer reflects the present state of things but even if that history is only a millisecond old it's still history but um i would say i mean i know you're trying to be a little bit provocative in saying this dave because you know as well as i do that people still need to do their taxes they still need to do accounting they still need to run general ledger programs and things like that that all involves historical data that's not going to go away unless you want to go to jail so you're going to have to deal with that but as far as the leading edge functionality i'm totally with you on that and i'm just you know i'm just kind of wondering um if this chain if this requires a change in the way that we perceive applications in order to truly be manifested and rethinking the way m applications work um saying that uh an application should respond instantly as soon as the state of things changes what do you say about that i i think that's true i think we do have to think about things differently that's you know it's not the way we design systems in the past uh we're seeing more and more systems designed that way but again it's not the default and and agree 100 with you that we do need historical databases you know that that's clear and even some of those historical databases will be used in conjunction with the streaming data right so absolutely i mean you know let's take the data warehouse example where you're using the data warehouse as context and the streaming data as the present you're saying here's a sequence of things that's happening right now have we seen that sequence before and where what what does that pattern look like in past situations and can we learn from that so tony bear i wonder if you could comment i mean if you when you think about you know real-time inferencing at the edge for instance which is something that a lot of people talk about um a lot of what we're discussing here in this segment looks like it's got great potential what are your thoughts yeah well i mean i think you nailed it right you know you hit it right on the head there which is that i think a key what i'm seeing is that essentially and basically i'm going to split this one down the middle is i don't see that basically streaming is the default what i see is streaming and basically and transaction databases um and analytics data you know data warehouses data lakes whatever are converging and what allows us technically to converge is cloud native architecture where you can basically distribute things so you could have you can have a note here that's doing the real-time processing that's also doing it and this is what your leads in we're maybe doing some of that real-time predictive analytics to take a look at well look we're looking at this customer journey what's happening with you know you know with with what the customer is doing right now and this is correlated with what other customers are doing so what i so the thing is that in the cloud you can basically partition this and because of basically you know the speed of the infrastructure um that you can basically bring these together and or and so and kind of orchestrate them sort of loosely coupled manner the other part is that the use cases are demanding and this is part that goes back to what dave is saying is that you know when you look at customer 360 when you look at let's say smart you know smart utility grids when you look at any type of operational problem it has a real-time component and it has a historical component and having predictives and so like you know you know my sense here is that there that technically we can bring this together through the cloud and i think the use case is that is that we we can apply some some real-time sort of you know predictive analytics on these streams and feed this into the transactions so that when we make a decision in terms of what to do as a result of a transaction we have this real time you know input sanjeev did you have a comment yeah i was just going to say that to this point you know we have to think of streaming very different because in the historical databases we used to bring the data and store the data and then we used to run rules on top uh aggregations and all but in case of streaming the mindset changes because the rules normally the inference all of that is fixed but the data is constantly changing so it's a completely reverse way of thinking of uh and building applications on top of that so dave menninger there seemed to be some disagreement about the default or now what kind of time frame are you are you thinking about is this end of decade it becomes the default what would you pin i i think around you know between between five to ten years i think this becomes the reality um i think you know it'll be more and more common between now and then but it becomes the default and i also want sanjeev at some point maybe in one of our subsequent conversations we need to talk about governing streaming data because that's a whole other set of challenges we've also talked about it rather in a two dimensions historical and streaming and there's lots of low latency micro batch sub second that's not quite streaming but in many cases it's fast enough and we're seeing a lot of adoption of near real time not quite real time as uh good enough for most for many applications because nobody's really taking the hardware dimension of this information like how do we that'll just happen carl so near real time maybe before you lose the customer however you define that right okay um let's move on to brad brad you want to talk about automation ai uh the the the pipeline people feel like hey we can just automate everything what's your prediction yeah uh i'm i'm an ai fiction auto so apologies in advance for that but uh you know um i i think that um we've been seeing automation at play within ai for some time now and it's helped us do do a lot of things for especially for practitioners that are building ai outcomes in the enterprise uh it's it's helped them to fill skills gaps it's helped them to speed development and it's helped them to to actually make ai better uh because it you know in some ways provides some swim lanes and and for example with technologies like ottawa milk and can auto document and create that sort of transparency that that we talked about a little bit earlier um but i i think it's there's an interesting kind of conversion happening with this idea of automation um and and that is that uh we've had the automation that started happening for practitioners it's it's trying to move outside of the traditional bounds of things like i'm just trying to get my features i'm just trying to pick the right algorithm i'm just trying to build the right model uh and it's expanding across that full life cycle of building an ai outcome to start at the very beginning of data and to then continue on to the end which is this continuous delivery and continuous uh automation of of that outcome to make sure it's right and it hasn't drifted and stuff like that and because of that because it's become kind of powerful we're starting to to actually see this weird thing happen where the practitioners are starting to converge with the users and that is to say that okay if i'm in tableau right now i can stand up salesforce einstein discovery and it will automatically create a nice predictive algorithm for me um given the data that i that i pull in um but what's starting to happen and we're seeing this from the the the companies that create business software so salesforce oracle sap and others is that they're starting to actually use these same ideals and a lot of deep learning to to basically stand up these out of the box flip a switch and you've got an ai outcome at the ready for business users and um i i'm very much you know i think that that's that's the way that it's going to go and what it means is that ai is is slowly disappearing uh and i don't think that's a bad thing i think if anything what we're going to see in 2022 and maybe into 2023 is this sort of rush to to put this idea of disappearing ai into practice and have as many of these solutions in the enterprise as possible you can see like for example sap is going to roll out this quarter this thing called adaptive recommendation services which which basically is a cold start ai outcome that can work across a whole bunch of different vertical markets and use cases it's just a recommendation engine for whatever you need it to do in the line of business so basically you're you're an sap user you look up to turn on your software one day and you're a sales professional let's say and suddenly you have a recommendation for customer churn it's going that's great well i i don't know i i think that's terrifying in some ways i think it is the future that ai is going to disappear like that but i am absolutely terrified of it because um i i think that what it what it really does is it calls attention to a lot of the issues that we already see around ai um specific to this idea of what what we like to call it omdia responsible ai which is you know how do you build an ai outcome that is free of bias that is inclusive that is fair that is safe that is secure that it's audible etc etc etc etc that takes some a lot of work to do and so if you imagine a customer that that's just a sales force customer let's say and they're turning on einstein discovery within their sales software you need some guidance to make sure that when you flip that switch that the outcome you're going to get is correct and that's that's going to take some work and so i think we're going to see this let's roll this out and suddenly there's going to be a lot of a lot of problems a lot of pushback uh that we're going to see and some of that's going to come from gdpr and others that sam jeeve was mentioning earlier a lot of it's going to come from internal csr requirements within companies that are saying hey hey whoa hold up we can't do this all at once let's take the slow route let's make ai automated in a smart way and that's going to take time yeah so a couple predictions there that i heard i mean ai essentially you disappear it becomes invisible maybe if i can restate that and then if if i understand it correctly brad you're saying there's a backlash in the near term people can say oh slow down let's automate what we can those attributes that you talked about are non trivial to achieve is that why you're a bit of a skeptic yeah i think that we don't have any sort of standards that companies can look to and understand and we certainly within these companies especially those that haven't already stood up in internal data science team they don't have the knowledge to understand what that when they flip that switch for an automated ai outcome that it's it's gonna do what they think it's gonna do and so we need some sort of standard standard methodology and practice best practices that every company that's going to consume this invisible ai can make use of and one of the things that you know is sort of started that google kicked off a few years back that's picking up some momentum and the companies i just mentioned are starting to use it is this idea of model cards where at least you have some transparency about what these things are doing you know so like for the sap example we know for example that it's convolutional neural network with a long short-term memory model that it's using we know that it only works on roman english uh and therefore me as a consumer can say oh well i know that i need to do this internationally so i should not just turn this on today great thank you carl can you add anything any context here yeah we've talked about some of the things brad mentioned here at idc in the our future of intelligence group regarding in particular the moral and legal implications of having a fully automated you know ai uh driven system uh because we already know and we've seen that ai systems are biased by the data that they get right so if if they get data that pushes them in a certain direction i think there was a story last week about an hr system that was uh that was recommending promotions for white people over black people because in the past um you know white people were promoted and and more productive than black people but not it had no context as to why which is you know because they were being historically discriminated black people being historically discriminated against but the system doesn't know that so you know you have to be aware of that and i think that at the very least there should be controls when a decision has either a moral or a legal implication when when you want when you really need a human judgment it could lay out the options for you but a person actually needs to authorize that that action and i also think that we always will have to be vigilant regarding the kind of data we use to train our systems to make sure that it doesn't introduce unintended biases and to some extent they always will so we'll always be chasing after them that's that's absolutely carl yeah i think that what you have to bear in mind as a as a consumer of ai is that it is a reflection of us and we are a very flawed species uh and so if you look at all the really fantastic magical looking supermodels we see like gpt three and four that's coming out z they're xenophobic and hateful uh because the people the data that's built upon them and the algorithms and the people that build them are us so ai is a reflection of us we need to keep that in mind yeah we're the ai's by us because humans are biased all right great okay let's move on doug henson you know a lot of people that said that data lake that term's not not going to not going to live on but it appears to be have some legs here uh you want to talk about lake house bring it on yes i do my prediction is that lake house and this idea of a combined data warehouse and data lake platform is going to emerge as the dominant data management offering i say offering that doesn't mean it's going to be the dominant thing that organizations have out there but it's going to be the predominant vendor offering in 2022. now heading into 2021 we already had cloudera data bricks microsoft snowflake as proponents in 2021 sap oracle and several of these fabric virtualization mesh vendors join the bandwagon the promise is that you have one platform that manages your structured unstructured and semi-structured information and it addresses both the beyond analytics needs and the data science needs the real promise there is simplicity and lower cost but i think end users have to answer a few questions the first is does your organization really have a center of data gravity or is it is the data highly distributed multiple data warehouses multiple data lakes on-premises cloud if it if it's very distributed and you you know you have difficulty consolidating and that's not really a goal for you then maybe that single platform is unrealistic and not likely to add value to you um you know also the fabric and virtualization vendors the the mesh idea that's where if you have this highly distributed situation that might be a better path forward the second question if you are looking at one of these lake house offerings you are looking at consolidating simplifying bringing together to a single platform you have to make sure that it meets both the warehouse need and the data lake need so you have vendors like data bricks microsoft with azure synapse new really to the data warehouse space and they're having to prove that these data warehouse capabilities on their platforms can meet the scaling requirements can meet the user and query concurrency requirements meet those tight slas and then on the other hand you have the or the oracle sap snowflake the data warehouse uh folks coming into the data science world and they have to prove that they can manage the unstructured information and meet the needs of the data scientists i'm seeing a lot of the lake house offerings from the warehouse crowd managing that unstructured information in columns and rows and some of these vendors snowflake in particular is really relying on partners for the data science needs so you really got to look at a lake house offering and make sure that it meets both the warehouse and the data lake requirement well thank you doug well tony if those two worlds are going to come together as doug was saying the analytics and the data science world does it need to be some kind of semantic layer in between i don't know weigh in on this topic if you would oh didn't we talk about data fabrics before common metadata layer um actually i'm almost tempted to say let's declare victory and go home in that this is actually been going on for a while i actually agree with uh you know much what doug is saying there which is that i mean we i remembered as far back as i think it was like 2014 i was doing a a study you know it was still at ovum predecessor omnia um looking at all these specialized databases that were coming up and seeing that you know there's overlap with the edges but yet there was still going to be a reason at the time that you would have let's say a document database for json you'd have a relational database for tran you know for transactions and for data warehouse and you had you know and you had basically something at that time that that resembles to do for what we're considering a day of life fast fo and the thing is what i was saying at the time is that you're seeing basically blur you know sort of blending at the edges that i was saying like about five or six years ago um that's all and the the lake house is essentially you know the amount of the the current manifestation of that idea there is a dichotomy in terms of you know it's the old argument do we centralize this all you know you know in in in in in a single place or do we or do we virtualize and i think it's always going to be a yin and yang there's never going to be a single single silver silver bullet i do see um that they're also going to be questions and these are things that points that doug raised they're you know what your what do you need of of of your of you know for your performance there or for your you know pre-performance characteristics do you need for instance hiking currency you need the ability to do some very sophisticated joins or is your requirement more to be able to distribute and you know distribute our processing is you know as far as possible to get you know to essentially do a kind of brute force approach all these approaches are valid based on you know based on the used case um i just see that essentially that the lake house is the culmination of it's nothing it's just it's a relatively new term introduced by databricks a couple years ago this is the culmination of basically what's been a long time trend and what we see in the cloud is that as we start seeing data warehouses as a checkbox item say hey we can basically source data in cloud and cloud storage and s3 azure blob store you know whatever um as long as it's in certain formats like you know like you know parquet or csv or something like that you know i see that as becoming kind of you know a check box item so to that extent i think that the lake house depending on how you define it is already reality um and in some in some cases maybe new terminology but not a whole heck of a lot new under the sun yeah and dave menger i mean a lot of this thank you tony but a lot of this is going to come down to you know vendor marketing right some people try to co-opt the term we talked about data mesh washing what are your thoughts on this yeah so um i used the term data platform earlier and and part of the reason i use that term is that it's more vendor neutral uh we've we've tried to uh sort of stay out of the the vendor uh terminology patenting world right whether whether the term lake house is what sticks or not the concept is certainly going to stick and we have some data to back it up about a quarter of organizations that are using data lakes today already incorporate data warehouse functionality into it so they consider their data lake house and data warehouse one in the same about a quarter of organizations a little less but about a quarter of organizations feed the data lake from the data warehouse and about a quarter of organizations feed the data warehouse from the data lake so it's pretty obvious that three quarters of organizations need to bring this stuff together right the need is there the need is apparent the technology is going to continue to verge converge i i like to talk about you know you've got data lakes over here at one end and i'm not going to talk about why people thought data lakes were a bad idea because they thought you just throw stuff in a in a server and you ignore it right that's not what a data lake is so you've got data lake people over here and you've got database people over here data warehouse people over here database vendors are adding data lake capabilities and data lake vendors are adding data warehouse capabilities so it's obvious that they're going to meet in the middle i mean i think it's like tony says i think we should there declare victory and go home and so so i it's just a follow-up on that so are you saying these the specialized lake and the specialized warehouse do they go away i mean johnny tony data mesh practitioners would say or or advocates would say well they could all live as just a node on the on the mesh but based on what dave just said are we going to see those all morph together well number one as i was saying before there's always going to be this sort of you know kind of you know centrifugal force or this tug of war between do we centralize the data do we do it virtualize and the fact is i don't think that work there's ever going to be any single answer i think in terms of data mesh data mesh has nothing to do with how you physically implement the data you could have a data mesh on a basically uh on a data warehouse it's just that you know the difference being is that if we use the same you know physical data store but everybody's logically manual basically governing it differently you know um a data mission is basically it's not a technology it's a process it's a governance process um so essentially um you know you know i basically see that you know as as i was saying before that this is basically the culmination of a long time trend we're essentially seeing a lot of blurring but there are going to be cases where for instance if i need let's say like observe i need like high concurrency or something like that there are certain things that i'm not going to be able to get efficiently get out of a data lake um and you know we're basically i'm doing a system where i'm just doing really brute forcing very fast file scanning and that type of thing so i think there always will be some delineations but i would agree with dave and with doug that we are seeing basically a a confluence of requirements that we need to essentially have basically the element you know the ability of a data lake and a data laid out their warehouse we these need to come together so i think what we're likely to see is organizations look for a converged platform that can handle both sides for their center of data gravity the mesh and the fabric vendors the the fabric virtualization vendors they're all on board with the idea of this converged platform and they're saying hey we'll handle all the edge cases of the stuff that isn't in that center of data gradient that is off distributed in a cloud or at a remote location so you can have that single platform for the center of of your your data and then bring in virtualization mesh what have you for reaching out to the distributed data bingo as they basically said people are happy when they virtualize data i i think yes at this point but to this uh dave meningas point you know they have convert they are converging snowflake has introduced support for unstructured data so now we are literally splitting here now what uh databricks is saying is that aha but it's easy to go from data lake to data warehouse than it is from data warehouse to data lake so i think we're getting into semantics but we've already seen these two converge so is that so it takes something like aws who's got what 15 data stores are they're going to have 15 converged data stores that's going to be interesting to watch all right guys i'm going to go down the list and do like a one i'm going to one word each and you guys each of the analysts if you wouldn't just add a very brief sort of course correction for me so sanjeev i mean governance is going to be the maybe it's the dog that wags the tail now i mean it's coming to the fore all this ransomware stuff which really didn't talk much about security but but but what's the one word in your prediction that you would leave us with on governance it's uh it's going to be mainstream mainstream okay tony bear mesh washing is what i wrote down that's that's what we're going to see in uh in in 2022 a little reality check you you want to add to that reality check is i hope that no vendor you know jumps the shark and calls their offering a data mesh project yeah yeah let's hope that doesn't happen if they do we're going to call them out uh carl i mean graph databases thank you for sharing some some you know high growth metrics i know it's early days but magic is what i took away from that it's the magic database yeah i would actually i've said this to people too i i kind of look at it as a swiss army knife of data because you can pretty much do anything you want with it it doesn't mean you should i mean that's definitely the case that if you're you know managing things that are in a fixed schematic relationship probably a relational database is a better choice there are you know times when the document database is a better choice it can handle those things but maybe not it may not be the best choice for that use case but for a great many especially the new emerging use cases i listed it's the best choice thank you and dave meninger thank you by the way for bringing the data in i like how you supported all your comments with with some some data points but streaming data becomes the sort of default uh paradigm if you will what would you add yeah um i would say think fast right that's the world we live in you got to think fast fast love it uh and brad shimon uh i love it i mean on the one hand i was saying okay great i'm afraid i might get disrupted by one of these internet giants who are ai experts so i'm gonna be able to buy instead of build ai but then again you know i've got some real issues there's a potential backlash there so give us the there's your bumper sticker yeah i i would say um going with dave think fast and also think slow uh to to talk about the book that everyone talks about i would say really that this is all about trust trust in the idea of automation and of a transparent invisible ai across the enterprise but verify verify before you do anything and then doug henson i mean i i look i think the the trend is your friend here on this prediction with lake house is uh really becoming dominant i liked the way you set up that notion of you know the the the data warehouse folks coming at it from the analytics perspective but then you got the data science worlds coming together i still feel as though there's this piece in the middle that we're missing but your your final thoughts we'll give you the last well i think the idea of consolidation and simplification uh always prevails that's why the appeal of a single platform is going to be there um we've already seen that with uh you know hadoop platforms moving toward cloud moving toward object storage and object storage becoming really the common storage point for whether it's a lake or a warehouse uh and that second point uh i think esg mandates are uh are gonna come in alongside uh gdpr and things like that to uh up the ante for uh good governance yeah thank you for calling that out okay folks hey that's all the time that that we have here your your experience and depth of understanding on these key issues and in data and data management really on point and they were on display today i want to thank you for your your contributions really appreciate your time enjoyed it thank you now in addition to this video we're going to be making available transcripts of the discussion we're going to do clips of this as well we're going to put them out on social media i'll write this up and publish the discussion on wikibon.com and siliconangle.com no doubt several of the analysts on the panel will take the opportunity to publish written content social commentary or both i want to thank the power panelist and thanks for watching this special cube presentation this is dave vellante be well and we'll see you next time [Music] you

Published Date : Jan 8 2022

SUMMARY :

the end of the day need to speak you

ENTITIES

Entity	Category	Confidence
381 databases	QUANTITY	0.99+
2014	DATE	0.99+
2022	DATE	0.99+
2021	DATE	0.99+
january of 2022	DATE	0.99+
100 users	QUANTITY	0.99+
jamal dagani	PERSON	0.99+
last week	DATE	0.99+
dave meninger	PERSON	0.99+
sanji	PERSON	0.99+
second question	QUANTITY	0.99+
15 converged data stores	QUANTITY	0.99+
dave vellante	PERSON	0.99+
microsoft	ORGANIZATION	0.99+
three	QUANTITY	0.99+
sanjeev	PERSON	0.99+
2023	DATE	0.99+
15 data stores	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
last year	DATE	0.99+
sanjeev mohan	PERSON	0.99+
six	QUANTITY	0.99+
two	QUANTITY	0.99+
carl	PERSON	0.99+
tony	PERSON	0.99+
carl olufsen	PERSON	0.99+
six years	QUANTITY	0.99+
david	PERSON	0.99+
carlos specter	PERSON	0.98+
both sides	QUANTITY	0.98+
2010s	DATE	0.98+
first backlash	QUANTITY	0.98+
five years	QUANTITY	0.98+
today	DATE	0.98+
dave	PERSON	0.98+
each	QUANTITY	0.98+
three quarters	QUANTITY	0.98+
first	QUANTITY	0.98+
single platform	QUANTITY	0.98+
lake house	ORGANIZATION	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
doug	PERSON	0.97+
one word	QUANTITY	0.97+
this year	DATE	0.97+
wikibon.com	OTHER	0.97+
one platform	QUANTITY	0.97+
39	QUANTITY	0.97+
about 600 percent	QUANTITY	0.97+
two analysts	QUANTITY	0.97+
ten years	QUANTITY	0.97+
single platform	QUANTITY	0.96+
five	QUANTITY	0.96+
one	QUANTITY	0.96+
three quarters	QUANTITY	0.96+
california	LOCATION	0.96+
google	ORGANIZATION	0.96+
single	QUANTITY	0.95+

Predictions 2022: Top Analysts See the Future of Data

(bright music) >> In the 2010s, organizations became keenly aware that data would become the key ingredient to driving competitive advantage, differentiation, and growth. But to this day, putting data to work remains a difficult challenge for many, if not most organizations. Now, as the cloud matures, it has become a game changer for data practitioners by making cheap storage and massive processing power readily accessible. We've also seen better tooling in the form of data workflows, streaming, machine intelligence, AI, developer tools, security, observability, automation, new databases and the like. These innovations they accelerate data proficiency, but at the same time, they add complexity for practitioners. Data lakes, data hubs, data warehouses, data marts, data fabrics, data meshes, data catalogs, data oceans are forming, they're evolving and exploding onto the scene. So in an effort to bring perspective to the sea of optionality, we've brought together the brightest minds in the data analyst community to discuss how data management is morphing and what practitioners should expect in 2022 and beyond. Hello everyone, my name is Dave Velannte with theCUBE, and I'd like to welcome you to a special Cube presentation, analysts predictions 2022: the future of data management. We've gathered six of the best analysts in data and data management who are going to present and discuss their top predictions and trends for 2022 in the first half of this decade. Let me introduce our six power panelists. Sanjeev Mohan is former Gartner Analyst and Principal at SanjMo. Tony Baer, principal at dbInsight, Carl Olofson is well-known Research Vice President with IDC, Dave Menninger is Senior Vice President and Research Director at Ventana Research, Brad Shimmin, Chief Analyst, AI Platforms, Analytics and Data Management at Omdia and Doug Henschen, Vice President and Principal Analyst at Constellation Research. Gentlemen, welcome to the program and thanks for coming on theCUBE today. >> Great to be here. >> Thank you. >> All right, here's the format we're going to use. I as moderator, I'm going to call on each analyst separately who then will deliver their prediction or mega trend, and then in the interest of time management and pace, two analysts will have the opportunity to comment. If we have more time, we'll elongate it, but let's get started right away. Sanjeev Mohan, please kick it off. You want to talk about governance, go ahead sir. >> Thank you Dave. I believe that data governance which we've been talking about for many years is now not only going to be mainstream, it's going to be table stakes. And all the things that you mentioned, you know, the data, ocean data lake, lake houses, data fabric, meshes, the common glue is metadata. If we don't understand what data we have and we are governing it, there is no way we can manage it. So we saw Informatica went public last year after a hiatus of six. I'm predicting that this year we see some more companies go public. My bet is on Culebra, most likely and maybe Alation we'll see go public this year. I'm also predicting that the scope of data governance is going to expand beyond just data. It's not just data and reports. We are going to see more transformations like spark jawsxxxxx, Python even Air Flow. We're going to see more of a streaming data. So from Kafka Schema Registry, for example. We will see AI models become part of this whole governance suite. So the governance suite is going to be very comprehensive, very detailed lineage, impact analysis, and then even expand into data quality. We already seen that happen with some of the tools where they are buying these smaller companies and bringing in data quality monitoring and integrating it with metadata management, data catalogs, also data access governance. So what we are going to see is that once the data governance platforms become the key entry point into these modern architectures, I'm predicting that the usage, the number of users of a data catalog is going to exceed that of a BI tool. That will take time and we already seen that trajectory. Right now if you look at BI tools, I would say there a hundred users to BI tool to one data catalog. And I see that evening out over a period of time and at some point data catalogs will really become the main way for us to access data. Data catalog will help us visualize data, but if we want to do more in-depth analysis, it'll be the jumping off point into the BI tool, the data science tool and that is the journey I see for the data governance products. >> Excellent, thank you. Some comments. Maybe Doug, a lot of things to weigh in on there, maybe you can comment. >> Yeah, Sanjeev I think you're spot on, a lot of the trends the one disagreement, I think it's really still far from mainstream. As you say, we've been talking about this for years, it's like God, motherhood, apple pie, everyone agrees it's important, but too few organizations are really practicing good governance because it's hard and because the incentives have been lacking. I think one thing that deserves mention in this context is ESG mandates and guidelines, these are environmental, social and governance, regs and guidelines. We've seen the environmental regs and guidelines and posts in industries, particularly the carbon-intensive industries. We've seen the social mandates, particularly diversity imposed on suppliers by companies that are leading on this topic. We've seen governance guidelines now being imposed by banks on investors. So these ESGs are presenting new carrots and sticks, and it's going to demand more solid data. It's going to demand more detailed reporting and solid reporting, tighter governance. But we're still far from mainstream adoption. We have a lot of, you know, best of breed niche players in the space. I think the signs that it's going to be more mainstream are starting with things like Azure Purview, Google Dataplex, the big cloud platform players seem to be upping the ante and starting to address governance. >> Excellent, thank you Doug. Brad, I wonder if you could chime in as well. >> Yeah, I would love to be a believer in data catalogs. But to Doug's point, I think that it's going to take some more pressure for that to happen. I recall metadata being something every enterprise thought they were going to get under control when we were working on service oriented architecture back in the nineties and that didn't happen quite the way we anticipated. And so to Sanjeev's point it's because it is really complex and really difficult to do. My hope is that, you know, we won't sort of, how do I put this? Fade out into this nebula of domain catalogs that are specific to individual use cases like Purview for getting data quality right or like data governance and cybersecurity. And instead we have some tooling that can actually be adaptive to gather metadata to create something. And I know its important to you, Sanjeev and that is this idea of observability. If you can get enough metadata without moving your data around, but understanding the entirety of a system that's running on this data, you can do a lot. So to help with the governance that Doug is talking about. >> So I just want to add that, data governance, like any other initiatives did not succeed even AI went into an AI window, but that's a different topic. But a lot of these things did not succeed because to your point, the incentives were not there. I remember when Sarbanes Oxley had come into the scene, if a bank did not do Sarbanes Oxley, they were very happy to a million dollar fine. That was like, you know, pocket change for them instead of doing the right thing. But I think the stakes are much higher now. With GDPR, the flood gates opened. Now, you know, California, you know, has CCPA but even CCPA is being outdated with CPRA, which is much more GDPR like. So we are very rapidly entering a space where pretty much every major country in the world is coming up with its own compliance regulatory requirements, data residents is becoming really important. And I think we are going to reach a stage where it won't be optional anymore. So whether we like it or not, and I think the reason data catalogs were not successful in the past is because we did not have the right focus on adoption. We were focused on features and these features were disconnected, very hard for business to adopt. These are built by IT people for IT departments to take a look at technical metadata, not business metadata. Today the tables have turned. CDOs are driving this initiative, regulatory compliances are beating down hard, so I think the time might be right. >> Yeah so guys, we have to move on here. But there's some real meat on the bone here, Sanjeev. I like the fact that you called out Culebra and Alation, so we can look back a year from now and say, okay, he made the call, he stuck it. And then the ratio of BI tools to data catalogs that's another sort of measurement that we can take even though with some skepticism there, that's something that we can watch. And I wonder if someday, if we'll have more metadata than data. But I want to move to Tony Baer, you want to talk about data mesh and speaking, you know, coming off of governance. I mean, wow, you know the whole concept of data mesh is, decentralized data, and then governance becomes, you know, a nightmare there, but take it away, Tony. >> We'll put this way, data mesh, you know, the idea at least as proposed by ThoughtWorks. You know, basically it was at least a couple of years ago and the press has been almost uniformly almost uncritical. A good reason for that is for all the problems that basically Sanjeev and Doug and Brad we're just speaking about, which is that we have all this data out there and we don't know what to do about it. Now, that's not a new problem. That was a problem we had in enterprise data warehouses, it was a problem when we had over DoOP data clusters, it's even more of a problem now that data is out in the cloud where the data is not only your data lake, is not only us three, it's all over the place. And it's also including streaming, which I know we'll be talking about later. So the data mesh was a response to that, the idea of that we need to bait, you know, who are the folks that really know best about governance? It's the domain experts. So it was basically data mesh was an architectural pattern and a process. My prediction for this year is that data mesh is going to hit cold heart reality. Because if you do a Google search, basically the published work, the articles on data mesh have been largely, you know, pretty uncritical so far. Basically loading and is basically being a very revolutionary new idea. I don't think it's that revolutionary because we've talked about ideas like this. Brad now you and I met years ago when we were talking about so and decentralizing all of us, but it was at the application level. Now we're talking about it at the data level. And now we have microservices. So there's this thought of have we managed if we're deconstructing apps in cloud native to microservices, why don't we think of data in the same way? My sense this year is that, you know, this has been a very active search if you look at Google search trends, is that now companies, like enterprise are going to look at this seriously. And as they look at it seriously, it's going to attract its first real hard scrutiny, it's going to attract its first backlash. That's not necessarily a bad thing. It means that it's being taken seriously. The reason why I think that you'll start to see basically the cold hearted light of day shine on data mesh is that it's still a work in progress. You know, this idea is basically a couple of years old and there's still some pretty major gaps. The biggest gap is in the area of federated governance. Now federated governance itself is not a new issue. Federated governance decision, we started figuring out like, how can we basically strike the balance between getting let's say between basically consistent enterprise policy, consistent enterprise governance, but yet the groups that understand the data and know how to basically, you know, that, you know, how do we basically sort of balance the two? There's a huge gap there in practice and knowledge. Also to a lesser extent, there's a technology gap which is basically in the self-service technologies that will help teams essentially govern data. You know, basically through the full life cycle, from develop, from selecting the data from, you know, building the pipelines from, you know, determining your access control, looking at quality, looking at basically whether the data is fresh or whether it's trending off course. So my prediction is that it will receive the first harsh scrutiny this year. You are going to see some organization and enterprises declare premature victory when they build some federated query implementations. You going to see vendors start with data mesh wash their products anybody in the data management space that they are going to say that where this basically a pipelining tool, whether it's basically ELT, whether it's a catalog or federated query tool, they will all going to get like, you know, basically promoting the fact of how they support this. Hopefully nobody's going to call themselves a data mesh tool because data mesh is not a technology. We're going to see one other thing come out of this. And this harks back to the metadata that Sanjeev was talking about and of the catalog just as he was talking about. Which is that there's going to be a new focus, every renewed focus on metadata. And I think that's going to spur interest in data fabrics. Now data fabrics are pretty vaguely defined, but if we just take the most elemental definition, which is a common metadata back plane, I think that if anybody is going to get serious about data mesh, they need to look at the data fabric because we all at the end of the day, need to speak, you know, need to read from the same sheet of music. >> So thank you Tony. Dave Menninger, I mean, one of the things that people like about data mesh is it pretty crisply articulate some of the flaws in today's organizational approaches to data. What are your thoughts on this? >> Well, I think we have to start by defining data mesh, right? The term is already getting corrupted, right? Tony said it's going to see the cold hard light of day. And there's a problem right now that there are a number of overlapping terms that are similar but not identical. So we've got data virtualization, data fabric, excuse me for a second. (clears throat) Sorry about that. Data virtualization, data fabric, data federation, right? So I think that it's not really clear what each vendor means by these terms. I see data mesh and data fabric becoming quite popular. I've interpreted data mesh as referring primarily to the governance aspects as originally intended and specified. But that's not the way I see vendors using it. I see vendors using it much more to mean data fabric and data virtualization. So I'm going to comment on the group of those things. I think the group of those things is going to happen. They're going to happen, they're going to become more robust. Our research suggests that a quarter of organizations are already using virtualized access to their data lakes and another half, so a total of three quarters will eventually be accessing their data lakes using some sort of virtualized access. Again, whether you define it as mesh or fabric or virtualization isn't really the point here. But this notion that there are different elements of data, metadata and governance within an organization that all need to be managed collectively. The interesting thing is when you look at the satisfaction rates of those organizations using virtualization versus those that are not, it's almost double, 68% of organizations, I'm sorry, 79% of organizations that were using virtualized access express satisfaction with their access to the data lake. Only 39% express satisfaction if they weren't using virtualized access. >> Oh thank you Dave. Sanjeev we just got about a couple of minutes on this topic, but I know you're speaking or maybe you've always spoken already on a panel with (indistinct) who sort of invented the concept. Governance obviously is a big sticking point, but what are your thoughts on this? You're on mute. (panelist chuckling) >> So my message to (indistinct) and to the community is as opposed to what they said, let's not define it. We spent a whole year defining it, there are four principles, domain, product, data infrastructure, and governance. Let's take it to the next level. I get a lot of questions on what is the difference between data fabric and data mesh? And I'm like I can't compare the two because data mesh is a business concept, data fabric is a data integration pattern. How do you compare the two? You have to bring data mesh a level down. So to Tony's point, I'm on a warpath in 2022 to take it down to what does a data product look like? How do we handle shared data across domains and governance? And I think we are going to see more of that in 2022, or is "operationalization" of data mesh. >> I think we could have a whole hour on this topic, couldn't we? Maybe we should do that. But let's corner. Let's move to Carl. So Carl, you're a database guy, you've been around that block for a while now, you want to talk about graph databases, bring it on. >> Oh yeah. Okay thanks. So I regard graph database as basically the next truly revolutionary database management technology. I'm looking forward for the graph database market, which of course we haven't defined yet. So obviously I have a little wiggle room in what I'm about to say. But this market will grow by about 600% over the next 10 years. Now, 10 years is a long time. But over the next five years, we expect to see gradual growth as people start to learn how to use it. The problem is not that it's not useful, its that people don't know how to use it. So let me explain before I go any further what a graph database is because some of the folks on the call may not know what it is. A graph database organizes data according to a mathematical structure called a graph. The graph has elements called nodes and edges. So a data element drops into a node, the nodes are connected by edges, the edges connect one node to another node. Combinations of edges create structures that you can analyze to determine how things are related. In some cases, the nodes and edges can have properties attached to them which add additional informative material that makes it richer, that's called a property graph. There are two principle use cases for graph databases. There's semantic property graphs, which are use to break down human language texts into the semantic structures. Then you can search it, organize it and answer complicated questions. A lot of AI is aimed at semantic graphs. Another kind is the property graph that I just mentioned, which has a dazzling number of use cases. I want to just point out as I talk about this, people are probably wondering, well, we have relation databases, isn't that good enough? So a relational database defines... It supports what I call definitional relationships. That means you define the relationships in a fixed structure. The database drops into that structure, there's a value, foreign key value, that relates one table to another and that value is fixed. You don't change it. If you change it, the database becomes unstable, it's not clear what you're looking at. In a graph database, the system is designed to handle change so that it can reflect the true state of the things that it's being used to track. So let me just give you some examples of use cases for this. They include entity resolution, data lineage, social media analysis, Customer 360, fraud prevention. There's cybersecurity, there's strong supply chain is a big one actually. There is explainable AI and this is going to become important too because a lot of people are adopting AI. But they want a system after the fact to say, how do the AI system come to that conclusion? How did it make that recommendation? Right now we don't have really good ways of tracking that. Machine learning in general, social network, I already mentioned that. And then we've got, oh gosh, we've got data governance, data compliance, risk management. We've got recommendation, we've got personalization, anti money laundering, that's another big one, identity and access management, network and IT operations is already becoming a key one where you actually have mapped out your operation, you know, whatever it is, your data center and you can track what's going on as things happen there, root cause analysis, fraud detection is a huge one. A number of major credit card companies use graph databases for fraud detection, risk analysis, tracking and tracing turn analysis, next best action, what if analysis, impact analysis, entity resolution and I would add one other thing or just a few other things to this list, metadata management. So Sanjeev, here you go, this is your engine. Because I was in metadata management for quite a while in my past life. And one of the things I found was that none of the data management technologies that were available to us could efficiently handle metadata because of the kinds of structures that result from it, but graphs can, okay? Graphs can do things like say, this term in this context means this, but in that context, it means that, okay? Things like that. And in fact, logistics management, supply chain. And also because it handles recursive relationships, by recursive relationships I mean objects that own other objects that are of the same type. You can do things like build materials, you know, so like parts explosion. Or you can do an HR analysis, who reports to whom, how many levels up the chain and that kind of thing. You can do that with relational databases, but yet it takes a lot of programming. In fact, you can do almost any of these things with relational databases, but the problem is, you have to program it. It's not supported in the database. And whenever you have to program something, that means you can't trace it, you can't define it. You can't publish it in terms of its functionality and it's really, really hard to maintain over time. >> Carl, thank you. I wonder if we could bring Brad in, I mean. Brad, I'm sitting here wondering, okay, is this incremental to the market? Is it disruptive and replacement? What are your thoughts on this phase? >> It's already disrupted the market. I mean, like Carl said, go to any bank and ask them are you using graph databases to get fraud detection under control? And they'll say, absolutely, that's the only way to solve this problem. And it is frankly. And it's the only way to solve a lot of the problems that Carl mentioned. And that is, I think it's Achilles heel in some ways. Because, you know, it's like finding the best way to cross the seven bridges of Koenigsberg. You know, it's always going to kind of be tied to those use cases because it's really special and it's really unique and because it's special and it's unique, it's still unfortunately kind of stands apart from the rest of the community that's building, let's say AI outcomes, as a great example here. Graph databases and AI, as Carl mentioned, are like chocolate and peanut butter. But technologically, you think don't know how to talk to one another, they're completely different. And you know, you can't just stand up SQL and query them. You've got to learn, know what is the Carl? Specter special. Yeah, thank you to, to actually get to the data in there. And if you're going to scale that data, that graph database, especially a property graph, if you're going to do something really complex, like try to understand you know, all of the metadata in your organization, you might just end up with, you know, a graph database winter like we had the AI winter simply because you run out of performance to make the thing happen. So, I think it's already disrupted, but we need to like treat it like a first-class citizen in the data analytics and AI community. We need to bring it into the fold. We need to equip it with the tools it needs to do the magic it does and to do it not just for specialized use cases, but for everything. 'Cause I'm with Carl. I think it's absolutely revolutionary. >> Brad identified the principal, Achilles' heel of the technology which is scaling. When these things get large and complex enough that they spill over what a single server can handle, you start to have difficulties because the relationships span things that have to be resolved over a network and then you get network latency and that slows the system down. So that's still a problem to be solved. >> Sanjeev, any quick thoughts on this? I mean, I think metadata on the word cloud is going to be the largest font, but what are your thoughts here? >> I want to (indistinct) So people don't associate me with only metadata, so I want to talk about something slightly different. dbengines.com has done an amazing job. I think almost everyone knows that they chronicle all the major databases that are in use today. In January of 2022, there are 381 databases on a ranked list of databases. The largest category is RDBMS. The second largest category is actually divided into two property graphs and IDF graphs. These two together make up the second largest number databases. So talking about Achilles heel, this is a problem. The problem is that there's so many graph databases to choose from. They come in different shapes and forms. To Brad's point, there's so many query languages in RDBMS, in SQL. I know the story, but here We've got cipher, we've got gremlin, we've got GQL and then we're proprietary languages. So I think there's a lot of disparity in this space. >> Well, excellent. All excellent points, Sanjeev, if I must say. And that is a problem that the languages need to be sorted and standardized. People need to have a roadmap as to what they can do with it. Because as you say, you can do so many things. And so many of those things are unrelated that you sort of say, well, what do we use this for? And I'm reminded of the saying I learned a bunch of years ago. And somebody said that the digital computer is the only tool man has ever device that has no particular purpose. (panelists chuckle) >> All right guys, we got to move on to Dave Menninger. We've heard about streaming. Your prediction is in that realm, so please take it away. >> Sure. So I like to say that historical databases are going to become a thing of the past. By that I don't mean that they're going to go away, that's not my point. I mean, we need historical databases, but streaming data is going to become the default way in which we operate with data. So in the next say three to five years, I would expect that data platforms and we're using the term data platforms to represent the evolution of databases and data lakes, that the data platforms will incorporate these streaming capabilities. We're going to process data as it streams into an organization and then it's going to roll off into historical database. So historical databases don't go away, but they become a thing of the past. They store the data that occurred previously. And as data is occurring, we're going to be processing it, we're going to be analyzing it, we're going to be acting on it. I mean we only ever ended up with historical databases because we were limited by the technology that was available to us. Data doesn't occur in patches. But we processed it in patches because that was the best we could do. And it wasn't bad and we've continued to improve and we've improved and we've improved. But streaming data today is still the exception. It's not the rule, right? There are projects within organizations that deal with streaming data. But it's not the default way in which we deal with data yet. And so that's my prediction is that this is going to change, we're going to have streaming data be the default way in which we deal with data and how you label it and what you call it. You know, maybe these databases and data platforms just evolved to be able to handle it. But we're going to deal with data in a different way. And our research shows that already, about half of the participants in our analytics and data benchmark research, are using streaming data. You know, another third are planning to use streaming technologies. So that gets us to about eight out of 10 organizations need to use this technology. And that doesn't mean they have to use it throughout the whole organization, but it's pretty widespread in its use today and has continued to grow. If you think about the consumerization of IT, we've all been conditioned to expect immediate access to information, immediate responsiveness. You know, we want to know if an item is on the shelf at our local retail store and we can go in and pick it up right now. You know, that's the world we live in and that's spilling over into the enterprise IT world We have to provide those same types of capabilities. So that's my prediction, historical databases become a thing of the past, streaming data becomes the default way in which we operate with data. >> All right thank you David. Well, so what say you, Carl, the guy who has followed historical databases for a long time? >> Well, one thing actually, every database is historical because as soon as you put data in it, it's now history. They'll no longer reflect the present state of things. But even if that history is only a millisecond old, it's still history. But I would say, I mean, I know you're trying to be a little bit provocative in saying this Dave 'cause you know, as well as I do that people still need to do their taxes, they still need to do accounting, they still need to run general ledger programs and things like that. That all involves historical data. That's not going to go away unless you want to go to jail. So you're going to have to deal with that. But as far as the leading edge functionality, I'm totally with you on that. And I'm just, you know, I'm just kind of wondering if this requires a change in the way that we perceive applications in order to truly be manifested and rethinking the way applications work. Saying that an application should respond instantly, as soon as the state of things changes. What do you say about that? >> I think that's true. I think we do have to think about things differently. It's not the way we designed systems in the past. We're seeing more and more systems designed that way. But again, it's not the default. And I agree 100% with you that we do need historical databases you know, that's clear. And even some of those historical databases will be used in conjunction with the streaming data, right? >> Absolutely. I mean, you know, let's take the data warehouse example where you're using the data warehouse as its context and the streaming data as the present and you're saying, here's the sequence of things that's happening right now. Have we seen that sequence before? And where? What does that pattern look like in past situations? And can we learn from that? >> So Tony Baer, I wonder if you could comment? I mean, when you think about, you know, real time inferencing at the edge, for instance, which is something that a lot of people talk about, a lot of what we're discussing here in this segment, it looks like it's got a great potential. What are your thoughts? >> Yeah, I mean, I think you nailed it right. You know, you hit it right on the head there. Which is that, what I'm seeing is that essentially. Then based on I'm going to split this one down the middle is that I don't see that basically streaming is the default. What I see is streaming and basically and transaction databases and analytics data, you know, data warehouses, data lakes whatever are converging. And what allows us technically to converge is cloud native architecture, where you can basically distribute things. So you can have a node here that's doing the real-time processing, that's also doing... And this is where it leads in or maybe doing some of that real time predictive analytics to take a look at, well look, we're looking at this customer journey what's happening with what the customer is doing right now and this is correlated with what other customers are doing. So the thing is that in the cloud, you can basically partition this and because of basically the speed of the infrastructure then you can basically bring these together and kind of orchestrate them sort of a loosely coupled manner. The other parts that the use cases are demanding, and this is part of it goes back to what Dave is saying. Is that, you know, when you look at Customer 360, when you look at let's say Smart Utility products, when you look at any type of operational problem, it has a real time component and it has an historical component. And having predictive and so like, you know, my sense here is that technically we can bring this together through the cloud. And I think the use case is that we can apply some real time sort of predictive analytics on these streams and feed this into the transactions so that when we make a decision in terms of what to do as a result of a transaction, we have this real-time input. >> Sanjeev, did you have a comment? >> Yeah, I was just going to say that to Dave's point, you know, we have to think of streaming very different because in the historical databases, we used to bring the data and store the data and then we used to run rules on top, aggregations and all. But in case of streaming, the mindset changes because the rules are normally the inference, all of that is fixed, but the data is constantly changing. So it's a completely reversed way of thinking and building applications on top of that. >> So Dave Menninger, there seem to be some disagreement about the default. What kind of timeframe are you thinking about? Is this end of decade it becomes the default? What would you pin? >> I think around, you know, between five to 10 years, I think this becomes the reality. >> I think its... >> It'll be more and more common between now and then, but it becomes the default. And I also want Sanjeev at some point, maybe in one of our subsequent conversations, we need to talk about governing streaming data. 'Cause that's a whole nother set of challenges. >> We've also talked about it rather in two dimensions, historical and streaming, and there's lots of low latency, micro batch, sub-second, that's not quite streaming, but in many cases its fast enough and we're seeing a lot of adoption of near real time, not quite real-time as good enough for many applications. (indistinct cross talk from panelists) >> Because nobody's really taking the hardware dimension (mumbles). >> That'll just happened, Carl. (panelists laughing) >> So near real time. But maybe before you lose the customer, however we define that, right? Okay, let's move on to Brad. Brad, you want to talk about automation, AI, the pipeline people feel like, hey, we can just automate everything. What's your prediction? >> Yeah I'm an AI aficionados so apologies in advance for that. But, you know, I think that we've been seeing automation play within AI for some time now. And it's helped us do a lot of things especially for practitioners that are building AI outcomes in the enterprise. It's helped them to fill skills gaps, it's helped them to speed development and it's helped them to actually make AI better. 'Cause it, you know, in some ways provide some swim lanes and for example, with technologies like AutoML can auto document and create that sort of transparency that we talked about a little bit earlier. But I think there's an interesting kind of conversion happening with this idea of automation. And that is that we've had the automation that started happening for practitioners, it's trying to move out side of the traditional bounds of things like I'm just trying to get my features, I'm just trying to pick the right algorithm, I'm just trying to build the right model and it's expanding across that full life cycle, building an AI outcome, to start at the very beginning of data and to then continue on to the end, which is this continuous delivery and continuous automation of that outcome to make sure it's right and it hasn't drifted and stuff like that. And because of that, because it's become kind of powerful, we're starting to actually see this weird thing happen where the practitioners are starting to converge with the users. And that is to say that, okay, if I'm in Tableau right now, I can stand up Salesforce Einstein Discovery, and it will automatically create a nice predictive algorithm for me given the data that I pull in. But what's starting to happen and we're seeing this from the companies that create business software, so Salesforce, Oracle, SAP, and others is that they're starting to actually use these same ideals and a lot of deep learning (chuckles) to basically stand up these out of the box flip-a-switch, and you've got an AI outcome at the ready for business users. And I am very much, you know, I think that's the way that it's going to go and what it means is that AI is slowly disappearing. And I don't think that's a bad thing. I think if anything, what we're going to see in 2022 and maybe into 2023 is this sort of rush to put this idea of disappearing AI into practice and have as many of these solutions in the enterprise as possible. You can see, like for example, SAP is going to roll out this quarter, this thing called adaptive recommendation services, which basically is a cold start AI outcome that can work across a whole bunch of different vertical markets and use cases. It's just a recommendation engine for whatever you needed to do in the line of business. So basically, you're an SAP user, you look up to turn on your software one day, you're a sales professional let's say, and suddenly you have a recommendation for customer churn. Boom! It's going, that's great. Well, I don't know, I think that's terrifying. In some ways I think it is the future that AI is going to disappear like that, but I'm absolutely terrified of it because I think that what it really does is it calls attention to a lot of the issues that we already see around AI, specific to this idea of what we like to call at Omdia, responsible AI. Which is, you know, how do you build an AI outcome that is free of bias, that is inclusive, that is fair, that is safe, that is secure, that its audible, et cetera, et cetera, et cetera, et cetera. I'd take a lot of work to do. And so if you imagine a customer that's just a Salesforce customer let's say, and they're turning on Einstein Discovery within their sales software, you need some guidance to make sure that when you flip that switch, that the outcome you're going to get is correct. And that's going to take some work. And so, I think we're going to see this move, let's roll this out and suddenly there's going to be a lot of problems, a lot of pushback that we're going to see. And some of that's going to come from GDPR and others that Sanjeev was mentioning earlier. A lot of it is going to come from internal CSR requirements within companies that are saying, "Hey, hey, whoa, hold up, we can't do this all at once. "Let's take the slow route, "let's make AI automated in a smart way." And that's going to take time. >> Yeah, so a couple of predictions there that I heard. AI simply disappear, it becomes invisible. Maybe if I can restate that. And then if I understand it correctly, Brad you're saying there's a backlash in the near term. You'd be able to say, oh, slow down. Let's automate what we can. Those attributes that you talked about are non trivial to achieve, is that why you're a bit of a skeptic? >> Yeah. I think that we don't have any sort of standards that companies can look to and understand. And we certainly, within these companies, especially those that haven't already stood up an internal data science team, they don't have the knowledge to understand when they flip that switch for an automated AI outcome that it's going to do what they think it's going to do. And so we need some sort of standard methodology and practice, best practices that every company that's going to consume this invisible AI can make use of them. And one of the things that you know, is sort of started that Google kicked off a few years back that's picking up some momentum and the companies I just mentioned are starting to use it is this idea of model cards where at least you have some transparency about what these things are doing. You know, so like for the SAP example, we know, for example, if it's convolutional neural network with a long, short term memory model that it's using, we know that it only works on Roman English and therefore me as a consumer can say, "Oh, well I know that I need to do this internationally. "So I should not just turn this on today." >> Thank you. Carl could you add anything, any context here? >> Yeah, we've talked about some of the things Brad mentioned here at IDC and our future of intelligence group regarding in particular, the moral and legal implications of having a fully automated, you know, AI driven system. Because we already know, and we've seen that AI systems are biased by the data that they get, right? So if they get data that pushes them in a certain direction, I think there was a story last week about an HR system that was recommending promotions for White people over Black people, because in the past, you know, White people were promoted and more productive than Black people, but it had no context as to why which is, you know, because they were being historically discriminated, Black people were being historically discriminated against, but the system doesn't know that. So, you know, you have to be aware of that. And I think that at the very least, there should be controls when a decision has either a moral or legal implication. When you really need a human judgment, it could lay out the options for you. But a person actually needs to authorize that action. And I also think that we always will have to be vigilant regarding the kind of data we use to train our systems to make sure that it doesn't introduce unintended biases. In some extent, they always will. So we'll always be chasing after them. But that's (indistinct). >> Absolutely Carl, yeah. I think that what you have to bear in mind as a consumer of AI is that it is a reflection of us and we are a very flawed species. And so if you look at all of the really fantastic, magical looking supermodels we see like GPT-3 and four, that's coming out, they're xenophobic and hateful because the people that the data that's built upon them and the algorithms and the people that build them are us. So AI is a reflection of us. We need to keep that in mind. >> Yeah, where the AI is biased 'cause humans are biased. All right, great. All right let's move on. Doug you mentioned mentioned, you know, lot of people that said that data lake, that term is not going to live on but here's to be, have some lakes here. You want to talk about lake house, bring it on. >> Yes, I do. My prediction is that lake house and this idea of a combined data warehouse and data lake platform is going to emerge as the dominant data management offering. I say offering that doesn't mean it's going to be the dominant thing that organizations have out there, but it's going to be the pro dominant vendor offering in 2022. Now heading into 2021, we already had Cloudera, Databricks, Microsoft, Snowflake as proponents, in 2021, SAP, Oracle, and several of all of these fabric virtualization/mesh vendors joined the bandwagon. The promise is that you have one platform that manages your structured, unstructured and semi-structured information. And it addresses both the BI analytics needs and the data science needs. The real promise there is simplicity and lower cost. But I think end users have to answer a few questions. The first is, does your organization really have a center of data gravity or is the data highly distributed? Multiple data warehouses, multiple data lakes, on premises, cloud. If it's very distributed and you'd have difficulty consolidating and that's not really a goal for you, then maybe that single platform is unrealistic and not likely to add value to you. You know, also the fabric and virtualization vendors, the mesh idea, that's where if you have this highly distributed situation, that might be a better path forward. The second question, if you are looking at one of these lake house offerings, you are looking at consolidating, simplifying, bringing together to a single platform. You have to make sure that it meets both the warehouse need and the data lake need. So you have vendors like Databricks, Microsoft with Azure Synapse. New really to the data warehouse space and they're having to prove that these data warehouse capabilities on their platforms can meet the scaling requirements, can meet the user and query concurrency requirements. Meet those tight SLS. And then on the other hand, you have the Oracle, SAP, Snowflake, the data warehouse folks coming into the data science world, and they have to prove that they can manage the unstructured information and meet the needs of the data scientists. I'm seeing a lot of the lake house offerings from the warehouse crowd, managing that unstructured information in columns and rows. And some of these vendors, Snowflake a particular is really relying on partners for the data science needs. So you really got to look at a lake house offering and make sure that it meets both the warehouse and the data lake requirement. >> Thank you Doug. Well Tony, if those two worlds are going to come together, as Doug was saying, the analytics and the data science world, does it need to be some kind of semantic layer in between? I don't know. Where are you in on this topic? >> (chuckles) Oh, didn't we talk about data fabrics before? Common metadata layer (chuckles). Actually, I'm almost tempted to say let's declare victory and go home. And that this has actually been going on for a while. I actually agree with, you know, much of what Doug is saying there. Which is that, I mean I remember as far back as I think it was like 2014, I was doing a study. I was still at Ovum, (indistinct) Omdia, looking at all these specialized databases that were coming up and seeing that, you know, there's overlap at the edges. But yet, there was still going to be a reason at the time that you would have, let's say a document database for JSON, you'd have a relational database for transactions and for data warehouse and you had basically something at that time that resembles a dupe for what we consider your data life. Fast forward and the thing is what I was seeing at the time is that you were saying they sort of blending at the edges. That was saying like about five to six years ago. And the lake house is essentially on the current manifestation of that idea. There is a dichotomy in terms of, you know, it's the old argument, do we centralize this all you know in a single place or do we virtualize? And I think it's always going to be a union yeah and there's never going to be a single silver bullet. I do see that there are also going to be questions and these are points that Doug raised. That you know, what do you need for your performance there, or for your free performance characteristics? Do you need for instance high concurrency? You need the ability to do some very sophisticated joins, or is your requirement more to be able to distribute and distribute our processing is, you know, as far as possible to get, you know, to essentially do a kind of a brute force approach. All these approaches are valid based on the use case. I just see that essentially that the lake house is the culmination of it's nothing. It's a relatively new term introduced by Databricks a couple of years ago. This is the culmination of basically what's been a long time trend. And what we see in the cloud is that as we start seeing data warehouses as a check box items say, "Hey, we can basically source data in cloud storage, in S3, "Azure Blob Store, you know, whatever, "as long as it's in certain formats, "like, you know parquet or CSP or something like that." I see that as becoming kind of a checkbox item. So to that extent, I think that the lake house, depending on how you define is already reality. And in some cases, maybe new terminology, but not a whole heck of a lot new under the sun. >> Yeah. And Dave Menninger, I mean a lot of these, thank you Tony, but a lot of this is going to come down to, you know, vendor marketing, right? Some people just kind of co-op the term, we talked about you know, data mesh washing, what are your thoughts on this? (laughing) >> Yeah, so I used the term data platform earlier. And part of the reason I use that term is that it's more vendor neutral. We've tried to sort of stay out of the vendor terminology patenting world, right? Whether the term lake houses, what sticks or not, the concept is certainly going to stick. And we have some data to back it up. About a quarter of organizations that are using data lakes today, already incorporate data warehouse functionality into it. So they consider their data lake house and data warehouse one in the same, about a quarter of organizations, a little less, but about a quarter of organizations feed the data lake from the data warehouse and about a quarter of organizations feed the data warehouse from the data lake. So it's pretty obvious that three quarters of organizations need to bring this stuff together, right? The need is there, the need is apparent. The technology is going to continue to converge. I like to talk about it, you know, you've got data lakes over here at one end, and I'm not going to talk about why people thought data lakes were a bad idea because they thought you just throw stuff in a server and you ignore it, right? That's not what a data lake is. So you've got data lake people over here and you've got database people over here, data warehouse people over here, database vendors are adding data lake capabilities and data lake vendors are adding data warehouse capabilities. So it's obvious that they're going to meet in the middle. I mean, I think it's like Tony says, I think we should declare victory and go home. >> As hell. So just a follow-up on that, so are you saying the specialized lake and the specialized warehouse, do they go away? I mean, Tony data mesh practitioners would say or advocates would say, well, they could all live. It's just a node on the mesh. But based on what Dave just said, are we gona see those all morphed together? >> Well, number one, as I was saying before, there's always going to be this sort of, you know, centrifugal force or this tug of war between do we centralize the data, do we virtualize? And the fact is I don't think that there's ever going to be any single answer. I think in terms of data mesh, data mesh has nothing to do with how you're physically implement the data. You could have a data mesh basically on a data warehouse. It's just that, you know, the difference being is that if we use the same physical data store, but everybody's logically you know, basically governing it differently, you know? Data mesh in space, it's not a technology, it's processes, it's governance process. So essentially, you know, I basically see that, you know, as I was saying before that this is basically the culmination of a long time trend we're essentially seeing a lot of blurring, but there are going to be cases where, for instance, if I need, let's say like, Upserve, I need like high concurrency or something like that. There are certain things that I'm not going to be able to get efficiently get out of a data lake. And, you know, I'm doing a system where I'm just doing really brute forcing very fast file scanning and that type of thing. So I think there always will be some delineations, but I would agree with Dave and with Doug, that we are seeing basically a confluence of requirements that we need to essentially have basically either the element, you know, the ability of a data lake and the data warehouse, these need to come together, so I think. >> I think what we're likely to see is organizations look for a converge platform that can handle both sides for their center of data gravity, the mesh and the fabric virtualization vendors, they're all on board with the idea of this converged platform and they're saying, "Hey, we'll handle all the edge cases "of the stuff that isn't in that center of data gravity "but that is off distributed in a cloud "or at a remote location." So you can have that single platform for the center of your data and then bring in virtualization, mesh, what have you, for reaching out to the distributed data. >> As Dave basically said, people are happy when they virtualized data. >> I think we have at this point, but to Dave Menninger's point, they are converging, Snowflake has introduced support for unstructured data. So obviously literally splitting here. Now what Databricks is saying is that "aha, but it's easy to go from data lake to data warehouse "than it is from databases to data lake." So I think we're getting into semantics, but we're already seeing these two converge. >> So take somebody like AWS has got what? 15 data stores. Are they're going to 15 converge data stores? This is going to be interesting to watch. All right, guys, I'm going to go down and list do like a one, I'm going to one word each and you guys, each of the analyst, if you would just add a very brief sort of course correction for me. So Sanjeev, I mean, governance is going to to be... Maybe it's the dog that wags the tail now. I mean, it's coming to the fore, all this ransomware stuff, which you really didn't talk much about security, but what's the one word in your prediction that you would leave us with on governance? >> It's going to be mainstream. >> Mainstream. Okay. Tony Baer, mesh washing is what I wrote down. That's what we're going to see in 2022, a little reality check, you want to add to that? >> Reality check, 'cause I hope that no vendor jumps the shark and close they're offering a data niche product. >> Yeah, let's hope that doesn't happen. If they do, we're going to call them out. Carl, I mean, graph databases, thank you for sharing some high growth metrics. I know it's early days, but magic is what I took away from that, so magic database. >> Yeah, I would actually, I've said this to people too. I kind of look at it as a Swiss Army knife of data because you can pretty much do anything you want with it. That doesn't mean you should. I mean, there's definitely the case that if you're managing things that are in fixed schematic relationship, probably a relation database is a better choice. There are times when the document database is a better choice. It can handle those things, but maybe not. It may not be the best choice for that use case. But for a great many, especially with the new emerging use cases I listed, it's the best choice. >> Thank you. And Dave Menninger, thank you by the way, for bringing the data in, I like how you supported all your comments with some data points. But streaming data becomes the sort of default paradigm, if you will, what would you add? >> Yeah, I would say think fast, right? That's the world we live in, you got to think fast. >> Think fast, love it. And Brad Shimmin, love it. I mean, on the one hand I was saying, okay, great. I'm afraid I might get disrupted by one of these internet giants who are AI experts. I'm going to be able to buy instead of build AI. But then again, you know, I've got some real issues. There's a potential backlash there. So give us your bumper sticker. >> I'm would say, going with Dave, think fast and also think slow to talk about the book that everyone talks about. I would say really that this is all about trust, trust in the idea of automation and a transparent and visible AI across the enterprise. And verify, verify before you do anything. >> And then Doug Henschen, I mean, I think the trend is your friend here on this prediction with lake house is really becoming dominant. I liked the way you set up that notion of, you know, the data warehouse folks coming at it from the analytics perspective and then you get the data science worlds coming together. I still feel as though there's this piece in the middle that we're missing, but your, your final thoughts will give you the (indistinct). >> I think the idea of consolidation and simplification always prevails. That's why the appeal of a single platform is going to be there. We've already seen that with, you know, DoOP platforms and moving toward cloud, moving toward object storage and object storage, becoming really the common storage point for whether it's a lake or a warehouse. And that second point, I think ESG mandates are going to come in alongside GDPR and things like that to up the ante for good governance. >> Yeah, thank you for calling that out. Okay folks, hey that's all the time that we have here, your experience and depth of understanding on these key issues on data and data management really on point and they were on display today. I want to thank you for your contributions. Really appreciate your time. >> Enjoyed it. >> Thank you. >> Thanks for having me. >> In addition to this video, we're going to be making available transcripts of the discussion. We're going to do clips of this as well we're going to put them out on social media. I'll write this up and publish the discussion on wikibon.com and siliconangle.com. No doubt, several of the analysts on the panel will take the opportunity to publish written content, social commentary or both. I want to thank the power panelists and thanks for watching this special CUBE presentation. This is Dave Vellante, be well and we'll see you next time. (bright music)

Published Date : Jan 7 2022

SUMMARY :

and I'd like to welcome you to I as moderator, I'm going to and that is the journey to weigh in on there, and it's going to demand more solid data. Brad, I wonder if you that are specific to individual use cases in the past is because we I like the fact that you the data from, you know, Dave Menninger, I mean, one of the things that all need to be managed collectively. Oh thank you Dave. and to the community I think we could have a after the fact to say, okay, is this incremental to the market? the magic it does and to do it and that slows the system down. I know the story, but And that is a problem that the languages move on to Dave Menninger. So in the next say three to five years, the guy who has followed that people still need to do their taxes, And I agree 100% with you and the streaming data as the I mean, when you think about, you know, and because of basically the all of that is fixed, but the it becomes the default? I think around, you know, but it becomes the default. and we're seeing a lot of taking the hardware dimension That'll just happened, Carl. Okay, let's move on to Brad. And that is to say that, Those attributes that you And one of the things that you know, Carl could you add in the past, you know, I think that what you have to bear in mind that term is not going to and the data science needs. and the data science world, You need the ability to do lot of these, thank you Tony, I like to talk about it, you know, It's just a node on the mesh. basically either the element, you know, So you can have that single they virtualized data. "aha, but it's easy to go from I mean, it's coming to the you want to add to that? I hope that no vendor Yeah, let's hope that doesn't happen. I've said this to people too. I like how you supported That's the world we live I mean, on the one hand I And verify, verify before you do anything. I liked the way you set up We've already seen that with, you know, the time that we have here, We're going to do clips of this as well

ENTITIES

Entity	Category	Confidence
Dave Menninger	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Henschen	PERSON	0.99+
David	PERSON	0.99+
Brad Shimmin	PERSON	0.99+
Doug	PERSON	0.99+
Tony Baer	PERSON	0.99+
Dave Velannte	PERSON	0.99+
Tony	PERSON	0.99+
Carl	PERSON	0.99+
Brad	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
2014	DATE	0.99+
Sanjeev Mohan	PERSON	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Oracle	ORGANIZATION	0.99+
last year	DATE	0.99+
January of 2022	DATE	0.99+
three	QUANTITY	0.99+
381 databases	QUANTITY	0.99+
IDC	ORGANIZATION	0.99+
Informatica	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Sanjeev	PERSON	0.99+
2021	DATE	0.99+
Google	ORGANIZATION	0.99+
Omdia	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
SanjMo	ORGANIZATION	0.99+
79%	QUANTITY	0.99+
second question	QUANTITY	0.99+
last week	DATE	0.99+
15 data stores	QUANTITY	0.99+
100%	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+

Danielle Royston & Robin Langdon, Totogi | Cloud City Live 2021

(gentle music) >> Okay, thank you Adam. Thank you everyone for joining us on the main stage here, folks watching, appreciate it. I'm John Furrier, Dave Vellante co-hosts of theCube. We're here in the main stage to talk to the two main players of Totogi, Danielle Royston, CEO as of today, the big news. Congratulations. >> Danielle: Yeah. Thank you. >> And Robin Langdon the CTO, Totogi. >> Robin: Thanks. So big news, CEO news today and $100 million investment. Every wants to know where's all the action? Why is this so popular right now? (Danielle chuckles) What's going on? Give us the quick update. >> Yeah, I met the Totogi guys and they have this great product I was really excited about. They're focused purely on telco software and bringing, coupling that with the Public Cloud, which is everything that I talk about, what I've been about for so long. And I really wanted to give them enough funding so they could focus on building great products. A lot of times, telcos, startups, you know they try to get a quick win. They kind of chase the big guys and I really wanted to make sure they were focused on building a great product. #2, I really wanted to show the industry, they had the funding they needed to be a real player. This wasn't like $5 million or a couple million dollars, so that was really important. And then #3, I want to make sure that we could hire great talent and you need money for compensation. And so $100 million it is. >> $100 million is a lot of fresh fat financing as they say. I got to ask you, what's different? Because I've been researching on the refactoring aspect of with the Cloud, obviously public cloud with AWS, a big deal. What's different about the charging aspect of this? >> Yeah I mean, charging hasn't been exciting, maybe ever. I mean, it's kind of like this really sort of sleepy area, but I think what the Totogi guys are doing is they're really coupling the idea of charging and network data to bring hyper-personalization to subscribers. And I think that's where it changes from being a charging engine to become an engagement engine. Telcos know more about us than Google, which is kind of crazy to think about it. They know when we wake up, they know what apps we use. If we call or text, if we game or stream and it's time to start using that data to drive a better experience to us. And I think to Totogi is enabling that. I'm super excited to do that. >> So Robin, I wonder if you could talk about that a little bit. I mean, maybe we get into the plumbing and I don't want to go too deep in it, but I think it's important because we've seen this movie before where people take their on-prem stacks, they wrap it in containers and they shove it into the Public Cloud and they say, "Hey, we're cloud too." If reading a press release, you guys are taking advantage of things like Amazon Nitro of course, but also Graviton and Graviton2 and eventually 3, which is the underlying capabilities, give you a cloud native advantage. Can you explain that a little bit? >> Yeah, absolutely. I mean, we wanted to build this in the Cloud using all of those great cloud innovations. So Graviton2, DynamoDB and using their infrastructure, just allowing us to be able to scale out. These all available to us to use and essentially free for us to use. And it's great, so as you say, we're not shoehorning something in that's decade's old technology, wrapping it in some kind of container and pushing it in. Which is just then, you just can't use any of those great innovations. >> And you've selected DynamoDB as the database. Okay, that's fine. We don't have to get so much into why, but maybe you could explain the advantage because I saw some benchmark numbers which were, like an order of magnitude greater than the competition, like share with us, why? How you were able to get there? And maybe share those numbers. >> Yeah, no, we do. So we just launched our benchmark. So, a million transactions per second. So we just blew away everyone else out there. And that's really because we could take advantage of all that great AWS technology in there and the database side we're using DynamoDB, where we had a huge debate about using what kind of database to go and use? There's a lot of people out there probably get very religious about the kind of database technology that you should be using. And whether it should it be SQL in-memory object database type technology, but really a single table design, gives you that true scalability. You can just horizontally scale that easily, across the whole planet. >> You know, Danielle. Again, I said that we've seen this movie before. There are a lot of parallels in telco with the enterprise. And if you look at enterprise SAS pricing, a lot of it is very static, kind of lock you in, per seat pricing, kind of an old model. And you're seeing a lot of the modern SAS companies who are emerging with a consumption pricing models. How are you guys thinking about pricing? >> Yeah, I don't know of any other company in telco that's starting to price by usage. And that is a very standard offering with the cloud providers, right? Google we know, Amazon, all those guys have a price by the API, price by the transaction. So we're really excited to offer that to telcos. They've been asking for it for awhile, right? Pay for what you need, when you need it, by the use. And so we're really excited to offer that, but I think what's really cool is the idea of a free tier, right? And so I think it's smaller telcos have a trade-off to make, whether, am I going to buy the best technology and pay through the nose and maybe at an unaffordable level, or do I compromise and buy something more affordable, but not as great. And what's so great about Totogi, it's the same product just priced for what you need. And so I think a CSP it'll, below 250,000 subscribers should be able to use the Totogi absolutely for free. And that is, and it's the same product that the big guy would get. So it's not a junior version or scaled back. And so I think that's really exciting. I think we're the only ones that do it. So here we go. >> Love the freemium model. So Robin, maybe you could explain why that's so much, so important in the charging space, because you've got a lot of different options that you want to configure for the consumer. >> Yeah. >> Maybe you could talk about sort of how the old world does that, the old guard and how long it takes and how you're approaching this. >> Yeah so it's, I mean traditionally, charging design, there's as you say, there's lots of different pricing leavers you want to be able to move and change to charge different people. And these systems, even if they say they're configurable, if they normally turn into an IT project where it takes weeks, months, even years to build out the system, you know, marketing can't just go in there and configure the dials and push out your new plans and tariffs. They have to go and create a requirement specification. They hand it down to IT. Those guys go and create a big change project. And by the time they're finished, the market's moved on. They're on to their next plan, their next tariff to go and build. So we wanted to create something that was truly configurable from a marketing standpoint. You know, user-friendly, they can go in there, configure it and be live in minutes, not even days or weeks. >> No, IT necessary. >> Robin: No IT necessary. >> So you know, I've been thinking about, John and I talk about this all the time, It's that there's a data play here. And what I think you're doing is actually building a data product. I think there's a new metric emerging in the industry, which is how long does it take me to go from idea to monetization with a data product. And that's what this is. This is a data product >> Yeah. >> for your customers. >> Absolutely, what Robin was talking about is totally the way the industry works. It's weeks before you have an idea and get it out to the market. And like Robin was mentioning, the market's changed by the time you get it out there, the data's stale. And so we researched every single plan in the world from every single CSP. There is about 30,000 plans in the world, right? The bigger you are, the more plans you have. On average, a tier one telco has 40 to 50 plans. And so how many offers, I mean think about, that's how many phones to buy, plans to buy. And so we're like, let's get some insight on the plans. Let's drive it into a standardization, right? Let's make them, which ones work, which ones don't. And that's, I think you're right. I think it's a data play and putting the power back into the marketer's hands and not with IT. >> So there's a lot of data on-prem. Explain why I can't do this with my on-prem data. >> Oh, well today that, I mean, sorry if you want to jump in. Feel free to jump in, right. But today, the products are designed in a way where they're, perpetually licensed, by the subscriber, rigid systems, not API based. I mean, there might be an API, but you got to pay through the nose to use it or you got to use the provider's people to code against it. They're inflexible. They were written when voice was the primary revenue driver, not data, right? And so they've been shoehorned, right? Like Robin was saying, shoehorned to be able to move into the world that we are now. I mean, when the iPhone came about that introduced apps and data went through the roof and the systems were written for voice, not written for data. >> And that's a good point, if you think about the telco industry, it seems like it could be a glacier that just needs to just break and just like, just get modern because we all have phones. We have apps. We can delete them. And the billing plans, like either nonexistent or it has to be all free. >> Well I mean, I'll ask you. Do you know what your billing plan is? Do you know how much data you use on a monthly basis? No one knows. >> I have no clue. >> A lot. >> No one. And so what you do is you buy unlimited. >> Dave: Right. >> You overpay. And so what we're seeing in the plans is that if you actually knew how much you used, you would be able to maybe pay less, which I remember the telcos are not excited to hear that message, but it's a win for the subscriber. And if you could >> I mean it's only >> accurately predict that. >> get lower and lower. I having a conversation last night at dinner with industry analysts, we're talking about a vehicle e-commerce, commerce in your car as you're driving. You can get that kind of with a 5G. The trend is transactions everywhere, ad-hoc, ephemeral... >> Yeah. >> The new apps are going to demand this kind of subscriber billing. >> Yeah >> Do people get this? Are you guys the only ones kind of like on this? >> No I think people have been talking about it for years. I think there's vendors out there that have been trying to offer this idea of like, build your own plan and all that other stuff but I think it's more than just minutes, text and data. It's starting to really understand what subscribers are using, right? Are you a football fan? Are you a golf fan? Are you a shopper? Are you a concert goer? And couple that with how you use your phone and putting out offers that are really exciting to subscribers so that we love our telco. Like we should be loving our telco. And I don't... I don't know that people talk >> They saved us >> about loving their telco. >> from the pandemic >> They saved us during the pandemic. The internet didn't crash, we got our zoom meetings. We got everything going on. What's the technical issue on solving these problems? Is it just legacy? Is it just mindset? Robin, what's your take on that? >> I'll keep talking as long as Robin will let me. (Daniel laughing) >> So the big technical issues, you're trying to build in this flexibility so that you can have, we don't know what people are going to configure in the future. It's minutes and text messages are given away for free. They're unlimited. Data is where it's at, about charging for apps and about using all that data in the network the telcos have, which is extremely valuable and there's a wealth of information in there that can be used to be monetized and push that out. And they need a charging system on top that can manage that and we have the flexibility that you don't have to go off and then start creating programs and IT projects that are going to do that. >> Well it's funny Danielle, you say that the telcos might not like that, right? 'Cause you might pay less. But in fact, that is the kind of on-prem mindset because when you have a fixed resource, you say, okay, don't use too much because we have to buy more. Or you overbuy to your point. The cloud mindset is, I'll try it. I'll try some more, I'll try some more. I'm aligning it with business value. Oh, I'm making money. Oh, great. I'm going to keep buying more. And it's very clear. It's transparent where the business value is. So my question is when you think about your charging engine and all this data conversation, is there more than just a charging engine in this platform? >> Well, I think just going back to what Robin was talking about. I think what Totogi is doing differently is by building it on the Public Cloud gives you virtually unlimited resources, right? In a couple of different directions, certainly hardware and capacity and scalability and all those other things, right? But also as Amazon is putting out more and more product, when you build it in this new way, you can take advantage of these new services very, very easily. And that is a different mindset. It's a different way to deploy applications. And I think that's what makes Totogi really different. You couldn't build Totogi on-premise because you need the infinite scalability. You need the machine learning, you need the AI of Amazon, which they have been investing in for decades, if they now charge you by the API call. And you get to use it like you were saying. Just give it a try, don't like it, stop. And it's just a completely different way of thinking, yeah. >> If I have to ask you a question about the Public Cloud, because the theme here in Cloud City is the Public Cloud is driving innovation, which is also includes disruption. And the new brands are coming in, old brands are either reinventing themselves or falling away. What is the Public Cloud truly enabling? Is it the scale? Is it the data? Is it the refactoring capability? What is the real driver of the Public Cloud innovation? >> I think the insight that CSPs are going to have is what Jamie Dimon had in banking. Like I think he was pretty famously saying, "I'm never going to use the Public Cloud. Our data is too precious, you know, regulations and all that stuff." But I think the insight they're going to have, and I hopefully, I do a keynote and I mentioned this, which is feature velocity. The ability to put out features in a day or two. Our feature velocity in telco is months. Months, months. >> Seriously? >> Yeah, sometimes years. It's just so slow between big iterations of new capability and to be able to put out new features in minutes or days and be able to outmaneuver your competition is unheard of. So the CSPs that starts to get this, it's going to be a real big get, and then they're going to start to.. (Danielle makes swishing sound) >> We just interviewed (Dave speaking indistinctly) a venture capitalist, Dave and I last month. And he's a big investor in Snowflake, on the big deals. He said that the new strategy that's working is you go to be agile with feature acceleration. We just talked about this at lunch and you get data. And you can dismantle the bad features quickly and double down >> Yup. >> on the winners. >> Ones that are working. So what used to be feature creep now is a benefit if you play it right? >> Danielle: It's feature experimentation. >> That's essentially what you- >> It's experimentation, right? And you're like, that one worked, this one didn't, kill that one, double down on this one, go faster and faster and so feature experimentation, which you can't do in telco, because every time we ask for a feature from your current vendor, it's hundreds of thousands, if not millions of dollars. So you don't experiment. And so yeah- >> You can make features disposable. >> Correct. And I think that we just discovered that on this stage just now. (group chuckling) >> Hey look at this. Digital revolution, DR. Telco DR. >> Yeah. >> Great to have you guys. >> This is super awesome. Thanks so much. >> You guys are amazing. Congratulations. And we're looking forward to the more innovation stories again, get out there, get the momentum. Great stuff. >> Danielle: It's going to be great. >> And awesome. >> Feature experimentation. >> Yeah. >> Hashtag. >> And Dave and I are going to head back over to our Cube set here, here on the main stage. We'll toss it back to the Adam in the studio. Adam, back to you and take it from here.

Published Date : Jul 6 2021

SUMMARY :

We're here in the main stage to talk to Danielle: Yeah. and $100 million investment. and you need money for compensation. I got to ask you, what's different? And I think to Totogi is enabling that. So Robin, I wonder if you could talk And it's great, so as you but maybe you could explain the advantage that you should be using. And if you look at enterprise SAS pricing, And that is, and it's the same product that you want to configure Maybe you could talk about sort of how to build out the system, you know, So you know, I've been thinking about, by the time you get it out this with my on-prem data. or you got to use the provider's And the billing plans, Do you know what your billing plan is? And so what you do is you buy unlimited. And if you could You can get that kind of with a 5G. The new apps are going to demand And couple that with What's the technical issue I'll keep talking as so that you can have, But in fact, that is the And you get to use it If I have to ask you a Our data is too precious, you know, So the CSPs that starts to And you can dismantle if you play it right? So you don't experiment. And I think that we just discovered that This is super awesome. the more innovation stories Adam, back to you and take it from here.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Danielle	PERSON	0.99+
40	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Jamie Dimon	PERSON	0.99+
Robin	PERSON	0.99+
Adam	PERSON	0.99+
Danielle Royston	PERSON	0.99+
Robin Langdon	PERSON	0.99+
Google	ORGANIZATION	0.99+
$5 million	QUANTITY	0.99+
Daniel	PERSON	0.99+
John Furrier	PERSON	0.99+
$100 million	QUANTITY	0.99+
telco	ORGANIZATION	0.99+
Telcos	ORGANIZATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Totogi	ORGANIZATION	0.99+
today	DATE	0.99+
millions of dollars	QUANTITY	0.99+
a day	QUANTITY	0.99+
50 plans	QUANTITY	0.99+
hundreds of thousands	QUANTITY	0.99+
last month	DATE	0.99+
two main players	QUANTITY	0.98+
Totogi	PERSON	0.98+
telcos	ORGANIZATION	0.98+
about 30,000 plans	QUANTITY	0.98+
single table	QUANTITY	0.97+
two	QUANTITY	0.96+
last night	DATE	0.96+
Telco	ORGANIZATION	0.95+
pandemic	EVENT	0.95+
below 250,000 subscribers	QUANTITY	0.95+
decades	QUANTITY	0.94+
DynamoDB	TITLE	0.92+
Cloud City	TITLE	0.91+
Snowflake	TITLE	0.9+
2021	DATE	0.86+
couple million dollars	QUANTITY	0.86+
Cube	COMMERCIAL_ITEM	0.84+
a million transactions per second	QUANTITY	0.82+
theCube	ORGANIZATION	0.81+

Andrew Rafla & Ravi Dhaval, Deloitte & Touche LLP | AWS re:Invent 2020

>>from around the globe. It's the Cube with digital coverage of AWS reinvent 2020 sponsored by Intel, AWS and our community partners. >>Hey, welcome back already, Jeffrey here with the Cube coming to you from Palo Alto studios today for our ongoing coverage of aws reinvent 2020. It's a digital event like everything else in 2020. We're excited for our next segment, so let's jump into it. We're joined in our next segment by Andrew Rafa. He is the principal and zero trust offering lead at the Light and Touche LLP. Andrew, great to see you. >>Thanks for having me. >>Absolutely. And joining him is Robbie Deval. He is the AWS cyber risk lead for Deloitte and Touche LLP. Robbie, Good to see you as well. >>Hey, Jeff, good to see you as well. >>Absolutely. So let's jump into it. You guys are all about zero trust and I know a little bit about zero trust I've been going to are safe for a number of years and I think one of the people that you like to quote analysts chase Cunningham from Forrester, who's been doing a lot of work around zero trust. But for folks that aren't really familiar with it. Andrew, why don't you give us kind of the 101? About zero trust. What is it? What's it all about? And why is it important? >>Sure thing. So is your trust is, um, it's a conceptual framework that helps organizations deal with kind of the ubiquitous nature of modern enterprise environments. Um, and then its course. Your trust commits to a risk based approach to enforcing the concept of least privileged across five key pillars those being users, workloads, data networks and devices. And the reason we're seeing is your trust really come to the forefront is because modern enterprise environments have shifted dramatically right. There is no longer a defined, clearly defined perimeter where everything on the outside is inherently considered, considered untrusted, and everything on the inside could be considered inherently trusted. There's a couple what I call macro level drivers that are, you know, changing the need for organizations to think about securing their enterprises in a more modern way. Um, the first macro level driver is really the evolving business models. So as organizations are pushing to the cloud, um, maybe expanding into into what they were considered high risk geography is dealing with M and A transactions and and further relying on 3rd and 4th parties to maintain some of their critical business operations. Um, the data and the assets by which the organization, um transact are no longer within the walls of the data center. Right? So, again, the perimeter is very much dissolved. The second, you know, macro level driver is really the shifting and evolving workforce. Um, especially given the pandemic and the need for organizations to support almost an entirely remote workforce nowadays, um, organizations, they're trying to think about how they revamp their traditional VPN technologies in order to provide connectivity to their employees into other third parties that need to get access to, uh, the enterprise. So how do we do so in a secure, scalable and reliable way and then the last kind of macro level driver is really the complexity of the I t landscape. So, you know, in legacy environment organizations on Lee had to support managed devices, and today you're seeing the proliferation of unmanaged devices, whether it be you know, B y o d devices, um, Internet of things, devices or other smart connected devices. So organizations are now, you know, have the need to provide connectivity to some of these other types of devices. But how do you do so in a way that, you know limits the risk of the expanding threat surface that you might be exposing your organization to by supporting from these connected devices? So those are some three kind of macro level drivers that are really, you know, constituting the need to think about security in a different >>way. Right? Well, I love I downloaded. You guys have, ah zero trust point of view document that that I downloaded. And I like the way that you you put real specificity around those five pillars again users, workloads, data networks and devices. And as you said, you have to take this kind of approach that it's kind of on a need to know basis. The less, you know, at kind of the minimum they need to know. But then, to do that across all of those five pillars, how hard is that to put in place? I mean, there's a There's a lot of pieces of this puzzle. Um, and I'm sure you know, we talk all the time about baking security and throughout the entire stack. How hard is it to go into a large enterprise and get them started or get them down the road on this zero trust journey? >>Yeah. So you mentioned the five pillars. And one thing that we do in our framework because we put data at the center of our framework and we do that on purpose because at the end of the day, you know, data is the center of all things. It's important for an organization to understand. You know what data it has, what the criticality of that data is, how that data should be classified and the governance around who and what should access it from a no users workloads, uh, networks and devices perspective. Um, I think one misconception is that if an organization wants to go down the path of zero trust, there's a misconception that they have to rip out and replace everything that they have today. Um, it's likely that most organizations are already doing something that fundamentally aligned to the concept of these privilege as it relates to zero trust. So it's important to kind of step back, you know, set a vision and strategy as faras What it is you're trying to protect, why you're trying to protect it. And what capability do you have in place today and take more of an incremental and iterative approach towards adoption, starting with some of your kind of lower risk use cases or lower risk parts of your environment and then implementing lessons learned along the way along the journey? Um, before enforcing, you know more of those robust controls around your critical assets or your crown jewels, if you >>will. Right? So, Robbie, I want to follow up with you, you know? And you just talked about a lot of the kind of macro trends that are driving this and clearly covert and work from anywhere is a big one. But one of the ones that you didn't mention that's coming right around the pike is five g and I o t. Right, so five g and and I o. T. We're going to see, you know, the scale and the volume and the mass of machine generated data, which is really what five g is all about, grow again exponentially. We've seen enough curves up into the right on the data growth, but we've barely scratched the surface and what's coming on? Five G and I o t. How does that work into your plans? And how should people be thinking about security around this kind of new paradigm? >>Yeah, I think that's a great question, Jeff. And as you said, you know, I UT continues to accelerate, especially with the recent investments and five G that you know pushing, pushing more and more industries and companies to adopt a coyote. Deloitte has been and, you know, helping our customers leverage a combination of these technologies cloud, Iot, TML and AI to solve their problems in the industry. For instance, uh, we've been helping restaurants automate their operations. Uh, we've helped automate some of the food safety audit processes they have, especially given the code situation that's been helping them a lot. We are currently working with companies to connect smart, wearable devices that that send the patient vital information back to the cloud. And once it's in the cloud, it goes through further processing upstream through applications and data. Let's etcetera. The way we've been implementing these solutions is largely leveraging a lot of the native services that AWS provides, like device manager that helps you onboard hundreds of devices and group them into different categories. Uh, we leveraged device Defender. That's a monitoring service for making sure that the devices are adhering to a particular security baseline. We also have implemented AWS green grass on the edge, where the device actually resides. Eso that it acts as a central gateway and a secure gateway so that all the devices are able to connect to this gateway and then ultimately connect to the cloud. One common problem we run into is ah, lot of the legacy i o t devices. They tend to communicate using insecure protocols and in clear text eso we actually had to leverage AWS lambda Function on the edge to convert these legacy protocols. Think of very secure and Q t t protocol that ultimately, you know, sense data encrypted to the cloud eso the key thing to recognize. And then the transformational shift here is, um, Cloud has the ability today to impact security off the device and the edge from the cloud using cloud native services, and that continues to grow. And that's one of the key reasons we're seeing accelerated growth and adoption of Iot devices on did you brought up a point about five G and and that's really interesting. And a recent set of investments that eight of us, for example, has been making. And they launched their AWS Waveland zones that allows you to deploy compute and storage infrastructure at the five G edge. So millions of devices they can connect securely to the computer infrastructure without ever having to leave the five g network Our go over the Internet insecurely talking to the cloud infrastructure. Uh, that allows us to actually enable our customers to process large volumes of data in a short, near real time. And also it increases the security of the architectures. Andi, I think truly, uh, this this five g combination with I o t and cloudy, I m l the are the technologies of the future that are collectively pushing us towards a a future where we're gonna Seymour smart cities that come into play driverless connected cars, etcetera. >>That's great. Now I wanna impact that a little bit more because we are here in aws re invent and I was just looking up. We had Glenn Goran 2015, introducing a W S s I O T Cloud. And it was a funny little demo. They had a little greenhouse, and you could turn on the water and open up the windows. But it's but it's a huge suite of services that you guys have at your disposal. Leveraging aws. I wonder, I guess, Andrew, if you could speak a little bit more suite of tools that you can now bring to bear when you're helping your customers go to the zero trust journey. >>Yeah, sure thing. So, um, obviously there's a significant partnership in place, and, uh, we work together, uh, pretty tremendously in the market, one of the service are one of solution offering that we've built out which we dub Delight Fortress, um is a is a concept that plays very nicely into our zero trust framework. More along the kind of horizontal components of our framework, which is really the fabric that ties it all together. Um s o the two horizontal than our framework around telemetry and analytics. A swell the automation orchestration. If I peel back the automation orchestration capability just a little bit, um, we we built this avoid fortress capability in order for organizations to kind of streamline um, some of the vulnerability management aspect of the enterprise. And so we're able through integration through AWS, Lambda and other functions, um, quickly identify cloud configuration issues and drift eso that, um, organizations cannot only, uh, quickly identify some of those issues that open up risk to the enterprise, but also in real time. Um, take some action to close down those vulnerabilities and ultimately re mediate them. Right? So it's way for, um, to have, um or kind of proactive approach to security rather than a reactive approach. Everyone knows that cloud configuration issues are likely the number one kind of threat factor for Attackers. And so we're able to not only help organizations identify those, but then closed them down in real time. >>Yeah, it's interesting because we hear that all the time. If there's a breach and if if they w s involved often it's a it's a configuration. You know, somebody left the door open basically, and and it really drives something you were talking about. Ravi is the increasing important of automation, um, and and using big data. And you talked about this kind of horizontal tele metrics and analytics because without automation, these systems are just getting too big and and crazy for people Thio manage by themselves. But more importantly, it's kind of a signal to noise issue when you just have so much traffic, right? You really need help surfacing. That signals you said so that your pro actively going after the things that matter and not being just drowned in the things that don't matter. Ravi, you're shaking your head up and down. I think you probably agree with this point. >>Yeah, yeah, Jeff and definitely agree with you. And what you're saying is truly automation is a way off dealing with problems at scale. When when you have hundreds of accounts and that spans across, you know, multiple cloud service providers, it truly becomes a challenge to establish a particular security baseline and continue to adhere to it. And you wanna have some automation capabilities in place to be able to react, you know, and respond to it in real time versus it goes down to a ticketing system and some person is having to do you know, some triaging and then somebody else is bringing in this, you know, solution that they implement. And eventually, by the time you're systems could be compromised. So ah, good way of doing this and is leveraging automation and orchestration is just a capability that enhances your operational efficiency by streamlining summed Emmanuel in repetitive tasks, there's numerous examples off what automation and orchestration could do, but from a security context. Some of the key examples are automated security operations, automated identity provisioning, automated incident response, etcetera. One particular use case that Deloitte identified and built a solution around is the identification and also the automated remediation of Cloud security. Miss Consideration. This is a common occurrence and use case we see across all our customers. So the way in the context of a double as the way we did this is we built a event driven architectures that's leveraging eight of us contribute config service that monitors the baselines of these different services. Azzan. When it detects address from the baseline, it fires often alert. That's picked up by the Cloudwatch event service that's ultimately feeding it upstream into our workflow that leverages event bridge service. From there, the workflow goes into our policy engine, which is a database that has a collection off hundreds of rules that we put together uh, compliance activities. It also matched maps back to, ah, large set of controls frameworks so that this is applicable to any industry and customer, and then, based on the violation that has occurred, are based on the mis configuration and the service. The appropriate lambda function is deployed and that Lambda is actually, uh, performing the corrective actions or the remediation actions while, you know, it might seem like a lot. But all this is happening in near real time because it is leveraging native services. And some of the key benefits that our customers see is truly the ease of implementation because it's all native services on either worse and then it can scale and, uh, cover any additional eight of those accounts as the organization continues to scale on. One key benefit is we also provide a dashboard that provides visibility into one of the top violations that are occurring in your ecosystem. How many times a particular lambda function was set off to go correct that situation. Ultimately, that that kind of view is informing. Thea Outfront processes off developing secure infrastructure as code and then also, you know, correcting the security guard rails that that might have drifted over time. Eso That's how we've been helping our customers and this particular solution that we developed. It's called the Lloyd Fortress, and it provides coverage across all the major cloud service providers. >>Yeah, that's a great summary. And I'm sure you have huge demand for that because he's mis configuration things. We hear about him all the time and I want to give you the last word for we sign off. You know, it's easy to sit on the side of the desk and say, Yeah, we got a big security and everything and you got to be thinking about security from from the time you're in, in development all the way through, obviously deployment and production and all the minutes I wonder if you could share. You know, you're on that side of the glass and you're out there doing this every day. Just a couple of you know, kind of high level thoughts about how people need to make sure they're thinking about security not only in 2020 but but really looking down the like another road. >>Yeah, yeah, sure thing. So, you know, first and foremost, it's important to align. Uh, any transformation initiative, including your trust to business objectives. Right? Don't Don't let this come off as another I t. Security project, right? Make sure that, um, you're aligning to business priorities, whether it be, you know, pushing to the cloud, uh, for scalability and efficiency, whether it's digital transformation initiative, whether it be a new consumer identity, Uh uh, an authorization, um, capability of china built. Make sure that you're aligning to those business objectives and baking in and aligning to those guiding principles of zero trust from the start. Right, Because that will ultimately help drive consensus across the various stakeholder groups within the organization. Uh, and build trust, if you will, in the zero trust journey. Um, one other thing I would say is focus on the fundamentals. Very often, organizations struggle with some. You know what we call general cyber hygiene capabilities. That being, you know, I t asset management and data classifications, data governance. Um, to really fully appreciate the benefits of zero trust. It's important to kind of get some of those table six, right? Right. So you have to understand, you know what assets you have, what the criticality of those assets are? What business processes air driven by those assets. Um, what your data criticality is how it should be classified intact throughout the ecosystem so that you could really enforce, you know, tag based policy, uh, decisions within, within the control stack. Right. And then finally, in order to really push the needle on automation orchestration, make sure that you're using technology that integrate with each other, right? So taken a p I driven approach so that you have the ability to integrate some of these heterogeneous, um, security controls and drive some level of automation and orchestration in order to enhance your your efficiency along the journey. Right. So those were just some kind of lessons learned about some of the things that we would, uh, you know, tell our clients to keep in mind as they go down the adoption journey. >>That's a great That's a great summary s So we're gonna have to leave it there. But Andrew Robbie, thank you very much for sharing your insight and and again, you know, supporting this This move to zero trust because that's really the way it's got to be as we continue to go forward. So thanks again and enjoy the rest of your reinvent. >>Yeah, absolutely. Thanks for your time. >>All right. He's Andrew. He's Robbie. I'm Jeff. You're watching the Cube from AWS reinvent 2020. Thanks for watching. See you next time.

Published Date : Dec 8 2020

SUMMARY :

It's the Cube with digital coverage He is the principal and zero trust offering lead at the Light Robbie, Good to see you as well. Andrew, why don't you give us kind of the 101? So organizations are now, you know, have the need to provide connectivity And I like the way that you you put real specificity around those five pillars to kind of step back, you know, set a vision and strategy as faras What it is you're trying to protect, Right, so five g and and I o. T. We're going to see, you know, the scale and the volume so that all the devices are able to connect to this gateway and then ultimately connect to the cloud. that you can now bring to bear when you're helping your customers go to the zero trust journey. Everyone knows that cloud configuration issues are likely the number But more importantly, it's kind of a signal to noise issue when you just have so much traffic, some person is having to do you know, some triaging and then somebody else is bringing in this, You know, it's easy to sit on the side of the desk and say, Yeah, we got a big security and everything and you got to be thinking so that you have the ability to integrate some of these heterogeneous, um, thank you very much for sharing your insight and and again, you know, supporting this This move to Thanks for your time. See you next time.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Jeffrey	PERSON	0.99+
Andrew	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Robbie Deval	PERSON	0.99+
Andrew Rafa	PERSON	0.99+
Robbie	PERSON	0.99+
2020	DATE	0.99+
Andrew Rafla	PERSON	0.99+
Andrew Robbie	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Ravi	PERSON	0.99+
five key pillars	QUANTITY	0.99+
3rd	QUANTITY	0.99+
second	QUANTITY	0.99+
chase Cunningham	PERSON	0.98+
five pillars	QUANTITY	0.98+
today	DATE	0.98+
Ravi Dhaval	PERSON	0.98+
Lloyd Fortress	ORGANIZATION	0.98+
one	QUANTITY	0.98+
one thing	QUANTITY	0.98+
eight	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
Emmanuel	PERSON	0.98+
One key benefit	QUANTITY	0.97+
two	QUANTITY	0.97+
zero trust	QUANTITY	0.97+
three	QUANTITY	0.97+
One	QUANTITY	0.97+
2015	DATE	0.97+
aws	ORGANIZATION	0.96+
Iot	TITLE	0.96+
one misconception	QUANTITY	0.96+
4th parties	QUANTITY	0.96+
pandemic	EVENT	0.95+
Light and Touche LLP	ORGANIZATION	0.95+
Glenn Goran	PERSON	0.95+
Deloitte & Touche LLP	ORGANIZATION	0.95+
hundreds of devices	QUANTITY	0.94+
hundreds of accounts	QUANTITY	0.94+
table six	QUANTITY	0.94+
millions of devices	QUANTITY	0.94+
Deloitte and Touche LLP	ORGANIZATION	0.91+
Cube	COMMERCIAL_ITEM	0.91+
Cloudwatch	TITLE	0.9+
Lambda	TITLE	0.9+
hundreds of rules	QUANTITY	0.9+
101	QUANTITY	0.9+
china	LOCATION	0.89+
Delight Fortress	TITLE	0.88+
first	QUANTITY	0.86+
double	QUANTITY	0.85+
zero	QUANTITY	0.83+
One particular use case	QUANTITY	0.78+
Seymour	ORGANIZATION	0.77+
Eso	ORGANIZATION	0.77+
five G	TITLE	0.77+

Data Cloud Catalysts - Women in Tech | Snowflake Data Cloud Summit

>> Hi and welcome to Data Cloud catalyst Women in Tech Round Table Panel discussion. I am so excited to have three fantastic female executives with me today, who have been driving transformations through data throughout their entire career. With me today is Lisa Davis, SVP and CIO OF Blue shield of California. We also have Nishita Henry who is the Chief Innovation Officer at Deloitte and Teresa Briggs who is on a variety of board of directors including our very own Snowflake. Welcome ladies. >> Thank you. >> So I am just going to dive right in, you all have really amazing careers and resumes behind you, am really curious throughout your career, how have you seen the use of data evolve throughout your career and Lisa am going to start with you. >> Thank you, having been in technology my entire career, technology and data has really evolved from being the province of a few in an organization to frankly being critical to everyone's business outcomes. Now every business leader really needs to embrace data analytics and technology. We've been talking about digital transformation, probably the last five, seven years, we've all talked about, disrupt or be disrupted, At the core of that digital transformation is the use of data. Data and analytics that we derive insights from and actually improve our decision making by driving a differentiated experience and capability into market. So data has involved as being I would say almost tactical, in some sense over my technology career to really being a strategic asset of what we leverage personally in our own careers, but also what we must leverage as companies to drive a differentiated capability to experience and remain relative in the market today. >> Nishita curious your take on, how you have seen data evolve? >> Yeah, I agree with Lisa, it has definitely become a the lifeblood of every business, right? It used to be that there were a few companies in the business of technology, every business is now a technology business. Every business is a data business, it is the way that they go to market, shape the market and serve their clients. Whether you're in construction, whether you're in retail, whether you're in healthcare doesn't matter, right? Data is necessary for every business to survive and thrive. And I remember at the beginning of my career, data was always important, but it was about storing data, it was about giving people individual reports, it was about supplying that data to one person or one business unit in silos. And it then evolved right over the course of time into integrating data into saying, alright, how does one piece of data correlate to the other and how can I get insights out of that data? Now, its gone to the point of how do I use that data to predict the future? How do I use that data to automate the future? How do I use that data not just for humans to make decisions, but for other machines to make decisions, right? Which is a big leap and a big change in how we use data, how we analyze data and how we use it for insights and involving our businesses. >> Yeah its really changed so tremendously just in the past five years, its amazing. So Teresa we've talked a lot about the Data Cloud, where do you think we are heading with that and also how can future leaders really guide their careers in data especially in those jobs where we don't traditionally think of them in the data science space? Teresa your thoughts on that. >> Yeah, well since I'm on the Snowflake Board, I'll talk a little bit about the Snowflake Data Cloud, we're getting your company's data out of the silos that exist all over your organization. We're bringing third party data in to combine with your own data and we're wrapping a governance structure around it and feeding it out to your employees so they can get their jobs done, as simple as that. I think we've all seen the pandemic accelerate the digitization of our work. And if you ever doubted that the future of work is here, it is here and companies are scrambling to catch up by providing the right amount of data, collaboration tools, workflow tools for their workers to get their jobs done. Now, it used to be as prior people have mentioned that in order to work with data you had to be a data scientist, but I was an auditor back in the day we used to work on 16 column spreadsheets. And now if you're an accounting major coming out of college joining an auditing firm, you have to be tech and data savvy because you're going to be extracting, manipulating, analyzing and auditing data, that massive amounts of data that sit in your clients IT systems. I'm on the board of Warby Parker, and you might think that their most valuable asset is their amazing frame collection, but it's actually their data, their 360 degree view of the customer. And so if you're a merchant, or you're in strategy, or marketing or talent or the Co-CEO, you're using data every day in your work. And so I think it's going to become a ubiquitous skill that any anyone who's a knowledge worker has to be able to work with data. >> Yeah I think its just going to be organic to every role going forward in the industry. So, Lisa curious about your thoughts about Data Cloud, the future of it and how people can really leverage it in their jobs for future leaders. >> Yeah, absolutely most enterprises today are, I would say, hybrid multicloud enterprises. What does that mean? That means that we have data sitting on-prem, we have data sitting in public clouds through software as a service applications. We have a data everywhere. Most enterprises have data everywhere, certainly those that have owned infrastructure or weren't born on the web. One of the areas that I love that Data Cloud is addressing is area around data portability and mobility. Because I have data sitting in various locations through my enterprise, how do I aggregate that data to really drive meaningful insights out of that data to drive better business outcomes? And at Blue Shield of California, one of our key initiatives is what we call an Experienced Cube. What does that mean? That means how do I drive transparency of data between providers, members and payers? So that not only do I reduce overhead on providers and provide them a better experience, our hospital systems are doctors, but ultimately, how do we have the member have it their power of their fingertips the value of their data holistically, so that we're making better decisions about their health care. One of the things Teresa was talking about, was the use of this data and I would drive to data democratization. We got to put the power of data into the hands of everyone, not just data scientists, yes we need those data scientists to help us build AI models to really drive and tackle these tough old, tougher challenges and business problems that we may have in our environments. But everybody in the company both on the IT side, both on the business side, really need to understand of how do we become a data insights driven enterprise, put the power of the data into everyone's hands so that we can accelerate capabilities, right? And leverage that data to ultimately drive better business results. So as a leader, as a technology leader, part of our responsibility, our leadership is to help our companies do that. And that's really one of the exciting things that I'm doing in my role now at Blue Shield of California. >> Yeah its really, really exciting time. I want to shift gears a little bit and focus on women in Tech. So I think in the past five to ten years there has been a lot of headway in this space but the truth is women are still under represented in the tech space. So what can we do to attract more women into technology quite honestly. So Nishita curious what your thoughts are on that? >> Great question and I am so passionate about this for a lot of reasons, not the least of which is I have two daughters of my own and I know how important it is for women and young girls to actually start early in their love for technology and data and all things digital, right? So I think it's one very important to start early started early education, building confidence of young girls that they can do this, showing them role models. We at Deloitte just partnered with LV Engineer to actually make comic books centered around young girls and boys in the early elementary age to talk about how heroes in tech solve everyday problems. And so really helping to get people's minds around tech is not just in the back office coding on a computer, tech is about solving problems together that help us as citizens, as customers, right? And as humanity, so I think that's important. I also think we have to expand that definition of tech, as we just said it's not just about right, database design, It's not just about Java and Python coding, it's about design, it's about the human machine interfaces, it's about how do you use it to solve real problems and getting people to think in that kind of mindset makes it more attractive and exciting. And lastly, I'd say look we have a absolute imperative to get a diverse population of people, not just women, but minorities, those with other types of backgrounds, disabilities, et cetera involved because this data is being used to drive decision making in all involved, right, and how that data makes decisions, it can lead to unnatural biases that no one intended but can happen just 'cause we haven't involved a diverse enough group of people around it. >> Absolutely, lisa curious about your thoughts on this. >> I agree with everything Nishita said, I've been passionate about this area, I think it starts with first we need more role models, we need more role models as women in these leadership roles throughout various sectors. And it really is it starts with us and helping to pull other women forward. So I think certainly it's part of my responsibility, I think all of us as female executives that if you have a seat at the table to leverage that seat at the table to drive change, to bring more women forward more diversity forward into the boardroom and into our executive suites. I also want to touch on a point Nishita made about women we're the largest consumer group in the company yet we're consumers but we're not builders. This is why it's so important that we start changing that perception of what tech is and I agree that it starts with our young girls, we know the data shows that we lose our like young girls by middle school, very heavy peer pressure, it's not so cool to be smart, or do robotics, or be good at math and science, we start losing our girls in middle school. So they're not prepared when they go to high school, and they're not taking those classes in order to major in these STEM fields in college. So we have to start the pipeline early with our girls. And then I also think it's a measure of what your boards are doing, what is the executive leadership in your goals around diversity and inclusion? How do we invite more diverse population to the decision making table? So it's really a combination of efforts. One of the things that certainly is concerning to me is during this pandemic, I think we're losing one in four women in the workforce now because of all the demands that our families are having to navigate through this pandemic. The last statistic I saw in the last four months is we've lost 850,000 women in the workforce. This pipeline is critical to making that change in these leadership positions. >> Yeah its really a critical time and now we are coming to the end of this conversation I want to ask you Teresa what would be a call to action to everyone listening both men and women since its to be solved by everyone to address the gender gap in the industry? >> I'd encourage each of you to become an active sponsor. Research shows that women and minorities are less likely to be sponsored than white men. Sponsorship is a much more active form than mentorship. Sponsorship involves helping someone identify career opportunities and actively advocating for them and those roles opening your network, giving very candid feedback. And we need men to participate too, there are not enough women in tech to pull forward and sponsor the high potential women that are in our pipelines. And so we need you to be part of the solution. >> Nishita real quickly what would be your call to action to everyone? >> I'd say look around your teams, see who's on them and make deliberate decisions about diversifying those teams, as positions open up, make sure that you have a diverse set of candidates, make sure that there are women that are part to that team and make sure that you are actually hiring and putting people into positions based on potential not just experience. >> And real quickly Lisa, we'll close it out with you what would your call to action be? >> Wow, it's hard to what Nishita and what Tricia shared I think we're very powerful actions. I think it starts with us. Taking action at our own table, making sure you're driving diverse panels and hiring setting goals for the company, having your board engaged and holding us accountable and driving to those goals will help us all see a better outcome with more women at the executive table and diverse populations. >> Great advice and great action for all of us to take. Thank you all so much for spending time with me today and talking about this really important issue, I really appreciate it. Stay with us.

Published Date : Nov 9 2020

SUMMARY :

I am so excited to have three fantastic So I am just going to dive right in, and remain relative in the market today. that data to one person in the data science space? and feeding it out to your employees just going to be organic And leverage that data to ultimately So I think in the past five to ten years and boys in the early elementary age about your thoughts on this. that our families are having to navigate and sponsor the high potential women that are part to that team Wow, it's hard to what Nishita and talking about this

ENTITIES

Entity	Category	Confidence
Tricia	PERSON	0.99+
Lisa	PERSON	0.99+
Nishita	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
Lisa Davis	PERSON	0.99+
Teresa	PERSON	0.99+
Teresa Briggs	PERSON	0.99+
Nishita Henry	PERSON	0.99+
360 degree	QUANTITY	0.99+
one person	QUANTITY	0.99+
Java	TITLE	0.99+
two daughters	QUANTITY	0.99+
Snowflake Board	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.99+
Python	TITLE	0.99+
one	QUANTITY	0.99+
Blue shield	ORGANIZATION	0.99+
one piece	QUANTITY	0.99+
both	QUANTITY	0.98+
850,000 women	QUANTITY	0.98+
Blue Shield	ORGANIZATION	0.98+
California	LOCATION	0.98+
Snowflake Data Cloud Summit	EVENT	0.98+
Warby Parker	ORGANIZATION	0.97+
pandemic	EVENT	0.97+
each	QUANTITY	0.96+
one business unit	QUANTITY	0.95+
first	QUANTITY	0.93+
four women	QUANTITY	0.93+
ten years	QUANTITY	0.91+
seven years	QUANTITY	0.91+
LV Engineer	ORGANIZATION	0.89+
last four months	DATE	0.88+
past five years	DATE	0.83+
Women in Tech Round Table Panel	EVENT	0.81+
16 column spreadsheets	QUANTITY	0.8+
Data Cloud	EVENT	0.78+
Data Cloud	ORGANIZATION	0.77+
three fantastic female executives	QUANTITY	0.77+
Experienced Cube	ORGANIZATION	0.74+
SVP	PERSON	0.67+
five	QUANTITY	0.64+
past	DATE	0.61+
Snowflake Data Cloud	ORGANIZATION	0.57+
Data	TITLE	0.53+
lisa	PERSON	0.51+
last five	DATE	0.51+
Snowflake	ORGANIZATION	0.5+
Cloud	ORGANIZATION	0.49+

Nishita Henry, Lisa Davis & Teresa Briggs V1

>> Hi, and welcome to Data Cloud Catalyst, Women in Tech Round Table Panel Discussion. I am so excited to have three fantastic female executives with me today who have been driving transformation through data throughout their entire career. With me today is Lisa Davis, SVP and CIO of Blue Shield of California. We also have Nishita Henry, who is the Chief Innovation Officer at Deloitte and Theresa Briggs, who is on a variety of board of directors, including our own very own Snowflake. Welcome, ladies. >> Thank you. >> Thank you. >> So I'm just going to dive right in. You all have really amazing careers and resumes behind you. I'm really curious, throughout your career, how have you seen the use of data evolve throughout your career? And, Lisa, I'm going to start with you. >> Thank you. Having been in technology my entire career, technology and data has really evolved from being the province of a few in an organization to frankly being critical to everyone's business outcomes. Now every business leader really needs to embrace data analytics and technology. We've been talking about digital transformation probably the last five, seven years, we've all talked about disrupt or be disrupted. At the core of that digital transformation is the use of data. Data and analytics that we derive insights from and actually improve our decision making by driving a differentiated experience and capability into market. So data has involved as being, I would say, almost tactical in some sense over my technology career, to really being a strategic asset of what we leverage personally in our own careers, but also what we must leverage as companies to drive a differentiated capability to experience and remain relative in the market today. >> Nishita, curious your take on how you've seen data evolve? >> Yeah, I agree with Lisa. It has definitely become the lifeblood of every business, right? It used to be that there were a few companies in the business of technology, every business is now a technology business. Every business is a data business. It is the way that they go to market, shape the market and serve their clients. Whether you're in construction, whether you're in retail, whether you're in healthcare it doesn't matter, right? Data is necessary for every business to survive and thrive. And I remember at the beginning of my career, data was always important but it was about storing data. It was about giving people individual reports, it was about supplying that data to one person or one business unit in silos. And it then evolved right over the course of time into integrating data and to saying, all right, how does one piece of data correlate to the other and how can I get insights out of that data? Now, let's go on to the point of how do I use that data to predict the future? How do I use that data to automate the future? How do I use that data not just for humans to make decisions, but for other machines to make decisions, right? Which is a big leap. And a big change in how we use data, how we analyze data and how we use it for insights in evolving our businesses. >> Yeah, it's really changed so tremendously just in the past five years. It's amazing. So Teresa, we've talked a lot about the Data Cloud, where do you think we're heading with that? And also, how can future leaders really guide their careers in data, especially in those jobs where we don't traditionally think of them in the data science space? Curious your thoughts on that? >> Yeah, well, since I'm on the Snowflake board, I'll talk a little bit about the Snowflake Data Cloud. Now we're getting your company's data out of the silos that exists all over your organization, we're bringing third party data in to combine with your own data, and we're wrapping a governance structure around it and feeding it out to your employees so that they can get their jobs done. And is as simple as that. I think we've all seen the pandemic accelerate the digitization of our work. And if you ever doubted the future of work is here, it is here. And companies are scrambling to catch up by providing the right amount of data, collaboration tools, workflow tools for their workers to get their jobs done. Now, it used to be as prior people have mentioned that in order to work with data you had to be a data scientist. But I was an auditor back in the day and we used to work on 16 columns spreadsheet. And now if you're an accounting major coming out of college joining an auditing firm, you have to be tech and data savvy because you're going to be extracting, manipulating, analyzing and auditing data, that massive amounts of data that sit in your client's IT systems. I'm on the board of Warby Parker, and you might think that their most valuable asset is their amazing frame collection, but it's actually their data, their 360 degree view of the customer. And so if you're a merchant or you're in strategy, or marketing or talent or the co-CEO, you're using data every day in your work. And so I think it's going to become a ubiquitous skill that anyone who's a knowledge worker has to be able to work with data. >> Yeah, I think it's just going to be organic to every role going forward in the industry. So Lisa, curious about your thoughts about Data Cloud, the future of it, and how people can really leverage it in their jobs from future leaders? >> Yeah, absolutely. Most enterprises today are, I would say, hybrid multi cloud enterprises. What does that mean? That means that we have data sitting on prem, we have data sitting in public clouds through software as a service applications, we have a data everywhere, most enterprises have data everywhere. Certainly those that have owned infrastructure or weren't born on the web. One of the areas that I love that Data Cloud is addressing is the area around data portability and mobility. Because I have data sitting in various locations through my enterprise, how do I aggregate that data to really drive meaningful insights out of that data to drive better business outcomes? And at Blue Shield of California, one of our key initiatives is what we call an experienced cube. What does that mean? It means how do I drive transparency of data between providers, members and payers? So that not only do I reduce overhead on providers and provide them a better experience, or hospital systems or doctors, but ultimately, how do we have the member have it their power of their fingertips the value of their data holistically, so that we're making better decisions about their health care? One of the things Teresa was talking about was the use of this data, and I would drive to data democratization. We got to put the power of data into the hands of everyone, not just data scientists. Yes, we need those data scientists to help us build AI models to really drive and tackle these tougher challenges and business problems that we may have in our environments. But everybody in the company, both on the IT side, both on the business side, really need to understand of how do we become a data insights driven enterprise. Put the power of the data into everyone's hands so that we can accelerate capabilities, right? And leverage that data to ultimately drive better business results. So as a leader, as a technology leader, part of our responsibility, our leadership is to help our companies do that. And that's really one of the exciting things that I'm doing in my role now at Blue Shield of California. >> Yeah, it's really, really exciting time. I want to shift gears a little bit and focus on women in tech. So I think in the past five to 10 years, there has been a lot of headway in this space. But the truth is women are still underrepresented in the tech space. So what can we do to attract more women into technology quite honestly. So Nishita, curious, what your thoughts are on that? >> Great question. And I am so passionate about this for a lot of reasons, not the least of which is I have two daughters of my own. And I know how important it is for women and young girls to actually start early in their love for technology, and data and all things digital, right? So I think it's one very important to start early, start an early education, building confidence of young girls that they can do this, showing them role models. We at Deloitte just partnered with Ella the Engineer to actually make comic books centered around young girls and boys in the early elementary age to talk about how heroes and tech solve everyday problems. And so really helping to get people's minds around tech is not just in the back office coding on a computer, tech is about solving problems together that help us as citizens, as customers, right? And as humanity. So I think that's important. I also think we have to expand that definition of tech, as we just said. It's not just about, right? Database design. It's not just about Java and Python coding, it's about design. It's about the human machine interfaces. It's about how do you use it to solve real problems and getting people to think in that kind of mindset makes it more attractive and exciting. And lastly, I'd say look, we have absolute imperative to get a diverse population of people, not just women, but minorities, those with other types of backgrounds, disabilities, etc involved. Because this data is being used to drive decision making, and if we are not all involved, right? In how that data makes decisions, it can lead to unnatural biases that no one intended but can happen just 'cause we haven't involved a diverse enough group of people around it. >> Absolutely. Lisa, curious about your thoughts on this. >> I agree with everything Nishita said. I've been passionate about this area, I think it starts with first we need more role models. We need more role models as women in these leadership roles throughout various sectors. And it really is it starts with us and helping to pull other women forward. So I think certainly, it's part of my responsibility, I think all of us as female executives that if you have a seat at the table to leverage that seat at the table to drive change, to bring more women forward, more diversity forward into the boardroom and into our executive suites. I also want to touch on a point Nishita made about women, we're the largest consumer group in the company yet we're consumers, but we're not builders. This is why it's so important that we start changing that perception of what tech is. And I agree that it starts with our young girls. We know the data shows that we lose our young girls by middle school. Very heavy peer pressure, it's not so cool to be smart, or do robotics, or be good at math and science. We start losing our girls in middle school. So they're not prepared when they go to high school and they're not taking those classes in order to major in the STEM fields in college. So we have to start the pipeline early with our girls. And then I also think it's a measure of what your boards are doing. What is the executive leadership and your goals around diversity and inclusion? How do we invite more diverse population to the decision making table? So it's really a combination of efforts. One of the things that certainly is concerning to me is during this pandemic, I think we're losing one in four women in the workforce now, because of all the demands that our families are having to navigate through this pandemic. The last statistic I saw in the last four months is we've lost 850,000 women in the workforce. This pipeline is critical to making that change in these leadership positions. >> Yeah, it's really a critical time. And now we're coming to the end of this conversation, I want to ask you Teresa, what would be a call to action to everyone listening, both men and women since its needs to be solved by everyone, to address the gender gap in the industry? >> I'd encourage each of you to become an active sponsor. Research shows that women and minorities are less likely to be sponsored than white men. Sponsorship is a much more active form than mentorship. Sponsorship involves helping someone identify career opportunities and actively advocating for them in those roles, opening your network, giving very candid feedback. And we need men to participate too. There are not enough women in tech to pull forward and sponsor the high potential women that are in our pipelines. And so we need you to be part of the solution. >> Nishita real quickly, what would be your call to action to everyone? >> I'd say look around your teams, see who's on them and make deliberate decisions about diversifying those teams. As positions open up, make sure that you have a diverse set of candidates, and make sure that there are women that are part of that team. And make sure that you are actually hiring and putting people into positions based on potential not just experience. >> And real quickly Lisa, will close it out with you, what would your call to action be? >> Well, it's hard to... What Nishita and what Teresa shared I think were very powerful actions. I think it starts with us. Taking action at our own table, making sure you're driving diverse panels and hiring, setting goals for the company. Having your board engaged and holding us accountable and driving to those goals, will help us all see a better outcome but with more women at the executive table and diverse populations. >> Great advice and great action for all of us to take. Thank you all so much for spending time with me today and talking about this really important issue. I really appreciate it. Stay with us.

Published Date : Oct 28 2020

SUMMARY :

I am so excited to have three And, Lisa, I'm going to start with you. and remain relative in the market today. that data to one person in the data science space? and feeding it out to your employees forward in the industry. and business problems that we So I think in the past five to 10 years, and getting people to think Lisa, curious about your thoughts on this. and helping to pull other women forward. to address the gender gap in the industry? And so we need you to and make sure that there are women and driving to those goals, and talking about this

ENTITIES

Entity	Category	Confidence
Teresa	PERSON	0.99+
Nishita	PERSON	0.99+
Lisa	PERSON	0.99+
Lisa Davis	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
16 columns	QUANTITY	0.99+
360 degree	QUANTITY	0.99+
two daughters	QUANTITY	0.99+
Nishita Henry	PERSON	0.99+
Java	TITLE	0.99+
one person	QUANTITY	0.99+
Teresa Briggs	PERSON	0.99+
Blue Shield	ORGANIZATION	0.99+
Python	TITLE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
850,000 women	QUANTITY	0.99+
each	QUANTITY	0.99+
both	QUANTITY	0.98+
California	LOCATION	0.98+
one piece	QUANTITY	0.98+
three	QUANTITY	0.98+
One	QUANTITY	0.98+
Warby Parker	ORGANIZATION	0.97+
Theresa Briggs	PERSON	0.97+
first	QUANTITY	0.96+
four women	QUANTITY	0.94+
one business unit	QUANTITY	0.92+
Snowflake	ORGANIZATION	0.91+
pandemic	EVENT	0.91+
last four months	DATE	0.9+
10 years	QUANTITY	0.87+
seven years	QUANTITY	0.86+
past five years	DATE	0.84+
Ella the Engineer	PERSON	0.82+
Snowflake Data Cloud	ORGANIZATION	0.82+
five	QUANTITY	0.72+
Catalyst, Women in Tech Round Table	EVENT	0.68+
last	DATE	0.61+
SVP	PERSON	0.61+
female	QUANTITY	0.56+
Cloud	ORGANIZATION	0.54+
Data	EVENT	0.52+
prem	ORGANIZATION	0.51+
past	DATE	0.46+

Colin Blair & David Smith, Tech Data | HPE Discover 2020

>>from around the globe. It's the Cube covering HP. Discover Virtual experience Brought to you by HP. >>Welcome to the Cube's coverage of HP Discover 2020 Virtual Experience. I'm Lisa Martin, and I'm pleased to be joined by two guests from HP longtime partner Tech Data. We have calling Blair the vice president of sales and marketing of I. O. T. And Data Solutions and David Smith, H P E Pre Sales Field Solutions are common. And David, Welcome to the Cube. Thanks, Lisa. Great to see. So let's start with the calling. HP and Technical have been partners for over 40 years, but tell our audience a little bit about tech data before we get into the specifics of what you're doing and some of the cool I o. T. Stuff with HP. I >>think that the Tech data is a Fortune 100 distributor. We continued to evolved to be a solutions aggregator in these next generation technology businesses. As you've mentioned, we've been serving the I T distribution markets globally for for 40 plus years, and we're now moving into next generation technologies like Wild Analytics, I O. T and Security bubble Lifecycle Management services. But to be able todo position ourselves with our customer base and the needs of their clients have. So I'm excited to be here today to talk a little bit about what we're doing in I, O. T. And Analytics with David on the HPC side >>and in addition to the 40 plus years of partnership calling that you mentioned that Detected and HP have you've got over 200 plus hp. Resource is David, you're one of those guys in the field. Talk to us about some of the things that you're working on with Channel Partners Table David to enable them, especially during such crazy times of living and now >>absolutely, absolutely so. What we can do is we can provide strong sales and technical enablement if your team, for example, wants to better understand how to position HP portfolio if they require assistance and architect ing a secure performance i o t. Solution. We can help ensure that you're technical team is fully capable of having that conversation, and it's one that they're able to have of confidence, weaken validate the proposed HP solutions with the customers, technical requirements and proposed use case. We can even exist on a customer calls, if it would, would benefit our partner to kind of extend out to that. We also have a a a deep technical bench that Colin can speak to in the OT space toe lean on as well. For so solution is that kind of span into the space beyond where HP typically operates, which would be edge, compute computing and network. Sic security. >>Excellent call and tell me a little bit about Tech Data's investments in I o. T. When did this start? What are you guys doing today? >>Sure, we started in the cloud space. First tackle this opportunity in data center modernization and hybrid cloud. That was about seven years ago. Shortly thereafter we started investing very materially in the security cyber security space. And then we follow that with Data Analytics and then the Internet of things. Now we've been in those spaces with our long term partners for some time. But now that we're seeing this movement to the intelligent edge and a real focus on business outcomes and specialization, we've kind of tracked with the market, and we feel like we've invested a little bit ahead of where the channel is in terms of supporting our ecosystem of partners in this space. >>So the intelligent edge has been growing for quite some time. Poland in the very unique times that we're living in in 2020 how are you seeing that intelligent edge expand even more? And what are some of the pressing opportunities that tech data and HPC i O T solutions together can address? >>So a couple. So the first is a Xai mentioned earlier just data center modernization. And so, in the middle of code 19 and perhaps postcode 19 we're going to see a lot of clients that are really focused on monetizing the things that they've got. But doing so to drive business outcomes. We believe that increasingly, the predominance of use cases and compute and analytics is going to move to the edge. And HP has got a great portfolio for not just on premise high performance computing but also hybrid cloud computing. And then when we get into the edge with edge line and networking with Aruba and devices that need to be a digitized and sense arised, it's a really great partnership. And then what we're able to do also, Lisa, is we've been investing in vertical markets since 2000 and seven, and I've been a long the ride with that team, most all of that way. So we've got deep specialization and healthcare and industrial manufacturing, retail and then public sector. And then the last thing we've kind of turned on here recently just last month is a strategic partnership in the smarter cities space. So we're able to leverage a lot of those vertical market capabilities. Couple that with our HP organization and really drive specialized repeatable solutions in these vertical markets, where we believe increasingly, customers are going to be more interested in a repeatable solutions that can drive quick proof of value proof of concepts with minimal viable what kinds of products. And that's that's kind of the apartment today with RHB Organization and the HP Corporation >>David. Let's double click into some of those of vertical markets that Colin mentioned some of the things that pop into minor healthcare manufacturing. As we know, supply chains have been very challenged during covered. Give us an insight into what you're hearing from channel partners now virtually, but what are some of the things that are pressing importance? >>So from a pressing and important to Collins exact point, and your exact point as well is really it's all about the edge computing space now from a product perspective Azaz Colin had mentioned earlier. HP has their edge line converged systems, which is kind of taking the functionality of OT and edge T Excuse me of OT and I t and combine it into a single edge processing compute solution. You kind of couple that with the ability to configure components such as Tesla GP, use in specific excellent offerings to offer an aid and things like realtime, video processing and analytics. Uh, and a perfect example of this is, ah so for dissing and covert space. If if I need to be able to analyze a group of people to ensure they're staying as far apart as possible or, you know within self distant guidelines, that is where kind of the real time that's like an aspect of things can be taken advantage of same things with with the leveraging cameras where you could actually take temperature detection as as well, so it's really kind of best to think of Edge Lines Solutions is data center computing at the edge kind of transition into the Aruba space. Uh Rubio says offerings aid in the island Security is such a clear pass device inside, which allows for device discovery of network and monitoring of wired and wireless devices. There's also Aruba asset tracking and real time location of solutions, and that's particularly important in the healthcare space as well. If I have a lot of high value assets, things like wheelchairs, things like ventilation devices, where these things low located within my facilities and how can I keep keep track of them? They also, and by that I mean HP. They also kind of leveraging expanse ecosystem of partners. As an example, they leverage thing works allow their i o t solutions as well, when you kind of tying it all together with HP Point. Next to the end, customers provided with comprehensive loyalty solution. >>So, Colin, how ready? Our channel partners and the end user customers to rapidly pivot and start either deploying more technologies at the edge to be able to deliver some of the capabilities that David talked about in terms of analytics and sensors for social distancing. How ready are the channel partners and customers to be able to understand, adopt and execute this technology. >>So I think on the understanding side, I think the partners are there. We've been talking about digital transformation in the channel for a couple of years now, and I think what's happened through the 19 Pandemic is that it's been a real spotlight on the need for those business outcomes to to solve for very specific problems. And that's one of the values that we serve in the channel. So we've got a solution offering that we call our solution factory. And what we do really says is we leverage a process to look outside the industry. At Gartner, Magic Quadrant Solutions forced a Wave G two crowd. You know, top leaders, visionaries and understand What are those solutions that are in demand in these vertical markets that we talked about? And then we do a lot of work with David and his team internally in the HP organization to be able to do that and then build out that reference architectures so that we know that there's a solution that drives a bill of materials and a reference architecture that's going to work that clients are going to need and then we can do it quickly. You know, Tech data. Everything's about being bold, acting now getting scale. And we've got a large ecosystem partners that already have great relationships. So we pride ourselves on being able to identify what are those solutions that we can take to our partners that they can quickly take to their end users where you know we've We've kind of developed out what we think the 70 or 80% of that solution is going to look like. And then we drive point next and other services capabilities to be able to complete that last mile, if you will, of some of the customization. So we're helping them. For those who aren't ready, we're helping them. For those who already have very specific use cases and a practice that they drive with repeatable solutions were coming alongside them and understanding. What can we do? Using a practice builder approach, which is our consultative approach to understand where our partners are going in the market, who their clients are, what skill sets do they have? What supplier affinities do they want to drive? What brand marketing or demand generation support do they need? And that's where we can take some of these solutions, bring them to bear and engage in that consultative engagement to accelerate being ready as, as you rightly say, >>so tech. It has a lot of partners. You in general. You also have a lot of partners in the i o T space calling What? How do you from a marketing hat perspective? How do you describe the differentiation that Tech data and HP ease Iot solutions delivered to the channel to the end user? >>A couple of different things? I think that's that's differentiation. And that's one of the things that we strive for in the channel is to be specialized and to be competitively differentiated. And so the first part, I say to all of my team, Lisa, is you know, whether it's our solution consultants or our technical consultants, our solutions to the developers or the software development team that works my organization. Our goal is to be specialized in such a way that we're having relevant value added conversations not only our channel partners, but also end users of our partners want to bring us into those conversations, and many do. The next is really education and enablement as you would expect. And so there's a lot of things that are specialized in our technical. We drive education certification programs, roadshows, seminars, one of the things that we're seeing a lot of interest now. Lisa is for a digital marketing, and we're driving. Some really need offerings around digital marketing platforms that not only educate our partners but also allow our partners to bring their end users and tour some of this some of these technologies. So whether it's at our Clearwater office, where we've got an I. O T. Solution center, that we we take our partners and their clients through or we're using our facilities Teoh to do executive briefings and ideation as a service that, you know, kind of understanding the art of the possible. With both our resellers and their clients work, we're using our solution. Our solution catalogs that we've built an interactive pdf that allows our partners to understand over 50 solutions that we've got and then be able to identify. Where would they like to bring in David and his team and then my consultants to do that, that deep planning on business development, uh, that we talked about a little bit earlier. >>So the engagement right now is maybe even more important than it has been in a while because it's all hands off and virtual David. Talk to me about some of the engagement and the enablement piece that call and talked about. How are you able to really keep a channel partner and their end user customers engaged and interested in what you're able to deliver through this from New Virtual World? >>That's a great, great question. And we work in conjunction with our marketing teams to make sure that as new technologies and quite in I O. T space as well as within the HP East base as well that that our channel partners are educated and aware that these solutions exist. I know for a fact that for the majority of them you kind of get this consistent bombardment of new technology. But being able to actually have someone go out and explain it and then being able to correspondingly position it's use case and it's functionality and why it would provide value for your end customer is one of the benefits of tech data ads to kind of build upon that previous statement. The fact that We have such a huge portfolio of partners, so you kind of have HP and the edge compute space. But we have so many different partners in the OT space where it's really just a phone call, an email, a Skype message, a way to have that conversation around interoperability and then provide those responses back to our partners. >>Excellent. One more question before we go. Colin for you, A lot of partners. Why HP fry Mt. >>So a couple of reasons? One of the one of the biggest reasons as HP is just a great partner. And so when you look at evaluating I. O. T solutions that tend to be pretty comprehensive in many cases, Lisa it takes 10 or 12 partners to complete a really i o t solution and address that use case that that's in the field. And so when you have a partner like HP who's investing in these programs, investing in demand generation, investing in the spectrum of technology, whether it's hybrid Cloud Data Center, compute storage or your edge devices and Iot gateways, then to be able to contextualize those into what we call market ready solutions in each one of these vertical markets where there's references and there's use cases. And there were coupling education that specific rest of solutions. You know HP can do all of those things, and that's very important. Because in this new world, no one can go it alone anymore. It takes it takes partnerships, and we're all better together. And HP really does embrace that philosophy. And they've been a great partner for us in the Iot space. >>Excellent. Well, Colin and David, thank you so much for joining me today on the Cube Tech data. H p e i o t better together. Thank you so much. It's been a pleasure talking with you. >>Thank you. >>Thank you. Lisa. >>And four Collet and David. I am Lisa Martin. You're watching the Cube's virtual coverage of HP Discover 2020. Thanks for watching. Yeah, yeah, yeah, yeah.

Published Date : Jun 23 2020

SUMMARY :

Discover Virtual experience Brought to you by HP. And David, Welcome to the Cube. But to be able todo position ourselves with our customer base and the and in addition to the 40 plus years of partnership calling that you mentioned that Detected team is fully capable of having that conversation, and it's one that they're able to have of confidence, What are you guys doing today? And then we follow that with Data Analytics and then the Internet So the intelligent edge has been growing for quite some time. And that's that's kind of the apartment today with RHB Organization that pop into minor healthcare manufacturing. You kind of couple that with the ability to configure How ready are the channel partners and customers to be able to that clients are going to need and then we can do it quickly. You also have a lot of partners in the i o T And so the first part, I say to all of my team, Lisa, is you know, So the engagement right now is maybe even more important than it has been in a while because a fact that for the majority of them you kind of get this consistent bombardment One more question before we go. And HP really does embrace that philosophy. Thank you so much. Thank you. And four Collet and David.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Lisa Martin	PERSON	0.99+
10	QUANTITY	0.99+
Colin	PERSON	0.99+
Blair	PERSON	0.99+
David Smith	PERSON	0.99+
Lisa	PERSON	0.99+
HP	ORGANIZATION	0.99+
70	QUANTITY	0.99+
2020	DATE	0.99+
Azaz Colin	PERSON	0.99+
12 partners	QUANTITY	0.99+
Colin Blair	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
2000	DATE	0.99+
40 plus years	QUANTITY	0.99+
Aruba	LOCATION	0.99+
two guests	QUANTITY	0.99+
First	QUANTITY	0.99+
80%	QUANTITY	0.99+
Magic Quadrant Solutions	ORGANIZATION	0.99+
One	QUANTITY	0.99+
Collet	PERSON	0.99+
code 19	OTHER	0.99+
first part	QUANTITY	0.99+
first	QUANTITY	0.99+
over 40 years	QUANTITY	0.99+
Skype	ORGANIZATION	0.99+
one	QUANTITY	0.98+
today	DATE	0.98+
One more question	QUANTITY	0.98+
both	QUANTITY	0.98+
Aruba	ORGANIZATION	0.98+
Poland	LOCATION	0.97+
over 50 solutions	QUANTITY	0.97+
Rubio	PERSON	0.97+
last month	DATE	0.97+
Wild Analytics	ORGANIZATION	0.97+
HP Corporation	ORGANIZATION	0.96+
postcode 19	OTHER	0.96+
I O. T	ORGANIZATION	0.95+
I. O. T. And Data Solutions	ORGANIZATION	0.94+
Collins	PERSON	0.94+
single	QUANTITY	0.93+
Cube Tech	ORGANIZATION	0.91+
about seven years ago	DATE	0.91+
RHB Organization	ORGANIZATION	0.9+

Darren Murph, GitLab | CUBE Conversation, April 2020

>> Narrator: From theCUBE Studios and Palo Alto in Boston. Connecting with thought leaders all around the world. This is a CUBE conversation. >> Hey, welcome back everybody. Jeff Frick here with theCUBE. We're at our Palo Alto studio as kind of our on going leadership coverage of what's happening with the COVID crisis, and really looking out into our community to find experts who can provide tips and tricks, and some guidance as everyone is kind of charting these uncharted waters if you will. And we've got a great cube alarm in our database. He's a fantastic resourcer. We're excited to get him on. Share the information with you. We'd like to welcome once again, Darren Murph. He is the Head of Remote for GitLab. Darren, great to see you. >> Absolutely, great to be here. Thanks for having me. >> Absolutely, so thank you and. First off, we had you on earlier this year, back when things were normal, in kind of a regular review. Who knew that you would be at the center of the work-from-home universe just a few short months later. I mean, you've been doing this for ever. So it's kind of a wile old veteran of the work-from-home, or not even from home, just work from some place else. What are some top level things that you can share for people that have never experienced this before? >> Yeah, on the working front. If you're one of the people that are working from home, I think there's a couple of things you can do to help acclimate, make your world a little bit better. The first is to try to create some sort of separation between your work life and your personal life. Now if you have a home big enough that you can dedicate a workspace to being your office, that's going to help a lot. Help from a focus standpoint and just. You don't want those lines between work and life to blur too much, that's where isolation kicks in. That's where burnout kicks in. You want to do whatever you can to avoid. You got to remember, when you're not physically walking out of a office and disconnecting from work. You have to replicate that and recreate that. I actually recommend for people that used to have a commute and now they don't. I would actually black something in your calendar, whether that's cooking, cleaning, spending time with your family. Resting more, anything so that you ramp into your day very deliberately and ramp out of very deliberately. Now on the team leading front. I'm going to say it may feel a little counter intuitive, but the further your team is from you, the more distributed they are. The more you really need to let go and allow them to have mechanism for feeding back to you. Managers job in a remote setting switches from just being a pure director, you're actually being an unblocker. A really active listener. And for people who have gotten to a certain point in their career through command and control, this is going to feel very strange, jarring and counter intuitive, but we've seen it time and time again. You need to trust that your workers are in a new environment. You have to give them a mechanism feeding back to you to help them unblock whatever it is. >> You know that's funny, we had someone on as part of this the other day, talking about leaders need to change their objectives that they're managing to, from kind of activity based, to deliverals based. And it actually floored me that someone is still writing in a blog in 2020, that people have to change their management deliverables from activity to deliverables. And it was so funny, you know, you had Martin Mikos on, we had him on too. My favorite comment was, "It's so easy to fake it in the office and look busy, "but when you're at home all you have is your deliverable." so it really, it seems like there's kind of a forcing function to get people to pay attention to the things they should be managing to anyway. >> You said it, forcing functions. I talk about this all the time, but there are so many forcing functions in remote that help you do remote well. But not only just do remote well, just run your business well. Even if you plan on going back to office. On some level there's a lot of things you can do now to help pave the infrastructure to creating a better and more effective team. And as a manager, if you have it in a writing down. The metrics or expectations for your direct reports in the office, now's the time to do it. Subjectivity is allowed to flourish in the office. You can praise or promote people just kind of how much you like them or how easy they are to work with. That really has nothing to do with metrics and results. I've often been asked, "How do you know if "someone's been working remotely?" And my response is, how do you know if they were working in the office. If you can't clearly answer that in the office, then you're not going to be able to answer it remotely. So frankly, in these times a lot of the burden falls more on the manager to actually take a hard look at what they're clarifying to their team. And if the metrics aren't laid out. It's on the manager to lay that out. It's not the responsibility of the direct report to figure out how to prove their worth. The manager has to be very articulant about what that value looks like. >> Right, and not only do they have to be articulate about what the deliverables are and what their expectations are, but. You guys have a remote play book GitLab has published, which is terrific. People should go online, it's 38 pages of dense, dense, dense material. It's a terrific resource, it's a open source, you got to love the open source, eat those. But one of the slides that jumped out to me, and it's consistent with a lot of these conversations that we're having, is that your frequency of communications when people are not in the same room together. Has to go up dramatically, which is a little counter intuitive, but what I found even more interesting was the variety of types of communication. Not just you kind of standard meeting, or you standard status on a project, or maybe a little bit of a look forward to some strategic stuff. But you outlined a whole variety of types of communication. Objectives or methods, or feel if you will, to help people stay connected and to help kind keep this team building going forward. >> So here's the thing about communication. You've got to be intentional about it in a remote setting. And in fact, you need to have more intentionality across the board in a remote setting. And communications is just a very obvious. So for a lot of companies, they leave a lot of things to spontaneity Inter-personal relationships and communications are two of the biggest ones. Where you may not actually lay out a plan for how work is communicated about, or what opportunities you give people to chat about their weekends, or sports, or anything like that. You just kind of put them in the same building and then people just kind of figure it out. In a remote setting that's unwise. You're going to get a lot of chaos and disfunction when people don't know how to communicate and on what channel. So at GitLab we're very prescriptive that work communication happens in a GitLab issue or a merge request. And then informal communication happens through Zoom calls or Slack. We actually expired our Slack messages after 90 days, specifically to force people not to do work in Slack. We want the work to begin where it needs to end up, and in that case it's a very, it's a tool, GitLab, that's built for asynchronous communication. We want to continue to encourage that bias towards asynchronous communication. So yeah, we write down everything about how we want people to communicate and through what channels. And that may sound like a lot of rules, but actually it's very much appreciated by our global team. We have over 1200 people, in more that 65 countries. And they all just need to know where communication is going to happen. And our team is really cohesive and on the same page because we're articulant about that. >> So I want to double down on that. On 'A secret is peace', 'cause you brought this up, or you and Stu brought it up in your conversation with Stu, and Stu raised an interesting point, right. Unfortunately in the day of email and connected phones, and this and that, there has grown an expectation that used to be business, okay was, "I'll get back to you within 24 hours "if you leave me a voicemail." And lord knows what it was when we were still typing letters and memos, and sticking stuff in the yellow envelop with the string, right, as multiple days. But somehow that all got changed to, "I need to hear back form you now." And often it feels like, if your trying to have just some uninterrupted work time, to get something done. It's like, why is your lack of planning suddenly my emergency. And you talked about, you can't operate that on a global, asynchronous team because everyone's in different timezones. And just by rule, there are going to be a lot of people that are not awake when you need the answer to that question. But that you've developed a culture that that's okay, and that that is kind of the flow and the pacing which A, forces people to ask in advance, not immediately when you need it. But also gives people unfettered time to actually plan to do work versus plan to answer communications. I wonder if you can dig into how did that evolve and how do you enforce that when somebody comes in from the outside world. >> The real key to that is something that might not be immediately apparent to everyone. Which is, at GitLab we try to shift as much burden as we possibly can humans to documentation. And this even starts at onboarding, where to get onboarded at GitLab, you get an onboarding issue within GitLab, with over 200 check boxes of things to read and knowledge assessments to take. And humans are a part of it, but very minimal compared to what most companies would do. And the thing that you just outlined was, we're talking about asking questions. Or tapping someone on the shoulder to fill in a knowledge gap. But at GitLab we want to write everything down in a very formalized structured way. We try to work handbook first. So we need to document all of our processes, protocols and solutions. Basically everything that we've ever seen or done, needs to be documented in the handbook. So it's not that GitLab team members just magically need less information, it's just that instead of having to ask someone on our team, we go ask the handbook. We go consult the documentation. And the more rich that your documentation is, the less you have to bother other people, and the less you need to rely on synchronicity. So for us it all starts with operating handbook first. That allows our humans to reserve their cycles for doing truly creative things, not just answering your question for the thousandth time. >> Right, another thing you covered, which I really enjoyed was getting senior executives to work from home for an extended period of time. Now obviously, before COVID that would probably be a lot harder to do. Well now COVID has forced that. And I think to your point about that is, it really forces the empathy for someone who had no interest in working from home. Didn't like to work from home. Loves going to the office, has their routine. Been doing it for decades, to kind of wake up to A, you need to have more empathy for what this is all about. And B, what's it all about by actually doing it. So I wonder, kind of your take in the movement to more of a work from anywhere future. Now that all the senior executives have been thrown into this work from home situation. >> Look Jeff, you never want to waste a crisis. We can't wish away the crisis that's in front of us, but we can choose how we respond to it. And this does present an opportunity to lay ground work, to lay infrastructure, to build a more remote organization. And I have absolutely advocated for companies to get their leadership teams out of the office for a meaningful amount of time. A month, ideally a quarter. So that they actually understand what the remote life is. They actually have some of those communication gaps and challenges so they can document what's happening. And then help fix it. But to your point, executives love going to the office because they're on a different playing field to begin with. They usually have an executive assistant. Things are just. There's less friction in general. So it behooves them to just kind of keep charging in that direction, but now what we have is a situation where all of those executives are remote. And I'm seeing a lot of them say, "You know what, I'm seeing the myths that I've perpetrated "break down in front of me." And this is even in the most suboptimal time ever to go remote. This isn't remote work, this is crisis induced work from home. We're all dealing with social isolation. Our parents are also doubling as homeschool teachers. We have a lot going on. And even on top of all of that, I'm amazed at how adaptable the human society has been. In just adjusting to this and figuring it out on the fly. And I think the companies that take this opportunity, to ask themselves the right question, and build this into their ongoing talent and operational strategy, will actually come out stronger on the other side. >> Yeah, as you said. This is as challenging as it's ever been. There was no planning ahead, you're spouse or significant other's also working from home. And has the same Zoom schedule as you do, for some strange reason, right. The kids are home as you said, and your homeschooling them. And they also have to get on Zoom to do their classes. So it's really suboptimal. But as you said, it's a forcing function and people are going to learn. One of the other things in your handbook is the kind of definitions. It's not just work from home or work at the office, but there's actually a continuum and a spectrum. And as people are doing this for weeks and months. And behaviors turn into habits. People are not going to want to go back to sitting on 101 for two hours every morning to go work on a laptop in the office. It just doesn't make sense. So as you kind of look forward. How do you see the evolution. How are people taking baby steps, if you will. To incorporate more of this learning as we go forward. And incorporate into more of their regular, everyday procedures. >> I'm really optimistic about the future because what I see happening here is people are unlocking their imaginations. So once they've kind of stabilized, they're starting to realize, "Hey, I'm getting a lot more time with my family. "I'm spending a lot less on gas. "I just feel better as a person because I don't show up "to work everyday with road rage. "So how can I keep this going." And I genuinely think what's going to happen in four or five months, we're going to have millions of people collectively look at each other and they say, "The boss just called me back into the office "but I just did my job from home. "Even in suboptimal conditions. "I saw my family more, I exercised more. "I had more time to cook and clean. "How about no, I'm not going to go back to the office "as my default location." And I think what's going to happen is the 80, 20 rule is going to flip. Right now people work from home only for a special occasion, like the cable company's coming or something like that. Going forward it think the offices are going to be the special occasion. You're only going to commute to the office, or fly to the office when you have a large contingent of people coming in and you need to wine and dine them, or something like that. And the second order of this is, people that are only living in expensive cities because of their location. When their lease comes up for renewal, they're going to cast a glance at places like Wyoming and Idaho, and Ohio. Maybe even Vietnam and Cambodia, or foreign places. Because now you have them thinking of, "What could life look like if I decouple geography at work. "I still want to work really hard "and contribute this knowledge. "But I can go to a place with better air quality, "better schools, better opportunity to actually "invest in a smaller community, "where I can see real impact." And I think that's just going to have massive, massive societal impacts. People are really taking this time to consider how tightly their identity has been woven into work. Now that they're home and they've become something more than just whatever the office life has defined them as. I think that's really healthy. I think a lot of people may have intertwined those two things too tightly in the past. And now it's a forcing function to really ask yourself, you aren't just your work, you're more than your work. And what can that look like when you can do that job from anywhere. >> Right, right. And as you said, there's so many kind of secondary benefits in terms of traffic and infrastructure, and the environment and all kinds of things. And the other thing I think that's interesting what you said, 80, 20 I think that was pretty generous. I wouldn't give it a 20 percent. But if people, even in this hybrid steps, do more once a week, twice a week. Once every two weeks, right. The impact on the infrastructure and peoples lives is going to be huge. But I wanted to drill on something as we go into kind of this hybrid mode at some point in time. And you talked about, and I thought it was fascinating, about the norms and really coming out from a work from home first, or a work from anywhere first. Your very good at specifying anywhere doesn't mean home. Could be the library, could be the coffee shop. Could be an office, could be a WeWork. Could be wherever. Because if you talked about the new norms and the one I thought was really interesting, which probably impacts a lot of teams, is when some of the team's in the office and some of the team isn't. The typical move, right, is to have everyone in the office go into the conference room. We sit around one big screen. So you get like five people sitting around one table and you got a bunch of heads on Zoom. And you said, "You know, no. "Let's all be remote. So if we just be happen to be sitting at our desk. If we happen to be in the office, that's okay. But really normalize. And like we saw the movement from Cloud got to Cloud-e to Cloud first, why not Cloud. And then you know, kind of mobile and does it work in a mobile. No, no, no it has to. It's mobile first. Really the shift to not, can it be done at home, but tell me why it shouldn't be done at home, a really different kind of opening position as to how people deploy resources and think about staffing and assigning teams. It's like turning the whole thing upside down. >> Completely upside down. I think remote first to your point, is going to be the default going forward. I think we're just one or two quarters away from major CEO's sitting on the hot seat on CNBC, when it's their turn for quarterly earnings. And they're going to have to justify why they're spending what they spend on real estate. Is if your spending a billion dollars a year on real estate, you could easily deploy that to more people, more R&D. Once that question is asked in mass, that is when you're going to see the next phase of this. Where you really have to justify, even from a cost stand point, why are you spending so much? Why are you tying so much of your business results to geography. The thing about remote first is that it's not a us versus them. A lot of what we've learned at GitLab, and how we operate so efficiently. They work really well for remote teams, and they are remote first. But they would work just as well in an office. We attach a Google doc agenda to every single business meeting that we have, so that there's always an artifact. There's always a documented thread on what happened in a meeting. Now this would work just as well in a co-located meeting. Who wouldn't want to have a meeting where it's not just in one ear and out the other. You're going to give the time to the meeting, you might as well get something out of it. And so a lot of these remote forced. Remote first forcing functions, they do help remote teams work well. But I think it's especially important for hybrid teams. Offices aren't going to vanish overnight. A lot of these companies are going to have some part of their company return to the office, when travel restrictions are lifted. It think the key here is that its going to be a lot more fluent. You're never going to know on a day to day basis, who is coming into the office, and who is not. So you need to optimize for everyone being out of the office. And if they just so happen to be there, they just so happen to be there. >> Right, right. So before we. I want to get into one little nitty gritty subject, in terms of investment into the home office. You know, we're doing 100% remote interviews now on theCUBE, we used to go to pretty much. Probably 80% of our business was at events, or at peoples offices, or facilities. Now it's all dial-in. You talked a lot about people need to flex a little bit on enabling people to invest in the little bits and pieces of infrastructure for their home office, that they just don't have the same set ups. You're talking about multiple monitors, a comfortable chair, a good light. That there's a few things you can invest in, not tremendous amounts of money. But a couple of hundred bucks here and there, to make a big difference on the home work environment. And how people should think about making that investment into a big monitor that they don't see. It's not sitting at the desk in the office. >> 100%, look if you're coming from a co-located space, you're probably sitting in a cube that costs five, 10, maybe 20 thousand dollars put together. You might not notice that, but it's not cheap to build cubicles in a high rise. And if you go to your home and you have nothing set up, I would say it's on the people group to think really hard about being more lax and more lenient about spending policy. People need multiple monitors. You need a decent webcam, you need a decent microphone. You need a chair that isn't going to kill your back. You want to help people create healthy ergonomics. Sustainable workspaces in their home. This is the kind of thing that will inevitably impact productivity. You force someone to just be hunched over on their couch, in front of a 13 inch laptop. I mean, what kind of productivity do you really expect from that. That's not a great long term solution. I think the people group actually has a higher burden to bare all the way around. You know when it comes to making sure teams feel like teams and they have the atmosphere to connect on a meaningful level. It comes down to the people group, to not letting that just go to spontaneity. You want to have a happy hour virtually, you're going to have to put a calendar invite on peoples calendar. You're going to create topical channels in Slack for people to talk about things other than work. Someone's going to have to do that. They don't just happen by default. So, from hardware all the way to communication. The people group really needs to use this opportunity to think about, "Okay, what can we unlock in this new world." >> Right, I'm glad you said the people group and not the resources group because they're not coal, or steel, or a factory. >> No, if anything COVID has humanized this in a way, and I think it's actually a really big silver lining, where we're all now peering into each others homes. And it is glaringly obvious, that we're all humans first, colleagues second. And of course that always been the case, but there's something about a sterile or co-located work environment. You check a piece of you at the door. And you just kind of get down to business. Why is that, we have technology at out fingertips. We can be humans with each other. And that going to actually encourage more empathy. As we've seen at GitLab, more empathy leads to better business results. It leads to more meaningful connections. I mean, I have people, friends, located all over the world that I feel like I have a closer bond with. A closer, more intimate connection with that a lot of people I've met in office. To some degree you don't know who they really are. You don't know what they really love and what makes them tick. >> Right, right. All right Darren, so before I let you go and again thank you for the time, the conversation. I'm sure everyone is calling you up and I just love the open source ETHOS and the sharing. It's made such a huge impact on the technology world and second order impacts that a lot of people take advantage. Again, give us the place that people can go for the playbook, so they can come and leverage some of the resources. And again, thank you guys for publishing 'em. >> Absolutely, so we're an open source. We try to open source all of our learnings on remote. So go to allremote.info that will redirect you right into the All Remote section of GitLab handbook. All of which is open source. Right at the top you can download the remote playbook, which is PDF that we talked about. Download that, it takes you through all of our best information on getting started and thriving as a remote team. Just under that there's a lot of comprehensive guides on how we think about everything. And how we operate synchronously. How we handle meeting, and even hiring and compensation. allremote.info and of course you're welcome to reach out to me on Twitter, I'm @darrenmurph. >> All right, well thanks a lot Darren. And I find it somewhat ironic that you have a jetliner over your shoulder. Waiting for the lockdown and the quarantine to end so you can get back on the airplane. And we're looking forward to that day. >> Can't wait man, I miss, I miss the airplanes. I told someone the other day, I never thought I'd say I miss having a middle seat at the very back of the airplane, with someone reclined into my nose. But honestly, I can't wait. Take me anywhere. >> I think you'll be fighting people for that seat in another month or so. All right, thanks a lot, Darren. >> Absolutely, take care all. >> All right, he's Darren. I'm Jeff, you're watching theCUBE, from our Palo Altos Studios. Thanks for watching, we'll see you next time. (upbeat music)

Published Date : Apr 30 2020

SUMMARY :

This is a CUBE conversation. We're excited to get him on. Absolutely, great to be here. Who knew that you would be at the center mechanism feeding back to you that people have to change in the office, now's the time to do it. that jumped out to me, And they all just need to "I need to hear back form you now." And the thing that you just outlined was, And I think to your point about that is, But to your point, executives And has the same Zoom schedule as you do, or fly to the office when you have a large Really the shift to not, the time to the meeting, on enabling people to and they have the atmosphere to connect and not the resources group And that going to actually and I just love the open Right at the top you can and the quarantine to end I miss the airplanes. fighting people for that seat we'll see you next time.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Idaho	LOCATION	0.99+
Jeff Frick	PERSON	0.99+
Darren Murph	PERSON	0.99+
Vietnam	LOCATION	0.99+
Wyoming	LOCATION	0.99+
10	QUANTITY	0.99+
20 percent	QUANTITY	0.99+
Ohio	LOCATION	0.99+
Cambodia	LOCATION	0.99+
Darren	PERSON	0.99+
38 pages	QUANTITY	0.99+
April 2020	DATE	0.99+
2020	DATE	0.99+
Stu	PERSON	0.99+
five people	QUANTITY	0.99+
80	QUANTITY	0.99+
two hours	QUANTITY	0.99+
two	QUANTITY	0.99+
80%	QUANTITY	0.99+
@darrenmurph	PERSON	0.99+
one	QUANTITY	0.99+
20 thousand dollars	QUANTITY	0.99+
100%	QUANTITY	0.99+
GitLab	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
20	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
one table	QUANTITY	0.99+
five	QUANTITY	0.99+
13 inch	QUANTITY	0.99+
CNBC	ORGANIZATION	0.99+
four	QUANTITY	0.99+
first	QUANTITY	0.99+
two things	QUANTITY	0.99+
over 1200 people	QUANTITY	0.98+
First	QUANTITY	0.98+
allremote.info	OTHER	0.98+
theCUBE Studios	ORGANIZATION	0.98+
thousandth time	QUANTITY	0.98+
five months	QUANTITY	0.97+
second order	QUANTITY	0.97+
Cloud	TITLE	0.97+
earlier this year	DATE	0.97+
A month	QUANTITY	0.97+
One	QUANTITY	0.96+
twice a week	QUANTITY	0.96+
24 hours	QUANTITY	0.96+
once a week	QUANTITY	0.95+
Slack	TITLE	0.95+
theCUBE	ORGANIZATION	0.95+
over 200 check boxes	QUANTITY	0.95+
millions of people	QUANTITY	0.94+
90 days	QUANTITY	0.93+
COVID	ORGANIZATION	0.93+
Martin Mikos	PERSON	0.93+
Twitter	ORGANIZATION	0.91+
second	QUANTITY	0.9+
GitLab	TITLE	0.89+
e	TITLE	0.87+
a quarter	QUANTITY	0.87+
WeWork	ORGANIZATION	0.87+

UNLIST TILL 4/2 - Vertica Database Designer - Today and Tomorrow

>> Jeff: Hello everybody and thank you for joining us today for the Virtual VERTICA BDC 2020. Today's breakout session has been titled, "VERTICA Database Designer Today and Tomorrow." I'm Jeff Healey, Product VERTICA Marketing, I'll be your host for this breakout session. Joining me today is Yuanzhe Bei, Senior Technical Manager from VERTICA Engineering. But before we begin, (clearing throat) I encourage you to submit questions or comments during the virtual session. You don't have to wait, just type your question or comment in the question box below the slides and click Submit. As always, there will be a Q&A session at the end of the presentation. We'll answer as many questions, as we're able to during that time, any questions we don't address, we'll do our best to answer them offline. Alternatively, visit VERTICA forums at forum.vertica.com to post your questions there after the session. Our engineering team is planning to join the forums, to keep the conversation going. Also, a reminder that you can maximize your screen by clicking the double arrow button at the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to view on demand this week. We will send you a notification as soon as it's ready. Now let's get started. Over to you Yuanzhe. >> Yuanzhe: Thanks Jeff. Hi everyone, my name is Yuanzhe Bei, I'm a Senior Technical Manager at VERTICA Server RND Group. I run the query optimizer, catalog and the disaggregated engine team. Very glad to be here today, to talk about, the "VERTICA Database Designer Today and Tomorrow". This presentation will be organized as the following; I will first refresh some knowledge about, VERTICA fundamentals such as Tables and Projections, which will bring to the question, "What is Database Designer?" and "Why we need this tool?". Then I will take you through a deep dive, into a Database Designer or we call DBD, and see how DBD's internals works, after that I'll show you some exciting DBD improvements, we have planned for 10.0 release and lastly, I will share with you, some DBD future roadmap we planned next. As most of you should already know, VERTICA is built on a columnar architecture. That means, data is stored column wise. Here we can see a very simple example, of table with four columns, and the many of you may also know, table in VERTICA is a virtual concept. It's just a logical representation of data, which means user can write SQL query, to reference the table names and column, just like other relational database management system, but the actual physical storage of data, is called Projection. A Projection can reference a subset, or all of the columns all to its anchor table, and must be sorted by at least one column. Each table need at least one C for projection which reference all the columns to the table. If you load data to a table with no projection, and automated, auto production will be created, which will be arbitrarily assorted by, the first couple of columns in the table. As you can imagine, even though such other production, can be used to answer any query, the performance is not optimized in most cases. A common practice in VERTICA, is to create multiple projections, contain difference step of column, and sorted in different ways on the same table. When query is sent to the server, the optimizer will pick the projection, that can answer the query in the most efficient way. For example, here you can say, let's say you have a query, that select columns B, D, C and sorted by B and D, the third projection will be ideal, because the data is already sorted, so you can save the sorting costs while executing the query. Basically when you choose the design of the projection, you need to consider four things. First and foremost, of course the sort order. The data already sorted in the right way, can benefit quite a lot of the query actually, like Ordered by, Group By, Analytics, Merge, Join, Predicates and so on. The select column group is also important, because the projection must contain, all the columns referenced by your workflow query. Even missing one column in the projection, this projection cannot be used for a particular query. In addition, VERTICA is the distributed database, and allow projection to be segmented, based on the hash of a set of columns, which is beneficial if the segmentation merged, the join keys or group keys. And finally encoding of each per columns is also part of the design, because the data is sorted in different way, may completely change the optimal encoding for each column. This example only show the benefit of the first two, but you can imagine the rest too are also important. But even for that, it doesn't sound that hard, right? Well I hope you change your mind already when you see this, at least I do. These machine generated queries, really beats me. It will probably take an experienced DBA hours, to figure out which projection can be benefit these queries, not even mentioning there could be hundreds of such queries, in the regular work logs in the real world. So what can we do? That's why we need DBD. DBD is a tool integrated in the VERTICA server, that it can help DBA to perform an access, on their work log query, tabled schema and data, and then automatically figure out, the most optimized projection design for their workload. In addition, DBD also a sophisticated tool, that can take customize by a user, by sending a lot of parameters objectives and so on. And lastly, DBD has access to the optimizer, so DB knows what kind of attribute, the projection need to have, in order to have the optimizer to benefit from them. DBD has been there for years, and I'm sure there are plenty of materials available online, to show you how DBD can be used in different scenarios, whether to achieve the query optimize, or load optimize, whether it's the comprehensive design, or the incremental design, whether it's a dumping deployment script, and manual deployment later, or let the DBD do the order deployment for you, and the many other options. I'm not planning to talk about this today, instead, I will take the opportunity today, to open this black box DBD, and show you what exactly hide inside. DBD is a complex tool and I have tried my best to summarize the DBD design process into seven steps; Extract, Permute, Prune, Build, Score, Identify and Encode. What do they mean? Don't worry, I will show you step by step. The first step is Extract. Extract Interesting Columns. In this step, DBD pass the design queries, and figure out the operations that can be benefited, by the potential projection design, and extract the corresponding columns, as interesting columns. So Predicates, Group By, Order By, Joint Condition, and analytics are all interesting Column to the DBD. As you can see this three simple sample queries, DBD can extract the interest in column sets on the right. Some of these column sets are unordered. For example, the green one for Group By a1 and b1, the DBD extracts the interesting column set, and put them in the own orders set, because either data sorted by a1 first or b1 first, can benefit from this Group By operation. Some of the other sets are ordered, and the best example is here, order by clause a2 and b2, and obviously you cannot sort it by b2 and then a2. These interesting columns set will be used as if, to extend to actual projection sort order candidates. The next step is Permute, once DBD extract all the C's, it will enumerate sort order using C, and how does DBD do that? I'm starting with a very simple example. So here you can see DBD can enumerate two sort orders, by extending d1 with the unordered set a1, b1, and the derived at two sort order candidates, d1, a1, b1, and d1, b1, a1. This sort order can benefit queries with predicate on d1, and also benefit queries by Group By a1, b1, when a1, sorry when d1 is constant. So with the same idea, DBD will try to extend other States with each other, and populate more sort order permutations. You can imagine that how many of them, there could be many of them, these candidates, based on how many queries you have in the design and that can be handled of the sort order candidates. That comes to the third step, which is Pruning. This step is to limit the candidates sort order, so that the design won't be running forever. DBD uses very simple capping mechanism. It sorts all the, sort all the candidates, are ranked by length, and only a certain number of the sort order, with longest length, will be moved forward to the next step. And now we have all the sort orders candidate, that we want to try, but whether this sort order candidate, will be actually be benefit from the optimizer, DBD need to ask the optiizer. So this step before that happens, this step has to build those projection candidate, in the catalog. So this step will build, will generates the projection DBL's, surround the sort order, and create this projection in the catalog. These projections won't be loaded with real data, because that takes a lot of time, instead, DBD will copy over the statistic, on existing projections, to this projection candidates, so that the optimizer can use them. The next step is Score. Scoring with optimizer. Now projection candidates are built in the catalog. DBD can send a work log queries to optimizer, to generate a query plan. And then optimizer will return the query plan, DBD will go through the query plan, and investigate whether, there are certain benefits being achieved. The benefits list have been growing over time, when optimizer add more optimizations. Let's say in this case because the projection candidates, can be sorted by the b1 and a1, it is eligible for Group By Pipe benefit. Each benefit has a preset score. The overall benefit score of all design queries, will be aggregated and then recorded, for each projection candidate. We are almost there. Now we have all the total benefit score, for the projection candidates, we derived on the work log queries. Now the job is easy. You can just pick the sort order with the highest score as the winner. Here we have the winner d1, b1 and a1. Sometimes you need to find more winners, because the chosen winner may only benefit a subset, of the work log query you provided to the DBD. So in order to have the rest of the queries, to be also benefit, you need more projections. So in this case, DBD will go to the next iteration, and let's say in this case find to another winner, d1, c1, to benefit the work log queries, that cannot be benefit by d1, b1 and a1. The number of iterations and thus the winner outcome, DBD really depends on the design objective that uses that. It can be load optimized, which means that only one, super projection winner will be selected, or query optimized, where DBD try to create as many projections, to cover most of the work log queries, or somewhat balance an objective in the middle. The last step is to decide encoding, for each projection columns, for the projection winners. Because the data are sorted differently, the encoding benefits, can be very different from the existing projection. So choose the right projection encoding design, will save the disk footprint a significant factor. So it's worth the effort, to find out the best thing encoding. DBD picks the encoding, based on the actual sampling the data, and measure the storage footprint. For example, in this case, the projection winner has three columns, and say each column has a few encoding options. DBD will write the sample data in the way this projection is sorted, and then you can see with different encoding, the disk footprint is different. DBD will then compare the disk footprint of each, of different options for each column, and pick the best encoding options, based on the one that has the smallest storage footprint. Nothing magical here, but it just works pretty well. And basic that how DBD internal works, of course, I think we've heard it quite a lot. For example, I didn't mention how the DBD handles segmentation, but the idea is similar to analyze the sort order. But I hope this section gave you some basic idea, about DBD for today. So now let's talk about tomorrow. And here comes the exciting part. In version 10.0, we significantly improve the DBD in many ways. In this talk I will highlight four issues in old DBD and describe how the 10.0 version new DBD, will address those issues. The first issue is that a DBD API is too complex. In most situations, what user really want is very simple. My queries were slow yesterday, with the new or different projection can help speed it up? However, to answer a simple question like this using DBD, user will be very likely to have the documentation open on the side, because they have to go through it's whole complex flow, from creating a projection, run the design, get outputs and then create a design in the end. And that's not there yet, for each step, there are several functions user need to call in order. So adding these up, user need to write the quite long script with dozens of functions, it's just too complicated, and most of you may find it annoying. They either manually tune the projection to themselves, or simply live with the performance and come back, when it gets really slow again, and of course in most situations, they never come back to use the DBD. In 10.0 VERTICA support the new simplified API, to run DBD easily. There will be just one function designer_single_run and one argument, the interval that you think, your query was slow. In this case, user complained about it yesterday. So what does this user to need to do, is just specify one day, as argument and run it. The user don't need to provide anything else, because the DBD will look up his query or history, within that time window and automatically populate design, run design and export the projection design, and the clean up, no user intervention needed. No need to have the documentation on the side and carefully write a script, and a debug, just one function call. That's it. Very simple. So that must be pretty impressive, right? So now here comes to another issue. To fully utilize this single round function, users are encouraged to run DBD on the production cluster. However, in fact, VERTICA used to not recommend, to run a design on a production cluster. One of the reasons issue, is that DBD picks massive locks, both table locks and catalog locks, which will badly interfere the running workload, on a production cluster. As of 10.0, we eliminated all the table and ten catalog locks from DBD. Yes, we eliminate 100% of them, simple improvement, clear win. The third issue, which user may not be aware of, is that DBD writes intermediate result. into real VERTICA tables, the real DBD have to do that is, DBD is the background task. So the intermediate results, some user needs to monitor it, the progress of the DBD in concurrent session. For complex design, the intermediate result can be quite massive, and as a result, many lost files will be created, and written to the disk, and we should both stress, the catalog, and that the disk can slow down the design. For ER mode, it's even worse because, the table are shared on communal storage. So writing to the regular table, means that it has to upload the data, to the communal storage, which is even more expensive and disruptive. In 10.0, we significantly restructure the intermediate results buffer, and make this shared in memory data structure. Monitoring queries will go directly look up, in memory data structure, and go through the system table, and return the results. No Intermediate Results files will be written anymore. Another expensive lubidge of local disk for DBD is encoding design, as I mentioned earlier in the deep dive, to determine which encoding works the best for the new projection design, there's no magic way, but the DBD need to actually write down, the sample data to the disk, using the different encoding options, and to find out which ones have the smallest footprint, or pick it as the best choice. These written sample data will be useless after this, and it will be wiped out right away, and you can imagine this is a huge waste of the system resource. In 10.0 we improve this process. So instead of writing, the different encoded data on the disk, and then read the file size, DBD aggregate the data block size on-the-fly. The data block will not be written to the disk, so the overall encoding and design is more efficient and non-disruptive. Of course, this is just about the start. The reason why we put a significant amount of the resource on the improving the DBD in 10.0, is because the VERTICA DBD, as essential component of the out of box performance design campaign. To simply illustrate the timeline, we are now on the second step, where we significantly reduced, the running overhead of the DBD, so that user will no longer fear, to run DBD on their production cluster. Please be noted that as of 10.0, we haven't really started changing, how DBD design algorithm works, so that what we have discussed in the deep dive today, still holds. For the next phase of DBD, we will briefly make the design process smarter, and this will include better enumeration mechanism, so that the pruning is more intelligence rather than brutal, then that will result in better design quality, and also faster design. The longer term is to make DBD to achieve the automation. What entail automation and what I really mean is that, instead of having user to decide when to use DBD, until their query is slow, VERTICA have to know, detect this event, and have have DBD run automatically for users, and suggest the better projections design, if the existing projection is not good enough. Of course, there will be a lot of work that need to be done, before we can actually fully achieve the automation. But we are working on that. At the end of day, what the user really wants, is the fast database, right? And thank you for listening to my presentation. so I hope you find it useful. Now let's get ready for the Q&A.

Published Date : Mar 30 2020

SUMMARY :

at the end of the presentation. and the many of you may also know,

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Yuanzhe Bei	PERSON	0.99+
Jeff Healey	PERSON	0.99+
100%	QUANTITY	0.99+
forum.vertica.com	OTHER	0.99+
one day	QUANTITY	0.99+
second step	QUANTITY	0.99+
third step	QUANTITY	0.99+
tomorrow	DATE	0.99+
third issue	QUANTITY	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
yesterday	DATE	0.99+
Each benefit	QUANTITY	0.99+
Today	DATE	0.99+
third projection	QUANTITY	0.99+
One	QUANTITY	0.99+
b2	OTHER	0.99+
each column	QUANTITY	0.99+
first issue	QUANTITY	0.99+
one column	QUANTITY	0.99+
three columns	QUANTITY	0.99+
VERTICA Engineering	ORGANIZATION	0.99+
Yuanzhe	PERSON	0.99+
each step	QUANTITY	0.98+
Each table	QUANTITY	0.98+
first step	QUANTITY	0.98+
DBD	TITLE	0.98+
DBD	ORGANIZATION	0.98+
seven steps	QUANTITY	0.98+
DBL	ORGANIZATION	0.98+
each	QUANTITY	0.98+
one argument	QUANTITY	0.98+
VERTICA	TITLE	0.98+
each projection	QUANTITY	0.97+
first two	QUANTITY	0.97+
first	QUANTITY	0.97+
this week	DATE	0.97+
hundreds	QUANTITY	0.97+
one function	QUANTITY	0.97+
clause a2	OTHER	0.97+
one	QUANTITY	0.97+
each per columns	QUANTITY	0.96+
Tomorrow	DATE	0.96+
both	QUANTITY	0.96+
four issues	QUANTITY	0.95+
VERTICA	ORGANIZATION	0.95+
b1	OTHER	0.95+
single round	QUANTITY	0.94+
4/2	DATE	0.94+
first couple of columns	QUANTITY	0.92+
VERTICA Database Designer Today and Tomorrow	TITLE	0.91+
Vertica	ORGANIZATION	0.91+
10.0	QUANTITY	0.89+
one function call	QUANTITY	0.89+
a1	OTHER	0.89+
four things	QUANTITY	0.88+
c1	OTHER	0.87+
two sort order	QUANTITY	0.85+

UNLIST TILL 4/2 - The Shortest Path to Vertica – Best Practices for Data Warehouse Migration and ETL

hello everybody and thank you for joining us today for the virtual verdict of BBC 2020 today's breakout session is entitled the shortest path to Vertica best practices for data warehouse migration ETL I'm Jeff Healey I'll leave verdict and marketing I'll be your host for this breakout session joining me today are Marco guesser and Mauricio lychee vertical product engineer is joining us from yume region but before we begin I encourage you to submit questions or comments or in the virtual session don't have to wait just type question in a comment in the question box below the slides that click Submit as always there will be a Q&A session the end of the presentation will answer as many questions were able to during that time any questions we don't address we'll do our best to answer them offline alternatively visit Vertica forums that formed at vertical comm to post your questions there after the session our engineering team is planning to join the forums to keep the conversation going also reminder that you can maximize your screen by clicking the double arrow button and lower right corner of the sides and yes this virtual session is being recorded be available to view on demand this week send you a notification as soon as it's ready now let's get started over to you mark marco andretti oh hello everybody this is Marco speaking a sales engineer from Amir said I'll just get going ah this is the agenda part one will be done by me part two will be done by Mauricio the agenda is as you can see big bang or piece by piece and the migration of the DTL migration of the physical data model migration of et I saw VTL + bi functionality what to do with store procedures what to do with any possible existing user defined functions and migration of the data doctor will be by Maurice it you want to talk about emeritus Rider yeah hello everybody my name is Mauricio Felicia and I'm a birth record pre-sales like Marco I'm going to talk about how to optimize that were always using some specific vertical techniques like table flattening live aggregated projections so let me start with be a quick overview of the data browser migration process we are going to talk about today and normally we often suggest to start migrating the current that allows the older disease with limited or minimal changes in the overall architecture and yeah clearly we will have to port the DDL or to redirect the data access tool and we will platform but we should minimizing the initial phase the amount of changes in order to go go live as soon as possible this is something that we also suggest in the second phase we can start optimizing Bill arouse and which again with no or minimal changes in the architecture as such and during this optimization phase we can create for example dog projections or for some specific query or optimize encoding or change some of the visual spools this is something that we normally do if and when needed and finally and again if and when needed we go through the architectural design for these operations using full vertical techniques in order to take advantage of all the features we have in vertical and this is normally an iterative approach so we go back to name some of the specific feature before moving back to the architecture and science we are going through this process in the next few slides ok instead in order to encourage everyone to keep using their common sense when migrating to a new database management system people are you often afraid of it it's just often useful to use the analogy of how smooth in your old home you might have developed solutions for your everyday life that make perfect sense there for example if your old cent burner dog can't walk anymore you might be using a fork lifter to heap in through your window in the old home well in the new home consider the elevator and don't complain that the window is too small to fit the dog through this is very much in the same way as Narita but starting to make the transition gentle again I love to remain in my analogy with the house move picture your new house as your new holiday home begin to install everything you miss and everything you like from your old home once you have everything you need in your new house you can shut down themselves the old one so move each by feet and go for quick wins to make your audience happy you do bigbang only if they are going to retire the platform you are sitting on where you're really on a sinking ship otherwise again identify quick wings implement published and quickly in Vertica reap the benefits enjoy the applause use the gained reputation for further funding and if you find that nobody's using the old platform anymore you can shut it down if you really have to migrate you can still go to really go to big battle in one go only if you absolutely have to otherwise migrate by subject area use the group all similar clear divisions right having said that ah you start off by migrating objects objects in the database that's one of the very first steps it consists of migrating verbs the places where you can put the other objects into that is owners locations which is usually schemers then what do you have that you extract tables news then you convert the object definition deploy them to Vertica and think that you shouldn't do it manually never type what you can generate ultimate whatever you can use it enrolls usually there is a system tables in the old database that contains all the roads you can export those to a file reformat them and then you have a create role and create user scripts that you can apply to Vertica if LDAP Active Directory was used for the authentication the old database vertical supports anything within the l dubs standard catalogued schemas should be relatively straightforward with maybe sometimes the difference Vertica does not restrict you by defining a schema as a collection of all objects owned by a user but it supports it emulates it for old times sake Vertica does not need the catalog or if you absolutely need the catalog from the old tools that you use it it usually said it is always set to the name of the database in case of vertical having had now the schemas the catalogs the users and roles in place move the take the definition language of Jesus thought if you are allowed to it's best to use a tool that translates to date types in the PTL generated you might see as a mention of old idea to listen by memory to by the way several times in this presentation we are very happy to have it it actually can export the old database table definition because they got it works with the odbc it gets what the old database ODBC driver translates to ODBC and then it has internal translation tables to several target schema to several target DBMS flavors the most important which is obviously vertical if they force you to use something else there are always tubes like sequel plots in Oracle the show table command in Tara data etc H each DBMS should have a set of tools to extract the object definitions to be deployed in the other instance of the same DBMS ah if I talk about youth views usually a very new definition also in the old database catalog one thing that you might you you use special a bit of special care synonyms is something that were to get emulated different ways depending on the specific needs I said I stop you on the view or table to be referred to or something that is really neat but other databases don't have the search path in particular that works that works very much like the path environment variable in Windows or Linux where you specify in a table an object name without the schema name and then it searched it first in the first entry of the search path then in a second then in third which makes synonym hugely completely unneeded when you generate uvl we remained in the analogy of moving house dust and clean your stuff before placing it in the new house if you see a table like the one here at the bottom this is usually corpse of a bad migration in the past already an ID is usually an integer and not an almost floating-point data type a first name hardly ever has 256 characters and that if it's called higher DT it's not necessarily needed to store the second when somebody was hired so take good care in using while you are moving dust off your stuff and use better data types the same applies especially could string how many bytes does a string container contains for eurozone's it's not for it's actually 12 euros in utf-8 in the way that Vertica encodes strings and ASCII characters one died but the Euro sign thinks three that means that you have to very often you have when you have a single byte character set up a source you have to pay attention oversize it first because otherwise it gets rejected or truncated and then you you will have to very carefully check what their best science is the best promising is the most promising approach is to initially dimension strings in multiples of very initial length and again ODP with the command you see there would be - I you 2 comma 4 will double the lengths of what otherwise will single byte character and multiply that for the length of characters that are wide characters in traditional databases and then load the representative sample of your cells data and profile using the tools that we personally use to find the actually longest datatype and then make them shorter notice you might be talking about the issues of having too long and too big data types on projection design are we live and die with our projects you might know remember the rules on how default projects has come to exist the way that we do initially would be just like for the profiling load a representative sample of the data collector representative set of already known queries from the Vertica database designer and you don't have to decide immediately you can always amend things and otherwise follow the laws of physics avoid moving data back and forth across nodes avoid heavy iOS if you can design your your projections initially by hand encoding matters you know that the database designer is a very tight fisted thing it would optimize to use as little space as possible you will have to think of the fact that if you compress very well you might end up using more time in reading it this is the testimony to run once using several encoding types and you see that they are l e is the wrong length encoded if sorted is not even visible while the others are considerably slower you can get those nights and look it in look at them in detail I will go in detail you now hear about it VI migrations move usually you can expect 80% of everything to work to be able to live to be lifted and shifted you don't need most of the pre aggregated tables because we have live like regain projections many BI tools have specialized query objects for the dimensions and the facts and we have the possibility to use flatten tables that are going to be talked about later you might have to ride those by hand you will be able to switch off casting because vertical speeds of everything with laps Lyle aggregate projections and you have worked with molap cubes before you very probably won't meet them at all ETL tools what you will have to do is if you do it row by row in the old database consider changing everything to very big transactions and if you use in search statements with parameter markers consider writing to make pipes and using verticals copy command mouse inserts yeah copy c'mon that's what I have here ask you custom functionality you can see on this slide the verticals the biggest number of functions in the database we compare them regularly by far compared to any other database you might find that many of them that you have written won't be needed on the new database so look at the vertical catalog instead of trying to look to migrate a function that you don't need stored procedures are very often used in the old database to overcome their shortcomings that Vertica doesn't have very rarely you will have to actually write a procedure that involves a loop but it's really in our experience very very rarely usually you can just switch to standard scripting and this is basically repeating what Mauricio said in the interest of time I will skip this look at this one here the most of the database data warehouse migration talks should be automatic you can use you can automate GDL migration using ODB which is crucial data profiling it's not crucial but game-changing the encoding is the same thing you can automate at you using our database designer the physical data model optimization in general is game-changing you have the database designer use the provisioning use the old platforms tools to generate the SQL you have no objects without their onus is crucial and asking functions and procedures they are only crucial if they depict the company's intellectual property otherwise you can almost always replace them with something else that's it from me for now Thank You Marco Thank You Marco so we will now point our presentation talking about some of the Vertica that overall the presentation techniques that we can implement in order to improve the general efficiency of the dot arouse and let me start with a few simple messages well the first one is that you are supposed to optimize only if and when this is needed in most of the cases just a little shift from the old that allows to birth will provide you exhaust the person as if you were looking for or even better so in this case probably is not really needed to to optimize anything in case you want optimize or you need to optimize then keep in mind some of the vertical peculiarities for example implement delete and updates in the vertical way use live aggregate projections in order to avoid or better in order to limit the goodbye executions at one time used for flattening in order to avoid or limit joint and and then you can also implement invert have some specific birth extensions life for example time series analysis or machine learning on top of your data we will now start by reviewing the first of these ballots optimize if and when needed well if this is okay I mean if you get when you migrate from the old data where else to birth without any optimization if the first four month level is okay then probably you only took my jacketing but this is not the case one very easier to dispute in session technique that you can ask is to ask basket cells to optimize the physical data model using the birth ticket of a designer how well DB deal which is the vertical database designer has several interfaces here I'm going to use what we call the DB DB programmatic API so basically sequel functions and using other databases you might need to hire experts looking at your data your data browser your table definition creating indexes or whatever in vertical all you need is to run something like these are simple as six single sequel statement to get a very well optimized physical base model you see that we start creating a new design then we had to be redesigned tables and queries the queries that we want to optimize we set our target in this case we are tuning the physical data model in order to maximize query performances this is why we are using my design query and in our statement another possible journal tip would be to tune in order to reduce storage or a mix between during storage and cheering queries and finally we asked Vertica to produce and deploy these optimized design in a matter of literally it's a matter of minutes and in a few minutes what you can get is a fully optimized fiscal data model okay this is something very very easy to implement keep in mind some of the vertical peculiarities Vaska is very well tuned for load and query operations aunt Berta bright rose container to biscuits hi the Pharos container is a group of files we will never ever change the content of this file the fact that the Rose containers files are never modified is one of the political peculiarities and these approach led us to use minimal locks we can add multiple load operations in parallel against the very same table assuming we don't have a primary or unique constraint on the target table in parallel as a sage because they will end up in two different growth containers salad in read committed requires in not rocket fuel and can run concurrently with insert selected because the Select will work on a snapshot of the catalog when the transaction start this is what we call snapshot isolation the kappa recovery because we never change our rows files are very simple and robust so we have a huge amount of bandages due to the fact that we never change the content of B rows files contain indiarose containers but on the other side believes and updates require a little attention so what about delete first when you believe in the ethica you basically create a new object able it back so it appeared a bit later in the Rose or in memory and this vector will point to the data being deleted so that when the feed is executed Vertica will just ignore the rules listed in B delete records and it's not just about the leak and updating vertical consists of two operations delete and insert merge consists of either insert or update which interim is made of the little insert so basically if we tuned how the delete work we will also have tune the update in the merge so what should we do in order to optimize delete well remember what we said that every time we please actually we create a new object a delete vector so avoid committing believe and update too often we reduce work the work for the merge out for the removal method out activities that are run afterwards and be sure that all the interested projections will contain the column views in the dedicate this will let workers directly after access the projection without having to go through the super projection in order to create the vector and the delete will be much much faster and finally another very interesting optimization technique is trying to segregate the update and delete operation from Pyrenean third workload in order to reduce lock contention beliefs something we are going to discuss and these contain using partition partition operation this is exactly what I want to talk about now here you have a typical that arouse architecture so we have data arriving in a landing zone where the data is loaded that is from the data sources then we have a transformation a year writing into a staging area that in turn will feed the partitions block of data in the green data structure we have at the end those green data structure we have at the end are the ones used by the data access tools when they run their queries sometimes we might need to change old data for example because we have late records or maybe because we want to fix some errors that have been originated in the facilities so what we do in this case is we just copied back the partition we want to change or we want to adjust from the green interior a the end to the stage in the area we have a very fast operation which is Tokyo Station then we run our updates or our adjustment procedure or whatever we need in order to fix the errors in the data in the staging area and at the very same time people continues to you with green data structures that are at the end so we will never have contention between the two operations when we updating the staging area is completed what we have to do is just to run a swap partition between tables in order to swap the data that we just finished to adjust in be staging zone to the query area that is the green one at the end this swap partition is very fast is an atomic operation and basically what will happens is just that well exchange the pointer to the data this is a very very effective techniques and lot of customer useless so why flops on table and live aggregate for injections well basically we use slot in table and live aggregate objection to minimize or avoid joint this is what flatten table are used for or goodbye and this is what live aggregate projections are used for now compared to traditional data warehouses better can store and process and aggregate and join order of magnitudes more data that is a true columnar database joint and goodbye normally are not a problem at all they run faster than any traditional data browse that page there are still scenarios were deficits are so big and we are talking about petabytes of data and so quickly going that would mean be something in order to boost drop by and join performances and this is why you can't reduce live aggregate projections to perform aggregations hard loading time and limit the need for global appear on time and flux and tables to combine information from different entity uploading time and again avoid running joint has query undefined okay so live aggregate projections at this point in time we can use live aggregate projections using for built in aggregate functions which are some min Max and count okay let's see how this works suppose that you have a normal table in this case we have a table unit sold with three columns PIB their time and quantity which has been segmented in a given way and on top of this base table we call it uncle table we create a projection you see that we create the projection using the salad that will aggregate the data we get the PID we get the date portion of the time and we get the sum of quantity from from the base table grouping on the first two columns so PID and the date portion of day time okay what happens in this case when we load data into the base table all we have to do with load data into the base table when we load data into the base table we will feel of course big injections that assuming we are running with k61 we will have to projection to projections and we will know the data in those two projection with all the detail in data we are going to load into the table so PAB playtime and quantity but at the very same time at the very same time and without having to do nothing any any particular operation or without having to run any any ETL procedure we will also get automatically in the live aggregate projection for the data pre aggregated with be a big day portion of day time and the sum of quantity into the table name total quantity you see is something that we get for free without having to run any specific procedure and this is very very efficient so the key concept is that during the loading operation from VDL point of view is executed again the base table we do not explicitly aggregate data or we don't have any any plc do the aggregation is automatic and we'll bring the pizza to be live aggregate projection every time we go into the base table you see the two selection that we have we have on in this line on the left side and you see that those two selects will produce exactly the same result so running select PA did they trying some quantity from the base table or running the select star from the live aggregate projection will result exactly in the same data you know this is of course very useful but is much more useful result that if we and we can observe this if we run an explained if we run the select against the base table asking for this group data what happens behind the scene is that basically vertical itself that is a live aggregate projection with the data that has been already aggregating loading phase and rewrite your query using polite aggregate projection this happens automatically you see this is a query that ran a group by against unit sold and vertical decided to rewrite this clearly as something that has to be collected against the light aggregates projection because if I decrease this will save a huge amount of time and effort during the ETL cycle okay and is not just limited to be information you want to aggregate for example another query like select count this thing you might note that can't be seen better basically our goodbyes will also take advantage of the live aggregate injection and again this is something that happens automatically you don't have to do anything to get this okay one thing that we have to keep very very clear in mind Brassica what what we store in the live aggregate for injection are basically partially aggregated beta so in this example we have two inserts okay you see that we have the first insert that is entered in four volts and the second insert which is inserting five rules well in for each of these insert we will have a partial aggregation you will never know that after the first insert you will have a second one so better will calculate the aggregation of the data every time irin be insert it is a key concept and be also means that you can imagine lies the effectiveness of bees technique by inserting large chunk of data ok if you insert data row by row this technique live aggregate rejection is not very useful because for every goal that you insert you will have an aggregation so basically they'll live aggregate injection will end up containing the same number of rows that you have in the base table but if you everytime insert a large chunk of data the number of the aggregations that you will have in the library get from structure is much less than B base data so this is this is a key concept you can see how these works by counting the number of rows that you have in alive aggregate injection you see that if you run the select count star from the solved live aggregate rejection the query on the left side you will get four rules but actually if you explain this query you will see that he was reading six rows so this was because every of those two inserts that we're actively interested a few rows in three rows in India in the live aggregate projection so this is a key concept live aggregate projection keep partially aggregated data this final aggregation will always happen at runtime okay another which is very similar to be live aggregate projection or what we call top K projection we actually do not aggregate anything in the top case injection we just keep the last or limit the amount of rows that we collect using the limit over partition by all the by clothes and this again in this case we create on top of the base stable to top gay projection want to keep the last quantity that has been sold and the other one to keep the max quantity in both cases is just a matter of ordering the data in the first case using the B time column in the second page using quantity in both cases we fill projection with just the last roof and again this is something that we do when we insert data into the base table and this is something that happens automatically okay if we now run after the insert our select against either the max quantity okay or be lost wanted it okay we will get the very last you see that we have much less rows in the top k projections okay we told at the beginning that basically we can use for built-in function you might remember me max sum and count what if I want to create my own specific aggregation on top of the lid and customer sum up because our customers have very specific needs in terms of live aggregate projections well in this case you can code your own live aggregate production user-defined functions so you can create the user-defined transport function to implement any sort of complex aggregation while loading data basically after you implemented miss VPS you can deploy using a be pre pass approach that basically means the data is aggregated as loading time during the data ingestion or the batch approach that means that the data is when that woman is running on top which things to remember on live a granade projections they are limited to be built in function again some max min and count but you can call your own you DTF so you can do whatever you want they can reference only one table and for bass cab version before 9.3 it was impossible to update or delete on the uncle table this limit has been removed in 9.3 so you now can update and delete data from the uncle table okay live aggregate projection will follow the segmentation of the group by expression and in some cases the best optimizer can decide to pick the live aggregates objection or not depending on if depending on the fact that the aggregation is a consistent or not remember that if we insert and commit every single role to be uncoachable then we will end up with a live aggregate indirection that contains exactly the same number of rows in this case living block or using the base table it would be the same okay so this is one of the two fantastic techniques that we can implement in Burtka this live aggregate projection is basically to avoid or limit goodbyes the other which we are going to talk about is cutting table and be reused in order to avoid the means for joins remember that K is very fast running joints but when we scale up to petabytes of beta we need to boost and this is what we have in order to have is problem fixed regardless the amount of data we are dealing with so how what about suction table let me start with normalized schemas everybody knows what is a normalized scheme under is no but related stuff in this slide the main scope of an normalized schema is to reduce data redundancies so and the fact that we reduce data analysis is a good thing because we will obtain fast and more brides we will have to write into a database small chunks of data into the right table the problem with these normalized schemas is that when you run your queries you have to put together the information that arrives from different table and be required to run joint again jointly that again normally is very good to run joint but sometimes the amount of data makes not easy to deal with joints and joints sometimes are not easy to tune what happens in in the normal let's say traditional data browser is that we D normalize the schemas normally either manually or using an ETL so basically we have on one side in this light on the left side the normalized schemas where we can get very fast right on the other side on the left we have the wider table where we run all the three joints and pre aggregation in order to prepare the data for the queries and so we will have fast bribes on the left fast reads on the Left sorry fast bra on the right and fast read on the left side of these slides the probability in the middle because we will push all the complexity in the middle in the ETL that will have to transform be normalized schema into the water table and the way we normally implement these either manually using procedures that we call the door using ETL this is what happens in traditional data warehouse is that we will have to coach in ETL layer in order to round the insert select that will feed from the normalized schema and right into the widest table at the end the one that is used by the data access tools we we are going to to view store to run our theories so this approach is costly because of course someone will have to code this ETL and is slow because someone will have to execute those batches normally overnight after loading the data and maybe someone will have to check the following morning that everything was ok with the batch and is resource intensive of course and is also human being intensive because of the people that will have to code and check the results it ever thrown because it can fail and introduce a latency because there is a get in the time axis between the time t0 when you load the data into be normalized schema and the time t1 when we get the data finally ready to be to be queried so what would be inverter to facilitate this process is to create this flatten table with the flattened T work first you avoid data redundancy because you don't need the wide table on the normalized schema on the left side second is fully automatic you don't have to do anything you just have to insert the data into the water table and the ETL that you have coded is transformed into an insert select by vatika automatically you don't have to do anything it's robust and this Latin c0 is a single fast as soon as you load the data into the water table you will get all the joints executed for you so let's have a look on how it works in this case we have the table we are going to flatten and basically we have to focus on two different clauses the first one is you see that there is one table here I mentioned value 1 which can be defined as default and then the Select or set using okay the difference between the fold and set using is when the data is populated if we use default data is populated as soon as we know the data into the base table if we use set using Google Earth to refresh but everything is there I mean you don't need them ETL you don't need to code any transformation because everything is in the table definition itself and it's for free and of course is in latency zero so as soon as you load the other columns you will have the dimension value valued as well okay let's see an example here suppose here we have a dimension table customer dimension that is on the left side and we have a fact table on on the right you see that the fact table uses columns like o underscore name or Oh the score city which are basically the result of the salad on top of the customer dimension so Beezus were the join is executed as soon as a remote data into the fact table directly into the fact table without of course loading data that arise from the dimension all the data from the dimension will be populated automatically so let's have an example here suppose that we are running this insert as you can see we are running be inserted directly into the fact table and we are loading o ID customer ID and total we are not loading made a major name no city those name and city will be automatically populated by Vertica for you because of the definition of the flood table okay you see behave well all you need in order to have your widest tables built for you your flattened table and this means that at runtime you won't need any join between base fuck table and the customer dimension that we have used in order to calculate name and city because the data is already there this was using default the other option was is using set using the concept is absolutely the same you see that in this case on the on the right side we have we have basically replaced this all on the school name default with all underscore name set using and same is true for city the concept that I said is the same but in this case which we set using then we will have to refresh you see that we have to run these select trash columns and then the name of the table in this case all columns will be fresh or you can specify only certain columns and this will bring the values for name and city reading from the customer dimension so this technique this technique is extremely useful the difference between default and said choosing just to summarize the most important differences remember you just have to remember that default will relate your target when you load set using when you refresh end and in some cases you might need to use them both so in some cases you might want to use both default end set using in this example here we'll see that we define the underscore name using both default and securing and this means that we love the data populated either when we load the data into the base table or when we run the Refresh this is summary of the technique that we can implement in birth in order to make our and other browsers even more efficient and well basically this is the end of our presentation thank you for listening and now we are ready for the Q&A session you

Published Date : Mar 30 2020

SUMMARY :

the end to the stage in the area we have

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Tom	PERSON	0.99+
Marta	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
David	PERSON	0.99+
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
Chris Keg	PERSON	0.99+
Laura Ipsen	PERSON	0.99+
Jeffrey Immelt	PERSON	0.99+
Chris	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris O'Malley	PERSON	0.99+
Andy Dalton	PERSON	0.99+
Chris Berg	PERSON	0.99+
Dave Velante	PERSON	0.99+
Maureen Lonergan	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Paul Forte	PERSON	0.99+
Erik Brynjolfsson	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Andrew McCafee	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
Cheryl	PERSON	0.99+
Mark	PERSON	0.99+
Marta Federici	PERSON	0.99+
Larry	PERSON	0.99+
Matt Burr	PERSON	0.99+
Sam	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Dave Wright	PERSON	0.99+
Maureen	PERSON	0.99+
Google	ORGANIZATION	0.99+
Cheryl Cook	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
$8,000	QUANTITY	0.99+
Justin Warren	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
2012	DATE	0.99+
Europe	LOCATION	0.99+
Andy	PERSON	0.99+
30,000	QUANTITY	0.99+
Mauricio	PERSON	0.99+
Philips	ORGANIZATION	0.99+
Robb	PERSON	0.99+
Jassy	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Mike Nygaard	PERSON	0.99+

UNLIST TILL 4/2 - A Technical Overview of Vertica Architecture

>> Paige: Hello, everybody and thank you for joining us today on the Virtual Vertica BDC 2020. Today's breakout session is entitled A Technical Overview of the Vertica Architecture. I'm Paige Roberts, Open Source Relations Manager at Vertica and I'll be your host for this webinar. Now joining me is Ryan Role-kuh? Did I say that right? (laughs) He's a Vertica Senior Software Engineer. >> Ryan: So it's Roelke. (laughs) >> Paige: Roelke, okay, I got it, all right. Ryan Roelke. And before we begin, I want to be sure and encourage you guys to submit your questions or your comments during the virtual session while Ryan is talking as you think of them as you go along. You don't have to wait to the end, just type in your question or your comment in the question box below the slides and click submit. There'll be a Q and A at the end of the presentation and we'll answer as many questions as we're able to during that time. Any questions that we don't address, we'll do our best to get back to you offline. Now, alternatively, you can visit the Vertica forums to post your question there after the session as well. Our engineering team is planning to join the forums to keep the conversation going, so you can have a chat afterwards with the engineer, just like any other conference. Now also, you can maximize your screen by clicking the double arrow button in the lower right corner of the slides and before you ask, yes, this virtual session is being recorded and it will be available to view on demand this week. We'll send you a notification as soon as it's ready. Now, let's get started. Over to you, Ryan. >> Ryan: Thanks, Paige. Good afternoon, everybody. My name is Ryan and I'm a Senior Software Engineer on Vertica's Development Team. I primarily work on improving Vertica's query execution engine, so usually in the space of making things faster. Today, I'm here to talk about something that's more general than that, so we're going to go through a technical overview of the Vertica architecture. So the intent of this talk, essentially, is to just explain some of the basic aspects of how Vertica works and what makes it such a great database software and to explain what makes a query execute so fast in Vertica, we'll provide some background to explain why other databases don't keep up. And we'll use that as a starting point to discuss an academic database that paved the way for Vertica. And then we'll explain how Vertica design builds upon that academic database to be the great software that it is today. I want to start by sharing somebody's approximation of an internet minute at some point in 2019. All of the data on this slide is generated by thousands or even millions of users and that's a huge amount of activity. Most of the applications depicted here are backed by one or more databases. Most of this activity will eventually result in changes to those databases. For the most part, we can categorize the way these databases are used into one of two paradigms. First up, we have online transaction processing or OLTP. OLTP workloads usually operate on single entries in a database, so an update to a retail inventory or a change in a bank account balance are both great examples of OLTP operations. Updates to these data sets must be visible immediately and there could be many transactions occurring concurrently from many different users. OLTP queries are usually key value queries. The key uniquely identifies the single entry in a database for reading or writing. Early databases and applications were probably designed for OLTP workloads. This example on the slide is typical of an OLTP workload. We have a table, accounts, such as for a bank, which tracks information for each of the bank's clients. An update query, like the one depicted here, might be run whenever a user deposits $10 into their bank account. Our second category is online analytical processing or OLAP which is more about using your data for decision making. If you have a hardware device which periodically records how it's doing, you could analyze trends of all your devices over time to observe what data patterns are likely to lead to failure or if you're Google, you might log user search activity to identify which links helped your users find the answer. Analytical processing has always been around but with the advent of the internet, it happened at scales that were unimaginable, even just 20 years ago. This SQL example is something you might see in an OLAP workload. We have a table, searches, logging user activity. We will eventually see one row in this table for each query submitted by users. If we want to find out what time of day our users are most active, then we could write a query like this one on the slide which counts the number of unique users running searches for each hour of the day. So now let's rewind to 2005. We don't have a picture of an internet minute in 2005, we don't have the data for that. We also don't have the data for a lot of other things. The term Big Data is not quite yet on anyone's radar and The Cloud is also not quite there or it's just starting to be. So if you have a database serving your application, it's probably optimized for OLTP workloads. OLAP workloads just aren't mainstream yet and database engineers probably don't have them in mind. So let's innovate. It's still 2005 and we want to try something new with our database. Let's take a look at what happens when we do run an analytic workload in 2005. Let's use as a motivating example a table of stock prices over time. In our table, the symbol column identifies the stock that was traded, the price column identifies the new price and the timestamp column indicates when the price changed. We have several other columns which, we should know that they're there, but we're not going to use them in any example queries. This table is designed for analytic queries. We're probably not going to make any updates or look at individual rows since we're logging historical data and want to analyze changes in stock price over time. Our database system is built to serve OLTP use cases, so it's probably going to store the table on disk in a single file like this one. Notice that each row contains all of the columns of our data in row major order. There's probably an index somewhere in the memory of the system which will help us to point lookups. Maybe our system expects that we will use the stock symbol and the trade time as lookup keys. So an index will provide quick lookups for those columns to the position of the whole row in the file. If we did have an update to a single row, then this representation would work great. We would seek to the row that we're interested in, finding it would probably be very fast using the in-memory index. And then we would update the file in place with our new value. On the other hand, if we ran an analytic query like we want to, the data access pattern is very different. The index is not helpful because we're looking up a whole range of rows, not just a single row. As a result, the only way to find the rows that we actually need for this query is to scan the entire file. We're going to end up scanning a lot of data that we don't need and that won't just be the rows that we don't need, there's many other columns in this table. Many information about who made the transaction, and we'll also be scanning through those columns for every single row in this table. That could be a very serious problem once we consider the scale of this file. Stocks change a lot, we probably have thousands or millions or maybe even billions of rows that are going to be stored in this file and we're going to scan all of these extra columns for every single row. If we tried out our stocks use case behind the desk for the Fortune 500 company, then we're probably going to be pretty disappointed. Our queries will eventually finish, but it might take so long that we don't even care about the answer anymore by the time that they do. Our database is not built for the task we want to use it for. Around the same time, a team of researchers in the North East have become aware of this problem and they decided to dedicate their time and research to it. These researchers weren't just anybody. The fruits of their labor, which we now like to call the C-Store Paper, was published by eventual Turing Award winner, Mike Stonebraker, along with several other researchers from elite universities. This paper presents the design of a read-optimized relational DBMS that contrasts sharply with most current systems, which are write-optimized. That sounds exactly like what we want for our stocks use case. Reasoning about what makes our queries executions so slow brought our researchers to the Memory Hierarchy, which essentially is a visualization of the relative speeds of different parts of a computer. At the top of the hierarchy, we have the fastest data units, which are, of course, also the most expensive to produce. As we move down the hierarchy, components get slower but also much cheaper and thus you can have more of them. Our OLTP databases data is stored in a file on the hard disk. We scanned the entirety of this file, even though we didn't need most of the data and now it turns out, that is just about the slowest thing that our query could possibly be doing by over two orders of magnitude. It should be clear, based on that, that the best thing we can do to optimize our query's execution is to avoid reading unnecessary data from the disk and that's what the C-Store researchers decided to look at. The key innovation of the C-Store paper does exactly that. Instead of storing data in a row major order, in a large file on disk, they transposed the data and stored each column in its own file. Now, if we run the same select query, we read only the relevant columns. The unnamed columns don't factor into the table scan at all since we don't even open the files. Zooming out to an internet scale sized data set, we can appreciate the savings here a lot more. But we still have to read a lot of data that we don't need to answer this particular query. Remember, we had two predicates, one on the symbol column and one on the timestamp column. Our query is only interested in AAPL stock, but we're still reading rows for all of the other stocks. So what can we do to optimize our disk read even more? Let's first partition our data set into different files based on the timestamp date. This means that we will keep separate files for each date. When we query the stocks table, the database knows all of the files we have to open. If we have a simple predicate on the timestamp column, as our sample query does, then the database can use it to figure out which files we don't have to look at at all. So now all of our disk reads that we have to do to answer our query will produce rows that pass the timestamp predicate. This eliminates a lot of wasteful disk reads. But not all of them. We do have another predicate on the symbol column where symbol equals AAPL. We'd like to avoid disk reads of rows that don't satisfy that predicate either. And we can avoid those disk reads by clustering all the rows that match the symbol predicate together. If all of the AAPL rows are adjacent, then as soon as we see something different, we can stop reading the file. We won't see any more rows that can pass the predicate. Then we can use the positions of the rows we did find to identify which pieces of the other columns we need to read. One technique that we can use to cluster the rows is sorting. So we'll use the symbol column as a sort key for all of the columns. And that way we can reconstruct a whole row by seeking to the same row position in each file. It turns out, having sorted all of the rows, we can do a bit more. We don't have any more wasted disk reads but we can still be more efficient with how we're using the disk. We've clustered all of the rows with the same symbol together so we don't really need to bother repeating the symbol so many times in the same file. Let's just write the value once and say how many rows we have. This one length encoding technique can compress large numbers of rows into a small amount of space. In this example, we do de-duplicate just a few rows but you can imagine de-duplicating many thousands of rows instead. This encoding is great for reducing the amounts of disk we need to read at query time, but it also has the additional benefit of reducing the total size of our stored data. Now our query requires substantially fewer disk reads than it did when we started. Let's recap what the C-Store paper did to achieve that. First, we transposed our data to store each column in its own file. Now, queries only have to read the columns used in the query. Second, we partitioned the data into multiple file sets so that all rows in a file have the same value for the partition column. Now, a predicate on the partition column can skip non-matching file sets entirely. Third, we selected a column of our data to use as a sort key. Now rows with the same value for that column are clustered together, which allows our query to stop reading data once it finds non-matching rows. Finally, sorting the data this way enables high compression ratios, using one length encoding which minimizes the size of the data stored on the disk. The C-Store system combined each of these innovative ideas to produce an academically significant result. And if you used it behind the desk of a Fortune 500 company in 2005, you probably would've been pretty pleased. But it's not 2005 anymore and the requirements of a modern database system are much stricter. So let's take a look at how C-Store fairs in 2020. First of all, we have designed the storage layer of our database to optimize a single query in a single application. Our design optimizes the heck out of that query and probably some similar ones but if we want to do anything else with our data, we might be in a bit of trouble. What if we just decide we want to ask a different question? For example, in our stock example, what if we want to plot all the trade made by a single user over a large window of time? How do our optimizations for the previous query measure up here? Well, our data's partitioned on the trade date, that could still be useful, depending on our new query. If we want to look at a trader's activity over a long period of time, we would have to open a lot of files. But if we're still interested in just a day's worth of data, then this optimization is still an optimization. Within each file, our data is ordered on the stock symbol. That's probably not too useful anymore, the rows for a single trader aren't going to be clustered together so we will have to scan all of the rows in order to figure out which ones match. You could imagine a worse design but as it becomes crucial to optimize this new type of query, then we might have to go as far as reconfiguring the whole database. The next problem of one of scale. One server is probably not good enough to serve a database in 2020. C-Store, as described, runs on a single server and stores lots of files. What if the data overwhelms this small system? We could imagine exhausting the file system's inodes limit with lots of small files due to our partitioning scheme. Or we could imagine something simpler, just filling up the disk with huge volumes of data. But there's an even simpler problem than that. What if something goes wrong and C-Store crashes? Then our data is no longer available to us until the single server is brought back up. A third concern, another one of scalability, is that one deployment does not really suit all possible things and use cases we could imagine. We haven't really said anything about being flexible. A contemporary database system has to integrate with many other applications, which might themselves have pretty restricted deployment options. Or the demands imposed by our workloads have changed and the setup you had before doesn't suit what you need now. C-Store doesn't do anything to address these concerns. What the C-Store paper did do was lead very quickly to the founding of Vertica. Vertica's architecture and design are essentially all about bringing the C-Store designs into an enterprise software system. The C-Store paper was just an academic exercise so it didn't really need to address any of the hard problems that we just talked about. But Vertica, the first commercial database built upon the ideas of the C-Store paper would definitely have to. This brings us back to the present to look at how an analytic query runs in 2020 on the Vertica Analytic Database. Vertica takes the key idea from the paper, can we significantly improve query performance by changing the way our data is stored and give its users the tools to customize their storage layer in order to heavily optimize really important or commonly wrong queries. On top of that, Vertica is a distributed system which allows it to scale up to internet-sized data sets, as well as have better reliability and uptime. We'll now take a brief look at what Vertica does to address the three inadequacies of the C-Store system that we mentioned. To avoid locking into a single database design, Vertica provides tools for the database user to customize the way their data is stored. To address the shortcomings of a single node system, Vertica coordinates processing among multiple nodes. To acknowledge the large variety of desirable deployments, Vertica does not require any specialized hardware and has many features which smoothly integrate it with a Cloud computing environment. First, we'll look at the database design problem. We're a SQL database, so our users are writing SQL and describing their data in SQL way, the Create Table statement. Create Table is a logical description of what your data looks like but it doesn't specify the way that it has to be stored, For a single Create Table, we could imagine a lot of different storage layouts. Vertica adds some extensions to SQL so that users can go even further than Create Table and describe the way that they want the data to be stored. Using terminology from the C-Store paper, we provide the Create Projection statement. Create Projection specifies how table data should be laid out, including column encoding and sort order. A table can have multiple projections, each of which could be ordered on different columns. When you query a table, Vertica will answer the query using the projection which it determines to be the best match. Referring back to our stock example, here's a sample Create Table and Create Projection statement. Let's focus on our heavily optimized example query, which had predicates on the stock symbol and date. We specify that the table data is to be partitioned by date. The Create Projection Statement here is excellent for this query. We specify using the order by clause that the data should be ordered according to our predicates. We'll use the timestamp as a secondary sort key. Each projection stores a copy of the table data. If you don't expect to need a particular column in a projection, then you can leave it out. Our average price query didn't care about who did the trading, so maybe our projection design for this query can leave the trader column out entirely. If the question we want to ask ever does change, maybe we already have a suitable projection, but if we don't, then we can create another one. This example shows another projection which would be much better at identifying trends of traders, rather than identifying trends for a particular stock. Next, let's take a look at our second problem, that one, or excuse me, so how should you decide what design is best for your queries? Well, you could spend a lot of time figuring it out on your own, or you could use Vertica's Database Designer tool which will help you by automatically analyzing your queries and spitting out a design which it thinks is going to work really well. If you want to learn more about the Database Designer Tool, then you should attend the session Vertica Database Designer- Today and Tomorrow which will tell you a lot about what the Database Designer does and some recent improvements that we have made. Okay, now we'll move to our next problem. (laughs) The challenge that one server does not fit all. In 2020, we have several orders of magnitude more data than we had in 2005. And you need a lot more hardware to crunch it. It's not tractable to keep multiple petabytes of data in a system with a single server. So Vertica doesn't try. Vertica is a distributed system so will deploy multiple severs which work together to maintain such a high data volume. In a traditional Vertica deployment, each node keeps some of the data in its own locally-attached storage. Data is replicated so that there is a redundant copy somewhere else in the system. If any one node goes down, then the data that it served is still available on a different node. We'll also have it so that in the system, there's no special node with extra duties. All nodes are created equal. This ensures that there is no single point of failure. Rather than replicate all of your data, Vertica divvies it up amongst all of the nodes in your system. We call this segmentation. The way data is segmented is another parameter of storage customization and it can definitely have an impact upon query performance. A common way to segment data is by using a hash expression, which essentially randomizes the node that a row of data belongs to. But with a guarantee that the same data will always end up in the same place. Describing the way data is segmented is another part of the Create Projection Statement, as seen in this example. Here we segment on the hash of the symbol column so all rows with the same symbol will end up on the same node. For each row that we load into the system, we'll apply our segmentation expression. The result determines which segment the row belongs to and then we'll send the row to each node which holds the copy of that segment. In this example, our projection is marked KSAFE 1, so we will keep one redundant copy of each segment. When we load a row, we might find that its segment had copied on Node One and Node Three, so we'll send a copy of the row to each of those nodes. If Node One is temporarily disconnected from the network, then Node Three can serve the other copy of the segment so that the whole system remains available. The last challenge we brought up from the C-Store design was that one deployment does not fit all. Vertica's cluster design neatly addressed many of our concerns here. Our use of segmentation to distribute data means that a Vertica system can scale to any size of deployment. And since we lack any special hardware or nodes with special purposes, Vertica servers can run anywhere, on premise or in the Cloud. But let's suppose you need to scale out your cluster to rise to the demands of a higher workload. Suppose you want to add another node. This changes the division of the segmentation space. We'll have to re-segment every row in the database to find its new home and then we'll have to move around any data that belongs to a different segment. This is a very expensive operation, not something you want to be doing all that often. Traditional Vertica doesn't solve that problem especially well, but Vertica Eon Mode definitely does. Vertica's Eon Mode is a large set of features which are designed with a Cloud computing environment in mind. One feature of this design is elastic throughput scaling, which is the idea that you can smoothly change your cluster size without having to pay the expenses of shuffling your entire database. Vertica Eon Mode had an entire session dedicated to it this morning. I won't say any more about it here, but maybe you already attended that session or if you haven't, then I definitely encourage you to listen to the recording. If you'd like to learn more about the Vertica architecture, then you'll find on this slide links to several of the academic conference publications. These four papers here, as well as Vertica Seven Years Later paper which describes some of the Vertica designs seven years after the founding and also a paper about the innovations of Eon Mode and of course, the Vertica documentation is an excellent resource for learning more about what's going on in a Vertica system. I hope you enjoyed learning about the Vertica architecture. I would be very happy to take all of your questions now. Thank you for attending this session.

Published Date : Mar 30 2020

SUMMARY :

A Technical Overview of the Vertica Architecture. Ryan: So it's Roelke. in the question box below the slides and click submit. that the best thing we can do

ENTITIES

Entity	Category	Confidence
Ryan	PERSON	0.99+
Mike Stonebraker	PERSON	0.99+
Ryan Roelke	PERSON	0.99+
2005	DATE	0.99+
2020	DATE	0.99+
thousands	QUANTITY	0.99+
2019	DATE	0.99+
$10	QUANTITY	0.99+
Paige Roberts	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Paige	PERSON	0.99+
Node Three	TITLE	0.99+
Today	DATE	0.99+
First	QUANTITY	0.99+
each file	QUANTITY	0.99+
Roelke	PERSON	0.99+
each row	QUANTITY	0.99+
Node One	TITLE	0.99+
millions	QUANTITY	0.99+
each hour	QUANTITY	0.99+
each	QUANTITY	0.99+
Second	QUANTITY	0.99+
second category	QUANTITY	0.99+
each column	QUANTITY	0.99+
One technique	QUANTITY	0.99+
one	QUANTITY	0.99+
two predicates	QUANTITY	0.99+
each node	QUANTITY	0.99+
One server	QUANTITY	0.99+
SQL	TITLE	0.99+
C-Store	TITLE	0.99+
second problem	QUANTITY	0.99+
Ryan Role	PERSON	0.99+
Third	QUANTITY	0.99+
North East	LOCATION	0.99+
each segment	QUANTITY	0.99+
today	DATE	0.98+
single entry	QUANTITY	0.98+
each date	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
one row	QUANTITY	0.98+
one server	QUANTITY	0.98+
single server	QUANTITY	0.98+
single entries	QUANTITY	0.98+
both	QUANTITY	0.98+
20 years ago	DATE	0.98+
two paradigms	QUANTITY	0.97+
a day	QUANTITY	0.97+
this week	DATE	0.97+
billions of rows	QUANTITY	0.97+
Vertica	TITLE	0.97+
4/2	DATE	0.97+
single application	QUANTITY	0.97+
each query	QUANTITY	0.97+
Each projection	QUANTITY	0.97+

UNLIST TILL 4/2 - Vertica Big Data Conference Keynote

>> Joy: Welcome to the Virtual Big Data Conference. Vertica is so excited to host this event. I'm Joy King, and I'll be your host for today's Big Data Conference Keynote Session. It's my honor and my genuine pleasure to lead Vertica's product and go-to-market strategy. And I'm so lucky to have a passionate and committed team who turned our Vertica BDC event, into a virtual event in a very short amount of time. I want to thank the thousands of people, and yes, that's our true number who have registered to attend this virtual event. We were determined to balance your health, safety and your peace of mind with the excitement of the Vertica BDC. This is a very unique event. Because as I hope you all know, we focus on engineering and architecture, best practice sharing and customer stories that will educate and inspire everyone. I also want to thank our top sponsors for the virtual BDC, Arrow, and Pure Storage. Our partnerships are so important to us and to everyone in the audience. Because together, we get things done faster and better. Now for today's keynote, you'll hear from three very important and energizing speakers. First, Colin Mahony, our SVP and General Manager for Vertica, will talk about the market trends that Vertica is betting on to win for our customers. And he'll share the exciting news about our Vertica 10 announcement and how this will benefit our customers. Then you'll hear from Amy Fowler, VP of strategy and solutions for FlashBlade at Pure Storage. Our partnership with Pure Storage is truly unique in the industry, because together modern infrastructure from Pure powers modern analytics from Vertica. And then you'll hear from John Yovanovich, Director of IT at AT&T, who will tell you about the Pure Vertica Symphony that plays live every day at AT&T. Here we go, Colin, over to you. >> Colin: Well, thanks a lot joy. And, I want to echo Joy's thanks to our sponsors, and so many of you who have helped make this happen. This is not an easy time for anyone. We were certainly looking forward to getting together in person in Boston during the Vertica Big Data Conference and Winning with Data. But I think all of you and our team have done a great job, scrambling and putting together a terrific virtual event. So really appreciate your time. I also want to remind people that we will make both the slides and the full recording available after this. So for any of those who weren't able to join live, that is still going to be available. Well, things have been pretty exciting here. And in the analytic space in general, certainly for Vertica, there's a lot happening. There are a lot of problems to solve, a lot of opportunities to make things better, and a lot of data that can really make every business stronger, more efficient, and frankly, more differentiated. For Vertica, though, we know that focusing on the challenges that we can directly address with our platform, and our people, and where we can actually make the biggest difference is where we ought to be putting our energy and our resources. I think one of the things that has made Vertica so strong over the years is our ability to focus on those areas where we can make a great difference. So for us as we look at the market, and we look at where we play, there are really three recent and some not so recent, but certainly picking up a lot of the market trends that have become critical for every industry that wants to Win Big With Data. We've heard this loud and clear from our customers and from the analysts that cover the market. If I were to summarize these three areas, this really is the core focus for us right now. We know that there's massive data growth. And if we can unify the data silos so that people can really take advantage of that data, we can make a huge difference. We know that public clouds offer tremendous advantages, but we also know that balance and flexibility is critical. And we all need the benefit that machine learning for all the types up to the end data science. We all need the benefits that they can bring to every single use case, but only if it can really be operationalized at scale, accurate and in real time. And the power of Vertica is, of course, how we're able to bring so many of these things together. Let me talk a little bit more about some of these trends. So one of the first industry trends that we've all been following probably now for over the last decade, is Hadoop and specifically HDFS. So many companies have invested, time, money, more importantly, people in leveraging the opportunity that HDFS brought to the market. HDFS is really part of a much broader storage disruption that we'll talk a little bit more about, more broadly than HDFS. But HDFS itself was really designed for petabytes of data, leveraging low cost commodity hardware and the ability to capture a wide variety of data formats, from a wide variety of data sources and applications. And I think what people really wanted, was to store that data before having to define exactly what structures they should go into. So over the last decade or so, the focus for most organizations is figuring out how to capture, store and frankly manage that data. And as a platform to do that, I think, Hadoop was pretty good. It certainly changed the way that a lot of enterprises think about their data and where it's locked up. In parallel with Hadoop, particularly over the last five years, Cloud Object Storage has also given every organization another option for collecting, storing and managing even more data. That has led to a huge growth in data storage, obviously, up on public clouds like Amazon and their S3, Google Cloud Storage and Azure Blob Storage just to name a few. And then when you consider regional and local object storage offered by cloud vendors all over the world, the explosion of that data, in leveraging this type of object storage is very real. And I think, as I mentioned, it's just part of this broader storage disruption that's been going on. But with all this growth in the data, in all these new places to put this data, every organization we talk to is facing even more challenges now around the data silo. Sure the data silos certainly getting bigger. And hopefully they're getting cheaper per bit. But as I said, the focus has really been on collecting, storing and managing the data. But between the new data lakes and many different cloud object storage combined with all sorts of data types from the complexity of managing all this, getting that business value has been very limited. This actually takes me to big bet number one for Team Vertica, which is to unify the data. Our goal, and some of the announcements we have made today plus roadmap announcements I'll share with you throughout this presentation. Our goal is to ensure that all the time, money and effort that has gone into storing that data, all the data turns into business value. So how are we going to do that? With a unified analytics platform that analyzes the data wherever it is HDFS, Cloud Object Storage, External tables in an any format ORC, Parquet, JSON, and of course, our own Native Roth Vertica format. Analyze the data in the right place in the right format, using a single unified tool. This is something that Vertica has always been committed to, and you'll see in some of our announcements today, we're just doubling down on that commitment. Let's talk a little bit more about the public cloud. This is certainly the second trend. It's the second wave maybe of data disruption with object storage. And there's a lot of advantages when it comes to public cloud. There's no question that the public clouds give rapid access to compute storage with the added benefit of eliminating data center maintenance that so many companies, want to get out of themselves. But maybe the biggest advantage that I see is the architectural innovation. The public clouds have introduced so many methodologies around how to provision quickly, separating compute and storage and really dialing-in the exact needs on demand, as you change workloads. When public clouds began, it made a lot of sense for the cloud providers and their customers to charge and pay for compute and storage in the ratio that each use case demanded. And I think you're seeing that trend, proliferate all over the place, not just up in public cloud. That architecture itself is really becoming the next generation architecture for on-premise data centers, as well. But there are a lot of concerns. I think we're all aware of them. They're out there many times for different workloads, there are higher costs. Especially if some of the workloads that are being run through analytics, which tend to run all the time. Just like some of the silo challenges that companies are facing with HDFS, data lakes and cloud storage, the public clouds have similar types of siloed challenges as well. Initially, there was a belief that they were cheaper than data centers, and when you added in all the costs, it looked that way. And again, for certain elastic workloads, that is the case. I don't think that's true across the board overall. Even to the point where a lot of the cloud vendors aren't just charging lower costs anymore. We hear from a lot of customers that they don't really want to tether themselves to any one cloud because of some of those uncertainties. Of course, security and privacy are a concern. We hear a lot of concerns with regards to cloud and even some SaaS vendors around shared data catalogs, across all the customers and not enough separation. But security concerns are out there, you can read about them. I'm not going to jump into that bandwagon. But we hear about them. And then, of course, I think one of the things we hear the most from our customers, is that each cloud stack is starting to feel even a lot more locked in than the traditional data warehouse appliance. And as everybody knows, the industry has been running away from appliances as fast as it can. And so they're not eager to get locked into another, quote, unquote, virtual appliance, if you will, up in the cloud. They really want to make sure they have flexibility in which clouds, they're going to today, tomorrow and in the future. And frankly, we hear from a lot of our customers that they're very interested in eventually mixing and matching, compute from one cloud with, say storage from another cloud, which I think is something that we'll hear a lot more about. And so for us, that's why we've got our big bet number two. we love the cloud. We love the public cloud. We love the private clouds on-premise, and other hosting providers. But our passion and commitment is for Vertica to be able to run in any of the clouds that our customers choose, and make it portable across those clouds. We have supported on-premises and all public clouds for years. And today, we have announced even more support for Vertica in Eon Mode, the deployment option that leverages the separation of compute from storage, with even more deployment choices, which I'm going to also touch more on as we go. So super excited about our big bet number two. And finally as I mentioned, for all the hype that there is around machine learning, I actually think that most importantly, this third trend that team Vertica is determined to address is the need to bring business critical, analytics, machine learning, data science projects into production. For so many years, there just wasn't enough data available to justify the investment in machine learning. Also, processing power was expensive, and storage was prohibitively expensive. But to train and score and evaluate all the different models to unlock the full power of predictive analytics was tough. Today you have those massive data volumes. You have the relatively cheap processing power and storage to make that dream a reality. And if you think about this, I mean with all the data that's available to every company, the real need is to operationalize the speed and the scale of machine learning so that these organizations can actually take advantage of it where they need to. I mean, we've seen this for years with Vertica, going back to some of the most advanced gaming companies in the early days, they were incorporating this with live data directly into their gaming experiences. Well, every organization wants to do that now. And the accuracy for clickability and real time actions are all key to separating the leaders from the rest of the pack in every industry when it comes to machine learning. But if you look at a lot of these projects, the reality is that there's a ton of buzz, there's a ton of hype spanning every acronym that you can imagine. But most companies are struggling, do the separate teams, different tools, silos and the limitation that many platforms are facing, driving, down sampling to get a small subset of the data, to try to create a model that then doesn't apply, or compromising accuracy and making it virtually impossible to replicate models, and understand decisions. And if there's one thing that we've learned when it comes to data, prescriptive data at the atomic level, being able to show end of one as we refer to it, meaning individually tailored data. No matter what it is healthcare, entertainment experiences, like gaming or other, being able to get at the granular data and make these decisions, make that scoring applies to machine learning just as much as it applies to giving somebody a next-best-offer. But the opportunity has never been greater. The need to integrate this end-to-end workflow and support the right tools without compromising on that accuracy. Think about it as no downsampling, using all the data, it really is key to machine learning success. Which should be no surprise then why the third big bet from Vertica is one that we've actually been working on for years. And we're so proud to be where we are today, helping the data disruptors across the world operationalize machine learning. This big bet has the potential to truly unlock, really the potential of machine learning. And today, we're announcing some very important new capabilities specifically focused on unifying the work being done by the data science community, with their preferred tools and platforms, and the volume of data and performance at scale, available in Vertica. Our strategy has been very consistent over the last several years. As I said in the beginning, we haven't deviated from our strategy. Of course, there's always things that we add. Most of the time, it's customer driven, it's based on what our customers are asking us to do. But I think we've also done a great job, not trying to be all things to all people. Especially as these hype cycles flare up around us, we absolutely love participating in these different areas without getting completely distracted. I mean, there's a variety of query tools and data warehouses and analytics platforms in the market. We all know that. There are tools and platforms that are offered by the public cloud vendors, by other vendors that support one or two specific clouds. There are appliance vendors, who I was referring to earlier who can deliver package data warehouse offerings for private data centers. And there's a ton of popular machine learning tools, languages and other kits. But Vertica is the only advanced analytic platform that can do all this, that can bring it together. We can analyze the data wherever it is, in HDFS, S3 Object Storage, or Vertica itself. Natively we support multiple clouds on-premise deployments, And maybe most importantly, we offer that choice of deployment modes to allow our customers to choose the architecture that works for them right now. It still also gives them the option to change move, evolve over time. And Vertica is the only analytics database with end-to-end machine learning that can truly operationalize ML at scale. And I know it's a mouthful. But it is not easy to do all these things. It is one of the things that highly differentiates Vertica from the rest of the pack. It is also why our customers, all of you continue to bet on us and see the value that we are delivering and we will continue to deliver. Here's a couple of examples of some of our customers who are powered by Vertica. It's the scale of data. It's the millisecond response times. Performance and scale have always been a huge part of what we have been about, not the only thing. I think the functionality all the capabilities that we add to the platform, the ease of use, the flexibility, obviously with the deployment. But if you look at some of the numbers they are under these customers on this slide. And I've shared a lot of different stories about these customers. Which, by the way, it still amaze me every time I talk to one and I get the updates, you can see the power and the difference that Vertica is making. Equally important, if you look at a lot of these customers, they are the epitome of being able to deploy Vertica in a lot of different environments. Many of the customers on this slide are not using Vertica just on-premise or just in the cloud. They're using it in a hybrid way. They're using it in multiple different clouds. And again, we've been with them on that journey throughout, which is what has made this product and frankly, our roadmap and our vision exactly what it is. It's been quite a journey. And that journey continues now with the Vertica 10 release. The Vertica 10 release is obviously a massive release for us. But if you look back, you can see that building on that native columnar architecture that started a long time ago, obviously, with the C-Store paper. We built it to leverage that commodity hardware, because it was an architecture that was never tightly integrated with any specific underlying infrastructure. I still remember hearing the initial pitch from Mike Stonebreaker, about the vision of Vertica as a software only solution and the importance of separating the company from hardware innovation. And at the time, Mike basically said to me, "there's so much R&D in innovation that's going to happen in hardware, we shouldn't bake hardware into our solution. We should do it in software, and we'll be able to take advantage of that hardware." And that is exactly what has happened. But one of the most recent innovations that we embraced with hardware is certainly that separation of compute and storage. As I said previously, the public cloud providers offered this next generation architecture, really to ensure that they can provide the customers exactly what they needed, more compute or more storage and charge for each, respectively. The separation of compute and storage, compute from storage is a major milestone in data center architectures. If you think about it, it's really not only a public cloud innovation, though. It fundamentally redefines the next generation data architecture for on-premise and for pretty much every way people are thinking about computing today. And that goes for software too. Object storage is an example of the cost effective means for storing data. And even more importantly, separating compute from storage for analytic workloads has a lot of advantages. Including the opportunity to manage much more dynamic, flexible workloads. And more importantly, truly isolate those workloads from others. And by the way, once you start having something that can truly isolate workloads, then you can have the conversations around autonomic computing, around setting up some nodes, some compute resources on the data that won't affect any of the other data to do some things on their own, maybe some self analytics, by the system, etc. A lot of things that many of you know we've already been exploring in terms of our own system data in the product. But it was May 2018, believe it or not, it seems like a long time ago where we first announced Eon Mode and I want to make something very clear, actually about Eon mode. It's a mode, it's a deployment option for Vertica customers. And I think this is another huge benefit that we don't talk about enough. But unlike a lot of vendors in the market who will dig you and charge you for every single add-on like hit-buy, you name it. You get this with the Vertica product. If you continue to pay support and maintenance, this comes with the upgrade. This comes as part of the new release. So any customer who owns or buys Vertica has the ability to set up either an Enterprise Mode or Eon Mode, which is a question I know that comes up sometimes. Our first announcement of Eon was obviously AWS customers, including the trade desk, AT&T. Most of whom will be speaking here later at the Virtual Big Data Conference. They saw a huge opportunity. Eon Mode, not only allowed Vertica to scale elastically with that specific compute and storage that was needed, but it really dramatically simplified database operations including things like workload balancing, node recovery, compute provisioning, etc. So one of the most popular functions is that ability to isolate the workloads and really allocate those resources without negatively affecting others. And even though traditional data warehouses, including Vertica Enterprise Mode have been able to do lots of different workload isolation, it's never been as strong as Eon Mode. Well, it certainly didn't take long for our customers to see that value across the board with Eon Mode. Not just up in the cloud, in partnership with one of our most valued partners and a platinum sponsor here. Joy mentioned at the beginning. We announced Vertica Eon Mode for Pure Storage FlashBlade in September 2019. And again, just to be clear, this is not a new product, it's one Vertica with yet more deployment options. With Pure Storage, Vertica in Eon mode is not limited in any way by variable cloud, network latency. The performance is actually amazing when you take the benefits of separate and compute from storage and you run it with a Pure environment on-premise. Vertica in Eon Mode has a super smart cache layer that we call the depot. It's a big part of our secret sauce around Eon mode. And combined with the power and performance of Pure's FlashBlade, Vertica became the industry's first advanced analytics platform that actually separates compute and storage for on-premises data centers. Something that a lot of our customers are already benefiting from, and we're super excited about it. But as I said, this is a journey. We don't stop, we're not going to stop. Our customers need the flexibility of multiple public clouds. So today with Vertica 10, we're super proud and excited to announce support for Vertica in Eon Mode on Google Cloud. This gives our customers the ability to use their Vertica licenses on Amazon AWS, on-premise with Pure Storage and on Google Cloud. Now, we were talking about HDFS and a lot of our customers who have invested quite a bit in HDFS as a place, especially to store data have been pushing us to support Eon Mode with HDFS. So as part of Vertica 10, we are also announcing support for Vertica in Eon Mode using HDFS as the communal storage. Vertica's own Roth format data can be stored in HDFS, and actually the full functionality of Vertica is complete analytics, geospatial pattern matching, time series, machine learning, everything that we have in there can be applied to this data. And on the same HDFS nodes, Vertica can actually also analyze data in ORC or Parquet format, using External tables. We can also execute joins between the Roth data the External table holds, which powers a much more comprehensive view. So again, it's that flexibility to be able to support our customers, wherever they need us to support them on whatever platform, they have. Vertica 10 gives us a lot more ways that we can deploy Eon Mode in various environments for our customers. It allows them to take advantage of Vertica in Eon Mode and the power that it brings with that separation, with that workload isolation, to whichever platform they are most comfortable with. Now, there's a lot that has come in Vertica 10. I'm definitely not going to be able to cover everything. But we also introduced complex types as an example. And complex data types fit very well into Eon as well in this separation. They significantly reduce the data pipeline, the cost of moving data between those, a much better support for unstructured data, which a lot of our customers have mixed with structured data, of course, and they leverage a lot of columnar execution that Vertica provides. So you get complex data types in Vertica now, a lot more data, stronger performance. It goes great with the announcement that we made with the broader Eon Mode. Let's talk a little bit more about machine learning. We've been actually doing work in and around machine learning with various extra regressions and a whole bunch of other algorithms for several years. We saw the huge advantage that MPP offered, not just as a sequel engine as a database, but for ML as well. Didn't take as long to realize that there's a lot more to operationalizing machine learning than just those algorithms. It's data preparation, it's that model trade training. It's the scoring, the shaping, the evaluation. That is so much of what machine learning and frankly, data science is about. You do know, everybody always wants to jump to the sexy algorithm and we handle those tasks very, very well. It makes Vertica a terrific platform to do that. A lot of work in data science and machine learning is done in other tools. I had mentioned that there's just so many tools out there. We want people to be able to take advantage of all that. We never believed we were going to be the best algorithm company or come up with the best models for people to use. So with Vertica 10, we support PMML. We can import now and export PMML models. It's a huge step for us around that operationalizing machine learning projects for our customers. Allowing the models to get built outside of Vertica yet be imported in and then applying to that full scale of data with all the performance that you would expect from Vertica. We also are more tightly integrating with Python. As many of you know, we've been doing a lot of open source projects with the community driven by many of our customers, like Uber. And so now with Python we've integrated with TensorFlow, allowing data scientists to build models in their preferred language, to take advantage of TensorFlow. But again, to store and deploy those models at scale with Vertica. I think both these announcements are proof of our big bet number three, and really our commitment to supporting innovation throughout the community by operationalizing ML with that accuracy, performance and scale of Vertica for our customers. Again, there's a lot of steps when it comes to the workflow of machine learning. These are some of them that you can see on the slide, and it's definitely not linear either. We see this as a circle. And companies that do it, well just continue to learn, they continue to rescore, they continue to redeploy and they want to operationalize all that within a single platform that can take advantage of all those capabilities. And that is the platform, with a very robust ecosystem that Vertica has always been committed to as an organization and will continue to be. This graphic, many of you have seen it evolve over the years. Frankly, if we put everything and everyone on here wouldn't fit on a slide. But it will absolutely continue to evolve and grow as we support our customers, where they need the support most. So, again, being able to deploy everywhere, being able to take advantage of Vertica, not just as a business analyst or a business user, but as a data scientists or as an operational or BI person. We want Vertica to be leveraged and used by the broader organization. So I think it's fair to say and I encourage everybody to learn more about Vertica 10, because I'm just highlighting some of the bigger aspects of it. But we talked about those three market trends. The need to unify the silos, the need for hybrid multiple cloud deployment options, the need to operationalize business critical machine learning projects. Vertica 10 has absolutely delivered on those. But again, we are not going to stop. It is our job not to, and this is how Team Vertica thrives. I always joke that the next release is the best release. And, of course, even after Vertica 10, that is also true, although Vertica 10 is pretty awesome. But, you know, from the first line of code, we've always been focused on performance and scale, right. And like any really strong data platform, the execution engine, the optimizer and the execution engine are the two core pieces of that. Beyond Vertica 10, some of the big things that we're already working on, next generation execution engine. We're already actually seeing incredible early performance from this. And this is just one example, of how important it is for an organization like Vertica to constantly go back and re-innovate. Every single release, we do the sit ups and crunches, our performance and scale. How do we improve? And there's so many parts of the core server, there's so many parts of our broader ecosystem. We are constantly looking at coverages of how we can go back to all the code lines that we have, and make them better in the current environment. And it's not an easy thing to do when you're doing that, and you're also expanding in the environment that we are expanding into to take advantage of the different deployments, which is a great segue to this slide. Because if you think about today, we're obviously already available with Eon Mode and Amazon, AWS and Pure and actually MinIO as well. As I talked about in Vertica 10 we're adding Google and HDFS. And coming next, obviously, Microsoft Azure, Alibaba cloud. So being able to expand into more of these environments is really important for the Vertica team and how we go forward. And it's not just running in these clouds, for us, we want it to be a SaaS like experience in all these clouds. We want you to be able to deploy Vertica in 15 minutes or less on these clouds. You can also consume Vertica, in a lot of different ways, on these clouds. As an example, in Amazon Vertica by the Hour. So for us, it's not just about running, it's about taking advantage of the ecosystems that all these cloud providers offer, and really optimizing the Vertica experience as part of them. Optimization, around automation, around self service capabilities, extending our management console, we now have products that like the Vertica Advisor Tool that our Customer Success Team has created to actually use our own smarts in Vertica. To take data from customers that give it to us and help them tune automatically their environment. You can imagine that we're taking that to the next level, in a lot of different endeavors that we're doing around how Vertica as a product can actually be smarter because we all know that simplicity is key. There just aren't enough people in the world who are good at managing data and taking it to the next level. And of course, other things that we all hear about, whether it's Kubernetes and containerization. You can imagine that that probably works very well with the Eon Mode and separating compute and storage. But innovation happens everywhere. We innovate around our community documentation. Many of you have taken advantage of the Vertica Academy. The numbers there are through the roof in terms of the number of people coming in and certifying on it. So there's a lot of things that are within the core products. There's a lot of activity and action beyond the core products that we're taking advantage of. And let's not forget why we're here, right? It's easy to talk about a platform, a data platform, it's easy to jump into all the functionality, the analytics, the flexibility, how we can offer it. But at the end of the day, somebody, a person, she's got to take advantage of this data, she's got to be able to take this data and use this information to make a critical business decision. And that doesn't happen unless we explore lots of different and frankly, new ways to get that predictive analytics UI and interface beyond just the standard BI tools in front of her at the right time. And so there's a lot of activity, I'll tease you with that going on in this organization right now about how we can do that and deliver that for our customers. We're in a great position to be able to see exactly how this data is consumed and used and start with this core platform that we have to go out. Look, I know, the plan wasn't to do this as a virtual BDC. But I really appreciate you tuning in. Really appreciate your support. I think if there's any silver lining to us, maybe not being able to do this in person, it's the fact that the reach has actually gone significantly higher than what we would have been able to do in person in Boston. We're certainly looking forward to doing a Big Data Conference in the future. But if I could leave you with anything, know this, since that first release for Vertica, and our very first customers, we have been very consistent. We respect all the innovation around us, whether it's open source or not. We understand the market trends. We embrace those new ideas and technologies and for us true north, and the most important thing is what does our customer need to do? What problem are they trying to solve? And how do we use the advantages that we have without disrupting our customers? But knowing that you depend on us to deliver that unified analytics strategy, it will deliver that performance of scale, not only today, but tomorrow and for years to come. We've added a lot of great features to Vertica. I think we've said no to a lot of things, frankly, that we just knew we wouldn't be the best company to deliver. When we say we're going to do things we do them. Vertica 10 is a perfect example of so many of those things that we from you, our customers have heard loud and clear, and we have delivered. I am incredibly proud of this team across the board. I think the culture of Vertica, a customer first culture, jumping in to help our customers win no matter what is also something that sets us massively apart. I hear horror stories about support experiences with other organizations. And people always seem to be amazed at Team Vertica's willingness to jump in or their aptitude for certain technical capabilities or understanding the business. And I think sometimes we take that for granted. But that is the team that we have as Team Vertica. We are incredibly excited about Vertica 10. I think you're going to love the Virtual Big Data Conference this year. I encourage you to tune in. Maybe one other benefit is I know some people were worried about not being able to see different sessions because they were going to overlap with each other well now, even if you can't do it live, you'll be able to do those sessions on demand. Please enjoy the Vertica Big Data Conference here in 2020. Please you and your families and your co-workers be safe during these times. I know we will get through it. And analytics is probably going to help with a lot of that and we already know it is helping in many different ways. So believe in the data, believe in data's ability to change the world for the better. And thank you for your time. And with that, I am delighted to now introduce Micro Focus CEO Stephen Murdoch to the Vertica Big Data Virtual Conference. Thank you Stephen. >> Stephen: Hi, everyone, my name is Stephen Murdoch. I have the pleasure and privilege of being the Chief Executive Officer here at Micro Focus. Please let me add my welcome to the Big Data Conference. And also my thanks for your support, as we've had to pivot to this being virtual rather than a physical conference. Its amazing how quickly we all reset to a new normal. I certainly didn't expect to be addressing you from my study. Vertica is an incredibly important part of Micro Focus family. Is key to our goal of trying to enable and help customers become much more data driven across all of their IT operations. Vertica 10 is a huge step forward, we believe. It allows for multi-cloud innovation, genuinely hybrid deployments, begin to leverage machine learning properly in the enterprise, and also allows the opportunity to unify currently siloed lakes of information. We operate in a very noisy, very competitive market, and there are people, who are in that market who can do some of those things. The reason we are so excited about Vertica is we genuinely believe that we are the best at doing all of those things. And that's why we've announced publicly, you're under executing internally, incremental investment into Vertica. That investments targeted at accelerating the roadmaps that already exist. And getting that innovation into your hands faster. This idea is speed is key. It's not a question of if companies have to become data driven organizations, it's a question of when. So that speed now is really important. And that's why we believe that the Big Data Conference gives a great opportunity for you to accelerate your own plans. You will have the opportunity to talk to some of our best architects, some of the best development brains that we have. But more importantly, you'll also get to hear from some of our phenomenal Roth Data customers. You'll hear from Uber, from the Trade Desk, from Philips, and from AT&T, as well as many many others. And just hearing how those customers are using the power of Vertica to accelerate their own, I think is the highlight. And I encourage you to use this opportunity to its full. Let me close by, again saying thank you, we genuinely hope that you get as much from this virtual conference as you could have from a physical conference. And we look forward to your engagement, and we look forward to hearing your feedback. With that, thank you very much. >> Joy: Thank you so much, Stephen, for joining us for the Vertica Big Data Conference. Your support and enthusiasm for Vertica is so clear, and it makes a big difference. Now, I'm delighted to introduce Amy Fowler, the VP of Strategy and Solutions for FlashBlade at Pure Storage, who was one of our BDC Platinum Sponsors, and one of our most valued partners. It was a proud moment for me, when we announced Vertica in Eon mode for Pure Storage FlashBlade and we became the first analytics data warehouse that separates compute from storage for on-premise data centers. Thank you so much, Amy, for joining us. Let's get started. >> Amy: Well, thank you, Joy so much for having us. And thank you all for joining us today, virtually, as we may all be. So, as we just heard from Colin Mahony, there are some really interesting trends that are happening right now in the big data analytics market. From the end of the Hadoop hype cycle, to the new cloud reality, and even the opportunity to help the many data science and machine learning projects move from labs to production. So let's talk about these trends in the context of infrastructure. And in particular, look at why a modern storage platform is relevant as organizations take on the challenges and opportunities associated with these trends. The answer is the Hadoop hype cycles left a lot of data in HDFS data lakes, or reservoirs or swamps depending upon the level of the data hygiene. But without the ability to get the value that was promised from Hadoop as a platform rather than a distributed file store. And when we combine that data with the massive volume of data in Cloud Object Storage, we find ourselves with a lot of data and a lot of silos, but without a way to unify that data and find value in it. Now when you look at the infrastructure data lakes are traditionally built on, it is often direct attached storage or data. The approach that Hadoop took when it entered the market was primarily bound by the limits of networking and storage technologies. One gig ethernet and slower spinning disk. But today, those barriers do not exist. And all FlashStorage has fundamentally transformed how data is accessed, managed and leveraged. The need for local data storage for significant volumes of data has been largely mitigated by the performance increases afforded by all Flash. At the same time, organizations can achieve superior economies of scale with that segregation of compute and storage. With compute and storage, you don't always scale in lockstep. Would you want to add an engine to the train every time you add another boxcar? Probably not. But from a Pure Storage perspective, FlashBlade is uniquely architected to allow customers to achieve better resource utilization for compute and storage, while at the same time, reducing complexity that has arisen from the siloed nature of the original big data solutions. The second and equally important recent trend we see is something I'll call cloud reality. The public clouds made a lot of promises and some of those promises were delivered. But cloud economics, especially usage based and elastic scaling, without the control that many companies need to manage the financial impact is causing a lot of issues. In addition, the risk of vendor lock-in from data egress, charges, to integrated software stacks that can't be moved or deployed on-premise is causing a lot of organizations to back off the all the way non-cloud strategy, and move toward hybrid deployments. Which is kind of funny in a way because it wasn't that long ago that there was a lot of talk about no more data centers. And for example, one large retailer, I won't name them, but I'll admit they are my favorites. They several years ago told us they were completely done with on-prem storage infrastructure, because they were going 100% to the cloud. But they just deployed FlashBlade for their data pipelines, because they need predictable performance at scale. And the all cloud TCO just didn't add up. Now, that being said, well, there are certainly challenges with the public cloud. It has also brought some things to the table that we see most organizations wanting. First of all, in a lot of cases applications have been built to leverage object storage platforms like S3. So they need that object protocol, but they may also need it to be fast. And the said object may be oxymoron only a few years ago, and this is an area of the market where Pure and FlashBlade have really taken a leadership position. Second, regardless of where the data is physically stored, organizations want the best elements of a cloud experience. And for us, that means two main things. Number one is simplicity and ease of use. If you need a bunch of storage experts to run the system, that should be considered a bug. The other big one is the consumption model. The ability to pay for what you need when you need it, and seamlessly grow your environment over time totally nondestructively. This is actually pretty huge and something that a lot of vendors try to solve for with finance programs. But no finance program can address the pain of a forklift upgrade, when you need to move to next gen hardware. To scale nondestructively over long periods of time, five to 10 years plus is a crucial architectural decisions need to be made at the outset. Plus, you need the ability to pay as you use it. And we offer something for FlashBlade called Pure as a Service, which delivers exactly that. The third cloud characteristic that many organizations want is the option for hybrid. Even if that is just a DR site in the cloud. In our case, that means supporting appplication of S3, at the AWS. And the final trend, which to me represents the biggest opportunity for all of us, is the need to help the many data science and machine learning projects move from labs to production. This means bringing all the machine learning functions and model training to the data, rather than moving samples or segments of data to separate platforms. As we all know, machine learning needs a ton of data for accuracy. And there is just too much data to retrieve from the cloud for every training job. At the same time, predictive analytics without accuracy is not going to deliver the business advantage that everyone is seeking. You can kind of visualize data analytics as it is traditionally deployed as being on a continuum. With that thing, we've been doing the longest, data warehousing on one end, and AI on the other end. But the way this manifests in most environments is a series of silos that get built up. So data is duplicated across all kinds of bespoke analytics and AI, environments and infrastructure. This creates an expensive and complex environment. So historically, there was no other way to do it because some level of performance is always table stakes. And each of these parts of the data pipeline has a different workload profile. A single platform to deliver on the multi dimensional performances, diverse set of applications required, that didn't exist three years ago. And that's why the application vendors pointed you towards bespoke things like DAS environments that we talked about earlier. And the fact that better options exists today is why we're seeing them move towards supporting this disaggregation of compute and storage. And when it comes to a platform that is a better option, one with a modern architecture that can address the diverse performance requirements of this continuum, and allow organizations to bring a model to the data instead of creating separate silos. That's exactly what FlashBlade is built for. Small files, large files, high throughput, low latency and scale to petabytes in a single namespace. And this is importantly a single rapid space is what we're focused on delivering for our customers. At Pure, we talk about it in the context of modern data experience because at the end of the day, that's what it's really all about. The experience for your teams in your organization. And together Pure Storage and Vertica have delivered that experience to a wide range of customers. From a SaaS analytics company, which uses Vertica on FlashBlade to authenticate the quality of digital media in real time, to a multinational car company, which uses Vertica on FlashBlade to make thousands of decisions per second for autonomous cars, or a healthcare organization, which uses Vertica on FlashBlade to enable healthcare providers to make real time decisions that impact lives. And I'm sure you're all looking forward to hearing from John Yavanovich from AT&T. To hear how he's been doing this with Vertica and FlashBlade as well. He's coming up soon. We have been really excited to build this partnership with Vertica. And we're proud to provide the only on-premise storage platform validated with Vertica Eon Mode. And deliver this modern data experience to our customers together. Thank you all so much for joining us today. >> Joy: Amy, thank you so much for your time and your insights. Modern infrastructure is key to modern analytics, especially as organizations leverage next generation data center architectures, and object storage for their on-premise data centers. Now, I'm delighted to introduce our last speaker in our Vertica Big Data Conference Keynote, John Yovanovich, Director of IT for AT&T. Vertica is so proud to serve AT&T, and especially proud of the harmonious impact we are having in partnership with Pure Storage. John, welcome to the Virtual Vertica BDC. >> John: Thank you joy. It's a pleasure to be here. And I'm excited to go through this presentation today. And in a unique fashion today 'cause as I was thinking through how I wanted to present the partnership that we have formed together between Pure Storage, Vertica and AT&T, I want to emphasize how well we all work together and how these three components have really driven home, my desire for a harmonious to use your word relationship. So, I'm going to move forward here and with. So here, what I'm going to do the theme of today's presentation is the Pure Vertica Symphony live at AT&T. And if anybody is a Westworld fan, you can appreciate the sheet music on the right hand side. What we're going to what I'm going to highlight here is in a musical fashion, is how we at AT&T leverage these technologies to save money to deliver a more efficient platform, and to actually just to make our customers happier overall. So as we look back, and back as early as just maybe a few years ago here at AT&T, I realized that we had many musicians to help the company. Or maybe you might want to call them data scientists, or data analysts. For the theme we'll stay with musicians. None of them were singing or playing from the same hymn book or sheet music. And so what we had was many organizations chasing a similar dream, but not exactly the same dream. And, best way to describe that is and I think with a lot of people this might resonate in your organizations. How many organizations are chasing a customer 360 view in your company? Well, I can tell you that I have at least four in my company. And I'm sure there are many that I don't know of. That is our problem because what we see is a repetitive sourcing of data. We see a repetitive copying of data. And there's just so much money to be spent. This is where I asked Pure Storage and Vertica to help me solve that problem with their technologies. What I also noticed was that there was no coordination between these departments. In fact, if you look here, nobody really wants to play with finance. Sales, marketing and care, sure that you all copied each other's data. But they actually didn't communicate with each other as they were copying the data. So the data became replicated and out of sync. This is a challenge throughout, not just my company, but all companies across the world. And that is, the more we replicate the data, the more problems we have at chasing or conquering the goal of single version of truth. In fact, I kid that I think that AT&T, we actually have adopted the multiple versions of truth, techno theory, which is not where we want to be, but this is where we are. But we are conquering that with the synergies between Pure Storage and Vertica. This is what it leaves us with. And this is where we are challenged and that if each one of our siloed business units had their own stories, their own dedicated stories, and some of them had more money than others so they bought more storage. Some of them anticipating storing more data, and then they really did. Others are running out of space, but can't put anymore because their bodies aren't been replenished. So if you look at it from this side view here, we have a limited amount of compute or fixed compute dedicated to each one of these silos. And that's because of the, wanting to own your own. And the other part is that you are limited or wasting space, depending on where you are in the organization. So there were the synergies aren't just about the data, but actually the compute and the storage. And I wanted to tackle that challenge as well. So I was tackling the data. I was tackling the storage, and I was tackling the compute all at the same time. So my ask across the company was can we just please play together okay. And to do that, I knew that I wasn't going to tackle this by getting everybody in the same room and getting them to agree that we needed one account table, because they will argue about whose account table is the best account table. But I knew that if I brought the account tables together, they would soon see that they had so much redundancy that I can now start retiring data sources. I also knew that if I brought all the compute together, that they would all be happy. But I didn't want them to tackle across tackle each other. And in fact that was one of the things that all business units really enjoy. Is they enjoy the silo of having their own compute, and more or less being able to control their own destiny. Well, Vertica's subclustering allows just that. And this is exactly what I was hoping for, and I'm glad they've brought through. And finally, how did I solve the problem of the single account table? Well when you don't have dedicated storage, and you can separate compute and storage as Vertica in Eon Mode does. And we store the data on FlashBlades, which you see on the left and right hand side, of our container, which I can describe in a moment. Okay, so what we have here, is we have a container full of compute with all the Vertica nodes sitting in the middle. Two loader, we'll call them loader subclusters, sitting on the sides, which are dedicated to just putting data onto the FlashBlades, which is sitting on both ends of the container. Now today, I have two dedicated storage or common dedicated might not be the right word, but two storage racks one on the left one on the right. And I treat them as separate storage racks. They could be one, but i created them separately for disaster recovery purposes, lashing work in case that rack were to go down. But that being said, there's no reason why I'm probably going to add a couple of them here in the future. So I can just have a, say five to 10, petabyte storage, setup, and I'll have my DR in another 'cause the DR shouldn't be in the same container. Okay, but I'll DR outside of this container. So I got them all together, I leveraged subclustering, I leveraged separate and compute. I was able to convince many of my clients that they didn't need their own account table, that they were better off having one. I eliminated, I reduced latency, I reduced our ticketing I reduce our data quality issues AKA ticketing okay. I was able to expand. What is this? As work. I was able to leverage elasticity within this cluster. As you can see, there are racks and racks of compute. We set up what we'll call the fixed capacity that each of the business units needed. And then I'm able to ramp up and release the compute that's necessary for each one of my clients based on their workloads throughout the day. And so while they compute to the right before you see that the instruments have already like, more or less, dedicated themselves towards all those are free for anybody to use. So in essence, what I have, is I have a concert hall with a lot of seats available. So if I want to run a 10 chair Symphony or 80, chairs, Symphony, I'm able to do that. And all the while, I can also do the same with my loader nodes. I can expand my loader nodes, to actually have their own Symphony or write all to themselves and not compete with any other workloads of the other clusters. What does that change for our organization? Well, it really changes the way our database administrators actually do their jobs. This has been a big transformation for them. They have actually become data conductors. Maybe you might even call them composers, which is interesting, because what I've asked them to do is morph into less technology and more workload analysis. And in doing so we're able to write auto-detect scripts, that watch the queues, watch the workloads so that we can help ramp up and trim down the cluster and subclusters as necessary. There has been an exciting transformation for our DBAs, who I need to now classify as something maybe like DCAs. I don't know, I have to work with HR on that. But I think it's an exciting future for their careers. And if we bring it all together, If we bring it all together, and then our clusters, start looking like this. Where everything is moving in harmonious, we have lots of seats open for extra musicians. And we are able to emulate a cloud experience on-prem. And so, I want you to sit back and enjoy the Pure Vertica Symphony live at AT&T. (soft music) >> Joy: Thank you so much, John, for an informative and very creative look at the benefits that AT&T is getting from its Pure Vertica symphony. I do really like the idea of engaging HR to change the title to Data Conductor. That's fantastic. I've always believed that music brings people together. And now it's clear that analytics at AT&T is part of that musical advantage. So, now it's time for a short break. And we'll be back for our breakout sessions, beginning at 12 pm Eastern Daylight Time. We have some really exciting sessions planned later today. And then again, as you can see on Wednesday. Now because all of you are already logged in and listening to this keynote, you already know the steps to continue to participate in the sessions that are listed here and on the previous slide. In addition, everyone received an email yesterday, today, and you'll get another one tomorrow, outlining the simple steps to register, login and choose your session. If you have any questions, check out the emails or go to www.vertica.com/bdc2020 for the logistics information. There are a lot of choices and that's always a good thing. Don't worry if you want to attend one or more or can't listen to these live sessions due to your timezone. All the sessions, including the Q&A sections will be available on demand and everyone will have access to the recordings as well as even more pre-recorded sessions that we'll post to the BDC website. Now I do want to leave you with two other important sites. First, our Vertica Academy. Vertica Academy is available to everyone. And there's a variety of very technical, self-paced, on-demand training, virtual instructor-led workshops, and Vertica Essentials Certification. And it's all free. Because we believe that Vertica expertise, helps everyone accelerate their Vertica projects and the advantage that those projects deliver. Now, if you have questions or want to engage with our Vertica engineering team now, we're waiting for you on the Vertica forum. We'll answer any questions or discuss any ideas that you might have. Thank you again for joining the Vertica Big Data Conference Keynote Session. Enjoy the rest of the BDC because there's a lot more to come

Published Date : Mar 30 2020

SUMMARY :

And he'll share the exciting news And that is the platform, with a very robust ecosystem some of the best development brains that we have. the VP of Strategy and Solutions is causing a lot of organizations to back off the and especially proud of the harmonious impact And that is, the more we replicate the data, Enjoy the rest of the BDC because there's a lot more to come

ENTITIES

Entity	Category	Confidence
Stephen	PERSON	0.99+
Amy Fowler	PERSON	0.99+
Mike	PERSON	0.99+
John Yavanovich	PERSON	0.99+
Amy	PERSON	0.99+
Colin Mahony	PERSON	0.99+
AT&T	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
John Yovanovich	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Joy King	PERSON	0.99+
Mike Stonebreaker	PERSON	0.99+
John	PERSON	0.99+
May 2018	DATE	0.99+
100%	QUANTITY	0.99+
Wednesday	DATE	0.99+
Colin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Vertica Academy	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Joy	PERSON	0.99+
2020	DATE	0.99+
two	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
Stephen Murdoch	PERSON	0.99+
Vertica 10	TITLE	0.99+
Pure Storage	ORGANIZATION	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
Philips	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
AT&T.	ORGANIZATION	0.99+
September 2019	DATE	0.99+
Python	TITLE	0.99+
www.vertica.com/bdc2020	OTHER	0.99+
One gig	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Second	QUANTITY	0.99+
First	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
yesterday	DATE	0.99+

UNLIST TILL 4/1 - Putting Complex Data Types to Work

hello everybody thank you for joining us today from the virtual verdict of BBC 2020 today's breakout session is entitled putting complex data types to work I'm Jeff Healey I lead vertical marketing I'll be a host for this breakout session joining me is Deepak Magette II technical lead from verdict engineering but before we begin I encourage you to submit questions and comments during the virtual session you don't have to wait just type your question or comment and the question box below the slides and click Submit it won't be a Q&A session at the end of the presentation we'll answer as many questions were able to during that time any questions we don't address we'll do our best to answer them offline alternatively visit Vertica forms that formed up Vertica calm to post your questions there after the session engineering team is planning to join the forms conversation going and also as a reminder that you can maximize your screen by clicking a double arrow button in the lower right corner of the slides yes this virtual session is being recorded and will be available to view on demand this week we'll send you a notification as submits ready now let's get started over to you Deepak thanks yes make sure you talk about the complex a textbook they've been doing it wedeck R&D without further delay let's see why and how we should put completely aside to work in your data analytics so this is going to be the outline or overview of my talk today first I'm going to talk about what are complex data types in some use cases I will then quickly cover some file formats that support these complex website I will then deep dive into the current support for complex data types in America finally I'll conclude with some usage considerations and what is coming in are 1000 release and our future roadmap and directions for this project so what are complex stereotypes complex data types are nested data structures composed of tentative types community types are nothing but your int float and string war binary etc the basic types some examples of complex data types include struct also called row are a list set map and Union composite types can also be built by composing other complicated types computer types are very useful for handling sparse data we also make samples on this presentation on that use case and also they help simplify analysis so let's look at some examples of complex data types so the first example on the left you can see a simple customer which is of type struc with two fields namely make a field name of type string and field ID of type integer structs are nothing but a group of fields and each field is a type of its own the type can be primitive or another complex type and on the right we have some example data for this simple customer complex type so it's basically two fields of type string and integer so in this case you have two rows where the first row is Alex with name named Alex and ID 1 0 and the second row has name Mary with ID 2 0 0 2 the second complex type on the left is phone numbers of type array of data has the element type string so area is nothing but a collection of elements the elements could be again a primitive type or another complex type so in this example the collection is of type string which is a primitive type and on the right you have some example of this collection of a fairy type called phone numbers and basically each row has a set or the list or a collection of phone numbers on the first we have two phone numbers and second you have a single phone number in that array and the third type on the slide is the map data type map is nothing but a collection of key value pairs so each element is actually a key value and you have a collection of such elements the key is usually a primitive type however the value is can be a primitive or complex type so in this example the both the key and value are of type string and then if you look on the right side of the slide you have some sample data here we have HTTP requests where the key is the header type and the value is the header value so the for instance on the first row we have a key type pragma with value no cash key type host with value some hostname and similarly on the second row you have some key value called accept with some text HTML because yeah they actually have a collection of elements allison maps are commonly called as collections as a to talking to in mini documents so we saw examples of a one-level complex steps on this slide we have nested complex there types on the right we have the root complex site called web events of type struct script has a for field a session ID of type integer session duration of type timestamp and then the third and the fourth fields customer and history requests are further complex types themselves so customer is again a complex type of type struct with three fields where the first two fields name ID are primitive types however the third field is another complex type phone numbers which we just saw in the previous slide similarly history request is also the same map type that we just saw so in this example each complex types is independent and you can reuse a complex type inside other complex types for example you can build another type called orders and simply reuse the customer type however in a practical implementation you have to deal with complexities involving security ownership and like sets lifecycle dependencies so keeping complex types as independent has that advantage of reusing them however the complication with that is you have to deal with security and ownership and lifecycle dependencies so this is on this slide we have another style of declaring a nested complex type do is call inlined complex data type so we have the same web driven struct type however if you look at the complex sites that embedded into the parent type definition so customer and HTTP request definition is embedded in lined into this parent structure so the advantage of this is you won't have to deal with the security and other lifecycle dependency issues but with the downside being you can't reuse them so it's sort of a trade-off between the these two so so let's see now some use cases of these complex types so the first use case or the benefit of using complex stereotypes is that you'll be able to express analysis mode naturally compute I've simplified the expression of analysis logic thereby simplifying the data pipelines in sequel it feels as if you have tables inside table so let's look at an example on and say you want to list all the customers with more than one thousand website events so if you have complex types you can simply create a table called web events and with one column of type web even which is a complex step so we just saw that difference it has four fields station customer and HTTP request so you can basically have the entire schema or in one type if you don't have complex types you'll have to create four tables one essentially for each complex type and then you have to establish primary key foreign key dependencies across these tables now if you want to achieve your goal of of listing all the customers in more than thousand web requests if you have complex types you can simply use the dot notation to extract the name the contact and also use some special functions for maps that will give you the count of all the HTTP requests grid in thousand however if you don't have complex types you'll have to now join each table individually extract the results from sub query and again joined on the outer query and finally you can apply a predicate of total requests which are greater than thousand to basically get your final result so it's a complex steps basically simplify the query writing part also the execution itself is also simplified so you don't have to have joins if you have complex you can simply have a load step to load the map type and then you can apply the function on top of it directly however if you have separate tables you have to join all these data and apply the filter step and then finally another joint to get your results alright so the other advantage of complex types is that you can cross this semi structured data very efficiently for example if you have data from clique streams or page views the data is often sparse and maps are very well suited for such data so maps or semi-structured by nature and with this support you can now actually have semi structured data represented along with structured columns in in any database so maps have this nice of nice feature to cap encapsulated sparse data as an example the common fields of a kick stream click stream or page view data are pragma host and except if you don't have map types you will have to end up creating a column for each of this header or field types however if you have map you can basically embed as key value pairs for all the data so on the left here on the slide you can see an example where you have a separate column for each field you end up with a lot of nodes basically the sparse however if you can embed them into in a map you can put them into a single column and sort of yeah have better efficiency and better representation of spots they imagine if you have thousands of fields in a click stream or page view you will have thousands of columns you will need thousands of columns represent data if you don't have a map type correct so given these are the most commonly used complexity types let's see what are the file formats that actually support these complex data types so most of file formats popular ones support complex data types however they have different serve variations so for instance if you have JSON it supports arrays and objects which are complex data types however JSON data is schema-less it is row oriented and this text fits because it is Kimmel s it has to store it in encase on every job the second type of file format is Avro and Avro has records enums arrays Maps unions and a fixed type however Avro has a schema it is oriented and it is binary compressed the third category is basically the park' and our style of file formats where the columnar so parquet and arc have support for arrays maps and structs the hewa schema they are column-oriented unlike Avro which is oriented and they're also binary compressed and they support a very nice compression and encoding types additionally so the main difference between parquet and arc is only in terms of how they represent complex types parquet includes the complex type hierarchy as reputation deflation levels however orc uses a separate column at every parent of the complex type to basically the prisons are now less so that apart from that difference in how they represent complex types parking hogs have similar capabilities in terms of optimizations and other compression techniques so to summarize JSON has no schema has no binary format in this columnar so it is not columnar Avro has a schema because binary format however it is not columnar and parquet and art are have a schema have a binary format and are columnar so let's see how we can query these different kinds of complex types and also the different file formats that they can be present in in how we can basically query these different variations in Vertica so in Vertica we basically have this feature called flex tables to where you can load complex data types and analyze them so flex tables use a binary format called vemma to store data as key value pairs clicks tables are schema-less they are weak typed and they trade flexibility for performance so when I mean what I mean by schema-less is basically the keys provide the field name and each row can potentially have different keys and it is weak type because there's no type information at the column level we have some we will see some examples of of this week type in the following slides but basically there's no type information so so the data is stored in text format and because of the week type and schema-less nature of flex tables you can implement some optimum use cases like if you can trivially implement needs like schema evolution or keep the complex types types fluid if that is your use case then the weak tightness and schema-less nature of flex tables will help you a lot to get give you that flexibility however because you have this weak type you you have a downside of not getting the best possible performance so if you if your use case is to get the best possible performance you can use a new feature of the strongly-typed complex types that we started to introduce in Vertica so complex types here are basically a strongly typed complex types they have a schema and then they give you the best possible performance because the optimizer now has enough information from the schema and the type to implement optimization system column selection or all the nice techniques that Vertica employs to give you the best possible color performance can now be supported even for complex types so and we'll see some of the examples of these two types in these slides now so let's use a simple data called restaurants a restaurant data - as running throughout this poll excites to basically see all the different variations of flex and complex steps so on this slide you have some sample data with four fields and essentially two rows if you sort of loaded in if you just operate them out so the four fields are named cuisine locations in menu name in cuisine or of type watch are locations is essentially an array and menu array of a row of two fields item and price so if you the data is in JSON there is no schema and there is no type information so how do we process that in Vertica so in Vertica you can simply create a flex table called restaurants you can copy the restaurant dot J's the restaurants of JSON file into Vertica and basically you can now start analyzing the data so if you do a select star from restaurants you will see that all the data is actually in one column called draw and it also you have the other column called identity which is to give you some unique row row ID but the row column base again encapsulates all the data that gives in the restaurant so JSON file this tall column is nothing but the V map format the V map format is a binary format that encodes the data as key value pairs and RAW format is basically backed by the long word binary column type in Vertica so each key essentially gives you the field name and the values the field value and it's all in its however the values are in the text text representation so see now you want to get better performance of this JSON data flex tables has these nice functions to basically analyze your data or try to extract some schema and type information from your data so if you execute compute flex table keys on the restaurants table you will see a new table called public dot restaurants underscore keys and then that will give you some information about your JSON data so it was able to automatically infer that your data has four fields namely could be name cuisine locations in menu and could also get that the name in cuisine or watch are however since locations in menu are complex types themselves one is array and one is area for row it sort of uses the same be map format as ease to process them so it has four columns to two primitive of type watch R and 2 R P map themselves so now you can materialize these columns by altering the table definitions and adding columns of that particular type it inferred and then you can get better performance from this materialized columns and yeah it's basically it's not in a single column anymore you have four columns for the fare your restaurant data and you can get some column selection and other optimizations on on the data that Whittaker provides all right so that is three flex tables are basically helpful if you don't have a schema and if you don't have any type of permission however we saw earlier that some file formats like Parker and Avro have schema and have some type information so in those cases you don't have to do the first step of inputting the type so you can directly create the type external table definition of the type and then you can target it to the park a file and you can load it in by an external table in vertical so the same restaurants dot JSON if you call if you transfer it to a translations or park' format you can basically get the fields with look however the locations and menu are still in the B map format all right so the V map format also allows you to explode the data and it has some nice functions to yeah M extract the fields from P map format so you have this map items so the same restaurant later if you want to explode and you want to apply predicate on the fields of the RS and the address of pro you can have map items to export your data and then you can apply predicates on a particular field in the complex type data so on this slide is basically showing you how you can explode the entire data the menu items as well as the locations and basically give you the elements of each of these complex types up so as I mentioned the menus so if you go back to the previous slide the locations and menu items are still the bond binary or the V map format so the question is if you want what if you want to get perform better on the V map data so for primitive types you could materialize into the primitive style however if it's an array and array of row we will need some first-class complex type constructs and that is what we will see that are added in what is right now so Vertica has started to introduce complex stereotypes with where these complex types is sort of a strongly typed complex site so on this slide you have an example of a row complex type where so we create an external table called customers and you have a row type of twit to fields name and ID so the complex type is basically inlined into the tables into the column definition and on the second example you can see the create external table items which is unlisted row type so it has an item of type row which is so fast to peals name and the properties is again another nested row type with two fixed quantities label so these are basically strongly typed complex types and then the optimizer can now give you a better performance compared to the V map using the strongly typed information in their queries so we have support for pure rows and extra draws in external tables for power K we have support for arrays and nested arrays as well for external tables in power K so you can declare an external table called contacts with a flip phone number of array of integers similarly you can have a nested array of items of type integer we can declare a column with that strongly typed complex type so the other complex type support that we are adding in the thinner liz's support for optimized one dimensional arrays and sets for both ross and as well as RK external table so you can create internal table called phone numbers with a one-dimensional array so here you have phone numbers of array of type int you can have one dimensional you can have sets as well which is also one color one dimension arrays but sets are basically optimized for fast look ups they are have unique elements and they are ordered so big so you can get fast look ups using sets if that is a use case then set will give you very quick lookups for elements and we also implemented some functions to support arrays sets as well so you have applied min apply max which are scale out that you can apply on top of an array element and you can get the minimum element and so on so you can up you have support for additional functions as well so the other feature that is coming in ten o is the explored arrays of functionality so we have a implemented EU DX that will allow you to similar similar to the example you saw in the math items case you can extract elements from these arrays and you can apply different predicates or analysis on the elements so for example if you have this restaurant table with the column name watch our locations of each an area of archer and menu again an area watch our you can insert values using the array constructor into these columns so here we inserting three values lilies feed the with location with locations cambridge pittsburgh menu items cheese and pepperoni again another row with name restaurant named bob tacos location Houston and totila salsa and Patty on the third example so now you can basically explode the both arrays into and extract the elements out from these arrays so you can explode the location array and extract the location elements which is which are basically Houston Cambridge Pittsburgh New Jersey and also you can explode the menu items and extract individual elements and now you can sort of apply other predicates on the extruded data Kollek so so so let's see what are some usage considerations of these complex data types so complex data types as we saw earlier are nice if you have sparse data so if your data has clickstream or has some page view data then maps are very nice to have to represent your data and then you can sort of efficiently represent the in the space wise fashion for sparse data use a map types and compensate that as we saw earlier for the web request count query it will help you simplify the analysis as well you don't have to have joins and it will simplify your query analysis as I just mentioned if your use cases are for fast look ups then you can use a set type so arrays are nice but they have the ordering on them however if your primary use case to just look up for certain elements then we can use the set type also you can use the B map or the Flex functionality that we have in Vertica if you want flexibility in your complex set data type schema so like I mentioned earlier you can trivially implement needs like scheme evolution or even keep the complex types fluid so if you have multiple iterations of unit analysis and each iteration we are changing the fields because you're just exploring the data then we map and flex will give you that nice ease to change the fields within the complex type or across files and we can load fluid complex you can load complexity types with bit fluids is basically different fields in different Rho into V map and flex tables easily however if you're once you basically treated over your data you figured out what are the fields and the complex types that you really need you can use the strongly typed complex data types that we started to introduce in Vertica so you can use the array type the struct type in the map type for your data analysis so that's sort of the high level use cases for complex types in vertical so it depends on a lot on where your data analysis phase is fear early then your data is usually still fluid and you might want to use V Maps and flex to explore it once you finalize your schema you can use the strongly typed complex data types and to get the best possible performance holic so so what's coming in the following releases of Vertica so antenna which is coming in sometime now so yeah so we are adding which is the next release of vertical basically we're adding support for loading Park a complex data types to the V map format so parquet is a strongly typed file format basically it has the schema it also has the type information for each of the complex type however if you are exploring your data then you might have different park' files with different schemes so you can load them to the V map format first and then you can analyze your data and then you can switch to the strongly typed complex types we're also adding one dimensional optimized arrays and sets in growth and for parquet so yeah the complex sets are not just limited to parquet you can also store them in drawers however right now you only support one dimension arrays and set in rows we're also adding the Explorer du/dx for one-dimensional arrays in the in this release so you can as you saw in the previous example you can explode the data for of arrays in arrays and you can apply predicates on individual elements for the erase data so you can in it'll apply for set so you can cause them to milli to erase and Clinics code sets as well so what are the plans paths that you know release so we are going to continue both for strongly-typed computer types right now we don't have support for the full in the tail release we won't have support for the full all the combinations of complex types so we only have support for nested arrays sorriness listed pure arrays or nested pure rows and some are only limited to park a file format so we will continue to add more support for sub queries and nested complex sites in the following in the in following releases and we're also planning to add this B map data type so you saw in the examples that the V map data format is currently backed by the long word binary data format or the other column type because of this the optimizer really cannot distinguish which is a which is which data is actually a long wall binary or which is actually data and we map format so if we the idea is to basically add a type called V map and then the optimizer can now implement our support optimizations or even syntax such as dot notation and yeah if your data is columnar such as Parque then you can implement optimizations just keep push down where you can push the keys that are actually querying in your in your in your analysis and then only those keys should be loaded from parquet and built into the V map format so that way you get sort of the column selection optimization for complex types as well and yeah that's something you can achieve if you have different types for the V map format so that's something on the roadmap as well and then unless join is basically another nice to have feature right now if you want to explode and join the array elements you have to explode in the sub query and then in the outer query you have to join the data however if you have unless join till I love you to explode as well as join the data in the same query and on the fly you can do both and finally we are also adding support for this new feature called UD vector so that's on the plan too so our work for complex types is is essentially chain the fundamental way Vertica execute in the sense of functions and expression so right now all expressions in Vertica can return only a single column out acceptance in some cases like beauty transforms and so on but the scalar functions for instance if you take aut scalar you can get only one column out of it however if you have some use cases where you want to compute multiple computation so if you also have multiple computations on the same input data say you have input data of two integers and you want to compute both addition and multiplication on those two columns this is for example but in many many machine learning example use cases have similar patterns so say you want to do both these computations on the data at the same time then in the current approach you have to have one function for addition one function for multiplication and both of them will have to load the data once basically loading data twice to get both these computations turn however with the Uni vector support you can perform both these computations in the same function and you can return two columns out so essentially saving you the loading loading these columns twice you can only do it once and get both the results out so that's sort of what we are trying to implement with all the changes that we are doing to support complex data types in Vertica and also you don't have to use these over Clause like a uni transform so PD scale just like we do scalars you can have your a vector and you can have multiple columns returned from your computations so that sort of concludes my talk so thank you for listening to my presentation now we are ready for Q&A

Published Date : Mar 30 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
America	LOCATION	0.99+
Jeff Healey	PERSON	0.99+
second row	QUANTITY	0.99+
Mary	PERSON	0.99+
two rows	QUANTITY	0.99+
two fields	QUANTITY	0.99+
first row	QUANTITY	0.99+
two rows	QUANTITY	0.99+
two types	QUANTITY	0.99+
each row	QUANTITY	0.99+
two integers	QUANTITY	0.99+
Deepak	PERSON	0.99+
one function	QUANTITY	0.99+
three fields	QUANTITY	0.99+
fourth fields	QUANTITY	0.99+
each element	QUANTITY	0.99+
each field	QUANTITY	0.99+
third	QUANTITY	0.99+
more than thousand web requests	QUANTITY	0.99+
second example	QUANTITY	0.99+
today	DATE	0.99+
each key	QUANTITY	0.99+
each table	QUANTITY	0.99+
four fields	QUANTITY	0.99+
third field	QUANTITY	0.99+
first example	QUANTITY	0.99+
Deepak Magette II	PERSON	0.99+
two columns	QUANTITY	0.99+
third category	QUANTITY	0.99+
two columns	QUANTITY	0.99+
two fields	QUANTITY	0.99+
Houston	LOCATION	0.99+
first step	QUANTITY	0.99+
twice	QUANTITY	0.99+
thousands of columns	QUANTITY	0.98+
three values	QUANTITY	0.98+
this week	DATE	0.98+
more than one thousand website events	QUANTITY	0.98+
third type	QUANTITY	0.98+
each iteration	QUANTITY	0.98+
both	QUANTITY	0.98+
greater than thousand	QUANTITY	0.98+
cambridge	LOCATION	0.98+
JSON	TITLE	0.98+
both arrays	QUANTITY	0.97+
one column	QUANTITY	0.97+
thousands of fields	QUANTITY	0.97+
second	QUANTITY	0.97+
third example	QUANTITY	0.97+
two	QUANTITY	0.97+
single column	QUANTITY	0.96+
thousand	QUANTITY	0.96+
Alex	PERSON	0.96+
first	QUANTITY	0.96+
BBC 2020	ORGANIZATION	0.96+
Vertica	TITLE	0.96+
four columns	QUANTITY	0.95+
once	QUANTITY	0.95+
one type	QUANTITY	0.95+
V Maps	TITLE	0.94+
one color	QUANTITY	0.94+
second type	QUANTITY	0.94+
one dimension	QUANTITY	0.94+
first two fields	QUANTITY	0.93+
four tables	QUANTITY	0.91+
each	QUANTITY	0.91+

Chris Fox, Oracle | Empowering the Autonomous Enterprise of the Future

(upbeat music) >> Welcome back to theCUBE everybody. This is Dave Vellante. We've been covering the transformation of Oracle Consulting and really its rebirth. And I'm here with Chris Fox, who's the Group Vice President for Enterprise Cloud Architects and Chief Technologist for the North America Tech Cloud at Oracle. Chris, thanks so much for coming on theCUBE. >> Thanks Dave, glad to be here. >> So I love this title. I mean years ago there was no such thing as a Cloud Architect, certainly there were Chief Technologists but so you are really-- Those are your peeps, is that right? >> That's right. That's right. That's really, my team and I, that's all we do. So our focus is really helping our customers take this journey from when they were on premise to really transforming with cloud. And when we think about cloud, really for us, it's a combination. It's our hybrid cloud which happens to be on premise and then of course the true public cloud like most people are familiar with. So, very exciting journey and frankly I've seen just a lot of success for our customers. >> interesting that you hear conversations like, "Oh every company is a software company" which by the way we believe. Everybody's got a some kind of SaaS offering, but it really used to be the application, heads within organizations that had a lot of the power, still do, but of course you have cloud native developers etc. And now you have this new role of Cloud Architects, they've got to align, essentially have to provide infrastructure and capabilities so that you can be agile from a development standpoint. I wonder if you can talk about that dynamic of how the roles have evolved in the last several years. >> Yeah, you know it's very interesting now because as Oracle we spend a lot of our time with those applications owners. As a leader in SaaS right now, SaaS ERP, HCM. You just start walking through the list, they're transforming their organizations. They're trying to make their lives, much more efficient, better for their employees or customers etc. On the other side of the spectrum, we have the cloud native development teams and they're looking at better ways to deploy, develop applications, roll out new features at scale, roll out new pipelines. But Dave, what I think we're seeing at Oracle though, because we're so connected with SaaS and then we're also connected with the traditional applications that have run the business for years, the legacy applications that have been servicing us for 20 years and then the cloud native developers. So what my team and I are constantly focused on now is things like digital transformation and really wiring up all three of these across. So if we think of like a customer outcome, like I want to have a package delivered to me from a retailer, that actual process flow could touch a brand new cloud native site from e-commerce. It could touch essentially, maybe a traditional application that used to be on prem that's now on the cloud and then it might even use some new SaaS application maybe for maybe a procurement process or delivery vehicle and scheduling. So what my team does, we actually connect all three. So, what I always mention to my team and all of our customers, we have to be able to service all three of those constituents and really think about process flows. So I take the cloud native developer, we help them become efficient. We take the person who's been running that traditional application and we help them become more efficient. And then we have the SaaS applications which are now rolling out new features on a quarterly basis and the whole new delivery model. But the real key is connecting all three of these into a business process flow that makes the customer's life much more efficient. >> So what you're saying is that these Cloud Architects and the sort of modern day Chief Technologists, they're multi tool players. It's not just about cloud, it's about connecting that cloud to, whether the system's on prem or other clouds. Is that right? >> It is. You know and one thing that we're seeing too Dave, is that we know it's multi cloud. So it could be Oracle's cloud, hopefully it's always Oracle's cloud, but we don't expect that. So as architects, we certainly have to take a look at what is it that we're trying to optimize? What's the outcome we're looking for? And then be able to work across these teams, and I think what makes it probably most fun and exciting, on one day in one morning, let's say, you could be talking to the cloud native developer team. Talking about Kubernetes, CI/CD pipelines, all the great technologies that help us roll out applications and features faster. Then you'll go to a traditional, maybe Oracle E-Business suite job. This is something that's been running on prem maybe for 20 years, and it's really still servicing the business. And then you have another team that maybe is rolling out a SaaS application from Oracle. And literally all three teams are connected by a process flow. So the question is, how do we optimize all three on behalf of either the customer, the employee, the supplier? And that's really the job for the Oracle Cloud Architect. Which I think, really good, that's different than the other cloud because for the most part, we actually do offer SaaS, we offer platform, we offer infrastructure and we offer the hybrid cloud on prem. So it's a common conversation. How do we optimize all these? >> So I want to get into this cloud conversation a little bit. You guys are used to this term last mover advantage. I got to ask you about it. How is being last an advantage? But let me start there. >> Yeah, that's a great question. I mean, so frankly speaking I think that-- So Oracle has been developing, what's interesting is our SaaS applications for many, many, many years, and where we began this journey is looking at SaaS. And then we started with platform. Right after that we started saying how do we augment SaaS? This OCI for us or Oracle Cloud Infrastructure Gen 2 could be considered a last mover advantage. What does that mean? We join this cloud journey later than the others but because of our heritage, of the workloads we've been running, right? We've been running enterprise scale workloads for years, the cloud itself has been phenomenal, right? It's easier to use, pay for what you use, elastic etc. These are all phenomenal features, fell. And based on our enterprise heritage it wasn't delivering resilience at scale, even for like the traditional applications we've known on prem forever. People always say, "Chris we want to get out of the data center. "We're going zero data center." And I always say, "Well, how are you going to handle that back office stuff?" Right? The stuff that's really big, it's cranky, doesn't handle just, instances dying or things going away too easily. It needs predictable performance. It needs scale. It absolutely needs security and ultimately a lot of these applications truly have relied on an Oracle database. The Oracle database has it's own specific characteristics that it needs to run really well. So we actually looked at the cloud and we said, let's take the first generation clouds, which are doing great, but let's add the features that specifically, a lot of times, the Oracle workload needed in order to run very well and in a cost effective manner. So that's what we mean when we say, last mover advantage. We said, let's take the best of the clouds that are out there today. Let's look at the workloads that, frankly Oracle runs and has been running for years, what our customers needed and then let's build those features right into this next version of the cloud, we can service the enterprise. So our goal, honestly what's interesting is, even that first discussion we had about cloud native, and legacy applications, and also the new SaaS applications, we built a cloud that handles all three use cases, at scale resiliently in a very secure manner, and I don't know of any other cloud that's handling those three use cases, all in, we'll call it the same tendency for us at Oracle. >> Let's unpack that a little bit and get into, sort of, trying to understand the strategy and I want to frame it. So you were the last really to enter the cloud market, let's sort of agree on that. >> Chris: Yup. >> And you kind of built it from the ground up. And it's just too expensive now. The CapEx required to get into cloud is just astronomical. Now, even for a SaaS company, there's no sense. If you're a new SaaS company, you're going to run it in the cloud. Somebody else's cloud. There are some SaaS companies that of course run their own data centers but they're fewer and further between. But so, and I've also said that your advantage relative to the hyper scalers is that you've got this big SaaS estate and it somewhat insulates you, actually more than somewhat. Largely insulates you from the race to the bottom. On compute and storage, cost per bit kind of thing. But my question is, why was it was it important for Oracle, and is it important for Oracle and it's customers, that it had to participate in IaaS and PaaS and SaaS? Why not just the last two layers of that? What does that give you from a strategic advantage standpoint and what does that do for your customer? >> Yeah, great question. So the number one reason why we needed to have all three was that we have so many customers to today that are in a data center. They're running a lot of our workloads on premise and they absolutely are trying to find a better way to deliver a lower cost services to their customers. And, so, we couldn't just say let's just-- everyone needs to just become net new. Everyone just needs to ditch the old and go just to brand new alone. Too hard, too expensive at times. So we said, let's give us customers the ultimate amount of choice. So, let's even go back again to that developer conversation in SaaS. If you didn't have IaaS, we couldn't help customers achieve a zero data center strategy with their traditional application. We'll call it Peoplesoft, or JD Edwards or E-Business suite or even-- there's some massive applications that are running on the Oracle cloud right now that are custom applications built on the Oracle database. What they want is they said, "Give me the lowest ASP to get predictable performance IaaS" I'll run my app's tier on this. Number two, give me a platform service for database 'cause frankly, I don't really want to run your database, like, with all the manual effort, I want someone to automate, patching, scale up and down, and all these types of features like the pilot should have given us. And then number three, I do want SaaS over time. So we spend a lot of time with our customers, really saying, "how do I take this traditional application, run it on IaaS and PaaS?" And then number two, "let's modernize it at scale." Maybe I want to start peeling off functionality and running them as cloud native services right alongside, right? That's something again, that we're doing at scale, and other people are having a hard time running these traditional workloads on prem in the cloud. The second part is they say, "You know, I've got this legacy traditional ERP. Been servicing we well or maybe a supply chain system. Ultimately I want to get out of this. How do I get to SaaS?" And we say, "Okay, here's the way to do this. First, bring into the cloud, run it on IaaS and PaaS. And then selectively, I call it cloud slicing. Take a piece of functionality and put it into SaaS." For ERP, it might be something like start with GL, a new chart of accounts in ERP SaaS. And then slowly over a number of your journey as needed, adopt the next module. So this way, I mean, I'll just say this is the fun part of as an architect, our jobs, we're helping customers move to the cloud at scale, we're helping them do it at their rate, with whatever level of change they want. And when they're ready for SaaS, we're ready for them. And I would just say the other IaaS providers, here's the challenge we're seeing Dave, is that they're getting to the cloud, they're doing a little bit of modernization, but they want PaaS, they also want to ultimately get to SaaS, and frankly, those other clouds don't offer them. So they're kind of in this we're stuck on this lift and shift. But then we want to really move and modernize and go to SaaS. And I would say that's what Oracle is doing right now for enterprises. We're really helping them move these traditional workloads to the cloud IaaS and PaaS. And then number two, they're moving to SaaS when they're ready. And even when you get to SaaS, everyone says, "You know what, leave it as as vanilla as possible, but I want to make myself differentiated." In that case, again, IaaS and PaaS, coupled alongside a SaaS environment, you can build your specific differentiation. And then you leave the ERP pristine, so it can be upgraded constantly with no impact to your specific sidebar applications. So, I would say that the best clouds in the world, I mean, I think you're going to see a lot of the others are trying to, either SaaS providers trying to grow a PaaS, or maybe some of the IaaS players are trying to add SaaS. So, I think you're going to see this blending more and more because customers are asking for the flexibility For either or all three. But I will say that-- >> How can I get PaaS and SaaS-minus. >> Absolutely, I mean, what are you doing there? You're offering choice. There's not a question in my mind that Cisco is a huge customer of ours, they have a product that is one of their SaaS applications running Tetration on the Oracle Cloud. It actually doesn't run any Oracle. It's all cloud native applications. Natively built with a number of open source components. They run just IaaS. That's it, the Tetration product, and it runs fast. The Gen 2 cloud has a great architecture underneath it, flattened fast network. By far, for us, we feel like we really gotten into the guts of IaaS and made it run more efficiently. Other customers say, "I've got a huge Oracle footprint in the data center, help me get it out." So up to the cloud that they go, and they say I don't want just IaaS because that means I'm writing all the automation, like I have to manage all the patching. And this is where for us platform services really help because we give them the automation at scale, which allows their people to do other things, that may be more impactful for the business. >> I want to ask you about, the automation piece. And you guys have made the statement that your Gen 2 cloud is fundamentally different than how other clouds work, Gen 1 clouds. And the Gen 1 clouds which are evolving, the hyper scalars are evolving, but how is Oracle's Gen 2 cloud fundamentally different? >> Yeah. I think that one of the most basic elements of the cloud itself was that for us, we had to start with the security and the network. So if you imagine that those two components really, A, could dictate speed and performance, plus doing it in a secure fashion. The two things that you'll see an awful lot about for us, is that we've embedded not only security at every level. But we've even separated off what we call, every cloud, you have a number of compute instances and then you have storage, right? In the middle, you have a network. However, to become a cloud, and to offer the elastic scale and the multiple sharing of resources, you have to have something called a control plane. What we've done is we've actually extracted the control plane out into its own separate instance of a running machine. Other clouds actually have the control plane inside of there running compute cores. Now, what does that do? Well, the fact of the matter is, we assume that the control plane and the network should be completely separate from what you run on your cloud. So if you run a virtual machine, or if you run a bare metal instance, there's no Oracle software running on it. We actually don't trust customers, and we actually tell the customers don't trust us, either. So by separating out the control plane, and all the code that runs that environment off of the running machine, you get more cores meaning like you have-- There's no Oracle tax for running this environment. It's a separate conmputer for each one, the control plane. Number two, it's more secure. We actually don't have any running code on that machine, if you had a bare metal instance. So therefore, there's no way for one machine in the cloud to infect another machine if the control plane was compromised. The second part of the network, the guys who have been building this cloud, Don Johnson, a lot of the guys came from other clouds before and they said, "yYou know the one thing we have to do is make a we call it Flattened Fast Clause Network that really is never oversubscribed." So you'll constantly see and people always ask me same question, "Dave, why is the performance faster if its the same VM shape? "Like I don't understand why it's going faster, like high performance computing." And the reason again a lot of times is the network itself is that it's just not oversubscribed. It's constantly flowing all the data, there's no such thing as congestion on the network, which can happen. The last part, we actually added 52 terabytes of local storage to every one of those compute nodes. So therefore, there's a possibility you don't even have to traverse the network to do some really serious work on the local machine. So you add these together, the idea is make the network incredibly fast, separate out the control plane and run the software and security layer separate from the entire node where all the customers work is being done. Number three, give the customers more compute, by obviously having us offload it to a separate machine. And the last thing is put local storage and everything is what's called NVMe storage. Whether it's local or remote, everything's NVMe, though the IOPS we get are really off the charts. And again, it shows up in our benchmarks. >> Yeah, so you're getting, atomic access to memory. But in your control plane, you describe that control plane that's running. Sorry to geek out everybody. But I'm kind of curious, you know. You got me started, Chris. So that's control-- >> Yeah, that's good. >> the Oracle cloud or runs. Where's it live? >> It's essentially separated from the compute node. We actually have it in between, there's a compute node that all the work is done from the customer, could be on like a Kubernetes container or VM, whatever it might be. The control plane literally is separate. And it lives right next to the actual compute node the customer is using. So it's actually embedded on a SmartNIC, it's a completely different cores. It's a different chipset, different memory structure, everything. And it does two things. It helps us control what happens up in the customers compute nodes in VMs. And it also helps us virtualize the network down as well. So it literally, the control plane is separate and distinct. It's essentially a couple SmartNICS. >> And then how does Autonomous fit into this whole architecture? I'm speaking by the way for that description, I mean, it's nuanced, but it's important. I'm sure you having this conversation with a lot of cloud architects and chief technologists, they want to know this stuff, and they want to know how it works. And then, obviously, we'll talk about what the business impact is. But talk about Autonomous and where that fit. >> Yeah, so as Larry says that there are two products that really dictate the future of Oracle and our success with our customers. Number one is ERP-SaaS. The second one is Autonomous Database. So the Autonomous Database, what we've done is really taken a look at all the runtime operations of an Oracle database. So tuning, patching, securing all these different features, and what we've done is taken the best of the Oracle database, the best of something called Exadata which we run on the cloud, which really helps a lot of our customers. And then we've wrapped it with a set of automation and security tools to help it really manage itself, tune itself, patch itself, scale up and down, independent between compute and storage. So, why that's important though, is that really our goal is to help people run the Oracle database as they have for years but with far less effort, and then even not only far less effort, hopefully, a machine plus man, out of the equation we always talk about is man plus machine is greater than man alone. So being assisted by artificial intelligence and machine learning to perform those database operations, we should provide a better service to our customers with far less costs. >> Yeah, the greatest chess player in the world is a combination of man and machine, you know that? >> You know what? It makes sense. It makes sense because, there's a number of things that we can do as humans that are just too difficult to program. And then there are other things where machines are just phenomenal, right? I mean, there's no-- Think of Google Maps, you ask it wherever you want to go. And it'll tell you in a fraction of a second, not only the best route, but based on traffic from maybe the last couple of years. right now, we don't have autonomous cars, right, that are allowed to at least drive fully autonomous yet, it's coming. But in the meantime, a human could really work through a lot of different scenarios it was hard to find a way to do that in autonomous driving. So I do believe that it's going to be a great combination. Our hope and goal is that the people who have been running Oracle databases, how can we help them do it with far less effort and maybe spend more time on what the data can do for the organization, right? Improve customer experience, etc. Versus maybe like, how do I spin up a table? One of our customers is a huge consumer. They said, "our goal is how do we reduce the time to first table?" Meaning someone in the business just came up with an idea? How do I reduce the time to first table. For some of our customers, it can take months. I mean, if you were going to put in a new server, find a place in the data center, stand up a database, make the security controls, right and etc. With the autonomous database, I could spin one up right here, for us and, and we could start using it and it would be secure, which is utmost and paramount. It would scale up and down, meaning like just based on workload, as I load data into it, it would tune itself, it would help us with the idea of running more efficiently, which means less cores, which means also less cost. And then the constant security patches that may come up because of different threats or new features. It would do that potentially on its own if you allow it. Obviously some people want to watch you know what exactly it's going to do first. Do regression testing. But it's an exciting product because I've been working with the Oracle database for about 20 years now. And to see it run in this manner, it's just phenomenal. And I think that's the thing, a lot of the database teams have seen. Pretty amazing work. >> So I love this conversation. It's hardcore computer science, architecture, engineering. But now let's end with by up leveling this. We've been talking, a lot about Oracle Consulting. So let's talk about the business impact. So you go into customers, you talk to the cloud architects, the chief technologist, you pass that test. Now you got to deliver the business impact. Where does Oracle consulting fit with regard to that, and maybe you could talk about sort of where you guys want to take this thing. >> Yeah, absolutely. I mean, so, the cloud is great set of technologies, but where Oracle consulting is really helping us deliver is in the outcome. One of the things I think that's been fantastic working with the Oracle consulting team is that cloud is new. For a lot of customers who've been running these environments for a number of years, there's always some fear and a little bit of trepidation saying, "How do I learn this new cloud?" I mean, the workloads, we're talking about deeper, like tier zero, tier one, tier two, and all the way up to Dev and Test and DR, Oracle Consulting does really, a couple of things in particular, number one, they start with the end in mind. And number two, that they start to do is they really help implement these systems. And, there's a lot of different assurances that we have that we're going to get it done on time, and better be under budget, 'cause ultimately, again, that's something that's really paramount for us. And then the third part of it a lot of it a lot of times is run books, right? We actually don't want to just live at our customers environments. We want to help them understand how to run this new system. So training and change management. A lot of times Oracle Consulting is helping with run books. We usually will, after doing it the first time, we'll sit back and let the customer do it the next few times, and essentially help them through the process. And our goal at that point is to leave, only if the customer wants us to but ultimately, our goal is to implement it, get it to go live on time, and then help the customer learn this journey to the cloud. And without them, frankly, I think these systems are sometimes too complex and difficult to do on your own, maybe the first time especially because like I say, they're closing the books, they might be running your entire supply chain. They run your entire HR system or whatever they might be. Too important to leave to chance. So they really help us with helping the customer become live and become very competent and skilled, because they can do it themselves. >> But Chris, we've covered the gamut. We're talking about, architecture, went to NVMe. We're talking about the business impact, all of your automation, run books, loved it. Loved the conversation, but to leave it right there but thanks so much for coming on theCUBE and sharing your insights, great stuff. >> Absolutely, thanks Dave, and thank you for having me on. >> All right, you're welcome. And thank you for watching everybody. This is Dave Vellante for theCUBE. We are covering the Oracle North America Consulting transformation and its rebirth in this digital event. Keep it right there. We'll be right back. (upbeat music)

Published Date : Mar 25 2020

SUMMARY :

for the North America Tech Cloud at Oracle. So I love this title. and then of course the true public cloud that had a lot of the power, still do, So I take the cloud native developer, and the sort of modern day Chief Technologists, So the question is, how do we optimize all three I got to ask you about it. and also the new SaaS applications, the strategy and I want to frame it. Why not just the last two layers of that? that are running on the Oracle cloud right now that may be more impactful for the business. And the Gen 1 clouds which are evolving, "yYou know the one thing we have to do is make a But I'm kind of curious, you know. the Oracle cloud or runs. So it literally, the control plane is separate and distinct. I'm speaking by the way for that description, So the Autonomous Database, what we've done How do I reduce the time to first table. the chief technologist, you pass that test. and let the customer do it the next few times, Loved the conversation, but to leave it right there and thank you for having me on. the Oracle North America Consulting transformation

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Larry	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Chris Fox	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
two products	QUANTITY	0.99+
20 years	QUANTITY	0.99+
52 terabytes	QUANTITY	0.99+
Don Johnson	PERSON	0.99+
one day	QUANTITY	0.99+
Oracle Consulting	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
first table	QUANTITY	0.99+
One	QUANTITY	0.99+
First	QUANTITY	0.99+
one machine	QUANTITY	0.99+
two things	QUANTITY	0.99+
second one	QUANTITY	0.99+
three use cases	QUANTITY	0.99+
one	QUANTITY	0.98+
first time	QUANTITY	0.98+
three	QUANTITY	0.98+
Peoplesoft	ORGANIZATION	0.98+
two components	QUANTITY	0.98+
each one	QUANTITY	0.98+
one morning	QUANTITY	0.98+
three teams	QUANTITY	0.97+
about 20 years	QUANTITY	0.97+
Oracle North America Consulting	ORGANIZATION	0.97+
today	DATE	0.97+
IaaS	TITLE	0.97+
Google Maps	TITLE	0.96+
first generation	QUANTITY	0.96+
one thing	QUANTITY	0.96+
first	QUANTITY	0.95+
Oracle consulting	ORGANIZATION	0.94+
NVMe	ORGANIZATION	0.93+
two layers	QUANTITY	0.93+
JD Edwards	ORGANIZATION	0.92+
Number one	QUANTITY	0.92+
Number three	QUANTITY	0.9+

Jeremy Daly, Serverless Chats | CUBEConversation January 2020

(upbeat music) >> From the Silicon Angle Media office in Boston, Massachusetts, it's theCube. Now, here's your host, Stu Miniman. >> Hi, I'm Stu Miniman, and welcome to the first interview of theCube in our Boston area studio for 2020. And to help me kick it off, Jeremy Daly who is the host of Serverless Chats as well as runs the Serverless Day Boston. Jeremy, saw you at reInvent, way back in 2019, and we'd actually had some of the people in the community that were like hey, "I think you guys like actually live and work right near each other." >> Right. >> And you're only about 20 minutes away from our office here, so thanks so much for making the long journey here, and not having to get on a plane to join us here. >> Well, thank you for having me. >> All right, so as Calvin from Calvin and Hobbes says, "It's a new decade, but we don't have any base on the moon, "we don't have flying cars that general people can use, "but we do have serverless." >> And our robot vacuum cleaners. >> We do have robot vacuum cleaners. >> Which are run by serverless, as a matter of fact. >> A CUBE alum on the program would be happy that we do get to mention there. So yeah, you know serverless there are things like the iRobot, as well as Alexa, or some of the things that people, you know usually when I'm explaining to people what this is, and they don't understand it, it's like, Oh, you've used Alexa, well those are the functions underneath, and you think about how these things turn on, and off, a little bit like that. But maybe, we don't need to get into the long ontological discussion or everything, but you know you're a serverless hero, so you know give us a little bit, what your hearing from people, what are some of the exciting use cases out there, and you know where serverless is being used in that maturity today. >> Yeah, I mean well, so the funny thing about serverless and the term serverless itself, and I do not want to get into a long discussion about this, obviously. I actually wrote a post last year that was called stop calling everything serverless, because basically people are calling everything serverless. So it really, what it, what I look at it as, is something where, it just makes it really easy for developers to abstract away that back end infrastructure, and not having to worry about setting up Kubernetes, or going through the process of setting up virtual machines and installing software is just, a lot of that stuff is kind of handled for you. And I think that is enabled, a lot of companies, especially start-ups is a huge market for serverless, but also enterprises. Enabled them to give more power to their developers, and be able to look at new products that they want to build, new services they want to tackle or even old services that they need to, you know that may have some stability issues or things like long running ETL tasks, and other things like that, that they found a way to sort of find the preferal edges of these monolithic applications or these mainframes that they are using and find ways to run very small jobs, you know using functions as a server, something like that. And so, I see a lot of that, I think that is a big use case. You see a lot of large companies doing. Obviously, people are building full fledged applications. So, yes, the web facing user application, certainly a thing. People are building API's, you got API Gateway, they just released the new HEDP API which makes it even faster. To run those sort of things, this idea of cold starts, you know in AWS trying to get rid of all that stuff, with the new VPC networking, and some of the things they are doing there. So you have a lot of those type of applications that people are building as well. But it really runs the gambit, there are things all across the board that you can do, and pretty much anything you can do with the traditional computing environment, you can do with a serverless computing environment. And obviously that's focusing quite a bit on the functions as a service side of things, which is a very tiny part of serverless, if you want to look at it, you know sort of the broader picture, this service full or managed services, type approach. And so, that's another thing that you see, where you used to have companies setting up you know, mySQL databases and clusters trying to run these things, or even worse, Cassandra rings, right. Trying to do these things and manage this massive amount of infrastructure, just so that they could write a few records to a database and read them back for their application. And that would take months sometimes, for them to get it setup and even more time to try to keep running them. So this sort of revolution of managed services and all these things we get now, whether that the things like managed elastic search or elastic search cloud doing that stuff for you, or Big Table and Dynamo DB, and Manage Cassandra, whatever those things are. I'm just thinking a lot easier for developers to just say hey, I need a database, and okay, here it is, and I don't have to worry about the infrastructure at all. So, I think you see a lot of people, and a lot of companies that are utilizing all of these different services now, and essentially are no longer trying to re-invent the wheel. >> So, a couple of years ago, I was talking to Andy Jassy, at an interview with theCube, and he said, "If I was to build AWS today, "I would've built it on serverless." And from what I've seen over the last two or three years or so, Amazon is rebuilding a lot of there servers underneath. It's very interesting to watch that platform changing. I think it's had some ripple effect dynamics inside the company 'cause Amazon is very well known for their two pizza teams and for all of their products are there, but I think it was actually in a conversation with you, we're talking about in some ways this new way of building things is, you know a connecting fabric between the various groups inside of Amazon. So, I love your view point that we shouldn't just call everything serverless, but in many ways, this is a revolution and a new way of thinking about building things and therefore, you know there are some organizational and dynamical changes that happen, for an Amazon, but for other people that start using it. >> Yeah, well I mean I actually was having a conversation with a Jay Anear, whose one of the product owners for Lambda, and he was saying to me, well how do we sell serverless. How do we tell people you know this is what the next way to do things. I said, just, it's the way, right. And Amazon is realized this, and part of the great thing about dog fooding your own product is that you say, okay I don't like the taste of this bit, so we're going to change it to make it work. And that's what Amazon has continued to do, so they run into limitations with serverless, just like us early adopters, run into limitations, and they say, we'll how do we make it better, how do we fix it. And they have always been really great to listening to customers. I complain all the time, there's other people that complain all the time, that say, "Hey, I can't do this." And they say, "Well what if we did it this way, and out of that you get things like Lambda Destinations and all different types of ways, you get Event Bridge, you get different ways that you can solve those problems and that comes out of them using their own services. So I think that's a huge piece of it, but that helps enable other teams to get past those barriers as well. >> Jeremy, I'm going to be really disappointed if in 2020, I don't see a T-shirt from one of the Serverless Days, with the Mandalorian on it, saying, "Serverless, this is the way." Great, great, great marketing opportunity, and I do love that, because some of the other spaces, you know we're not talking about a point product, or a simple thing we do, it is more the way of doing things, it's just like I think about Cybersecurity. Yes, there are lots of products involved here but, you know this is more of you know it's a methodology, it needs to be fully thought of across the board. You know, as to how you do things, so, let's dig in a little bit. At reInvent, there was, when I went to the serverless gathering, it was serverless for everyone. >> Serverless for everyone, yes. >> And there was you know, hey, serverless isn't getting talked, you know serverless isn't as front and center as some people might think. They're some people on the outside look at this and they say, "Oh, serverless, you know those people "they have a religion, and they go so deep on this." But I thought Tim Wagner had a really good blog post, that came out right after reInvent, and what we saw is not only Amazon changing underneath the way things are done, but it feel that there's a bridging between what's happening in Kubernetes, you see where Fargate is, Firecracker, and serverless and you know. Help us squint through that, and understand a little bit, what your seeing, what your take was at reInvent, what you like, what you were hoping to see and how does that whole containerization, and Kubernetes wave intersect with what we're doing with serverless? >> Yeah, well I mean for some reason people like Kubernetes. And I honestly, I don't think there is anything wrong with it, I think it's a great container orchestration system, I think containers are still a very important part of the workloads that we are putting into a cloud, I don't know if I would call them cloud native, exactly, but I think what we're seeing or at least what I'm seeing that I think Amazon is seeing, is they're saying people are embracing Kubernetes, and they are embracing containers. And whether or not containers are ephemeral or long running, which I read a statistic at some point, that was 63% of containers, so even running on Kubernetes, or whatever, run for less than 10 minutes. So basically, most computing that's happening now, is fairly ephemeral. And as you go up, I think it's 15 minutes or something like that, I think it's 70% or 90% or whatever that number is, I totally got that wrong. But I think what Amazon is doing is they're trying to basically say, look we were trying to sell serverless to everyone. We're trying to sell this idea of look managed services, managed compute, the idea that we can run even containers as close to the metal as possible with something like Fargate which is what Firecracker is all about, being able to run virtual machines basically, almost you know right on the metal, right. I mean it's so close that there's no level of abstraction that get in the way and slow things down, and even though we're talking about milliseconds or microseconds, it's still something and there's efficiencies there. But I think what they looked at is, they said look at we are not Apple, we can't kill Flash, just because we say we're not going to support it anymore, and I think you mention this to me in the past where the majority of Kubernetes clusters that were running in the Public Cloud, we're running in Amazon anyways. And so, you had using virtual machines, which are great technology, but are 15 years old at this point. Even containerization, there's more problems to solve there, getting to the point where we say, look you want to take this container, this little bit of code, or this small service and you want to just run this somewhere. Why are we spinning up virtual containers. Why are we using 15 or 10 year old technology to do that. And Amazon is just getting smarter about it. So Amazon says hay, if we can run a Lambda function on Firecracker, and we can run a Fargate container on Firecracker, why can't we run, you know can we create some pods and run some pods for Kubernetes on it. They can do that. And so, I think for me, I was disappointed in the keynotes, because I don't think there was enough serverless talk. But I think what they're trying to do, is there trying to and this is if I put my analyst hat on for a minute. I think they're trying to say, the world is at Kubernetes right now. And we need to embrace that in a way, that says we can run your Kubernetes for you, a lot more efficiently and without you having to worry about it than if you use Google or if you use some other cloud provider, or if you run on-prem. Which I think is the biggest competitor to Amazon is still on-prem, especially in the enterprise world. So I see them as saying, look we're going to focus on Kubernetes, but as a way that we can run it our way. And I think that's why, Fargate and Kubernetes, or the Kubernetes for Fargate, or whatever that new product is. Too many product names at AWS. But I think that's what they are trying to do and I think that was the point of this, is to say, "Listen you can run your Kubernetes." And Claire Legore who showed that piece at the keynote, Vernor's keynote that was you know basically how quickly Fargate can scale up Kubernetes, you know individual containers, Kubernetes, as opposed to you know launching new VM's or EC2 instances. So I thought that was really interesting. But that was my overall take is just that they're embracing that, because they think that's where the market is right now, and they just haven't yet been able to sell this idea of serverless even though you are probably using it with a bunch of things anyways, at least what they would consider serverless. >> Yeah, to part a little bit from the serverless for a second. Talk about multi-cloud, it was one of the biggest discussions, we had in 2019. When I talk to customers that are using Kubernetes, one of the reasons that they tell me they're doing it, "Well, I love Amazon, I really like what I'm doing, "but if I needed to move something, it makes it easier." Yes, there are some underlying services I would have to re-write, and I'm looking at all those. I've talked to customers that started with Kubernetes, somewhere other than Amazon, and moved it to Amazon, and they said it did make my life easier to be able to do that fundamental, you know the container piece was easy move that piece of it, but you know the discussion of multi-cloud gets very convoluted, very easily. Most customers run it when I talk to them, it's I have an application that I run, in a cloud, sometimes, there's certain, you know large financials will choose two of everything, because that's the way they've always done things for regulation. And therefore they might be running the same application, mirrored in two different clouds. But it is not follow the sun, it is not I wake up and I look at the price of things, and deploy it to that. And that environment it is a little bit tougher, there's data gravity, there's all these other concerns. But multi-cloud is just lots of pieces today, more than a comprehensive strategy. The vision that I saw, is if multi-cloud is to be a successful strategy, it should be more valuable than the sum of its pieces. And I don't see many examples of that yet. What do you see when it comes to multi-cloud and how does that serverless discussion fit in there? >> I think your point about data gravity is the most important thing. I mean honestly compute is commoditized, so whether your running it in a container, and that container runs in Fargate or orchestrated by Kubernetes, or runs on its own somewhere, or something's happening there, or it's a fast product and it's running on top of K-native or it's running in a Lambda function or in an Azure function or something like that. Compute itself is fairly commoditized, and yes there's wiring that's required for each individual cloud, but even if you were going to move your Kubernetes cluster, like you said, there's re-writes, you have to change the way you do things underneath. So I look at multi-cloud and I think for a large enterprise that has a massive amount of compliance, regulations and things like that they have to deal with, yeah maybe that's a strategy they have to embrace, and hopefully they have the money and tech staff to do that. I think the vast majority of companies are going to find that multi-cloud is going to be a completely wasteful and useless exercise that is essentially going to waste time and money. It's so hard right now, keeping up with everything new that comes out of one cloud right, try keeping up with everything that comes out of three clouds, or more. And I think that's something that doesn't make a lot of sense, and I don't think you're going to see this price gauging like we would see with something. Probably the wrong term to use, but something that we would see, sort of lock-in that you would see with Oracle or with Microsoft SQL, some of those things where the licensing became an issue. I don't think you're going to see that with cloud. And so, what I'm interested in though in terms of the term multi-cloud, is the fact that for me, multi-cloud really where it would be beneficial, or is beneficial is we're talking about SaaS vendors. And I look at it and I say, look it you know Oracle has it's own cloud, and Google has it's own cloud, and all these other companies have their own cloud, but so does Salesforce, when you think about it. So does Twilio, even though Twilio runs inside AWS, really its I'm using that service and the AWS piece of it is abstracted, that to me is a third party service. Stripe is a third-party service. These are multi-cloud structure or SaaS products that I'm using, and I'm going to be integrating with all those different things via API's like we've done for quite some time now. So, to me, this idea of multi-cloud is simply going to be, you know it's about interacting with other products, using the right service for the right job. And if your duplicating your compute or you're trying to write database services or something like that that you can somehow share with multiple clouds, again, I don't see there being a huge value, except for a very specific group of customers. >> Yeah, you mentioned the term cloud-native earlier, and you need to understand are you truly being cloud-native or are you kind of cloud adjacent, are you leveraging a couple of things, but you're really, you haven't taken advantage of the services and the promise of what these cloud options can offer. All right, Jeremy, 2020 we've turned the calendar. What are you looking at, you know you're planning, you got serverless conference, Serverless Days-- >> Serverless Days Boston. >> Boston, coming up-- >> April 6th in Cambridge. >> So give us a little views to kind of your view point for the year, the event itself, you got your podcast, you got a lot going on. >> Yeah, so my podcast, Serverless Chats. You know I talk to people that are in the space, and we usually get really really technical. So if you're a serverless geek or you like that kind of stuff definitely listen to that. But yeah, but 2020 for me though, this is where I see what is happened to serverless, and this goes back to my "Stop calling everything serverless" post, was this idea that we keep making serverless harder. And so, as a someone whose a serverless purist, I think at this point. I recognize and it frustrates me that it is so difficult now to even though we're abstracting away running that infrastructure, we still have to be very aware of what pieces of the infrastructure we are using. Still have setup the SQS Queue, still have to setup Event Bridge. We still have to setup the Lambda function and API gateways and there's services that make it easier for us, right like we can use a serverless framework, or the SAM framework, or ARCH code or architect framework. There's a bunch of these different ones that we can use. But the problem is that it's still very very tough, to understand how to stitch all this stuff together. So for me, what I think we're going to see in 2020, and I know there is hints for this serverless framework just launched their components. There's other companies that are doing similar things in the space, and that's basically creating, I guess what I would call an abstraction as a service, where essentially it's another layer of abstraction, on top of the DSL's like Terraform or Cloud Formation, and essentially what it's doing is it's saying, "I want to launch an API that does X-Y-Z." And that's the outcome that I want. Understanding all the best practices, am I supposed to use Lambda Destinations, do I use DLQ's, what should I throttle it at? All these different settings and configurations and knobs, even though they say that there's not a lot of knobs, there's a lot of knobs that you can turn. Encapsulating that and being able to share that so that other people can use it. That in and of itself would be very powerful, but where it becomes even more important and I think definitely from an enterprise standpoint, is to say, listen we have a team that is working on these serverless components or abstractions or whatever they are, and I want Team X to be able to use, I want them to be able to launch an API. Well you've got security concerns, you've got all kinds of things around compliance, you have what are the vetting process for third-party libraries, all that kind of stuff. If you could say to Team X, hey listen we've got this component, or this piece of, this abstracted piece of code for you, that you can take and now you can just launch an API, serverless API, and you don't have to worry about any of the regulations, you don't have to go to the attorneys, you don't have to do any of that stuff. That is going to be an extremely powerful vehicle for companies to adopt things quickly. So, I think that you have teams now that are experimenting with all of these little knobs. That gets very confusing, it gets very frustrating, I read articles all the time, that come out and I read through it, and this is all out of date, because things have changed so quickly and so if you have a way that your teams, you know and somebody who stays on top of the learning this can keep these things up to date, follow the most, you know leading practices or the best practices, whatever you want to call them. I think that's going to be hugely important step from making it to the teams that can adopt serverless more quickly. And I don't think the major cloud vendors are doing anything in this space. And I think SAM is a good idea, but basically SAM is just a re-write of the serverless framework. Whereas, I think that there's a couple of companies who are looking at it now, how do we take this, you know whatever, this 1500 line Cloud Formation template, how do we boil that down into two or three lines of configuration, and then a little bit of business logic. Because that's where we really want to get to. It's just we're writing business logic, we're no where near there right now. There's still a lot of stuff that has to be done, around configuration and so even though it's nice to say, hey we can just write some business logic and all the infrastructure is handled for us. The infrastructure is handled for us, if we configure it correctly. >> Yeah, really remind me some of the general thread we've been talking about, Cloud for a number of years is, remember back in the early days, is cloud is supposed to be inexpensive and easy to use, and of course in today's world, it isn't either of those things. So serverless needs to follow those threads, you know love some of those view points Jeremy. I want to give you the final word, you've got your Serverless Day Boston, you got your podcast, best way to get in touch with you, and keep up with all you're doing in 2020. >> Yeah, so @Jeremy_daly on Twitter. I'm pretty active on Twitter, and I put all my stuff out there. Serverless Chats podcast, you can just find, serverlesschats.com or any of the Pod catchers that you use. I also publish a newsletter that basically talks about what I'm talking about now, every week called Off by None, which is, collects a bunch of serverless links and gives them some IoPine on some of them, so you can go to offbynone.io and find that. My website is jeremydaly.com and I blog and keep up to date on all the kind of stuff that I do with serverless there. >> Jeremy, great content, thanks so much for joining us on theCube. Really glad and always love to shine a spotlight here in the Boston area too. >> Appreciate it. >> I'm Stu Miniman. You can find me on the Twitter's, I'm just @Stu thecube.net is of course where all our videos will be, we'll be at some of the events for 2020. Look for me, look for our co-hosts, reach out to us if there's an event that we should be at, and as always, thank you for watching theCube. (upbeat music)

Published Date : Jan 2 2020

SUMMARY :

From the Silicon Angle Media office that were like hey, "I think you guys like actually live and not having to get on a plane to join us here. "we don't have flying cars that general people can use, and you know where serverless is being used that they need to, you know and therefore, you know there are some organizational and out of that you get things like Lambda Destinations You know, as to how you do things, and they say, "Oh, serverless, you know those people and I think you mention this to me in the past and I look at the price of things, and deploy it to that. that you can somehow share with multiple clouds, again, and you need to understand are you truly being cloud-native for the year, the event itself, you got your podcast, and so if you have a way that your teams, I want to give you the final word, serverlesschats.com or any of the Pod catchers that you use. Really glad and always love to shine a spotlight and as always, thank you for watching theCube.

ENTITIES

Entity	Category	Confidence
Claire Legore	PERSON	0.99+
15	QUANTITY	0.99+
Tim Wagner	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Jeremy	PERSON	0.99+
2019	DATE	0.99+
Andy Jassy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Jeremy Daly	PERSON	0.99+
Boston	LOCATION	0.99+
70%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
two	QUANTITY	0.99+
2020	DATE	0.99+
90%	QUANTITY	0.99+
63%	QUANTITY	0.99+
Cambridge	LOCATION	0.99+
15 minutes	QUANTITY	0.99+
10 year	QUANTITY	0.99+
less than 10 minutes	QUANTITY	0.99+
jeremydaly.com	OTHER	0.99+
Jay Anear	PERSON	0.99+
January 2020	DATE	0.99+
Calvin	PERSON	0.99+
April 6th	DATE	0.99+
Apple	ORGANIZATION	0.99+
last year	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
offbynone.io	OTHER	0.99+
three lines	QUANTITY	0.99+
one	QUANTITY	0.99+
serverlesschats.com	OTHER	0.99+
Boston, Massachusetts	LOCATION	0.99+
Lambda	ORGANIZATION	0.98+
two different clouds	QUANTITY	0.98+
@Jeremy_daly	PERSON	0.98+
Twilio	ORGANIZATION	0.98+
three clouds	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
today	DATE	0.97+
about 20 minutes	QUANTITY	0.97+
1500 line	QUANTITY	0.97+
first interview	QUANTITY	0.96+
two pizza teams	QUANTITY	0.96+
Lambda	TITLE	0.96+
one cloud	QUANTITY	0.96+
Alexa	TITLE	0.96+
theCube	ORGANIZATION	0.95+
Azure	TITLE	0.94+
each individual cloud	QUANTITY	0.94+
Serverless Days	EVENT	0.93+
Big Table	ORGANIZATION	0.93+

Stephanie McReynolds, Alation | CUBEConversation, November 2019

>> Announcer: From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a CUBE conversation. >> Hello, and welcome to theCUBE studios, in Palo Alto, California for another CUBE conversation where we go in depth with though leaders driving innovation across tech industry. I'm your host, Peter Burris. The whole concept of self service analytics has been with us decades in the tech industry. Sometimes its been successful, most times it hasn't been. But we're making great progress and have over the last few years as the technologies matures, as the software becomes more potent, but very importantly as the users of analytics become that much more familiar with what's possible and that much more wanting of what they could be doing. But this notion of self service analytics requires some new invention, some new innovation. What are they? How's that going to play out? Well, we're going to have a great conversation today with Stephanie McReynolds, she's Senior Vice President of Marketing, at Alation. Stephanie, thanks again for being on theCUBE. >> Thanks for inviting me, it's great to be back. >> So, tell us a little, give us an update on Alation. >> So as you know, Alation was one of the first companies to bring a data catalog to the market. And that market category has now been cemented and defined depending on the industry analyst you talk to. There could be 40 or 50 vendors now who are providing data catalogs to the market. So this has become one of the hot technologies to include in a modern analytics stacks. Particularly, we're seeing a lot of demand as companies move from on premise deployments into the cloud. Not only are they thinking about how do we migrate our systems, our infrastructure into the cloud but with data cataloging more importantly, how do we migrate our users to the cloud? How do we get self-service users to understand where to go to find data, how to understand it, how to trust it, what re-use can we do of it's existing assets so we're not just exploding the amount of processing we're doing in the cloud. So that's been very exciting, it's helped us grow our business. We've now seen four straight years of triple digit revenue growth which is amazing for a high growth company like us. >> Sure. >> We also have over 150 different organizations in production with a data catalog as part of their modern analytics stack. And many of those organizations are moving into the thousands of users. So eBay was probably our first customer to move into the, you know, over a thousand weekly logins they're now up to about 4,000 weekly logins through Alation. But now we have customers like Boeing and General Electric and Pfizer and we just closed a deal with US Air Force. So we're starting to see all sorts of different industries and all sorts of different users from the analytics specialist in your organization, like a data scientist or a data engineer, all the way out to maybe a product manager or someone who doesn't really think of them as an analytics expert using Alation either directly or sometimes through one of our partnerships with folks like Tableau or Microstrategy or Power BI. >> So, if we think about this notion of self- service analytics, Stephanie, and again it's Alation has been a leader in defining this overall category, we think in terms of an individual who has some need for data but is, most importantly, has questions they think data can answer and now they're out looking for data. Take us through that process. They need to know where the data is, they need to know what it is, they need to know how to use it, and they need to know what to do if they make a mistake. How is that, how are the data catalogs, like Alation, serving that, and what's new? >> Yeah, so as consumers, this world of data cataloging is very similar if you go back to the introduction of the internet. >> Sure. >> How did you find a webpage in the 90's? Pretty difficult, you had to know the exact URL to go to in most cases, to find a webpage. And then a Yahoo was introduced, and Yahoo did a whole bunch of manual curation of those pages so that you could search for a page and find it. >> So Yahoo was like a big catalog. >> It was like a big catalog, an inventory of what was out there. So the original data catalogs, you could argue, were what we would call from an technical perspective, a metadata repository. No business user wants to use a metadata repository but it created an inventory of what are all the data assets that we have in the organizations and what's the description of those data assets. The meta- data. So metadata repositories were kind of the original catalogs. The big breakthrough for data catalogs was: How do we become the Google of finding data in the organization? So rather than manually curating everything that's out there and providing an in- user inferant with an answer, how could we use machine learning and AI to look at patterns of usage- what people are clicking on, in terms of data assets- surface those as data recommendations to any end user whether they're an analytics specialist or they're just a self- service analytics user. And so that has been the real break through of this new category called data cataloging. And so most folks are accessing a data catalog through a search interface or maybe they're writing a SQL query and there's SQL recommendations that are being provided by the catalog-- >> Or using a tool that utilizes SQL >> Or using a tool that utilizes SQL, and for most people in a- most employees in a large enterprise when you get those thousands of users, they're using some other tool like Tableau or Microstrategy or, you know, a variety of different data visualization providers or data science tools to actually access that data. So a big part of our strategy at Alation has been, how do we surface this data recommendation engine in those third party products. And then if you think about it, once you're surfacing that information and providing some value to those end users, the next thing you want to do is make sure that they're using that data accurately. And that's a non- trivial problem to solve, because analytics and data is complicated. >> Right >> And metadata is extremely complicated-- >> And metadata is-- because often it's written in a language that's arcane and done to be precise from a data standpoint, that's not easily consumable or easily accessible by your average human being. >> Right, so a label, for example, on a table in a data base might be cust_seg_257, what does that mean? >> It means we can process it really quickly in the system. >> Yeah, but as-- >> But it's useless to a human being-- >> As a marketing manager, right? I'm like, hey, I want to do some customer segmentation analysis and I want to find out if people who live in California might behave differently if I provide them an offer than people that live in Massachusetts, it's not intuitive to say, oh yeah, that's in customer_seg_ so what data catalogs are doing is they're thinking about that marketing manager, they're thinking about that peer business user and helping make that translation between business terminology, "Hey I want to run some customer segmentation analysis for the West" with the technical, physical model, that underlies the data in that data base which is customer_seg_257 is the table you need to access to get the answer to that question. So as organizations start to adapt more self- service analytics, it's important that we're managing not just the data itself and this translation from technical metadata to business metadata, but there's another layer that's becoming even more important as organizations embrace self- service analytics. And that's how is this data actually being processed? What is the logic that is being used to traverse different data sets that end users now have access to. So if I take gender information in one table and I have information on income on another table, and I have some private information that identifies those two customers as the same in those two tables, in some use tables I can join that data, if I'm doing marketing campaigns, I likely can join that data. >> Sure. >> If I'm running a loan approval process here in the United States, I cannot join that data. >> That's a legal limitation, that's not a technical issue-- >> That's a legal, federal, government issue. Right? And so here's where there's a discussion, in folks that are knowledgeable about data and data management, there's a discussion of how do we govern this data? But I think by saying how we govern this data, we're kind of covering up what's actually going on, because you don't have govern that data so much as you have to govern the analysis. How is this joined, how are we combining these two data sets? If I just govern the data for accuracy, I might not know the usage scenario which is someone wants to combine these two things which makes it's illegal. Separately, it's fine, combined, it's illegal. So now we need to think about, how do we govern the analytics themselves, the logic that is being used. And that gets kind of complicated, right? For a marketing manager to understand the difference between those things on the surface is doesn't really make sense. It only makes sense when the context of that government regulation is shared and explained and in the course of your workflow and dragging and dropping in a Tableau report, you might not remember that, right? >> That's right, and the derivative output that you create that other people might then be able to use because it's back in the data catalog, doesn't explicitly note, often, that this data was generated as a combination of a join that might not be in compliance with any number of different rules. >> Right, so about a year and a half ago, we introduced a new feature in our data catalog called Trust Check. >> Yeah, I really like this. This is a really interesting thing. >> And that was meant to be a way where we could alert end users to these issues- hey, you're trying to run the same analytic and that's not allowed. We're going to give you a warning, we're not going to let you run that query, we're going to stop you in your place. So that was a way in the workflow of someone while they're typing a SQL statement or while they're dragging and dropping in Tableau to surface that up. Now, some of the vendors we work with, like Tableau, have doubled down on this concept of how do they integrate with an enterprise data catalog to make this even easier. So at Tableau conference last week, they introduced a new metadata API, they introduced a Tableau catalog, and the opportunity for these type of alerts to be pushed into the Tableau catalog as well as directly into reports and worksheets and dashboards that end users are using. >> Let me make sure I got this. So it means that you can put a lot of the compliance rules inside Alation and have a metadata API so that Alation effectively is governing the utilization of data inside the Tableau catalog. >> That's right. So think about the integration with Tableau is this communication mechanism to surface up these policies that are stored centrally in your data catalog. And so this is important, this notion of a central place of reference. We used to talk about data catalogs just as a central place of reference for where all your data assets lie in the organizations, and we have some automated ways to crawl those sources and create a centralized inventory. What we've added in our new release, which is coming out here shortly, is the ability to centralize all your policies in that catalog as well as the pointers to your data in that catalog. So you have a single source of reference for how this data needs to be governed, as well as a single source of reference for how this data is used in the organization. >> So does that mean, ultimately, that someone could try to do something, trust check and say, no you can't, but this new capability will say, and here's why or here's what you do. >> Exactly. >> A descriptive step that says let me explain why you can't do it. >> That's right. Let me not just stop your query and tell you no, let me give you the details as to why this query isn't a good query and what you might be able to do to modify that query should you still want to run it. And so all of that context is available for any end user to be able to become more aware of what is the system doing, and why is recommending. And on the flip side, in the world before we had something like Trust Check, the only opportunity for an IT Team to stop those queries was just to stop them without explanation or to try to publish manuals and ask people to run tests, like the DMV, so that they memorized all those rules of governance. >> Yeah, self- service, but if there's a problem you have to call us. >> That's right. That's right. So what we're trying to do is trying to make the work of those governance teams, those IT Teams, much easier by scaling them. Because we all know the volume of data that's being created, the volume of analysis that's being created is far greater than any individual can come up with, so we're trying to scale those precious data expert resources-- >> Digitize them-- >> Yeah, exactly. >> It's a digital transformation of how we acquire data necessary-- >> And then-- >> for data transformation. >> make it super transparent for the end user as to why they're being told yes or no so that we remove this friction that's existed between business and IT when trying to perform analytics. >> But I want to build a little bit on one of the things I thought I heard you say, and that is that the idea that this new feature, this new capability will actually prescribe an alternative, logical way for you to get your information that might be in compliance. Have I got that right? >> Yeah, that's right. Because what we also have in the catalog is a workflow that allows individuals called Stewards, analytics Stewards to be able to make recommendations and certifications. So if there's a policy that says though shall not use the data in this way, the Stewards can then say, but here's an alternative mechanism, here's an alternative method, and by the way, not only are we making this as a recommendation but this is certified for success. We know that our best analysts have already tried this out, or we know that this complies with government regulation. And so this is a more active way, then, for the two parties to collaborate together in a distributed way, that's asynchronous, and so it's easy for everyone no matter what hour of the day they're working or where they're globally located. And it helps progress analytics throughout the organization. >> Oh and more importantly, it increases the likelihood that someone who is told you now have self- service capability doesn't find themselves abandoning it the first time that somebody says no, because we've seen that over and over with a lot of these query tools, right? That somebody says, oh wow, look at this new capability until the screen, you know, metaphorically, goes dark. >> Right, until it becomes too complicated-- >> That's right-- >> and then you're like, oh I guess I wasn't really trained on this. >> And then they walk away. And it doesn't get adopted. >> Right. >> And this is a way, it's very human centered way to bring that self- service analyst into the system and be a full participant in how you generate value out of it. >> And help them along. So you know, the ultimate goal that we have as an organization, is help organizations become our customers, become data literate populations. And you can only become data literate if you get comfortable working with the date and it's not a black box to you. So the more transparency that we can create through our policy center, through documenting the data for end users, and making it more easy for them to access, the better. And so, in the next version of the Alation product, not only have we implemented features for analytic Stewards to use, to certify these different assets, to log their policies, to ensure that they can document those policies fully with examples and use cases, but we're also bringing to market a professional services offering from our own team that says look, given that we've now worked with about 20% of our installed base, and observed how they roll out Stewardship initiatives and how they assign Stewards and how they manage this process, and how they manage incentives, we've done a lot of thinking about what are some of the best practices for having a strong analytics Stewardship practice if you're a self- service analytics oriented organization. And so our professional services team is now available to help organizations roll out this type of initiative, make it successful, and have that be supported with product. So the psychological incentives of how you get one of these programs really healthy is important. >> Look, you guys have always been very focused on ensuring that your customers were able to adopt valued proposition, not just buy the valued proposition. >> Right. >> Stephanie McReynolds, Senior Vice President of Marketing Relation, once again, thanks for being on theCUBE. >> Thanks for having me. >> And thank you for joining us for another CUBE conversation. I'm Peter Burris. See you next time.

Published Date : Dec 10 2019

SUMMARY :

in the heart of Silicon Valley, Palo Alto, California, and that much more wanting of what they could be doing. So, tell us a little, depending on the industry analyst you talk to. and General Electric and Pfizer and we just closed a deal and they need to know what to do if they make a mistake. of the internet. of those pages so that you could search for a page And so that has been the real break through the next thing you want to do is make sure that's arcane and done to be precise from a data standpoint, and I have some private information that identifies in the United States, I cannot join that data. and in the course of your workflow and dragging and dropping That's right, and the derivative output that you create we introduced a new feature in our data catalog This is a really interesting thing. and the opportunity for these type of alerts to be pushed So it means that you can put a lot of the compliance rules is the ability to centralize all your policies and here's why or here's what you do. let me explain why you can't do it. the only opportunity for an IT Team to stop those queries but if there's a problem you have to call us. the volume of analysis that's being created so that we remove this friction that's existed and that is that the idea that this new feature, and by the way, not only are we making this Oh and more importantly, it increases the likelihood and then you're like, And then they walk away. And this is a way, it's very human centered way So the psychological incentives of how you get one of these not just buy the valued proposition. Senior Vice President of Marketing Relation, once again, And thank you for joining us for another

ENTITIES

Entity	Category	Confidence
Boeing	ORGANIZATION	0.99+
Pfizer	ORGANIZATION	0.99+
General Electric	ORGANIZATION	0.99+
Stephanie McReynolds	PERSON	0.99+
Stephanie	PERSON	0.99+
Peter Burris	PERSON	0.99+
40	QUANTITY	0.99+
California	LOCATION	0.99+
Massachusetts	LOCATION	0.99+
Yahoo	ORGANIZATION	0.99+
November 2019	DATE	0.99+
Alation	ORGANIZATION	0.99+
eBay	ORGANIZATION	0.99+
two parties	QUANTITY	0.99+
two things	QUANTITY	0.99+
two tables	QUANTITY	0.99+
two customers	QUANTITY	0.99+
one table	QUANTITY	0.99+
United States	LOCATION	0.99+
50 vendors	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Palo Alto, California	LOCATION	0.99+
SQL	TITLE	0.99+
last week	DATE	0.99+
US Air Force	ORGANIZATION	0.99+
Microstrategy	ORGANIZATION	0.99+
first customer	QUANTITY	0.99+
Tableau	ORGANIZATION	0.98+
Tableau	TITLE	0.98+
Stewards	ORGANIZATION	0.98+
Power BI	ORGANIZATION	0.98+
over 150 different organizations	QUANTITY	0.98+
90's	DATE	0.97+
today	DATE	0.97+
single	QUANTITY	0.97+
one	QUANTITY	0.97+
about 20%	QUANTITY	0.97+
four straight years	QUANTITY	0.97+
first time	QUANTITY	0.97+
CUBE	ORGANIZATION	0.96+
over a thousand weekly logins	QUANTITY	0.96+
thousands of users	QUANTITY	0.96+
two data	QUANTITY	0.94+
Microstrategy	TITLE	0.94+
first companies	QUANTITY	0.92+
Tableau	EVENT	0.9+
about	DATE	0.9+
Silicon Valley, Palo Alto, California	LOCATION	0.89+
a year and a half ago	DATE	0.88+
about 4,000 weekly logins	QUANTITY	0.86+
Trust Check	ORGANIZATION	0.82+
single source	QUANTITY	0.79+
Trust Check	TITLE	0.75+
theCUBE	ORGANIZATION	0.75+
customer_seg_257	OTHER	0.74+
up	QUANTITY	0.73+
Alation	PERSON	0.72+
decades	QUANTITY	0.7+
cust_seg_257	OTHER	0.66+
Senior Vice President	PERSON	0.65+
years	DATE	0.58+
CUBEConversation	EVENT	0.51+

David Nuti, Open Systems | CUBEConversation, August 2019

(upbeat music) >> From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a CUBE conversation. >> Hello everyone, welcome to this CUBE conversation here in the Palo Alto CUBE Studios. I'm John Furrier, host of theCUBE. We here have Dave Nuti, who is the Head of Channels for Open Systems. Open Systems just recently launched their partner network in 2019. Dave, welcome to theCUBE conversation. >> Thank you John, good to be here. >> So, security obviously is the hottest area we've been covering it like a blanket these days. It's only getting better and stronger in terms of number of players and options for customers. But that's also a double-edged sword. There's more options, more for customers. And security problems aren't going away. They're just getting more compounded. It's complicated global marketplace, global scale, regional clouds on-premise, no surface area. We've had these conversations with you guys a lot and it's super important, but opportunity to deliver solutions with channel partners has become a huge thing at Amazon re:Inforce, we had a big conversation what that even looks like. It's a new market opportunity for security players. You guys are forging there. Tell us about your partner's channel, just launched, give us a quick overview. >> Yeah I have a growing smile as you talk about the complexity of the space and how difficult it can be because we're the ones that eliminate that complexity, make it very simple. And for our partners that we've been engaging with, I joined the company just over a year ago and we began laying the groundwork of transitioning from a direct sales model to a partner only model and you fast forward to where we are today, we've already made that 180 degree turn and are working exclusively through partners throughout North America and executing around the world in that way. What's exciting for the partners is that they have a new supplier in the portfolio in the form of Open Systems that while it is a new name to them, is anything but new in experience and execution. It might arguably be one of the more seasoned suppliers in their entire portfolio they have today and it is opening doors and breaking down barriers to entry in a number of security categories that for years they've been on the outside looking in trying to figure out, how can I participate in these areas and how can I really unify a conversation around value for my customers that I am the trusted advisor to? And those are the exciting networks of hundreds and thousands of trusted advisors out there that we're engaging with today. >> You know, the security space is interesting. It's changing a lot, it's not just the one supplier, multiple suppliers, there are now hundreds and thousands of suppliers of something, the security market. There's a lot of venture capital being funded for startups, you got customers spending money so there's a lot of spend and activity flow and money flow and huge value creation opportunity. Yet customers are also looking at the cloud technologies as a disruptive enabler of how to deal with new things but also they're looking at their supplier relationships right now, they're evaluating you know, who do I want to do business with, they don't want to get another tool, they don't want to new thing. They don't want to get more and more sprawl. You guys have been Open System and been very successful with word of mouth customer growth. The CEO talked about that in the last interview, it's like you guys have been getting a lot of wins. Classic word of mouth, good product offerings. So you have success on the product side. As you go into the channel and enable the people in front of the customers every day to bring a solution to the table, what's the value proposition to the partners? Because they're fighting to be relevant, they want to be in front of the customers. The customers want their partners as well. So the opportunity for the people in front of your customers for the channel is big. What's the value proposition? >> Well establishing trust with the channel is critical. For years they've had solutions that roll into the portfolio that were written in a conference room a year and a half ago and they're only selling off of PowerPoint slides and now you're coming in with Open Systems and you have 20 years of experience accumulated, maturity and automation into a platform that they rarely see that type of door opened up for them. So when they lean in and they really start asking questions about Open Systems, we really check off boxes in a fantastic way for our partners. You talk about vendor sprawl and complexity and it all boils back, you're exactly correct, to the embracing of the cloud and that diversity of application origin, the diversity of the users trying to access those corporate resources, wherever they happen to be hosted and how do I unify a strategy and it's resulted in what is not uncommon having to engage 30, 40, 50, different vendors and then trying to unify that environment, let alone the problem that you can't hire the people to go and do it anyway. There's a negative unemployment issue in IT security categories today. So you know, there's a very, very fortune few that have the ability, the bench, the depth, the resource to do that and then an even fewer number of people who can lead an enterprise down that path and then you turn the corner and where usually there's this tug of war between agility and security. If I'm really agile, it means I'm compromising security. Or if I'm super secure, I'm going to be as slow as a sloth in doing anything. And then you have Open Systems sitting in the middle who says, that's not necessarily the case. You can have world class deployment in an agile platform where all that complexity and service chaining unification is handled for you and that really, that is mind boggling and I'll tell you, it's a whole lot of fun to demonstrate it. >> You know, Dave, we talked a lot of customers and user customers through our media business, CIOs, and now CISOs and they're all kind of working together. They have partners, they have partners they've worked with for many, many years from the old days of buying servers and rack and stacking 'em to software to applications but now the touch points for services are those traditional suppliers, application developers, but security's being bolted in everywhere, so almost all services need security, that's essentially what the main message with cloud is. So that gives the service opportunities for you guys but partners to enable you guys in there. As a partner, if I'm a partner of Open Systems, what do I get? 'Cause I want to make my, I want to keep my customer. I want to deliver security. What do I talk to my customer, what's the pitch that I can give as a partner to customer to ensure that they're going to get what they need from Open Systems? >> What I tell our partners is that we should be the services conversation that you lead with. There are a lot of other options out there and even if you don't mention it by name, if you approach the conversation in an open way with a customer with the mindfulness of the wide net of capabilities and value that you're able to execute on with Open Systems, it gives you your strongest footing. One of the big problems and you mentioned it, is that so often for years these technology conversations have been siloed and isolated and that always creates problems. I talked to a partner who works their way downstream on an SD-WAN conversation and at the very end they say, "This looks great, we just have "to get it passed by our security team." And the wind falls out of everybody's sails because that should've been part of the conversation all along or vice versa, starting from a security conversation and now I've got to get the network team to sign off on it. Open Systems really comes with a model that says all those viewpoints need to be in the room at the same time. That's how you execute and that's how you unify an environment so that you're not running into those bottlenecks later on. It's just madness, it needs to be simpler. >> We were talking before we came on camera about what it means to be disruptive and valuable to partners and to customers and you mentioned convergence of capabilities and manage services. What do you mean by that? I get convergence of services, we talk about that all the time from industrial IoT, we've been doing some segments on that to manage services, people get what that means. What do you mean by convergence of services and and manage services with respect to security and Open Systems? >> Absolutely. I mean convergences, we all carry one in our pocket so how many people carry a separate GPS device with a separate digital camera with a separate phone and a separate- Converging technologies just simplifies my environment and often times is a viewpoint of I'm compromising in certain areas that if I break everything out myself I can probably do it better off myself. And in some cases that's absolutely true. When you look at how Open Systems has taken a very diverse set of services and network and security categories and unified it into a single platform, we've taken, if you will, we've taken that stack of boxes and turned it into one by building a main services platform that's delivered as a service but what we've layered on top of it is the ability to manage it for our customers and I talk about modern managed services. It's very different. Before maintainence services was, I'm just too incapable to do something myself so I need somebody else to do it. When I talk to a partner, I like pointing out that I don't try to find somebody too dumb to do the things we do and they have to rely upon us. No, our best customers are very forward-leaning 'cause they realize that the automation that we've accumulated over 20 years that we're 85 to 90% of our detected incidents are handled by AI automation and Machine Learning and that type of monitoring automation that we have at the edge and the engine and the team of 115 level three plus engineers that are executing on our customer's behalf is we're force multiplier for our end customers to an ability that they will never achieve on their own, they'll never build that on their own. Those are the two, I think two of the biggest pillars in disruption are convergence and managed services and they are two enormous check boxes for Open Systems where it's hard to find someone more experienced in that than the team at Open Systems. >> And those are realities that the customers are dealing with but also the other reality on top of that to make it even more complicated and better for you guys and partners is you have more surface area to deal with. So the AI and the automation really play into the hands of, on the delivery side, so if I'm a partner, I'm standing up Open Systems, it's working. >> So you can't just develop that in a conference room. That's something that's accumulated over time, that's what comes with experience. And I usually really lean heavily into our maturity and our experience. We're in 183 countries with customers today. We have a 98% retention rate, a 58 NPS score. When I show the monitoring portals, the visibility tools, the maturity, and what has been developed isn't just Open Systems, you know, stubbornly telling the world what they need and should be doing. It's actually a very aggressive two way conversation with our existing customers and their guidance telling us, this is what we want, what we need to see, what we need to be able to pull and what we need your help in enforcing. I met with a customer in Pacific Northwest and he dropped a line on me that was terrific. He said, "I'm looking for a partner "that can tell us the questions we should be asking "that we haven't and the technologies "we should be evaluating that we haven't looked at yet." And I told him I was going to steal that line and I'm using it here today. Because that is an absolutely brilliant description of exactly the type of customer experience that we expect to deliver from Open Systems to our customers. >> So if I'm rep, I'm a person who's got a portfolio of customers and I want to bring Open Systems to the table, take me through that. I mean, am I asking the questions, what are some of those questions I should be asking, what's my engagement posture look like to my customer? >> That's a great question. I've been to a number of events and sat through kind of advanced training seminars and at the beginning of a seminar, you have somebody on stage saying, talk about security categories. If you talk about security, then you have a pathway to sell anything else on there. And then at the end of the event, all the SD-WAN guys were sitting on the stage saying, "Talk about SD-WAN, it's the glue "that holds everything together and if you can sell SD-WAN, "it'll give you pathway to everything else." And meanwhile I'm in the back of the room smiling just wondering, what if you didn't have to pick? What if you could just have a wide open conversation with your customer around application origins and remote users and how you're unifying security and application performance and routing intelligence for any application origin to any type of user trying to access it, how are you addressing that? And that's really at the core of what Open Systems has developed for its clients is that type of agility and flexibility where you're never trapped and opening up considerations around new and emerging threats and capabilities that you should be looking at where if it's not the time for you today, we've still already designed it in for you, so when you're ready it's there for you. >> Now the real question on the rep's mind, while he's asking those basic questions. How do I make money from this? Which is essentially, money making certainly is a great channel formula. It's indirect sales for you guys but also you have to have a couple table stakes. One, it's got to be a product that can be sold. The delivery has to be elegant enough where there's margin for the partner. And benefit the customer. So the money making is certainly the big part of not only trust as the supplier to the channel, but also as an engine of innovation and wealth creation. What's your pitch there, how am I making money? >> Well as a managed services model, that's always the beauty is you get to configure to the requirement of the individual customer so no one's force fed capability they don't need or an over subscription for what they might need in a year so just in case they want to, we're able to right size and deliver the capability that's specifically configured to the individual customer level but then also show them that they have a pathway to capability laid out for them and integrated and modern, we never go end of life, we never get shelved, this is something that is living, breathing, you're never buying boxes, again and service chaining and handling the complexity so we make that very simple for our partners in categories around security and SOC and manage services, and SIM, and CASB, these are things that they hear about but they don't know how to address them with their customers. And now Open Systems makes that very simple because we fully integrated the capabilities around those categories and many more into the same service-- >> So one of the psychology, I was just reading from that as a rep, if I was a rep I would be like, oh, I don't have to overplay my hand. I can get an engagement with my customer, they can get a feel for the service, grow into it because it's a managed service and go from there, it's not a big ask. >> Right. >> It's instant alignment. >> Yeah, often times what we do is a timing issue. Somebody just bought boxes in one category so fine, we'll coexist with that. We sit in parallel and in framework with current investments and subscriptions that happen to be in place but we give them a pathway that allows them to integrate it into fully unified and I like to really point this out is that, we don't go to a customer and say, "What do you need? "We'll build it for you." It's, what do you need? We've already built it, we just want to know how we configure it for you to match up to what your requirements are and maybe suggest some areas that should be a part of that consideration as well based upon 20 plus years of doing this with customers that we already have under our belt. >> Yeah, it gives them confidence that the operating model of say cloud, it's been around, it's proven and now you have a model there. Final question for you Dave is okay, my fear might be, are you going to be around tomorrow 'cause people want to know, are you going to be there for the long haul? What's your answer to that? >> We're a 30 year old security company founded out of Zurich and started in 1990 and transitioned as a service in 1999 and have grown on the backs, we're customer funded. So this is as battle-tested and bulletproof as anything that they may have in their portfolio and it shows extremely well in front of a customer. I spend more time talking to partners saying be the first one in the door to talk about Open Systems with your customer, don't let somebody else do it. Or certainly use the mindfulness of the net of capabilities of Open Systems and don't go in narrow-viewed because if somebody comes in behind you with our conversation, I don't think you're going to like what happens. >> One more question just jumped in my head, you reminded me of, we were talking before we came on camera around how channels are great leverage, great win-win, but we're in a modern area of computing, delivery of services, cloud has certainly shown that, whole nother wave coming behind it, security obviously the biggest challenge. You've been in the channel business for awhile, what's your take on what's happening in the channel business because it is changing, there's opportunities there, what's your take? >> Yeah, this is the second company I've had the opportunity to introduce into the channel and this one is a lot of fun, I'll say that. But the channel's traditionally thought of in more of a telecom space and for many of our partners, that's where they've been literally for decades in some cases is selling technology but is selling connectivity rather, networks, but what has happened is that technology has found its way into the network layer and because of cloud and SaaS app origins and remote users from coffee shops or theCUBE or our customer site accessing those applications, it's created a massive set of diversity in requirements on the IT team at the enterprise and how do you accommodate for all that? How do you keep up with it and maintain it? And now these things transition from these Capex buying boxes and maintenance agreements and rotating those out and that model is constantly being assaulted in the same way that we've seen so many services that we have come to our house. Nobody digs a well for water anymore, I've got a water company. Or makes their own electric power plant in the backyard, I've got the electric company. >> Everything's as a service. >> Absolutely. >> Dave Nuti, head of channels at Open Systems. Thanks for sharing the insight on your partner congratulations. Thanks for coming in. >> Pleasure, thank you. >> I'm John Furrier here at CUBE conversation in Palo Alto, thanks for watching. (upbeat music)

Published Date : Aug 14 2019

SUMMARY :

in the heart of Silicon Valley, Palo Alto, California, here in the Palo Alto CUBE Studios. We've had these conversations with you guys and executing around the world in that way. The CEO talked about that in the last interview, the depth, the resource to do that that they're going to get what they need One of the big problems and you mentioned it, and you mentioned convergence and the team of 115 level three plus engineers and better for you guys and partners and he dropped a line on me that was terrific. I mean, am I asking the questions, the beginning of a seminar, you have somebody So the money making is certainly the big part that's always the beauty is you get So one of the psychology, that happen to be in place but we give that the operating model of say cloud, and have grown on the backs, we're customer funded. You've been in the channel business for awhile, I've had the opportunity to introduce into the channel Thanks for sharing the insight in Palo Alto, thanks for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Nuti	PERSON	0.99+
John	PERSON	0.99+
1990	DATE	0.99+
David Nuti	PERSON	0.99+
1999	DATE	0.99+
John Furrier	PERSON	0.99+
2019	DATE	0.99+
85	QUANTITY	0.99+
20 years	QUANTITY	0.99+
Pacific Northwest	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
20 plus years	QUANTITY	0.99+
August 2019	DATE	0.99+
hundreds	QUANTITY	0.99+
Zurich	LOCATION	0.99+
98%	QUANTITY	0.99+
two	QUANTITY	0.99+
Open Systems	ORGANIZATION	0.99+
30	QUANTITY	0.99+
180 degree	QUANTITY	0.99+
North America	LOCATION	0.99+
50	QUANTITY	0.99+
40	QUANTITY	0.99+
PowerPoint	TITLE	0.99+
today	DATE	0.99+
58 NPS	QUANTITY	0.99+
over 20 years	QUANTITY	0.99+
first one	QUANTITY	0.99+
115	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
One more question	QUANTITY	0.99+
183 countries	QUANTITY	0.98+
One	QUANTITY	0.98+
Alto	LOCATION	0.98+
theCUBE	ORGANIZATION	0.97+
a year and a half ago	DATE	0.97+
CUBE	ORGANIZATION	0.97+
tomorrow	DATE	0.96+
30 year old	QUANTITY	0.96+
one category	QUANTITY	0.96+
one	QUANTITY	0.96+
single platform	QUANTITY	0.96+
90%	QUANTITY	0.95+
Capex	ORGANIZATION	0.95+
one supplier	QUANTITY	0.94+
decades	QUANTITY	0.94+
second company	QUANTITY	0.91+
Palo Alto, California	LOCATION	0.91+
two way	QUANTITY	0.88+
over	DATE	0.87+
two enormous check boxes	QUANTITY	0.87+
a year ago	DATE	0.83+
Silicon Valley,	LOCATION	0.83+
Inforce	ORGANIZATION	0.82+
a year	QUANTITY	0.74+
thousands of trusted advisors	QUANTITY	0.69+
thousands of suppliers	QUANTITY	0.69+
level three	QUANTITY	0.63+
CUBE Studios	ORGANIZATION	0.61+
CUBEConversation	EVENT	0.61+
couple table stakes	QUANTITY	0.6+
agile	TITLE	0.6+
Palo	ORGANIZATION	0.53+
double	QUANTITY	0.51+

Dell Technologies World 2019 Analysis

>> live from Las Vegas. It's the queue covering del Technologies. World twenty nineteen, brought to you by Del Technologies and its ecosystem partners. >> Okay, welcome back. Everyone's cubes. Live coverage. Day three wrap up of Del Technologies World twenty nineteen Java is Dave a lot. There's too many men on set one. We get set to over there blue set, White said. We got a lot of content. It's been a cube can, in guise of a canon of content firing into the digital sphere. Great gas. We had all the senior executive players Tech athletes. Adele Technology World. Michael Dell, Tom Sweet, Marius Haas, Howard Ally As we've had Pat Kelsey, rco v M were on the key partner in the family. They're of del technology world and we had the clients guys on who do alien where, as well as the laptops and the power machines. Um, we've had the power edge guys on. We talked about Hollywood. It's been a great run, but Dave, it's been ten years Stew. Remember, the first cube event we ever went to was DMC World in Boston. The chowder there he had and that was it wasn't slogan of of the show turning to the private cloud. Yeah, I think that was this Logan cheering to the private cloud that was twenty ten. >> Well, in twenty ten, it was Cloud Cloud Cloud Cloud Cloud twenty nineteen. It's all cloud now. That difference is back then it was like fake cloud and made up cloud and really was no substance to it. We really started to see stew, especially something that we've been talking about for years, which is substantially mimicking the public cloud on Prem. Now I know there are those who would say No, no, no, no, no. And Jessie. Probably in one of those that's not cloud. So there's still that dichotomy is a cloud. >> Well, Dave, if I could jump in on that one of the things that's really interesting is when Veum, where made that partnership with a ws It was the ripple through this ecosystem. Oh, what's that mean for Del you know Veum, wherein Del not working together Well, they set the model and they started rolling out bm where, and they took the learnings that they had. And they're bringing that data center as a service down to the Dell environment. So it's funny I always we always here, you know, eight of us, They're learning from their partners in there listening and everything like that. Well, you know, Dylan Veum where they've been listening, they've been learning to in this, and it brings into a little bit of equilibrium for me, that partnership and right, David, you said, you know that you could be that cloud washing discussion. And today it's, you know, we're talking about stacks that live in eight of us and Google and Microsoft. And now, in, you know, my hosted or service lighter or, you know, my own data center. If that makes sense, >> I mean, if you want to just simplify the high order bit, Dave Cloud. It's simply this Amazon's trying to be enterprised everyone, the enterprise, trying to claw Amazon, right? And so what? The what that basically means is it's all cloud. It's all a distributed computer system. OK, Scott McNealy had it right. The network is the computer. If you look at what's going on here, the traditional enterprise of vendors over decades of business model and technology, you know, had full stack solutions from mainframe many computers to PC the local area networking all cobble together wires it up creates applications, services. All that is completely being decimated by a new way to roll out storage, computing and networking is the same stuff. It's just being configured differently. Throw on massive computer power with Cloud and Moore's Law and Data and A. I U have a changing of the the architecture. But the end of the day the cloud is operating model of distributed computing. If you look at all the theories and pieces of computer science do and networking, all those paradigms are actually playing out in in the clouds. Everything from a IIE. In the eighties and nineties you got distributed networking and computing, but it's all one big computer. And Michael Dell, who was the master of the computer industry building PCs, looks at this. Probably leg. It's one big computer. You got a processor and subsystems. So you know this is what's interesting. Amazon has done that, and if they try to be like the enterprise, like the old way, they could fall into that trap. So if the enterprise stays in the enterprise, they know they're not going out. So I think it's interesting that I see the enterprise trying to like Amazon Amazon trying to get a price. So at the end of the day, whoever could build that system that's scalable the way I think Dell's doing, it's great. I was only scaleable using data for special. So it's a distributed computer. That's all that's going on in the world right now, and it's changing everything. Open source software is there. All that makes it completely different, and it's a huge opportunity. Whoever can crack the code on this, it's in the trillions and trillions of dollars. Total adjustable market >> well, in twenty ten we said that way, noted the gap. There's still a gap between what Amazon could do and what the on Prem guys Khun Dio, we'd argue, is a five years is seven years, maybe ten years, whatever it is. But at the time we said, if you recall, lookit, they got to close the gap. It's got to be good enough for I t to buy into it like we're starting to see that. But my view, it's still not cloud. It doesn't have to scale a cloud, doesn't have the economics cloud. When you peel the onion, it doesn't certainly doesn't have the SAS model and the consumption model of cloud nowhere close yet. Well, and you know, >> here's the drumbeat of innovation that we see from the public cloud. You know where we hit the shot to show this week, the public have allowed providers how many announcements that they probably had. Sure, there was a mega launch of announcements here, but the public lives just that regular cadence of their, you know, Public Cloud. See a CD. We're not quite there yet in this kind of environment, it's still what Amazon would say is. You put this in an environment and it's kind of frozen. Well, it's thought some, and it's now we can get data set. A service consumption model is something we can go. We're shifting in that model. It's easier to update things, but you know, how do I get access to the new features? But we're seeing that blurring of the line. I could start moving services that hybrid nature of the environment. We've talked a few times. We've been digging into that hybrid cloud taxonomy and some of the services to span because it's not public or private. It's now truly that hybrid and multi environment and customers are going to live in. And all of >> the questions Jonah's is good enough to hold serve >> well. I think the reality is is that you go back to twenty ten, the jury in the private cloud and it's enterprises almost ten years to figure out that it's real. And I think in that time frame Amazon is absolutely leveled. Everybody, we call that the tsunami. Microsoft quickly figures out that they got to get Cloud. They come in there, got a fast followers. Second, Google's trying to retool Oracle. I think Mr Bo completely get Ali Baba and IBM in there, so you got the whole cloud game happening. The problem of the enterprises is that there's no growth in terms of old school enterprise other than re consolidate in position for Cloud. My question to you guys is, Is there going to be true? True growth in the classic enterprise business or, well, all this SAS run on clouds. So, yes, if it's multi cloud or even hybrid for the reasons they talk about, that's not a lot of growth compared to what the cloud can offer. So again, I still haven't seen Dave the visibility in my mind that on premises growth is going to be massive compared to cloud. I mean, I think cloud is where Sassen lives. I think that's where the scale lives we have. How much scale can you do with consolidation? We >> are in a prolonged bull market that that started in twenty ten, and it's kind of hunger. In the tenth year of a of a decade of bull market, the enterprise market is cyclical, and it's, you know, at some point you're going to start to see a slowdown cloud. I mean, it's just a tiny little portion of the market is going to continue to gain share cloud can grow in a downturn. The no >> tell Motel pointed out on this, Michael Dell pointed out on the Cubans, as as those lieutenants, the is the consolidation of it is just that is a retooling to be cloud ready operationally. That's where hybrid comes in. So I think that realization has kicked in. But as enterprises aren't like, they're not like Google and Facebook. They're not really that fast, so So they've got to kind of get their act together on premises. That's why I think In the short term, this consolidation and new revitalisation is happening because they're retooling to be cloud ready. That is absolutely happen. But to say that's the massive growth studio >> now looked. It is. Dave pointed out that the way that there is more than the market growth is by gaining market share Share share are areas where Dell and Emcee didn't have large environment. You know, I spent ten years of DMC. I was a networking. I was mostly storage networking, some land connectivity for replication like srd Evan, like today at this show, I talked a lot of the telco people talk to the service of idle talk where the sd whan deny sirrah some of these pieces, they're really starting to do networking. That's the area where that software defined not s the end, but the only in partnership with cos like Big Switch. They're getting into that market, and they have such small market share their that there's huge up uplift to be able to dig into the giant. >> Okay, couple questions. What percent of Dell's ninety one billion today is multi cloud revenue. Great question. Okay, one percent. I mean, very small. Okay. Very small hero. Okay? And is that multi cloud revenue all incremental growth isat going to cannibalize the existing base? These? Well, these are the fundamentals weighs six local market that I'm talking to >> get into this. You led the defense of conversations. We had Tom Speed on the CFO and he nailed us. He said There's multiple levers to shareholder growth. Pay down the debt check. He's got to do that. You love that conversation. Margin expansion. Get the margins up. Use the client business to cover costs. As you said, increased go to market efficiency and leverage. The supply chain that's like their core >> fetrow of cash. And that all >> these. The one thing he said that was mind blowing to me is that no one gets the valuation of how valuable Del Technologies is. They're throwing off close to seven billion dollars in free cash flow free cash flow. Okay, so you can talk margin expansion all you want. That's great, but there got this huge cash flow coming in. You can't go out of business worth winning if you don't run out of cash >> in the market. When the market is good, these guys are it is good a position is anybody, and I would argue better position than anybody. The question on the table that I'm asking is, how long can it last? And if and when the market turns down and markets always cyclical we like again. We're in the tenth year of a bull market. I mean, it's someone >> unprecedented gel can use the war chest of the free cash flow check on these levers that they're talking about here, they're gonna have the leverage to go in during the downturn and then be the cost optimizer for great for customers. So right now, they're gonna be taking their medicine, creating this one common operating environment, which they have an advantage because they have all the puzzle pieces. You A Packer Enterprises doesn't have the gaping holes in the end to end. They can't address us, >> So that is a really good point that you're making now. So then the next question is okay. If and when the downturn turn comes, who's going to take advantage of it, who's going to come out stronger? >> I think Amazon is going to be continued to dominate, and as long as they don't fall into the enterprise trap of trying to be too enterprising, continue to operate their way for enterprises. I think jazz. He's got that covered. I think DEL Technologies is perfectly positioned toe leverage, the cash flow and the thing to do that. I think Cisco's got a great opportunity, and I think that's something that you know. You don't hear a lot of talk about the M where Cisco war happening. But Cisco has a network. They have a developer ecosystem just starting to get revitalized. That's an opportunity. So >> I got thoughts on Cisco, too. But one of things I want to say about Del being able to come out of that stronger. I keep saying I've said this a number of times and asked a lot of questions this week is the PC business is vital for Del. It's almost half the company's revenue. Maybe not quite, but it it's where the company started it. It sucks up a lot of corporate overhead. >> If Hewlett Packard did not spin out HP HP, they would be in the game. I think spinning that out was a huge mistake. I wrote about a publicly took a lot of heat for it, but you know I try to go along with the HPD focus. Del has proven bigger is better. HP has proven that smaller is not as leverage. And if it had the PC that bee have the mojo in gaming had the mojo in the edge, and Dale's got all the leverage to cross pollinate the front end and edge into the back and common cloud operate environment that is going to be an advantage. And that's going to something that will see Well, let me let me >> let me counter what you just said. I agree. You know this this minute. But the autonomy was the big mistake. Once hp autonomy, you know what Meg did was almost a fatal complete. They never should've bought autonomy >> makers. Levi Protector he was. So he was there. >> But she inherited that bag of rocks. And then what you gonna do with it? Okay, so that's why they had to spend out and did create shareholder value. If they had not purchased autonomy, then he would return much better shape, not to split it up. And they would be a much stronger competitor. >> And I share holder Pop. They had a pop on value. People made some cash with long game. I think that >> going toe peon base actually done pretty well for a first year holding a standalone PC company. So, but again, I think Del. With that leverage, assuming pieces, it's going to be really interesting. I don't know much about that market. You were loving that PC conversation, but the whole, you know, the new game or markets and and the new wayto work throwing an edge in there, I don't know is ej PC and edges that >> so the peanut butter. And so the big thing that Michael get the big thing, Michael Dell said on the Cube was We're not a conglomerate were an integrated company. And when you have an integrated company like this, with the tech the tech landscape shifting to their advantage, you have the ability to cross subsidize. So strategy game. Matt Baker was here we'd be talking about OK, I can cross subsidize margin. You've brought it up on the client side. Smaller margins, but it pays a lot of the corporate overhead. Absolutely. Then you got higher margin GMC business was, you know, those margins that's contributing. And so when you have this new configuration. You can cross, subsidize and move and shift, so I think that's a great advantage. I think that's undervalued in the market place. And I think, you know, I think Del stock price is, well, undervalue. Point out the numbers they got VM wear and their question is, What what point is? VM where blink and go All in on del technology stew. Orcas Remember that Gus was gonna partner. You don't think the phone was ringing off the hook in Palo Alto from their parties? What? What's this as your deal? So Vienna. There's gotta be the neutral party. Big problem. The opportunity. >> Well, look, if I'm a traditional historical partner of'Em are, it's not the Azure announcement that has me a little bit concerned because all of them partner with Microsoft to it is how tightly combined. Del and Veum, where are the emcee, always kept them in arms like now they're in the same. It's like Dave. They're blending it. It's like, you know Del, from a market cap standpoint, gets fifty cents on the dollar. VM wears a software company, and they get their multiples. Del is not a software company, but VM where well, people are. Well, if we can win that a little bit, maybe we could get that. >> Marty still Isn't it splendid? No, no, I think the strategy is absolutely right on. You have to go hard with VM wear and use it as a competitive weapon. But, Stuart, your point fifty cents and all, it's actually much worse than that. I mean the numbers. If you take out of'Em, wears the VM wear ownership, you take out the core debt and you look at the market value you're left with, like a billion dollars. Cordell is undervalued. Cordell is worth more than a billion or two billion dollars. Okay, so it's a really cheap way to buy Veum. Where Right that the Tom Sweet nailed this, he said. You know, basically, these company those the streets not used to tech companies having such big debt. But to your point, John, they're throwing off cash. So this company is undervalued, in my view. Now there's some risks associated with that, and that's why the investors of penalizing them for that debt there, penalizing him from Michael's ownership structure. You know, that's what this is, but >> a lack of understanding in my opinion. I think I think you're right. I just think they don't understand. Look at Dale and they think G You don't look a day Ellen Think distributed computing system with software, fill in those gaps and all that extra ten expansion. It's legit. I think they could go after new market opportunities as as a twos to us as the client business. I mean mere trade ins and just that's massive trillions of dollars. It's, I think I think that is huge. But I'm >> a bull. I'm a bull on the value of the company. I know >> guys most important developments. Del technology world. What's the big story that you think is coming out of the show here? >> Well, it's definitely, you know, the VM wear on del I mean, that is the big story, and it's to your point. It's Del basically saying we're going to integrate this. We're going to hard, we're going to go hard and you know Veum wear on Dell is a preferred solution. No doubt that is top for Dell and PacBell Singer said it. Veum wearing eight of us is the first and preferred solution. Those are the two primary vectors. They're going to drive hard and then Oh, yeah, we'Ll listen to customers Whatever else you want Google as you're fine, we're there. But those two vectors, they're going to Dr David >> build on that because we saw the, um we're building out of multi cloud strategy and what we have today is Del is now putting themselves in there as a first class citizen. Before it was like, Oh, we're doing VX rail and Anna sex and, you know, we'LL integrate all these pieces there, but infrastructure, infrastructure, infrastructure now it is. It is multi cloud. We want to see that the big table, >> right, Jeff, Jeff Clarke said, Why are you doing both? Let's just one strategy, one company. It's all one Cash registers that >> saying those heard that before. I think the biggest story to me is something that we've been seeing in the Cuban laud, you know, been Mom. This rant horizontally scaleable operating environment is the land grab and then vertically integrate with data into applications that allow each vertical industry leverage data for the kind of intimate, personalized experiences for user experiences in each industry. With oil and gas public sector, each one has got their own experiences that are unique. Data drives that, but the horizontal and tow an operating model when it's on premises hybrid or multi cloud is a huge land grab. And I think that is a major strategic win for Dell, and I think, as if no one challenges them on this. Dave, if HP doesn't go on, emanate change. If H h p e does not do it em in a complete changeover from strategy and pulling, filling their end to end, I think that going to be really hurting I think there's gonna be a tell sign and we'LL see, See who reacts and challenges Del on this in ten. And I think if they can pull it off without being contested, >> the only thing I would say that the only thing I would say that Jonah's you know, HP, you know very well I mean, they got a lot of loyal customers and is a huge market out there. So it's >> Steve. Look at economic. The economics are shifting in the new world. New use cases, new step function of user experiences. This is this is going to be new user experiences at new economic price points that's a business model. Innovation, loyal customers that's hard to sustain. They'Ll keep some clutching and grabbing, but everyone will move to the better mousetrap in the scenario. So the combination of that stability with software it's just this as a big market. >> So John twenty ten Little Table Back Corner, you know of'em See Dylan Blogger World double set. Beautiful says theatre of present lot of exchange and industry. But the partnership in support of this ecosystem. It's something that helped us along the way. >> You know, when we started doing this, Jeff came on board. The team has been amazing. We have been growing up and getting better every show. Small, incremental improvements here and there has been an amazing production, Amazing team all around us. But the support of the communities do this is has been a co creation project from day one. We love having this conversation's with smart people. Tech athletes make it unique. Make it organic, let the page stuff on on the other literature pieces go well. But here it's about conversations for four and with the community, and I think the community sponsorship has been part of funding mohr of it. You're seeing more cubes soon will be four sets of eight of US four sets of V M World four sets here. Global Partners sets I'm used to What have we missed? >> Yeah, it's phenomenal. You know, we're at a unique time in the industry and honored to be able to help documented with the two of you in the whole team. >> Dave, How it Elias sitting there giving him some kind of a victory lap because we've been doing this for ten years. He's been the one of the co captains of the integration. He says. There's a lot of credit. >> Yeah, Howard has had an amazing career. I I met him like literally decades ago, and he has always taken on the really hard jobs. I mean, that's I think, part of his secret success, because it's like he took on the integration he took on the services business at at AMC U members to when Joe did you say we're a product company? No services company. I was like, Give me services. Take it. >> It's been on the Cube ten years. Dave. He was. He was John away. He was on fire this week. I thought bad. Kelsey was phenomenal. >> Yeah, he's an amazing guest. Tom Tom Suite, You know, very strong moments. >> What's your favorite Cuban? I'LL never forget. Joe Tucci had my little camera out film and Joe Tucci, Anna. One of the sessions is some commentary in the hallway. >> Well, that was twenty ten, one of twenty eleven, I think one of my favorite twenty ten moments I go back to the first time we did. The cue was when you asked Joe Tucci, you know why a storage sexy. Remember that? >> A He never came on >> again. Ah, but that was a mean. If you're right, that was a cube mean all for the next couple of years. Remember, Tom Georges, we have because I'm not touching. That was >> so remember when we were critical of hybrid clouds like twenty, twelve, twenty, thirteen I go, Pat is a hybrid cloud, a halfway house to the final destination of public loud. He goes to a halfway house, three interviews. This was like the whole crowd was like, what just happened? Still favorite moment. >> Oh, gosh is a mean so money here, John. As you said, just such a community, love. You know, the people that we've had on for ten years and then, you know, took us, you know, three or four years to before we had Michael Dell on. Now he's a regular on our program with luminaries we've had on, you know, but yeah, I mean, twenty ten, you know, it's actually my last week working for him. See? So, Dave, thanks for popping me out. It's been a fun ride, and yeah, I mean, it's amazing to be able to talk to this whole community. >> Favorite moment was when we were at eighty bucks our first show. We're like, We still like hell on this. James Hamilton, Andy Jazzy Come on up, Very small show. Now it's a monster, David The Cube has had some good luck. Well, we've been on the right waves, and a lot of a lot of companies have sold their companies. Been part of Q comes when public Unicorns New Channel came on early on. No one understood that company. >> What I'm thrilled about to Jonah's were now a decade, and we're documenting a lot of the big waves. One of one of the most memorable moments for me was when you called me up. That said, Hey, we're doing a dupe world in New York. I got on a plane and went out. I landed in, like, two. Thirty in the morning. You met me. We did to dupe World. Nobody knew what to do was back then it became, like, the hottest thing going. Now nobody talks about her dupe. So we're seeing these waves and the Cube was able to document them. It's really >> a pleasure. The Cube can and we got the Cube studios sooner with cubes Stories with Cube Network too. Cue all the time, guys. Thanks. It's been a pleasure doing business with you here. Del Technologies shot out the letter. Chuck on the team. Sonia. Gabe. Everyone else, Guys. Great job. Excellent set. Good show. Closing down. Del Technologies rose two cubes coverage. Thanks for watching

Published Date : May 2 2019

SUMMARY :

It's the queue covering and the power machines. We really started to see stew, especially something that we've been talking about for years, Well, Dave, if I could jump in on that one of the things that's really interesting is when Veum, I U have a changing of the the architecture. But at the time we said, if you recall, lookit, they got to close the gap. We've been digging into that hybrid cloud taxonomy and some of the services to span I think the reality is is that you go back to twenty ten, the jury in the private cloud and it's enterprises the enterprise market is cyclical, and it's, you know, at some point you're going to start to the is the consolidation of it is just that is a retooling to be cloud ready operationally. show, I talked a lot of the telco people talk to the service of idle talk where the sd whan local market that I'm talking to Use the client business to cover costs. And that all Okay, so you can talk margin expansion all you want. We're in the tenth year of a bull market. You A Packer Enterprises doesn't have the gaping holes in the end to end. So that is a really good point that you're making now. the cash flow and the thing to do that. It's almost half the company's revenue. that bee have the mojo in gaming had the mojo in the edge, and Dale's got all the leverage But the autonomy was the big mistake. So he was there. And then what you gonna do with it? I think that but the whole, you know, the new game or markets and and the new wayto work throwing an edge And so the big thing that Michael get the big thing, Michael Dell said on the Cube was We're not a conglomerate were in the same. I mean the numbers. I think I think you're right. I'm a bull on the value of the company. What's the big story that you think is coming out of the show here? We're going to hard, we're going to go hard and you know Veum wear on Dell is a preferred solution. Oh, we're doing VX rail and Anna sex and, you know, we'LL integrate all these pieces there, It's all one Cash registers that I think the biggest story to me is something that we've been seeing in the Cuban laud, the only thing I would say that the only thing I would say that Jonah's you know, HP, you know very well I mean, So the combination of that stability with software it's just this as a big market. But the partnership in support of this ecosystem. But the support of the communities do this and honored to be able to help documented with the two of you in the whole team. He's been the one of the co captains of the integration. and he has always taken on the really hard jobs. It's been on the Cube ten years. Tom Tom Suite, You know, very strong moments. One of the sessions is some commentary in the hallway. The cue was when you asked Joe Tucci, you know why a storage sexy. Ah, but that was a mean. Pat is a hybrid cloud, a halfway house to the final destination of public loud. You know, the people that we've had on for ten years and then, you know, took us, Favorite moment was when we were at eighty bucks our first show. One of one of the most memorable moments for me was when you called me up. It's been a pleasure doing business with you here.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Jeff	PERSON	0.99+
John	PERSON	0.99+
Jeff Clarke	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Michael Dell	PERSON	0.99+
Stuart	PERSON	0.99+
Sonia	PERSON	0.99+
Tom Speed	PERSON	0.99+
Joe Tucci	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Matt Baker	PERSON	0.99+
Dave	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Tom Sweet	PERSON	0.99+
Del Technologies	ORGANIZATION	0.99+
Michael	PERSON	0.99+
Howard	PERSON	0.99+
Joe	PERSON	0.99+
Steve	PERSON	0.99+
Marius Haas	PERSON	0.99+
Tom Georges	PERSON	0.99+
three	QUANTITY	0.99+
New York	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
James Hamilton	PERSON	0.99+
Gabe	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Pat Kelsey	PERSON	0.99+
tenth year	QUANTITY	0.99+
fifty cents	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
one percent	QUANTITY	0.99+
seven years	QUANTITY	0.99+
ten years	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
one	QUANTITY	0.99+
Boston	LOCATION	0.99+
ninety one billion	QUANTITY	0.99+
del Technologies	ORGANIZATION	0.99+
Meg	PERSON	0.99+
Hewlett Packard	ORGANIZATION	0.99+
Kelsey	PERSON	0.99+
Dale	PERSON	0.99+
White	PERSON	0.99+
David The Cube	PERSON	0.99+
Dave Cloud	PERSON	0.99+
more than a billion	QUANTITY	0.99+
Andy Jazzy	PERSON	0.99+
Stew	PERSON	0.99+
Scott McNealy	PERSON	0.99+
GMC	ORGANIZATION	0.99+

Shuyi Chen, Uber | Flink Forward 2018

>> Announcer: Live from San Francisco, it's theCUBE covering Flink Forward, brought to you by data Artisans. (upbeat music) >> This is George Gilbert. We are at Flink Forward, the user conference for the Apache Flink community, sponsored by data Artisans, the company behind Flink. And we are here with Shuyi Chen from Uber, and Shuyi works on a very important project which is the Calcite Query Optimizer, SQL Query Optimizer, that's used in Apache Flink as well as several other projects. Why don't we start with, Shuyi tell us where Calcite's used and its role. >> Calcite is basically used in the Flink Table and SQL API, as the SQL POSSTR and query optimizer in planner for Flink. >> OK. >> Yeah. >> So now let's go to Uber and talk about the pipeline or pipelines you guys have been building and then how you've been using Flink and Calcite to enable the SQL API and the Table API. What workloads are you putting on that platform, or on that pipeline? >> Yeah, so basically I'm the technical lead of the streaming platform, processing platform in Uber, and so we use Apache Flink as the stream processing engine for Uber. Basically we build two different platforms one is the, called AthenaX, which use Flink SQL. So basically enable user to use SQL to compose the stream processing logic. And we have a UI, and with one click, they can just deploy the stream processing job in production. >> When you say UI, did you build a custom UI to take essentially, turn it a business intelligence tool so you have a visual way of constructing your queries? Is that what you're describing, or? >> Yeah, so it's similar to how you compose your, write a SQL query to query database. We have a UI for you to write your SQL query, with all the syntax highlight and all the hint. To write a SQL query so that, even the data scientists and also non engineers in general can actually use that UI to compose stream processing lock jobs. >> Okay, give us an example of some applications 'cause this sounds like it's a high-level API so it makes it more accessible to a wider audience. So what are some of the things they build? >> So for example, in our Uber Eats team, they use the SQL API to, as the stream processing tool to build their Restaurant Manager Dashboard. Restaurant Manager Dashboard. >> Okay. >> So basically, the data log lives in Kafka, get real-time stream into the Flink job, which it's composed using the SQL API and then that got stored in our lab database, P notes, then when the restaurant owners opens the Restaurant Manager, they will see the dashboard of their real-time earnings and everything. And with the SQL API, they no longer need to write the Flink job, they don't need to use Java or skala code, or do any testing or debugging, It's all SQL, so they, yeah. >> And then what's the SQL coverage, the SQL semantics that are implemented in the current Calcite engine? >> So it's about basic transformation, projection, and window hopping and tumbling window and also drawing, and group eye, and having, and also not to mention about the event time and real time, processing time support. >> And you can shuffle from anywhere, you don't have to have two partitions with the same join key on one node. You can have arbitrary, the data placement can be arbitrary for the partitions? >> Well the SQL is the collective, right? And so once the user compose the logic the underlying panel will actually take care of how the key by and group by, everything. >> Okay, 'cause the reason I ask is many of the early Hadoop based MPP sequel engines had the limitation where you had to co-locate the partitions that you were going to join. >> That's the same thing for Flink. >> Oh. >> But it just the SQL part is just take care of that. >> Okay. >> So you do describe what you do, but underlying get translated into a Flink program that actually will do all the co-location. >> Oh it redoes it for you, okay >> Yeah, yeah. So now they don't even need to learn Flink, they just need to learn the SQL, yeah. >> Now you said there a second platform that Uber is building on top of Flink. >> Yeah, the second platform is the, we call it the Flink as a service platform. So the motivation is, we found that SQL actually cannot satisfy all the advanced need in Uber to build stream processing, due to the reason, like for example, they will need to call up RPC services within their stream processing application or even training the RCP call, so which is hard to express in SQL and also when they are having a complicated DAG, like a workflow, it's very difficult to debug individual stages, so they want the control to actually to use delative Flink data stream APL dataset API to build their stream of batch job. >> Is the dataset API the lowest level one? >> No it's on the same level with the data stream, so it's one for streaming, one for batch. >> Okay, data stream and then the other was table? >> Dataset. >> Oh dataset, data stream, data set. >> Yeah. >> And there's one lower than that right? >> Yeah, there's one lower API but it's usually, most people don't use that API. >> So that's system programmers? >> Yeah, yeah. >> So then tell me, who is using, like what type of programmer uses the data stream or the data set API, and what do they build at Uber? >> So for example, in one of the talk later, there's a marketplace team, marketplace dynamics team, it's actually using the platform to do online model update, machinery model update, using Flink, and so basically they need to take in the model that is trained offline and do a few group by, time and location and then apply the model, and then incrementally update the model. >> And so are they taking a window of updates and then updating the model and then somehow promoting it as the candidate or, >> Yeah, yeah, yeah. Something similar, yeah. >> Okay, that's interesting. And what type of, so are these the data scientists who are using this API? >> Well data scientists are not really, it's not designed for data scientists. >> Oh so they're just going the models off, they're preparing the models offline and then they're being updated in line on the stream processing platform. >> Yes. >> And so it's maybe, data engineers who are essentially updating the features that get fed in and are continually training, or updating the models. >> Basically it's a online model update. So as Kafka event comes in, continue to refine the model. >> Okay, and so as Uber looks out couple years, what sorts of things do you see adding to one of these, either of these pipelines, and do you see a shift away from the batch and request response type workloads towards more continuous processing. >> Yes actually there we do see that trend, actually, before becoming entirely of stream processing platform team in Uber, I was in marketplace as well and at that point we always see there's a shift, like people would love to use stream processing technology to actually replace some of the normal backhand service applications. >> Tell me some examples. >> Yeah, for example... So in our dispatch platform, we have the need to actually shard the workload by, for example, writers, to different hosts to process. For example, compute say ETA or compute some of the time average, and this is before done in back hand services and say use our internal distribution system things to do the sharding. But actually with Flink, this can be just done very easily, right. And so actually there's a shift, those people will also want to adopt stream processing technology and, so long as this is not a request response style application. >> So the key thing, just to make sure I understand it's that Flink can take care of the distributed joins, whereas when it was a data base based workload, DBA had to set up the sharding and now it's sort of more transparent like it's more automated? >> I think, it's... More of the support, so if before people writing backhand services they have to write everything: the state management, the sharding, and everything, they need to-- >> George: Oh it's not even data base based-- >> Yeah, it's not data base, it's real time. >> So they have to do the physical data management, and Flink takes care of that now? >> Yeah, yeah. >> Oh got it, got it. >> For some of the application it's real time so we don't really need to store the data all the time in the database, So it's usually keep in memory and somehow gets snapshot, But we have, for normal backhand service writer they have to do everything. But with Flink it has already built in support for state management and all the sharding, partitioning and the time window, aggregation primitive, and it's all built in and they don't need to worry about re-implement the logic and we architect the system again and again. >> So it's a new platform for real time it gives you a whole lot of services, higher abstraction for real time applications. >> Yeah, yeah. >> Okay. Alright with that, Shuyi we're going to have to call it a day. This was Shuyi Chen from Uber talking about how they're building more and more of their real time platforms on Apache Flink and using a whole bunch of services to complement it. We are at Flink Forward, the user conference of data Artisans for the Apache Flink community, we're in San Francisco, this is the second Flink Forward conference and we'll be back in a couple minutes, thanks. (upbeat music)

Published Date : Apr 11 2018

SUMMARY :

brought to you by data Artisans. the user conference for the Apache Flink community, as the SQL POSSTR and talk about the pipeline or pipelines Yeah, so basically I'm the technical lead Yeah, so it's similar to how you compose your, so it makes it more accessible to a wider audience. as the stream processing tool the Flink job, they don't need to use Java or skala code, and also not to mention about the event time the data placement can be arbitrary for the partitions? And so once the user compose the logic had the limitation where you had to co-locate So you do describe what you do, So now they don't even need to learn Flink, Now you said there a second platform all the advanced need in Uber to build stream processing, No it's on the same level with the data stream, Yeah, there's one lower API but it's usually, and so basically they need to take in the model Yeah, yeah, yeah. so are these the data scientists who are using this API? it's not designed for data scientists. on the stream processing platform. and are continually training, So as Kafka event comes in, continue to refine the model. Okay, and so as Uber looks out couple years, and at that point we always see there's a shift, or compute some of the time average, More of the support, and it's all built in and they don't need to worry about So it's a new platform for real time for the Apache Flink community, we're in San Francisco,

ENTITIES

Entity	Category	Confidence
Uber	ORGANIZATION	0.99+
Shuyi Chen	PERSON	0.99+
George Gilbert	PERSON	0.99+
San Francisco	LOCATION	0.99+
George	PERSON	0.99+
Flink	ORGANIZATION	0.99+
second platform	QUANTITY	0.99+
Shuyi	PERSON	0.99+
Java	TITLE	0.99+
SQL	TITLE	0.99+
Kafka	TITLE	0.99+
Uber Eats	ORGANIZATION	0.99+
one click	QUANTITY	0.99+
SQL Query Optimizer	TITLE	0.99+
SQL POSSTR	TITLE	0.98+
second	QUANTITY	0.98+
Calcite	TITLE	0.98+
two partitions	QUANTITY	0.97+
SQL API	TITLE	0.97+
Calcite Query Optimizer	TITLE	0.97+
Flink Forward	EVENT	0.96+
a day	QUANTITY	0.95+
one	QUANTITY	0.95+
Flink Table	TITLE	0.94+
Apache Flink	ORGANIZATION	0.94+
one node	QUANTITY	0.88+
Flink	TITLE	0.83+
two different platforms	QUANTITY	0.82+
couple years	QUANTITY	0.82+
Table	TITLE	0.82+
Apache	ORGANIZATION	0.8+
Artisans	ORGANIZATION	0.78+
2018	DATE	0.77+
Hadoop	TITLE	0.73+
one for	QUANTITY	0.69+
couple minutes	QUANTITY	0.65+
AthenaX	ORGANIZATION	0.64+
Flink Forward	TITLE	0.56+
Forward	EVENT	0.52+
DBA	ORGANIZATION	0.5+
MPP	TITLE	0.47+

Stefan Renner, Veeam & Darren Williams, Cisco | Cisco Live EU 2018

>> Announcer: From Barcelona, Spain, it's theCUBE covering Cisco Live 2018. Brought to you by Cisco, Veeam and theCUBE's ecosystem partners. >> Here in Barcelona, Spain. It's theCUBE's exclusive coverage of Cisco Live 2018 in Europe. I'm John Furrier, co-host of theCUBE, with my partner in crime this week Stu Miniman, Senior Analyst at Wikibon. Also co-host of many events across the world in terms of networking, storage, Cloud, you name it, Stu is on the developers with me. Stu, thanks. Nice seeing you. Stefan Renner is Technical Director, Global Alliances at Veeam Software is with us with Darren Williams, @MrHyperFlex, that's his Twitter handle, go check him out. HyperFlex-V at Cisco, guys welcome to theCUBE. >> Thank you. >> Also love the Twitter handle. >> Darren: I live the brand. >> You live the brand. I mean that's got some longevity to it, it's evergreen. So congratulations on that. You guys are together with Cisco Veeam, what's the story? What's going on in Europe with Cisco and Veeam? >> I would say there is a lot of stuff going on between Cisco and Veeam. Especially around the Hyperflex story, obviously is the topic of this session, right? So having integration, Hyperflex, having a good go-to-market, having a good relationship between the two companies. We just joked about how often we've been in front of cameras talking about this exact same topic. So that shows that the relationship between the two of us is really moving forward and in a good shape. >> I think we're in good shape in terms of, you think about not just my product, Hyperflex, but you look at what Veeam can do for the rest of Cisco data-centered products, and be that backup, safer hands around what we need in terms of that data protection layer. But also then, what we can add in terms of that target to be the server of choice for backups so you get the benefits of the speed, performance, and more importantly, you get quicker restores. Because that's the important bit, you need to be able to do the quick restore. >> Yeah, we usually talk about availability, right? We don't talk about backups or recovery. Even if recovery is maybe the most important part of availability, still we talk more about availability than maybe anything else. The good thing about Cisco is that the actually can deliver what we need in terms of performance, in terms of capacity, in terms of compute resources. So yeah, that's a real benefit. >> It's such an interesting time, I mean we look back at history, go back 10 years ago, maybe, or more; backup recover, that's like, "Oh, we forgot to talk about that in our RFP." Kind of bolted on, kind of retrofitted in. But now we've seen it come to the main center. But more importantly, with AI and Cloud, and all the action happening with DevOpps on premises, you hear CIOs and CXOs and developers saying, "We're data driven." >> Yeah. >> Okay, so if you're data driven, you have to be data protection driven too. So those things go hand in hand. So the question for you guys is how does a data driven organization, whether it's in the data center, all the way up to the business units, or the business processes, become data protection built in? How do they design in from day one a data protection system up and down the stack? >> Yeah, so maybe I'll start to answer that question. I think when I'm going to customers, and I fully agree on what you just said, most customers 10 years ago were focusing on getting used to platforms and getting used to org systems. It has to be an isolated project, right? Now in those days when I go to customers I tried to convince them to include data protection in every project the do in data center, because at the end, data protection is one of the core elements. >> So designing in early, at the front end? >> I say whenever you go about having a new Hyperflex system or whenever you talk about replacing your existing environment, whatever you do, right, just look into data protection, looking into your availability story. Because right now, and you mentioned that, it's about data services, right? We don't really talk about restoring of EM, we don't restore to the single file. It's about, the customer wants to have a data availability in terms of a service availability. And that includes more than just the VM, it includes more than just the single thing, right? >> Yeah. So they need to include data protection and the design of that in the whole org chart. From the beginning. >> And you're point? >> Yeah we look at it from a similar thing in terms of where you've got changes happening in terms of the way people are looking at how they want to design their applications, where they want their data to live. And that's the whole messaging around 3.0, is that multi-Cloud readiness platform. Being able to think about an application and go, "Do I want to design in the public, and house privately, "or vice versa? Do I want to house the data "of the application in a private location "and the actual application in public?" Having that being able to be transparent to a user in terms of the way they design it and then position, but also as we look at other applications, not all people on this journey are going to go, "We're going to put everything in the Cloud." They're going to look at about, maybe have a little bit in the Cloud, a little bit of the traditional apps we need to manage and protect. And it's all about that 3.0 that we've delivered the pre-multi-Cloud offering around Hyperconvergence, we've now brought the multi-Cloud element. It's giving you the choice of where you want to position things, where you want to house things, how you want to design things. And keeping it nice and simple for customers, and the agility and performance. >> Darren, some really interesting points that you just had there. When I think back to a few years ago, Hyperconverge, pretty strong in North America. But it was project based, it was like, let's take a VDI, some virtualized environment, it wasn't a Cloud discussion. >> Darren: Correct. >> Take us inside what you're seeing in Europe here, because today Hyperconverge is a lot about Cloud, how that kind of hybrid or multi-Cloud environment, so what are you hearing from your customers? >> Absolutely, and I think if you look at the, what's happened in times of Hyperconvergence up to this point it's the initial building block of this multi-Cloud. And we're seeing more and more customers now, I think the latest IDC survey, surveyed that 87% of all customers have a multi-Cloud strategy. And we're seeing now more of the ability to think of Hyperconvergence as that multi-Cloud strategy, and have that simplicity that people have done in terms of the initial thought around a simple application, how they can collapse the layers, they can now utilize that experience into the multi-Cloud experience. And we're seeing more and more of that. We've now got 2500 users around the world around Hyperflex, and about 700-800 EMEA, and the majority of those are utilizing it as private Cloud experience. They're getting the benefits of what they've had in the Cloud, and getting away from the sovereignty issues, and the shadow IT issues that they all face. They can now bring it back into their own data center. They can start small. They can spin out applications very quickly. They're getting the benefit of that Cloud message, but locally now. >> And I think that perfectly aligns with the Veeam story because as you know we are also focusing on the Cloud. We recently changed and also did some acquisitions on the Cloud, so we're also moving forward in the Cloud story and the HyperCloud area. And that's more or less what Cisco's multi-Cloud's story is also about, right? And I think one thing we should also mention here coming a bit back to how to implement and how to design such solutions as having more of a broad view on all the projects. I think one important thing for customers is the CBD Cisco has, right? And we do have CBD available to beam Cisco on the data protection layer. So we try to make it really easy for customers and for partners to design, implement and actually do the right decisions for those projects. >> Stefan, at Veeam On, of course a lot of partners, a lot of talk about the multi-Cloud, of course Veeam has a long history of VMware, but why don't you talk about Microsoft? I believe there's some things you've been doing lately with Hyper-V and the like, what's the update? >> Yeah, so obviously with Hyperflex there is Hyper-V coming, right? That's one of the bigger things coming to Hyperflex. Now for us, when we started to talk with Cisco, Cisco actually told us that Hyper-V is next and 3.0. We said that's fine for us, because as I said, we are dealing with Hyper-V like we did with VMware since a couple of years. So there is no big difference in terms of features and what we can do with Hyper-V. On the Microsoft side obviously it's around extract, which also is a big story with Cisco and Veeam, because there is a extract solution, and so we tried to get the extract fully integrated in the Veeam portfolio, and it's about effort, right? As we just talked about, making this Cloud journey even easier for the customer, making sure we have data protection forever, or making sure we can actually use our Cloud solutions to provide the full experience in the cloud. >> So the question on European audience, I was just looking at some Twitter tweets, getting in some feedback, is, "Ask the GDPR our question." Which is basically code words for the sophistication between data protection, you know we say as you get bitten in the butt if you don't prepare. And this is one of those things where I mean literally, there's so much data out there, people can't understand their own tables. I mean, if you have accounts, how do I know a user uses a certain name in this one, I got a certain name in this database, I mean it's just a nightmare to even understand what data do you have, nevermind taking someone out of a database. >> Yeah. >> So, the challenges are massive. >> Yep. >> This is coming down and it really highlights the bigger trend is: what do I do with the data, what is my protection, what's my recovery, how do I engage in real time, GDPR issue? Talk about the GDPR issue, and then what it really is going to mean for customers going forward. >> Well, I think if you think about GDPR, and people, I've got the understanding that it's just a mere thing, it's not. It's a worldwide thing. Any data that relates to a European citizen, anywhere in the world, is covered under the GDPR. So you've got to think about the multinationals we work with, have to have this GDPR thoughts, even if they're not based in EMEA. They may house data based around a European citizen. So it's a massive thing. Now, not one person or one organization can fix GDPR. We're all part of a bigger framework. So it looks like if you look at the Hyperflex offering, having self-encrypting drives, having good data protection and replication of the data so it's protected. That protects the actual content of a record, but it doesn't solve everything around GDPR. There's no one organization that can do that. It's about having that framework of you do the right decisions around the architecture, and the data protection, you'll get in there in terms of the protection. >> Well, I mean, I'm just going to rant here and say whoever came up with GDPR doesn't know anything about databases, okay. >> Darren: Yeah. >> I mean I get the concept, but, I mean, just think about how hard it is to deal with unstructured data, and structured data in and of itself within a company. Nevermind inside a company, what's happening externally, it is a technical nightmare. And so, yeah, just hand waving, "Hey, someone came "to your website." Well, did they come in anonymously, did they login, which identity did they login on? There's no - I mean it's a nightmare. This is a huge problem. What do customers do? >> I think if you talk about GDPR it's first of all not about a single solution, right? It's not an issue of just one company, or one vendor, one solution. It goes across different databases, different applications, different software, so as you said, it's database solutions, you need to delete maybe a single table entry, which is almost impossible right now. Especially if that's ina backup, right? How are you going to do that? I think between Cisco and us, and he mentioned that one important part of GDPR is data protection itself. So the customers need to make sure they can actually promise and they can show to the government that they have a proper data protection in place, so they can showcase what does my DR plan look like? How do I recover? What is my RPO? So we can already solve those issues. >> It changes your game because, for you, it turns you into a insurance policy to a proactive; in order to do data protection you actually have to know what the data is. So it kind of creates an opportunity to say hey, this is an opportunity to say we're going to start thinking about, kind of a new e-discovery model. >> If you look at 3.0, the multi-Cloud platform, we were discussing around how Hyperconvergence started very small in certain apps. But when you actually then expand that out into the multi-Cloud, security is a major pillar. And you've got to have the security elements, and Cisco has some great security offerings in the data center and outside of the data center. They all form part of that GDPR message. But it's been baked into multi-Cloud 3.0. as a key component to allow customers that confidence. >> It's going to be a Hyperconvergence of databases. So this is coming. >> Darren: Yeah. >> So this is going to force, I think the compliance is going to be more a shot across the bow, if you will. I don't know how hardcore they're going to be enforcing it. >> It's going to be interesting in the first one. Because at the moment I think a lot of customers are thinking, "Well, we'll wait till we see "how big the fines are, and then we'll decide." >> They're going to create shell corporations in the Cayman Islands. (laughter) >> Alright, so we've talked a little bit about some of the headwinds we're facing in IT. Talk about the tailwinds. A lot of things in the Hyperflex 3.0, got 700-800 customers, what's going to drive adoption, get that into thousands of customers here in 2018? >> So I think it's the simplicity message. Customers want ease of use of technology. They want to get away from what they've had before where they've had tough times standing up applications, where they've had to invest time around different skill sets for the infrastructure, be it networking, be it storage, be it compute. Having 3 teams back leaning against each other, and change windows. So the simplicity message of Hyperflex is you can have a three node cluster up and running in 34 minutes, including the network. We're the only ones that incorporate the network into the solution, and we do it for good reason. Because when we can get predictability in performance, and we can grow the solution very, very easily. And that's the whole point of what they're doing, is they want to be able to start small, and add more nodes when required, around what applications they're going to deploy on. Our tagline is "any application, anywhere" now, and either private location or into that multi-Cloud location. Gives customers choice, and I think as we start seeing more and more customers, 700 in just under 2 years is a phenomenal amount in EMEA, and 2500 worldwide, we've had some great traction. And it's just going to get faster and faster. >> Yeah, I think a lot of customers are obviously talking about moving to the Cloud completely or at least majority of the data. So for the customers that stay for them, and I talked with some customers today, and they told me, "For us right now, we can't focus "anymore on a data center itself. "We do have much more difficult and more important "topics to talk about and to cover in our IT business "than the basic data center itself" That includes compute, that includes digitalization. So it's great to hear you can actually set up a Hyperflex system, no matter if that's Hyper-V or VM or whatever in less than an hour, right? And if I tell you now that if you add Veeam on that to provide the availability for Hyperflex environment that's also less than an hour. So if you know how to configure that you can be done in a couple of hours, and you have more or less the whole data center set up. >> You bring up a really good point. What are customers concerned about? I have to worry about my application portfolio, I have my security issue, my whole Cloud strategy piece, so, if the infrastructure piece is just invisible and I don't have to touch it, tweak it and do that, I'm going to have time to actually grow my business. >> The more integrated it is, the more easy it is to set up and to maintain and troubleshoot by the way, that's also an important thing, right? What if it doesn't work? If there is a consistent layer, a consistent way to get all this information sent to get a troubleshooting thing done, the better it is for our customers. Because again, they don't want to care anymore about what's happening in the back end. >> And that's the next challenge we're addressing, in-app product or Insight, is taking that management solution into the Cloud to make things easier for customers. And being able to take a lot of the things we have in point product into a Cloud model. So the likes of analytics, the likes of Smart Tac. Customers get fed up if when they have an issue they have to go and roll the logs up into Tac, and then go and FTP them. They get away from that, they don't need to do that in Insight. And it's all about, we're talking about the deployment of technology, well one of the fist benefits of Insight is Hyperflex. We can roll out sites without even visiting them. You just do a Cloud deployment, and a Cloud management, and it's job done. >> And this is the whole point we were kind of getting at earlier, connect back to the compliance issue, these agile like things are happening; it's throwing off data too. So now you got to organize the data, you can't protect what you don't understand. >> Correct. >> I mean that is ultimately the bottom line for what's happening here. >> Yeah, you can't protect what you don't understand, I think that's a good conclusion of the whole thing. And I think for us >> By the way when you guys use that tagline I want royalties. But it's true. (laughter) We'll get back to you on that. No, but this is a big problem. Protection is inherently assuming you know the data is. >> Stefan: Yeah. >> Darren: Yeah. >> There it is. >> That's for sure the case, and one thing we worked on and, you know, we announced it a couple of months ago, was the Veeam Ability Orchestrator, which is another layer on top of it. So he just talked about how they can deploy within the site, multiple sites of Hyperflex very easily. And for us it's about, you know, getting the customer an easy solution with all the successful recovery and failovers in areas across the data centers with the Availability Orchestrator. >> Data is the competitive advantage, data is messy if you don't control it and reign it in, of course theCUBE is doing their part and bringing the data to you guys here in theCUBE with Veeam and Cisco partnership. I'm John Furrier, Stu Miniman breaking it down here at Cisco Live in Europe 2018. Live coverage with theCUBE. Be back with more after this short break. (techno music)

Published Date : Jan 31 2018

SUMMARY :

Brought to you by Cisco, Veeam Stu is on the developers with me. I mean that's got some longevity to it, it's evergreen. So that shows that the relationship between the two of us Because that's the important bit, Even if recovery is maybe the most important part and all the action happening with DevOpps on premises, So the question for you guys is in every project the do in data center, And that includes more than just the VM, and the design of that in the whole org chart. of the traditional apps we need to manage and protect. When I think back to a few years ago, Hyperconverge, and about 700-800 EMEA, and the majority of those and actually do the right decisions for those projects. That's one of the bigger things coming to Hyperflex. in the butt if you don't prepare. Talk about the GDPR issue, and then what and replication of the data so it's protected. Well, I mean, I'm just going to rant here and say I mean I get the concept, but, I mean, just think about So the customers need to make sure they can actually in order to do data protection you actually in the data center and outside of the data center. It's going to be a Hyperconvergence of databases. is going to be more a shot across the bow, if you will. Because at the moment I think a lot in the Cayman Islands. about some of the headwinds we're facing in IT. And that's the whole point of what they're doing, So it's great to hear you can actually and I don't have to touch it, tweak it and do that, The more integrated it is, the more easy it is And that's the next challenge we're addressing, So now you got to organize the data, I mean that is ultimately the bottom line And I think for us By the way when you guys use that tagline and failovers in areas across the data centers and bringing the data to you guys here in theCUBE

ENTITIES

Entity	Category	Confidence
Darren Williams	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Stefan	PERSON	0.99+
Stu Miniman	PERSON	0.99+
John Furrier	PERSON	0.99+
Stefan Renner	PERSON	0.99+
Darren	PERSON	0.99+
Cayman Islands	LOCATION	0.99+
2018	DATE	0.99+
Veeam	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
34 minutes	QUANTITY	0.99+
two companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
87%	QUANTITY	0.99+
less than an hour	QUANTITY	0.99+
one solution	QUANTITY	0.99+
Barcelona, Spain	LOCATION	0.99+
North America	LOCATION	0.99+
Veeam Software	ORGANIZATION	0.99+
2500 users	QUANTITY	0.99+
two	QUANTITY	0.99+
3 teams	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
one vendor	QUANTITY	0.99+
one	QUANTITY	0.99+
EMEA	LOCATION	0.99+
Stu	PERSON	0.99+
one company	QUANTITY	0.99+
one person	QUANTITY	0.99+
Hyperflex	ORGANIZATION	0.99+
700-800 customers	QUANTITY	0.98+
single file	QUANTITY	0.98+
single solution	QUANTITY	0.98+
700	QUANTITY	0.98+
today	DATE	0.98+
IDC	ORGANIZATION	0.98+
10 years ago	DATE	0.98+
Wikibon	ORGANIZATION	0.98+
2500	QUANTITY	0.98+
Veeam	PERSON	0.97+
Twitter	ORGANIZATION	0.97+
first one	QUANTITY	0.96+
this week	DATE	0.96+
Hyper-V	TITLE	0.96+
one organization	QUANTITY	0.94+
under 2 years	QUANTITY	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Table: