Christopher Voss, Microsoft | Kubecon + Cloudnativecon Europe 2022

>> theCUBE presents KubeCon and CloudNativeCon, Europe, 2022. Brought to you by Red Hat, the cloud-native computing foundation and its ecosystem partners. >> Welcome to Valencia, Spain in KubeCon, CloudNativeCon, Europe, 2022. I'm Keith Townsend with my cohosts, Enrico Signoretti, Senior IT Analyst at GigaOm. >> Exactly. >> 7,500 people I'm told, Enrico. What's the flavor of the show so far? >> It's a fantastic mood, I mean, I found a lot of people wanting to track, talk about what they're doing with Kubernetes, sharing their you know, stories, some war stories that bit tough. And you know, this is where you learn actually. Because we had a lot of Zoom calls, webinar and stuff. But it is when you talk a video, "Oh, I did it this way, and it didn't work out very well." So, and, you start a conversation like this that is really different from learning from Zoom, when, you know, everybody talks about things that work it well, they did it right. No, it's here that you learn from other experiences. >> So we're talking to amazing people the whole week, talking about those experiences here on theCUBE. Fresh on the theCUBE for the first time, Chris Voss, senior software engineer at Microsoft Xbox. Chris, welcome to the theCUBE. >> Thank you so much for having me. >> So first off, give us a high level picture of the environment that you're running at Microsoft. >> Yeah. So, you know, we've got 20 well probably close to 30 clusters at this point around the globe, you know 700 to 1,000 pods per cluster, roughly. So about 22,000 pods total. So yeah, it's pretty, pretty sizable footprint and yeah. So we've been running on Kubernetes since 2018 and well actually might be 2017, but anyways, so yeah, that's kind of our footprint. Yeah. >> So all of that, let's talk about the basics which is security across multiple I'm assuming containers, microservices, etcetera. Why did you and the team settle on Linkerd? >> Yeah, so previously we had our own kind of solution for managing TLS certs and things like that. And we found it to be pretty painful, pretty quickly. And so we knew, you know we wanted something that was a little bit more abstracted away from the developers and things like that, that allowed us to move quickly. And so we began investigating, you know, solutions to that. And a few of our colleagues went to Kubecon in San Diego in 2019, Cloudnativecon as well. And basically they just, you know, sponged it all up. And actually funny enough, my old manager was one of the people who was there and he went to the Linkerd booth and they had a thing going that was like, "Hey, get set up with MTLS in five minutes." And he was like, "This is something we want to do, why not check this out?" And he was able to do it. And so that put it on our radar. And so yeah, we investigated several others and Linkerd just perfectly fit exactly what we needed. >> So, in general we are talking about, you know, security at scale. So how you manage security scale and also flexibility. Right? So, but you know, what is the... You told us about the five minutes to start using there but you know, again, we are talking about war stories. We're talking about, you know, all these. So what kind of challenges you found at the beginning when you started adopting this technology? >> So the biggest ones were around getting up and running with like a new service, especially in the beginning, right, we were, you know, adding a new service almost every day. It felt like. And so, you know, basically it took someone going through a whole bunch of different repos, getting approvals from everyone to get the certs minted, all that fun stuff getting them put into the right environments and in the right clusters, to make sure that, you know, everybody is talking appropriately. And just the amount of work that that took alone was just a huge headache and a huge barrier to entry for us to, quickly move up the number of services we have. >> So, I'm trying to wrap my head around the scale of the challenge. When I think about certification or certificate management, I have to do it on a small scale. And every now and again, when a certificate expires it is just a troubleshooting pain. >> Yes. >> So as I think about that, it costs it's not just certificates across 22,000 pods, or it's certificates across 22,000 pods in multiple applications. How were you doing that before Linkerd? Like, what was the... And what were the pain points? Like what happens when a certificate either fails? Or expired up? Not updated? >> So, I mean, to be completely honest, the biggest thing is we're just unable to make the calls, you know, out or in, based on yeah, what is failing basically. But, you know, we saw essentially an uptick in failures around a certain service and pretty quickly, pretty quickly, we got used to the fact that it was like, oh, it's probably a cert expiration issue. And so we tried, you know, a few things in order to make that a little bit more automated and things like that. But we never came to a solution that like didn't require every engineer on the team to know essentially quite a bit about this, just to get into it, which was a huge issue. >> So talk about day two, after you've deployed Linkerd, how did this alleviate software engineers? And what was like the benefits of now having this automated way of managing certs? >> So the biggest thing is like, there is no touch from developers, everyone on our team... Well, I mean, there are a lot of people who are familiar with security and certs and all of that stuff. But no one has to know it. Like it's not a requirement. Like for instance, I knew nothing about it when I joined the team. And even when I was setting up our newer clusters, I knew very little about it. And I was still able to really quickly set up Linkerd, which was really nice. And it's been, you know, essentially we've been able to just kind of set it, and not think about it too much. Obviously, you know, there're parts of it that you have to think about, we monitor it and all that fun stuff, but yeah, it's been pretty painless almost day one. It took a long time to trust it for developers. You know, anytime there was a failure, it's like, "Oh, could this be Linkerd?" you know. But after a while, like now we don't have that immediate assumption because people have built up that trust, but. >> Also you have this massive infrastructure I mean, 30 clusters. So, I guess, that it's quite different to manage a single cluster in 30. So what are the, you know, consideration that you have to do to install this software on, you know, 30 different cluster, manage different, you know versions probably, et cetera, et cetera, et cetera. >> So, I mean, you know, as far as like... I guess, just to clarify, are you asking specifically with Linkerd? Or are you just asking in more in general? >> Well, I mean, you can take that the question in two ways. >> Okay. >> Sure, yeah, so Linkerd in particular but the 30 cluster also quite interesting. >> Yeah. So, I mean, you know, more generally, you know how we manage our clusters and things like that. We have, you know, a CLI tool that we use in order to like change context very quickly, and switch and communicate with whatever cluster we're trying to connect to and you know, are we debugging or getting logs, whatever. And then, you know, with Linkerd it's nice because again, you know, we aren't having to worry about like, oh, how is this cert being inserted in the right node? Or not the right node, but in the right cluster or things like that. Whereas with Linkerd, we don't really have that concern. When we spin up our clusters, essentially we get the route certificate and everything like that packaged up, passed along to Linkerd on installation. And then essentially, there's not much we have to do after that. >> So talk to me about your upcoming section here at Kubecon. what's the high level talking points? Like what attendees learn? >> Yeah. So it's a journey. Those are the sorts of talks that I find useful. Having not been, you know, I'm not a deep Kubernetes expert from, you know decades or whatever of experience, but-- >> I think nobody is. >> (indistinct). >> True, yes. >> That's also true. >> That's another story >> That's a job posting decades of requirements for-- >> Of course, yeah. But so, you know, it's a journey. It's really just like, hey, what made us decide on a service mesh in the first place? What made us choose Linkerd? And then what are the ways in which, you know, we use Linkerd? So what are those, you know we use some of the extra plugins and things like that. And then finally, a little bit about more what we're going to do in the future. >> Let's talk about not just necessarily the future as in two or three days from now, or two or three years from now. Well, the future after you immediately solve the low level problems with Linkerd, what were some of the surprises? Because Linkerd in service mesh and in general have side benefits. Do you experience any of those side benefits as well? >> Yeah, it's funny, you know, writing the blog post, you know, I hadn't really looked at a lot of the data in years on, you know when we did our investigations and things like that. And we had seen that we like had very low latency and low CPU utilization and things like that. And looking at some of that, I found that we were actually saving time off of requests. And I couldn't really think of why that was and I was talking with someone else and the biggest, unfortunately all that data's gone now, like the source data. So I can't go back and verify this but it makes sense, you know, there's the availability zone routing that Linkerd supports. And so I think that's actually doing it where, you know essentially, if a node is closer to another node, it's essentially, you know, routing to those ones. So when one service is talking to another service and maybe they're on the same node, you know, it short circuits that and allows us to gain some time there. It's not huge, but it adds up after, you know, 10, 20 calls down the line. >> Right. In general, so you are saying that it's smooth operations at this very, you know, simplifying your life. >> And again, we didn't have to really do anything for that. It handled that for us. >> It was there? >> Yep. Yeah, exactly. >> So we know one thing when I do it on my laptop it works fine. When I do it with across 22,000 pods, that's a different experience. What were some of the lessons learned coming out of Kubecon 2018 in San Diego? I was there. I wish I would've ran into the Microsoft folks, but what were some of the hard lessons learned scaling Linkerd across the 22,000 nodes? >> So, you know, the first one and this seems pretty obvious, but was just not something I knew about was the high availability mode of Linkerd. So obviously makes sense. You would want that in, you know a large scale environment. So like, that's one of the big lessons that like, we didn't ride away. No. Like one of the mistakes we made in one of our pre-production clusters was not turning that on. And we were kind of surprised. We were like, whoa, like all of these pods are spinning up but they're having issues, like actually getting injected and things like that. And we found, oh, okay. Yeah, you need to actually give it some more resources. But it's still very lightweight considering, you know, they have high availability mode but it's just a few instances still. >> So from, even from, you know, binary perspective and running Linkerd how much overhead is it? >> That is a great question. So I don't remember off the top of my head, the numbers but it's very lightweight. We evaluated a few different service missions and it was the lightest weight that we encountered at that point. >> And then from a resource perspective, is it a team of Linkerd people? Is it a couple of people? Like how? >> To be completely honest for a long time, it was one person Abraham, who actually is the person who proposed this talk. He couldn't make it to Valencia, but he essentially did probably 95% of the work to get into production. And then this was before, we even had a team dedicated to our infrastructure. And so we have, now we have a team dedicated, we're all kind of Linkerd folks, if not Linkerd experts, we at least can troubleshoot basically. And things like that. So it's, I think a group of six people on our team and then, you know various people who've had experience with it on other teams. >> But others, dedicated just to that. >> No one is dedicated just to it. No, it's pretty like pretty light touch once it's up and running. It took a very long time for us to really understand it and to, you know, get like not getting started, but like getting to where we really felt comfortable letting it go in production. But once it was there, like, it is very, very light touch. >> Well, I really appreciate you stopping by Chris. It's been an amazing conversation to hear how Microsoft is using a open source project. >> Exactly. >> At scale, it's just a few years ago when you would've heard the concept of Microsoft and open source together and like OS, just, you know-- >> They have changed a lot in the last few years. Now, there are huge contributors. And, you know, if you go to Azure, it's full of open source stuff, everywhere so. >> Yeah. >> Wow. The Kubecon 2022, how the world has changed in so many ways. From Valencia Spain, I'm Keith Townsend, along with Enrico Signoretti. You're watching theCUBE, the leader in high tech coverage. (upbeat music)

Published Date : May 19 2022

SUMMARY :

Brought to you by Red Hat, Welcome to Valencia, Spain What's the flavor of the show so far? And you know, this is Fresh on the theCUBE for the first time, of the environment that at this point around the globe, you know Why did you and the And so we knew, you know So, but you know, what is the... right, we were, you know, I have to do it on a small scale. How were you doing that before Linkerd? And so we tried, you know, And it's been, you know, So what are the, you know, So, I mean, you know, as far as like... Well, I mean, you can take that but the 30 cluster also quite interesting. And then, you know, with Linkerd So talk to me about Having not been, you know, But so, you know, you immediately solve but it makes sense, you know, you know, simplifying your life. And again, we didn't have So we know one thing So, you know, the first one and it was the lightest and then, you know dedicated just to that. and to, you know, get you stopping by Chris. And, you know, if you go to Azure, how the world has changed in so many ways.

ENTITIES

Entity	Category	Confidence
Enrico	PERSON	0.99+
Chris	PERSON	0.99+
Enrico Signoretti	PERSON	0.99+
Christopher Voss	PERSON	0.99+
Chris Voss	PERSON	0.99+
Keith Townsend	PERSON	0.99+
95%	QUANTITY	0.99+
700	QUANTITY	0.99+
2017	DATE	0.99+
Linkerd	ORGANIZATION	0.99+
San Diego	LOCATION	0.99+
30 clusters	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Abraham	PERSON	0.99+
10	QUANTITY	0.99+
2019	DATE	0.99+
20	QUANTITY	0.99+
Valencia	LOCATION	0.99+
six people	QUANTITY	0.99+
22,000 pods	QUANTITY	0.99+
30	QUANTITY	0.99+
Valencia, Spain	LOCATION	0.99+
Valencia Spain	LOCATION	0.99+
KubeCon	EVENT	0.99+
7,500 people	QUANTITY	0.99+
2018	DATE	0.99+
1,000 pods	QUANTITY	0.99+
two ways	QUANTITY	0.99+
five minutes	QUANTITY	0.99+
Europe	LOCATION	0.99+
CloudNativeCon	EVENT	0.98+
Enrico Signore	PERSON	0.98+
three days	QUANTITY	0.98+
GigaOm	ORGANIZATION	0.98+
two	QUANTITY	0.98+
first time	QUANTITY	0.98+
first	QUANTITY	0.98+
Cloudnativecon	ORGANIZATION	0.97+
one service	QUANTITY	0.97+
Kubecon	ORGANIZATION	0.97+
three years	QUANTITY	0.97+
30 different cluster	QUANTITY	0.96+
first one	QUANTITY	0.96+
22,000 nodes	QUANTITY	0.96+
one	QUANTITY	0.96+
30 cluster	QUANTITY	0.95+
one thing	QUANTITY	0.94+
Xbox	COMMERCIAL_ITEM	0.93+
about 22,000 pods	QUANTITY	0.92+
single cluster	QUANTITY	0.92+
20 calls	QUANTITY	0.91+
day two	QUANTITY	0.91+
one person	QUANTITY	0.89+
few years ago	DATE	0.88+
decades	QUANTITY	0.87+
2022	DATE	0.85+
Azure	TITLE	0.79+
Kubernetes	TITLE	0.77+

Day 1 Wrap | Kubecon + Cloudnativecon Europe 2022

>> Narrator: theCUBE presents KubeCon and Cloud NativeCon Europe, 2022 brought to you by Red Hat, the Cloud Native Computing Foundation and its ecosystem partners. >> Welcome to Valencia, Spain. A coverage of KubeCon, Cloud NativeCon, Europe, 2022. I'm Keith Townsend. Your host of theCUBE, along with Paul Gillum, Senior Editor Enterprise Architecture for Silicon Angle, Enrico, Senior IT Analyst for GigaOm . This has been a full day, 7,500 attendees. I might have seen them run out of food, this is just unexpected. I mean, it escalated from what I understand, it went from capping it off at 4,000 gold, 5,000 gold in it off finally at 7,500 people. I'm super excited for... Today's been a great dead coverage. I'm super excited for tomorrow's coverage from theCUBE, but first off, we'll let the the new person on stage take the first question of the wrap up of the day of coverage, Enrico, what's different about this year versus other KubeCons or Cloud Native conversations. >> I think in general, it's the maturity. So we talk a lot about day two operations, observability, monitoring, going deeper and deeper in the security aspects of the application. So this means that for many enterprises, Kubernetes is becoming real critical. They want to get more control of it. And of course you have the discussion around FinOps, around cost control, because we are deploying Kubernetes everywhere. And if you don't have everything optimized, control, monitored, costs go to the roof and think about deploying the Public Cloud . If your application is not optimized, you're paying more. But also in that, on-premises if you are not optimized, you don't have any clear idea what is going to happen. So capacity planning become the nightmare, that we know from the past. So there is a lot of going on around these topics, really exciting actually, less infrastructure, more application. That is what Kubernetes is in here. >> Paul help me separate some of the signal from the noise. There is a lot going on a lot of overlap. What are some of the big themes of takeaways for day one that Enterprise Architects, Executives, need to take home and really chew on? >> Well, the Kubernetes was a turning point. Docker was introduced nine years ago, and for the first three or four years it was an interesting technology that was not very widely adopted. Kubernetes came along and gave developers a reason to use containers. What strikes me about this conference is that this is a developer event, ordinarily you go to conferences and it's geared toward IT Managers, towards CIOs, this is very much geared toward developers. When you have the hearts and minds of developers the rest of the industry is sort of pulled along with it. So this is ground zero for the hottest area of the entire computing industry right now, is in this area building Distributed services, Microservices based, Cloud Native applications. And it's the developers who are leading the way. I think that's a significant shift. I don't see the Managers here, the CIOs here. These are the people who are pulling this industry into the next generation. >> One of the interesting things that I've seen when we've always said, Kubernetes is for the developers, but we talk with an icon from MoneyGram, who's a end user, he's an enterprise architect, and he brought Kubernetes to his front end developers, and they rejected it. They said, what is this? I just want to develop code. So when we say Kubernetes is for developers or the developers are here, how do we reconcile that mismatch of experience? We have Enterprise Architect here. I hear constantly that the Kubernetes is for developers, but is it a certain kind of developer that Kubernetes is for? >> Well, yes and no. I mean, so the paradigm is changing. Okay. So, and maybe a few years back, it was tough to understand how make your application different. So microservices, everything was new for everybody, but actually, everything has changed to a point and now the developer understands, is neural. So, going through the application, APIs, automation, because the complexity of this application is huge, and you have, 724 kind of development sort of deployment. So you have to stay always on, et cetera, et cetera. And actually, to the point of developers bringing this new generation of decision makers in there. So they are actually decision, they are adopting technology. Maybe it's a sort of shadow IT at the very beginning. So they're adopting it, they're using it. And they're starting to use a lot of open source stuff. And then somebody upper in the stack, the Executive, says what are... They discover that the technology is already in place is a critical component, and then it's transformed in something enterprise, meaning paying enterprise services on top of it to be sure support contract and so on. So it's a real journey. And these guys are the real decision makers, or they are at the base of the decision making process, at least >> Cloud Native is something we're going to learn to take for granted. When you remember back, remember the Fail Whale in the early days of Twitter, when periodically the service would just crash from traffic, or Amazon went through the same thing. Facebook went through the same thing. We don't see that anymore because we are now learning to take Cloud Native for granted. We assume applications are going to be available. They're going to be performant. They're going to scale. They're going to handle anything we throw at them. That is Cloud Native at work. And I think we forget sometimes how refreshing it is to have an internet that really works for you. >> Yeah, I think we're much earlier in the journey. We had Microsoft on, the Xbox team talked about 22,000 pods running Linkerd some of the initial problems and pain points around those challenges. Much of my hallway track conversation has been centered around as we talk about the decision makers, the platform teams. And this is what I'm getting excited to talk about in tomorrow's coverage. Who's on the ground doing this stuff. Is it developers as we see or hear or told? Or is it what we're seeing from the Microsoft example, the MoneyGram example, where central IT is getting it. And not only are they getting it, they're enabling developers to simply write code, build it, and Kubernetes is invisible. It seems like that's become the Holy Grail to make Kubernetes invisible and Cloud Native invisible, and the experience is much closer to Cloud. >> So I think that, it's an interesting, I mean, I had a lot of conversation in the past year is that it's not that the original traditional IT operations are disappearing. So it's just that traditional IT operation are giving resources to these new developers. Okay, so it's a sort of walled garden, you don't see the wall, but it's a walled garden. So they are giving you resources and you use these resources like an internal Cloud. So a few years back, we were talking about private Cloud, the private Cloud as let's say the same identical paradigm of the Public Cloud is not possible, because there are no infinite resources or well, whatever we think are infinite resources. So what you're doing today is giving these developers enough resources to think that they are unlimited and they can do automatic operationing and do all these kind of things. So they don't think about infrastructure at all, but actually it's there. So IT operation are still there providing resources to let developers be more free and agile and everything. So we are still in a, I think an interesting time for all of it. >> Kubernetes and Cloud Native in general, I think are blurring the lines, traditional lines development and operations always were separate entities. Obviously with DevOps, those two are emerging. But now we're moving when you add in shift left testing, shift right testing, DevSecOps, you see the developers become much more involved in the infrastructure and they want to be involved in infrastructure because that's what makes their applications perform. So this is going to cause, I think IT organizations to have to do some rethinking about what those traditional lines are, maybe break down those walls and have these teams work much closer together. And that should be a good thing because the people who are developing applications should also have intimate knowledge of the infrastructure they're going to run on. >> So Paul, another recurring theme that we've heard here is the impact of funding on resources. What have your discussions been around founders and creators when it comes to sourcing talent and the impact of the markets on just their day to day? >> Well, the sourcing talent has been a huge issue for the last year, of course, really, ever since the pandemic started. Interestingly, one of our guests earlier today said that with the meltdown in the tech stock market, actually talent has become more available, because people who were tied to their companies because of their stock options are now seeing those options are underwater and suddenly they're not as loyal to the companies they joined. So that's certainly for the startups, there are many small startups here, they're seeing a bit of a windfall now from the tech stock bust. Nevertheless, skills are a long term problem. The US educational system is turning out about 10% of the skilled people that the industry needs every year. And no one I know, sees an end to that issue anytime soon. >> So Enrico, last question to you. Let's talk about what that means to the practitioner. There's a lot of opportunity out there. 200 plus sponsors I hear, I think is worth the projects is 200 plus, where are the big opportunities as a practitioner, as I'm thinking about the next thing that I'm going to learn to help me survive the next 10 or 15 years of my career? Where you think the focus should be? Should it be that low level Cloud builder? Or should it be at those levels of extraction that we're seeing and reading about? >> I think that it's a good question. The answer is not that easy. I mean, being a developer today, for sure, grants you a salary at the end of the month. I mean, there is high demand, but actually there are a lot of other technical figures in the data center, in the Cloud, that could really find easily a job today. So, developers is the first in my mind also because they are more, they can serve multiple roles. It means you can be a developer, but actually you can be also with the new roles that we have, especially now with the DevOps, you can be somebody that supports operation because you know automation, you know a few other things. So you can be a sysadmin of the next generation even if you are a developer, even if when you start as a developer. >> KubeCon 2022, is exciting. I don't care if you're a developer, practitioner, a investor, IT decision maker, CIO, CXO, there's so much to learn and absorb here and we're going to be covering it for the next two days. Me and Paul will be shoulder to shoulder, I'm not going to say you're going to get sick of this because it's just, it's all great information, we'll help sort all of this. From Valencia, Spain. I'm Keith Townsend, along with my host Enrico Signoretti, Paul Gillum, and you're watching theCUBE, the leader in high tech coverage. (upbeat music)

Published Date : May 19 2022

SUMMARY :

the Cloud Native Computing Foundation of the wrap up of the day of coverage, of the application. of the signal from the noise. and for the first three or four years I hear constantly that the and now the developer understands, the early days of Twitter, and the experience is is that it's not that the of the infrastructure and the impact of the markets So that's certainly for the startups, So Enrico, last question to you. of the next generation it for the next two days.

ENTITIES

Entity	Category	Confidence
Paul Gillum	PERSON	0.99+
Enrico Signoretti	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Keith Townsend	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Paul	PERSON	0.99+
Valencia, Spain	LOCATION	0.99+
last year	DATE	0.99+
7,500 attendees	QUANTITY	0.99+
Enrico	PERSON	0.99+
Silicon Angle	ORGANIZATION	0.99+
4,000 gold	QUANTITY	0.99+
two	QUANTITY	0.99+
first	QUANTITY	0.99+
5,000 gold	QUANTITY	0.99+
KubeCon	EVENT	0.99+
nine years ago	DATE	0.99+
GigaOm	ORGANIZATION	0.99+
7,500 people	QUANTITY	0.99+
tomorrow	DATE	0.99+
one	QUANTITY	0.99+
today	DATE	0.98+
Cloud NativeCon	EVENT	0.98+
Today	DATE	0.98+
four years	QUANTITY	0.98+
first question	QUANTITY	0.97+
this year	DATE	0.96+
200 plus	QUANTITY	0.96+
Kubernetes	TITLE	0.96+
DevSecOps	TITLE	0.95+
Cloud Native	TITLE	0.95+
DevOps	TITLE	0.95+
about 10%	QUANTITY	0.94+
first three	QUANTITY	0.94+
15 years	QUANTITY	0.94+
Kubecon	ORGANIZATION	0.93+
KubeCon 2022	EVENT	0.93+
day one	QUANTITY	0.93+
One	QUANTITY	0.92+
Twitter	ORGANIZATION	0.92+
past year	DATE	0.92+
Kubernetes	PERSON	0.92+
724	QUANTITY	0.91+
pandemic	EVENT	0.91+
MoneyGram	ORGANIZATION	0.89+
Xbox	COMMERCIAL_ITEM	0.89+
earlier today	DATE	0.89+
about 22,000 pods	QUANTITY	0.89+
Docker	TITLE	0.89+
Day	QUANTITY	0.84+
Linkerd	ORGANIZATION	0.84+
2022	DATE	0.83+
Cloud	TITLE	0.82+
Europe	LOCATION	0.81+
10	QUANTITY	0.81+
200 plus sponsors	QUANTITY	0.8+
few years back	DATE	0.78+
Cloud NativeCon Europe	EVENT	0.78+
Enrico	ORGANIZATION	0.77+
FinOps	TITLE	0.76+
US	LOCATION	0.76+
a few years back	DATE	0.74+
next two days	DATE	0.73+
Kubernetes	ORGANIZATION	0.69+
theCUBE	ORGANIZATION	0.68+
day two	QUANTITY	0.67+
Cloudnativecon	ORGANIZATION	0.58+
Public Cloud	TITLE	0.54+
2022	EVENT	0.53+
Fail Whale	TITLE	0.52+

Matt Provo & Patrick Bergstrom, StormForge | Kubecon + Cloudnativecon Europe 2022

>> Instructor: "theCUBE" presents KubeCon and CloudNativeCon Europe 2022, brought to you by Red Hat, the Cloud Native Computing Foundation and its ecosystem partners. >> Welcome to Valencia, Spain and we're at KubeCon, CloudNativeCon Europe 2022. I'm Keith Townsend, and my co-host, Enrico Signoretti. Enrico's really proud of me. I've called him Enrico instead of Enrique every session. >> Every day. >> Senior IT analyst at GigaOm. We're talking to fantastic builders at KubeCon, CloudNativeCon Europe 2022 about the projects and their efforts. Enrico, up to this point, it's been all about provisioning, insecurity, what conversation have we been missing? >> Well, I mean, I think that we passed the point of having the conversation of deployment, of provisioning. Everybody's very skilled, actually everything is done at day two. They are discovering that, well, there is a security problem. There is an observability problem a and in fact, we are meeting with a lot of people and there are a lot of conversation with people really needing to understand what is happening. I mean, in their cluster work, why it is happening and all the questions that come with it. And the more I talk with people in the show floor here or even in the various sessions is about, we are growing so that our clusters are becoming bigger and bigger, applications are becoming bigger as well. So we need to now understand better what is happening. As it's not only about cost, it's about everything at the end. >> So I think that's a great set up for our guests, Matt Provo, founder and CEO of StormForge and Patrick Brixton? >> Bergstrom. >> Bergstrom. >> Yeah. >> I spelled it right, I didn't say it right, Bergstrom, CTO. We're at KubeCon, CloudNativeCon where projects are discussed, built and StormForge, I've heard the pitch before, so forgive me. And I'm kind of torn. I have service mesh. What do I need more, like what problem is StormForge solving? >> You want to take it? >> Sure, absolutely. So it's interesting because, my background is in the enterprise, right? I was an executive at UnitedHealth Group before that I worked at Best Buy and one of the issues that we always had was, especially as you migrate to the cloud, it seems like the CPU dial or the memory dial is your reliability dial. So it's like, oh, I just turned that all the way to the right and everything's hunky-dory, right? But then we run into the issue like you and I were just talking about, where it gets very very expensive very quickly. And so my first conversations with Matt and the StormForge group, and they were telling me about the product and what we're dealing with. I said, that is the problem statement that I have always struggled with and I wish this existed 10 years ago when I was dealing with EC2 costs, right? And now with Kubernetes, it's the same thing. It's so easy to provision. So realistically what it is, is we take your raw telemetry data and we essentially monitor the performance of your application, and then we can tell you using our machine learning algorithms, the exact configuration that you should be using for your application to achieve the results that you're looking for without over-provisioning. So we reduce your consumption of CPU, of memory and production which ultimately nine times out of 10, actually I would say 10 out of 10, reduces your cost significantly without sacrificing reliability. >> So can your solution also help to optimize the application in the long run? Because, yes, of course-- >> Yep. >> The lowering fluid as you know optimize the deployment. >> Yeah. >> But actually the long-term is optimizing the application. >> Yes. >> Which is the real problem. >> Yep. >> So, we're fine with the former of what you just said, but we exist to do the latter. And so, we're squarely and completely focused at the application layer. As long as you can track or understand the metrics you care about for your application, we can optimize against it. We love that we don't know your application, we don't know what the SLA and SLO requirements are for your app, you do, and so, in our world it's about empowering the developer into the process, not automating them out of it and I think sometimes AI and machine learning sort of gets a bad rap from that standpoint. And so, at this point the company's been around since 2016, kind of from the very early days of Kubernetes, we've always been, squarely focused on Kubernetes, using our core machine learning engine to optimize metrics at the application layer that people care about and need to go after. And the truth of the matter is today and over time, setting a cluster up on Kubernetes has largely been solved. And yet the promise of Kubernetes around portability and flexibility, downstream when you operationalize, the complexity smacks you in the face and that's where StormForge comes in. And so we're a vertical, kind of vertically oriented solution, that's absolutely focused on solving that problem. >> Well, I don't want to play, actually. I want to play the devils advocate here and-- >> You wouldn't be a good analyst if you didn't. >> So the problem is when you talk with clients, users, there are many of them still working with Java, something that is really tough. I mean, all of us loved Java. >> Yeah, absolutely. >> Maybe 20 years ago. Yeah, but not anymore, but still they have developers, they have porting applications, microservices. Yes, but not very optimized, et cetera, cetera, et cetera. So it's becoming tough. So how you can interact with this kind of old hybrid or anyway, not well engineered applications. >> Yeah. >> We do that today. We actually, part of our platform is we offer performance testing in a lower environment and stage and we, like Matt was saying, we can use any metric that you care about and we can work with any configuration for that application. So perfect example is Java, you have to worry about your heap size, your garbage collection tuning and one of the things that really struck me very early on about the StormForge product is because it is true machine learning. You remove the human bias from that. So like a lot of what I did in the past, especially around SRE and performance tuning, we were only as good as our humans were because of what they knew. And so, we kind of got stuck in these paths of making the same configuration adjustments, making the same changes to the application, hoping for different results. But then when you apply machine learning capability to that the machine will recommend things you never would've dreamed of. And you get amazing results out of that. >> So both me and Enrico have been doing this for a long time. Like, I have battled to my last breath the argument when it's a bare metal or a VM, look, I cannot give you any more memory. >> Yeah. >> And the argument going all the way up to the CIO and the CIO basically saying, you know what, Keith you're cheap, my developer resources are expensive, buy bigger box. >> Yeah. >> Yap. >> Buying a bigger box in the cloud to your point is no longer a option because it's just expensive. >> Yeah. >> Talk to me about the carrot or the stick as developers are realizing that they have to be more responsible. Where's the culture change coming from? Is it the shift in responsibility? >> I think the center of the bullseye for us is within those sets of decisions, not in a static way, but in an ongoing way, especially as the development of applications becomes more and more rapid and the management of them. Our charge and our belief wholeheartedly is that you shouldn't have to choose. You should not have to choose between costs or performance. You should not have to choose where your applications live, in a public private or hybrid cloud environment. And so, we want to empower people to be able to sit in the middle of all of that chaos and for those trade offs and those difficult interactions to no longer be a thing. We're at a place now where we've done hundreds of deployments and never once have we met a developer who said, "I'm really excited to get out of bed and come to work every day and manually tune my application." One side, secondly, we've never met, a manager or someone with budget that said, please don't increase the value of my investment that I've made to lift and shift us over to the cloud or to Kubernetes or some combination of both. And so what we're seeing is the converging of these groups, their happy place is the lack of needing to be able to make those trade offs, and that's been exciting for us. >> So, I'm listening and looks like that your solution is right in the middle in application performance, management, observability. >> Yeah. >> And, monitoring. >> Yeah. >> So it's a little bit of all of this. >> Yeah, so we want to be, the intel inside of all of that, we often get lumped into one of those categories, it used to be APM a lot, we sometimes get, are you observability or and we're really not any of those things, in and of themselves, but we instead we've invested in deep integrations and partnerships with a lot of that tooling 'cause in a lot of ways, the tool chain is hardening in a cloud native and in Kubernetes world. And so, integrating in intelligently, staying focused and great at what we solve for, but then seamlessly partnering and not requiring switching for our users who have already invested likely, in a APM or observability. >> So to go a little bit deeper. What does it mean integration? I mean, do you provide data to this, other applications in the environment or are they supporting you in the work that you do. >> Yeah, we're a data consumer for the most part. In fact, one of our big taglines is take your observability and turn it into action ability, right? Like how do you take that, it's one thing to collect all of the data, but then how do you know what to do with it, right? So to Matt's point, we integrate with folks like Datadog, we integrate with Prometheus today. So we want to collect that telemetry data and then do something useful with it for you. >> But also we want Datadog customers, for example, we have a very close partnership with Datadog so that in your existing Datadog dashboard, now you have-- >> Yeah. >> The StormForge capability showing up in the same location. >> Yep. >> And so you don't have to switch out. >> So I was just going to ask, is it a push pull? What is the developer experience when you say you provide developer this resolve ML learnings about performance, how do they receive it? Like, what's the developer experience. >> They can receive it, for a while we were CLI only, like any good developer tool. >> Right. >> And, we have our own UI. And so it is a push in a lot of cases where I can come to one spot, I've got my applications and every time I'm going to release or plan for a release or I have released and I want to pull in observability data from a production standpoint, I can visualize all of that within the StormForge UI and platform, make decisions, we allow you to set your, kind of comfort level of automation that you're okay with. You can be completely set and forget or you can be somewhere along that spectrum and you can say, as long as it's within, these thresholds, go ahead and release the application or go ahead and apply the configuration. But we also allow you to experience the same, a lot of the same functionality right now, in Grafana, in Datadog and a bunch of others that are coming. >> So I've talked to Tim Crawford who talks to a lot of CIOs and he's saying one of the biggest challenges or if not, one of the biggest challenges CIOs are facing are resource constraints. >> Yeah. >> They cannot find the developers to begin with to get this feedback. How are you hoping to address this biggest pain point for CIOs-- >> Yeah.6 >> And developers? >> You should take that one. >> Yeah, absolutely. So like my background, like I said at UnitedHealth Group, right. It's not always just about cost savings. In fact, the way that I look about at some of these tech challenges, especially when we talk about scalability there's kind of three pillars that I consider, right? There's the tech scalability, how am I solving those challenges? There's the financial piece 'cause you can only throw money at a problem for so long and it's the same thing with the human piece. I can only find so many bodies and right now that pool is very small, and so, we are absolutely squarely in that footprint of we enable your team to focus on the things that they matter, not manual tuning like Matt said. And then there are other resource constraints that I think that a lot of folks don't talk about too. Like, you were talking about private cloud for instance and so having a physical data center, I've worked with physical data centers that companies I've worked for have owned where it is literally full, wall to wall. You can't rack any more servers in it, and so their biggest option is, well, I could spend $1.2 billion to build a new one if I wanted to, or if you had a capability to truly optimize your compute to what you needed and free up 30% of your capacity of that data center. So you can deploy additional name spaces into your cluster, like that's a huge opportunity. >> So I have another question. I mean, maybe it doesn't sound very intelligent at this point, but, so is it an ongoing process or is it something that you do at the very beginning, I mean you start deploying this. >> Yeah. >> And maybe as a service. >> Yep. >> Once in a year I say, okay, let's do it again and see if something change it. >> Sure. >> So one spot, one single.. >> Yeah, would you recommend somebody performance test just once a year? Like, so that's my thing is, at previous roles, my role was to do performance test every single release, and that was at a minimum once a week and if your thing did not get faster, you had to have an executive exception to get it into production and that's the space that we want to live in as well as part of your CICD process, like this should be continuous verification, every time you deploy, we want to make sure that we're recommending the perfect configuration for your application in the name space that you're deploying into. >> And I would be as bold as to say that we believe that we can be a part of adding, actually adding a step in the CICD process that's connected to optimization and that no application should be released, monitored, and sort of analyzed on an ongoing basis without optimization being a part of that. And again, not just from a cost perspective, but for cost and performance. >> Almost a couple of hundred vendors on this floor. You mentioned some of the big ones Datadog, et cetera, but what happens when one of the up and comings out of nowhere, completely new data structure, some imaginative way to click to telemetry data. >> Yeah. >> How do, how do you react to that? >> Yeah, to us it's zeros and ones. >> Yeah. >> And, we really are data agnostic from the standpoint of, we're fortunate enough from the design of our algorithm standpoint, it doesn't get caught up on data structure issues, as long as you can capture it and make it available through one of a series of inputs, one would be load or performance tests, could be telemetry, could be observability, if we have access to it. Honestly, the messier the better from time to time from a machine learning standpoint, it's pretty powerful to see. We've never had a deployment where we saved less than 30%, while also improving performance by at least 10%. But the typical results for us are 40 to 60% savings and 30 to 40% improvement in performance. >> And what happens if the application is, I mean, yes Kubernetes is the best thing of the world but sometimes we have to, external data sources or, we have to connect with external services anyway. >> Yeah. >> So, can you provide an indication also on this particular application, like, where the problem could be? >> Yeah. >> Yeah, and that's absolutely one of the things that we look at too, 'cause it's, especially when you talk about resource consumption it's never a flat line, right? Like depending on your application, depending on the workloads that you're running it varies from sometimes minute to minute, day to day, or it could be week to week even. And so, especially with some of the products that we have coming out with what we want to do, integrating heavily with the HPA and being able to handle some of those bumps and not necessarily bumps, but bursts and being able to do it in a way that's intelligent so that we can make sure that, like I said, it's the perfect configuration for the application regardless of the time of day that you're operating in or what your traffic patterns look like, or, what your disc looks like, right. Like 'cause with our low environment testing, any metric you throw at us, we can optimize for. >> So Matt and Patrick, thank you for stopping by. >> Yeah. >> Yes. >> We can go all day because day two is I think the biggest challenge right now, not just in Kubernetes but application re-platforming and transformation, very, very difficult. Most CTOs and EASs that I talked to, this is the challenge space. From Valencia, Spain, I'm Keith Townsend, along with my host Enrico Signoretti and you're watching "theCube" the leader in high-tech coverage. (whimsical music)

Published Date : May 19 2022

SUMMARY :

brought to you by Red Hat, and we're at KubeCon, about the projects and their efforts. And the more I talk with I've heard the pitch and then we can tell you know optimize the deployment. is optimizing the application. the complexity smacks you in the face I want to play the devils analyst if you didn't. So the problem is when So how you can interact and one of the things that last breath the argument and the CIO basically saying, Buying a bigger box in the cloud Is it the shift in responsibility? and the management of them. that your solution is right in the middle we sometimes get, are you observability or in the work that you do. consumer for the most part. showing up in the same location. What is the developer experience for a while we were CLI only, and release the application and he's saying one of the They cannot find the developers and it's the same thing or is it something that you do Once in a year I say, okay, and that's the space and that no application You mentioned some of the and 30 to 40% improvement in performance. Kubernetes is the best thing of the world so that we can make So Matt and Patrick, Most CTOs and EASs that I talked to,

ENTITIES

Entity	Category	Confidence
Keith Townsend	PERSON	0.99+
Enrico	PERSON	0.99+
Enrico Signoretti	PERSON	0.99+
Matt	PERSON	0.99+
Jeff	PERSON	0.99+
Tim Crawford	PERSON	0.99+
Patrick	PERSON	0.99+
2003	DATE	0.99+
Keith Townsend	PERSON	0.99+
UnitedHealth Group	ORGANIZATION	0.99+
40	QUANTITY	0.99+
Alex	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Santa Clara	LOCATION	0.99+
30	QUANTITY	0.99+
$1.2 billion	QUANTITY	0.99+
Alex Wolf	PERSON	0.99+
Enrique	PERSON	0.99+
StormForge	ORGANIZATION	0.99+
Alexander Wolf	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
ACG	ORGANIZATION	0.99+
January	DATE	0.99+
Matt Provo	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
Santa Cruz	LOCATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Patrick Bergstrom	PERSON	0.99+
Best Buy	ORGANIZATION	0.99+
30%	QUANTITY	0.99+
first time	QUANTITY	0.99+
Bergstrom	ORGANIZATION	0.99+
nine times	QUANTITY	0.99+
10	QUANTITY	0.99+
Valencia, Spain	LOCATION	0.99+
300 people	QUANTITY	0.99+
millions	QUANTITY	0.99+
Datadog	ORGANIZATION	0.99+
Java	TITLE	0.99+
GigaOm	ORGANIZATION	0.99+
Baskin School of Engineering	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
third year	QUANTITY	0.99+
Mountain View, California	LOCATION	0.99+
KubeCon	EVENT	0.99+
ACGSV	ORGANIZATION	0.99+
both	QUANTITY	0.99+
once a week	QUANTITY	0.99+
less than 30%	QUANTITY	0.99+
ACGSV GROW! Awards	EVENT	0.98+
2016	DATE	0.98+
one	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
40%	QUANTITY	0.98+
Santa Cruz UC Santa Cruz School of Engineering	ORGANIZATION	0.98+
today	DATE	0.98+
ACG Silicon Valley	ORGANIZATION	0.98+
60%	QUANTITY	0.98+
once a year	QUANTITY	0.98+
one spot	QUANTITY	0.98+
10 years ago	DATE	0.97+
Patrick Brixton	PERSON	0.97+
Prometheus	TITLE	0.97+
20 years ago	DATE	0.97+
CloudNativeCon Europe 2022	EVENT	0.97+
secondly	QUANTITY	0.97+
one single	QUANTITY	0.96+
first conversations	QUANTITY	0.96+
millions of dollars	QUANTITY	0.96+
ACGSV GROW! Awards 2018	EVENT	0.96+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Enrico Signoretti: