Patrick Lin, Splunk | Leading with Observability | January 2021

(upbeat music) >> Announcer: From the keeps studios in Palo Alto in Boston, connecting with that leaders all around the world. This is theCube conversation. >> Welcome to theCube conversation here in Palo Alto, California. I'm John Furrier, host of theCube. With a special content series called, Leading with observability, and this topic is, Keeping watch over microservices and containers. With great guests, Patrick Lin, VP of Product Management for the observability product at Splunk. Patrick, great to see you. Thanks for coming on remotely. We're still in the pandemic, but thanks for coming on. >> Yeah, John, great to see you as well. Thanks for having me. >> So, leading with observability is a big theme of our content series. Managing end to end and user experience is a great topic around how data can be used for user experience. But now underneath that layer, you have this whole craziness of the rise of the container generation, where containers are actually going mainstream. And Gardner will forecast anywhere from 30 to 40 percent of enterprises still yet, haven't really adopted at full scale and you've got to keep watch over these. So, what is the topic about keeping watch over microservice and containers, because, yeah, we know they're being deployed. Is it just watching them for watching sake or is there a specific reason? What's the theme here? Why this topic? >> Yeah, well, I think containers are part of the entire kind of stack of technology that's being deployed in order to develop and ship software more quickly. And, the fundamental reasons for that haven't changed but they've been greatly accelerated by the impact of the pandemic. And so I think for the past few years we've been talking about how software's eating the world, how it's become more and more important that company go through the transformation to be more digital. And I think now that is so patently obvious to everybody. When your only way of accessing your customer and for the customer to access your services is through a digital media. The ability for your IT and DevOps teams to be able to deliver against those requirements, to deliver that flawless customer experience, to sort of keep pace with it the digital transformation and the cloud initiatives. All of that is kind of coming as one big wave. And so, we see a lot of organizations migrating workloads to the cloud, refactoring applications, building new applications natively. And so, when they do that oftentimes the infrastructure of choice is containers. Because it's the thing that keeps up with the pace of the development. It's a much more efficient use of underlying resources. So it's all kind of part of the overall movement that we see. >> What is the main driver for this use case microservices and where's the progress bar in your mind of the adoption and deployment of microservices, and what is the critical things that are there you guys are looking at that are important to monitor and observe and keep track of? Is it the status of the microservices? Is it the fact that they're being turned on and off, the state, non-state, I mean take us through some of the main drivers for why you guys are keeping an eye on the microservices component? >> Sure, well, I think that if we take a step back the reason that people have moved towards microservices and containers fundamentally has to do with the desire to be able to, number one, develop and ship more quickly. And so if you can parallelize the development have API is the interface between these services rather than having sort of one monolithic code base, you can evolve more quickly. And on top of that, the goal is to be able to deliver software that is able to scale as needed. And so, that is a part of the equation as well. So when you sort of look at at this the desire to be able to iterate on your software and services more quickly, to be able to scale infinitely, staying up and so on. That's all like a great reason to do it, but what happens along those lines, what comes with it is a few kind of additional layers of complexity because now rather than have, let's say an end to your app that you're watching over on some hosts that you could reboot when there's a problem. Now you have 10's, maybe 100's of services running on top of maybe 100,000's, maybe 10,000's of containers. And so the complexity of that environment has grown quite quickly. And the fact that those containers may go away as you are scale the service up and down to meet demand also adds to that complexity. And so from an observability perspective, what you need to be able to do is a few things. One is you need to actually be tracking this in enough detail and at a high enough resolution in realtime. So that you know when things are coming in and out. And that's been one of the more critical things that we've built towards a Splunk, is that ability to watch over it in realtime. But more important, or just as important in that is, understanding the dependencies and the relationships between these different services. And so, that's one of the main things that we worked on here is to make sure that you can understand the dependency so that when there's an issue you have a shot at actually figuring out where the problem is coming from. Because of the fact that there's so many different services and so many things that could be affecting the overall user experience when something goes wrong. >> I think that's one of the most exciting areas right now, on observability is this whole microservices container equation, because a lot of actions being done there, there's a lot of complexity but the upside, if you do it right, it's significant. I think people generally are bought into that concept, Patrick, but I want to get your thoughts. I get this question a lot from executives and leaders whether it's a cloud architect or a CXO. And the question is, what should I consider? What do I need to consider when deploying an observability solution? >> Yeah, that's a great question. Cause I think they're obviously a lot of considerations here. So, I think one of the main ones, and this is something that I think is a pattern that we are pretty familiar with in the this sort of monitoring and management tool world. Is that, over time most enterprises have gotten themselves a very large number of tools. One for each part of their infrastructure or their application stack and so on. And so, what you end up with is sprawl in the monitoring toolset that you have. Which creates not just sort of a certain amount of overhead in terms of the cost, but also complexity that gets in the way of actually figuring out where the problem is. I've been looking at some of the toolsets that some of our customers have pulled together and they have the ability to get information about everything but it's not kind of woven together in a useful way. And it sort of gets in the way actually, having so many tools when you are actually in the heat of the moment trying to figure something out. It sort of hearkens back to the time when you have an outage, you have a con call with like a cast of 1000's on it trying to figure out what's going on. And each person comes to that with their own tool, with their own view, without anything that ties that to what the others are seeing. And so, that need to be able to provide sort of an integrated toolset, with a consistent interface across infrastructure, across the application, across what the user experience is and across the different data types. The metrics, the traces, the logs. Fundamentally I think that ability to kind of easily correlate the data across it and get to the right insight. We think that's a super important thing. >> Yeah, and I think what that points out, I mean, I always say, don't be a fool with a tool. And if you have too many tools, you have a tool shed, and there are too many tools everywhere. And that's kind of a trend, and tools are great when you need tools. To do things. But when you have too many, when you have a data model where essentially what you're saying is, a platform is the trend, because weaving stuff together you need to have a data control plane, you need to have data visualization. You need to have these things for understanding the success there. So, really it's a platform, but platforms also have tools as well. So tools or features of a platform if I get what you're saying, right? Is that correct? Yeah, so I think that there's one part of this which is, you need to be able to, if I start from the user point of view, what you want is a consistent and coherent set of workflows for the people who are trying to actually do the work. You don't want them to have to deal with the impedance mismatches across different tools that exist based on, whatever, even the language that they use but how they bring the data in and how it's being processed. You go down one layer from that. You sort of want to make sure that what they're working with is actually consistent as well. And that's the sort of capabilities that you're looking at whether you're whatever, trying to chart something to be able to look at the details, or go from a view of logs to the related traces. You sort of want to make sure that the information that's being served up there is consistent. And that in turn relies on data coming in, in a way that is sort of processed to be correlated well. So that if you say, Hey, I'm I'm looking at a particular service. I want to understand what infrastructure is sitting on or I'm looking at a log and I see that it relates to a particular service. And I want to look at traces for that service. Those things need to be kind of related from the data on in and that needs to be exposed to the user so that they can navigate it properly and make use of it. Whether that's during kind of, or time during an incident or peace time. >> Yeah, I love that wartime conciliary versus peace time. I saw blog posts from a VC, I think said, don't be a Tom Hagen, which is the guy in The Godfather when the famous lines said, you're not a wartime conciliary. Which means things are uncertain in these times and you've got to get them to be certain. This is a mindset, this is part of the pandemic we're living in. Great point, I love that. Maybe we could follow up on that at the end, but I want to get some of these topics. I want to get your reactions to. So, I want you to react to the following, Patrick. it's an issue in a topic, and there it is, missing data results in limited analytics and misguided troubleshooting. What's your reaction to that? What's your take on that? What's the Splunk's take on that? >> Yeah, I mean, I think Splunk has sort of been a proponent of that view for a very long time. I think that whether that's from the log data or from, let's say, the metric data that we capture at high resolution or from tracing. The goal here is to have the data that you need in order to actually properly diagnose what's going on. And I think that older approaches, especially on the application side, tend to sample data right at the source and provide hopefully useful samples of it for when you have that problem. That doesn't work very well in the microservice world because you need to actually be able to see the entirety of a transaction, to a full trace across many services before you could possibly make a decision as to what's useful to keep. And so, the approach that I think we believe is the right one, is to be able to capture at full fidelity all of those bits of information, partly because of what I just said, you want to be able to find the right sample, but also because it's important to be able to tie it to something that may be being pulled in by different system. So, an example of that might be, in a case where you are trying to do real user monitoring alongside of APM, and you want to see the end to end trace from what the user sees all the way through to all the backend services. And so, what's typical in this world today is that, that information is being captured by two different systems independent sampling decisions. And therefore the ability to draw a straight line from what the end user sees all the way to what is effecting it on the backend is pretty hard. Where it gets really expensive. And I think the approach that we've taken is to make it so that that's easy and cost-effective. And it's tremendously helpful then to tie it back to kind of what we were talking about at the outset here where you were trying to provide services that make sense and are easy access and so on to your end user. to be able to have that end to end view because you're not missing data. It's tremendously valuable. >> You know what I love about Splunk is, cause I'm a data geek going back when it wasn't fashionable back in the 80's. And Splunk has always been about ingesting all the data. So they bring all the data, we'll take it all. Now from at the beginning it was pretty straightforward, complex but still it had a great utility. But even now, today, it's the same thing you just mentioned, ingest all the data because there's now benefits. And I want to just ask you a quick question on this, distributed computing trend, because I mean everyone's pretty much in agreement that's in computer science or in the industry and in technology says, okay, cloud is a distributed computing with the edge. It's essentially distributed computing in a new way, new architecture with new great benefits, new things, but science is still kicking apply some science there. You mentioned distributed tracing because at the end of the day that's also a new major thing that you guys are focused on and it's not so much about, it's also good get me all the data but distributed tracing is a lot harder than understanding that because of the environment and it's changing so fast. What's your take on it? >> Yeah, well fundamentally I think this goes back to, ironically one of the principles in observability. Which is that oftentimes you need participation from the developers in sort of making sure that you have the right visibility. And it has to do with the fact that there are many services that are being kind of strong together as it were to be able to deliver on some end user transaction or some experience. And so, the fact that you have many services that are part of this, means that you need to make sure that each of those components is actually kind of providing some view into what it's doing. And distribute tracing is about taking that and kind of weaving it together so that you get that coherent view of the business workflow within the overall kind of web of services that make up your application. >> So the next topic, I want to get into, we've got limited time, but I'm going to squeeze through, but I'm going to read it to you real quick. Slow alerts and insights are difficult to scale. If they're difficult to scale it holds back the meantime between resolving. And so, it's difficult to detect in cloud. It was easier maybe on premise, but with cloud this is another complexity thing. How are you seeing the inability to scale quickly across the environments for to manage the performance issues and delays that are coming out of not having that kind of in slow insights or managing that? What's your reaction to that? >> Yeah, well, I think there are a lot of tools out there that we'll take in events or where issues from cloud environments. But they're not designed from the very beginning to be able to handle the sort of scale of what you're looking at. So, I mentioned, it's not uncommon for a company to have 10's or maybe even 100's of services and 1000's of containers or hosts. And so, the sort of sheer amount of data you have to be looking at on an ongoing basis. And the fact that things can change very quickly. Containers can pop in and go away within seconds. And so, the ability to track that in realtime implies that you need to have an architectural approach that is built for that from the very beginning. It's hard to retrofit a system to be able to handle orders to magnitude more complexity and change in pace of change. You need to start from the very beginning. And the belief we have is that you need some form of a realtime streaming architecture. Something that's capable of providing that realtime detection and alerting across a very wide range of things in order to handle the scale and the ephemeral nature of cloud environments. >> Let me ask a question then, because I heard some people say, well, it doesn't matter. 10, 15 minutes to log in to an event is good enough. What would you react to that? (chuckles) What a great example of where it's not good enough? I mean, is it minutes is it's seconds, what are we talking about here? What's the good enough bar right now? >> Yeah, I mean, I think any anybody who has tried to deliver an experience digitally to an end user, if you think you can wait minutes to solve a problem you clearly haven't been paying enough attention. And I think that, I think it almost goes without saying, that the faster you know that you have a problem, the better off you are. And so, when you think about what are the objectives that you have for your service levels or your performance or availability. I think you run out of minutes pretty quickly, if you get to anything like say, three nines So, waiting 15 minutes, maybe would have been acceptable before people were really trying to use your service at scale. But definitely not any more. >> And the latest app requires it. It's super important. I brought that up and tongue in cheek kind of tee that up for you because these streaming analytics, streaming engines are super valuable, and knowing when to use realtime and not also matters. This is where the platforms come in. >> Yes, absolutely. The platform is the thing that enables that. And I think you have to sort of build it from the very beginning with that streaming approach with the ability to do analytics against the streams coming in, in order for you to deliver on this sort of promise of alerts and insights at scale and in realtime. >> All right, final point. I'll give you the last word here. Give a plug for the Splunk observability suite. What is it? Why is it important? Why should people buy it? Why should people adopt it? Why should they upgrade to it? Give the perspective, give the plug. >> Yeah, sure. I appreciate the opportunity. So, I think as we've been out there speaking to customers right over the last year as part of Splunk and before that, I think they've spoken to us a lot about the need for better visibility into their environments. Which are increasingly complex and where they're trying to deliver on the best possible user experience. And to sort of add to that, where they're trying to actually consolidate the tools. We spoke about the sprawl at the beginning. And so, with what we're putting together here with the Splunk observability suite. I'd say we have the industry's most comprehensive and powerful combination of solutions that will help both sort of IT and DevOps teams tackle these new challenges for monitoring and observability that other tools simply can't address. So you're able to eliminate the management complexity by having a single consistent user experience across the metrics and logs and traces, so that you can have seamless monitoring and troubleshooting and investigation. You can create a better user experiences by having that true end to end visibility, all the way from the front end to the backend services, so that you can actually see what kind of impact you're having on users and figure it out within seconds. I think we're also able to help increase developer productivity. As these high performance tools that help the DevOps teams get to a better quality code faster, because they can get immediate feedback on how their coachings are doing with each we would see each release and they're able to operate more efficiently. So, I think there's a very large number of benefits from this approach of providing a single unified toolset that relies on a source of data that's consistent across it but then has the sort of particular tools that different users need for what they care about. Whether you're the front end developer, needing to understand the user experience, whether you're backend service owner wanting to see how your service relates to others, whether you're owning the infrastructure, and needs to see, is it actually providing what the services are running on it need. >> Well, Patrick, great to see you. And I just want to say, congratulations has been following your work, going back in the industry specifically with SignalFx, you guys were really early and seeing the value of observability before it was a category. And so how has more often so relevant as you guys had saw it. So, congratulations and keep up the great work. We'll keep a competition's open. Thanks for coming on. >> Great, thanks so much, John. Great talking to you. >> All right, this is theCube, Leading with observability, it's a series, check it out. We have a multiple talk tracks. Check out the Splunk's a series, Leading with observability. I'm John Furrier with theCube. Thanks for watching. (upbeat music)

Published Date : Feb 22 2021

SUMMARY :

all around the world. for the observability product at Splunk. Yeah, John, great to see you as well. What's the theme here? and for the customer the goal is to be able to deliver software And the question is, And so, that need to be able and that needs to be exposed to the user What's the Splunk's take on that? the data that you need it's the same thing you just mentioned, And so, the fact that the environments for to And so, the ability to What's the good enough bar right now? that the faster you know of tee that up for you And I think you have to sort of build it Give a plug for the Splunk the DevOps teams get to a and seeing the value of observability Great talking to you. Check out the Splunk's a series,

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Patrick	PERSON	0.99+
John Furrier	PERSON	0.99+
Patrick Lin	PERSON	0.99+
January 2021	DATE	0.99+
Tom Hagen	PERSON	0.99+
Palo Alto	LOCATION	0.99+
15 minutes	QUANTITY	0.99+
10	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
Splunk	ORGANIZATION	0.99+
The Godfather	TITLE	0.99+
one layer	QUANTITY	0.99+
30	QUANTITY	0.99+
each	QUANTITY	0.99+
Boston	LOCATION	0.98+
one part	QUANTITY	0.98+
last year	DATE	0.98+
each person	QUANTITY	0.98+
both	QUANTITY	0.98+
pandemic	EVENT	0.98+
1000	QUANTITY	0.98+
1000's	QUANTITY	0.98+
single	QUANTITY	0.97+
One	QUANTITY	0.96+
40 percent	QUANTITY	0.96+
each release	QUANTITY	0.95+
80's	DATE	0.94+
SignalFx	ORGANIZATION	0.93+
each part	QUANTITY	0.93+
10's	QUANTITY	0.91+
two different systems	QUANTITY	0.88+
10,000's of containers	QUANTITY	0.87+
100,000's	QUANTITY	0.84+
Gardner	PERSON	0.84+
Leading with Observability	TITLE	0.81+
100's of	QUANTITY	0.81+
three nines	QUANTITY	0.81+
100's of services	QUANTITY	0.79+
theCube	ORGANIZATION	0.76+
DevOps	TITLE	0.72+
Splunk	PERSON	0.71+
past few years	DATE	0.63+
Splunk	TITLE	0.59+
containers	QUANTITY	0.55+
one	EVENT	0.54+
big wave	EVENT	0.53+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Patrick Lin: