Sean Scott, PagerDuty | PagerDuty Summit 2022

>> Welcome back to theCube's coverage of PagerDuty Summit 22. Lisa Martin with you here on the ground. I've got one of our alumni back with me. Sean Scott joins me, the Chief Product Officer at PagerDuty. It's great to have you here in person. >> Super great to be here in person. >> Isn't it nice? >> Quite a change, quite a change. >> It is a change. We were talking before we went live about it. That's that readjustment to actually being with another human, but it's a good readjustment to have >> Awesome readjustment. I've been traveling more and more in the past few weeks and just speaking the offices, seeing the people the energy we get is the smiles, it's amazing. So it's so much better than just sitting at your home and. >> Oh, I couldn't agree more. For me it's the energy and the CEO of DocuSign talked about that with Jennifer during her fireside chat this morning, but yes, finally, someone like me who doesn't like working from home but as one of the things that you talked about in your keynote this morning was the ways traditionally that we've been working are no longer working. Talk to me about the future of work. What does it look like from PagerDuty's lens? >> Sure. So there's a few things. If we just take a step back and think about, what your day looks like from all the different slacks, chats, emails, you have your dashboards, you have more slacks coming in, you have more emails coming in, more chat and so just when you start the day off, you think you know what you're doing and then it kind of blows up out of the gate and so what we're all about is really trying to revolutionize operations so how do you help make sense of all the chaos that's happening and how do you make it simpler so you can get back to doing the more meaningful work and leave the tedium to the machines and just automate. >> That would be critical. One of the things that such an interesting dynamic two years that we've had obviously here we are in San Francisco with virtual event this year but there's so many problems out there that customer landscape's dealing with the great resignation. The data deluge, there's just data coming in everywhere and we have this expectation when we're on the consumer side, that we're going to be that a business will know us and have enough context to make us that the next best offer that actually makes sense but now what we're seeing is like the great resignation and the data overload is really creating for many organizations, this operational complexity that's now a problem really amorphously across the organization. It's no longer something that the back office has to deal with or just the front office, it's really across. >> Yeah, that's right. So you think about just the customer's experience, their expectations are higher than ever. I think there's been a lot of great consumer products that have taught the world, what good looks like, and I came from a consumer background and we measured the customer experience in milliseconds and so customers talking about minutes or hours of outages, customers are thinking in milliseconds so that's the disconnect and so, you have to be focused at that level and have everybody in your organization focused, thinking about milliseconds of customer experience, not seconds, minutes, hours, if that's where you're at, then you're losing customers. And then you think about, you mentioned the great resignation. Well, what does that mean for a given team or organization? That means lost institutional knowledge. So if you have the experts and they leave now, who's the experts? And do you have the processes and the tools and the runbooks to make sure that nothing falls on the ground? Probably not. Most of the people that we talk to, they're trying to figure it out as they go and they're getting better but there's a lot of institutional knowledge that goes out the door when people leave. And so part of our solution is also around our runbook automation and our process automation and some of our announcements today really help address that problem to keep the business running, keep the operations running, keep everything kind of moving and the customers happy ultimately and keep your business going where it needs to go. >> That customer experience is critical for organizations in every industry these days because we don't to your point. We'll tolerate milliseconds, but that's about it. Talk to me about you did this great keynote this morning that I had a chance to watch and you talked about how PagerDuty is revolutionizing operations and I thought, I want you to be able to break that down for this audience who may not have heard that. What are those four tenants or revolutionizing operations that PagerDuty is delivering to ORGS? >> Sure, so it starts with the data. So you mentioned the data deluge that's happening to everybody, right? And so we actually do, we integrate with over 650 systems to bring all that data in, so if you have an API or webhook, you can actually integrate with PagerDuty and push this data into PagerDuty and so that's where it starts, all these integrations and it's everything from a develop perspective, your CI/CD pipelines, your code repositories, from IT we have those systems are instrumented as well, even marketing, more tech stacks we can actually instrument and pull data in. The next step is now we have all this data, how do we make sense of it? So, we think we have machine learning algorithms that really help you focus your attention and kind of point you to the really relevant work, part of that is also noise suppression. So, our algorithms can suppress noise about 98% of the noise can just be eliminated and that helps you really focus where you need to spend your time 'cause if you think about human time and attention, it's pretty expensive and it's probably one of your company's most precious resources is that human time and so you want the humans doing the really meaningful work. Next step is automation, which is okay. We want the humans doing the special work, so what's the TDM? What's the toil that we can get rid of and push that to the machines 'cause machines are really good at doing very easy, repetitive task and there's a lot of them that we do day in, day out. The next step is just orchestrating the work and putting, getting everybody in the organization on the same page and that's where this morning I talked about our customer service operations product into the customer service is on the front lines and they're often getting signals from actual customers that nobody else in the organization may not even be aware of it yet So, I was running a system before and all our metrics are good and you get a customer feedback saying, "This isn't working for me," and you go look at the metrics and your dashboards and all looks good and then you go back and talk to the customer some more and they're like, "No, it's still not working," and you go back to your data, you're back to your dashboards, back to your metrics and sure enough, we had an instrumentation issue but the customer was giving us that feedback and so customer service is really on the front lines and they're often the kind of the unsung heroes for your customers but they're actually really helping and make sure that everything, the right signals are coming to the dev team, to the owners that own it and even in the case when you think you have everything instrumented, you may be missing something and that's where they can really help but our customer service operations product really helps bring everybody on the same page and then as the development teams and the IT teams and the SRA has pushed information back to customer service, then they're equipped, empowered to go tell the customer, "Okay, we know about the issue. Thank you." We should have it up in the next 30 minutes or whatever it is, five minutes, hopefully it's faster than longer, but they can inform the customer so to help that customer experience as opposed to the customer saying, "Oh, I'm just going to go shop somewhere else," or "I'm going to go buy somewhere else or do something else." And the last part is really around, how do we really enable our customers with the best practices? So those million users, the 21,000 companies in organizations we're working with, we've learned a lot around what good looks like. And so we've really embedded that back into our product in terms of our service standards which is really helps SRES and developers set quality standards for how services should be implemented at their company and then they can actually monitor and track across all their teams, what's the quality of the services and this team against different teams in their organization and really raise the quality of the overall system. >> So for businesses and like I mentioned, DocuSign was on this morning, I know some great brand customers that you guys have. I've seen on the website, Peloton Slack, a couple that popped out to me. When you're able to work with a customer to help them revolutionize operations, what are some of the business impacts? 'Cause some of the things that jump out to me would be like reduction in churn, retention rate or some of those things that are really overall impactful to the revenue of a business. >> Absolutely. And so there's a couple different parts of it. One is, all the work what PagerDuty is known for is orchestrating the work for a service outage or a website outage and so that's actually easy to measure 'cause you can measure your revenue that's coming in or missed revenue and how much we've shortened that. So that's the, I guess that's our kind of the history and our legacy but now we've moved into a lot of the cost side as well. So, helping customers really understand from an outage perspective where to focus our time as opposed to just orchestrating the work. Well now, we can say, we think we have a new feature we launched last year called Probable Origin. So when you have an outage, we can actually narrow in where we think the outage and just give you a few clues of this looks anomalous, for example. So let's start here. So that still focus on the top line and then from an automation perspective, there's lots and lots of just toil and noise that people are dealing with on a day in, day out basis and then some of it's easy work, some of it's harder work. One of the ones I really like is our automated diagnostics. So, if you have an incident, one of the first things you have to do is you have to go gather telemetry of what's actually happening on the servers, to say, is the CPU look good? Does the memory look good? Does the disc look good? Does the network look good? And that's all perfect work for automation. And so we can run our automated diagnostics and have all that data pumped directly into the incident so when the responder engages, it's all right there waiting for them and they don't have to go do all that basic task of getting data, cutting and pasting into the incident or if you're using one of those old ticketing systems, cutting and pacing into a a tickety system, it's all right there waiting for you. And that's on average 15 minutes during an outage of time that's saved. And the nice thing about that is that can all be kicked off at time zero so you can actually call from our event orchestration product, you can call directly into automation actions right there when that event first comes in. So you think about, there's a warning for a CPU and instantly it kicks off this diagnostics and then within seconds or even minutes, it's in the incident waiting for you to take action. >> One of the things that you also shared this morning that I loved was one of the stats around customer sale point that they had 60 different alerts coming in and PagerDuty was able to reduce that to one alert. So, 60 X reduction in alerts, getting rid of a lot of noise allowing them to focus on really those probably key high escalations that are going to make the biggest impact to their customers and to their business. >> That's right. You think about, you have a high severity incident like they actually had a database failure and so, when you're in the heat of the moment and you start getting these alerts, you're trying to figure out, is that one incident? Is it 10 incidents? Is it a hundred incidents that I'm having to deal with? And you probably have a good feeling like there's, I know it's probably this thing but you're not quite sure and so, with our machine learning we're able to eliminate a lot of the noise and in this case it was, going from 60 alerts down to one, just to let you know, this is the actual incident, but then also to focus your attention on where we think may be the cause and you think about all the different teams that historically have been had to pull in for a large scale incident. We can quickly narrow into the root cause and just get the right people involved. So we don't have these conference bridges of a hundred people on which you hear about. When these large cottages happen that everyone's on a call across the entire company and it's not just the dev teams and IT teams, you have PR, you have legal, you have everybody's involved in these. And so the more that we can workshop their work and get smarter about using machine learning, some of these other technologies then the more efficient it is for our customers and ultimately the better it is for their customers. >> Right and hopefully, PR, HR, legal doesn't have to be some of those incident response leaders that right now we're seeing across the organization. >> Exactly. Exactly. >> So when you're talking with customers and some of the things that you announced, you mentioned automated actions, incident workflows, what are you hearing from the voice of the customer as the chief product officer and what influence did that have in terms of this year's vision for the PagerDuty Summit? >> Sure. We listen to our customers all the time. It's one of our leadership principles and really trying to hear their feedback and it was interesting. I got sent some of the chat threads during the keynote afterwards, and there's a lot of excitement about the products we announced. So the first one is incident workflows, and this is really, it's a no code workflow based on or a recent acquisition of a company called Catalytic and what it does is it's, you can think of as kind of our next generation of response plays so you can actually go in and and build a workflow using no code tooling to say, when this incident happens or this type of incident happens, here's what that process looks like and so back to your original comment around the great resignation that loss institutional knowledge, well now, you're building all this into your processes through your incident response. And so, I think the incident workflows, if you want to create a incident specific slack channel or an incident specific zoom bridge, or even just in status updates, all that is right there for you and you can use our out of the box orchestrations or you can define your own 'cause we have back to the, our customer list, we have some of the biggest companies in the world, as customers and we have a very opinionated product and so if you're new to the whole DevOps and full service ownership, we help you through that. But then, a lot of our companies are evolving along that continuum, the operational maturity model continuum. And at the other end, we have customers that say "This is great, but we want to extend it. We want to like call this person or send this or update this system here." And so that's where the incident workflows is really powerful and it lets our customers just tailor it to their processes and really extend it for them. >> And that's GA later this year? >> Later this year, yes, we'll start ING probably the next few months and then GA later this year. >> Got it. Last question, as we're almost out of time here, what are some of the things that as you talk to customers day in and day out, as you see you saw the chats from this morning's live keynote, the excitement, the trust that PagerDuty is building with its customers, its partners, et cetera, What excites you about the future? >> So it's really why I came to PagerDuty. I've been here about a year and a half now, but revolutionizing operations, that's a big statement and I think we need it. I think Jennifer said in her keynote today, work is broken and I think our data, we surveyed our customers earlier this year and 42% of the respondents were working more hours in 2021 compared to 2020. And I don't think anyone goes home and if I could only work more hours, I think there's some and if I could only do more of this like TDM, the TDM, more toils, if I could only do more of that, I think life would be so good. We don't hear that. We don't hear that a lot. We hear about there's a lot of noise. We have a massive attrition that every company does. That's the type of feedback that we get and so we're really, that's what gets me excited about, the tools that we're building that and especially when I think just seeing the chat even this morning about some of the announcements, it shows we've been listening and it shows the excitement in our customers when they're, lots of I'm going to use this tool, that tool, I can just use PagerDuty which is awesome. >> The momentum is clear and it's palpable and I love being a part of that. Thank you so much Sean for joining me on theCube this afternoon, talking about what's new, what's exciting and how you guys are fixing work that's broken that validated me thinking the work was broken so thank you. >> Happy to be here and thanks for having me. >> My pleasure. For Sean Scott. I'm Lisa Martin, you're watching theCube's coverage of PagerDuty Summit 22 on the ground from the San Francisco. (soft music)

Published Date : Jun 8 2022

SUMMARY :

It's great to have you here in person. but it's a good readjustment to have and just speaking the offices, and the CEO of DocuSign talked about that and leave the tedium to the that the back office has to deal with and the tools and the runbooks and I thought, I want you to and even in the case 'Cause some of the things and so that's actually easy to measure and to their business. and it's not just the across the organization. Exactly. and so back to your original comment and then GA later this year. that as you talk to and 42% of the respondents the work was broken Happy to be here and of PagerDuty Summit 22 on the

ENTITIES

Entity	Category	Confidence
Jennifer	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Sean Scott	PERSON	0.99+
Sean	PERSON	0.99+
10 incidents	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
60 alerts	QUANTITY	0.99+
2020	DATE	0.99+
PagerDuty	ORGANIZATION	0.99+
2021	DATE	0.99+
DocuSign	ORGANIZATION	0.99+
21,000 companies	QUANTITY	0.99+
60 different alerts	QUANTITY	0.99+
Catalytic	ORGANIZATION	0.99+
last year	DATE	0.99+
one alert	QUANTITY	0.99+
five minutes	QUANTITY	0.99+
ING	ORGANIZATION	0.99+
15 minutes	QUANTITY	0.99+
60 X	QUANTITY	0.99+
one incident	QUANTITY	0.99+
42%	QUANTITY	0.99+
today	DATE	0.99+
two years	QUANTITY	0.99+
SRA	ORGANIZATION	0.98+
one	QUANTITY	0.98+
over 650 systems	QUANTITY	0.98+
about 98%	QUANTITY	0.98+
One	QUANTITY	0.98+
million users	QUANTITY	0.97+
PagerDuty Summit 22	EVENT	0.97+
first one	QUANTITY	0.97+
this year	DATE	0.97+
Peloton Slack	ORGANIZATION	0.96+
Later this year	DATE	0.96+
first	QUANTITY	0.96+
GA	LOCATION	0.96+
four tenants	QUANTITY	0.96+
later this year	DATE	0.96+
PagerDuty Summit	EVENT	0.95+
this morning	DATE	0.95+
next few months	DATE	0.94+
this afternoon	DATE	0.91+
earlier this year	DATE	0.91+
PagerDuty Summit 2022	EVENT	0.87+
hundred incidents	QUANTITY	0.85+
hundred people	QUANTITY	0.84+
about a year and a half	QUANTITY	0.83+
couple	QUANTITY	0.83+
theCube	ORGANIZATION	0.8+
SRES	ORGANIZATION	0.8+
Probable Origin	TITLE	0.79+
first things	QUANTITY	0.78+
things	QUANTITY	0.68+
next 30 minutes	DATE	0.67+
PagerDuty	TITLE	0.58+
runbooks	ORGANIZATION	0.53+
past	DATE	0.53+
year	DATE	0.49+
weeks	DATE	0.48+
zero	QUANTITY	0.46+

Ed Bailey, Cribl | AWS Startup Showcase S2 E2

(upbeat music) >> Welcome everyone to theCUBE presentation of the AWS Startup Showcase, the theme here is Data as Code. This is season two, episode two of our ongoing series covering the exciting startups from the AWS ecosystem. And talk about the future of data, future of analytics, the future of development and all kind of cool stuff in Multicloud. I'm your host, John Furrier. Today we're joined by Ed Bailey, Senior Technology, Technical Evangelist at Cribl. Thanks for coming on the queue here. >> I thank you for the invitation, thrilled to be here. >> The theme of this session is the observability lake, which I love by the way I'm getting into that in a second. A breach investigation's best friend, which is a great topic. Couple of things, one, I like the breach investigation angle, but I also like this observability lake positioning, because I think this is a teaser of what's coming, more and more data usage where it's actually being applied specifically for things here, it's observability lake. So first, what is an observability lake? Why is it important? >> Why it's important is technology professionals, especially security professionals need data to make decisions. They need data to drive better decisions. They need data to understand, just to achieve understanding. And that means they need everything. They don't need what they can afford to store. They don't need not what vendor is going to let them store. They need everything. And I think as a point of the observability lake, because you couple an observability pipeline with the lake to bring your enterprise of data, to make it accessible for analytics, to be able to use it, to be able to get value from it. And I think that's one of the things that's missing right now in the enterprises. Admins are being forced to make decisions about, okay, we can't afford to keep this, we can afford to keep this, they're missing things. They're missing parts of the picture. And by bringing, able to bring it together, to be able to have your cake and eat it too, where I can get what I need and I can do it affordably is just, I think that's the future, and it just drives value for everyone. >> And it just makes a lot of sense data lake or the earlier concert, throw everything into the lake, and you can figure it out, you can query it, you can take action on it real time, you can stream it. You can do all kinds of things with it. Verb observability is important because it's the most critical thing people are doing right now for all kinds of things from QA, administration, security. So this is where the breach piece comes in. I like that's part of the talk because the breached investigation's best friend, it implies that you got the secret sourced to behind it, right? So, what is the state of the breach investigation today? What's going on with that? Because we know breaches, we see 'em out there, but like, why is this the best friend of a breach investigator? >> Well, and this is unfortunate, but typically there's an enormous delay between breach and detection. And right now, there's an IBM study, I think it's 287 days, but from the actual breach to detection and containment. It's an enormous amount of time. And the key is so when you do detect a breach, you're bringing in your instant, your response team, and typically without an observability lake, without Cribl solutions around observability pipeline, you're going to have an incomplete picture. The incident response team has to first to understand what's the scope of the breach. Is it one server? Is it three servers? Is it all the servers? You got to understand what's been compromised, what's been the end, what's the impact? How did the breach occur in the first place? And they need all the data to stitch that together, and they need it quickly. The more time it takes to get that data, the more time it takes for them to finish their analysis and contain the breach. I mean, hence the, I think about an 87, 90 days to contain a breach. And so by being able to remove the friction, by able to make it easier to achieve these goals, what shouldn't be hard, but making, by removing that friction, you speed up the containment and resolution time. Not to mention for many system administrators, they don't simply have the data because they can afford to store the data in their SIEM. Or they have to go to their backup team to get a restore which can take days. And so that's-- It's just so many obstacles to getting resolution right now. >> I mean, it's just, you're crawling through glass there, right? Because you think about it like just the timing aspect. Where is the data? Where is it stored and relevant and-- >> And do you have it at all? >> And you have it at all, and then, you know, that person doesn't work anywhere, they change jobs. I mean, who is keeping track of all this? You guys have now, this capability where you can come in and do the instrumentation with the observability lake without a lot of change to the environment, which is not the way it used to be. Used to be, buy a tool, build a platform. Cribl has a solution that eases the struggles with the enterprise. What specifically is that pain point? And what do you guys do specifically? >> Well, I'll start out with kind of example, what drew me to Cribl, so back in 2018. I'm running the Splunk team for a very large multinational. The complexity of that, we were dealing with the complexity of the data, the demands we were getting from security and operations were just an enormous issue to overcome. I had vendors come to me all the time that will solve your problems, but that means you got to move to our platform where you have to get rid of Splunk or you have to do this, and I'm losing something. And what Cribl stream brought into, was I could put it between my sources and my destinations and manage my data. And I would have flow control over the data. I don't have to lose anything. I could keep continuing use our existing analytics tools, and that sense of power and control, and I don't have to lose anything. I was like, there's something wrong here. This is too good to be true. And so what we're talking about now in terms of breach investigation, is that with Cribl stream, I can create a clone of my data to an object store. So this is in, this is almost any object store. So it can be AWS, it could be the other vendor object stores. It could be on-prem object stores. And then I can house my data, I can house all my data at the cheapest possible price. So instead of eating up my most expensive storage, I put all my data in my object store. And I only put the data I need for the detections in my SIEM. So if, and hopefully never, but if you do have a breach, lock stream has a wonderful UI that makes a trivial to then pick my data out of my object store and restore it back into my SIEM so that my IR team has to develop a complete picture of how the breach happen. What's the scope? What is their lateral movement and answer those questions. And it just, it takes the friction away. Just like you said, just no more crawling over glass. You're running to your solution. >> You mentioned object store, and you're streaming that in. You talk about the Cribble stream tool. I'm assuming there when you're streaming the pipeline stuff, but is there a schema involved? Is there database challenges? What, how do you guys look at that? I know you're vendor agnostic. I like that piece, you plug in and you leverage all the tools that are out there, Splunk, Datadog, whatever. But how about on the database side, what's the impact there? >> Well, so I'm assuming you're talking about the object store itself, so we don't have to apply the schema. We can fit the data to whichever the object store is. We structure the data so it makes it easier to understand. For example, if I want to see communications from one IP to another IP, we structure it to make it easier to see that and query that, but it is just, we're-- Yeah, it's completely vendor neutral and this makes it so simple, so simple to enable, I think-- >> So no pre-defined schema needed. >> No, not at all. And this, it made it so much easier. I think we enabled this for the enterprise. I think it took us three hours to do, and we were able to then start, I mean, start cutting our retention costs dramatically. >> Yeah, it's great when you get that kind of value, time to value critical and all the skeptics fall to the sides pretty quickly. (chuckles) I got to ask you, well, go ahead. >> So I say, I mean, previously, I would have to go to our backup team. We'd have to open up a ticket, we'd have to have a bridge, then we'd have to go through the process of pulling tape and being, it could take, you know, hours, hours if not days to restore the amount of data we needed. And just it, you know, we were able to run to our goals, and solve business problems instead of focusing on the process steps of getting things done. >> Right, so take me through the architecture here and some customer examples, 'cause you have the Cribble streaming there, observability pipeline. That's key, you mentioned that. >> Yes. >> And then they build out these observability lakes from that. So what is the impact of that? Can you share the customers that are using that solution? What are they seeing for benefits? What are some of the impact? Can you give us some specifics? >> I mean, I can't share with all the exact customer names. I can definitely give you some examples. Like referenceable conference would be TransUnion, so that I came from TransUnion. I was one of the first customers and it solved enormous number of problems for us. Autodesk is another great example. The idea that we're able to automate and data practices. I mean, just for example, what we were talking about with backups. We'd have to, you have to put a lot of time into managing your backups in your inner analytics platforms, you have to. And then you're locked into custom database schemas, you're locked into vendors. And it's also, it's still, it's expensive. So being able to spend a few hours, dramatically cut your costs, but still have the data available, and that's the key. I didn't have to make compromises, 'cause before I was having to say, okay, we're going to keep this, we're going to just drop this and hope for the best. And we just don't, we just didn't have to do that anymore. I think for the same thing for TransUnion and Autodesk, the idea that we're going to lower our cost, we're going to make it easier for our administrators to do their job and so they can spend more time on business value fundamentals, like responding to a breach. You're going to spend time working with your teams, getting value observability solutions and stop spending time on writing custom solutions using to open source tools. 'Cause your engineering time is the most precious asset for any enterprise and you got to focus your engineering time on where it's needed the most. >> Yeah, and they can't underestimate the hassle and cost of ownership, of swapping out pre-existing stuff, just for the sake of having a functionality. I mean that's a big-- >> It's pain and that's a big thing about lock stream is that being vendor neutral is so important. If you want to use the Splunk universal forwarder, that's great. If you want to use Beats, that's awesome. If you want to use Fluentd, even better. If you want to use all three, you can do that too. It's the customer choice and we're saying to people, use what suits your needs. And if you want to write some of your data to elastic, that's great. Some of your data to Splunk, that's even better. Some of it to, pick your pick, fine as well or Exabeam. You have the choices to put together, put your own solutions together and put your data where you need it to be. We're not asking you only in our ecosystem to work with only our partners. We're letting you pick and choose what suits your business. >> Yeah, you know, that's the direction I was just talking about the Amazon folks around their serverless. You know, you can use any tool, you know, you can, they have that core architecture for everything, the S3 and then pick whatever you want to use. SageMaker, just that other thing. This is the new way. That's the way it has to be to be effective. How do you guys handle that? What's been the reaction from customers? Do they like, roll their eyes and doubt you guys, or can you do it? Are they skeptical? How fast can you convert 'em over? (chuckles) >> Right, and that's always the challenge. And that's, I mean, the best part of my day is talking to customers. I love hearing and feedback, what they like, what they don't and what they need. And of course I was skeptical. I didn't believe it when I first saw it because I was like this, you know, because I'm, I was used to being locked in. I was used to having to put a lot of effort, a lot of custom code, like, what do you mean? It's this easy? I believe I did the first, this is 2018, and I did our first demos, like 30 minutes in, and I cut about 1/2 million dollars out of our license in the first 30 minutes in our first demo. And I was stunned because I mean, it's like, this is easy. >> Yeah, I mean-- >> Yeah, exactly. I mean, this is, and then this is the future. And then for example, we needed to bring in so like the security team wanted to bring in a UBA solution that wasn't part of the vendor ecosystem that we were in. And I was like, not a problem. We're going to use log stream. We're going to clone a copy of our data to the UBA solution. We were able to get value from this UBA solution in weeks. What typically is a six month cycle to start getting value. And it just, it was just too easy and the best part of it. And the thing is, it just struck me was my engineers can now spend their time on delivering value instead of integrations and moving data around. >> Yeah, and also we can spend more time preventing breaches. But what's interesting is counterintuitive here is that, if you, as you add more flexibility and choice, you'd think it'd be harder to handle a breach, right? So, now let's go back to the scenario. Now you guys, say an organization has a breach, and they have the observability pipeline, They got the lake in place, your observability lake, take me through the investigation. How easy is it, what happens? How they start it, what goes on? >> So, once your SOC detects a breach, then they bring in the idea. Typically you're going to bring in your incident response team. So what we did, and this is one more way that we removed that friction, we cleaned up the glass, is we delegate to the instant response team, the ability to restore, we call it-- So if Cribl calls it replay, we play data at our object store back into your SIEM. There's a very nice UI that gives you the ability to say, "I want data from this time period, at this time period, I want it to be all the data." Or the ability to filter and say, "I want this, just this IP." For example, if I detected, okay, this IP has been breached then I'm going to pull all the data that mentions this IP and this timeframe, hit a button and it just starts. And then it's going to restore how as fast your IOPS are for your solution. And then it's back in your tool, it's back in your tool. One of the things I also want to mention is we have an amazing enrichment capability. So one of the things that we would do is we would've pipelines so as the data comes out of the object store, it hits the pipeline, and then we enrich it. We hit use GoIP information, perverse and NAS. It gets processed through threat Intel feed. So the data's already enriched and ready for the incident response people to do their job. And so it just, it bamboozle the friction of getting to the point where I can start doing my job. >> You know, at this theme, this episode for this showcase is about Data as Code. And which is, you know, we've been, I've been saying this on theCUBES for since it was being around 13 years ago, that developers are going to be dealing with data like they deal with software code, and you're starting to see, you mentioned enrichment. Where do you see Data as Code going? How relevant in it now, because we really talking about when you add machine learning in here, that has to be enriched, and iterated on too. We're talking about taking things off a branch and putting it back into the core. This is a data discussion, this isn't software, but it sounds the same. >> Right, and this is something that the irony is that, I remember first time saying it to an auditor. I was constantly going with auditors, and that's what I described is I'm going to show you the code that manages the data. This is the data's code that's going to show you how we transform it, how we secure it, where the data goes, how it's enriched. So you can see the whole story, the data life cycle in one place. And that's how we handled our orders. And I think that is enormously, you know, positive because it's so easy to be confused. It's so easy to have complexity to get in the way of progress. And by being able to represent your Data as Code, it's a step forward 'cause the amount of data and the complexity of data, it's not getting simpler, it's getting more complex. So we need to come up with better ways to handle it. >> Now you've been on both sides of the fence. You've been in the trenches as customer, now you're a supplier with Great Solution. What are people doing with this data engineering roles? Because it's not enough data engineering. I mean, 'cause if you say Data as Code, if you believe that to be true and many people do, we do. And you looked at the history of infrastructure risk code that enabled DevOps, AIOps, MLOps, DataOps, it's happening, right? So data stack ops is coming. Obviously security is huge in this. How does that data engineering role evolve? Because it just seems more and more that there's going to be a big push towards an SRE version of data, right? >> I completely agree. I was working with a customer yesterday, and I spent a large part of our conversation talking about implementing development practices for administrators. It's a new role. It's a new way to think of things 'cause traditionally your Splunk or elastic administrators is talking about operating systems and memory and talking about how to use proprietary tools in the vendor, that's just not quite the same. And so we started talking about, you need to have, you need to start getting used to code reviews. Yeah, the idea of getting used to making sure everything has a comment, was one thing I told him was like, you know, if you have a function has to have a comment, just by default, just it has to. Yeah, the standards of how you write things, how you name things all really start to matter. And also you got to start adding, considering your skillset. And this is some mean probably one of the best hire I ever made was I hired a guy with a math degree, because I needed his help to understand how do machine learning works, how to pick the best type of algorithm. And I think this is going to evolve, that you're going to be just away from the gray bearded administrator to some other gray bearded administrator with a math degree. >> It's interesting, it's a step function. You have a data engineer who's got that kind of capabilities, like what the SRA did with infrastructure. The step function of enablement, the value creation from really good data engineering, puts the democratization playback on the table, and changes, >> Thank you very much John. >> And changes that entire landscape. How do you, what's your reaction to that? >> I completely agree 'cause so operational data. So operational security data is the most volatile data in the enterprise. It changes on a whim, you have developers who change things. They don't tell you what happens, vendor doesn't tell you what happened, and so that idea, that life cycle of managing data. So the same types of standards of disciplines that database administrators have done for years is going to have, it has to filter down into the operational areas, and you need tooling that's going to give you the ability to manage that data, manage it in flight in real time, in order to drive detections, in order to drive response. All those business value things we've been talking about. >> So I got to ask you the larger role that you see with observability lakes we were talking before we came on camera live here about how exciting this kind of concept is, and you were attracted to the company because of it. I love the observability lake concept because it puts all that data in one spot, you can manage it. But you got machine learning in AI around the corner that also can help. How has all this changed in the landscape of data security and things because it makes a lot of sense, and I can only see it getting better with machine learning. >> Yeah, definitely does. >> Totally, and so the core issue, and I don't want to say, so when you talk about observability, most people have assumptions around observability is only an operational or an application support process. It's also security process. The idea that you're looking for your unknown, unknowns. This is what keeps security administrators up at night is I'm being attacked by something I don't know about. How do you find those unknown? And that's where your machine learning comes in. And that's where that you have to understand there's so many different types of machine learning algorithms, where the guy that I hired, I mean, had started educating me about the umpteen number of algorithms and how it applies to different data and how you get different value, how you have to test your data constantly. There's no such thing as the magical black box of machine learning that gives you value. You have to implement, but just like the developer practices to keep testing and over and over again, data scientists, for example. >> The best friend of a machine learning algorithm is data, right? You got to keep feeding that data, and when the data sets are baked and secure and vetted, even better, all cool. Had great stuff, great insight. Congratulations Cribl, Great Solution. Love the architecture, love the pipelining of the observability data and streaming that in to a lake. Great stuff. Give a plug for the company where you guys are at, where people can get information. I know you guys got a bunch of live feeds on YouTube, Twitch, here in theCUBE. Where else can people find you? Give the plug. >> Oh, please, please join our slack community, go to cribl.io/community. We have an amazing community. This was another thing that drew me to the company is have a large group of people who are genuinely excited about data, about managing data. If you want to try Cribl out, we have some great tool. Try Cribl tools out. We have a cloud platform, one terabyte up free data. So go to cribl.io/cloud or cribl.cloud, sign up for, you know, just never times out. You're not 30 day, it's forever up to one terabyte. Try out our new products as well, Cribl Edge. And then finally come watch Nick Decker and I, every Thursday, 2:00 PM Eastern. We have live streams on Twitter, LinkedIn and YouTube live. And so just my Twitter handle is EBA 1367. Love to have, love to chat, love to have these conversations. And also, we are hiring. >> All right, good stuff. Great team, great concepts, right? Of course, we're theCUBE here. We got our video lake coming on soon. I think I love this idea of having these video. Hey, videos data too, right? I mean, we've got to keep coming to you. >> I love it, I love videos, it's awesome. It's a great way to communicate, it's a great way to have a conversation. That's the best thing about us, having conversations. I appreciate your time. >> Thank you so much, Ed, for representing Cribl here on the Data as Code. This is season two episode two of the ongoing series covering the hottest, most exciting startups from the AWS ecosystem. Talking about the future data, I'm John Furrier, your host. Thanks for watching. >> Ed: All right, thank you. (slow upbeat music)

Published Date : Apr 26 2022

SUMMARY :

And talk about the future of I thank you for the I like the breach investigation angle, to be able to have your I like that's part of the talk And the key is so when Where is the data? and do the instrumentation And I only put the data I need I like that piece, you We can fit the data to for the enterprise. I got to ask you, well, go ahead. and being, it could take, you know, hours, the Cribble streaming there, What are some of the impact? and that's the key. just for the sake of You have the choices to put together, This is the new way. I believe I did the first, this is 2018, And the thing is, it just They got the lake in place, the ability to restore, we call it-- and putting it back into the core. is I'm going to show you more that there's going to be And I think this is going to evolve, the value creation from And changes that entire landscape. that's going to give you the So I got to ask you the Totally, and so the core of the observability data and that drew me to the company I think I love this idea That's the best thing about Cribl here on the Data as Code. Ed: All right, thank you.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Ed	PERSON	0.99+
Ed Bailey	PERSON	0.99+
TransUnion	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
2018	DATE	0.99+
Autodesk	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
three hours	QUANTITY	0.99+
287 days	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
30 day	QUANTITY	0.99+
six month	QUANTITY	0.99+
first demo	QUANTITY	0.99+
yesterday	DATE	0.99+
Cribl	ORGANIZATION	0.99+
first demos	QUANTITY	0.99+
YouTube	ORGANIZATION	0.99+
Twitch	ORGANIZATION	0.99+
first	QUANTITY	0.99+
both sides	QUANTITY	0.99+
three servers	QUANTITY	0.99+
Splunk	ORGANIZATION	0.99+
one spot	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
30 minutes	QUANTITY	0.98+
Cribl	PERSON	0.98+
UBA	ORGANIZATION	0.98+
one place	QUANTITY	0.98+
one terabyte	QUANTITY	0.98+
first 30 minutes	QUANTITY	0.98+
LinkedIn	ORGANIZATION	0.98+
SRA	ORGANIZATION	0.97+
Today	DATE	0.97+
one more way	QUANTITY	0.97+
about 1/2 million dollars	QUANTITY	0.96+
one server	QUANTITY	0.96+
Twitter	ORGANIZATION	0.96+
Beats	ORGANIZATION	0.96+
Nick Decker	PERSON	0.96+
Cribl	TITLE	0.95+
today	DATE	0.94+
Cribl Edge	TITLE	0.94+
first customers	QUANTITY	0.94+
87, 90 days	QUANTITY	0.93+
Thursday, 2:00 PM Eastern	DATE	0.92+
around 13 years ago	DATE	0.91+
first time	QUANTITY	0.89+
three	QUANTITY	0.87+
cribl.io/community	OTHER	0.87+
Intel	ORGANIZATION	0.87+
cribl.cloud	TITLE	0.86+
Datadog	ORGANIZATION	0.85+
S3	TITLE	0.84+
Cribl stream	TITLE	0.82+
cribl.io/cloud	TITLE	0.81+
Couple of things	QUANTITY	0.78+
two	OTHER	0.78+
episode	QUANTITY	0.74+
AWS Startup Showcase	EVENT	0.72+
lock	TITLE	0.72+
Exabeam	ORGANIZATION	0.71+
Startup Showcase S2 E2	EVENT	0.69+
season two	QUANTITY	0.67+
Multicloud	TITLE	0.67+
up to one terabyte	QUANTITY	0.67+

Brett McMillen, AWS | AWS re:Invent 2020

>>From around the globe. It's the cube with digital coverage of AWS reinvent 2020, sponsored by Intel and AWS. >>Welcome back to the cubes coverage of AWS reinvent 2020 I'm Lisa Martin. Joining me next is one of our cube alumni. Breton McMillan is back the director of us, federal for AWS. Right. It's great to see you glad that you're safe and well. >>Great. It's great to be back. Uh, I think last year when we did the cube, we were on the convention floor. It feels very different this year here at reinvent, it's gone virtual and yet it's still true to how reinvent always been. It's a learning conference and we're releasing a lot of new products and services for our customers. >>Yes. A lot of content, as you say, the one thing I think I would say about this reinvent, one of the things that's different, it's so quiet around us. Normally we're talking loudly over tens of thousands of people on the showroom floor, but great. That AWS is still able to connect in such an actually an even bigger way with its customers. So during Theresa Carlson's keynote, want to get your opinion on this or some info. She talked about the AWS open data sponsorship program, and that you guys are going to be hosting the national institutes of health, NIH sequence, read archive data, the biologist, and may former gets really excited about that. Talk to us about that because especially during the global health crisis that we're in, that sounds really promising >>Very much is I am so happy that we're working with NIH on this and multiple other initiatives. So the secret greed archive or SRA, essentially what it is, it's a very large data set of sequenced genomic data. And it's a wide variety of judge you gnomic data, and it's got a knowledge human genetic thing, but all life forms or all branches of life, um, is in a SRA to include viruses. And that's really important here during the pandemic. Um, it's one of the largest and oldest, um, gen sequence genomic data sets are out there and yet it's very modern. It has been designed for next generation sequencing. So it's growing, it's modern and it's well used. It's one of the more important ones that it's out there. One of the reasons this is so important is that we know to find cures for what a human ailments and disease and death, but by studying the gem genomic code, we can come up with the answers of these or the scientists can come up with answer for that. And that's what Amazon is doing is we're putting in the hands of the scientists, the tools so that they can help cure heart disease and diabetes and cancer and, um, depression and yes, even, um, uh, viruses that can cause pandemics. >>So making this data, sorry, I'm just going to making this data available to those scientists. Worldwide is incredibly important. Talk to us about that. >>Yeah, it is. And so, um, within NIH, we're working with, um, the, um, NCBI when you're dealing with NIH, there's a lot of acronyms, uh, and uh, at NIH, it's the national center for, um, file type technology information. And so we're working with them to make this available as an open data set. Why, why this is important is it's all about increasing the speed for scientific discovery. I personally think that in the fullness of time, the scientists will come up with cures for just about all of the human ailments that are out there. And it's our job at AWS to put into the hands of the scientists, the tools they need to make things happen quickly or in our lifetime. And I'm really excited to be working with NIH on that. When we start talking about it, there's multiple things. The scientists needs. One is access to these data sets and SRA. >>It's a very large data set. It's 45 petabytes and it's growing. I personally believe that it's going to double every year, year and a half. So it's a very large data set and it's hard to move that data around. It's so much easier if you just go into the cloud, compute against it and do your research there in the cloud. And so it's super important. 45 petabytes, give you an idea if it were all human data, that's equivalent to have a seven and a half million people or put another way 90% of everybody living in New York city. So that's how big this is. But then also what AWS is doing is we're bringing compute. So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. The third leg of the tool of the stool is giving the scientists easy access to the specialized tool sets they need. >>And we're doing that in a few different ways. One that the people would design these toolsets design a lot of them on AWS, but then we also make them available through something called AWS marketplace. So they can just go into marketplace, get a catalog, go in there and say, I want to launch this resolve work and launches the infrastructure underneath. And it speeds the ability for those scientists to come up with the cures that they need. So SRA is stored in Amazon S3, which is a very popular object store, not just in the scientific community, but virtually every industry uses S3. And by making this available on these public data sets, we're giving the scientists the ability to speed up their research. >>One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, it's also facilitating collaboration globally because now you've got the cloud to drive all of this, which allows researchers and completely different parts of the world to be working together almost in real time. So I can imagine the incredible power that this is going to, to provide to that community. So I have to ask you though, you talked about this being all life forms, including viruses COVID-19, what are some of the things that you think we can see? I expect this to facilitate. Yeah. >>So earlier in the year we took the, um, uh, genetic code or NIH took the genetic code and they, um, put it in an SRA like format and that's now available on AWS and, and here's, what's great about it is that you can now make it so anybody in the world can go to this open data set and start doing their research. One of our goals here is build back to a democratization of research. So it used to be that, um, get, for example, the very first, um, vaccine that came out was a small part. It's a vaccine that was done by our rural country doctor using essentially test tubes in a microscope. It's gotten hard to do that because data sets are so large, you need so much computer by using the power of the cloud. We've really democratized it and now anybody can do it. So for example, um, with the SRE data set that was done by NIH, um, organizations like the university of British Columbia, their, um, cloud innovation center is, um, doing research. And so what they've done is they've scanned, they, um, SRA database think about it. They scanned out 11 million entries for, uh, coronavirus sequencing. And that's really hard to do in a typical on-premise data center. Who's relatively easy to do on AWS. So by making this available, we can have a larger number of scientists working on the problems that we need to have solved. >>Well, and as the, as we all know in the U S operation warp speed, that warp speed alone term really signifies how quickly we all need this to be progressing forward. But this is not the first partnership that AWS has had with the NIH. Talk to me about what you guys, what some of the other things are that you're doing together. >>We've been working with NIH for a very long time. Um, back in 2012, we worked with NIH on, um, which was called the a thousand genome data set. This is another really important, um, data set and it's a large number of, uh, against sequence human genomes. And we moved that into, again, an open dataset on AWS and what's happened in the last eight years is many scientists have been able to compute about on it. And the other, the wonderful power of the cloud is over time. We continue to bring out tools to make it easier for people to work. So what they're not they're computing using our, um, our instance types. We call it elastic cloud computing. whether they're doing that, or they were doing some high performance computing using, um, uh, EMR elastic MapReduce, they can do that. And then we've brought up new things that really take it to the next layer, like level like, uh, Amazon SageMaker. >>And this is a, um, uh, makes it really easy for, um, the scientists to launch machine learning algorithms on AWS. So we've done the thousand genome, uh, dataset. Um, there's a number of other areas within NIH that we've been working on. So for example, um, over at national cancer Institute, we've been providing some expert guidance on best practices to how, how you can architect and work on these COVID related workloads. Um, NIH does things with, um, collaboration with many different universities, um, over 2,500, um, academic institutions. And, um, and they do that through grants. And so we've been working with doc office of director and they run their grant management applications in the RFA on AWS, and that allows it to scale up and to work very efficiently. Um, and then we entered in with, um, uh, NIH into this program called strides strides as a program for knowing NIH, but also all these other institutions that work within NIH to use the power of the cloud use commercial cloud for scientific discovery. And when we started that back in July of 2018, long before COVID happened, it was so great that we had that up and running because now we're able to help them out through the strides program. >>Right. Can you imagine if, uh, let's not even go there? I was going to say, um, but so, okay. So the SRA data is available through the AWS open data sponsorship program. You talked about strides. What are some of the other ways that AWS system? >>Yeah, no. So strides, uh, is, uh, you know, wide ranging through multiple different institutes. So, um, for example, over at, uh, the national heart lung and blood Institute, uh, do di NHL BI. I said, there's a lot of acronyms and I gel BI. Um, they've been working on, um, harmonizing, uh, genomic data. And so working with the university of Michigan, they've been analyzing through a program that they call top of med. Um, we've also been working with a NIH on, um, establishing best practices, making sure everything's secure. So we've been providing, um, AWS professional services that are showing them how to do this. So one portion of strides is getting the right data set and the right compute in the right tools, in the hands of the scientists. The other areas that we've been working on is making sure the scientists know how to use it. And so we've been developing these cloud learning pathways, and we started this quite a while back, and it's been so helpful here during the code. So, um, scientists can now go on and they can do self-paced online courses, which we've been really helping here during the, during the pandemic. And they can learn how to maximize their use of cloud technologies through these pathways that we've developed for them. >>Well, not education is imperative. I mean, there, you think about all of the knowledge that they have with within their scientific discipline and being able to leverage technology in a way that's easy is absolutely imperative to the timing. So, so, um, let's talk about other data sets that are available. So you've got the SRA is available. Uh, what are their data sets are available through this program? >>What about along a wide range of data sets that we're, um, uh, doing open data sets and in general, um, these data sets are, um, improving the human condition or improving the, um, the world in which we live in. And so, um, I've talked about a few things. There's a few more, uh, things. So for example, um, there's the cancer genomic Atlas that we've been working with, um, national cancer Institute, as well as the national human genomic research Institute. And, um, that's a very important data set that being computed against, um, uh, throughout the world, uh, commonly within the scientific community, that data set is called TCGA. Um, then we also have some, uh, uh, datasets are focused on certain groups. So for example, kids first is a data set. That's looking at a lot of the, um, challenges, uh, in diseases that kids get every kind of thing from very rare pediatric cancer as to heart defects, et cetera. >>And so we're working with them, but it's not just in the, um, uh, medical side. We have open data sets, um, with, uh, for example, uh, NOAA national ocean open national oceanic and atmospheric administration, um, to understand what's happening better with climate change and to slow the rate of climate change within the department of interior, they have a Landsat database that is looking at pictures of their birth cell, like pictures of the earth, so we can better understand the MCO world we live in. Uh, similarly, uh, NASA has, um, a lot of data that we put out there and, um, over in the department of energy, uh, there's data sets there, um, that we're researching against, or that the scientists are researching against to make sure that we have better clean, renewable energy sources, but it's not just government agencies that we work with when we find a dataset that's important. >>We also work with, um, nonprofit organizations, nonprofit organizations are also in, they're not flush with cash and they're trying to make every dollar work. And so we've worked with them, um, organizations like the child mind Institute or the Allen Institute for brain science. And these are largely like neuro imaging, um, data. And we made that available, um, via, um, our open data set, um, program. So there's a wide range of things that we're doing. And what's great about it is when we do it, you democratize science and you allowed many, many more science scientists to work on these problems. They're so critical for us. >>The availability is, is incredible, but also the, the breadth and depth of what you just spoke. It's not just government, for example, you've got about 30 seconds left. I'm going to ask you to summarize some of the announcements that you think are really, really critical for federal customers to be paying attention to from reinvent 2020. >>Yeah. So, um, one of the things that these federal government customers have been coming to us on is they've had to have new ways to communicate with their customer, with the public. And so we have a product that we've had for a while called on AWS connect, and it's been used very extensively throughout government customers. And it's used in industry too. We've had a number of, um, of announcements this weekend. Jasmine made multiple announcements on enhancement, say AWS connect or additional services, everything from helping to verify that that's the right person from AWS connect ID to making sure that that customer's gets a good customer experience to connect wisdom or making sure that the managers of these call centers can manage the call centers better. And so I'm really excited that we're putting in the hands of both government and industry, a cloud based solution to make their connections to the public better. >>It's all about connections these days, but I wish we had more time, cause I know we can unpack so much more with you, but thank you for joining me on the queue today, sharing some of the insights, some of the impacts and availability that AWS is enabling the scientific and other federal communities. It's incredibly important. And we appreciate your time. Thank you, Lisa, for Brett McMillan. I'm Lisa Martin. You're watching the cubes coverage of AWS reinvent 2020.

Published Date : Dec 10 2020

SUMMARY :

It's the cube with digital coverage of AWS It's great to see you glad that you're safe and well. It's great to be back. Talk to us about that because especially during the global health crisis that we're in, One of the reasons this is so important is that we know to find cures So making this data, sorry, I'm just going to making this data available to those scientists. And so, um, within NIH, we're working with, um, the, So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. And it speeds the ability for those scientists One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, And that's really hard to do in a typical on-premise data center. Talk to me about what you guys, take it to the next layer, like level like, uh, Amazon SageMaker. in the RFA on AWS, and that allows it to scale up and to work very efficiently. So the SRA data is available through the AWS open data sponsorship And so working with the university of Michigan, they've been analyzing absolutely imperative to the timing. And so, um, And so we're working with them, but it's not just in the, um, uh, medical side. And these are largely like neuro imaging, um, data. I'm going to ask you to summarize some of the announcements that's the right person from AWS connect ID to making sure that that customer's And we appreciate your time.

ENTITIES

Entity	Category	Confidence
NIH	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Brett McMillan	PERSON	0.99+
Brett McMillen	PERSON	0.99+
AWS	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
July of 2018	DATE	0.99+
2012	DATE	0.99+
Theresa Carlson	PERSON	0.99+
Jasmine	PERSON	0.99+
Lisa	PERSON	0.99+
90%	QUANTITY	0.99+
New York	LOCATION	0.99+
Allen Institute	ORGANIZATION	0.99+
SRA	ORGANIZATION	0.99+
last year	DATE	0.99+
Breton McMillan	PERSON	0.99+
NCBI	ORGANIZATION	0.99+
45 petabytes	QUANTITY	0.99+
SRE	ORGANIZATION	0.99+
seven and a half million people	QUANTITY	0.99+
third leg	QUANTITY	0.99+
One	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
earth	LOCATION	0.99+
over 2,500	QUANTITY	0.99+
SRA	TITLE	0.99+
S3	TITLE	0.98+
pandemic	EVENT	0.98+
first partnership	QUANTITY	0.98+
one	QUANTITY	0.98+
child mind Institute	ORGANIZATION	0.98+
U S	LOCATION	0.98+
this year	DATE	0.98+
pandemics	EVENT	0.98+
national cancer Institute	ORGANIZATION	0.98+
both	QUANTITY	0.98+
national heart lung and blood Institute	ORGANIZATION	0.98+
NOAA	ORGANIZATION	0.97+
national human genomic research Institute	ORGANIZATION	0.97+
today	DATE	0.97+
Landsat	ORGANIZATION	0.96+
first	QUANTITY	0.96+
11 million entries	QUANTITY	0.96+
about 30 seconds	QUANTITY	0.95+
year and a half	QUANTITY	0.94+
AWS connect	ORGANIZATION	0.93+
university of British Columbia	ORGANIZATION	0.92+
COVID	EVENT	0.91+
COVID-19	OTHER	0.91+
over tens of thousands of people	QUANTITY	0.91+

UNLISTED FOR REVIEW Tammy Butow & Alberto Farronato, Gremlin | CUBE Conversation, April 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hello everyone welcome to the cube conversation here in Palo Alto our studios of the cube I'm showing for your host we're here during the crisis of Cove in nineteen doing remote interviews I come into the studio we've got a quarantine crew or here getting the interviews getting the stories out there and of course the story we continue to talk about is the impact of Kovan 19 and how we're all getting back to work either working at home or working remotely and virtually certainly but as things start to change we can start to see events mostly digital events and we're here to talk about an event that's coming up called the failover conference from gremlin which is now gone digital because it's April 21st but I think what's important about this conversation that I want to get into is not only talk about the event that's coming up but talk about these scale problems that are being highlighted by this change in work environment working at home we've been talking about the at scale problems that we're seeing whether it's a flood of surge of traffic and the chaos that's ensuing across the world with this pandemic so I'm excited have two great guests Alberto Ferran auto senior vice president marketing gremlin and Tammy Bhutto principal site reliability engineer or SRE guys thanks for coming on appreciate it thank you Thank You Alberto I want to get to you first you know we've known each other before you've been in this industry we all we've been all been talking about the cloud native cloud scale for some time it's kind of inside the ropes it's inside baseball Tami your site reliability engineer everyone knows Google knows how well cloud works this is large-scale stuff now with The Cove in 19 we're starting to see the average person my brother my sister our family members and people around the world go oh my god this is really a high impact this change of behavior the surge of you know whether whether it's traffic on the internet or work at home tools that are inadequate you start to see these statistical things that were planned for not working well and this actually Maps the things that we've been talking about it in our industry Alberto you've been on this how you guys doing and what's your what's your take on this situation we're in right now yeah yeah we're we're doing pretty well as a company we were born as a distributed organization to begin with so for us working in a distributed environment from all over the world is is common practice day-to-day personally you know I'm originally from Italy my parents my family is Milan and Bergen audible places so I have to follow the news with extra care and so much in me it becomes so much clearer nowadays that technology is not just a powerful tool to enable our businesses but it also is so critical for our day-to-day life and thanks to you know video calls I can easily talk to my family back there every day Wow so that's that's really important so yes we've been talking for a long time as you mentioned about complex systems at scale and reliability often in the context of mission-critical applications but more and more these systems need to be reliable also when it comes to back office systems that enable people to continue to work on a daily basis yeah well our hearts go out to your family and your friends in Italy and hope everyone's stay safe there no that was a tough situation continues to be a challenge Tammy I want to get your thoughts how is life going for you you're a sight reliable engineer what you deal with on the tech side is now happening in the real world it's it's almost it's mind-blowing and to me that we're seeing these these things happen it's it's a paradigm that needs attention and whew look at it as a sre dealing a most from a tech side now seeing it play out in real life it's such an interesting situation really terrible so one of the things that I specialize in as a site reliability engineer is incident management and so for example I previously worked at Dropbox where I was you know the incident manager on call for 500 million customers you know it's like 24/7 and these large-scale incidents you really need to be able to act fast there are two very important metrics that we track and care about as a site reliability engineer the first one is mean time to detection how fast can you detect what something is happening obviously if you detect an issue faster and you've got a better chance of making the impact lower so you can contain the blast radius I like to explain it to people like if you have a fire in your sauce bin in your kitchen and you put it out that's way better than waiting until your entire house is on fire and the other metric is mean time to resolution so how long does it take you to recover from the situation so yeah this is a large-scale global incident right now that we're in yeah I know you guys do a lot of talk about chaos theory and that applies a lot of math involved we all know that but I think when you go look at the real world this is gonna be table stakes and you know there's now a line in the sand here you know pre-pandemic post pandemic and i think you guys have an interesting company gremlin in the sense that this is this is a complex system and if you think about the world we're going to be living in whether it's digital events that you guys are have one coming up or how to work at home or tools that humans are going to be using it's going to be working with systems right so you have this new paradigm gonna be upon us pretty quickly and it's not just buying software mechanisms or software it's a complex system it's distributed computing and operating so I mean this is kind of the world can you guys talk about the gremlin situation of how you guys are attacking these new problems and these new opportunities that are emerging one of the things that I've always specialized in over the last 10 years is chaos engineering and so the idea of chaos engineering is that you're injecting failure on purpose to uncover weaknesses so that's really important in distributed systems with distributed you know cloud computing all these different services that you're kind of putting together but the idea is if you can inject failure you can actually figure out what happens when I inject that small failure and then you can actually go ahead and fix it one of the things I like to say to people is you know focus on what your top 5 critical systems are let's fix those first don't go for low-hanging fruit fix the biggest problems first get rid of the biggest amount of pain that you have as a company and then you can go ahead and like actually if you think about Pareto principle the 80/20 rule if you fix 20% of your biggest problems you actually solve 80% of your issues that always works something that I've done while working at National Australia Bank doing chaos engineering also what gremlin at Dropbox and I help a lot of our customers do that to albariño talk about the mindset involved it's almost counterintuitive whoa-oh-oh risk the biggest system and I don't want to touch those there working fine right now and then these problems just gestate they kind of hang around to the bin in the kitchen fire you know mist okay I don't want to touch it the house is still working so this is kind of a new mindset could you talk about what your take is on that is the industry there I mean oh it was a kind of a corner case you know you had Netflix you had the chaos monkey those days and then now it's the DevOps practice for a lot of folks you guys are involved in that what's the what's the appetite what's the progress of chaos engineering and mainstream yeah it's interesting that you mentioned DevOps and you know recently Gartner came up with a new revisited devil scream work that has chaos engineering in the middle of the lifecycle of your application and the reality is that systems have become so complex in infrastructure so many layers of abstractions you have hundreds of services if you're doing micro services but even if you're not doing micro services you have so many applications connected to each other build really complex workflows and automation flows it's impossible for traditional QA to really understand well the vulnerability are in terms of resiliency in terms of quality too often the production environment is also too different from the staging environment and so you need a fundamentally different approach to go and find where your weaknesses are and find them before they happen before you end up finding yourself in a situation like the one we're in today and you're not prepared and so much of what we talk about is giving it >> and the methodology for people to go and find these vulnerabilities not so much about creating cause chaos but it's about managing sales that is built into our current system and exposing those vulnerabilities before they create problem and so that's a very scientific methodology and and and tooling that we would bring to market and we help customers with Tammy I want to get your thoughts on so you know we used to riff a lot of to our 10th you know cube we've had a lot of conversation we've ripped over the over the years but you know when the surge of Amazon Web Services came out as pretty obvious the clouds amazing and look at the startups that were born you mentioned Dropbox you work there these comings and all these born in the cloud these hyper scale comes built from scratch great way to scale up and we used to joke about Google people say I would like a cloud like Google but no one has Google's use cases and Google really pioneered the sre concept and you gotta give them a lot of props for that but now we're kind of getting to a world where it's becoming Google like there's more scale now than ever before it's not a corner case it's becoming more popular and more of a preferred architecture this large scale what's your assessment of the of the mainstream enterprises how far are they did in your mind our way are they there with Castle they clothed how they doing it how does someone take how does someone develop an SRE practice to get the Google like scale because Google has an amazing network they got large-scale cloud they have sres they've been doing it for years how does a company that's transforming their IT have expertise it's a great question I get asked this a lot as well one of our goals at Bremen is to help make Internet more reliable for everybody everyone using the Internet all of the engineers who are trying to build reliable services and so I'm often asked by you know companies all over the world how do we create an SRE practice and how do we practice chaos engineering and so actually how you can get started actually rolling out your sre program based on my experiences I've done it so when I worked at Dropbox I worked with a lot of people who had been at Google they've been at YouTube they were there when was rolled out across those companies and then they brought those learnings to Dropbox and I learned from them but also the interesting thing is if you look at enterprise companies so large banks say for example I worked at a National Australia Bank for six years we actually did a lot of work that I would consider chaos engineering and sre practices so for example we would do large-scale disaster recovery and that's where you fail over an entire data center to a secret data center in an unknown location and the reason is because you're checking to make sure that everything operates okay if there's a nuclear blast that's actually what you have to do and you have to do that practice every quarter so but but if you think about it it's not very good to only do it once a quarter you really want to be practicing chaos engineering and injecting failure on this I think actually my I prefer to do it three times a week do I do it a lot but I'm also someone who likes to work out a lot and be fit all the time so I know that do something regularly you get great results so that's what I always tell us yeah I get the reps in as we say you know get get stronger at the muscle memory guys talk about the event that's coming up you got an event that was schedules physical event and then you were right in the planning mode and then the crisis hits you going digital going virtual it's really digital but it's digital that's on the internet so how are you guys thinking about this I know I it's out there it's April 21st can you share some specifics around the event well who should be attending and how they get involved online yeah yeah they vent really came about about together about a month ago when we started to see all the cancellations happening across the industry because of code 19 and we are extremely engaged with in the community and we have a lot of talks and we are seeing a lot of conferences just dropping and so speakers losing their opportunity to share their knowledge with respect to how you do reliability and topics that we focus on and so we quickly people it as a company and created a new online event to give everyone in the community the opportunity to you know they'll over to a new event as the president as a as the conference name says and and have those speakers will have lost their speaking slots have a new opportunity to go share their knowledge and so that came together really quickly we share the idea with a dozen of our partners and everyone liked it and all the sudden this thing took off like crazy in just a month where we are approaching you know four thousand registrations we have over 30 partners signed up and supporting the initiative a lot of a lot of past partners as well covering the event so it was impressive to see the amount of interest that that we were able to generate in such a short amount of time and really this is a conference for anybody who is interested in resilience and if you want to know from the best on how to build business continuity of persistence people and processes this is a great opportunity at no cost we need some free conference and the target persona and the audience you want to have a ten is what Sree Zoar folks doing architectural work and what's that that's the target yes and to attend our cadets s Ari's developers business leaders who care about the quality and reliability of their applications who need to help create a framework and a mindset for their organization that speaks to what Tammy was saying a minute ago having that constant crap is on a daily basis about who and finding how to improve things you know Tammy we've been doing going to physical events with the cube and extracting the signal of the noise and distributing it digitally for ten years and I got to ask you because now that those are those events have gone away you talk about chaos and injecting failure these doing these digital events is not as easy it's just live streaming it's it's hard to replicate the value of a physical event years of experience and standards roles and responsibilities to digital different consumption environments a synchronous you're trying to create a synchronous environment it's its own complex system so I think a lot of people are experimenting and learning from these events because it's pretty chaotic so I'd love to get your thoughts on how you look at these digital events as a chaos engineer how should people be looking at these events how are you I was looking at it you know I also want to get the program going get people out there get the content but you have to iterate on this how do you view this it is really different so I actually like to compare it to fire drills in SRA so often what you do there is you actually create a fake incident or a fake issue so you just you know you're saying let's have a fire drill similar to like you know when you're in a building and you have a fire drill that goes off you have wardens and everything and you all have to go outside so we can do that in this new world that we're all in all of a sudden you know a lot of people have never run an online event and now all of a sudden they have to so what I would say is like do a fire drill um run up you know a baked one before you do the actual on one to make sure that everything does work okay my other tip is make sure that you have backup plans backup plans on backup plans on backup plans like as in SRA I always have at least three to five backup plans like I'm not just saying plan a and Plan B but there's also a C D and E and I think that's very important and you know even when you're considering technology one of the things we say with chaos engineering is you know if you're using one service inject failure and make sure that you can fail over to a different alternative service in case something goes wrong yeah hence the failover conference which is the name of the conference yeah yeah well we certainly are gonna be sending a digital reporter there virtually if you need any backup plans obviously we have the remote interviews here if you need any help let us know really appreciate it I'll great to see you guys and thanks for sharing any final thoughts on the conference how what what happens when we get through the other side of this I'll give you guys a final word we'll start with Alberto with you first yeah I think one when we are on the other side of this will will understand even more the importance of effective resilience architecting and and and testing I think you know as a provider of tools and methodologies for that we we think we will be able to help customers do we do a significant leap forward on that side and the conference is just super exciting I think it's going to be a great I encourage everyone to participate we have tremendous lineup of speakers that have incredible reputation in their fields so I'm really happy and and excited about the work that the team has being able to do with our partners put together this type of event okay Tammy yes ma'am I'm actually going to be doing the opening keynote for the conference and the topic that I'm speaking about is that reliability matters more now than ever and I'll be sharing some you know bizarre weird incidents that I've worked on myself that I've experienced you know really critical strange issues that have come up but yeah I just I'm really looking forward to sharing that with everybody else so please come along it's free you can join from your own home and we can all be there together to support each other you got a great community support and there's a lot of partners press media and an ecosystem and customers so congratulations gremlin having a conference on April 21st called the failover conference the qubits look at angle we'll have a digital reporter there we covering the news thanks for coming on and sharing and appreciate the time I'm Jeff we're here in the Palo Alto series with remote interview with gremlin around there failover conference April 21st it's really demonstrating in my opinion the at scale problems that we've been working on the industry now more applicable than ever before as we get post pandemic with kovin 19 thanks for watching be back [Music]

Published Date : Apr 7 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Tammy	PERSON	0.99+
April 21st	DATE	0.99+
Milan	LOCATION	0.99+
20%	QUANTITY	0.99+
April 2020	DATE	0.99+
Palo Alto	LOCATION	0.99+
Tammy Bhutto	PERSON	0.99+
six years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Italy	LOCATION	0.99+
Alberto Farronato	PERSON	0.99+
ten years	QUANTITY	0.99+
Jeff	PERSON	0.99+
Alberto	PERSON	0.99+
National Australia Bank	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Tammy Butow	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
National Australia Bank	ORGANIZATION	0.99+
two very important metrics	QUANTITY	0.99+
nineteen	QUANTITY	0.99+
Bergen	LOCATION	0.99+
over 30 partners	QUANTITY	0.99+
Dropbox	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.98+
Tami	PERSON	0.98+
10th	QUANTITY	0.98+
a month	QUANTITY	0.98+
hundreds of services	QUANTITY	0.98+
one	QUANTITY	0.97+
four thousand registrations	QUANTITY	0.97+
three times a week	QUANTITY	0.97+
YouTube	ORGANIZATION	0.97+
first one	QUANTITY	0.97+
gremlin	PERSON	0.96+
Alberto Ferran	PERSON	0.96+
first	QUANTITY	0.96+
Netflix	ORGANIZATION	0.95+
today	DATE	0.94+
once a quarter	QUANTITY	0.93+
ten	QUANTITY	0.93+
one service	QUANTITY	0.93+
pandemic	EVENT	0.92+
code 19	OTHER	0.9+
500 million customers	QUANTITY	0.89+
two great guests	QUANTITY	0.88+
five backup	QUANTITY	0.84+
Bremen	ORGANIZATION	0.84+
about a month ago	DATE	0.83+
lot of people	QUANTITY	0.8+
pandemic post pandemic	EVENT	0.79+
The Cove	ORGANIZATION	0.79+
a minute ago	DATE	0.79+
failover	EVENT	0.78+
a lot of people	QUANTITY	0.78+
80% of your issues	QUANTITY	0.77+
Kovan 19	EVENT	0.76+
pre-	EVENT	0.76+
19	QUANTITY	0.75+
every quarter	QUANTITY	0.75+
failover conference	EVENT	0.75+
Sree Zoar	ORGANIZATION	0.75+
top 5 critical systems	QUANTITY	0.73+
DevOps	TITLE	0.72+
19	DATE	0.7+
one of	QUANTITY	0.7+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for SRA: