Image Title

Search Results for Tammy Bryant:

Tammy Bryant | PagerDuty Summit 2020


 

>> Presenter: From around the globe, it's the cube, with digital coverage of pager duty summit 2020. Brought to you by pager duty. >> Welcome to this cube conversation. I'm Lisa Martin, today talking with Tammy Bryant is a cube alumna, the principal Site reliability engineer at Gremlin and the co-founder and CTO of the Girl Geek Academy. Tammy, it's great to have you on the program again. >> Hi Lisa, thanks so much for having me again. It's great to be here. >> So one of the things I saw in your background 10 plus years of technical expertise, and SRE, and chaos engineering, and I thought chaos engineering, I feel like I'm living in chaos right now. What is chaos engineering and why do you break things on purpose? >> Yep. So the idea of chaos engineering is that we're, breaking systems but in a thoughtful controlled way, to identify weaknesses in systems. So that's really what it's all about. The idea there is, you know, When you're doing really complicated work with technical systems, so like, for example, distributed systems and say, for example, you're working at a bank, it's tough to be able to pinpoint the exact failure mode that could cause a really large outage for your customers. And that's what chaos engineering is all about. you inject the failure proactively, to identify the issues and then you fix them before they actually cause really big problems for customers and you do it during the middle of the day, you know, when you're feeling great, instead of being paged in the middle of the night for an incident, that's actually like causing your customers pain, and making you lose a lot of money. So that's what chaos engineering really is. >> Are you seeing in the last six months since the world is so different, are you seeing an increase in customers? Now with, the for example, Brick and Mortars shut down and everything having to convert to digital if it wasn't already? Is there an increase in demand for chaos engineering services? >> Yeah, definitely. So a lot of people are asking what is chaos engineering, how can I use ,it will it help me reduce my incidents? and definitely because there are a lot of new services that have been rolled out recently, say, for example, curbside pickup. That's a whole new thing that had to be created really recently to be able to handle a large amount of load. And you know, people show up, they want to get their product really fast, 'cause they want to be able to just get back home quickly. And that's something that we've been working on with our customers is to make sure that curbside pickup experience is really great. The other interesting thing that we've been working on because of the pandemic is making sure that banks are really reliable, and that customers are able to get access to their money when they need it. And able to see that information too. And you can imagine that not as when you're in lockdown, and you only can leave your house for maybe an hour a day, you need to be able to quickly get access to your money to buy food, and we've seen some big incidents recently, where that hasn't been the case. Yeah. >> And I can imagine I mean, just thinking of what happened with, everything six months ago and how people were, we are just, demanding, right, consumers were demanding, we expect to get whatever we want, whether it's something we buy on Amazon, something that we stream on Netflix, or whatnot, we have this expectation that we can almost get it in real time. But there was a there was, you know what, there was a delay a few months ago, and there still is to some degree. But companies like Amazon and Netflix, I can imagine, really must have a big focus on chaos engineering, to test these things regularly. And now have proved, I would imagine to some degree that with chaos engineering that they have built, they're built to withstand that. >> Yes, exactly. So our founders at Gremlin came from Netflix and Amazon, our CEO had worked at both where he done chaos engineering, and that's actually why he decided to create Gremlin. It's the first company in the world to offer chaos engineering as a service. And you know, obviously, when you're working somewhere like Netflix, you know the whole product, you have to be able to get access to that movie, that TV show, right in that moment, and also customers expect to be able to see that on for example. There PlayStation in their living room and it should work and there paying for a subscription, So, to be able to keep them on that subscription, you need to offer a great service. Same thing with Amazon, you know, Amazon.com, they've done a lot of chaos engineering work over many years now to be able to make sure that everything is available. And it's not just that, the entire amazon.com is up and running. It's also for example, that when you go and look at a page that the recommendation service works toO and they're able to show you, hey, here's some other things that you might like to get to buy at this time. And I like as as a consumer, I love that 'cause it helps me save time and effort and even money as well 'cause it's giving you some good advice. So that's the type of statement we do. >> Exactly, So. when you're working with customers, I'd love to understand just a little bit from the, like the conversational standpoint is this now, is chaos engineering now, at kind of the sea level or is it still sort of in within the engineering folks 'cause looking at this as a make or break, knowing that for example, Netflix, there's Hulu, there's Disney Plus, there's Apple TV. Plus, if we don't get something that we're looking for right away, there's prime, we're going to go to another streaming service. So are you starting to see like an increase in demand from companies that no, we have competition right behind us, we've got to be able to set up the infrastructure and ensure that it is reliable. Now more than ever. >> Yeah, exactly. That's really, really important. I'm seeing a lot of executives. I mean, I've seen that since the beginning, really, since I first started working at Gremlin. I would often be invited by executives to come and give talks actually, within their company, to help the teams learn about chaos engineering, and I love doing that, It's really great. So I'd be invited by C levels, or VPs, from different departments. And I often get people adding me on LinkedIn from all over the world who are in leadership roles, because really, like, you know, they're responsible for making sure that their companies can hit those critical metrics and make sure that they're able to achieve their really, you know, demanding business goals, and then they're trying to help their teams be able to achieve that, too. So I've actually been so pleased to see that as well. Like it is really cool to have an executive reach out and say, hey, I'm thinking of helping my team, I'd like to get them introduced to you can you come and just teach them about this topic? And I love being able to do that it's really positive. And it's the right way to improve. >> It is, and I think nowadays, with reliability being more important than ever, you know, we talked to leaders from industry, from every industry. And there are certain things right now that are going to be shaping the winners and the losers of tomorrow. And it sounds to me like chaos engineering is one of those things that's going to be fundamental to any type of business to not just survive these times, but to thrive going forward. >> Yes, I definitely think so. I mean, obviously, people can easily just go to a different URL and try and use a different service. And you know, we're seeing now failure across so many different industries. We didn't see that before. But for example, you know, I'm sure you've seen in the news or heard from friends and family about schools, now being completely online. And then kids can't actually access, their calls their resources, what they need to learn every day. So that really just shows you how much it's impacting us as a society, we really know that the internet is critical. It's amazing that we have the internet, like how lucky we are to have this, but it needs to work for us to actually be able to get value out of it. And that's what chaos engineering is all about. You know, were able to make sure that everything is reliable, so it's up and running. And we do that by looking at things like redundancy. So we'll do failover work where we completely shut down an application or service and make sure it gracefully fails over. We also do a lot of dependency failure work, where you're actually looking to say, this is the critical path of this service. And a lot of people don't think about this, but the critical path really starts at sign in. So you need to make sure that login and sign in works really well. It's not just about like the experience once you've signed in, that has to work well all the way through. So actually if you have a good understanding of user experience, it helps you create a much better pathway and understand those critical pieces that the customer needs to be able to do to have a great experience. And I care a lot about that. Like whenever I go and work somewhere, I always read customer tickets, I always try and understand what are the customer pain points. And I love listening to customers and then just solving their problems. The last thing I want them to do is, you know, be complaining or be really annoyed on Twitter because something just isn't working when they need it to be working. And it is really critical these days. It's a the internet is a really serious part of our day to day life. >> Oh, it's a lifeline. I mean, that's, some folks. It's the only way that they're connecting with the outside world, is through the internet. So when things aren't, I had a friend whose son first day of college couple weeks ago, freshman year, first class couldn't get into zoom. And that's a stressful situation. But I imagine too, though, that and I know you're going to be speaking at the pager duty summit that more folks need to understand what this is. And I can tell the you have a real authentic passion for it. Talk to us about what you're going to be talking about at the pager duty summit. >> Sure thing, I'm really excited to be speaking at Pager Duty Summit very soon. My talk is called building, and scaling SRE teams, so site reliability engineering teams. And this is something that I've done previously. I've built out the SRE teams at Dropbox for both databases as well as storage. So block storage, and then I also lead the code workflows team. And that's for, you know, over 500 million users, people accessing the critical data that they store on Dropbox all the time. You know the way that folks use Dropbox is in so many different ways. Maybe it's like really famous music musicians who are trying to create an amazing new album that happens or maybe it's a lawyer preparing for a court case, and they need to be able to access their documents. So those are a lot of customer stories that would come up over time. And prior to that, I worked at the National Australia Bank as well leading teams too and obviously like people care about their money if they can't access their money. If there incorrect transactions, if there are missing transactions, you know, duplicate transactions, maybe people don't mind so much about it you get like a double deposit, but it's still not good from the bank's perspective. So there's all types of different chaos that can happen. And I found it to be really interesting to be able to dive into that and make sure that you can make improvements. And I love that it makes customers happier. And also, it helps you improve your company as a whole. So it's a really good thing to be able to do, And with my talk, I'm going to talk to folks about, you know, not only why it's important to build out a reliability practice at your organization, you know, back in the day, people used to go, why would you need a security team? You know, why would we need that? now everybody has a security team, everyone has a chief security officer as well. But why don't we focus on reliability, like we know that we see incidents out in the news all the time, but for some reason, we don't have the chief reliability officer. I think that's definitely going to be something that will appear in the future just like the chief security officer roll up. But that's what I'm going to talk about there. How you can find site reliability engineers, I'll share a few of my secrets. I won't give any spoilers out. But there's actually quite a few places that you can find amazing people. There's even a school that you can hire them from, which I've done in the past. And then I'll talk to you about how you can interview them to make sure that you get the best people on your team. There are a number of things that I think are very important to interview for. And then once you've got those folks on your team, I'll talk to you about how you can make sure that they're successful. How to set them up for success and make sure that they're aligned to not only your business goals, but also your core values as a company, which is really important too. >> Yeah, that's fantastic. It's very well rounded, I'm curious, what are some of the the characteristics that you think are really critical for someone to become a successful SRE? >> Yeah, so there's a few key things that I look for. One thing is that, somebody who is really good at troubleshooting, so they need to be able to be comfortable with complexity, ambiguity and open ended challenges and problems and also thrive in those types of environments. Because often you're seeing something that you've never seen happen before. And also you're working with really complicated systems. So you just need to be able to feel good in that moment. And you can test for that during an interview question on troubleshooting and debugging. So that's something that I'll go into in more detail. But that's definitely the first characteristic. The other thing, of course, is you want to have someone who is good at being able to build solutions. So they can code, they understand automation, they can figure out how can I take this pain point, this problem? And how can I automate it and then scale this out and make it available for everyone across my organization? So someone who has that mindset of building tools for others, and often they are internal tools, because maybe you're building a tool that helps everybody know, who's on call every single critical service at the company and also non critical service and they can identify that in a minute or less like maybe even just in a few seconds, and then they can quickly get that person involved, if anything need to escalate to them. Via for example, a tool like pager duty, that's really what you want. You want them to be able to think, how can I just make this efficient? How can I make sure that we can get really great results? And yeah, I think they also just need to be really personable too and work well in a really complicated organizational structure. Because usually they have to work with the engineering team, the finance team to understand the revenue impact. They need to be able to work with the PR team and the social media team, if they're incidents, and then they need to provide information about when this incident is going to be resolved, and how they can update VIP customers. They need to talk to the sales team, because what happens if you're giving a demonstration, and then somehow there's an issue, or failure that happens, an incident and then in the middle of your very important sales demo, you're not able to actually deliver it that can happen a lot too. So there are a lot of very important key skills. >> Sounds like it's a really cross functional role, pivotal to an organization, that needs to understand how these different functions not only operate, but also operate together, is that somebody that you think has certain types of previous work experience? Is this something that you talked to the Girl Geek Academy girls about? How did they get into? I'm curious, like what the career path is? >> Yeah, it's interesting, like I find a lot of SRE's often come from either a few different backgrounds. One is they came through the world of Linux and understanding systems, and just being really interested in that. Like deep diving into the kernel, understanding how to improve performance of systems. The other side is maybe they came from coding background where they were actually building applications and features. I started off actually on that side, but I also had a passion for Linux. And then I sort of spread over into the other side and was able to learn both. And then often you know, someone who's comfortable with being on call and handling incidents, but it is a lot of skills, like that's actually something that I often talk to folks about, and they asked me how can I become a great SRE? There's so many things I need to learn. And I just say, you know, take it slow, try and gradually increase your number of skills. People often say that there is like there's some curve for SRE's, where you have the operations side, on one side, and then the coding side on the other. And often like the best person sits right in the middle where they have both ops and engineering skills. But it's really hard to find those people. It's okay if you have someone that's like, really deep, has amazing knowledge of Linux and scaling systems and internet management, and then you can pair them up with a really amazing programmer who's great at software engineering and software architecture, that's okay, too. >> We've been hearing for a long time about this sort of negative unemployment with respect to cyber security professionals. Is that, are you guys falling into that same category as well with SRE? Or is it somehow different or you just know this is exactly what we're looking for? We want to go out there, and even in the Girl, Greek Academy, maybe help girls learn how to be able to find what I imagine are a lot of opportunities. >> Yeah, there are so many opportunities for this. So it's definitely an opportunity because what I see is there's not enough SRE's. So tons of companies all over the world will actually ping me and say, hey, Tommy, how do I hire SRE's, that's why I decided to give this talk because I wanted to package that up and just share that information as to how you can do it. And also, maybe you can't find the SRE's because they don't exist. But you can help retrain your team. So you can have an engineer learn the skills that are required to be an SRE, that's totally possible too, maybe move them over to become an SRE. With girl geek Academy, one of the things that I've done is run hackathons and workshops and just online training sessions to help girls learn these new skills. So that's exactly what our mission is, is to teach 1 million girls technical skills by 2025. And I love to do mentoring at scale, which is why it's been really cool to be able to do it online and through these like workshops and remote hackathons. And I definitely love to do something where else work with some of our customers actually, and run an event. I did one a while back, it was really cool, we were able to have all of the girls come in and be at the customer's office and actually learn skills with the customer, which was really fun. And it helps them actually think, hey, I could work one day that would be really amazing. And I'm going to do that again in November. And it's kind of fun too. We can do things like have like, you know, dad and mom and then daughter day, where you actually bring your daughter to work and help her learn technical skills. That's really fun because they get to see what you do and they understand it more and see how cool chaos engineering really is. Then they think oh, wow, you're so awesome, this is great. >> I love it, that's fantastic. Well it sounds like, like I said before your passion for it is really there. What, I think is really interesting is how you're talking about chaos engineering and just the word in and of itself chaos. But you painted in such a positive lights critical business critical, but also the all the opportunities there that businesses have to learn and fine tune so such an interesting conversation. Yeah, Tammy. We have you back on the program. But I thank you so much for joining me today. And for those folks that lucky enough that are attending the pager duty summit, they're going to get to learn a lot from you. Thank you. >> Thanks so much for having me, Lisa. >> For Tammy Bryant, I'm Lisa Martin. You're watching this cube conversation. (upbeat music)

Published Date : Sep 10 2020

SUMMARY :

Brought to you by pager duty. and the co-founder and CTO It's great to be here. and why do you break things on purpose? and then you fix them and that customers are able to get access and there still is to some degree. and also customers expect to be able to and ensure that it is reliable. I'd like to get them introduced to you that are going to be shaping the winners the customer needs to be able to do And I can tell the you have a and make sure that they're aligned to that you think are really critical and then they need to And I just say, you know, take it slow, maybe help girls learn how to be able to they get to see what you do and just the word in and of itself chaos.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Tammy BryantPERSON

0.99+

Lisa MartinPERSON

0.99+

TammyPERSON

0.99+

LisaPERSON

0.99+

AmazonORGANIZATION

0.99+

Tammy BryantPERSON

0.99+

NetflixORGANIZATION

0.99+

Amazon.comORGANIZATION

0.99+

National Australia BankORGANIZATION

0.99+

NovemberDATE

0.99+

DropboxORGANIZATION

0.99+

10 plus yearsQUANTITY

0.99+

HuluORGANIZATION

0.99+

2025DATE

0.99+

GremlinORGANIZATION

0.99+

Girl Geek AcademyORGANIZATION

0.99+

Brick and MortarsORGANIZATION

0.99+

amazon.comORGANIZATION

0.99+

todayDATE

0.99+

bothQUANTITY

0.99+

LinkedInORGANIZATION

0.99+

PlayStationCOMMERCIAL_ITEM

0.99+

Pager Duty SummitEVENT

0.98+

LinuxTITLE

0.98+

six months agoDATE

0.98+

One thingQUANTITY

0.98+

Apple TVCOMMERCIAL_ITEM

0.98+

over 500 million usersQUANTITY

0.98+

TommyPERSON

0.98+

TwitterORGANIZATION

0.98+

Girl, Greek AcademyORGANIZATION

0.98+

tomorrowDATE

0.97+

first dayQUANTITY

0.97+

pager duty summitEVENT

0.97+

an hour a dayQUANTITY

0.96+

1 million girlsQUANTITY

0.96+

couple weeks agoDATE

0.96+

oneQUANTITY

0.96+

one dayQUANTITY

0.95+

firstQUANTITY

0.95+

both databasesQUANTITY

0.93+

pandemicEVENT

0.93+

first companyQUANTITY

0.93+

few months agoDATE

0.92+

OneQUANTITY

0.91+

first classQUANTITY

0.9+

last six monthsDATE

0.9+

primeCOMMERCIAL_ITEM

0.89+

first characteristicQUANTITY

0.88+

singleQUANTITY

0.86+

pager duty summit 2020EVENT

0.84+

double depositQUANTITY

0.83+

PagerDuty Summit 2020EVENT

0.82+

DisneyORGANIZATION

0.74+

one sideQUANTITY

0.73+

lot of moneyQUANTITY

0.71+

SREORGANIZATION

0.66+

girlORGANIZATION

0.58+

tonsQUANTITY

0.56+

peopleQUANTITY

0.54+

AcademyORGANIZATION

0.52+

few secondsQUANTITY

0.49+

SRETITLE

0.44+

PlusCOMMERCIAL_ITEM

0.32+