Why Use IaaS When You Can Make Bare Metal Cloud-Native?

>>Hi, Oleg. So great of you to join us today. I'm really looking forward to our session. Eso Let's get started. So if I can get you to give a quick intro to yourself and then if you can share with us what you're going to be discussing today >>Hi, Jake. In my name is Oleg Elbow. I'm a product architect and the Doctor Enterprise Container Cloud team. Uh, today I'm going to talk about running kubernetes on bare metal with a container cloud. My goal is going to tell you about this exciting feature and why we think it's important and what we actually did to make it possible. >>Brilliant. Thank you very much. So let's get started. Eso from my understanding kubernetes clusters are typically run in virtual machines in clouds. So, for example, public cloud AWS or private cloud maybe open staff based or VM ware V sphere. So why why would you go off and run it on their mettle? >>Well, uh, the Doctor Enterprise container cloud already can run Coburn eighties in the cloud, as you know, and the idea behind the container clouds to enable us to manage multiple doctor enterprise clusters. But we want to bring innovation to kubernetes. And instead of spending a lot of resources on the hyper visor and virtual machines, we just go all in for kubernetes directly environmental. >>Fantastic. So it sounds like you're suggesting then to run kubernetes directly on their mettle. >>That's correct. >>Fantastic and without a hyper visor layer. >>Yes, we all know the reasons to run kubernetes and virtual machines it's in The first place is mutual mutual isolation off workloads, but virtualization. It comes with the performance, heat and additional complexity. Uh, another. And when Iran coordinated the director on the hardware, it's a perfect opportunity for developers. They can see performance boost up to 30% for certain container workloads. Uh, this is because the virtualization layer adds a lot off overhead, and even with things like enhanced placement awareness technologies like Numa or processor opinion, it's it's still another head. By skipping over the virtualization, we just remove this overhead and gained this boost. >>Excellent, though it sounds like 30% performance boost very appealing. Are there any other value points or positive points that you can pull out? >>Yes, Besides, the hyper visor over had virtual machines. They also have some static resource footprint. They take up the memory and CPU cycles and overall reintroduces the density of containers per host. Without virtual machines, you can run upto 16% more containers on the same host. >>Excellent. Really great numbers there. >>One more thing to point out directly. Use environmental makes it easier to use a special purpose hardware like graphic processors or virtual no virtual network functions for don't work interfaces or the field programmable gate arrays for custom circuits, Uh, and you can share them between containers more efficiently. >>Excellent. I mean, there's some really great value points you pulled out there. So 30% performance boost, 60% density boost on it could go off and support specialized hardware a lot easier. But let's talk about now. The applications. So what sort of applications do you think would benefit from this The most? >>Well, I'm thinking primarily high performance computations and deep learning will benefit, Uh, which is the more common than you might think of now they're artificial Intelligence is gripping into a lot off different applications. Uh, it really depends on memory capacity and performance, and they also use a special devices like F P G s for custom circuits widely sold. All of it is applicable to the machine learning. Really? >>And I mean, that whole ai piece is I mean, really exciting. And we're seeing this become more commonplace across a whole host of sectors. So you're telcos, farmers, banking, etcetera. And not just I t today. >>Yeah, that's indeed very exciting. Uh, but creating communities closer environmental, unfortunately, is not very easy. >>Hope so it sounds like there may be some challenges or complexities around it. Ondas this, I guess. The reason why there's not many products then out there today for kubernetes on their metal on baby I like. Could you talk to us then about some of the challenges that this might entail? >>Well, there are quite a few challenges first, and for most, there is no one way to manage governmental infrastructures Nowadays. Many vendors have their solutions that are not always compatible with each other and not necessarily cover all aspects off this. Um So we've worked an open source project called metal cube metal cooped and integrated it into the doctor Enterprise Container Cloud To do this unified bar middle management for us. >>And you mentioned it I hear you say is that open source? >>There is no project is open source. We had a lot of our special sauce to it. Um, what it does, Basically, it enables us to manage the hardware servers just like a cloud server Instances. >>And could you go? I mean, that's very interesting, but could you go into a bit more detail and specifically What do you mean? As cloud instances, >>of course they can. Generally, it means to manage them through some sort of a p I or programming interface. Uh, this interface has to cover all aspects off the several life cycle, like hardware configuration, operating system management network configuration storage configuration, Uh, with help off Metal cube. We extend the carbonated C p i to enable it to manage bare metal hosts. And aled these suspects off its life cycle. The mental que project that's uses open stack. Ironic on. Did it drops it in the Cuban. It s a P I. And ironic does all the heavy lifting off provisioned. It does it in a very cloud native way. Uh, it configures service using cloud they need, which is very familiar to anyone who deals with the cloud and the power is managed transparently through the i p my protocol on. But it does a lot to hide the differences between different hardware hosts from the user and in the Doctor Enterprise Container Cloud. We made everything so the user doesn't really feel the difference between bare metal server and cloud VM. >>So, Oleg, are you saying that you can actually take a machine that's turned off and turn it on using the commands? >>That's correct. That's the I. P M I. R Intelligent platform management interface. Uh, it gives you an ability to interact directly with the hardware. You can manager monitor things like power, consumption, temperature, voltage and so on. But what we use it for is to manage the food source and the actual power state of the server. So we have a group of service that are available and we can turn them on. And when we need them, just if we were spinning the VM >>Excellent. So that's how you get around the fact that while aled cloud the ends of the same, the hardware is all different. But I would assume you would have different server configurations in one environment So how would you get around that? >>Uh, yeah, that Zatz. Excellent questions. So some elements of the berm mental management the FBI that we developed, they are specifically to enable operators toe handle wider range of hardware configurations. For example, we make it possible to consider multiple network interfaces on the host. We support flexible partitioning off hard disks and other storage devices. We also make it possible thio boot remote live using the unified extended firmware interface for modern systems. Or just good old bias for for the legacy ones. >>Excellent. So yeah, thanks. Thanks for sharing that that. Now let's take a look at the rest of the infrastructure and eggs. So what about things like networking and storage house that managed >>Oh, Jakey, that's some important details. So from the networking standpoint, the most important thing for kubernetes is load balancing. We use some proven open source technologies such a Zengin ICS and met a little bit to handle. Handle that for us and for the storage. That's ah, a bit more tricky part. There are a lot off different stories. Solutions out. There s o. We decided to go with self and ah cooperator for self self is very much your and stable distributed stories system. It has incredible scalability. We actually run. Uh, pretty big clusters in production with chef and rock makes the life cycle management for self very robust and cloud native with health shaking and self correction. That kind of stuff. So any kubernetes cluster that Dr Underprice Container Cloud provision for environmental Potentially. You can have the self cluster installed self installed in this cluster and provide stories that is accessible from any node in the cluster to any port in the cluster. So that's, uh, called Native Storage components. Native storage. >>Wonderful. But would that then mean that you'd have to have additional hardware so mawr hardware for the storage cluster, then? >>Not at all. Actually, we use Converse storage architecture in the current price container cloud and the workloads and self. They share the same machines and actually managed by the same kubernetes cluster A. Some point in the future, we plan to add more fully, even more flexibility to this, uh, self configuration and enable is share self, where all communities cluster will use a single single self back, and that's that's not the way for us to optimize our very basically. >>Excellent. So thanks for covering the infrastructure part. What would be good is if we can get an understanding them for that kind of look and feel, then for the operators and the users of the system. So what can they say? >>Yeah, the case. We know Doc Enterprise Container Cloud provides a web based user interface that is, uh, but enables to manage clusters. And the bare metal management actually is integrated into this interface and provides provides very smooth user experience. A zone operator, you need to add or enrolled governmental hosts pretty much the same way you add cloud credentials for any other for any other providers for any other platforms. >>Excellent. I mean, Oleg, it sounds really interesting. Would you be able to share some kind of demo with us? It be great to see this in action. Of >>course. Let's let's see what we have here. So, >>uh, thank you. >>Uh, so, first of all, you take a bunch of governmental service and you prepare them, connect and connect them to the network is described in the dogs and bootstrap container cloud on top of these, uh, three of these bare metal servers. Uh, once you put through, you have the container cloud up and running. You log into the u I. Let's start here. And, uh, I'm using the generic operator user for now. Its's possible to integrate it with your in the entity system with the customer and the entity system and get real users there. Mhm. So first of all, let's create a project. It will hold all off our clusters. And once we created it, just switched to it. And the first step for an operator is to add some burr metal hosts of the project. As you see it empty, uh, toe at the berm. It'll host. You just need a few parameters. Uh, name that will allow you to identify the server later. Then it's, ah, user name and password to access the IBM. My controls off the server next on, and it's very important. It's the hardware address off the first Internet port. It will be used to remotely boot the server over network. Uh, finally, that Z the i p address off the i p m i n point and last, but not the least. It's the bucket, uh, toe Assign the governmental host to. It's a label that is assigned to it. And, uh, right now we offer just three default labels or buckets. It's, ah, manager, manager, hosts, worker hosts and storage hosts. And depending on the hardware configuration of the server, you assign it to one of these three groups. You will see how it's used later in the phone, so note that least six servers are required to deploy managed kubernetes cluster. Just as for for the cloud providers. Um, there is some information available now about the service is the result of inspection. By the way, you can look it up. Now we move. Want to create a cluster, so you need to provide the name for the cluster. Select the release off Dr Enterprise Engine and next next step is for provider specific information. You need to specify the address of the Class three guy and point here, and the range of feathers is for services that will be installed in the cluster. The user war close um kubernetes Network parameter school be changed as well, but the defaults are usually okay. Now you can enable or disable stack light the monitoring system for the Burnett's cluster and provide some parameters to eat custom parameters. Uh, finally you click create to create the cluster. It's an empty cluster that we need to add some machines to. So we need a least three manager notes. The form is very simple. You just select the roll off the community snowed. It's either manager of worker Onda. You need to select this label bucket from which the environmental hospital we picked. We go with the manager label for manager notes and work your label for the workers. Uh, while question is deploying, let's check out some machine information. The storage data here, the names off the disks are taken from the environmental host Harbor inspection data that we checked before. Now we wait for servers to be deployed. Uh, it includes ah, operating system, and the government is itself. So uh, yeah, that's that's our That's our you user interface. Um, if operators need to, they can actually use Dr Enterprise Container Container cloud FBI for some more sophisticated, sophisticated configurations or to integrate with an external system, for example, configuration database. Uh, all the burr mental tasks they just can be executed through the carbonated C. P. I and by changing the custom resources customer sources describing the burr mental notes and objects >>Mhm, brilliant. Well, thank you for bringing that life. It's always good. Thio See it in action. I guess from my understanding, it looks like the operators can use the same tools as develops or developers but for managing their infrastructure, then >>yes, Exactly. For example, if you're develops and you use lands, uh, to monitor and manage your cluster, uh, the governmental resources are just another set of custom resources for you. Uh, it is possible to visualize and configure them through lands or any other developer to for kubernetes. >>Excellent. So from what I can see, that really could bridge the gap, then between infrastructure operators on develops and developer teams. Which is which is a big thing? >>Yes, that's that's Ah, one of our aspirations is to unify the user experience because we've seen a lot of these situations when infrastructure is operated by one set of tools and the container platform uses agnostic off it end users and offers completely different set of tools. So as a develops, you have to be proficient in both, and that's not very sustainable for some developers. Team James. >>Sure. Okay, well, thanks for covering that. That's great. E mean, there's obviously other container platforms out there in the market today. It would be great if you could explain only one of some of the differences there and in how Dr Enterprise Container Cloud approaches bare metal. >>Yeah, that's that's a That's an excellent question, Jake. Thank you. So, uh, in container cloud in the container Cloud Burr Mental management Unlike another container platforms, Burr metal management is highly and is tightly integrated in the in the product. It's integrated on the U and the A p I, and on the back and implementation level. Uh, other platforms typically rely on the user to provision in the ber metal hosts before they can deploy kubernetes on it. Uh, this leaves the operating system management hardware configuration hardware management mostly with dedicated infrastructure greater steam. Uh, Dr Enterprise Container Cloud might help to reduce this burden and this infrastructure management costs by just automated and effectively removing the part of responsibility from the infrastructure operators. And that's because container cloud on bare metal is essentially full stack solution. It includes the hardware configuration covers, operating system lifecycle management, especially, especially the security updates or C e updates. Uh, right now, at this point, the only out of the box operating system that we support is you, Bhutto. We're looking to expand this, and, as you know, the doctor Enterprise engine. It makes it possible to run kubernetes on many different platforms, including even Windows. And we plan to leverage this flexibility in the doctor enterprise container cloud full extent to expand this range of operating systems that we support. >>Excellent. Well, Oleg, we're running out of time. Unfortunately, I mean, I've thoroughly enjoyed our conversation today. You've pulled out some excellent points you talked about potentially up to a 30% performance boost up to 60% density boost. Um, you've also talked about how it can help with specialized hardware and make this a lot easier. Um, we also talked about some of the challenges that you could solve, obviously, by using docker enterprise container clouds such as persistent storage and load balancing. There's obviously a lot here, but thank you so much for joining us today. It's been fantastic. And I hope that we've given some food for thoughts to go out and try and deployed kubernetes on Ben. It'll so thanks. So leg >>Thank you for coming. BJ Kim

Published Date : Sep 14 2020

SUMMARY :

Hi, Oleg. So great of you to join us today. My goal is going to tell you about this exciting feature and why we think it's So why why would you go off And instead of spending a lot of resources on the hyper visor and virtual machines, So it sounds like you're suggesting then to run kubernetes directly By skipping over the virtualization, we just remove this overhead and gained this boost. Are there any other value points or positive points that you can pull out? Yes, Besides, the hyper visor over had virtual machines. Excellent. Uh, and you can share them between containers more efficiently. So what sort of applications do you think would benefit from this The most? Uh, which is the more common than you might think And I mean, that whole ai piece is I mean, really exciting. Uh, but creating communities closer environmental, the challenges that this might entail? metal cooped and integrated it into the doctor Enterprise Container Cloud to it. We made everything so the user doesn't really feel the difference between bare metal server Uh, it gives you an ability to interact directly with the hardware. of the same, the hardware is all different. So some elements of the berm mental Now let's take a look at the rest of the infrastructure and eggs. So from the networking standpoint, so mawr hardware for the storage cluster, then? Some point in the future, we plan to add more fully, even more flexibility So thanks for covering the infrastructure part. And the bare metal management actually is integrated into this interface Would you be able to share some Let's let's see what we have here. And depending on the hardware configuration of the server, you assign it to one of these it looks like the operators can use the same tools as develops or developers Uh, it is possible to visualize and configure them through lands or any other developer Which is which is a big thing? So as a develops, you have to be proficient in both, It would be great if you could explain only one of some of the differences there and in how Dr in the doctor enterprise container cloud full extent to expand Um, we also talked about some of the challenges that you could solve, Thank you for coming.

ENTITIES

Entity	Category	Confidence
Oleg	PERSON	0.99+
Oleg Elbow	PERSON	0.99+
30%	QUANTITY	0.99+
Jake	PERSON	0.99+
FBI	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
today	DATE	0.99+
Jakey	PERSON	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
first step	QUANTITY	0.98+
three groups	QUANTITY	0.98+
one	QUANTITY	0.98+
one set	QUANTITY	0.98+
BJ Kim	PERSON	0.98+
Windows	TITLE	0.97+
up to 30%	QUANTITY	0.97+
Doctor Enterprise	ORGANIZATION	0.96+
Iran	ORGANIZATION	0.93+
three	QUANTITY	0.91+
single	QUANTITY	0.91+
Ben	PERSON	0.91+
Onda	ORGANIZATION	0.9+
James	PERSON	0.9+
Eso	ORGANIZATION	0.89+
three manager	QUANTITY	0.87+
Burnett	ORGANIZATION	0.86+
One more thing	QUANTITY	0.84+
three default	QUANTITY	0.84+
each	QUANTITY	0.83+
upto 16% more	QUANTITY	0.81+
60% density	QUANTITY	0.79+
single self	QUANTITY	0.76+
up to 60%	QUANTITY	0.75+
Zengin ICS	TITLE	0.73+
IaaS	TITLE	0.73+
six servers	QUANTITY	0.72+
Harbor	ORGANIZATION	0.68+
P G	TITLE	0.68+
Enterprise	TITLE	0.67+
Dr Enterprise	ORGANIZATION	0.67+
I. P M	TITLE	0.64+
three	OTHER	0.64+
up	QUANTITY	0.63+
Dr Enterprise Container Cloud	ORGANIZATION	0.63+
Doctor	ORGANIZATION	0.6+
Cuban	OTHER	0.58+
Coburn eighties	ORGANIZATION	0.58+
tools	QUANTITY	0.56+
Thio	PERSON	0.55+
Bhutto	ORGANIZATION	0.55+
Cloud	TITLE	0.54+
Doc Enterprise Container	TITLE	0.5+
Doctor Enterprise Container	TITLE	0.5+
Zatz	PERSON	0.49+
Team	PERSON	0.49+
Container Cloud	TITLE	0.36+

UNLISTED FOR REVIEW Tammy Butow & Alberto Farronato, Gremlin | CUBE Conversation, April 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hello everyone welcome to the cube conversation here in Palo Alto our studios of the cube I'm showing for your host we're here during the crisis of Cove in nineteen doing remote interviews I come into the studio we've got a quarantine crew or here getting the interviews getting the stories out there and of course the story we continue to talk about is the impact of Kovan 19 and how we're all getting back to work either working at home or working remotely and virtually certainly but as things start to change we can start to see events mostly digital events and we're here to talk about an event that's coming up called the failover conference from gremlin which is now gone digital because it's April 21st but I think what's important about this conversation that I want to get into is not only talk about the event that's coming up but talk about these scale problems that are being highlighted by this change in work environment working at home we've been talking about the at scale problems that we're seeing whether it's a flood of surge of traffic and the chaos that's ensuing across the world with this pandemic so I'm excited have two great guests Alberto Ferran auto senior vice president marketing gremlin and Tammy Bhutto principal site reliability engineer or SRE guys thanks for coming on appreciate it thank you Thank You Alberto I want to get to you first you know we've known each other before you've been in this industry we all we've been all been talking about the cloud native cloud scale for some time it's kind of inside the ropes it's inside baseball Tami your site reliability engineer everyone knows Google knows how well cloud works this is large-scale stuff now with The Cove in 19 we're starting to see the average person my brother my sister our family members and people around the world go oh my god this is really a high impact this change of behavior the surge of you know whether whether it's traffic on the internet or work at home tools that are inadequate you start to see these statistical things that were planned for not working well and this actually Maps the things that we've been talking about it in our industry Alberto you've been on this how you guys doing and what's your what's your take on this situation we're in right now yeah yeah we're we're doing pretty well as a company we were born as a distributed organization to begin with so for us working in a distributed environment from all over the world is is common practice day-to-day personally you know I'm originally from Italy my parents my family is Milan and Bergen audible places so I have to follow the news with extra care and so much in me it becomes so much clearer nowadays that technology is not just a powerful tool to enable our businesses but it also is so critical for our day-to-day life and thanks to you know video calls I can easily talk to my family back there every day Wow so that's that's really important so yes we've been talking for a long time as you mentioned about complex systems at scale and reliability often in the context of mission-critical applications but more and more these systems need to be reliable also when it comes to back office systems that enable people to continue to work on a daily basis yeah well our hearts go out to your family and your friends in Italy and hope everyone's stay safe there no that was a tough situation continues to be a challenge Tammy I want to get your thoughts how is life going for you you're a sight reliable engineer what you deal with on the tech side is now happening in the real world it's it's almost it's mind-blowing and to me that we're seeing these these things happen it's it's a paradigm that needs attention and whew look at it as a sre dealing a most from a tech side now seeing it play out in real life it's such an interesting situation really terrible so one of the things that I specialize in as a site reliability engineer is incident management and so for example I previously worked at Dropbox where I was you know the incident manager on call for 500 million customers you know it's like 24/7 and these large-scale incidents you really need to be able to act fast there are two very important metrics that we track and care about as a site reliability engineer the first one is mean time to detection how fast can you detect what something is happening obviously if you detect an issue faster and you've got a better chance of making the impact lower so you can contain the blast radius I like to explain it to people like if you have a fire in your sauce bin in your kitchen and you put it out that's way better than waiting until your entire house is on fire and the other metric is mean time to resolution so how long does it take you to recover from the situation so yeah this is a large-scale global incident right now that we're in yeah I know you guys do a lot of talk about chaos theory and that applies a lot of math involved we all know that but I think when you go look at the real world this is gonna be table stakes and you know there's now a line in the sand here you know pre-pandemic post pandemic and i think you guys have an interesting company gremlin in the sense that this is this is a complex system and if you think about the world we're going to be living in whether it's digital events that you guys are have one coming up or how to work at home or tools that humans are going to be using it's going to be working with systems right so you have this new paradigm gonna be upon us pretty quickly and it's not just buying software mechanisms or software it's a complex system it's distributed computing and operating so I mean this is kind of the world can you guys talk about the gremlin situation of how you guys are attacking these new problems and these new opportunities that are emerging one of the things that I've always specialized in over the last 10 years is chaos engineering and so the idea of chaos engineering is that you're injecting failure on purpose to uncover weaknesses so that's really important in distributed systems with distributed you know cloud computing all these different services that you're kind of putting together but the idea is if you can inject failure you can actually figure out what happens when I inject that small failure and then you can actually go ahead and fix it one of the things I like to say to people is you know focus on what your top 5 critical systems are let's fix those first don't go for low-hanging fruit fix the biggest problems first get rid of the biggest amount of pain that you have as a company and then you can go ahead and like actually if you think about Pareto principle the 80/20 rule if you fix 20% of your biggest problems you actually solve 80% of your issues that always works something that I've done while working at National Australia Bank doing chaos engineering also what gremlin at Dropbox and I help a lot of our customers do that to albariño talk about the mindset involved it's almost counterintuitive whoa-oh-oh risk the biggest system and I don't want to touch those there working fine right now and then these problems just gestate they kind of hang around to the bin in the kitchen fire you know mist okay I don't want to touch it the house is still working so this is kind of a new mindset could you talk about what your take is on that is the industry there I mean oh it was a kind of a corner case you know you had Netflix you had the chaos monkey those days and then now it's the DevOps practice for a lot of folks you guys are involved in that what's the what's the appetite what's the progress of chaos engineering and mainstream yeah it's interesting that you mentioned DevOps and you know recently Gartner came up with a new revisited devil scream work that has chaos engineering in the middle of the lifecycle of your application and the reality is that systems have become so complex in infrastructure so many layers of abstractions you have hundreds of services if you're doing micro services but even if you're not doing micro services you have so many applications connected to each other build really complex workflows and automation flows it's impossible for traditional QA to really understand well the vulnerability are in terms of resiliency in terms of quality too often the production environment is also too different from the staging environment and so you need a fundamentally different approach to go and find where your weaknesses are and find them before they happen before you end up finding yourself in a situation like the one we're in today and you're not prepared and so much of what we talk about is giving it >> and the methodology for people to go and find these vulnerabilities not so much about creating cause chaos but it's about managing sales that is built into our current system and exposing those vulnerabilities before they create problem and so that's a very scientific methodology and and and tooling that we would bring to market and we help customers with Tammy I want to get your thoughts on so you know we used to riff a lot of to our 10th you know cube we've had a lot of conversation we've ripped over the over the years but you know when the surge of Amazon Web Services came out as pretty obvious the clouds amazing and look at the startups that were born you mentioned Dropbox you work there these comings and all these born in the cloud these hyper scale comes built from scratch great way to scale up and we used to joke about Google people say I would like a cloud like Google but no one has Google's use cases and Google really pioneered the sre concept and you gotta give them a lot of props for that but now we're kind of getting to a world where it's becoming Google like there's more scale now than ever before it's not a corner case it's becoming more popular and more of a preferred architecture this large scale what's your assessment of the of the mainstream enterprises how far are they did in your mind our way are they there with Castle they clothed how they doing it how does someone take how does someone develop an SRE practice to get the Google like scale because Google has an amazing network they got large-scale cloud they have sres they've been doing it for years how does a company that's transforming their IT have expertise it's a great question I get asked this a lot as well one of our goals at Bremen is to help make Internet more reliable for everybody everyone using the Internet all of the engineers who are trying to build reliable services and so I'm often asked by you know companies all over the world how do we create an SRE practice and how do we practice chaos engineering and so actually how you can get started actually rolling out your sre program based on my experiences I've done it so when I worked at Dropbox I worked with a lot of people who had been at Google they've been at YouTube they were there when was rolled out across those companies and then they brought those learnings to Dropbox and I learned from them but also the interesting thing is if you look at enterprise companies so large banks say for example I worked at a National Australia Bank for six years we actually did a lot of work that I would consider chaos engineering and sre practices so for example we would do large-scale disaster recovery and that's where you fail over an entire data center to a secret data center in an unknown location and the reason is because you're checking to make sure that everything operates okay if there's a nuclear blast that's actually what you have to do and you have to do that practice every quarter so but but if you think about it it's not very good to only do it once a quarter you really want to be practicing chaos engineering and injecting failure on this I think actually my I prefer to do it three times a week do I do it a lot but I'm also someone who likes to work out a lot and be fit all the time so I know that do something regularly you get great results so that's what I always tell us yeah I get the reps in as we say you know get get stronger at the muscle memory guys talk about the event that's coming up you got an event that was schedules physical event and then you were right in the planning mode and then the crisis hits you going digital going virtual it's really digital but it's digital that's on the internet so how are you guys thinking about this I know I it's out there it's April 21st can you share some specifics around the event well who should be attending and how they get involved online yeah yeah they vent really came about about together about a month ago when we started to see all the cancellations happening across the industry because of code 19 and we are extremely engaged with in the community and we have a lot of talks and we are seeing a lot of conferences just dropping and so speakers losing their opportunity to share their knowledge with respect to how you do reliability and topics that we focus on and so we quickly people it as a company and created a new online event to give everyone in the community the opportunity to you know they'll over to a new event as the president as a as the conference name says and and have those speakers will have lost their speaking slots have a new opportunity to go share their knowledge and so that came together really quickly we share the idea with a dozen of our partners and everyone liked it and all the sudden this thing took off like crazy in just a month where we are approaching you know four thousand registrations we have over 30 partners signed up and supporting the initiative a lot of a lot of past partners as well covering the event so it was impressive to see the amount of interest that that we were able to generate in such a short amount of time and really this is a conference for anybody who is interested in resilience and if you want to know from the best on how to build business continuity of persistence people and processes this is a great opportunity at no cost we need some free conference and the target persona and the audience you want to have a ten is what Sree Zoar folks doing architectural work and what's that that's the target yes and to attend our cadets s Ari's developers business leaders who care about the quality and reliability of their applications who need to help create a framework and a mindset for their organization that speaks to what Tammy was saying a minute ago having that constant crap is on a daily basis about who and finding how to improve things you know Tammy we've been doing going to physical events with the cube and extracting the signal of the noise and distributing it digitally for ten years and I got to ask you because now that those are those events have gone away you talk about chaos and injecting failure these doing these digital events is not as easy it's just live streaming it's it's hard to replicate the value of a physical event years of experience and standards roles and responsibilities to digital different consumption environments a synchronous you're trying to create a synchronous environment it's its own complex system so I think a lot of people are experimenting and learning from these events because it's pretty chaotic so I'd love to get your thoughts on how you look at these digital events as a chaos engineer how should people be looking at these events how are you I was looking at it you know I also want to get the program going get people out there get the content but you have to iterate on this how do you view this it is really different so I actually like to compare it to fire drills in SRA so often what you do there is you actually create a fake incident or a fake issue so you just you know you're saying let's have a fire drill similar to like you know when you're in a building and you have a fire drill that goes off you have wardens and everything and you all have to go outside so we can do that in this new world that we're all in all of a sudden you know a lot of people have never run an online event and now all of a sudden they have to so what I would say is like do a fire drill um run up you know a baked one before you do the actual on one to make sure that everything does work okay my other tip is make sure that you have backup plans backup plans on backup plans on backup plans like as in SRA I always have at least three to five backup plans like I'm not just saying plan a and Plan B but there's also a C D and E and I think that's very important and you know even when you're considering technology one of the things we say with chaos engineering is you know if you're using one service inject failure and make sure that you can fail over to a different alternative service in case something goes wrong yeah hence the failover conference which is the name of the conference yeah yeah well we certainly are gonna be sending a digital reporter there virtually if you need any backup plans obviously we have the remote interviews here if you need any help let us know really appreciate it I'll great to see you guys and thanks for sharing any final thoughts on the conference how what what happens when we get through the other side of this I'll give you guys a final word we'll start with Alberto with you first yeah I think one when we are on the other side of this will will understand even more the importance of effective resilience architecting and and and testing I think you know as a provider of tools and methodologies for that we we think we will be able to help customers do we do a significant leap forward on that side and the conference is just super exciting I think it's going to be a great I encourage everyone to participate we have tremendous lineup of speakers that have incredible reputation in their fields so I'm really happy and and excited about the work that the team has being able to do with our partners put together this type of event okay Tammy yes ma'am I'm actually going to be doing the opening keynote for the conference and the topic that I'm speaking about is that reliability matters more now than ever and I'll be sharing some you know bizarre weird incidents that I've worked on myself that I've experienced you know really critical strange issues that have come up but yeah I just I'm really looking forward to sharing that with everybody else so please come along it's free you can join from your own home and we can all be there together to support each other you got a great community support and there's a lot of partners press media and an ecosystem and customers so congratulations gremlin having a conference on April 21st called the failover conference the qubits look at angle we'll have a digital reporter there we covering the news thanks for coming on and sharing and appreciate the time I'm Jeff we're here in the Palo Alto series with remote interview with gremlin around there failover conference April 21st it's really demonstrating in my opinion the at scale problems that we've been working on the industry now more applicable than ever before as we get post pandemic with kovin 19 thanks for watching be back [Music]

Published Date : Apr 7 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Tammy	PERSON	0.99+
April 21st	DATE	0.99+
Milan	LOCATION	0.99+
20%	QUANTITY	0.99+
April 2020	DATE	0.99+
Palo Alto	LOCATION	0.99+
Tammy Bhutto	PERSON	0.99+
six years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Italy	LOCATION	0.99+
Alberto Farronato	PERSON	0.99+
ten years	QUANTITY	0.99+
Jeff	PERSON	0.99+
Alberto	PERSON	0.99+
National Australia Bank	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Tammy Butow	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
National Australia Bank	ORGANIZATION	0.99+
two very important metrics	QUANTITY	0.99+
nineteen	QUANTITY	0.99+
Bergen	LOCATION	0.99+
over 30 partners	QUANTITY	0.99+
Dropbox	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.98+
Tami	PERSON	0.98+
10th	QUANTITY	0.98+
a month	QUANTITY	0.98+
hundreds of services	QUANTITY	0.98+
one	QUANTITY	0.97+
four thousand registrations	QUANTITY	0.97+
three times a week	QUANTITY	0.97+
YouTube	ORGANIZATION	0.97+
first one	QUANTITY	0.97+
gremlin	PERSON	0.96+
Alberto Ferran	PERSON	0.96+
first	QUANTITY	0.96+
Netflix	ORGANIZATION	0.95+
today	DATE	0.94+
once a quarter	QUANTITY	0.93+
ten	QUANTITY	0.93+
one service	QUANTITY	0.93+
pandemic	EVENT	0.92+
code 19	OTHER	0.9+
500 million customers	QUANTITY	0.89+
two great guests	QUANTITY	0.88+
five backup	QUANTITY	0.84+
Bremen	ORGANIZATION	0.84+
about a month ago	DATE	0.83+
lot of people	QUANTITY	0.8+
pandemic post pandemic	EVENT	0.79+
The Cove	ORGANIZATION	0.79+
a minute ago	DATE	0.79+
failover	EVENT	0.78+
a lot of people	QUANTITY	0.78+
80% of your issues	QUANTITY	0.77+
Kovan 19	EVENT	0.76+
pre-	EVENT	0.76+
19	QUANTITY	0.75+
every quarter	QUANTITY	0.75+
failover conference	EVENT	0.75+
Sree Zoar	ORGANIZATION	0.75+
top 5 critical systems	QUANTITY	0.73+
DevOps	TITLE	0.72+
19	DATE	0.7+
one of	QUANTITY	0.7+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Bhutto: