Ashish Palekar & Cami Tavares, AWS | AWS Storage Day 2022
(upbeat music) >> Okay, we're back covering AWS Storage Day 2022 with Ashish Palekar. Who's the general manager of AWS EBS Snapshot and Edge and Cami Tavares. Who's the head of product at Amazon EBS. Thanks for coming back in theCube guys. Great to see you again. >> Great to see you as well, Dave. >> Great to see you, Dave. Ashish, we've been hearing a lot today about companies all kinds of applications to the cloud and AWS and using their data in new ways. Resiliency is always top of mind for companies when they think about just generally their workloads and specifically the clouds. How should they think about customers think about data resiliency? >> Yeah, when we think about data resiliency it's all about making sure that your application data, the data that your application needs is available when it needs it. It's really the ability for your workload to mitigate disruptions or recover from them. And to build that resilient architecture you really need to understand what kinds of disruptions your applications can experience. How broad the impact of those disruptions is, and then how quickly you need to recover. And a lot of this is a function of what the application does, how critical it is. And the thing that we constantly tell customers is, this works differently in the cloud than it does in a traditional on-premises environment. >> What's different about the cloud versus on-prem? Can you explain how it's different? >> Yeah, let me start with a video on-premises one. And in the on-premises one, building resilient architectures is really the customer's responsibility, and it's very challenging. You'll start thinking about what your single points of failure are. To avoid those, you have to build in redundancy, you might build in replication as an example for storage and doing this now means you have to have provision more hardware. And depending on what your availability requirements are, you may even have to start looking for multiple data centers, some in the same regions, some in different geographical locations. And you have to ensure that you're fully automated, so that your recovery processes can take place. And as you can see that's a lot of owners being placed on the customer. One other thing that we hear about is really elasticity and how elasticity plays into the resiliency for applications. As an example, if you experience a sudden spike in workloads, in a on-premises environment, that can lead to resource saturation. And so really you have two choices. One is to sort of throttle the workload and experience resiliency, or your second option becomes buying additional hardware and securing more capacity and keeping it fair low in case of experiencing such a spike. And so your two propositions that are either experiencing resiliency, challenges or paying really to have infrastructure that's lying around. And both of those are different really when you start thinking about the cloud. >> Yeah, there's a third option too, which is lose data, which is not an option. Go ahead- >> Which is not, yeah, I pretty much as a storage person, that is not an option. The reason about that that we think is reasonable for customers to take. The big contrast in the cloud really comes with how we think about capacity. And fundamentally the the cloud gives you that access to capacity so you are not managing that capacity. The infrastructure complexity and the cost associated with that are also just a function of how infrastructure is built really in the cloud. But all of that really starts with the bedrock of how we design for avoiding single points of failure. The best way to explain this is really to start thinking about our availability zones. Typically these availability zones consist of multiple data centers, located in the same regional area to enable high throughput and low latency for applications. But the availability zones themselves are physically independent. They have independent connections to utility power, standalone backup power resources, independent mechanical services and independent network connectivity. We take availability zone independence extremely seriously, so that when customers are building the availability of their workload, they can architect using these multiple zones. And that is something that when I'm talking to customers or Tami is talking to customers, we highly encourage customers to keep in mind as they're building resiliency for their applications. >> Right, so you can have within an availability zone, you can have, you know, instantaneous, you know when you're doing it right. You've got, you've captured that data and you can asynchronously move to outside of that in case there's, the very low probability, but it does happen, you get some disasters. You're minimizing that RPO. And I don't have to worry about that as a customer and figuring out how to do three site data centers. >> That's right. Like that even further, now imagine if you're expanding globally. All those things that we described about like creating new footprint and creating a new region and finding new data centers. As a customer in an on-premises environment, you take that on yourself. Whereas with AWS, because of our global presence, you can expand to a region and bring those same operational characteristics to those environments. And so again, bringing resiliency as you're thinking about expanding your workload, that's another benefit that you get from using the availability zone region architecture that AWS has. >> And as Charles Phillips, former CEO of Infor said, "Friends, don't let friends build data center," so I don't have to worry about building the data center. Let's bring Cami into the discussion here. Cami, think about elastic block storage, it gives, you know customers, you get persistent block storage for EC2 instances. So it's foundational for any mission critical or business critical application that you're building on AWS. How do you think about data resiliency in EBS specifically? I always ask the question, what happens if something goes wrong? So how should we think about data resiliency in EBS specifically? >> Yeah, you're right Dave, block storage is a really foundational piece. When we talk to customers about building in the cloud or moving an application to the cloud, and data resiliency is something that comes up all the time. And with EBS, you know EBS is a very large distributed system with many components. And we put a lot of thought and effort to build resiliency into EBS. So we design those components to operate and fail independently. So when customers create an EBS volume for example, we'll automatically choose the best storage nodes to address the failure domain and the data protection strategy for each of our different volume types. And part of our resiliency strategy also includes separating what we call a volume life cycle control plane. Which are things like creating a volume, or attaching a volume to an EC2 instance. So we separate that control plane, from the storage data plane, which includes all the components that are responsible for serving IO to your instance, and then persisting it to durable media. So what that means is once a volume is created and attached to the instance, the operations on that volume they're independent from the control point function. So even in the case of an infrastructure event, like a power issue, for example, you can recreate an EBS volume from a snapshot. And speaking of snapshots, that's the other core pillar of resiliency in EBS. Snapshots are point in time copies of EBS volumes that would store in S3. And snapshots are actually a regional service. And that means internally we use multiple of the availability zones that Ashish was talking about to replicate your data so that the snapshots can withstand the failure of an availability zone. And so thanks to that availability zone independence, and then this builtin component independence, customers can use that snapshot and recreate an EBS following another AZO or even in another region if they need to. >> Great so, okay, so you touched on some of the things EBS does to build resiliency into the service. Now thinking about over your right shoulders, you know, Joan Deviva, so what can organizations do to build more resilience into their applications on EBS so they can enjoy life without anxiety? >> (laughs) That is a great question. Also something that we love to talk to customers about. And the core thing to think about here is that we don't believe in a one size fits all approach. And so what we are doing in EBS is we give customers different tools so that they can design a resiliency strategy that is custom tailored for their data. And so to do this, this resiliency assessment, you have to think about the context of this specific workload and ask questions like what other critical services depend on this data and what will break if this data's not available and how long can can those systems withstand that, for example. And so the most important step I'll mention it again, snapshots, that is a very important step in a recovery plan. Make sure you have a backup of your data. And so we actually recommend that customers take the snapshots at least daily. And we have features that make that easier for you. For example, Data Lifecycle Manager which is a feature that is entirely free. It allows you to create backup policies, and then you can automate the process of creating the snapshot, so it's very low effort. And then when you want to use that backup to recreate a volume, we have a feature called Fast Snapshot Restore, that can expedite the creation of the volume. So if you have a more, you know a shorter recovery time objective you can use that feature to expedite the recovery process. So that's backup. And then the other pillar we talked to customers about is data replication. Just another very important step when you're thinking about your resiliency and your recovery plans. So with EBS, you can use replication tools that work at the level of the operating system. So that's something like DRBD for example. Or you can use AWS Elastic Disaster Recovery, and that will replicate your data across availability zones or nearby regions too. So we talked about backup and replication, and then the last topic that we recommend customers think about is having a workload monitoring solution in place. And you can do that in EBS, using cloud watch metrics. So you can monitor the health of your EBS volume using those metrics. We have a lot of tips in our documentation on how to measure that performance. And then you can use those performance metrics as triggers for automated recovery workflows that you can build using tools like auto scaling groups for example. >> Great, thank you for that advice. Just quick follow up. So you mentioned your recommendation, at least daily, what kind of granularity, if I want to compress my RPO can I go at a more granular level? >> Yes, you can go more granular and you can use again the daily lifecycle manager to define those policies. >> Great, thank you. Before we go, I want to just quickly cover what's new with EBS. Ashish, maybe you could talk about, I understand you've got something new today. You've got an announcement, take us through that. >> Yeah, thanks for checking in and I'm so glad you asked. We talked about how snapshots help resilience and are a critical part of building resilient architectures. So customers like the simplicity of backing up their EC2 instances, using multi volume snapshots. And what they're looking for is the ability to back up only to exclude specific volumes from the backup, especially those that don't need backup. So think of applications that have cash data, or applications that have temporary data that really doesn't need backup. So today we are adding a new parameter to the create snapshots API, which creates a crash consistent set of snapshots for volumes attached to an EC2 instance. Where customers can now exclude specific volumes from an instance backup. So customers using data life cycle manager that can be touched on, can automate their backups. And again they also get to exclude these specific volumes. So really the feature is not just about convenience, but it's also to help customers save on cost. As many of these customers are managing tens of thousands of snapshots. And so we want to make sure they can take it at the granularity that they need it. So super happy to bring that into the hands of customers as well. >> Yeah, that's a nice option. Okay, Ashish, Cami thank you so much for coming back in theCube, helping us learn about what's new and what's cool and EBS, appreciate your time. >> Thank you for having us Dave. >> Thank you for having us Dave. >> You're very welcome now, if you want to learn more about EBS resilience, stay right here because coming up, we've got a session which is a deep dive on protecting mission critical workloads with Amazon EBS. Stay right there, you're watching theCube's coverage of AWS Storage Day 2022. (calm music)
SUMMARY :
Great to see you again. and specifically the clouds. And the thing that we And so really you have two choices. option too, which is lose data, to capacity so you are not and you can asynchronously that you get from using so I don't have to worry about And with EBS, you know EBS is a very large of the things EBS does And the core thing to So you mentioned your and you can use again the Ashish, maybe you could is the ability to back up only you so much for coming back if you want to learn more
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ashish | PERSON | 0.99+ |
Ashish Palekar | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Charles Phillips | PERSON | 0.99+ |
Joan Deviva | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Cami | PERSON | 0.99+ |
third option | QUANTITY | 0.99+ |
EBS | ORGANIZATION | 0.99+ |
two propositions | QUANTITY | 0.99+ |
second option | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Infor | ORGANIZATION | 0.99+ |
Cami Tavares | PERSON | 0.99+ |
both | QUANTITY | 0.99+ |
today | DATE | 0.98+ |
two choices | QUANTITY | 0.98+ |
EBS | TITLE | 0.97+ |
EC2 | TITLE | 0.97+ |
Tami | PERSON | 0.96+ |
tens of thousands of snapshots | QUANTITY | 0.95+ |
each | QUANTITY | 0.95+ |
AZO | TITLE | 0.93+ |
Amazon EBS | ORGANIZATION | 0.91+ |
theCube | ORGANIZATION | 0.89+ |
Ashish | ORGANIZATION | 0.89+ |
single points | QUANTITY | 0.86+ |
three site | QUANTITY | 0.83+ |
single points | QUANTITY | 0.82+ |
DRBD | TITLE | 0.8+ |
Storage Day 2022 | EVENT | 0.78+ |
one size | QUANTITY | 0.76+ |
Elastic Disaster | TITLE | 0.7+ |
Edge | ORGANIZATION | 0.68+ |
CEO | PERSON | 0.63+ |
Lifecycle | TITLE | 0.59+ |
thing | QUANTITY | 0.57+ |
Snapshot | TITLE | 0.49+ |
S3 | TITLE | 0.46+ |